Introduction: Flash-based Data Management
With the development of electronic technology, flash memory emerges as new data storage and has been widely used in embedded and portable devices such as mobile communication, industry control, Aeronautics & Astronautics and Notebook. With the rapid increase of the capacity of flash memory, data management on flash becomes a new great challenge, which incurs researches to promote the significant development on flash-based database and application, as well as the framework and structure of flash-based database. This project researches fundamental theory and design principles of flash-based database including a series of key problems such as system architecture, storage management and indexing, query processing , transaction processing.
[Top]        
Motivation

Flash Memory has special physical characteristics, such as unsymmetric I/O, erase before rewrite, limited erase times. Conventional disk-based database only get low performance when directly applied on flash memory. In order to improve performance of flash-based database, we need to redesign conventional database according to characteristics of flash memory. We will do our research from storage, index, query and transaction processing.

[Top]        
Research Work
  • Storage Management
    SSD, which is maken up of flash memory chip, are used in common lives more and more widely. It becomes a trend that SSD will replace hard disk as new secondary storage devices. But due to the high cost and low capacity, SSD will be coexist with hard disk in computer in long time. Based on this application, we will research how to distribute data between SSD and HDD.

  • Indexing

    IO Evaluation
    Convential IO evaluation policy is based on the same cost of read and write. The write cost is multi-times of that of read for flash memory. Duo to the characteristic of unsymmetric IO, We need to re-evaluate the IO performance of flash-based database. Cost of I and O should be taken of apartly. Besides this, the cost of erase should also be thought about.

    Flash-based Index
    If we want to rewrite data on flash memory, we must erase the block of the data. Due to high cost of erase operation, we do not rewrite data in-place, but write a new version of data in other place. This method is called out-of-place update. Out-of-place update will lead to cascade update of convential balanced tree index. Update of a leaf node will lead to updates of ancient nodes. Our research will try to reduce the impact of performance because of cascade update.

     

  • Query Processing and Optimization
    Query is a very important part of database. Query need read a lot of data. Because low random read performance of disk, convential query policies try to avoid random read. Unlike hard disk, flash memory has excellent random read performance. Our research will try enhance query performance after taking full use of random read performance. On the other hand, plenty of temporary data will be wrtten during query. Write is a high cost operation when compared with read. Our research will reduce write cost by reducing write times and write content.

  • Transaction Management
    Transaction management plays key role when databases are used to manage mass data of enterprises. It is a big challenge to implement transaction processing on flash-based database. The workloads of transaction are mainly random small writes. Flash memory has low random write performance, even lower than hard disk. Convential transaction processing has low performance on flash memory because random small writes is not suitable to flash memory. Our research will try to redesign transaction processing policy according to characteristics of flash memory.

    [Top]        
  • Grant
    • 2009-2012 Flash-based Database Research (Key Project)
      Granted by the Natural Science Foundation of China(NSFC) under grant number 60833005
    Seminar
    Patent
    [Top]        
    Publications
    • Shaoyi Yin, Philippe Pucheral, Xiaofeng Meng, PBFilter: A Sequential Indexing Scheme for Flash-Based Embedded Systems, accepted by EDBT2009.
    • S. Yin, P. Pucheral, X. Meng: PBFilter: Indexing Flash-Resident Data through Partitioned Summaries.In Proceedings of the ACM 17th Conference on Information and Knowledge Management(CIKM2008), page 1333-1334, Napa Valley, California, October 26-30, 2008.
    • L. Xiang, D. Zhou and X. Meng: A New Dynamic Hash Index for Flashbased Storage. In Proceedings of 9th International Conference on Web-Age Information Management (WAIM 2008) , Zhangjajie, China.
    • S. Yin, J. Chen, X. Meng, C. Lai. The Storage and Recovery of PhoneDB, Computer Science, Vol.32(Supplement A):358-362,2005,8,NDBC2005. (in Chinese with English abstract).
    [Top]        
    Reference
    • Intel Corporation. Understanding the flash translation layer (FTL) specification. Intel Technical Report.
    • M. Rosenblum, J. K. Ousterhout. The Design and Implementation of a Log-Structured File System. ACM Trans. Comput. Syst. (TOCS) 10(1):26-52 (1992)
    • E. Gal, S. Toledo. Algorithms and data structures for flash memories. ACM Comput. Surv. (CSUR) 37(2):138-163 (2005)
    • A.-B. Bityutskiy. JFFS3 design issues. http://www.linux-mtd.infradead.org
    • D. Z. Yazti, S. Lin, V. Kalogeraki, D. Gunopulos and W. A. Najjar. MicroHash: An Efficient Index Stucuture for Flash-Based Sensor Devices. FAST 2005.
    • G.-J. Kim, S.C. Baek, H.-S. Lee, H.-D. Lee, M. J. Joe: LGeDBMS: A Small DBMS for Embedded System with Flash Memory. VLDB 2006:1255-1258
    • D. Myers. On the Use of NAND Flash Memory in High-Performance Relational Databases. MIT Msc Thesis.
    • S. Nath, A. Kansal. FlashDB: Dynamic Self-tuning Database for NAND Flash. IPSN 2007
    • C. H. Wu, T. W. Kuo, and L. P. Chang. An Efficient B-Tree Layer Implementation for Flash-Memory Storage Systems. ACM Transactions on Embedded Computing Systems 2007
    • G. Graefe. The Five-Minute Rule Twenty Years Later, and How Flash Memory Changes the Rules. ACM DaMoN 2007.
    • S. W. Lee, and B. Moon. Design of Flash-Based DBMS: An In-Page Logging Approach. SIGMOD 2007.
    • B. Moon, C. Park, S. W. Lee. A Case for Flash Memory SSD in Enterprise Database Applications. SIGMOD 2008.
    • K. Ross. Modeling the Performance of Algorithms on Flash Memory Devices. DaMoN 2008.
    • M. A Shah, S. Harizopoulos, J. L. Wiener, and G. Graefe. Fast Scans and Joins using Flash Drives. DaMoN 2008.
    • S. Yin, P. Pucheral, X. Meng: PBFilter: Indexing Flash-Resident Data through Partitioned Summaries.In Proceedings of the ACM 17th Conference on Information and Knowledge Management(CIKM2008), page 1333-1334, Napa Valley, California, October 26-30, 2008.
    • Ioannis Koltsidas, Stratis Viglas: Flashing up the storage layer. PVLDB 1(1):514-525 (2008)
    • Suman Nath, Phillip B. Gibbons: Online maintenance of very large random samples on flash storage. PVLDB 1(1):970-983 (2008)
    • Y. Li, B. He, Q. Luo and K. Yi. Tree Indexing on Flash Disks. ICDE 2009.
    • Shimin Chen. FlashLogging: Exploiting Flash Devices for Synchronous Logging Performance. In Proceedings of the 28th ACM SIGMOD International Conference on Management of Data (SIGMOD'09), Providence, RI, June-July 2009.M
    [Top]        
    Maintained by Zhongyuan Wang( ) Copyright © 2007-2008 WAMDM, All rights reserved