Introduction: Flash-based Data Management

With the development of electronic technology, flash memory emerges as new data storage and has been widely used in embedded and portable devices such as mobile communication, industry control, Aeronautics & Astronautics and Notebook. With the rapid increase of the capacity of flash memory, data management on flash becomes a new great challenge, which incurs researches to promote the significant development on flash-based database and application, as well as the framework and structure of flash-based database. This project researches fundamental theory and design principles of flash-based database including a series of key problems such as system architecture, storage management and indexing, query processing , transaction processing.

[Top]        
Motivation

Flash Memory has special physical characteristics, such as unsymmetric I/O, erase before rewrite, limited erase times. Conventional disk-based database only get low performance when directly applied on flash memory. In order to improve performance of flash-based database, we need to redesign conventional database according to characteristics of flash memory. We will do our research from storage, index, query and transaction processing.

[Top]        
Research Work
  • Storage Management

    SSD, which is maken up of flash memory chip, are used in common lives more and more widely. It becomes a trend that SSD will replace hard disk as new secondary storage devices. But due to the high cost and low capacity, SSD will be coexist with hard disk in computer in long time. Based on this application, we will research how to distribute data between SSD and HDD.

  • Indexing

    IO Evaluation
    Convential IO evaluation policy is based on the same cost of read and write. The write cost is multi-times of that of read for flash memory. Duo to the characteristic of unsymmetric IO, We need to re-evaluate the IO performance of flash-based database. Cost of I and O should be taken of apartly. Besides this, the cost of erase should also be thought about.

    Flash-based Index
    If we want to rewrite data on flash memory, we must erase the block of the data. Due to high cost of erase operation, we do not rewrite data in-place, but write a new version of data in other place. This method is called out-of-place update. Out-of-place update will lead to cascade update of convential balanced tree index. Update of a leaf node will lead to updates of ancient nodes. Our research will try to reduce the impact of performance because of cascade update.

     

  • Query Processing and Optimization
    Query is a very important part of database. Query need read a lot of data. Because low random read performance of disk, convential query policies try to avoid random read. Unlike hard disk, flash memory has excellent random read performance. Our research will try enhance query performance after taking full use of random read performance. On the other hand, plenty of temporary data will be wrtten during query. Write is a high cost operation when compared with read. Our research will reduce write cost by reducing write times and write content.

  • Transaction Management
    Transaction management plays key role when databases are used to manage mass data of enterprises. It is a big challenge to implement transaction processing on flash-based database. The workloads of transaction are mainly random small writes. Flash memory has low random write performance, even lower than hard disk. Convential transaction processing has low performance on flash memory because random small writes is not suitable to flash memory. Our research will try to redesign transaction processing policy according to characteristics of flash memory.

  • SSD-based hybrid system
    SSD-based hybrid cache system is currently a hot research topic.SSD serves as a cache layer between DRAM and hard disk. The main memory buffer is consulted when the buffer manager gets a request for a page. If the data do not reside in memory, its corresponding page will be fetched from SSD or disk. When the main memory has no available space, the system selects a victim page to swap out, the page may be granted a chance to stay in the SSD cache for an extended period. The key issues in SSD-based extension buffer system can be attributed to the following two questions. What and how data pages should be swapped out DRAM? When and what data pages should be staged into or out of the flash cache?

    [Top]        
  • Grant
    • 2009-2012 Flash-based Database Research (Key Project)
      Granted by the Natural Science Foundation of China(NSFC) under grant number 60833005
    Seminar
    Publications
    • W.Lai, Y.Fan, X Meng. Scan and Join Optimization by Exploiting Internal Parallelism of Flash-Based Solid State Drives. Proceedings of the 14th International Conference on Web-Age Information Management (WAIM 2013): 381-392, Beidaihe, China (the Best Paper Runner Up of WAIM 2013).
    • Y.Fan, X.Meng. MixSL: An Efficient Transaction Recovery Model in Flash-Based DBMS. Proceedings of the 14th International Conference on Web-Age Information Management (WAIM 2013): 393-404, Beidaihe, China
    • Y.Fan, X. Meng. Transaction Recovery Model of Databases Based on PCM and Flash memory. Chinese Journal of Computers, Vol 36(8): 1582-1591, 2013, 8. (NDBC2013, Haerbin)(the Best Paper Runner-up Award).
    • J. Wang, W. Lai, X. Meng: Flash-Based:Studies,Techniques and Forecasts. Journal of Computers, Vol36(8): 1549-1567, 2013, 8.
    • J. Wang, W. Lai, X. Meng: SFCM: A SSD-Friendly Cache Management Policy for Hybrid Storage Systems. In Proceedings of the WAIM Workshops 2013, pages 1-12, Qinhuangdao,China, June 14-16, 2013.
    • Y. Fan, W. Lai, X. Meng: Database Table Scan and Aggregation by Exploiting Internal Parallelism of SSDs. Chinese Journal of Computers, Vol 35(11): 2327-2336, 2012, 11. (NDBC2012, Hefei)(the Sa Shi Xuan Best Paper Award)
    • Z. Lu, X. Qi, W. Cao, X. Meng: LB-Logging: A Highly Efficient Recovery Technique for Flash-Based Database. In Proceedings of the 13th International Conference on Web-Age Information Management(WAIM 2012), pages: 375-386, Harbin, China, August 18-20, 2012.
    • Y. Fan, W. Cao, X. Meng: A Study of Space Reclamation on Flash-Based Append-only Storage Management. In Proceedings of the DASFAA Workshops 2012, pages 28-39, Busan, South Korea.
    • Q. Cao, Z. Liang, Y. Fan, X. Meng: A Flash-Based Decomposition Storage Model. In Proceedings of the DASFAA Workshops 2012, pages 73-80, Busan, South Korea.
    • X. Tang, X. Meng ,Z. Liang, Z. Lu: Cost-Based Buffer Management Algorithm for Flash Database Systems. Journal of Software, Vol.22(12): 2951-2964,2011,12.
    • X. Qi, X. Tang, Z. Liang, X. Meng: OAFTL: An Efficient Flash Translation Layer for Enterprise Application. Chinese Journal of Computers, Vol 48(10):1918-1926, 2011,10.(NDBC2011, Shanghai)(the Sa Shi Xuan Best Paper Award)
    • X.Han,W.Cao,X.Meng:Virtual Memory Management for Main-Memory KV Database Using Solid State Disk. Journal of Frontiers of Computer Science and Technology, Vol. 5, No.8: 686-694, Aug. 2011.(NDBC2011, Shanghai)
    • X. Meng, P. Jin, W.Cao,L.Yue: Report on the first international workshop on flash-based database systems.SIGMOD Record. Vol 40(2):40-44 ( 2011).
    • X. Meng, L.Yue, J. Xu: Flash-Based Database Systems: Experiences from the FlashDB Project.(invited talk). In Proceedings of the DASFAA2011 Workshop on Flash-based Database Systems (FlashDB2011): 240, April 22, 2011, Hong Kong.
    • Z. Liang, Y. Fan, X. Meng: A Novel Method to Extend Flash Memory Lifetime in Flash-Based DBMS. In Proceedings of the DASFAA2011 Workshop on Flash-based Database Systems (FlashDB2011): 190-201, April 22, 2011, Hong Kong.
    • X. Tang, X. Meng: FClock: An Adaptive Buffer Replacement Algorithm for SSD. Chinese Journal of Computers, Vol.33, No.8: 1460-1471, Aug. 2010.(NDBC2010, Beijing)
    • Z. Lu, X. Meng, D. Zhou: HV-recovery: A High Efficient Recovery Technique for Flashed-based Database. Chinese Journal of Computers. Vol.33(12): 2258-2266.(NDBC2010, Beijing)(the Sa Shi Xuan Best Paper Award)
    • Z. Liang, D. Zhou, X. Meng: Sub-Join: Query Optimization Algorithm for Flash-Based Database. Journal of Frontiers of Computer Science & Technology, Vol.4(5):401-409, 2010.5.
    • D. Zhou, Z. Liang, X. Meng: HF-Tree: An Update-Efficient Index for Flash Memory. Journal of Computer Research and Development(CRAD), Vol.47(5):832-840, 2010.5.
    • D. Zhou, X. Meng: A Flash-Aware Random Write Optimized Database. In Proceedings of the 11th International Conference on Mobile Data Management (MDM 2010): 276-278,May 23-26, 2010, Kansas City, Missouri, USA.
    • X. Tang, X. Meng: ACR: an Adaptive Cost-Aware Buffer Replacement Algorithm for Flash Storage Devices. In Proceedings of the 11th International Conference on Mobile Data Management (MDM 2010): 33-42,May 23-26, 2010, Kansas City, Missouri, USA.
    • Shaoyi Yin, Philippe Pucheral, Xiaofeng Meng, PBFilter: A Sequential Indexing Scheme for Flash-Based Embedded Systems, accepted by EDBT2009.
    • S. Yin, P. Pucheral, X. Meng: PBFilter: Indexing Flash-Resident Data through Partitioned Summaries.In Proceedings of the ACM 17th Conference on Information and Knowledge Management(CIKM2008), page 1333-1334, Napa Valley, California, October 26-30, 2008.
    • L. Xiang, D. Zhou and X. Meng: A New Dynamic Hash Index for Flashbased Storage. In Proceedings of 9th International Conference on Web-Age Information Management (WAIM 2008) , Zhangjajie, China.
    • S. Yin, J. Chen, X. Meng, C. Lai. The Storage and Recovery of PhoneDB, Computer Science, Vol.32(Supplement A):358-362,2005,8,NDBC2005. (in Chinese with English abstract).
    [Top]        
    Reference
    • Intel Corporation. Understanding the flash translation layer (FTL) specification. Intel Technical Report.
    • M. Rosenblum, J. K. Ousterhout. The Design and Implementation of a Log-Structured File System. ACM Trans. Comput. Syst. (TOCS) 10(1):26-52 (1992)
    • E. Gal, S. Toledo. Algorithms and data structures for flash memories. ACM Comput. Surv. (CSUR) 37(2):138-163 (2005)
    • A.-B. Bityutskiy. JFFS3 design issues. http://www.linux-mtd.infradead.org
    • D. Z. Yazti, S. Lin, V. Kalogeraki, D. Gunopulos and W. A. Najjar. MicroHash: An Efficient Index Stucuture for Flash-Based Sensor Devices. FAST 2005.
    • G.-J. Kim, S.C. Baek, H.-S. Lee, H.-D. Lee, M. J. Joe: LGeDBMS: A Small DBMS for Embedded System with Flash Memory. VLDB 2006:1255-1258
    • D. Myers. On the Use of NAND Flash Memory in High-Performance Relational Databases. MIT Msc Thesis.
    • S. Nath, A. Kansal. FlashDB: Dynamic Self-tuning Database for NAND Flash. IPSN 2007
    • C. H. Wu, T. W. Kuo, and L. P. Chang. An Efficient B-Tree Layer Implementation for Flash-Memory Storage Systems. ACM Transactions on Embedded Computing Systems 2007
    • G. Graefe. The Five-Minute Rule Twenty Years Later, and How Flash Memory Changes the Rules. ACM DaMoN 2007.
    • S. W. Lee, and B. Moon. Design of Flash-Based DBMS: An In-Page Logging Approach. SIGMOD 2007.
    • B. Moon, C. Park, S. W. Lee. A Case for Flash Memory SSD in Enterprise Database Applications. SIGMOD 2008.
    • K. Ross. Modeling the Performance of Algorithms on Flash Memory Devices. DaMoN 2008.
    • M. A Shah, S. Harizopoulos, J. L. Wiener, and G. Graefe. Fast Scans and Joins using Flash Drives. DaMoN 2008.
    • S. Yin, P. Pucheral, X. Meng: PBFilter: Indexing Flash-Resident Data through Partitioned Summaries.In Proceedings of the ACM 17th Conference on Information and Knowledge Management(CIKM2008), page 1333-1334, Napa Valley, California, October 26-30, 2008.
    • Ioannis Koltsidas, Stratis Viglas: Flashing up the storage layer. PVLDB 1(1):514-525 (2008)
    • Suman Nath, Phillip B. Gibbons: Online maintenance of very large random samples on flash storage. PVLDB 1(1):970-983 (2008)
    • Y. Li, B. He, Q. Luo and K. Yi. Tree Indexing on Flash Disks. ICDE 2009.
    • Shimin Chen. FlashLogging: Exploiting Flash Devices for Synchronous Logging Performance. In Proceedings of the 28th ACM SIGMOD International Conference on Management of Data (SIGMOD'09), Providence, RI, June-July 2009.M
    • Gokul S,Vijayan P,Mahesh B,Ted W. Extending SSD Lifetimes with Disk-Based Write Caches//Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST'10),San Jose,USA,2010:101-114
    • Canim M, Mihaila G A, Bhattacharjee B, Ross K A,Lang C A.SSD Bufferpool Extensions for Database Systems.PVLDB,2010,3(2):1435-1446
    • Do J,Zhang D H,Patel J,David J,Jeffrey F,Halverson N.Turbocharging DBMS buffer pool using SSDs//Proceedings of the ACM SIGMOD Conference ,Athens,Greece,2011:1113–1124
    • Kang W H,Lee S W,Moon B.Flash-based Extended Cache for Higher Throughput and Faster Recovery.PVLDB,2012,5(11):1615-1626
    • Do J,Zhang D F,Patel J M,DeWitt D J.Fast Peak-to-Peak Behavior with SSD Buffer Pool// Proceedings of the 29th International Confe-rence on Data Engineering (ICDE'13),Brisbane, Australia, 2013:1129-1140
    [Top]        
    Maintained by Zhongyuan Wang( ) Copyright © 2007-2008 WAMDM, All rights reserved