By analyzing existing native XML storage technologies, we proposed storage XML data according to their schema information, and implemented four kinds of storage strategies, which can automatically cluster XML data to the same unit according to the features of the given XML document. Further, elements of the same type may be organized together or in document order according to query requirement. Such storage mechanism can greatly facilitate the query processing.The prototype, OrientStore, implemented by us was published in VLDB2003, which has been cited twice. Athena Vakal et al. said our method “The advantages of this approach are evident in several new applications” in their paper published in IEEE Internet Computing 9(2): 62-69 (2005).
We have applied for a patent based on this method (No: 200410073869.5).
We focus on range-code and XML Numbering update. The strategy for update is to reserve numbering space. The problem is how to reserve space and how to reallocate the space when necessary. We propose space reserving algorithm and renumbering algorithm respectively for this problem. Our experimental results show our method is effective and efficient, which can greatly reduce the cost of update and more than 85% data will not cause renumbering.
This idea is published in WWWJ2005 and Journal of Software2005(5)
This work was published in ICDE2005, the corresponding demo was published in SIGMOD2004, which was cited by two papers of VLDB2005 and considered as one of the representative work to sequence-based method. Until now, this work was cited 16 times. K.Hima Prasad et al. pointed out that “Recently sequence based query processing is gaining importance because of its holistic query processing feature” in their paper “K.Hima Prasad Ch.Rajesh P.Sreenivasa Kumar: Handling Updates in Sequence Based XML Query Processing. In the Proceeding of International Conference on Management of Data (COMAD 2005), Hyderabad, India, December 20-22, 2005.”
Based on this work, we have applied for two patents (No: 200810056098.7, 200810056100.0).
XQuery is the recommended standard for XML Query. XQuery processing strategies can be classified into two categories: core syntax based strategy (node-oriented) and algebra based strategy (set-oriented). Neither of them can handle XQuery well. The syntax based strategy is inefficient and hard to optimize, while the current algebra based strategies can not satisfy the flexible programming characteristics of XQuery. After summarizing the current stage and unsolved problems of former algebra based works, we proposed an effective XQuery algebra system, OrientXA, ideas from both strategies are embodied in it. OrientXA introduces the notion of Construct Pattern Tree for the first time. The Construct operator in it materializes the flexible characteristics of XQuery. Corresponding to its expressive operators, it can express all the queries in W3C use cases and XMark benchmark.
This work is published in WAIM2004, Journal of Software, and Journal of Computer Research and Development.
OrientX is a schema based, integrated native XML database system built by WDMAM Lab, Renmin University of China under NSFC grant 60273018. It includes following functional modules: native storage, schema manager, index manager and query engines. Schema information, which plays a vital role in the system, affects storage granularity, indexing structure and query optimization; all these are combined together to support efficient XML query processing. OrientX was accepted in the XQuery Implementation List of W3C.
The Chair of “Dagstuhl Seminar on XQuery Implementation Paradigms, 2006” has pointed out that “Your native XML database system OrientX is clearly recognized as a highly significant contribution in this research area and the seminar organizers are looking forward to your attendance”.
The very large volume of data is a significant problem of Ontology data management in Semantic Web environment. In addition, the continual increase of web resources induces frequent update of Ontology data. How to support efficient update is another important problem of Ontology data management.
Most existing work try to solve the problems by methods based on relational database. However, RDB is not designed for Ontology data features. There is great difference between the complex graph model of Ontology and simple flat model of relational data. RDB based Ontology data management needs divide Ontology graph into simple relations, and transform graph-based query into a set of join operations on relation tables. The mismatch between two models restricts RDB-based methods in managing large scale Ontology data. In addition, RDB based methods always pre-compute the implicit inference data and materialize them in storage. Though this method can guarantee query efficiency, it increases cost of update a lot. When update explicit data, the maintenance of materialized data is an expensive problem. In fact, most existing Ontology management systems can not support effective update.
In order to efficiently manage large volume ontology data,  proposed a novel storage method, which designs native storage structure according to the characteristics of Ontology data and breaks through the restriction of RDB model. The most remarkable characteristic is that it leverages the XML data model, adopts tree structure (see figure1) to store the class and property hierarchies in Ontology. Tree structures can reserve the original hierarchies in ontology data, thus don’t need to materialize the implicit inference data brought by class and property hierarchy, which reduces the cost of update. Such storage structure can support update as well. It can maintain the consistency of data through simple operations and keep little cost of update. Based on the novel storage,  proposes relevant query processing method to support Ontology query in SPARQL. In addition to query processing, inference ability is also an important aspect in Ontology data management.  studies this problem and proposes initial inference algorithms and incremental inference algorithms to guarantee the completeness and efficiency of inference procedure.OrientX is a native XML data management, based on which an extended version is developed called OrientX/Ontology, which can be viewed as a special version for Ontology data. It implements the novel storage method and has the ability to query and inference, which has much significance to both researchers and engineers. At present, it can load more than 200M documents; the further work to improve the query engine and support larger volume documents is under development.
Using structured query languages, e.g. XQuery and XPath, for query processing is too restrictive for users when they want to retrieve desired information from an XML document, while XML keyword search avoid the great burden of understanding the underlying schema and query languages, thus have been extensive studied in the past few years. However, there are still some problems that have not been addressed before, which forms our research points.
Our research focus on the following problems:
- 2006-2007 Supporting
Context in XML Data Management Systems (Principle Investigator)
Granted by China-Greece international cooperation project
- 2005 Ontology based
Data Management (Principle Investigator)
Grangted by IBM University project
- 2004-2007 XML Data
Management (Principle Investigator)
Granted by Program for New Century Excellent Talents in University(NCET)
- J. Zhou, X. Meng, T. Ling: Efficient Processing of Partially Specified Twig Pattern Queries, Accepted by Science in China Series E: Information Sciences.
- J. Zhu, W. Wang, X. Meng: Efficient Processing of Complex XML Twig Query. In Proceedings of 9th International Conference on Web-Age Information Management (WAIM 2008), Zhangjajie, China
- J. Huang, J. Xu, J. Zhou, X. Meng: MLCEA: An Entity Based Semantics for XML keyword Search, 2008.10(NDBC2008, Guilin)(in Chinese)
- J.Zhu, W.Wang, J.Zhou, X.Meng: Efficient Processing of XML Twig Pattern Based on Related Semantics. Jouranl of Computer Research and Development(Suppl.). 2008.10 (NDBC2008, Guilin)(in Chinese)
- X. Zhang, X. Meng, J. Zhu, W. Wang, J. Huang: OrientStore+: A Native XML Storage Strategy for Efficient Update. Journal of Computer Research and Development, Vol. 44 Suppl.: 368-373, 2007.10 (NDBC2007, Haikou, Best Paper Award) (in Chinese)
- J. Zhou, X. Meng, X. Zhang, J. Huang: Keyword Based Multiple Query Processing over XML Streams. Journal of Computer Research and Development, Vol. 44 Suppl.: 392-397, 2007.10 (NDBC2007, Haikou) (in Chinese)
- J. Zhou, M. Xie, X. Meng: TwigStack+: Holistic Twig Join Pruning Using Extended Solution Extension. Wuhan University Journal of Natural Sciences, Vol. 12, No. 5: 855-860, 2007.9 (4th Web Information System and Application(WISA2007), Beijing, Best Paper Award)
- J. Zhou, X. Meng, Y. Jiang, M. Xie: F-Index: A Flattened Structural Index for Speeding up Twig Query Processing. Journal of Software, Vol.18(6):1429-1442, June, 2007.
- Xiaofeng Meng, Xiaofeng Wang, Min Xie and et al: OrientX: An Integrated, Schema-Based Native XML Database System. Wuhan University Journal of Natural Sciences,11(5):1192-1196, Nov., 2006.(The Third Web Information System and Application(WISA2006), Nanjing, Nov 3-5, 2006.)
- Xiaofeng Wang, Xin Zhang, Min Xie, Xiaofeng Meng, Junfeng Zhou，Keyword Search on XML Streams. Journal of Computer Research and Development, Volume43(Supplement), 2006.10, NDBC2006
- Min Xie, Xiaofeng Wang, Xin Zhang, Xiaofeng Meng, Junfeng Zhou, Ordered XPath Query Processing on XML Stream，Journal of Computer Research and Development, Volume43(Supplement), 2006.10, NDBC2006
- X. Wang, J. Ou, X. Meng, and Y. Chen: Abox Inference for Large Scale OWL-Lite Data. To appear in Proceedings of The 2th International Conference on Semantics, Knowledge, and Grids(SKG2006), Guilin, China, Oct. 31 - Nov. 3, 2006. (Regular paper 18%)会议，广州.)
- X. Meng, X.Wang , M. Xie and et al: OrientX: An Integrated, Schema-Based Native XML Database System. Wuhan University Journal of Natural Sciences,11(5):1192-1196, Nov., 2006.(The Third Web Information System and Application(WISA2006), Nanjing, Nov 3-5, 2006.)
- Y. Chen, J. Ou, Y. Jiang, X. Meng: HStar-a Semantic Repository for Large Scale OWL Documents. In Proceedings of the First Asian Semantic Web Conference (ASWC2006), page 415-428, Beijing, China, September 3-7, 2006. Lecture Notes in Computer Science 4185, Springer. (Full Paper 36/208=18%)
- H.Wang, X. Meng: On the Sequencing of Tree Structures for XML Indexing. In Processdings of the 21st International Conference on Data Engineering (ICDE 2005), pages 372-373, Tokyo, Japan, April 2005.
- J. X. Yu, D. Luo, X. Meng, H. Lu: Dynamically Updating XML Data: Numbering Scheme Revisited, World Wide Web, Vol 8( 1):5-26, March, 2005.004，11
- X. Meng, D. Luo, J. Ou: Extended Role Based Access Control Method for XML Documents. Wuhan University Journal of Natural Science, Vol.9(5):740-744, Sept., 2004.
- Y. Wang, H. Wang, X. Meng, S. Wang: Estimating the Selectivity of XML Path Expression with predicates by Histograms. In proceedings of the 5th International Conference Web-Age Information Management(WAIM 2004), pages 409-418, Dalian, China, July 15-17, 2004. Lecture Notes in Computer Science 3129 Springer 2004.
- X. Meng, Y. Jiang, Y. Chen, H. Wang: XSeq: An Index Infrastructure for Tree Pattern (Demo). In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 2004), pages941-942, Paris, France, June 13-18, 2004.
- D. Luo, T. Chen, T. W. Ling, X. Meng: On View Transformation Support for a Native XML Database. In Proceedings of the 9th International Conference on Database Systems for Advances Applications(DASFAA 2004), pages 226-231, Jeju Island, Korea, March 17-19, 2004. Lecture Notes in Computer Science 2973, Springer.
- X. Meng, D. Luo, M.L. Lee, J. An: OrientStore: A Schema Based Native XML Storage System. (Demo).In Proceedings of 29th International Conference on Very Large Data Bases(VLDB2003), pages 1057~1060, Berlin, Germany, September 9-12, 2003.
- J. Wang, X. Meng, S. Wang: Integrating Path Index with Value Index for XML Data. In Proceedings of the Fifth Asia Pacific Web Conference(APWeb2003), pages 95-100, Xi'an China, 27-29 September 2003. LNCS 2642.
|Maintained by Zhongyuan Wang()||Copyright © 2007-2009 WAMDM, All rights reserved|