 (EMC Research China) Cloud based Personal Information Management – Introduction to EMC Cloud Computing
With the requirements of automatic online storage and backup, round the clock access and securely sharing andpublishing of personal digital information, it is inevitable that personal information management will migrate into the cloud. The goal of personal information cloud service is to securely access and organize all your information anytime, anywhere, using any device and never lose any of it. EMC is creating a new cloud services business called Decho ('digital echo' referring to the reverberating accesses to information in a user's digital environment)by joining Mozy (cloud backup) and Pi (personal information) together. It will use EMC data centres around the planet to store consumer and business files using Mozy's software front end to provide data ingest and access services and Pi's metadata software to manage and verify personal information. Decho can deliver on the promise of cloud-based personal information management and can help individuals everywhere preserve, manage and enrich the information most important to them.
 (Tsinghua University) Understanding and Comments on Cloud Computing
Cloud computing is new concept proposed in recent years. This talk firstly compares cloud computingwith traditional distributed computing and grid computing to help understanding the concept and chacharacteristicsof cloud computing, and then introduces some possible research directions in both cloud computing platform and combining with applications.
 (Web Group) Information Credibility on the Web
Credibility on the Web is an important research topic. This presentation is a survey to introduce the concept of information credibility. Methods of identifying credibility in six different Web scenarios are also introduced detailedly, including P2P network, online discussion forums, wikipedia and so on. Besides, information credibility criteria and two typical evaluation methods are introduced. At last, a brief summary about information credibility on the Web.
 (Web Group) Web Spam
A survey is done on currently popular web spams,which introduces two spam techniques
 (Mobile Group) Uncertainty Reasoning Under Correlative Knowledge 
Uncertainty Reasoning infiltrate into many aspects of human life.First we considered the key components of a uncertainty reasoning system,then proposed a reasoning under correlative knowledge framework.
 (Mobile Group) The probabilistic complex event detection in pervasice computing 
Extracting complex events from low-level atomic events is becoming more and more important in daily life. However, current event detection researches often assume the data is precise. In many of the real applications, the data is instead imprecise. In addition, few research works have been carried out on finding better usages of time information. Due to the importance of event queries and the rapidly increasing amount of probabilistic data collected, we propose some temporal semantics, data model and event query techniques.
 (XML Group) Introduction to OrientX
An brief introduction to OrientX database system, including the architecture, main features, storage management, demonstration of OrientX3.0 and etc.
 (XML Group) OrientX3.0 and its improvements
This ppt mainly deals with the XQuery implementation in OrientX3.0, including navigation-based query processing and algebra-based query processing. Then, the current efficiency problem in current version is analyzed. Finally, we show the implementation issue of XQuery/Update.
 (XML Group) Features of new version and plans
the feature definitions of OrientX new version and its plans

 (Web Group) 2nd Stage Develope Plan of OrientSpace 
Introduced the develope plan of OrientSpace in the second stage.
 (Web Group) Evolution of Personal Dataspace using User Feedback 
Evolution is one of the most important features of dataspace systems. Evolution in personal dataspace is different from that of dataspace in data integration context. We propsed a evolution framework for personal dataspace using user feedback. This framework can perform the evolution in a pay-as-you-go fashion.
 (Web Group) EASE: TaskSpace: A Task Based Model For Personal Dataspace Management 
In this paper, we propose a task based model for organizing personal data items
 (Mobile Group) 2008 workshop on flash-based database 
2008 workshop on flash-based database was hold in hefei, Anhui. More than twenty researchers took part in this workshop. The topics cover storage, index, query processing and transaction processing. We show our recent research about query on flash disk after the introduction of workshop. Some naive ideas are introduced and initial experiments are done. Finally we show some pictures in Hefei.
 (XML Group) Conditional Random Fields Model
A conditional random field (CRF) is a type of discriminative probabilistic model most often used for the labeling or parsing of sequential data, such as natural language text or biological sequences. And I'm trying to use it into XML keywords refinement.
 (Web Group) Cumulated gain-based evaluation of IR techniques
The author develops 4 new measures to evaluate the efficiency of different IR techniques which are CG,DCG,NCG,NDCG. The first one accumulates the relevance scores of retrieved documents along the ranked result list. The second one is similar but applies a discount factor to the relevance scores in order to devaluate late-retrieved documents. The third (fourth) one computes the relative-to-the ideal performance of IR techniques, based on the (discounted) cumulated gain they are able to yield. Then the author examines five different IR techniques using the new measures based on data of TREC-7 and discusses the parameters and the limitations of the new measures.

 (Mobile Group) Summary for Attending CIKM08
This presentation is to share our trip experiments with our lab. It contains number of papers&accept rate, three keynotes, some interesting talks and poster.
 (XML Group) Revisiting XML Keyword Search 
In this talk, I mainly discussed existing keyword search methods, then I introduced the initial idea of keyword search in XML stream.

 (Web Group) Experimental Results On Approximate Membership Checking 
This experiment is about the problem of identifying sub-strings of input text strings that approximately match with some member of a potentially large dictionary. This roblem arises in several important applications such as extracting named entities from text documents and identifying biological concepts from biomedical literature.
 (XML Group) Analysis of SQL/XML 
This PPT mainly concerned with the charactor of SQL/XML, especially the postion of XML query in SQL. Then an initial solution is given to integrate the XML query into existing relational DB.
 (Mobile Group) The Declarative Programming Language:Ruby
Ruby is a language of careful balance. Its creator, Yukihiro “matz” Matsumoto, blended parts of his favorite languages(Perl, Smalltalk, Eiffel, Ada, and Lisp) to form a new language that balanced functional programming with imperative programming. Important Language Features include Iteration, Expression, String, Controlling Structure, everything is an object, naming conventions, method access, Class, Singleton methods, Block, Exception Processing and Thread,and so on。
 (Web Group) Progress on Dataspace 
Introduced our progress on system deveolpement and research on Dataspace.
 (Web Group) TEXEM: An Entity-based Task Extraction Approach for Emails(For NDBC2008)
An introduction to our work on task extraction from emails that is to be presented in NDBC 2008.
 (Web Group) A Data Driven Approach for Automatic Wrapper Generation and Maintenance(For NDBC2008)
This paper proposes a novel method to perform this issue automatically, which is called data driven approach. This approach matches date items between source pages and target pages by the same semantic record of different websites in one domain or different templates in one site. Then it generates or maintains wrappers with these mapping data items.
 (XML Group) MLCEA: An Entity Based Semantics for XML Keyword Search(For NDBC2008)
Defining effective semantics which is used to determine the data fragments returned to users is the key problem in XML keyword search. We proposed new semantics
 (XML Group) Efficient Processing of XML Twig Query based on Related Semantics(For NDBC2008)
Though keyword search method can be used easily, it possesses the inherit feature of limited expressive capability. Structured query language has the powerful expressive ability, however, users must have a full understanding of the underneath schema information. We propose an extension to XPath by introducing a novel related semantics and an efficient algorithm rTwigStack based on this semantics.
 (Mobile Group) The TPC-C and Current Testing Benchmark
TPC Benchmark TM C(TPC-C)is an OLTP workload,which is considered as performance evaluating standard. To evaluate soft ware’s performance,a TPC-C testing system is needed.

 (Web Group) Approximate membership lookup 
How to efficiently extract all substrings from input documents such that the substring approximately matchs a record in dictionary.
 (Web Group) OwnerCorrelation: A new Framework for Personal Dataspace Management 
Correlation to owner is the root character of each object of PDS, and may play an important role in data operation of PDS. Based on the assumption, we propose a new concept OwnerCorrelation(OC) to describe the relation between owner and other objects of PDS, and present to take Personal Taskspace(PTS) to model characters of owner entity of PDS, which provides people a new position to view techniques of PDSMS.
 (Mobile Group) A Summary of Indoor Navigation 
The trandational locating methods cannot be used in the indoor environment because of the influence of the signal strength, the accuracy, and so on. There are a lot of difference between indoor navigation and outdoor navigation, so the research of indoor navigation is becoming a hot point. This paper analyze the challenge problems of the indoor navigation. It summarizes the current researches in the aspect of navigation patterns, navigation techniques and navigation systems. Finally it points out the future research directions.
 (mobile Group) Summary of Recent Work 
In this talk, I mainly introduce the simple idea of recent work. The idea comes from papers of two different aspects. One is about self-tuning and the other is about storage strategy.
 (Mobile Group) Summary for research in Hongkong 
This presentation is to share my work and Dr.Xu's work in HongKong with our lab.
 (XML Group) Summary of Recent Work 
In this talk, I mainly discuss recent work done in NUS, which consists of two aspects, the first is about extending XPath and was submitted to ICDE09, the second is about XML keyword search and will be submitted to DASFAA.

C-DBLP Project Presentation
C-DBLP Project is a Chinese computer science document-integration system based on data integration technology.The system is developed by Web Group in the Lab of Web and Mobile Data Management(WAMDM), Renmin University of China. Data in C-DBLP system which include papers published in famous journals and conferences are author-oriented organized to provide efficient and easy-to-use document retrieval services.
 (Web Group) Introduction to Progress of OrientSpace 
Introduced some development progress of OrientSpace in this summer.
 (Undergraduate) An example of the report of PG code review--executor 
An example of the report of PG code review--executor
 (Web Group) EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data 
This is a paper pubished in SIGMOD2008(Guoliang Li,etc). In this paper,an efficient and adaptive keyword search method is proposed, called EASE, for indexing and querying large collections of heterogenous data.It propose an extended inverted index to facilitate keyword-based search, and present a novel ranking mechanism for enhancing search effectiveness.
 (XML Group) Keyword Proximity Search in Complex Data Graphs 
This is a sigmod'08 paper about keyword proximity search on graphs. Previous approach try to answer queries on graph by way of solving Steiner-tree problem. But they have two major flaws
 (XML Group) Templete of the Report of PG Code Review 
Templete of the Report of PG Code Review
 (XML Group) Efficient Merging and Filtering Algorithms for Approximate String Searches 
I present my research experience and present this ICDE paper, where ee propose several new algorithms for efficient merging and filtering algorithms for Approximate String Searches.
 (Web Group) Introduction to online advertising and AdCenter labs
A brief introduction to online Advertising and the demo developed by Microsoft AdCenter labs.
 (Web Group) Faceted Search
A brief introduction to faceted search and the differences between faceted search and others.
 (Web Group) Introduction to Freebase
Freebase claims to be an open, shared database of the world's knowledge.Compared with Wikipedia, the information it stores and organizes is more structural. Freebase is a novel application of WEB2.0 and semantic web.

 (Web Group) Research on Personal Dataspace Management
Explosion of the amount of digital information has made Personal Information Management(PIM) become a hot topic. Personal data is always distributed, rough-and-tumble, personalized, heterogenous and evolutionary, which brings much challenge to effective and efficient Personal Dataspace Management (PDSM). In the paper, by highlighting the importance of users in Personal Dataspace Management System(PDSMS), we proposed a user-centered framework. We first show research issues, related work, main research problems and challenges in this area. We then introduce the current research work and the preliminary results. Finally, the research plan of my PhD project is presented for discussion.
 (Mobile Group) Continuous Density Queries for Moving Objects
A density query returns the regions with a density higher than some user-specified threshold. Although many studies have been done on density queries for moving objects in highly dynamic scenarios, they all focused on how to answer snapshot density queries for moving objects. This presentation proposed an approach which continuously monitors dense regions for moving objects.

 (XML Group) Keyword Search on XML Tree 
This presentation introduces Keyword Search on XML Tree,including the construction of the index, the processing of the query and the analysis of the result.
 (Mobile Group) A new IO mechanism for transaction processing on flash database 
With the development of widely used PDA, MP3 and DC, flash memory, as a new electronic storage device, becomes more and more popular and important. There is an increasing trend of using database to manage the increasing data on flash memory. Moreover, transaction processing is needed to assure the correctness of complicated applications. Based on the analysis of traditional transaction IO mechanism, this paper presents a new IO mechanism of transaction.
 (Web Group) An Introduction to Desktop Search
This presentation reveals a general perspective of current desktop search tools and introduces current research in this field.As this field becomes hotter and its significant relationship with PIM, desktop search deserves our attention.
 (XML Group) Dagstuhl Seminar on Ranked XML Querying
This report introduces some interesting talks about ranked XML querying presented in Dagstuhl Seminar.

 (Web Group) Introduction to Cloud Computing
This presentation introduced Cloud Computing which is looking like a classic disruptive technology. We also show the relationship of Web2.0, Grid Computing and Cloud Computing. At last, we discuss several Cloud Computing cases and the future of Cloud Computing.
 (Mobile Group) FlashDB & LazyHash
One problem of designing index for flash-based storage is that the hardware platform and workloads are quite very different. FlashDB and LazyHash try are optimized for all situations by self-tuning theirself.
 (Web Group) An Event Based Email Processing Approach
Introduced an approach to help users process their emails by showing them collection of events rather than plain text.
 (XML Group) Discussion of recent work of XML group 
In this talk, I present the ongoing work of XML group and propose some interesting problems encountered in practice to disscuss with all members of our lab.
 (XML Group) Relaxed Twig Query Processing 
Twig query processing can achieve high performance when compared to binary join processing. But two problems exists
 (XML Group) A New Query Semantic 
This presentation introduces several query semantic of keyword search on XML document and their disadvantage , and we also define a new query semantic.
 (Web Group) Automatic Scientific Paper Classification
Automatic scientific paper classification has become an important research topic due to the increasing number of large scientific paper collection. Several works have explored this problem but still have some drawbacks and the results can not meet the users's need in practice. We focus on improving performance of current works by search engin.
 (XML Group) LaTeX: An Intro 
A brief introduction of LaTeX.
 (Web Group) An efficiency for personal dataspace integration 
An adaptive strategy for personal data integration is proposed, and a prototype system is developed and demonstrated in the paper.
 (Web Group) A New Method for Deep Web Data Integration 
We propose a new method for Deep Web Data Integration which can implement precise extraction

 (XML Group) XQuery/Update processing 
This talk focus on the pocessing technique of XQuery/Update whose draft is proposed by W3C in the past year. this talk try to process XQuery/Update based on XML algebra. And how to optimize Transform query based on XML algebra.
 (Web Group) Research of Entity Identification for Web data management 
Entity identification means to find those records which refer to the same real world entity from two or more data sources. In the web data integration, the variety and variability of the data sources make the entity identification on public data a very challenge problem. Secondly, some special applications require implementing the entity identification on private data.
 (XML Group) Integrity Auditing of Outsourced Data 
Some research points on integrity auditing of outsourced data, thing needs to be done before graduation, and the near future work
 (Mobile Group) Privacy Protection in Location Based Service 
This talk analyzes the privacy problem in location based service and the related works in state-of-art. We also propose two solutions for different privacy issues.

 (XML Group) Finding What You Want to Find from Complex Structured XML Databases 
This talk focus on how to provide the user with a simpler interface to express their query requirement when the structure of the given XML document is very complex,
 (Web Group) Similarity Measures in Deep Web Data Integration
This presentation introduces the existing similarity measures and proposes the challenges and solutions in Deep Web data integration.
 (Web Group) Jobtong System Progress and Research Topics
This presentation introduced Jobtong system which was a effective Deep Web Data Integration System. It also showed progress of Jobting in this term and propose plans need to do in future.
 (Mobile Group) Location Privacy-Preserving against Maximum Movement Boundary Attack 
This talk proposes a new cloaking algorithm which is against Maximum Movement Boundary Attack. It is an undirected-graph based cloaking algorithm which incrementally finding cliques and anonymized set. The MBR of the anonymized set is the finall cloaking region.
 (mobile Group) Flash DBMS Index and Transaction 
This presentation propose a novel index model for flash dbms.the initial experiment show great performance compared with FTL,JFFS3 and IPL. Besides this, two interesting questions about the transantion on flash are proposed.
 (Web Group) SNS&DBRef
Introduced the concepts of SNS, analyzed the features of mainstream SNS, and proposed some ideas on construcing DBref.
 (Mobile Group) Study on a Time Management Model for Distributed Workflow Systems
This paper builds an integrated time management model, called DWfS-TMM (Distributed Workflow System-Time Management Model), for distributed workflow systems, which supports the distributed collaboration of business activities model and operation across different time zone, in multi-granularity of time and different work time.
 (Mobile Group) Semantic based Request Diversity in LBS 
This presentation propose a new privacy protection in LBS, which considers semantic information of query and the diversity of request information. An anonymization algorithm based on this model is also proposed as well as some pruning heuristic and algorithm optimizations.

 (XML Group) A glance of the Data Warehouse implemented by CVICSE
This presentation give a glanc of the DWH implemented by CVICSE
 (XML Group) Introduction to XML IR —Scoring and Ranking
This report summarizes XML scoring and ranking approaches in XML information retrieval.
 (Mobile Group) A Problem of Concurrency Control in A Flash-based DBMS 
We surveyed how to control concurrency in a flash-based DBMS and found a interesting new problem !
 (Web Group) DBRef:Discussion on New Features
This presentation shows the current version of DBRef System and put forward some challenges and problems in the system. The main aim is to discus the following work.
 (Web Group) Uncertainty in Data Integration
This presentation can be divided into three parts. In the first part the detailed introduction about the paper "Data Integration with Uncertainty" will be given. The second part is the brief introduction about the Workshop on Management of Uncertain Data. And the third part is about uncertainty in Deep Web.
 (Mobile Group) Uncertain Data
This presentation shows some query processing algorithms with uncertainy of location data.
 (XML Group) Introduction of Probabilistic XML
This talk presents a review of probabilistic XML and its application in recent years, including query semantics, computing the probability of query results and the application of probabilistic XML in data integration.
 (Web Group) Social Network and Collaborative Task Management in Email
This presentation shows some challenges and problems in email management and our sight of the problem.A draft proposal of MIR is also presented here.
 (Web Group) Introduction to Natural Language Processing
This report is mainly about natural language processing and metadata-based web information-extraction.It will provide our research group an openning platform of natural language process.

 (XML Group) OrientX 3.0 Demo
This presentation mainly concerns about OrientX, a native XML database system developed by WAMDM. During the last year, we recoded OrientX2.5 and added serveral new features including new architecture, XQuery/Update, a set of pragramming APIs and visualization. The update module is the most important and we then demonstate the XQuery/Update in the new version OrientX3.0.
 (Mobile Group) Location Management and Moving Objects Databases 
This report summarizes moving objects research and the direction that we are focouing on. Specifically, it includes indexing technology, query processing technology, uncertainty and probabilistic qeury, spatio-temporal data mining.
 (XML Group) Introduction to XML IR
This ppt concerns the process overflow of XML information retrieval(XML IR), including query semantics, process algorithm, scoring and ranking, representation of the query result. And this ppt mainly deals with the first two parts (query semantics and algorithm).
 (XML Group) A Review of VLDB2007
A short review of VLDB 2007 conference, including all the listened sessions and demos

 (Web Group) Introduction for Deep Web
 (Web Group) Dataspace- A New Research Focus
 (XML Group) XML Query Process Tecnology
 (XML Group) Integrity Auditing of Outsourced Data                                                                        
 (Mobile Group) Flash Data Management

 (XML Group) Efficient Processing of Partially Specified Twig Queries
Partially Specified Twig Queries can provide the user the most flexibility to specify more flexible semantic structure constrained in a query. This presentation focused on how to express a partially specified twig queries in a more concise but effective way and how to process a partially specified twig queries more efficiently.
 (Mobile Group) Design of Flash-Based DBMS :An In-Page Logging Approach
The popularity of high-density flash memory as data storage media has increased steadily for a wide spectrum of computing devices,It is thus not inconceivable to consider running a full database system on the flash-only computing platforms or running an embedded database system on the lightweight computing devices. In this paper, The authors present a new design called inpage logging (IPL) for flash memory based database servers.

 (Web Group) Indexing Dataspaces
To support users’ queries to a heterogeneous mix of data in dataspaces, this paper proposed several indexing plans based on the characteristics of dataspace. All of those are essentially based on extended inverted lists. They combine keyword queries and structure-aware queries to establish such indexes that are the most suitable to dataspaces’ characteristics and users’ requests.
 (Mobile Group) A Survey on Flash-based Database Index Methods
To support efficient lookups on the primary key of database records stored on NAND flash devices, some index methods have been proposed. We analyse these methods and build a cost model for each of them. By comparing them on several important metrics, we find the advantages and drawbacks of each index structure.

 (XML Group) New OrientX Storage for XML Update
In order to support xml update more efficiently. we introduce a improved implementation of OrientX storage which is basded on the original implementation. The new implementation of OrientX storage can keep the old modle of OrientX changing a little, and support XML update efficiently.
 (Mobile Group) Quality Aware Privacy Protection for Location-based Services (DASFAA 2007)
In this paper, we have discussed the problem of quality-aware privacy protection in location-based services. We classified the privacy requirements into location anonymity and identifier anonymity. To protect both of these two anonymities, we gives some solutions. Experimental evaluation have verified the effectiveness of our model and the proposed cloaking algorithms under various privacy and QoS requirements.
 (Mobile Group) Clustering Moving Objects in Spatial Networks (DASFAA 2007)
In this paper, we studied the problem of clustering moving objects in a spatial road network and proposed a framework to address this problem. By introduc- ing a notion of cluster block, this framework, on one hand, amortizes the cost of clustering into CB maintenance and combination based on the object movement feature in the road network; and on the other hand, it effciently supports differ- ent clustering criteria.
 (Web Group) Dasfaa07-EasyQuerier (DASFAA 2007)
In web database integration system, there are two potential problems when using these integrated interfaces in practice. First, if the number of domains is large, it may be difficult for users to find the correct domain. Second, the integrated interfaces can become too complicated for ordinary users to use. EasyQuerier allows the users to submit keyword-based queries to access the Web databases by first mapping a keyword-based user query to a suitable domain and then translating the user query to a well-formatted query on the integrated interface of the found domain.

 (XML Group) Consider improving the system OrientX
A new design for OrientX. The architecture, core components and the related research topic that may be interesting.
 (Web Group) Introduction to JobTong
These slides introduce how does JobTong System work, which is a solution of Deep Web Integration. By now, JobTong has more than 300,000 job records. In these slices, the structure of JobTong is presented and the next work is proposed.
 (Mobile Group) RFID Data Management
RFID technology has gained significant momentum in the past few years. In addition to applications in retail and distribution, RFID technology holds the promise to simplify airline luggage management, healthcare, and library. We present a brief introduction to RFID technology ,and highlight some fruits of RFID data management, including storage and model of RFID, warehousing and mining massive RFID data sets, data cleaning and demos existing.
 (XML Group) Introduction to European academic visit
In this presentation, the interesting research points focused by the Knowledge and Database Systems Lab, NTU Athens, are introduced firstly, these points covers a wide range in database research area; and then, two XML related problems are introduced, they are context-aware database and partially twig pattern query.
 (Mobile Group) Novel Forms of Nearest Neighbor Search
The slide presents five novel forms of Nearest Neighbor Search, including Conventional NN Queries, Reverse NN Queries, Aggregate NN Queries, NN Queries with Validity Information and Skyline Queries. Some algorithms to answer these queries are also introduced.

 (Web Group) About PIM
In the slide, Background knowledge of Personal Information Management (PIM) is introduced firstly, which includes development history of PIM, the origin of the concept of PIM and todays studying condition of PIM, especially the PIM workshop which has been held twice is presented. Then The related works on PIM has been summarized and the related branches is discussed on. In the end, some ideas and probable research topics are issued out and paid a brief statement, also an outlook on PIM research was proposed.

  (Web Group)  Query translation in Web database integration
  (Mobile Group)  Flash DBMS
  (Web Group)  Data Integration - Achievements and Perspectives in the Last Ten Years
  (Web Group)  Mashup related
  (Mobile Group)  Trustworthy Keyword Search for Regulatory Compliant Record Retention

  (Web Group)  Introduction for Deep Web                               
  (XML Group)  Orientx A Native XML Database
  (Web Group)  essential google                                                                        
  (XML Group)  Orientx: Sum-up and Future
