大数据开放与治理中的隐私保护关键技术研究
近年来,隐私保护已成为大数据管理决策研究与应用的核心问题,传统的隐私保护理论和技术已经无法涵盖大数据隐私的内涵,有必要对其进行重新思考与定位。基于此,本项目以大数据集成与融合、查询与分析、发布与共享带来的隐私问题为切入点,提出了大数据隐私保护框架,具体包括隐私风险监测与评估技术、隐私主动保护技术、查询隐私保护技术、基于数据溯源的问责技术等。研究成果将用于搭建大数据管理与决策下的隐私保护原型系统,并以移动通信领域为应用示范,以验证所提出保护机制与模型在真实数据上的有效性与高效性。通过本 项目研究可以为大数据隐私保护技术进一步深入研究与应用提供理论方法、技术支撑与新的思路。
项目题目
² 国家自然科学基金重大研究计划“大数据开放与治理中的隐私保护关键技术研究”( 9164620003 ),2017年01月 - 2020年12月
项目说明
图1 隐私保护框架
本项目从大数据共享与治理的需求出发,基于图1提出的隐私保护框架,将主要研究内容概括为:(1)研究大数据私风险侦测体系与评估技术;(2)研究大数据隐私主动保护技术;(3)研究该框架下的隐私数据索引与集成技术等;(4)研究该框架在支持用户查询处理时数据隐私与查询隐私的保护技术;(5)研究 该框架下隐私数据高效分析技术;(6)研究该框架下大数据收集、处理、存储、销毁等操作导致隐私泄露后的问责技术;(7)研究大数据驱动的管理与决策中 的隐私保护应用示范。
项目工作
· PrivKV: Key-Value Data Collection with Local Differential Privacy
Local differential privacy (LDP), where each user perturbs her data locally before sending to an untrusted data collector, is a new and promising technique for privacy-preserving distributed data collection. To the best of our knowledge, there is no existing LDP work on key-value data, which is an extremely popular NoSQL data model and the generalized form of set-valued and numerical data. In this paper, we study this problem of frequency and mean estimation on key-value data by first designing a three methods, namely PrivKV, PrivKVM and PrivKVM+.
· AppPrivacy: Analyzing Data Collection and Privacy Leakage from Mobile Apps
While collecting legitimate usage data, many mobile applications (apps) have reportedly posed privacy threats to their hosted mobile devices and individuals, who are, unfortunately, unaware of data leaks and measures to protect themselves against these leaks. In this poster, we present a system that analyzes the two sides of mobile application ecosystem - data collection and privacy risk. The system consists of three main modules that correspond to mobile apps, users and service providers, respectively. To the best of our knowledge, this is the first work to evaluate privacy risk by analyzing data collection and privacy leakage from mobile apps.
· Protecting Location Privacy against Location-Dependent Attacks in Mobile Services
Most of the existing k-anonymity location cloaking algorithms are concerned with snapshot user locations only and cannot effectively prevent location-dependent attacks when users’ locations are continuously updated. Therefore, adopting both the location k-anonymity and cloaking granularity as privacy metrics, we propose a new incremental clique-based cloaking algorithm, called ICliqueCloak, to defend against location-dependent attacks. The efficiency and effectiveness of the proposed ICliqueCloak algorithm are validated by a series of carefully designed experiments. And, the experimental results also show that the price paid for defending against location-dependent attacks is small.
· 中国隐私风险指数分析报告
本报告对2019年度使用移动设备的用户(以下报告中简称移动用户)个人数据被收集情况进行调研分析,从移动场景下两大数据主体:数据拥有者(移动用户)、数据收集者(App开发者)角度入手,设计隐私风险量化模型定量并制定中国隐私风险指数体系,从而揭示隐私风险各维度特征。该报告以地域分层抽样得到的3000万真实用户为实验样本,分析移动用户App使用状况和隐私风险成因,通过数据收集者隐私风险指数(单App数据收集者、多App数据收集者)、数据拥有者隐私风险指数(区域隐私风险指数、人群隐私风险指数、行为隐私风险指数)讨论3000万用户的隐私风险群体特征。
项目成果
期刊论文
(1) 孟小峰* ; 朱敏杰; 刘俊旭; 大规模用户隐私风险量化研究, 信息安全研究, 2019, 5(9):778-788. 第一标注
(2) 孟小峰* ; 马超红; 杨晨; 机器学习化数据库系统研究综述, 计算机研究与发展, 2019, 56(9):1803-1820. 第四标注
(3) 孟小峰* ; 朱敏杰; 刘立新; 刘俊旭; 数据垄断与其治理模式研究, 信息安全研究, 2019, 5(09):789-797. 第一标注
(4) 霍峥; 孟小峰* ; 一种满足差分隐私的轨迹数据发布方法, 计算机学报, 2017, 41(2):400-412. 第一标注
(5) 王春凯; 孟小峰* ; 应对倾斜数据流在线连接方法, 软件学报, 2017, 29(3):869-882. 第四标注
(6) Chunkai Wang; Xiaofeng Meng* ; Qi Guo; Zujian Weng; Chen Yang; Automating Characterization Deployment in Distributed Data Stream Management Systems, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (TKDE), 2017, 29(12):2669-2681. 第一标注
(7) Shuo Wang; Aishan Maoliniyazi; Xinle Wu; Xiaofeng Meng* ; Emo2Vec: Learning emotional embeddings via multi-emotion category, ACM Transactions on Internet Technolog, 2020, 20(2):1-17. 第二标注
(8) 刘俊旭; 孟小峰* ; 机器学习的隐私保护研究综述, 计算机研究与发展, 2020, 57(2):346-362. 北大中文核心期刊. 第一标注
(9) 王硕; 杜志娟; 孟小峰* ; 大规模知识图谱补全技术的研究进展, 中国科学:信息科学, 2020, 50(4):93-117. 第四标注
(10) 张啸剑; 付聪聪; 孟小峰* ; 结合矩阵分解与差分隐私的人脸图像发布, 中国图象图形学报, 2020, 25(4):655-668. 第二标注
(11) 陈珂锐; 孟小峰* ; 机器学习的可解释性, 计算机研究与发展, 2020, 57(9):1971-1986. 第一标注
(12) 叶青青; 孟小峰* ; 朱敏杰; 霍峥; 本地化差分隐私研究综述, 软件学报, 2018, 28(7):1981-2005. 第一标注
(13) 张啸剑; 金凯忠; 孟小峰; 基于自适应网格的隐私空间分割方法, 计算机学报, 2018, 55(6):1143-1156. 第三标注
(14) 张啸剑; 付聪聪; 孟小峰; 面向人脸图像发布的差分隐私保护, 中国图象图形学报, 2018, 23(9):1305-1315. 第三标注
(15) 张啸剑; 付楠; 孟小峰; 基于本地差分隐私的空间范围查询方法, 计算机研究与发展, 2020, 57(4):847-858. 第三标注
(16) 张啸剑; 付楠; 孟小峰; 基于本地差分隐私的键-值数据精确收集方法, 计算机学报, 2020, 43(8):1479-1492. 第三标注
(17) 张啸剑; 陈莉; 金凯忠; 孟小峰; 基于联合树的隐私高维数据发布方法, 计算机研究与发展, 2018, 55(12):1178-1193. 第三标注
(18) 彭慧丽; 金凯忠; 付聪聪; 付楠; 张啸剑; 基于序列格的隐私时序模式挖掘方法, 电子学报, 2020, 48(1):153-163. 第二标注
(19) Haibin Zheng; Qianhong Wu; Jan Xie; Zhenyu Guan; Bo Qin; Zhiqiang Gu; An organization-friendly blockchain system, Computers & Security, 2019, 88(1):0167-4048. 第九标注
(20) Lin Zhong; Qianhong Wu; Jan Xie; Zhenyu Guan* ; Bo Qin; A secure large-scale instant payment system based on blockchain, Computers & Security, 2019, 83:349-364. 第九标注
(21) Lin Zhong; Qianhong Wu; Jan Xie; Zhenyu Guan* ; Bo Qin; A secure versatile light payment system based on blockchain, Future Generation Computer Systems, 2019, 93:327-337. 第九标注
(22) 傅继彬; 张啸剑* ; 丁丽萍; MAXGDDP: 基于差分隐私的决策数据发布算法, 通信学报, 2018, 39(3):136-146. 第二标注
(23) 刘立新; 孟小峰* ; 区块链与数据治理, 中国科学基金, 2020, 34(1):12-17. 第一标注
(24) 孟小峰* ; 王雷霞; 刘俊旭; 人工智能时代的数据隐私、垄断与公平, 大数据, 2020, 6(1):35-46. 第一标注
(25) 金凯忠; 彭慧丽* ; 张啸剑; 基于差分隐私的轨迹模式挖掘算法, 计算机应用, 2017, 37(10):2938-2945. 第二标注
(26) Wang, Y.; Ding, Y.; Wu, Q.; Wei, Y.; Qin, B; Privacy-preserving cloud-based road condition monitoring with source authentication in vanets, Transactions on Information Forensics and Security, 2018, 14(7):1779-1790. 第六标注
(27) 张祎; 孟小峰* ; InterTris:三元交互的领域知识图谱表示学习, 计算机学报. 第四标注
(28) 朱敏杰; 叶青青; 孟小峰* ; 杨鑫; 基于权限的移动应用程序隐私风险量化, 中国科学 : 信息科学. 第一标注
(29) Ye Qingqing; Hu Haibo; Au Man Ho; Meng Xiaofeng* ; Xiao Xiaokui; LF-GDPR: A Framework for Estimating Graph Metrics with Local Differential Privacy, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. 第三标注
会议论文
(1) Xin Qiao; Lixiaoyang Wang; Bo Qin; Hong Chen; Suyun Zhao; Privacy Protection Method Based on Access Control, APSIPA, Hawaii, 2018-11-12至2018-11-15. 第八标注
(2) Q. Ye; H. Hu; X. Meng* ; H. Zheng; PrivKV: Key-Value Data Collection with Local Differential Privacy, I EEE Symposium on Security and Privacy, 美国, 2019-5-22至2019-5-24. 第一标注
(3) Qingqing Ye; Haibo Hu; Man Ho Au; Xiaofeng Meng* ; Xiaokui Xiao; Towards Locally Differentially Private Generic Graph Metric Estimation, The 36th IEEE International Conference on Data Engineering (ICDE 2020), Dallas, Texas, US, 2020-4-20至2020-4-24. 第一标注
(4) Chen Yang; Xiaofeng Meng* ; Zhihui Du; Cloud based Real-Time and Low Latency Scientific Event Analysis, 2018 IEEE International Conference on Big Data, Seattle, WA, USA, 2018-12-10至2018-12-13. 第三标注
(5) Chen Yang; Xiaofeng Meng* ; Zhihui Du; JiaMing Qiu; Kenan Liang; Yongjie Du; Zhiqiang Duan; Xiaobin Ma; AstroServ Distributed Database for Serving Large-Scale Full Life-Cycle Astronomical Data, International Conference on Big Scientific Data Management (BigSDM), Beijing, China, 2018-11-30至2018-12-01. 第三标注
(6) Chunkai Wang; Xiaofeng Meng* ; Partitioning Road Network Streams Based on Runtime Correlation Discovery, 18th IEEE International Conference on Mobile Data Management, KAIST, Daejeon, 2017-5-29至2017-6-1. 第一标注
(7) Yongjie Du; Xiaofeng Meng* ; Chen Yang; Zhiqiang Duan; Real-Time Query Enabled by Variable Precision in Astronomy, International Conference on Big Scientific Data Management (BigSDM), Beijing, China, 2019-11-30至2019-12-01. 第三标注
(8) Zujian Weng; Qi Guo; Chunkai Wang; Xiaofeng Meng* ; AdaStorm Resource Efficient Storm with Adaptive Configuration, 2017 IEEE 33rd International Conference on Data Engineering, San Diego, CA. USA, 2017-4-19至2017-4-22. 第六标注
(9) Zhiqiang Duan; Chen Yang* ; Xiaofeng Meng; Yongjie Du; Continuous Cross Identification in Large-scale Dynamic Astronomical Data Flow, International Conference on Big Scientific Data Management (BigSDM), Beijing, China, 2019-11-30至2019-12-01. 第三标注
(10) Zhiqiang Duan; Chen Yang; Xiaofeng Meng* ; Yongjie Du; Xukang Zhang; Jiaming Qiu; Xiaobin Ma; Zhihui Du; Baoning Niu; Chao Wu; SciDetector: Scientific Event Discovery by Tracking Variable Source Data Streaming, 20 19 IEEE 35th International Conference on Data Engineering (ICDE), Macao, 2019-4-8至2019-4-11. 第二标注
(11) Chen Yang; Zhihui Du; Xiaofeng Meng* ; Yongjie Du; A Frequency Scaling Based Performance Indicator Framework for Big Data Systems, International Conference on Database Systems for Advanced Applications (DASFAA 2019), Chiang Mai, Thailand, 2019-4-22至2019-4-25. 第三标注
(12) Zehui Hao; Zhongyuan Wang; Xiaofeng Meng* ; Jun Yan; Qiuyue Wang; Semantic Definition Ranking, Interna tional Conference on Database Systems for Advanced Applications (DASFAA 2017), Suzhou, China, 2017-5-27至2017-5-30. 第五标注
(13) Xinle Wu; Lei Wang; Shuo Wang; Xiaofeng Meng* ; Linfeng Li; Haitao Huang; Xiaohong Zhang; A Unified Adversarial Learning Framework for Semi-supervised Multi-target Domain Adaptation, International Conference on Database Systems for Advanced Applications (DASFAA 2020), Jeju, South Korea, 2020-9-24至2020-9-27. 第一标注
(14) (13) Shuo Wang; Xiaofeng Meng* ; Multi-Emotion Category Improving Embedding for Sentiment Classification, 2 7th ACM International Conference on Information and Knowledge Management (CIKM 2018), Lingotto, Torino, Italy, 2018-10-22至2018-10-26. 第三标注
(15) Shuo Wang; Zehui Hao; Xiaofeng Meng* ; Qiuyue Wang; ScholarGraph: AChinese Knowledge Graph of Chinese Scholars, The Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 2018-5-7至2018-5-12. 第四标注
(16) Yi Zhang; Zhijuan Du; Xiaofeng Meng* ; EMT: A Tail-Oriented Method for Specific Domain Knowledge Graph Completion, Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2019), Macao, 2019-4-14至2019-4-17. 第五标注
(17) Yang C; Du YJ; Du ZH; Meng XF* ; Micro Analysis to Enable Energy-Efficient Database Systems, Proceedin gs of the 23rd International Conference on Extending Database Technology (EDBT), Copenhagen, Denmark, 2020-3-30至2020-4-2. 第二标注
(18) Qing Tang; Chen Yang; Xiaofeng Meng* ; Zhihui Du; Benchmarking Database Ingestion Ability with Real-Time Big Astronomical Data, BenchCouncil International Symposium on Benchmarking, Denver, Colorado, USA, 2019-11-14至2019-11-16. 第六标注
(19) Shengna Guo; Xiaofeng Meng* ; Density Peaks Clustering with Differential Privacy, The Biennial Conference on Innovative Data Systems Research (CIDR 2017), Chaminade, California, USA, 2017-1-8至2017-1-11. 第一标注
(20) Zhijuan Du; Zehui Hao; Xiaofeng Meng* ; Qiuyue Wang; CirE: Circular Embeddings of Knowledge Graphs, In ternational Conference on Database Systems for Advanced Applications (DASFFA), Suzhou, China, 2017-3-27至2017-3-30. 第五标注
(21) Huadi Zheng; Qingqing Ye; Haibo Hu; Chengfang Fang; Jie Shi; BDPL: A Boundary Differential Private Layer against Machine Learning Model Extraction Attacks, The 24th European Symposium on Research in Computer Security (ESORICS 2019), luxembourg, 2019-9-23至2019-9-27. 第三标注
(22) Shuo Wang* ; Knowledge Representation for Emotion Intelligence, IEEE 35th International Conference on Data Engineering (ICDE 2019), Macao, 2019-4-8至2019-4-11. 第五标注
(23) Yuxuan Zhang; Jianghong Wei; Xiaojian Zhang* ; Xuexian Hu; Wenfen Liu; Two-Phase Algorithm for Generating Synthetic Graph Under Local Differential Privacy, The 8th International Conference on Communication and Network Security (ICCNS 2018), Qingdao, China, 2018-11-2至2018-11-4. 第六标注
(24) Ninghui Li(#); Qingqing Ye,; Mobile data collection and analysis with local differential privacy, IEEE International Conference on Mobile Data Management, China, 2019-6-12至2019-6-14. 第三标注
学术专著
(1) 潘晓; 霍铮; 孟小峰; 位置大数据隐私管理, 机械工业出版社, 2017.
专利
(1) 孟小峰,杜治娟; 一种融合多背景知识的知识图谱 嵌入方法,2020-3-31至2040-3-31, 中国,ZL201710549884.X.
(2) 孟小峰; 王春凯; 一种应对倾斜数据流在线连接的处理方法,2019-11-15至2039-11-15, 中国,ZL201710542086.4.
(3) 孟小峰; 祝敏杰; 一种收集APP隐私风险量化评估方法,2019-12-13至2039-12-13, 中国,ZL201710623492.3.
(4) 孟小峰; 张祎; 一种强适应性的知识库补全方法,2020-1-10至2030-1-10, 中国,ZL201710630354.8.
会议报告
(1) 孟小峰; 科学大数据管理系统实践与展望, 第四届科学数据大会, 云南昆明, 2017-8-2至2017-8-4.
(2) 孟小峰; 学术专著的写作和思考, CCF YOCSEF走进Springer Nature活动, Springer北京办公室, 2017-9-15至2017-9-15.
(3) 孟小峰; 科学大数据管理展望, 2017年度网络安全和信息化主管干部培训班, 成都, 2017-11-21至2017-11-21.
(4) 孟小峰; 大数据与知识图谱构建, “大数据与知识图谱”学术研讨会, 西藏民族大学, 2018-6-8至2018-6-8.
(5) 孟小峰; 社会计算的现状与展望, 大数据与社会科学转型高端学术研讨会, 哈尔滨工业大学科学园国际会议中心, 2019-1-12至2019-1-13.
(6) 孟小峰; 中国特色的数据治理理论与实践, 第十七届中国信息系统及应用大会, 广州, 2020-9-23至2020-9-25.
(7) 孟小峰; 数据智能时代中文期刊影响力的提升, 中国科协“卓越计划人才培训项目”, 苏州, 2020-10-16至2020-10-17.
(8) 孟小峰; 数据治理的问题与挑战, 首届数字经济与人口发展研讨会, 赣州, 2020-11-15至2020-11-15.
(9) 孟小峰; 大数据智能时代的数据治理, 联想研究院专家委员会学术年会, 北京, 2020-12至2020-12.
(10) 叶青青; 胡海波; Local Differential Privacy: Tools, Challenges, and Opportunities, WISE 2019 (International Conference on Web Information Systems Engineering), 香港, 2020-1-19至2020-1-22.
(11) 张啸剑; 叶青青; 孙林; Differential Privacy: Basic Theory and Practical Applications, 2020年网络空间安全暑期研讨会, 南京航空航天大学, 2020-8-17至2020-8-17.
(12) 张啸剑; 隐私保护:方法与应用, 第十二届电子商务与电子政务管理国际会议(ICMeCG2018), 郑州, 2018-9-21至2018-9-23.
(13) 杜治娟; 知识图谱嵌入表示, “大数据与知识图谱”学术研讨会, 西藏民族大学, 2018-6-8至2018-6-8.
(14) 王硕; 知识图谱相关知识与示例, “大数据与知识图谱”学术研讨会, 西藏民族大学, 2018-6-8至2018-6-8. (15) 张祎; 学术空间系统——ScholarSpace, “大数据与知识图谱”学术研讨会, 西藏民族大学, 2018-6-8至2018-6-8.
(15) 吴新乐; Relation Extraction Based on Deep Learning, “大数据与知识图谱”学术研讨会, 西藏民族大学, 2018-6-8至2018-6-8.
(16) 张啸剑; 差分隐私保护:方法与应用, 第四届数据安全与隐私保护学术会议(ChinaPrivacy2019), 广西桂林, 2019-10-25至2019-10-27.
软件著作权
(1) 孟小峰; GWAC天文大数据系统, 2017SR465001, 原始取得, 全部权利, 2017-5-29.
(2) 刘俊旭; OrientDP差分隐私原理展示系统, 2018SR029766, 原始取得, 全部权利, 2017-5-30.
(3) 孟小峰; Storm参数调优及检测软件, 2017SR464912, 原始取得, 全部权利, 2017-5-18.
(4) 朱敏杰; 手机用户隐私风险量化监控系统, 2018SR028674, 原始取得, 全部权利, 2017-8-6.
(5) 张祎; 学者学术知识图谱交互式分析系统, 2017SR464975, 原始取得, 全部权利, 2017-1-1.
(6) 吴永泰; ScholarFinding学者关系发现系统, 2019SR0512967, 原始取得, 全部权利, 2018-12-6.
(7) 卲玉洁; 基于领域分类的中国高校排名系统, 2019SR0512976, 原始取得, 全部权利, 2018-12-7.
(8) 杜永杰; 杨晨; 分布式天文大数据生成器, 2019SR0512984, 原始取得, 全部权利, 2018-12-24.
(9) 段志强; 杨晨; 天文数据处理流水线, 2019SR0577396, 原始取得, 全部权利, 2018-12-24.
(10) 杜永杰; 杨晨; 实时天文大数据查询引擎, 2019SR0625233, 原始取得, 全部权利, 2018-12-24.

Maintained by WAMDM Administrator() | Copyright © 2007-2017 WAMDM, All rights reserved |