计算机专业英语论文摘要合辑【2】

写在前面：我是【程序员宝藏】的宝藏派发员，致力于创作原创干货。我热爱技术、热爱开源与分享，创作的【计算机基础面试问题】系列文章和【计算机基础主干知识】系列文章广受好评！后期会创作更多优质原创系列文章！如果您对计算机基础知识、编程等感兴趣，可以关注我，我们一起成长！

本人力荐：如果觉得CSDN排版不够美观，欢迎来我的个人原创公zong号【程序员宝藏】（号如其名，诚不欺你！）查看有红色重点标记和排版美观的全系列文章（不细你来找我要红包）
参考推文链接：TCP三次握手四次挥手

好多同学问我要pdf版，我干脆把我的全部原创文章都整理成了pdf直接打印版，在公zong号后台回复关键字【宝藏】即可免费带回家慢慢看！

本系列参考文章：

计算机专业英语篇（专业英语提升必备）

文章目录

三、大数据相关

1.基于高性能密码实现的大数据安全方案
2.知识图谱研究综述及其在医疗领域的应用。
3.基于MOOC数据的学习行为分析与预测
4.知识图谱构建技术综述
5.一种面向大规模序列数据的交互特征并行挖掘算法

三、大数据相关

1.基于高性能密码实现的大数据安全方案

A Big Data Security Scheme Based on High-Performance Cryptography Implementation

摘要： 目前信息技术发展的趋势是以大数据计算为基础的人工智能技术.云计算、雾计算、边缘计算等计算模式下的大数据处理技术，在给经济发展带来巨大推动力的同时，也面临着巨大的安全风险.密码技术是解决大数据安全的核心技术.大数据的机密性、认证性及隐私保护问题需要解决海量数据的高速加解密问题；高并发的大规模用户认证问题；大数据的隐私保护及密态计算问题等，这些问题的解决，需要底层密码算法的快速实现.针对大数据安全应用的逻辑架构，对底层的国产密码标准算法SM4-XTS，SM2以及大整数模幂运算，分别给出快速计算的算法，并在基于Xilinx公司的KC705开发板上进行了验证，并给出实验数据.实验表明：该工作具有一定的先进性：1)SM4-XTS模式的实现填补了国内该方向的空白；2)SM2签名具有较高性能，领先于国内同类产品；3)大整数的模幂运算应用于同态密码的产品化，填补了国内该产品的空白.

关键词: SM4-XTS, SM2, 大整数模幂, 密码算法快速实现, 大数据

Abstract: At present, the trend of information technology development is the artificial intelligence technology based on big data computing. Although it has made enormous contribution in the economic development, big data processing technology which includes cloud computing, fog computing, edge computing and other computing modes also brings a great risk of data security. Cryptographic technology is the kernel of the big data security. Confidentiality, authentication and privacy protection of big data need to solve the following three security problems: firstly, high-speed encryption and decryption of massive data; secondly, the authentication problem of high concurrency and large scale user; thirdly, privacy protection in data mining. The solution of these problems requires the fast implementation of the underlying cryptographic algorithm. Aiming at the logic architecture of big data security application, this paper gives a fast calculation algorithm for the cryptographic standard algorithm SM4-XTS, SM2 and modular exponentiation of large integers. It is verified on the KC705 development board based on Xilinx company, the results of experiment show that our work has certain advancement: 1) The implementation of SM4-XTS fills the blank of this direction in China. 2) SM2 signature has high performance, leading domestic similar products. 3) Modular exponentiation is applied to the productization of homomorphism cryptography, and its performance is ahead of other similar products.

Key words: SM4-XTS, SM2, modular exponentiation, high-speed implementation of cryptographic algorithm, big data

2.知识图谱研究综述及其在医疗领域的应用。

Research Review of Knowledge Graph and Its Application in Medical Domain

摘要： 随着医疗大数据时代的到来，知识互联受到了广泛的关注.如何从海量的数据中提取有用的医学知识，是医疗大数据分析的关键.知识图谱技术提供了一种从海量文本和图像中抽取结构化知识的手段，知识图谱与大数据技术、深度学习技术相结合，正在成为推动人工智能发展的核心驱动力.知识图谱技术在医疗领域拥有广阔的应用前景，该技术在医疗领域的应用研究将会在解决优质医疗资源供给不足和医疗服务需求持续增加的矛盾中产生重要的作用.目前，针对医学知识图谱的研究还处于探索阶段，现有知识图谱技术在医疗领域普遍存在效率低、限制多、拓展性差等问题.首先针对医疗领域大数据专业性强、结构复杂等特点，对医学知识图谱架构和构建技术进行了全面剖析;其次，分别针对医学知识图谱中知识表示、知识抽取、知识融合和知识推理这4个模块的关键技术和研究进展进行综述，并对这些技术进行实验分析与比较.此外，介绍了医学知识图谱在临床决策支持、医疗智能语义检索、医疗问答等医疗服务中的应用现状.最后对当前研究存在的问题与挑战进行了讨论和分析，并对其发展前景进行了展望.

关键词: 知识图谱, 智慧医疗, 大数据, 知识融合, 自然语言处理

Abstract: With the advent of the medical big data era, knowledge interconnection has received extensive attention. How to extract useful medical knowledge from massive data is the key for medical big data analysis. Knowledge graph technology provides a means to extract structured knowledge from massive texts and images.The combination of knowledge graph, big data technology and deep learning technology is becoming the core driving force for the development of artificial intelligence. The knowledge graph technology has a broad application prospect in the medical domain. The application of knowledge graph technology in the medical domain will play an important role in solving the contradiction between the supply of high-quality medical resources and the continuous increase of demand for medical services.At present, the research on medical knowledge graph is still in the exploratory stage. The existing knowledge graph technology generally has several problems such as low efficiency, multiple restrictions and poor expansion in the medical domain. This paper firstly analyzes the medical knowledge graph architecture and construction technology for the strong professionalism and complex structure of big data in the medical domain. Secondly, the key technologies and research progress of the three modules of knowledge extraction, knowledge expression, knowledge fusion and knowledge reasoning in medical knowledge map are summarized. In addition, the application status of medical knowledge maps in clinical decision support, medical intelligence semantic retrieval, medical question answering system and other medical services are introduced. Finally, the existing problems and challenges of current research are discussed and analyzed, and its development is prospected.

Key words: knowledge graph, medical wisdom, big data, knowledge fusion, natural language processing

3.基于MOOC数据的学习行为分析与预测

Learning Behavior Analysis and Prediction Based on MOOC Data

摘要： 随着近2年慕课(massive open online course, MOOC)的兴起，教育大数据分析正成为一个新兴的研究方向.2013年秋，北京大学在Coursera上开设了6门慕课.通过分析挖掘约8万多人次参与这6门课的海量学习行为数据，力图展现慕课学习活动多个侧面的风貌.同时，首次针对中文慕课中学习行为的特点，将学习者分类，以更加深入地考察学习行为与学习效果之间的关系.在此基础上，通过选择学习者的若干典型行为特征，对他们最后的学习成果进行预测的工作也尚属首次.数据表明：基于学习行为的特征分析能有效地判别一个学习者能否成功完成学习任务获得通过证书，并能找出潜在的认真学习者，这为今后更加精准的慕课教学测评提供了一种依据.

关键词: 慕课, 学习者类型, 学习行为, 数据分析, 成绩预测

Abstract: With the booming of MOOC (massive open online course) in the past two years, educational data analysis has become a promising research field where the quality of teaching and learning can be and is being quantified to improve the educational effectiveness and even to promote the modern higher education. In the autumn of 2013, Peking University released its first six courses on the Coursera platform. Through mining and analyzing the massive data of learning behavior of over 80000 participants from the courses, this paper endeavors to manifest more than one side of learning activity in MOOC. Meanwhile, according to the characteristic of learning behavior in Chinese MOOC, learners are classified into several groups and then the relationship between their learning behavior and performance is thoroughly studied. Based on the above work, we find out that learners performance, regarding whether heshe could get certificated eventually, can be predicted by looking into several features of their learning behavior. Experiment results indicate that these features can be trained to effectively estimate whether a learner is probably to complete the course successfully. Besides, this method has the potential to partially evaluate the quality of both teaching and learning in practice.

Key words: massive open online course (MOOC), engagement style, learning behavior, data analysis, performance prediction

4.知识图谱构建技术综述

Knowledge Graph Construction Techniques

摘要： 谷歌知识图谱技术近年来引起了广泛关注，由于公开披露的技术资料较少，使人一时难以看清该技术的内涵和价值.从知识图谱的定义和技术架构出发，对构建知识图谱涉及的关键技术进行了自底向上的全面解析.1)对知识图谱的定义和内涵进行了说明，并给出了构建知识图谱的技术框架，按照输入的知识素材的抽象程度将其划分为3个层次：信息抽取层、知识融合层和知识加工层;2)分别对每个层次涉及的关键技术的研究现状进行分类说明，逐步揭示知识图谱技术的奥秘，及其与相关学科领域的关系;3)对知识图谱构建技术当前面临的重大挑战和关键问题进行了总结.

关键词: 知识图谱, 语义网, 信息检索, 语义搜索引擎, 自然语言处理

Abstract: Google’s knowledge graph technology has drawn a lot of research attentions in recent years. However, due to the limited public disclosure of technical details, people find it difficult to understand the connotation and value of this technology. In this paper, we introduce the key techniques involved in the construction of knowledge graph in a bottom-up way, starting from a clearly defined concept and a technical architecture of the knowledge graph. Firstly, we describe in detail the definition and connotation of the knowledge graph, and then we propose the technical framework for knowledge graph construction, in which the construction process is divided into three levels according to the abstract level of the input knowledge materials, including the information extraction layer, the knowledge integration layer, and the knowledge processing layer, respectively. Secondly, the research status of the key technologies for each level are surveyed comprehensively and also investigated critically for the purposes of gradually revealing the mysteries of the knowledge graph technology, the state-of-the-art progress, and its relationship with related disciplines. Finally, five major research challenges in this area are summarized, and the corresponding key research issues are highlighted.

Key words: knowledge graph, semantic Web, information retrieval, semantic search engine, natural language processing

5.一种面向大规模序列数据的交互特征并行挖掘算法

A Parallel Algorithm for Mining Interactive Features from Large Scale Sequences

摘要： 序列是一种重要的数据类型，在诸多应用领域广泛存在.基于序列的特征选择具有广阔的现实应用场景.交互特征是指一组整体具有显著强于单独个体与目标相关性的特征集合.从大规模序列中挖掘交互特征面临着位点的“组合爆炸”问题，计算挑战性极大.针对该问题，以生物领域高通量测序数据为背景，提出了一种新的基于并行处理和演化计算的高阶交互特征挖掘算法.位点数是制约交互作用挖掘效率的根本因素.摈弃了现有方法基于序列分块的并行策略，采用基于位点分块的并行思想，具有天然的效率优势.进一步，提出了极大等位公共子序列(maximal allelic common subsequence, MACS)的概念并设计了基于MACS的特征区域划分策略.该策略能将交互特征的查找范围缩小至许多“碎片”空间，并保证不同“碎片”间不存在交互特征，避免计算耦合引起的高额通信代价.利用基于置换搜索的并行蚁群算法，执行交互特征选择.大量真实数据集和合成数据集上的实验结果，证实提出的PACOIFS算法在有效性和效率上优于同类其他算法.

关键词: 交互特征, 数据挖掘, 大规模序列, 蚁群算法, 并行计算, 极大等位公共子序列

Abstract: Sequence is an important type of data which is widely existing in various domains, and thus feature selection from sequence data is of practical significance in extensive applications. Interactive features refer to a set of features, each of which is weakly correlated with the target, but the whole of which is strongly correlated with the target. It is of great challenge to mine interactive features from large scale sequence data for the combinatorial explosion problem of loci. To address the problem, against the background of high-throughput sequencing in biology, a parallel evolutionary algorithm for high-order interactive features mining is proposed in this paper. Instead of sequence-block based parallel strategy, the work is inspired by loci-based idea since the number of loci is the fundamental factor that restricts the efficiency. Further, we propose the conception of maximal allelic common subsequence (MACS) and MACS based strategy for feature region partition. According to the strategy, the search range of interactive features is narrowed to many fragged spaces and interactions are guaranteed not to exist among different fragments. Finally, a parallel ant algorithm based on substitution search is developed to conduct interactive feature selection. Extensive experiments on real and synthetic datasets show that the efficiency and effectiveness of the proposed PACOIFS algorithm is superior to that of competitive algorithms.

Key words: interactive features, data mining, large scale sequence, ant colony algorithm, parallel computation, maximal allelic common subsequence (MACS)

如果对你有帮助，请点个赞！谢谢！

程序员宝藏

原创文章 27 获赞 376 访问量 3万+

关注私信