Python_jieba Hyogo


jieba Python library is an important third-party Chinese word library, some Chinese text can be divided into Chinese words in sequence, just talk, like when people stutter.


Example 1:

import jieba  #

f = open('data.txt','r')   # 导入文本数据
lines = f.readlines()
f.close()
f = open('out.txt','r+')  # 使用r+模式读取和写入文件
for line in lines:     
    line=  line.strip()         # 删除每行首尾可能出现的空格
    wordList = jieba.lcut(line)         # 用结巴分词,对每行内容进行分词
    f.writelines('\n'.join(wordList))  # 将分词结果存到文件out.txt中
f.seek(0)
txt = f.read()
print(txt)
f.close()

LCUT () library is jieba commonly used functions, the precise mode word, returns a list of type

Output:

Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\蒙山知府\AppData\Local\Temp\jieba.cache
Loading model cost 0.931 seconds.
Prefix dict has been built successfully.
内容简介
编辑整个
故事
在
东汉
末年
至
西晋
初
的
历史
大
背景
下
展开
。
东汉
末年
,

Example 2:

import jieba

f1 = open('data.txt','r')
data = f1.read()
f1.close()
f = open('out1.txt','w+')
data1 = jieba.lcut(data)
d = []
for x in data1:
    if len(x) >= 3 and x not in d:  # 统计字长不小于3个字的词语
        f.write(x+'\n')
        d.append(x)
f.seek(0)
txt = f.read()
print(txt)
f.close()

data.txt text:

Differences and connections between artificial intelligence, machine learning and deep learning
some say, artificial intelligence (AI) is the future, artificial intelligence is science fiction, artificial intelligence is part of our daily life. These assessments can say is correct, to see you mean what kind of artificial intelligence.
Earlier this year, Google DeepMind of AlphaGo beat South Korea's Lee Se-dol Master of Go Kau. When the media describe DeepMind victory, artificial intelligence (AI), machine learning (machine learning) and deep learning (deep learning) use both. In the course of these three AlphaGo beat Lee Se-dol's have played a role, but they are not talking about the same thing.

Today we use the most simple way - concentric circles, visually show relationships and apply them three.

Differences and connections between artificial intelligence, machine learning and deep learning

As shown above, the artificial intelligence is the earliest and largest, the outermost concentric; followed by machine learning, a little later; the innermost, is the depth of learning, the core driver of today's artificial intelligence explosion.

Fifties, artificial intelligence was once very good. After that, some smaller subset of artificial intelligence developed together. First, machine learning, then the depth of learning. Deep learning is a subset of machine learning. Depth study caused tremendous influence ever.
| From concept to prosper

In 1956, several computer scientists gathered at the conference Dartmouth (Dartmouth
Conferences), the concept of "artificial intelligence" of. After that, the AI has been lingering in people's minds, and slowly hatch in the research laboratory. After a few decades, artificial intelligence has been at the poles reverse, otherwise referred to human civilization bright future predictions; or be treated as a madman's fantasy technology thrown into the trash. Frankly, until 2012, the two voices still exist.

Over the past few years, especially since 2015, AI outbreak began. Large part due to the extensive use of the GPU, making parallel computing faster, cheaper and more effective. Of course, the unlimited expansion of storage capacity and a sudden outbreak of the flood of data (big data) a combination of boxing, but also makes image data, text data, transaction data, map data, comprehensive mass outbreak.

Let us slowly sort out what computer scientists is how artificial intelligence from the earliest signs of a little bit, to be able to support the development of those applications used every day by hundreds of millions of users.

| AI (Artificial Intelligence) - Intelligent machine is given to people

Differences and connections between artificial intelligence, machine learning and deep learning

As early as the summer of 1956 that meeting, artificial intelligence pioneer who dreamed of a computer had just appeared to construct complex, with the same characteristics and the nature of human intelligence machine. This is what we now call "strong AI" (General
AI). The all-powerful machine, it has all of our perception (even more than people), all of our rationality, can think like us.

People always see in the movies such a machine: friendly, like C-3PO in Star Wars; evil, such as the Terminator. Strong AI now exists only in movies and science fiction, easy to understand why, we have not achieved their law, at least for now is not enough.

We are able to achieve, generally known as "weak AI" (Narrow
AI). Weak AI is able, like people, even better than people to perform specific tasks of technology. For example, the image classification Pinterest; Facebook, or face recognition.

These are examples of AI weak in practice. These are some of the specific technical implementation of local human intelligence. But how they are achieved? This intelligence come from? This took us to the inside layer of concentric circles, machine learning.

Output:

Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\蒙山知府\AppData\Local\Temp\jieba.cache
Loading model cost 0.947 seconds.
Prefix dict has been built successfully.
人工智能
日常生活
一部分
早些时候
Google
DeepMind
AlphaGo
machine
learning
deep
同心圆
可视化
展现出
如上图
五十年代
曾一度
前所未有
1956
计算机
科学家
达特茅斯
Dartmouth
Conferences
实验室
几十年
极反转
被称作
人类文明
垃圾堆
坦白说
2012
2015
GPU
广泛应用
并行计算
组合拳
一点点
Artificial
Intelligence
General
无所不能
星球大战
3PO
终结者
科幻小说
不难理解
Narrow
Pinterest
Facebook
人脸识别
在实践中
从何而来

Example 3:

import jieba
f = open("data.txt",'r')
data = f.read()
f.close()
f = open("out2.txt",'w+')
d = {}
data2 = jieba.lcut(data)
for D in data2:
    if len(D) >= 3:
        d[D] = d.get(D, 0)+1  # dict.get(key, default=None) 
'''key -- 字典中要查找的键
   default -- 如果指定键的值不存在时,返回该默认值'''
ls = list(d.items())
ls.sort(key=lambda x:x[1], reverse=True) # 此行可以按照词频由高到低排序
for L in ls:
    f.write(L[0]+":"+str(L[1])+'\n')
f.seek(0)
txt = f.read()
print(txt)
f.close()

Output:

Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\蒙山知府\AppData\Local\Temp\jieba.cache
Loading model cost 0.906 seconds.
Prefix dict has been built successfully.
人工智能:24
同心圆:3
计算机:3
一部分:2
DeepMind:2
AlphaGo:2
learning:2
1956:2
科学家:2
日常生活:1
早些时候:1
Google:1
machine:1
deep:1
可视化:1
展现出:1
如上图:1
五十年代:1
曾一度:1
前所未有:1
达特茅斯:1
Dartmouth:1
Conferences:1
实验室:1
几十年:1
极反转:1
被称作:1
人类文明:1
垃圾堆:1
坦白说:1
2012:1
2015:1
GPU:1
广泛应用:1
并行计算:1
组合拳:1
一点点:1
Artificial:1
Intelligence:1
General:1
无所不能:1
星球大战:1
3PO:1
终结者:1
科幻小说:1
不难理解:1
Narrow:1
Pinterest:1
Facebook:1
人脸识别:1
在实践中:1
从何而来:1
Published 16 original articles · won praise 8 · views 1826

Guess you like

Origin blog.csdn.net/wayne6515/article/details/104456452