In this article we describe the use of the basic data storage and interaction python as follows:
File Storage: TXT, JSON, CSV
Relational databases: Mysql (pymysql module)
Non-relational database: MongoDB (pymongo module), Redis (redis module)
1. Text memory; simple example, crawling know almost topic, and answer contents are stored into the A txt file
# # Text storage; simple example, crawling know almost topic, and answer contents are stored into the A txt file from pyquery Import pyquery AS PQ Import Requests URL = ' https://www.zhihu.com/explore ' headers = { ' the User-- Agent ' : ' the Mozilla / 5.0 (the Windows NT 6.1; Win64; x64-) AppleWebKit / 537.36 (KHTML, like the Gecko) the Chrome / 79.0.3945.130 Safari / 537.36 ' } HTML = requests.get (URL = URL, headers = headers) .text DOC = PQ (HTML) items = doc.find ( ' .ExploreCollectionCard-ContentItem ').items() for item in items: question = item('.ExploreCollectionCard-contentTitle').text() author = item('.ExploreCollectionCard-contentExcerpt').text().split(':')[0] answer = ''.join(item('.ExploreCollectionCard-contentExcerpt').text().split(':')[1:]) with open('zhihu_explore.txt', 'a', encoding='utf-8') as f: f.write('\n'.join([question, author, answer])) f.write('\n' + '=' * 50 + '\n')
'' ' Save the text: ? Year can memorize new concepts 3.4 text you lu luce possible. . I successfully completed. WANG Jiang-tao is a six-step method. . See the results of human flesh my English learning laboratory assistant in human flesh. Human flesh experimenter drifting away. . . I checked my learning log. Year or complete new concept of drifting back four of 1234. From the beginning of December 2016, back-to-March 2018. From the beginning of December 2016, 2017 4 ... ====================================== ============ there are no super nice symbol can put a nickname? The rain stopped the enemy scattered flowers super cute ah ¹⁹⁹⁴ ²⁰⁰⁷ ¹⁹⁹⁵ ²⁰⁰⁸ ¹⁹⁹⁶ ⁰¹²³⁴⁵⁶⁷⁸⁹ ²⁰⁰⁹ ¹⁹⁹⁷ ²⁰¹⁰ ¹⁹⁹⁸ ²⁰¹¹ ¹⁹⁹⁹ ... ============================ ====================== which overturned the people's understanding of the history of archaeological discoveries? Qingyuan Cultural Heritage in 2002, in lajia Qinghai, the archaeological team members accidentally discovered on the floor of turn buckle bowl of noodles 4,000 years ago leaned see only remaining Qijia culture upside down basket patterns red pottery bowl, the bowl was retained slug visible traces of yellow crimped strand material, weathering very serious, only a little thin material remained epidermal scientific identification, and found that the main component of millet, ... ========= ========================================= wine really tastes it? Xugong Zi Send you a wine list, all come back to say hello again to drink does not taste good. Two years ago, I do not drink, drink liquor only a spicy flavor. Begun to taste the wine in the cup is vibrato "lover's tears." Not to mention tasty, it can not be difficult to drink, the main point is that heart literary mischief; with the increasing need to work and work stress, drinking, wine tasting into a routine, only to find the wine ... ======= =========================================== traditional methods to solve the short text similar to BM25 the degree of the problem Cong NLP introduced before it TF-IDF short text similarity calculation, see Cong NLP traditional TF-IDF method to solve the problem of short text similarity, thinking to put this series introduce all, can be considered their induction summary, today will introduce how to use the short text BM25 algorithm to calculate the similarity. Previous short text similarity algorithm research articles, we held over such a scene, in ... ============================= ===================== Shannon read | ReZero: weighted residuals connection accelerate the convergence depth model Shannon Technology Posted ReZero is All You Need: Fast convergence at Large Depth authors Thomas Bachlechner, Bodhisattwa Prasad Majumder, Huanru Henry Mao, Garrison W. Cottrell, Julian McAuley paper links https://arxiv.org/abs/2003.04887 code to connect https://github.com/majumderb/rezero... = ================================================= 100 a material site, we've used up know almost users a long time not to share resources for everyone, is not it also the recent very hungry. Good things must share the fishes, the following resources are on welfare when it! Google flat design manual https://material.google.com/ domestic learning website http://www.wanyouyingli.com/ common function chart http://easings.net/zh-cn domestic tour visual design center ... == ================================================ you seen any weird website? Lin concise my favorites bleeding the coffers again! (Treasure boy attack! The ending to 1 egg, Marxists Internet Archive we always joked before "Matt is very difficult, the test is what stuff." But you do not know there is a group of unknown groups, they do not expect anything in return, one and all for the cause of the work. the collection from around the world in the past, present and future for the communist ... =============================== =================== '' '
2. JSON file storage
# # Store the JSON file # # in the JSON two common types: arrays and objects can be understood as dictionaries and lists in python, both of which may be nested to # # reading data JSON, JSON string must use bis quotation marks, otherwise parse fails Import JSON STR = '' ' [ { "name": "DMR", "Age": "25", "Score": "80"}, { "name": "ASX", "Age": "23 is", "Score": "81"} ] '' ' Print (type (STR)) STR = json.loads (STR) Print (type (STR)) # read value, a dictionary of get method, when the key is not present, not given, returns None Print (STR [0] [ ' name ']) print(str[0].get('age')) ''' Outputting content: <class' STR '> <class' List'> DMR 25 '' ' # # write JSON, JSON string format is converted into the dictionary, automatically recognizes JSON format conversion and correction, as a single quote to double quotes Import JSON D = { ' name ' : [ ' DMR ' , ' ASX ' , ' tease than ' ], ' Age ' : ' 25 ' , } # converting the data into a specific format JSON format data_json = json.dumps (D) # indent, indent character specified amount data_json2 = json.dumps(d, indent=2) # Ensure_ascii that the content may be displayed in Chinese data_json3 json.dumps = (D, indent = 2, ensure_ascii = False) Print (data_json) Print (data_json2) Print (data_json3) # save the contents of the file JSON with Open ( ' Data. JSON ' , ' W ' , encoding = ' UTF-. 8 ' ) AS F: f.write ( ' \ n- ' .join ([data_json, data_json2, data_json3])) ' '' output content: { "name": [ "DMR", "ASX", "\ u9017 \ u6bd4"], "Age": "25"} { "name": [ "dmr", "asx", "\u9017\u6bd4" ], "age": "25" } { "name": [ "dmr", "asx", "逗比" ], "age": "25" } '''
3. CSV file as plain text data stored in tables
# # The CSV file, in plain text data storage table # # write Import CSV with Open ( ' the data.csv ' , ' W ' ) AS csvf: # obtain a file handle Writer = csv.writer (csvf) # obtain the file handle and specify the delimiter writer2 = csv.writer (csvf, dELIMITER = ' ' ) # write line content writer.writerow ([ ' ID ' , ' name ' , ' Age ' ]) writer.writerow ([ ' 0001 ' ,'dmr', '25']) writer.writerow(['0002', 'asx', '23']) writer.writerow(['0003', 'scy', '26']) writer2.writerow(['0004', 'test', '22']) writer2.writerow('=================================') # 写入多行 writer2.writerows([['id', 'name', 'age'], ['0001', 'dmr', '25'], ['0003', 'scy', '26']]) writer2.writerow('================================='' fieldNames = [Files by adding the contents of the dictionary form#) id', 'name', 'age'] writer3 = csv.DictWriter(csvf, fieldnames=fieldnames) # 生成fieldnames的首行 writer3.writeheader() # 写入内容 writer3.writerow({'id': '0001', 'name': 'dmr', 'age': '25'}) writer3.writerow({'id': '0002', 'name': 'asx', 'age': '23'}) writer3.writerow({'id': '0003', 'name': 'scy', 'age': '26'}) ## 读取 import csv with open('data.csv', ' R & lt ' , encoding = ' UTF-. 8 ' ) AS csvf: Reader = csv.reader (csvf) Print (Reader) for Row in Reader: Print (Row) # read file pandas module Import pandas PD AS Data = pd.read_csv ( ' the data.csv ' ) Print (Data)
4. relational database mysql
pymysql module: https://www.cnblogs.com/Caiyundo/p/9578925.html
The non-relational databases mongodb, redis
pymongo module: https://www.cnblogs.com/Caiyundo/p/9480265.html
redis module: https://www.cnblogs.com/Caiyundo/p/9561548.html