pandas get Mysql data usage

Mysql data acquired using the Pandas

Mysql operating in python, though convenient, but frequently get data from the server, the efficiency is very low. Try a bit today to extract data from Mysql database with pandas, really convenient, but also acquire data once after using pandas rich variety of analysis functions is also handy.

  1. First, establish a database connection, the establishment of a common conn as a Connection object
import pymysql
import pandas as pd
class cardb():
    conn = None

    def connect_db(self):
        db_host = "XXXX"
        user = "XXXX"
        pw = "XXXX"
        try:
            # 创建连接数据库
            self.conn = pymysql.connect(db_host, user, pw, "yiche_car_info", use_unicode=True, charset='utf8mb4')
            return self.conn
        except Exception as e:
            print("数据库连接异常!错误%s", e)
            return None
  1. pandas connection is established using the acquired data
 try:
            self.conn.ping(reconnect=True)
        except Exception as e:
            print("%s" % e)
            return None
            sql=“select * from viewcarinfo”
        df = pd.read_sql_query(sql % (str_info, str_viewname), con=self.conn)
        print(df)

The data obtained, a dataframe format. Line by line index is called index, a column is a column like form.

       pz_id cartype_id              pz_name  ...    车型级别 车身型式    前大灯
0    m139120      m4758       2020款1.2L手动超值版  ...     小型车   两厢     卤素
1    m111122      m2790         2014款1.3L标准版  ...     小型车   三厢     卤素
2    m133807      m3067       2019款1.5L手动进取版  ...     小型车   三厢    LED
3    m133409      m4758      2019款1.2LAMT舒适版  ...     小型车   两厢     卤素
4    m139423      m3167       2020款1.4L手动焕新版  ...     小型车   三厢     卤素
..       ...        ...                  ...  ...     ...  ...    ...
343  m129677      m4586  2018款5.3L手自一体白宫一号4座  ...  全尺寸SUV  SUV    LED
344  m136613      m3859      2020款6.0TW12标准版  ...     豪华车   三厢    LED
345  m131852      m4373       2019款S680双调典藏版  ...     豪华车   三厢  矩阵LED
346  m132800      m2078    2019款GT6.0TW12敞篷版  ...     豪华车  敞篷车  矩阵LED
347  m125538      m3044    2017款6.8T手自一体长轴距版  ...     豪华车   三厢     氙气
  1. First establish that you need to save the dictionary in python json
 json_res = {}
        json_res["item_name"] = []
        json_res["item_option"] = {}
        json_res["item_value"] = []
        json_res["car_prosys_name"] = car_prosys_name
        json_res["car_prosys_value"] = []
        json_res["car_prosys_series_value"] = []
        json_res["car_pricezone_name"] = car_pricezone_name
        json_res["car_pricezone_value"] = []
        json_res["car_pricezone_series_value"] = []
  1. Analysis of data with pandas
    usage data query: df.loc is positioned rows of data, coupled with data filtering conditions can be achieved.
    A. filtration or equal to the data size filter
item="卤素"
item1 = df.loc[df["前大灯"] == item]  

== equal to, greater than less than <,> you can filter out the data rows that satisfy all of equal size condition or value.
n1 = item1 [str_feild] .count ( )
statistics of the number of lines of "halogen" == All headlights.

B. Fuzzy text search filters
and fuzzy search text can utilize this function .str.contains
as

item="卤素"
item2 = df.loc[df["前大灯"].str.contains(item)]

All can count all the rows with halogen.

n2 = item2[str_feild].count()

C. Joint criteria to
the statistics of the number of lines all headlamps with "halo" of.
You can also find a joint two or more conditions, such as "&" is and effect, and "|" is or effect.

item3 = df.loc[(df[str_feild] == item) & (df[prosys_name] == ps_name)]
n3 = item2[str_feild].count()

Statistical satisfying df [str_feild] == item and df [prosys_name] == ps_name) data columns.

  1. Dictionary is derived json format, can be used for flask, Django data source.
 save_path = sys.path[0] + json_path + str_feild_id + ".json"
        with open(save_path, 'w') as wr:
            json.dump(json_res, wr)
  1. Analysis of data with pandas
    pandas start, there are many uses, can be said to be an upgraded version of numpy, for more numpy arrays and matrices, and pandas for more
    advanced data processing and similar excel sql can also directly interface with the database , and there are multiple access and export formats (a common
    such csv, excel, json), can be said that a large data processing tool.
Published 14 original articles · won praise 6 · views 1353

Guess you like

Origin blog.csdn.net/qq_43662503/article/details/104679163