版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/chenKFKevin/article/details/84974148
安装pyhive,连接presto并用pandas读取:
import pandas as pd
from sqlalchemy.engine import create_engine
from pyhive import hive
# 准备语句
sql = "select * from table"
engine1 = create_engine('presto://ip:port/hive/default')
# 获取数据
df = pd.read_sql(sql, engine1)
连接hive,往hive中插数据:
conn = hive.connect(host='ip', port=port, database='db', username='xxx', auth='NONE')
cursor = conn.cursor()
sql_2 = "Insert into table partition (pt='xxx')" + \
"(column1, column2, column3, column4, column5) values "
# 此处我是拼接多个,一次性插入
for i in df['sql'].tolist():
sql_2 += i
cursor.execute(sql_2[:-1])
如果遇到报错,可参考https://github.com/cloudera/impyla/issues/267解决方案。