hadoop组件---面向列的开源数据库(九)--python--python使用thrift连接hbase

Python使用thrift操作HBase

thrift支持多种语言进行连接使用,但是没找到linux中的cli操作命令行的形式。所以如果服务器有python环境的话,可以使用python进行连接,快速测试。

确认hbase和thrift服务已经安装,启动

相关hbase和thrift的安装,启动参考

注意:我这里使用的CDH套装中的hbase服务,如果单独安装hbase使用的话,请参考文末附录。

hadoop基础----hadoop实战(七)-----hadoop管理工具—使用Cloudera Manager安装Hadoop—Cloudera Manager和CDH5.8离线安装

hadoop组件—面向列的开源数据库(三)—hbase的接口thrift简介和安装

在root权限下使用命令 (如果是个人账户,有可能看不到root账户安装的程序)

jps

输出如下:

root@master:/# jps
3332 Jps
3254 ThriftServer
2685 HMaster

有HMaster 说明 hbase服务正常运行,有ThriftServer说明thrift服务正常运行。

python2连接hbase

检查环境

明确python的版本 和pip是否安装

[zzq@host252 ~]$ python --version
Python 2.7.11

pip

[zzq@host252 ~]$ pip

Usage:   
  pip <command> [options]

Commands:
  install                     Install packages.
  download                    Download packages.
  uninstall                   Uninstall packages.
  freeze                      Output installed packages in requirements format.
  list                        List installed packages.
  show                        Show information about installed packages.
  check                       Verify installed packages have compatible dependencies.
  config                      Manage local and global configuration.
  search                      Search PyPI for packages.
  wheel                       Build wheels from your requirements.
  hash                        Compute hashes of package archives.
  completion                  A helper command used for command completion.
  help                        Show help for commands.

General Options:
  -h, --help                  Show help.
  --isolated                  Run pip in an isolated mode, ignoring environment variables and user configuration.
  -v, --verbose               Give more output. Option is additive, and can be used up to 3 times.
  -V, --version               Show version and exit.
  -q, --quiet                 Give less output. Option is additive, and can be used up to 3 times (corresponding to WARNING, ERROR, and CRITICAL logging levels).
  --log <path>                Path to a verbose appending log.
  --proxy <proxy>             Specify a proxy in the form [user:passwd@]proxy.server:port.
  --retries <retries>         Maximum number of retries each connection should attempt (default 5 times).
  --timeout <sec>             Set the socket timeout (default 15 seconds).
  --exists-action <action>    Default action when a path already exists: (s)witch, (i)gnore, (w)ipe, (b)ackup, (a)bort).
  --trusted-host <hostname>   Mark this host as trusted, even though it does not have valid or any HTTPS.
  --cert <path>               Path to alternate CA bundle.
  --client-cert <path>        Path to SSL client certificate, a single file containing the private key and the certificate in PEM format.
  --cache-dir <dir>           Store the cache data in <dir>.
  --no-cache-dir              Disable the cache.
  --disable-pip-version-check
                              Don't periodically check PyPI to determine whether a new version of pip is available for download. Implied with --no-index.
  --no-color                  Suppress colored output
[zzq@host252 ~]$ 

可能遇到的问题–bash: pip: command not found

解决方法 把对应python路径中的pip连接到系统层面

首先查下安装路径:

find / -name pip

做个软连接

ln -sv /usr/local/python/bin/pip /usr/bin/pip

创建虚拟环境

为了不影响系统的python环境 最好新建一个 虚拟环境来运行(当然也可以不创建,直接在系统python环境中操作)

只有python2.7及更高版本才支持virtualenv这个脚本的运行

使用命令如下:

pip install virtualenv
或
pip2 install virtualenv -i https://pypi.douban.com/simple

安装完成后使用命令校验

[zzq@host252 ~]$ virtualenv
You must provide a DEST_DIR
Usage: virtualenv [OPTIONS] DEST_DIR

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -v, --verbose         Increase verbosity.
  -q, --quiet           Decrease verbosity.
  -p PYTHON_EXE, --python=PYTHON_EXE
                        The Python interpreter to use, e.g.,
                        --python=python3.5 will use the python3.5 interpreter
                        to create the new environment.  The default is the
                        interpreter that virtualenv was installed with
                        (/usr/bin/python3.6)
  --clear               Clear out the non-root install and start from scratch.
  --no-site-packages    DEPRECATED. Retained only for backward compatibility.
                        Not having access to global site-packages is now the
                        default behavior.
  --system-site-packages
                        Give the virtual environment access to the global
                        site-packages.
  --always-copy         Always copy files rather than symlinking.
  --relocatable         Make an EXISTING virtualenv environment relocatable.
                        This fixes up scripts and makes all .pth files
                        relative.
  --no-setuptools       Do not install setuptools in the new virtualenv.
  --no-pip              Do not install pip in the new virtualenv.
  --no-wheel            Do not install wheel in the new virtualenv.
  --extra-search-dir=DIR
                        Directory to look for setuptools/pip distributions in.
                        This option can be used multiple times.
  --download            Download preinstalled packages from PyPI.
  --no-download, --never-download
                        Do not download preinstalled packages from PyPI.
  --prompt=PROMPT       Provides an alternative prompt prefix for this
                        environment.
  --setuptools          DEPRECATED. Retained only for backward compatibility.
                        This option has no effect.
  --distribute          DEPRECATED. Retained only for backward compatibility.
                        This option has no effect.
  --unzip-setuptools    DEPRECATED.  Retained only for backward compatibility.
                        This option has no effect.

创建虚拟环境使用命令

mkdir my-python2hbase-env
cd my-python2hbase-env

创建
virtualenv  project-env

使用命令查看当前目录

pwd

输出为:

/home/zzq/my-python2hbase-env

进入虚拟环境

source /home/zzq/my-python2hbase-env/project-env/bin/activate

安装依赖包

一共需要两个依赖包 Thrift和hbase-thrift 使用命令如下:

python连接hbase的包也有很多种
HBase-Thrift
happyhbase
hbase-python 的pypi仓库
hbase-python github

我们这里使用HBase-Thrift

安装Thrift依赖包

pip install thrift

安装成功输出如下:

(project-env) [root@host3 my-python2hbase-env]# pip install thrift
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Collecting thrift
  Downloading https://files.pythonhosted.org/packages/c6/b4/510617906f8e0c5660e7d96fbc5585113f83ad547a3989b80297ac72a74c/thrift-0.11.0.tar.gz (52kB)
     |████████████████████████████████| 61kB 46kB/s 
Collecting six>=1.7.2
  Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Building wheels for collected packages: thrift
  Building wheel for thrift (setup.py) ... done
  Created wheel for thrift: filename=thrift-0.11.0-cp27-cp27mu-linux_x86_64.whl size=264173 sha256=8392860fa66ddd575b004c4d1ef13f1a462c01a779ddfa1929db42bcebe26a34
  Stored in directory: /root/.cache/pip/wheels/be/36/81/0f93ba89a1cb7887c91937948519840a72c0ffdd57cac0ae8f
Successfully built thrift
Installing collected packages: six, thrift
Successfully installed six-1.12.0 thrift-0.11.0
(project-env) [root@host3 my-python2hbase-env]# 

安装hbase-thrift依赖包

pip install hbase-thrift

安装成功输出如下:

(project-env) [root@host3 my-python2hbase-env]# pip install hbase-thrift
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Collecting hbase-thrift
  Downloading https://files.pythonhosted.org/packages/89/f7/dbb6c764bb909ed361c255828701228d8c9867d541cfef84127e6f3704cc/hbase-thrift-0.20.4.tar.gz
Requirement already satisfied: Thrift in ./project-env/lib/python2.7/site-packages (from hbase-thrift) (0.11.0)
Requirement already satisfied: six>=1.7.2 in ./project-env/lib/python2.7/site-packages (from Thrift->hbase-thrift) (1.12.0)
Building wheels for collected packages: hbase-thrift
  Building wheel for hbase-thrift (setup.py) ... done
  Created wheel for hbase-thrift: filename=hbase_thrift-0.20.4-cp27-none-any.whl size=19705 sha256=c3334f4d28c385ec7b29fda6db64c128c76e08e4bc2cfe9e1d20ff8dbd813629
  Stored in directory: /root/.cache/pip/wheels/fe/51/f2/afb7b010cd97910aa0b651d492735a38ed69a93a817444904e
Successfully built hbase-thrift
Installing collected packages: hbase-thrift
Successfully installed hbase-thrift-0.20.4
You have mail in /var/spool/mail/root
(project-env) [root@host3 my-python2hbase-env]# 

python连接thrift代码

目前的Hbase有两套thrift接口(可以叫thrift和thrift2),它们并不兼容

先来看看连接thrift的代码

vi query.py

注意 localhost 和端口9090(thrift默认端口) 需要与自己的对应

输入内容如下:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
 
from hbase import Hbase
from hbase.ttypes import *
 
transport = TSocket.TSocket('localhost', 9090)
 
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
 
client = Hbase.Client(protocol)
transport.open()
print client.getTableNames() 

可能遇到问题–thrift.Thrift.TApplicationException: Invalid method name: 'getTableNames

原因

客户端thrift版本和hbase thrift server的thrift版本不一致造成的。

thrift server上是使用的thrift2启动的,而客户端使用的是thrift访问的。

解决方法
因为根本原因在于客户端和服务器thrift版本不一致,那么解决方法有两个:

1、服务端以启动thrift版本的thrift server
hbase 的 thrift server以thrift1方式启动。

hbase-daemon.sh stop thrift2
#启动命令
hbase-daemon.sh start thrift

如果想使用happybase这个好用的模块去连接hbase,只能使用thrift,因为happybase目前还不支持thrift2

python连接thrift2代码

python连接thrift2要稍微麻烦一些

生成对应编译器–注意thrift版本和thrift2版本

需要安装Thrift编译器,才能生成HBase跨语言的API。
生成编译器的工具的路径如下

如果是原生安装的hbase路径为:
$HBASE_HOME/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

如果是CDH安装的hbase路径为:
/opt/cloudera/parcels/CDH/lib/hue/apps/hbase/thrift/Hbase.thrift

如果实在找不到则使用全局搜索命令

sudo find  /  -name "Hbase.thrift"

如图:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-xfIRozbm-1575285150936)(http://image.525.life/FsupHgwkScRimAgDcivI66mAnMRA)]

使用命令生成python版本的编译器

thrift --gen py  /opt/cloudera/parcels/CDH/lib/hue/apps/hbase/thrift/Hbase.thrift

如果报错 -bash: thrift: command not found

需要安装 thrift,参考 hadoop组件—面向列的开源数据库(三)—hbase的接口thrift简介和安装

该命令会在当前目录下生成 gen-py文件夹

因为 CDH的hbase只提供了thrift1类型的编译器,所以 需要我们在其他地方找一下thrift2的编译器Hbase.thrift。

如果是HDP版本的话Hbase则提供了两个版本的编译器,路径和使用的命令可能如下:

# hdp hbase.thrift 文件路径
cd /usr/hdp/3.0.0.0-1634/hbase/include/thrift/
# 生成 python
# 该路径下存在 thrift1 和 thrift2 两种,可以自行选择
thrift -gen py hbase1.thrift 或 thrift -gen py hbase2.thrift

如果不是使用的HDP版本的Hbase的话,需要去github里找到hbase源码项目中,有thrift1和thrift2两个版本的编译器



 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift1/hbase.thrift
或者
 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift

甚至可以直接下载编译好的文件使用
https://github.com/apache/hbase/tree/master/hbase-examples/src/main/python

我们这里直接下载 thrift2编译好的文件

链接:https://pan.baidu.com/s/1s3iysNJHW7s8lW6ni4qxrw
提取码:is1j

thrift1版本可以得到一组 Python 文件:

[zzq@host252 thrift-0.10.0]$ ll gen-py/hbased/
total 440
-rw-rw-r--. 1 zzq zzq    326 Nov 28 19:38 constants.py
-rw-rw-r--. 1 zzq zzq 384499 Nov 28 19:38 Hbase.py
-rwxr-xr-x. 1 zzq zzq  14386 Nov 28 19:38 Hbase-remote
-rw-rw-r--. 1 zzq zzq     43 Nov 28 19:38 __init__.py
-rw-rw-r--. 1 zzq zzq  38776 Nov 28 19:38 ttypes.py
[zzq@host252 thrift-0.10.0]$ 

thrift2版本会得到以下文件

$ ls gen-py
gen-py/hbase/__init__.py
gen-py/hbase/constants.py
gen-py/hbase/THBaseService.py
gen-py/hbase/ttypes.py

因为thrift2没有getTableNames()方法,所以我们需要先手动创建一个测试用的table。

hbase shell

hbase(main):001:0> create "example", NAME => "family"
0 row(s) in 1.6480 seconds

=> Hbase::Table - example
hbase(main):002:0> 

假如我们的gen-py路径为:
/home/zzq/thrift2/gen-py

则使用命令创建测试脚本test.py

vim test.py

输入内容如下:

import sys
import os
import time

from thrift.transport import TTransport
from thrift.transport import TSocket
from thrift.transport import THttpClient
from thrift.protocol import TBinaryProtocol

# Add path for local "gen-py/hbase" for the pre-generated module
sys.path.append("/home/zzq/thrift2/gen-py")
from hbase import THBaseService
from hbase.ttypes import *

print "Thrift2 Demo"
print "This demo assumes you have a table called \"example\" with a column family called \"family\""

host = "192.168.30.250"
port = 9090
framed = False

socket = TSocket.TSocket(host, port)
if framed:
  transport = TTransport.TFramedTransport(socket)
else:
  transport = TTransport.TBufferedTransport(socket)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = THBaseService.Client(protocol)

transport.open()

table = "example"

put = TPut(row="row1", columnValues=[TColumnValue(family="family",qualifier="qualifier1",value="value1")])
print "Putting:", put
client.put(table, put)

get = TGet(row="row1")
print "Getting:", get
result = client.get(table, get)

print "Result:", result

transport.close()

使用命令运行

python test.py

输出如下:

[zzq@host252 ~]$ vi test.py
[zzq@host252 ~]$ python test.py 
Thrift2 Demo
This demo assumes you have a table called "example" with a column family called "family"
Putting: TPut(durability=None, timestamp=None, cellVisibility=None, attributes=None, columnValues=[TColumnValue(qualifier='qualifier1', family='family', tags=None, timestamp=None, value='value1', type=None)], row='row1')
Getting: TGet(storeOffset=None, existence_only=None, authorizations=None, filterString=None, timestamp=None, maxVersions=None, timeRange=None, filterBytes=None, targetReplicaId=None, consistency=None, attributes=None, storeLimit=None, cacheBlocks=None, columns=None, row='row1')
Result: TResult(partial=False, stale=False, columnValues=[TColumnValue(qualifier='qualifier1', family='family', tags=None, timestamp=1575022330934, value='value1', type=None)], row='row1')
[zzq@host252 ~]$ 

python3连接hbase

检查环境

明确python的版本 和pip是否安装

(project-env) [zzq@host252 ~]$ python --version
Python 3.6.5
(project-env) [zzq@host252 ~]$ pip3

Usage:   
  pip3 <command> [options]

Commands:
  install                     Install packages.
  download                    Download packages.
  uninstall                   Uninstall packages.
  freeze                      Output installed packages in requirements format.
  list                        List installed packages.
  show                        Show information about installed packages.
  check                       Verify installed packages have compatible dependencies.
  config                      Manage local and global configuration.
  search                      Search PyPI for packages.
  wheel                       Build wheels from your requirements.
  hash                        Compute hashes of package archives.
  completion                  A helper command used for command completion.
  debug                       Show information useful for debugging.
  help                        Show help for commands.

General Options:
  -h, --help                  Show help.
  --isolated                  Run pip in an isolated mode, ignoring environment variables and user configuration.
  -v, --verbose               Give more output. Option is additive, and can be used up to 3 times.
  -V, --version               Show version and exit.
  -q, --quiet                 Give less output. Option is additive, and can be used up to 3 times (corresponding to WARNING, ERROR, and CRITICAL logging levels).
  --log <path>                Path to a verbose appending log.
  --proxy <proxy>             Specify a proxy in the form [user:passwd@]proxy.server:port.
  --retries <retries>         Maximum number of retries each connection should attempt (default 5 times).
  --timeout <sec>             Set the socket timeout (default 15 seconds).
  --exists-action <action>    Default action when a path already exists: (s)witch, (i)gnore, (w)ipe, (b)ackup, (a)bort.
  --trusted-host <hostname>   Mark this host or host:port pair as trusted, even though it does not have valid or any HTTPS.
  --cert <path>               Path to alternate CA bundle.
  --client-cert <path>        Path to SSL client certificate, a single file containing the private key and the certificate in PEM format.
  --cache-dir <dir>           Store the cache data in <dir>.
  --no-cache-dir              Disable the cache.
  --disable-pip-version-check
                              Don't periodically check PyPI to determine whether a new version of pip is available for download. Implied with --no-index.
  --no-color                  Suppress colored output
(project-env) [zzq@host252 ~]$ 

安装依赖包

一共需要两个依赖包 Thrift和hbase-thrift 使用命令如下:

python连接hbase的包也有很多种
HBase-Thrift
happyhbase
hbase-python 的pypi仓库
hbase-python github

我们这里使用HBase-Thrift

安装Thrift依赖包

pip3 install thrift

安装hbase-thrift依赖包

pip3 install hbase-thrift

python3连接thrift1代码

目前的Hbase有两套thrift1接口(可以叫thrift1和thrift2),它们并不兼容

先来看看连接thrift1的代码

vi query.py

注意 localhost 和端口9090(thrift默认端口) 需要与自己的对应

输入内容如下:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
 
from hbase import Hbase
from hbase.ttypes import *
 
transport = TSocket.TSocket('192.168.30.250', 9090)
 
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
 
client = Hbase.Client(protocol)
transport.open()
print(client.getTableNames())

运行命令

python query.py

会报错如下:

(project-env) [zzq@host252 ~]$ python query.py
Traceback (most recent call last):
  File "query.py", line 6, in <module>
    from hbase import Hbase
  File "/home/zzq/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/Hbase.py", line 2066
    except IOError, io:
                  ^
SyntaxError: invalid syntax

thrift连接的时候需要导入一个Hbase包, 实际是需要另外下载一个第三方包hbase-thrift, 这个包是用Python2写的,加载时会出现兼容性问题。

网上有别人修改好的兼容python3版本的文件,需要下载python3的Hbase文件,替换Hbase文件/usr/local/lib/python3.6/site-packages/hbase/Hbase.py和ttypes.py

如果是虚拟环境则路径 查找如下:

(project-env) [zzq@host252 ~]$ which python
~/my-python2hbase-env/project-env/bin/python
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/
bin  include  lib  lib64
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/
python3.6
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/
abc.py          codecs.py                     copy.py           encodings     __future__.py   hmac.py    keyword.py    no-global-site-packages.txt  os.py         reprlib.py      site-packages     sre_parse.py  tempfile.py  warnings.py
base64.py       collections                   copyreg.py        enum.py       genericpath.py  importlib  lib-dynload   ntpath.py                    posixpath.py  re.py           site.py           stat.py       tokenize.py  weakref.py
bisect.py       _collections_abc.py           distutils         fnmatch.py    hashlib.py      imp.py     linecache.py  operator.py                  __pycache__   rlcompleter.py  sre_compile.py    struct.py     token.py     _weakrefset.py
_bootlocale.py  config-3.6m-x86_64-linux-gnu  _dummy_thread.py  functools.py  heapq.py        io.py      locale.py     orig-prefix.txt              random.py     shutil.py       sre_constants.py  tarfile.py    types.py
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase
hbase/                         hbase_thrift-0.20.4.dist-info/ 
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
constants.py  Hbase.py  __init__.py  __pycache__  ttypes.py

下载地址为:

链接:https://pan.baidu.com/s/1-yKP1ghu2IAswnXzWpGNbw
提取码:d132

替换使用命令如下 :

(project-env) [zzq@host252 ~]$ unzip hbase3.6.zip 
Archive:  hbase3.6.zip
  inflating: hbase3.6/Hbase.py       
  inflating: hbase3.6/readme         
  inflating: hbase3.6/ttypes.py  

(project-env) [zzq@host252 ~]$ cp hbase3.6/Hbase.py  ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
(project-env) [zzq@host252 ~]$ cp hbase3.6/ttypes.py   ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/

(project-env) [zzq@host252 ~]$ ll ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
total 276
-rw-rw-r--. 1 zzq zzq    150 Nov 28 12:13 constants.py
-rw-rw-r--. 1 zzq zzq 240677 Dec  2 16:44 Hbase.py
-rw-rw-r--. 1 zzq zzq     43 Nov 28 12:13 __init__.py
drwxrwxr-x. 2 zzq zzq   4096 Nov 28 12:13 __pycache__
-rw-rw-r--. 1 zzq zzq  25228 Dec  2 16:44 ttypes.py

可能遇到问题–thrift.Thrift.TApplicationException: Invalid method name: 'getTableNames

原因

客户端thrift版本和hbase thrift server的thrift版本不一致造成的。

thrift server上是使用的thrift2启动的,而客户端使用的是thrift访问的。

解决方法
因为根本原因在于客户端和服务器thrift版本不一致,那么解决方法有两个:

1、服务端以启动thrift版本的thrift server
hbase 的 thrift server以thrift1方式启动。

hbase-daemon.sh stop thrift2
#启动命令
hbase-daemon.sh start thrift

如果想连接服务端的thrift2,参考下节

python3连接thrift2代码

生成对应编译器–注意thrift版本和thrift2版本

流程跟python2的差不多,需要注意的是使用thrift0.10.0以上版本生成编译器,才支持python3.5以上的版本。

 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift1/hbase.thrift
或者
 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift

我们还是可以直接下载编译好的文件使用
https://github.com/apache/hbase/tree/master/hbase-examples/src/main/python

我们这里直接下载 thrift2编译好的文件

链接:https://pan.baidu.com/s/1s3iysNJHW7s8lW6ni4qxrw
提取码:is1j

thrift1版本可以得到一组 Python 文件:

[zzq@host252 thrift-0.10.0]$ ll gen-py/hbased/
total 440
-rw-rw-r--. 1 zzq zzq    326 Nov 28 19:38 constants.py
-rw-rw-r--. 1 zzq zzq 384499 Nov 28 19:38 Hbase.py
-rwxr-xr-x. 1 zzq zzq  14386 Nov 28 19:38 Hbase-remote
-rw-rw-r--. 1 zzq zzq     43 Nov 28 19:38 __init__.py
-rw-rw-r--. 1 zzq zzq  38776 Nov 28 19:38 ttypes.py
[zzq@host252 thrift-0.10.0]$ 

thrift2版本会得到以下文件

$ ls gen-py
gen-py/hbase/__init__.py
gen-py/hbase/constants.py
gen-py/hbase/THBaseService.py
gen-py/hbase/ttypes.py

因为thrift2没有getTableNames()方法,所以我们需要先手动创建一个测试用的table。

hbase shell

hbase(main):001:0> create "example", NAME => "family"
0 row(s) in 1.6480 seconds

=> Hbase::Table - example
hbase(main):002:0> 

假如我们的gen-py路径为:
/home/zzq/thrift2/gen-py

则使用命令创建测试脚本test.py

vim test.py

输入内容如下:

import sys
import os
import time

from thrift.transport import TTransport
from thrift.transport import TSocket
from thrift.transport import THttpClient
from thrift.protocol import TBinaryProtocol

# Add path for local "gen-py/hbase" for the pre-generated module
sys.path.append("/home/zzq/thrift2/gen-py")
from hbase import THBaseService
from hbase.ttypes import *

print("Thrift2 Demo")
print("This demo assumes you have a table called \"example\" with a column family called \"family\"")

host = "192.168.30.250"
port = 9090
framed = False

socket = TSocket.TSocket(host, port)
if framed:
  transport = TTransport.TFramedTransport(socket)
else:
  transport = TTransport.TBufferedTransport(socket)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = THBaseService.Client(protocol)

transport.open()

table = "example"

tableName = str.encode(table)
rowKey =str.encode('row2')
put = TPut()
put.row = rowKey
columnValues=[TColumnValue(family=str.encode("family"),qualifier=str.encode("qualifier2"),value=str.encode("value2"))]
put.columnValues = columnValues
result = client.put(tableName, put)
 
rowKey =str.encode('row2')  
get = TGet()
get.row = rowKey
result = client.get(tableName, get)
print(result.row)
print(result.columnValues)
for i in result.columnValues:
     print(i.value)


transport.close()

使用命令运行

python test.py

输出如下:

(project-env) [zzq@host252 ~]$ python test.py 
Thrift2 Demo
This demo assumes you have a table called "example" with a column family called "family"
b'row2'
[TColumnValue(family=b'family', qualifier=b'qualifier2', value=b'value2', timestamp=1575285482023, tags=None, type=None)]
b'value2'
(project-env) [zzq@host252 ~]$ 



可能遇到报错–ImportError: cannot import name ‘THBaseService’

(project-env) [zzq@host252 ~]$ python3.6  test.py
Traceback (most recent call last):
  File "test.py", line 12, in <module>
    from hbase import THBaseService
ImportError: cannot import name 'THBaseService'

原因 默认先加载了 project-env/lib/python3.6/site-packages/hbase/路径的hbase.py文件。

没有识别到 gen-py目录

解决方法一

修改路径名

把生成的gen-py目录修改成genpy,否则python3导入会出现问题。

解决方法2 用新的覆盖lib包里的文件

(project-env) [zzq@host252 ~]$ cp thrift2/gen-py/hbase/*    ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
(project-env) [zzq@host252 ~]$ ll ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
total 1200
-rw-rw-r--. 1 zzq zzq    366 Dec  2 16:59 constants.py
-rw-rw-r--. 1 zzq zzq 240677 Dec  2 16:44 Hbase.py
-rw-rw-r--. 1 zzq zzq     51 Dec  2 16:59 __init__.py
-rw-rw-r--. 1 zzq zzq    199 Dec  2 16:59 __init__.pyc
drwxrwxr-x. 2 zzq zzq   4096 Dec  2 16:49 __pycache__
-rw-rw-r--. 1 zzq zzq 369677 Dec  2 16:59 THBaseService.py
-rw-rw-r--. 1 zzq zzq 359818 Dec  2 16:59 THBaseService.pyc
-rw-rw-r--. 1 zzq zzq  14357 Dec  2 16:59 THBaseService-remote
-rw-rw-r--. 1 zzq zzq 119702 Dec  2 16:59 ttypes.py
-rw-rw-r--. 1 zzq zzq  97317 Dec  2 16:59 ttypes.pyc

更多用法参考

https://blog.csdn.net/qq_21153619/article/details/86502624

https://blog.csdn.net/m0_37634723/article/details/79191420

https://blog.csdn.net/zjerryj/article/details/80045657

https://blog.csdn.net/luanpeng825485697/article/details/81048468

附录—单独的hbase服务安装和thrift启动

安装jdk

配置hbase的依赖环境JAVA_HOME

参考文章

linux软件(一)—CentOS安装jdk

Hbase下载

下载地址:http://hbase.apache.org/downloads.html

本地Hbase安装

root@master:/usr/local/setup_tools# tar -zxvf hbase-2.0.0-bin.tar.gz 
root@master:/usr/local/setup_tools# mv hbase-2.0.0 /usr/local/
root@master:/usr/local/setup_tools# cd /usr/local
root@master:/usr/local# ls | grep hbase
hbase-2.0.0


root@master:/usr/local/hbase-2.0.0# vi /etc/profile

export HBASE_HOME=/usr/local/hbase-2.0.0
export PATH=.:$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$SPARK_HOME/bin:$HIVE_HOME/bin:$FLUME_HOME/bin:$ZOOKEEPER_HOME/bin:$KAFKA_HOME/bin:$IDEA_HOME/bin:$eclipse_HOME:$MAVEN_HOME/bin:$ALLUXIO_HOME/bin:$HBASE_HOME/bin

root@master:/usr/local/hbase-2.0.0# source /etc/profile

配置

修改hbase-site.xml,设置存储数据的根目录。

root@master:/usr/local/hbase-2.0.0/conf# vi hbase-site.xml
<configuration>
    <property>
        <name>hbase.rootdir</name>
        <value>file:///usr/local/hbase-2.0.0/data</value>
    </property>
 
</configuration>

启动hbase

root@master:/usr/local/hbase-2.0.0# cd bin
root@master:/usr/local/hbase-2.0.0/bin# ls
considerAsDead.sh     hbase             hbase-config.cmd  hbase-jruby             master-backup.sh  replication               start-hbase.sh  zookeepers.sh
draining_servers.rb   hbase-cleanup.sh  hbase-config.sh   hirb.rb                 region_mover.rb   rolling-restart.sh        stop-hbase.cmd
get-active-master.rb  hbase.cmd         hbase-daemon.sh   local-master-backup.sh  regionservers.sh  shutdown_regionserver.rb  stop-hbase.sh
graceful_stop.sh      hbase-common.sh   hbase-daemons.sh  local-regionservers.sh  region_status.rb  start-hbase.cmd           test


root@master:/usr/local/hbase-2.0.0/bin# start-hbase.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
running master, logging to /usr/local/hbase-2.0.0/logs/hbase-root-master-master.out


root@master:/usr/local/hbase-2.0.0/bin# jps
2757 Jps
2685 HMaster

使用hbase shell

root@master:/usr/local/hbase-2.0.0/bin#  hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
Took 0.0044 seconds                                                                                                                                                    
hbase(main):001:0>

hbase(main):003:0> version
2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
Took 0.0054 seconds                                                                                                                                                    
hbase(main):004:0> 

启动hbase thrift服务

root@master:/usr/local/hbase-2.0.0/bin# hbase-daemon.sh start thrift
running thrift, logging to /usr/local/hbase-2.0.0/logs/hbase-root-thrift-master.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

root@master:/usr/local/hbase-2.0.0/bin# jps
3332 Jps
3254 ThriftServer
2685 HMaster
发布了805 篇原创文章 · 获赞 897 · 访问量 524万+

猜你喜欢

转载自blog.csdn.net/q383965374/article/details/103355133
今日推荐