HBase（二）hbase shell常用语法

HBase Shell是HBase的一个命令行工具，我们可以通过它对HBase表进行维护操作。我们可以使用命令hbase shell来进入HBase Shell。

1.基本操作

在HBase shell中，可以使用status, version和whoami分别获得当前服务的状态、版本、登录用户和验证方式。

hbase(main):001:0> status
1 active master, 0 backup masters, 3 servers, 0 dead, 1.6667 average load

hbase(main):002:0> version
1.2.0-cdh5.9.2, rUnknown, Tue Apr  4 01:53:51 PDT 2017

hbase(main):003:0> whoami
root (auth:SIMPLE)
    groups: root

HBase shell中的帮助命令非常强大，使用help获得全部命令的列表，使用help ‘command_name’获得某一个命令的详细信息。例如：

hbase(main):004:0> help 'create'
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily 
including NAME attribute. 
Examples:

Create a table with namespace=ns1 and table qualifier=t1
  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
  hbase> # The above in shorthand would be the following:
  hbase> create 't1', 'f1', 'f2', 'f3'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

2.命名空间

在HBase系统中，命名空间namespace指的是一个HBase表的逻辑分组，同一个命名空间中的表有类似的用途，也用于配额和权限等设置进行安全管控。你也可以理解在关系型数据库中的一个数据库名。
HBase默认定义了两个系统内置的预定义命名空间：
• hbase：系统命名空间，用于包含hbase的内部表
• default：所有未指定命名空间的表都自动进入该命名空间
我们可以通过create_namespace命令来建立命名空间

hbase(main):005:0>create_namespace 'xudong'

通过drop_namespace来删除命名空间

hbase(main):008:0> drop_namespace 'xudong'
0 row(s) in 0.0620 seconds

通过describe_namespace来显示命名空间设定的元信息

hbase(main):010:0> describe_namespace 'xudong'
DESCRIPTION                                                                                                                           
{NAME => 'xudong'}                                                                                                                    
1 row(s) in 0.0200 seconds

通过命令list_namespace可以显示所有的命名空间

hbase(main):011:0> list_namespace
NAMESPACE                                                                                                                             
default                                                                                                                               
hbase                                                                                                                                 
xudong                                                                                                                                
3 row(s) in 0.0170 seconds

注：如果只是用list命令，则显示的是default中的所有表(支持正则表达式)

默认建表语句create table_name, column_family1 这个命令是将表创建在default下面，如果需要在自己设定的命名空间下建表，则需要用create namespace:table_name命令：

hbase(main):013:0> create 'xudong:userTest','info'
0 row(s) in 1.3280 seconds
=> Hbase::Table - xudong:userTest

通过命令list_namespace_tables可以列出某个命名空间下所有的表

hbase(main):015:0> list_namespace_tables 'xudong'
TABLE                                                                                                                                 
userTest                                                                                                                              
1 row(s) in 0.0280 seconds

3.DDL语句
创建HBase表，create命令，第一个参数是表名，然后后面是一系列列族的列表。每个列族可以独立指定使用的版本数，数据有效保存时间，是否启动缓存等信息。

hbase(main):025:0> create 'test1',{NAME => 'f1',VERSION => 1},{NAME => 'f2',VERSION => 1}

可以通过describe来查看这个表中的元信息

hbase(main):027:0> describe 'test1'
Table test1 is ENABLED                                                                                                                
test1                                                                                                                                 
COLUMN FAMILIES DESCRIPTION                                                                                                           
{NAME => 'f1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NON
E', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'
}                                                                                                                                     
{NAME => 'f2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NON
E', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'
}                                                                                                                                     
2 row(s) in 0.0460 seconds

还可以通过enable和disable来启用/禁用这个表，相应的可以通过is——enabled和is_disabled来检查表是否禁用。
使用exists来检查表是否存在。

使用alter来改变表的属性，比如改变列族的属性等。常用的添加和删除一个列族。

hbase(main):028:0> alter 'test1','f3'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.0380 seconds

hbase(main):030:0> alter 'test1','delete' => 'f3'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 1.9510 seconds
也可以使用下面的语句：
alter 't1',{ NAME => 'f3', METHOD => 'delete'}

可以通过describe来检查者两条语句是否运行成功。

4.添加和查询（put & get）
首先我们创建一个表test2，它有name、baseinfo、extinfo三个列族，并插入一些数据。列族下的列不需要提前创建。

hbase(main):031:0> create 'test2','name','baseinfo','extinfo'
0 row(s) in 1.2640 seconds

=> Hbase::Table - test2
hbase(main):032:0> put 'test2','xudong','baseinfo:age','20'
0 row(s) in 0.1990 seconds

hbase(main):033:0> put 'test2','xudong','baseinfo:edu','hfut'
0 row(s) in 0.0190 seconds

hbase(main):034:0> put 'test2','michael','baseinfo:age','26'
0 row(s) in 0.0100 seconds

hbase(main):035:0> put 'test2','michael','baseinfo:career','it'
0 row(s) in 0.0160 seconds

hbase(main):036:0> put 'test2','michael','extinfo:hobby','code'
0 row(s) in 0.0100 seconds

根据name获取所有的数据

hbase(main):037:0> get 'test2','xudong'
COLUMN                             CELL                                                                                               
baseinfo:age                      timestamp=1499694539376, value=20                                                                  
baseinfo:edu                      timestamp=1499694574274, value=hfut                                                                
2 row(s) in 0.0320 seconds

还可以根据name和一个列族获取所有数据

hbase(main):038:0> get 'test2','michael','baseinfo'
COLUMN                             CELL                                                                                               
 baseinfo:age                      timestamp=1499694608816, value=26                                                                  
 baseinfo:career                   timestamp=1499694672665, value=it                                                                  
2 row(s) in 0.0300 seconds

5.DML语句

通过delete命令，可以删除name为某个值下的字段（发现整个列族下的baseinfo列的所有字段都删除了，age和career，但是通过scan发现所有的字段还在，只不过get的时候无视掉删除的字段）

hbase(main):005:0> delete 'test2','michael','baseinfo:age'
0 row(s) in 0.0660 seconds

hbase(main):006:0> get 'test2','micahel','baseinfo'
COLUMN                             CELL                                                                                               
0 row(s) in 0.0110 seconds

6.Scan

通过scan来对全表进行扫描。

hbase(main):010:0> scan 'test2'
ROW                                COLUMN+CELL                                                                                        
 bye                               column=baseinfo:age, timestamp=1499778403437, value=29                                             
 michael                           column=baseinfo:career, timestamp=1499694672665, value=it                                          
 michael                           column=extinfo:hobby, timestamp=1499694732650, value=code                                          
 xudong                            column=baseinfo:age, timestamp=1499694539376, value=20                                             
 xudong                            column=baseinfo:edu, timestamp=1499694574274, value=hfut                                           
3 row(s) in 0.0750 seconds

还可以扫描指定的某个列或者列族

hbase(main):012:0> scan 'test2',{COLUMNS => 'baseinfo:age'}
ROW                                COLUMN+CELL                                                                                        
 bye                               column=baseinfo:age, timestamp=1499778403437, value=29                                             
 xudong                            column=baseinfo:age, timestamp=1499694539376, value=20                                             
2 row(s) in 0.0190 seconds

hbase(main):014:0> scan 'test2',{COLUMNS =>'baseinfo'}
ROW                                COLUMN+CELL                                                                                        
 bye                               column=baseinfo:age, timestamp=1499778403437, value=29                                             
 michael                           column=baseinfo:career, timestamp=1499694672665, value=it                                          
 xudong                            column=baseinfo:age, timestamp=1499694539376, value=20                                             
 xudong                            column=baseinfo:edu, timestamp=1499694574274, value=hfut                                           
3 row(s) in 0.0280 seconds

除了列（COLUMNS）修饰词外，HBase还支持Limit（限制查询结果行数），STARTROW （ROWKEY起始行。会先根据这个key定位到region，再向后扫描）、STOPROW(结束行)、TIMERANGE（限定时间戳范围）、VERSIONS（版本数）、和FILTER（按条件过滤行）等。

7.Filter
Filter是一个非常强大的修饰词，可以设定一系列条件来进行过滤。比如我们要限制某个列的值等于29

hbase(main):016:0> scan 'test2',FILTER => "ValueFilter(=,'binary:29')"
ROW                                COLUMN+CELL                                                                                        
 bye                               column=baseinfo:age, timestamp=1499778403437, value=29                                             
1 row(s) in 0.0610 seconds

或者值包含2这个值

hbase(main):017:0> scan 'test2',FILTER => "ValueFilter(=,'substring:2')"
ROW                                COLUMN+CELL                                                                                        
 bye                               column=baseinfo:age, timestamp=1499778403437, value=29                                             
 xudong                            column=baseinfo:age, timestamp=1499694539376, value=20                                             
2 row(s) in 0.0260 seconds

FILTER中支持多个过滤条件通过括号、AND和OR的条件组合。有兴趣的可以查看官网api。

HBase（二）hbase shell常用语法

猜你喜欢