Cassandra学习五使用Key的正确姿势

NoSQL一般是反范式的，比如提倡数据冗余，使得不至于写出非常复杂的SQL语句。

Cassandra之中一共包含下面5中Key：

Primary Key: 用来获取某一行的数据，可以是一列或多列

　　 PRIMARY KEY(key_part_one, key_part_two)

　　 key_part_one - partition key

　　　　　　　 key_part_two - clustering key

PRIMARY KEY((k_part_one,k_part_two), k_clust_one, k_clust_two, k_clust_three)

　　　　　　　　复合的partition key, 复合的clustering key

Partition Key: （用来进行分区的）Cassandra会对partition key做一个hash计算，并自己决定将记录放到哪个NODE。 partition key 可以是单一主键，也可以是复合主键。但是主键不能重复。

Cassandra会给每一行数据一个timestamp，如果有多行数据，Cassandra会取时间最新的数据返回

Composite Key:
Compound Key:
Clustering Key: 主要用于进行Range Query（范围查找）. 并且使用的时候需要按照建表顺序进行提供信息。

-- 创建表
-- 注意state 这个field
CREATE TABLE users (
  mainland text,
  state text,
  uid int,
  name text,
  zip int,
  PRIMARY KEY ((mainland), state, uid)
) 
 
-- 插入一些值
insert into users (mainland, state, uid, name, zip) VALUES ( 'northamerica', 'washington', 1, 'john', 98100);
insert into users (mainland, state, uid, name, zip) VALUES ( 'northamerica', 'texas', 2, 'lukas', 75000);
insert into users (mainland, state, uid, name, zip) VALUES ( 'northamerica', 'delaware', 3, 'henry', 19904);
insert into users (mainland, state, uid, name, zip) VALUES ( 'northamerica', 'delaware', 4, 'dawson', 19910);
insert into users (mainland, state, uid, name, zip) VALUES ( 'centraleurope', 'italy', 5, 'fabio', 20150);
insert into users (mainland, state, uid, name, zip) VALUES ( 'southamerica', 'argentina', 6, 'alex', 10840);

有效的查询：

  select * from users where mainland = 'northamerica' and state > 'ca' and state < 'ny';

(mainland), state, uid 构成联合主键，其中mainland是partition key，是hash实现的，所以用=；而state 和UID是clustering key, 是sortedMap实现的可以用于做范围查找。原理参考：

1	Map<RowKey, SortedMap<ColumnKey, ColumnValue>>

无效的查询：

-- 没有提供stat 信息
select * from users where mainland = 'northamerica' and uid < 5;

Cassandra 整体数据可以理解成一个巨大的嵌套的Map。只能按顺序一层一层的深入，不能跳过中间某一层~

create table stackoverflow (
key_part_one text,
key_part_two int,
data text,
PRIMARY KEY(key_part_one, key_part_two)
);

select * from stackoverflow where key_part_one = ‘ronaldo’;         // 这个正确，没有问题

select * from stackoverflow where key_part_two = 9 ALLOW FILTERING  // 一定要有allow filterring

PRIMARY KEY((col1, col2), col3, col4))

正确的where查询条件：

col1 and col2
col1 and col2 and col3
col1 and col2 and col3 and col4

无效的where查询条件：
col1 (这样Cassandra无法找到在哪一行)
col1 and col2 and col4
anything that does not contain both col1 and col2

总结：

　　Cassandra之中的存储，是2-level nested Map（2级嵌套map）

　　Partition Key –> Custering Key –> Data

　　partition key: 可以在where中使用 eq and in

　　clustering key: 可以在where中使用 < <= = >= > in

Cassandra学习五 使用Key的正确姿势

猜你喜欢

Cassandra学习五使用Key的正确姿势