ceph review summary

A, ceph concept

  Ceph is a for outstanding performance, reliability and scalability designed unified, distributed file systems. ceph unity now provides a file system , block storage and object storage , embodied in a distributed can be dynamically expanded .

Features:

(1) High performance:

  a. abandoned the traditional centralized store metadata addressing scheme using CRUSH algorithm, data distribution balanced, high parallelism.
  b. Consider the isolation domain of disaster recovery, a copy can be achieved placement rules kinds of loads, such as across the room, rack awareness and so on.
  c. can scale to support thousands of storage nodes, data to support TB PB level.

(2) High Availability:

  a. the number of copies can be flexibly controlled.
  B. Support domain separation failure, strong data consistency.
  c. a variety of failure scenarios automatically repair self-healing.
  D. no single point of failure, automatic management.

(3) scalability:

  a. to the center.
  b. expansion flexibility.
  c. With the increase of nodes increases linearly.

(4) feature-rich:

  . A storage interface supports three: block storage, file storage, object store.
  b. support for custom interface, support for multiple languages drives.

Two, ceph assembly

(1) Monitors: Monitor and maintain the status of a variety of cluster mapping, while providing authentication and logging services, including information on the monitor end-node, including Ceph cluster ID, monitoring host name and IP and port. And stores the current version information as well as information on the latest changes, through the "ceph mon dump" view monitor map.

(2) MDS (the Metadata Server): Ceph metadata , the metadata is stored mainly Ceph file system. Note: ceph block storage and do not need to store objects ceph MDS.

(3) the OSD: That object storage daemon , but it is not for object storage. Is a physical disk drive, the data stored on each physical disk cluster node in the form of objects. OSD is responsible for storing data, processing data replication, recovery, return (Backfilling), rebalancing . The vast majority of the work completed to store the data is performed by the OSD daemon process. When building Ceph OSD, we recommend using SSD disk and xfs file system to format the partition. Further also other OSD OSD for heartbeat detection , a detection result reported to Monitor

(4) RADOS : Reliable Autonomic Distributed Object Store. RADOS is the basis ceph storage cluster. In ceph, all data are to objects stored in the form of, and no matter what type of data, RADOS object storage will be responsible for saving these objects. RADOS layer ensures that the data is always consistent .

(5) Librados : Librados library providing access interface for the application level. It also provides for the native block storage, object storage, file system interfaces .

(6) RADOSGW: gateway interface, provides object storage service . It uses librgw librados and allows applications to implement object store to establish a connection with Ceph. And S3 and provides the  Swift (openstack)  compatible RESTful API interface.

(. 7) the RBD: block device , it is possible to adjust the size and thin configuration, and stores data on a plurality of OSD.

(. 8) CephFS: the Ceph file system , POSIX-compliant file system, based on encapsulated librados native interface.

Three, ceph data storage procedure

  Either storage (object block, file system) using the stored data will be cut into Objects. Objects size size can be adjusted by the administrator, usually 2M or 4M. Each object has a unique OID, and generated by ino ono.

  ino: i.e., the file is File ID, a globally unique identifier for each file
  ono: number is sliced

 Ceph in addressing at least go through the following three maps:

  (1) File -> object map
  (2) Object -> PG map, the hash (OID) & mask -> the PGID
  (. 3) the PG -> the OSD mapping, CRUSH algorithm

  pool: a logical partition when ceph storing data, which serves namespace role. Each pool comprises a number of PG (configurable) is. PG where objects are mapped to different Object. pool is distributed to the entire cluster. pool could do fault isolation domain, not unified isolation according to different user scenarios.

Four, ceph IO process

1, the normal IO process

 

step:

  1)client 创建cluster handler。
  2)client 读取配置文件。
  3)client 连接上monitor,获取集群map信息。
  4)client 读写io 根据crshmap 算法请求对应的主osd数据节点。
  5)主osd数据节点同时写入另外两个副本节点数据。
  6)等待主节点以及另外两个副本节点写完数据状态。
  7)主节点及副本节点写入状态都成功后,返回给client,io写入完成。

2、新主IO流程

新主IO流程步骤:

  1)client连接monitor获取集群map信息。
  2)同时新主osd1由于没有pg数据会主动上报monitor告知让osd2临时接替为主。
  3)临时主osd2会把数据全量同步给新主osd1。
  4)client IO读写直接连接临时主osd2进行读写。
  5)osd2收到读写io,同时写入另外两副本节点。
  6)等待osd2以及另外两副本写入成功。
  7)osd2三份数据都写入成功返回给client, 此时client io读写完毕。
  8)如果osd1数据同步完毕,临时主osd2会交出主角色。
  9)osd1成为主节点,osd2变成副本。

五、ceph 命令

1、检测ceph集群状态:

ceph -s

2、查看osd状态:

ceph osd tree

3、列式pool列表

ceph osd lspools

4、创建pool

ceph osd pool create vms 1024

六、

七、

 

Guess you like

Origin www.cnblogs.com/renyz/p/11760758.html