Talking about the basic concept of storage RAID

1. The basic concept of RAID

RAID (Redundant Arrays of Independent Drives, disk array), which means "an array of independent disks with redundant capabilities" means that a
disk array is composed of many independent disks, combined into a huge disk group, provided by individual disks The added effect of the data improves the performance of the entire disk system.

2. Development of RAID

The original intention of RAID technology is to combine multiple small-capacity hard drives to obtain greater storage.
Nowadays, the RAID technology we often say is more related to data protection. In other words, when the physical device fails, RAID can be used to place data loss

3. The main functions of RAID

· By the hard disk data striping , to achieve access to the data into blocks, seek to reduce the drive's mechanical, improve data access speed
-array by a few hard drives simultaneously accessing (parallel access ) , Reduces the mechanical seek time of the hard disk, and improves the data access speed
. By mirroring or storing parity check messages, the redundancy protection of the data is realized
. Multiple hard disks are combined into a logical disk group to provide Larger capacity storage

4. RAID data organization form

Insert picture description here
RAID achieves efficient access to data by striping the disk array. Striping
: single or multiple consecutive sectors in the hard disk form a stripe, which is the smallest unit for one data read and write on a hard disk , Which is the element that makes up the
stripe: Stripe: Strips in the same position (same number) on multiple hard drives in the same hard disk array

Striped broadband: refers to the number of data member disks in
a stripe Striped depth: refers to the size of a stripe's capacity

5. RAID data protection

Method 1: Save a copy of the data on another redundant hard disk
Method 2: Parity check algorithm (XOR), the same is false, the difference is true
Insert picture description here

Six. RAID status

Insert picture description here
RAID technology combines multiple physical hard disks together to form a RAID group, which will maintain its own state
. When all hard disks of the RAID group are working normally, the state of the RAID group is normal
. When a certain number of hard disks fail , But the entire RAID group can prevent data loss, and when the data recovery process has not been started, this state is called a degraded
and a failed hard disk has been replaced or there is a hot spare disk in the system. The process, but the unfinished state is called reconstruction (or reconstruction)
· After the reconstruction is successful, the state of the RAID group will return to normal
· When the number of hard disk failures is large and exceeds the number of redundant hard drives supported by the RAID type , Will lead to failure to complete the data recovery function, this state is called RAID group failure

Create ---- "Created successfully (normal) ----" Disk failure (degraded) ---- "Disk recovery (reconstruction) ----" Normal work

Create ---- "Created successfully (normal) ----" Disk failure (degraded) ---- "Too many disk failures, no redundant disks (RAID group failure)

Whether the degraded RAID can complete data reconstruction depends on the type of RAID used, the number of hard drive failures and the availability of replacement hard drives

A hot spare disk is a hard disk designated as a replacement for a failed member in a RAID group in the system. The completed task is to carry data from the replaced hard disk

Seven. RAID implementation

1. Hardware implementation
Hardware RAID uses a dedicated RAID adapter, hard disk controller or storage processor. The RAID controller has its own processor. I / O processing chips and memory are used to improve resource utilization and data transmission speed. The RAID controller manages the routing, the cache area, and controls the data flow between the host and the RAID. Hardware RAID is usually used in the server

2. Software implementation: The software-implemented RAID does not have its own processor or I / O processing chip, but completely depends on the host processor. Therefore, the low-speed CPU cannot meet the implementation requirements of RAID. Software RAID is commonly used on enterprise storage devices

8. RAID and LVM comparison

1. LVM (Logic Volume Management, logical volume management), his biggest use is to flexibly manage the capacity of the disk, so that the disk partition can be enlarged or reduced at will, so that the remaining space of the disk can be better applied.
Simply speaking, LVM is to talk about Physical partitions or disks are combined by software,

2. RAID, focusing more on performance improvement and data security

9. Comparison and analysis of various RAID levels

According to different combinations, it can be divided into different RAID levels
Insert picture description here

1.RAID0
(. 1) described RAID0
RAID0 yet Stripe (striping) or the Striping (stripe pattern), he has the highest storage performance, read and write performance of the RAID preferably all, but there is no fault tolerance, i.e., The damage of any one disk will result in the loss of all data of the RAID group
(2) The principle of RAID0 The principle of
RAID0 to improve storage performance is to distribute continuous data to multiple disks for access, so that the system can read and write data Perform data access in parallel between the two disks . RAID0 requires at least one physical disk to
Insert picture description here
access data: the
system sends I / O requests to the logical disk group (RAID group) composed of disks into four operations, each of which corresponds to a physical disk, as shown above The data that should be operated sequentially is distributed to four disks. These four operations are done in parallel. That is, theoretically, the access efficiency of RAID0 is N times the number of disks, but the fact is not so fast
Insert picture description here

(3) RAID0 production scenarios
RAID0 because of its high storage efficiency, but it can not be fault-tolerant features, it is suitable for a large number of performance requirements, but data security requirements are not particularly large applications
· load balancing cluster below Multiple same RS node servers
· Master node under distributed file storage
· Multiple slave servers for MySQL master-slave replication
· High performance requirements and low redundancy requirements related services
(4) RAID0 comprehensive description

index description
capacity It is the capacity of multiple hard drives added together. It has the highest storage performance in RAID. The principle is to store continuous data in different hard drives and realize parallel access to data.
performance RAID0's performance is the best in RAID, with the highest access efficiency
redundancy RAID0 has no redundancy, if one hard disk fails, the entire RAID group is unavailable
occasion Applicable to large-scale concurrent read and write, but does not require high data redundancy scenarios
advantage Fast speed without loss of capacity
Disadvantages No redundancy

2.
RAID1 (1) RAID1 description
RAID1 is also known as Mirror or Mirroring (mirror), its purpose is to ensure the maximum availability and repairability of user data. Achieve 100% copy of data, but reduces the performance of writing, and the performance of reading has little effect
(2) The principle of
RAID1 The operation method of RAID1 is to copy 100% of the data written by the user to one disk to another disk, so as to achieve To store double copies of data
RAID1 must consist of two disks, and the size of the disk is determined by the smallest disk. The
Insert picture description here
process of accessing data:
when storing data, the same data must be written to two different disks, so the write efficiency of the disk is compared When
writing data low , the primary disk is preferred. If the data is read successfully, the system will not process the data on the backup disk. If the data reading fails, go back and read the data on the backup disk.
Insert picture description here
(3) RAID1 application scenario
RAID1 has high reliability and low performance, making it suitable for the following application scenarios
. There is no large number of external users to access but particularly important data General government and other organizations can choose RAID1 when storing important data

(4) RAID1 comprehensive description

index description
capacity Half of the capacity is used for backup, and the loss is 50%
performance RAID1 can not improve storage performance, theoretically the write performance is not much different from a single disk
redundancy Among all RAID levels, RAID1 provides the highest data security
occasion Applicable to enterprises that require high data redundancy and low performance
advantage Good redundancy
Disadvantages Low performance, large capacity loss, high cost

3.
RAID5 (1) RAID5 description
RAID is a storage solution that combines storage performance, data security and storage cost

(2) Principle RAID5
RAID5 physical disks require more than three, may be provided to achieve hot spare fault recovery, the use of parity , reliability, will be completely destroyed while only two hard disks is damaged, when a drive fails because Data can be reconstructed by verification and used for temporary services. Two hard disks fail to be verified. The
Insert picture description here
Insert picture description here
process of accessing data:
storing data, similar to RAID0, realizes the parallel storage of data, and the writing efficiency is high (to be written to the school Checking, efficiency is lower than RAID0, higher than RAID1). There will be a hard disk space to store the check code, so it will lose the capacity of a hard disk
Read data: has a higher reading efficiency
(3) RAID5 applicable scenarios
RAID5 can be used as a compromise between RAID0 and RAID1, suitable for performance And redundancy have certain requirements but are not very high. RAID5 is the most applicable RAID level in the field
. File and application server
. Database server
(4) RAID5 comprehensive description

capacity Will lose the capacity of a disk to store the check digit
performance RAID5 has a read speed similar to that of RAID0, but with an additional parity information, the speed of writing data
redundancy Loss of a disk, RAID5 data security is low, but it is lower than RAID1 and the disk utilization rate is high
occasion Applicable to situations that require performance and redundancy but are not particularly high
advantage Very high read efficiency (slightly lower than RAID0), medium write efficiency
Disadvantages The failure of the hard disk will have a certain impact on the throughput. The design of the controller is more complicated, and the reconstruction system is more complicated.

4.
RAID10 (1) RAID10 description
RAID10 is a combination of the advantages of RAID0 and RAID1, with high performance and redundancy at the same time, but the cost is higher

(2) Principle of
RAID10 RAID10 requires at least 4 disks and must be an even number of hard disks. No matter how many hard disks, half of the capacity is lost. The
Insert picture description here
specific data access process is inconsistent. If you are interested, you can learn from other places

5. Comparison of common RAID

RAID level RAID0 RAID1 RAID5 RAID10
Alias Zone Mirror Distributed parity Mirrored array zone
Fault tolerance no Have Have Have
Redundancy type no copy Parity check copy
Hot spare option no Have Have Have
Read performance high low low general
Random write performance high low low general
Available capacity (generally not more than 16) N * single hard disk capacity (N / 2) * Single hard disk capacity (N-1) * Single hard disk capacity (N / 2) * Single hard disk capacity
Typical application environment Read and write quickly, with low security requirements, such as graphics workstations Randomly write data, high security requirements, storage areas such as servers and databases Random data transmission, high security requirements, database, storage, etc. Large amount of data, high security, bank finance, etc.

Insert picture description here
As can be seen from the above figure,
RAID0 balances cost and performance.
RAID1 has the best reliability.
RAID10 balances reliability and performance.
RAID5 has both.

Ten. Several RAIDs that are not commonly used

1. A
Insert picture description here
single RAID3 disk is used as a verification disk, with a capacity of N-1 hard disks

2. RAID50 RAID50
is a RAID level that combines RAID5 and RAID0 in two levels. The first level is RAID5 and the second level is RAID0.
Insert picture description here
RAID50 requires at least six hard drives, because a RAID5 requires three hard drives
Insert picture description here

Published 24 original articles · won 10 · views 2372

Guess you like

Origin blog.csdn.net/flat0809/article/details/98738317