Ali technical experts: 5 PPT thoroughly publicize the "delete library on foot 'under, second-level data recovery mechanism!

Source | Ali Technology (ali_tech)

Author | Fan Jun

REVIEW: Data security is referred to an unprecedented height, the topic of data protection become more and more sensitive. Because the impact of downtime for business users due to greater and greater. Technical experts who challenge Ali Jun data from the security situation and development, facing the problem definition, the traditional solutions, the current cloud vendors' solutions, to set forth what is continuous data protection and continuous data presented verifiable resilient protection scheme (Elastic Assured Continuous Data protection).

I. Summary

Compared to conventional continuous data protection solutions need to Guest OS level or in a proprietary storage level, be acquired when writing data change log, more or less have a great impact on storage performance production machine Once the cloud, it will increase the customer's computing costs and storage costs. Even the hybrid deployment architecture, network bandwidth, and the level of complexity of the implementation is also very difficult to implement in comparison with the cloud, it is difficult to meet traditional enterprise customers lower RPO (Recovery Point Objective) and RTO (Recovery Time Objective) of appeal. Although, continuous data protection product positioning and snapshots, replication (Replication) functions overlap somewhat, but the CDP broader positioning, focusing on data protection, recovery, more efficient business continuity, and is not limited to snapshots move data.

The new storage block new architecture provides the opportunity for Pangu2.0 Drive realize continuous data protection, in particular log-structured block device (Log Structure Block Device), which comprises: a new data write mode, and a log storage snapshots and so greatly facilitate the realization of continuous data protection. I believe that with the acceleration of the cloud business, in both storage performance, while low RTO will meet the urgent needs of traditional business users and advanced data protection of low RPO. However, data backup and data backup at the same time taking into account operational, operational data can be recovered to a large extent determine the effectiveness of data protection.

Second, the data protection challenges

In today's data security was referred to an unprecedented height, the topic of data protection become more and more sensitive. Because the impact of downtime for business users due to greater and greater. In 2017, viruses, ransomware, such as WannCry, Peta and Locky and frequent misuse delete libraries, and even some direct attacks on the user's backup software, making cloud user expectations for data security and data protection are increasingly high.

Data is becoming increasingly important: data = data = resource assets

January 2017, "Gitlab accidental deletion library thing pieces of" sensitive nerve caused by industry and major information security risks. It is noteworthy that, in the course of Gitlab recovery, found that only db1.staging database can be used to recover, while the other five kinds of backup mechanisms are not available. The db1.staging six hours before the data, but limited transmission rate, resulting in a slow recovery process, Gitlab eventually lost almost six hours of data.

Therefore, how to reduce the risk of data loss, reduce data protection window, reducing the loss of users, providing efficient recovery mechanism, the user's urgent needs. Further, it can be seen from one side, and the recovery of low RTO verifiable, the importance of data protection; recoverable phase data for storage costs at this point is extremely important straw.

Third, the continuity of data protection defined

Storage Networking Association (SNIA) for the definition of continuous data protection for: continuous data protection is a method that can capture or track data changes, and save on the outside independent production data to ensure that data can be restored to the past any point in time. Continuous data protection, can be based on the block, file or application implementation may provide sufficient granularity for recovery of recovery, to achieve an almost infinite number of recovery points.

The world's most authoritative IT research and consulting firm (Gartner) is defined as: continuous data protection is a recovery method that continuous or near-continuous data capture or trace file or change data block, at the same time in the form of logs save. This capability provides a more fine-grained, real-time point, in order to reduce the loss of data, and makes any recovery points possible. Some CDP solutions can be configured to crawl continuous data changes (true CDP) or at a certain time change data capture (quasi-CDP).

In order to express the state of CDP requires the introduction of two concepts: RPO and RTO.

  • RPO (Recovery Point Objective): recovery point objectives, referring to a disaster when how long data will be lost, that is, the backup interval.

  • RTO (Recovery Time Objective): recovery time objective, refers to how long the time of disaster can allow business to continue to operate, that is recovery time.

  • True CDP concept is defined as RPO = 0, RTO close to zero, in order to be become the CDP. When the call is not zero RPO: Near the CDP (quasi CDP).                                                                                                                                       

Fourth, continuous data protection features

Traditional data protection solutions focused on the periodic backup of data, it has been accompanied by a problem with the backup window, data consistency and the impact on production systems and so on. The CDP provides users with new data protection tools, system administrators need not concern the backup process data (CDP because the system will continuously monitor changes in key data, in order to continuously automate data protection), but only after a disaster simply select the need to restore data to a point in time backup for rapid recovery of data. 

Continuous data protection and disaster recovery technology as compared to conventional, continuous data protection has the following distinctive features:

1, the first time can greatly improve the data recovery point objectives (RPO). Backup interval data protection technology is generally 24 hours (backup once a day), so users will face the risk of data loss up to 24 hours, using snapshot technology, risk of data loss can be reduced to a few hours, and CDP the amount of data loss can be achieved may be reduced to a few seconds (of course, the time accuracy of CDP different products and solutions are not the same). In fact, employed in the conventional art, data protection is a "single point in time (SinglePoint-In-Time)" for the data copy management mode, continuous data protection and protection can be achieved "at any point in time (Any Point-In -Time) "data protection. 

2、虽然复制(Replication)技术可以通过与生产数据的同步获得数据的最新状态,但其无法规避由人为的逻辑错误或病毒攻击所造成的数据丢失。当生产数据由于以上原因导致数据遭到破坏时(例如数据被误删除),复制技术会将遭到破坏的数据状态同步到后备数据存储系统,使后备数据也受到破坏。CDP系统可以使数据状态恢复到数据遭到破坏之前的任意一个时间点,也就可以消除前者具有的风险。 

3、由于恢复时间和恢复对象的粒度更细,所以连续数据保护保护的数据恢复也更加灵活。目前的部分产品和解决方案允许最终用户(而不仅仅是系统管理员)直接对数据进行恢复操作,这在很大程度上方便了使用者。 

五、实现方式

连续数据保护实现的关键技术是对数据变化的记录和保存,以便实现任意时间点的快速恢复。一般来讲,有三种实现方式:

  • 基准参考数据模式。建立参考数据拷贝,根据生产数据变化记录数据差异日志,根据日志差异按需恢复数据。基准参考数据模式原理简单,实现起来比较容易,但由于数据恢复时需要从最原始的参考数据开始,逐步进行数据恢复,因此恢复时间比较长,尤其是恢复时间点越靠近当前的时间,恢复所需要的时间就越长。  

  • 复制参考数据模式。生产数据和参考数据副本实时同步,在同步的同时记录回退日志或事件,基于回退日志(Undo     Log)差异实现数据按需恢复。复制参考数据模式和基准参考数据模式在实现原理上恰好相反。复制参考数据模式在数据恢复时,恢复的时间点越靠近当前,所需要的恢复时间越短。但在数据的保存过程中,需要同时进行数据和日志记录的同步,需要较多的系统资源。 

  • 合成参考数据模式。合成参考数据模式是以上两种模式的折衷,较好地实现了以上两种模式的妥协,因此可以得到较好的资源占用和恢复时间效果。但需要复杂的软件管理和数据处理功能,实现起来比较复杂。 连续数据保护技术或解决方案的实现有多种模式。

不同的传统厂商建立了不同的连续数据保护保护模型,参考SNIA的存储共享模型, 可以将实现连续数据保护的产品或解决方案分为基于应用、基于文件和基于数据块的连续数据保护保护。本文主要从数据块层面讲CDP的实现。基于块的CDP功能直接运行在物理的存储设备或逻辑的卷管理器上,甚至也可以运行在数据传输层上。当数据块写入生产数据的存储设备时,CDP系统可以捕获数据的拷贝并将其存放在另外一个存储设备中。 基于数据块的数据保护又有基于主机层、基于传输层和基于存储层三类实现方式

六、传统数据保护产品的CDP

 下面以FalconStorCDP、VeeamCDP及EMC RecoverPoint这3个厂商,从不同背景进行分析,具有一定的代表性:飞康是传统的连续数据保护产品的代表。EMC传统的存储厂商,收购以前的RecoverPoint打造自己的数据保护套件, 方案建立在自己的存储上,提供物理机到虚拟机的保护方案。Veeam 是虚拟机保护的后起之秀,主打虚拟化平台上,VMWARE 及 HYPERV的数据保护,扩展到云端,目前的方案依赖于VMWare的VAIO 虚拟化数据获取框架。

 

EMCRecoverPoint/SE 是针对 EMC CLARiiON 系列阵列的全面解决方案,而 EMC RecoverPoint则是针对整个数据中心的全面解决方案。两种产品都提供了使用连续数据保护 (CDP)的同步本地复制,以及具有任意时间点恢复功能的同步和异步连续远程复制 (CRR)。在RecoverPoint 应用装置上同时运行CDP和CRR实现本地和远程(CLR) 数据保护,使您能够用单个解决方案同时在本地和远程保护相同数据。 飞康CDP解决方案整合了数据备份、系统恢复、灾难恢复、本地及异地容灾等多项功能。飞康CDP是基于磁盘的备份与容灾一体化解决方案,实现文件/数据库/操作系统的实时备份与瞬间恢复;实现了验证、演练的本地/异地容灾功能整合。

 

七、主要云厂商的数据保护方式

AWS仅提供原生的快照功能及帮助客户上云的手段,数据备份等功能依赖于传统的数据保护厂商;Azure提供基于虚拟机的基本的备份及恢复方式,没有提供CDP等高级功能。

八、可验证的弹性的连续数据保护CDP

根据Gartner的描述的弹性的云备份引擎,其中规定的了成功弹性备份的几个特征:

  • 弹性的云备份引擎需要快速的RTO,这就要求备份引擎和数据恢复在一个数据中心。

  • 弹性的云备份引擎需要有全备份,没有过大的WAN数据传输,将备份与生产机职责分开。

  • 并且要确保数据的可恢复性。

连续数据保护CDP本质上作为一种高级的数据保护方案,由云厂商进行,具有传统备份所不具有的弹性。传统厂商为了上云,必然需要将数据经过WAN传输到云端,必然耗费CPU资源,必然耗费IO资源。为了躲避资源的耗费,可能采取定时开启的任务方式,连基本的弹性的备份都保证不了,更谈不上CDP。可验证性,强调了CDP方案的可靠性,可操作性。为了保证应用程序的数据的跨卷一致性,需要卷之间建立一致性组(Consistency Group)及应用程序的一致性(Application Consistency)。

 

九、结论

数据保护不是亡羊补牢,需要未雨绸缪。随着企业上云的快速增长,传统企业对云端数据保护的诉求更加突出;随着数据重要性的日益提高,用户对数据丢失的敏感程度前所未有,从而使得云端数据保护与用户需求之间的矛盾更加凸显。

传统的基于块存储的连续数据保护因为大多依赖于特定的存储设备,并不具有云端实现所具有的弹性,并不适应云端分布式环境的复杂性。

连续数据保护作为传统或者混合云数据保护的重要补充,定会以新的解决方案的出现而被企业用户所重视。全新的Pangu2.0的块存储的架构为实现云端连续性数据保护提供了契机,

随着企业上云的加速,在兼顾存储性能的同时,将会满足传统高级企业用户的低RTO及低RPO的数据保护的紧迫需求。

后续文章将会着重阐述基于基准参考数据模型的云端连续数据保护,该方案基于Pangu2.0的Block Storage实现连续性数据保护,着重描述连续数据保护的秒级数据恢复机制。

 -END- 

关注“技术领导力”公众号

老K主理,文出过畅销书、武做过CTO

用故事讲技术,有趣,有料!

想加入社区,跟100位互联网大咖学习?

添加群助理Emma,注明“加群”

技术领导力社群


大家在看:

1.从微盟的5张架构PPT,分析“删库跑路“原罪

1.马化腾:普通人追求安全感,高手拥抱不确定

2.张一鸣:为什么 BAT 挖不走我们的人才?

3.迷信中台是一种病,得治

4.李开复:职场人35岁以后,真诚比面子重要

5.阿里中台架构15篇干货,100页ppt精选

6.雷军、张一鸣,价值千亿的6个思维模式

喜欢就点在看!

发布了152 篇原创文章 · 获赞 732 · 访问量 19万+

Guess you like

Origin blog.csdn.net/yellowzf3/article/details/104568322