Common database architecture program

First, the database schema principle

  • High Availability

  • high performance

  • consistency

  • Expansibility


Second, the common framework program

Option One: master and slave architecture, only the main library provides literacy services, prepared by the library for redundancy with failover

jdbc:mysql://vip:3306/xxdb

1, availability analysis: availability, linked to the main reservoir, Keepalive (only a tool) automatically switch to the standby database. This process is transparent to the service layer without modifying the code or configuration.

2, high-performance analytics: read and write operation of the master library are easily bottleneck. Most Internet applications reading and writing less, read will first become a bottleneck, thereby affecting the write performance. In addition, the standby database is simply a backup resource utilization by 50%, this program two to resolve.

3. Consistency Analysis: read and write operations are the main library, data coherency problem exists.

4, scalability analysis: Unable to read performance from Calais extend through the library, thereby improving overall performance.

5, landing analysis: Two floor affect use. First, performance in general, this can be established by introducing efficient indexing and cache to increase read performance, thereby improving performance. This is also a common scheme. Second, poor scalability, this sub-library can be extended by a sub-table.

Option Two: Two main architecture, the library also provides two main services, load balancing

 jdbc:mysql://vip:3306/xxdb

1, high availability analysis: high availability, a main library hung up, does not affect the other main library services. This process is transparent to the service layer without modifying the code or configuration.

2, high-performance analytics: read and write performance as compared to a scheme may be raised, doubled.

3. Consistency Analysis: there is data consistency problems. Consider the following consistency solutions.

4, extended analysis: Of course, the main loop can be extended to three, but I do not recommend (data synchronization will be more than one layer, this synchronization time will be longer). If you have to extend the database schema level, then extended to Option IV.

5, landing analysis: Two floor affect use. First, data consistency, consistency solutions to solve the problem. Second, the primary key conflict, ID uniformly distributed by the ID generation service to generate solve the problem.

Scheme Three: the master-slave architecture, a master from multiple, separate read and write

 jdbc:mysql://master-ip:3306/xxdb

 jdbc:mysql://slave1-ip:3306/xxdb

 jdbc:mysql://slave2-ip:3306/xxdb

1, availability analysis: primary single-point library, from the library availability. Once the main library hung up, writing services will not be provided.

2, high-performance analytics: Most Internet applications writing less reading and more reading will first become a bottleneck, thereby affecting the overall performance. Read performance improves, the overall performance has improved. Further, the main index can not library, from the library and wire line can also create a different index (line or from a library to be established if there are a plurality of the same index, or more harm than good from the library; the library is usually developer offline check the library, you can build more online troubleshooting problems when the index).

3. Consistency Analysis: there is data consistency problems. Look at the consistency of the solution described below.

4, extended analysis: can be extended from the library by adding the performance of read, thereby improving overall performance. (Problem caused, from the library, the more the more need to take the log ends from the main binlog Kula, thereby affecting the performance of the main bank, and the data synchronization completion time will be longer)

5, landing analysis: Two floor affect use. First, data consistency, consistency solutions to solve the problem. Second, the main library of a single point, I did not expect a good temporary solution.

Note: think about a problem, a hung from the library will happen? Reading separation of read and write load balancing strategy how fault-tolerant?

Option 4: double master + master-slave architecture, seemingly perfect solution

 jdbc:mysql://vip:3306/xxdb

 jdbc:mysql://slave1-ip:3306/xxdb

 jdbc:mysql://slave2-ip:3306/xxdb

1, high availability analysis: availability.

2, high-performance analytics: high performance.

3、一致性分析:存在数据一致性问题。请看,一致性解决方案。

4、扩展性分析:可以通过加从库来扩展读性能,进而提高整体性能。(带来的问题同方案二)

5、可落地分析:同方案二,但数据同步又多了一层,数据延迟更严重。


三、一致性解决方案

第一类:主库和从库一致性解决方案:

注:图中圈出的是数据同步的地方,数据同步(从库从主库拉取binlog日志,再执行一遍)是需要时间的,这个同步时间内主库和从库的数据会存在不一致的情况。如果同步过程中有读请求,那么读到的就是从库中的老数据。如下图。

既然知道了数据不一致性产生的原因,有下面几个解决方案供参考:

1、直接忽略,如果业务允许延时存在,那么就不去管它。

2、强制读主,采用主备架构方案,读写都走主库。用缓存来扩展数据库读性能 。有一点需要知道:如果缓存挂了,可能会产生雪崩现象,不过一般分布式缓存都是高可用的。

3、选择读主,写操作时根据库+表+业务特征生成一个key放到Cache里并设置超时时间(大于等于主从数据同步时间)。读请求时,同样的方式生成key先去查Cache,再判断是否命中。若命中,则读主库,否则读从库。代价是多了一次缓存读写,基本可以忽略。

4、半同步复制,等主从同步完成,写请求才返回。就是大家常说的“半同步复制”semi-sync。这可以利用数据库原生功能,实现比较简单。代价是写请求时延增长,吞吐量降低。

5、数据库中间件,引入开源(mycat等)或自研的数据库中间层。个人理解,思路同选择读主。数据库中间件的成本比较高,并且还多引入了一层。

第二类:DB和缓存一致性解决方案

先来看一下常用的缓存使用方式:

第一步:淘汰缓存;

第二步:写入数据库;

第三步:读取缓存?返回:读取数据库;

第四步:读取数据库后写入缓存。

注:如果按照这种方式,图一,不会产生DB和缓存不一致问题;图二,会产生DB和缓存不一致问题,即4.read先于3.sync执行。如果不做处理,缓存里的数据可能一直是脏数据。解决方式如下:

注:设置缓存时,一定要加上失效时间,以防延时淘汰缓存失败的情况!


四、个人的一些见解

1、架构演变

  • 架构演变一:方案一 -> 方案一+分库分表 -> 方案二+分库分表 -> 方案四+分库分表;

  • 架构演变二:方案一 -> 方案一+分库分表 -> 方案三+分库分表 -> 方案四+分库分表;

  • 架构演变三:方案一 -> 方案二 -> 方案四 -> 方案四+分库分表;

  • 架构演变四:方案一 -> 方案三 -> 方案四 -> 方案四+分库分表;

2、个人见解

1、加缓存和索引是通用的提升数据库性能的方式;

2、分库分表带来的好处是巨大的,但同样也会带来一些问题,详见数据库之分库分表-垂直?水平?

3、不管是主备+分库分表还是主从+读写分离+分库分表,都要考虑具体的业务场景。某8到家发展四年,绝大部分的数据库架构还是采用方案一和方案一+分库分表,只有极少部分用方案三+读写分离+分库分表。另外,阿里云提供的数据库云服务也都是主备方案,要想主从+读写分离需要二次架构。

4、记住一句话:不考虑业务场景的架构都是耍流氓

Guess you like

Origin blog.csdn.net/Dome_/article/details/91508573