Questions raised
What is the consistency of hash? Suppose there are four cache servers N0,N1,N2,N3
, and now need to store data OBJECT1,OBJECT2,OBJECT3,OBJECT4,OBJECT5,OBJECT5,OBJECT7,OBJECT8
,
we need these data cached on four servers, the appropriate question is
How to design a data storage strategy? That ObjectX which should be stored on the server?
To solve this problem, as we have a few ideas.
1. The remainder hash scheme
Using hash (Objectx)% 4 determined server node
Suppose hash(OBJECT1)=2
, from 2% 4 = 2, it is found, Object1
it should be stored to the node N2
on the
assumption hash(OBJECT2)=3
, by a 3% 4 = 3, it is found, Object2
it should be stored to the node N3
on the
assumption hash(OBJECT3)=1
, from 1% 4 = 1, it is found, Object3
it should be stored in the node N1
on the
assumption hash(OBJECT4)=0
, from 1% 4 = 1, it is found, Object4
it should be stored in the node N0
on the
assumption hash(OBJECT5)=5
, from 5% 4 = 1, it is found, Object5
it should be stored in the node N1
on the
assumption hash(OBJECT6)=6
, from 6% 4 = 2, it is found, Object6
it should be stored in node N2
on the
assumption hash(OBJECT7)=7
, from 7% 4 = 3, it is found, Object7
then the node should be stored to N3
the
assumption hash(OBJECT8)=8
, from 8% 4 = 0, it is found, Object8
then the node should be stored to N0
the
Suppose we need to read the Object3
data by hash(object3)=1
known, we only need access node N1
can be.
1.1 Now suppose that N3
a sudden malfunction offline
We face the problem of cache reconstructed
Using hash (Objectx)% 3 determines the server node
Suppose hash(OBJECT1)=2
, from 2% 3 = 2, it is found, Object1
it should be stored to the node N2
on the
assumption hash(OBJECT2)=3
, by a 3% 3 = 0, it is found, Object2
it should be stored to the node N0
on the
assumption hash(OBJECT3)=1
, from 1% 3 = 1, it is found, Object3
it should be stored in the node N1
on the
assumption hash(OBJECT4)=0
, from 0% 3 = 0, it is found, Object4
it should be stored to the node N0
on the
assumption hash(OBJECT5)=5
, from 5% 3 = 2, it is found, Object5
it should be stored to the node N2
on the
assumption hash(OBJECT6)=6
, from 6% 3 = 0, it is found, Object6
it should be stored in node N0
on the
assumption hash(OBJECT7)=7
, from 7% 3 = 1, it is found, Object7
then the node should be stored to N1
the
assumption hash(OBJECT8)=8
, from 8% 3 = 2, it is found, Object8
then the node should be stored to N2
the
At this time, in order to ensure the accuracy of the data, we need
the data Object2
from N3
migrating to N0
the data Object5
from N1
the migration to N2
data Object6
from N2
migrating to N0
the data Object7
from N3
migrating to N1
the data Object8
from N0
the migration toN2
1.2 Now suppose we add a new serverN4
We face the problem of cache reconstructed
Using hash (Objectx)% 5 determines the server node
Suppose hash(OBJECT1)=2
, from 2% 5 = 2, it is found, Object1
it should be stored to the node N2
on the
assumption hash(OBJECT2)=3
, from 3% 5 = 3, it is found, Object2
it should be stored to the node N3
on the
assumption hash(OBJECT3)=1
, from 1% 5 = 1, it is found, Object3
it should be stored in the node N1
on the
assumption hash(OBJECT4)=0
, from 0% 5 = 0, it is found, Object4
it should be stored in the node N0
on the
assumption hash(OBJECT5)=5
, from 5% 5 = 0, it is found, Object5
it should be stored in the node N0
on the
assumption hash(OBJECT6)=6
, from 6% 5 = 1, it is found, Object6
it should be stored in node N1
on the
assumption hash(OBJECT7)=7
, from 7% 5 = 2, it is found, Object7
then the node should be stored to N2
the
assumption hash(OBJECT8)=8
, from 8% 5 = 3, it is found, Object8
then the node should be stored to N3
the
At this time, in order to ensure the accuracy of the data we need
The data Object2
from N3
the migration to N0
data Object5
from N1
migrating to N0
the data Object6
from N2
the migration to N1
data Object7
from N3
migrating to N2
the data Object8
from N0
the migration toN3
As can be seen from the above two cases, once the number of machines change, we face a lot of cache change, in other words, most of cache failure, is likely to lead to an avalanche.
2. Consistency hash scheme
Now we replace the following strategy
0 <hash (Objectx)% 8 <= 2, then stored at
N0
2 <hash (Objectx)% 8 <= 4, then stored atN1
4 <hash (Objectx)% 8 <= 6, then stored atN2
6 <hash (Objectx ) 8% <= 8, is stored in theN3
2.1 Now suppose that N3
a sudden malfunction offline
We face the problem of re-cache structure, adjustment strategies are as follows
0 <hash (Objectx)% 8 <= 2, then stored at
N0
2 <hash (Objectx)% 8 <= 4, then stored atN1
4 <hash (Objectx)% 8 <= 6, then stored atN2
6 <hash (Objectx ) 8% <= 8, is stored in theN0
At this time, in order to ensure the accuracy of the data, we need
the data ObjectX
from the N3
migration to N0
the affected data is only N3 relevant data.
2.2 Now suppose we add a new serverN4
We face the problem of re-cache structure, adjustment strategies are as follows
0 <hash (Objectx)% 8 <= 2, then stored at
N0
2 <hash (Objectx)% 8 <= 4, then stored atN1
4 <hash (Objectx)% 8 <= 5, then stored atN2
5 <hash (Objectx ) 8% <= 6, then stored atN4
6 <hash (Objectx)% 8 <= 8, is stored in theN3
At this time, in order to ensure the accuracy of the data, we need
data from N2
copy to N4
affected only N2 relevant user.
Comparing the two kinds of practices, better visibility Scenario 2 Scenario 2 is the consistency of hash
2.3 shortcomings
The fewer machines, the load on each machine will be the more uneven, the solution to this problem is to add a virtual node, adjust its strategy as follows, you can imagine, the more data, the more evenly distributed.
0 <hash (Objectx)% 8 <= 1, then stored at
N0
1 <hash (Objectx)% 8 <= 2, then stored atN1
2 <hash (Objectx)% 8 <= 3, then stored atN2
3 <hash (Objectx )% 8 <= 4, then stored atN3
4 <hash (Objectx)% 8 <= 5, then stored atN0
5 <hash (Objectx)% 8 <= 6, then stored atN1
6 <hash (Objectx)% 8 <= 7, it is stored in theN2
7 <hash (Objectx)% 8 <= 8, is stored in theN3
3. The principle of consistency Hash
Too many on the principle of the network, where no further elaboration.
Recommended Reading
Language open source project golang go backstage management framework restgo-admin
Support touch gestures, you can slide around the calendar plugin
You have to know 18 Internet business model
Recommended Reading
Language open source project golang go backstage management framework restgo-admin
Support touch gestures, you can slide around the calendar plugin