(转)深入剖析Redis主从复制

一、主从概述

Redis 支持 Master-Slave（主从）模式，Redis Server 可以设置为另一个 Redis Server 的主机（从机），从机定期从主机拿数据。特殊的，一个从机同样可以设置为一个 Redis Server 的主机，这样一来 Master-Slave 的分布看起来就是一个有向无环图 DAG，如此形成 Redis Server 集群，无论是主机还是从机都是 Redis Server，都可以提供服务。

在配置后，主机可负责读写服务，从机只负责读。Redis 提高这种配置方式，为的是让其支持数据的弱一致性，即最终一致性。在业务中，选择强一致性还是弱一致性，应该取决于具体的业务需求，像微博，完全可以使用弱一致性模型；像淘宝，可以选用强一致性模型。

Redis 主从复制的实现主要在 replication.c 中。

这篇文章涉及较多的代码，但我已经尽量删繁就简，达到能说明问题本质。为了保留代码的原生性并让读者能够阅读原生代码的注释，剖析 Redis 的几篇文章都没有删除代码中的英文注释，并已加注释。

二、积压空间

在《深入剖析 Redis AOF 持久化策略》中，介绍了更新缓存的概念，举一个例子：客户端发来命令：set name Jhon，这一数据更新被记录为：*3\r\n$3\r\nSET\r\n$4\r\nname\r\n$3\r\nJhon\r\n，并存储在更新缓存中。

同样，在主从连接中，也有更新缓存的概念。只是两者的用途不一样，前者被写入本地，后者被写入从机，这里我们把它称为积压空间。

更新缓存存储在 server.repl_backlog，Redis 将其作为一个环形空间来处理，这样做节省了空间，避免内存再分配的情况。

 
        struct  
        redisServer { 
       
        /* Replication (master) */ 
       
        // 最近一次使用（访问）的数据集 
       
        int  
        slaveseldb;                  
        /* Last SELECTed DB in replication output */ 
       
        // 全局的数据同步偏移量 
       
        long  
        long  
        master_repl_offset;    
        /* Global replication offset */ 
       
        // 主从连接心跳频率 
       
        int  
        repl_ping_slave_period;      
        /* Master pings the slave every N seconds */ 
       
        // 积压空间指针 
       
        char  
        *repl_backlog;              
        /* Replication backlog for partial syncs */ 
       
        // 积压空间大小 
       
        long  
        long  
        repl_backlog_size;     
        /* Backlog circular buffer size */ 
       
        // 积压空间中写入的新数据的大小 
       
        long  
        long  
        repl_backlog_histlen;  
        /* Backlog actual data length */ 
       
        // 下一次向积压空间写入数据的起始位置 
       
        long  
        long  
        repl_backlog_idx;      
        /* Backlog circular buffer current offset */ 
       
        // 积压数据的起始位置，是一个宏观值 
       
        long  
        long  
        repl_backlog_off;      
        /* Replication offset of first byte 
       
        in the backlog buffer. */ 
       
        // 积压空间有效时间 
       
        time_t  
        repl_backlog_time_limit;  
        /* Time without slaves after the backlog gets released. */ 
       
        }

积压空间中的数据变更记录是什么时候被写入的？在执行一个 Redis 命令的时候，如果存在数据的修改（写），那么就会把变更记录传播。Redis 源码中是这么实现的：call()->propagate()->replicationFeedSlaves()

注释：命令真正执行的地方在 call() 中，call() 如果发现数据被修改（dirty），则传播 propagrate()，replicationFeedSlaves() 将修改记录写入积压空间和所有已连接的从机。

这里可能会有疑问：为什么把数据添加入积压空间，又把数据分发给所有的从机？为什么不仅仅将数据分发给所有从机呢？

因为有一些从机会因特殊情况（？？？）与主机断开连接，注意从机断开前有暂存主机的状态信息，因此这些断开的从机就没有及时收到更新的数据。Redis 为了让断开的从机在下次连接后能够获取更新数据，将更新数据加入了积压空间。从 replicationFeedSlaves() 实现来看，在线的 Slave 能马上收到数据更新记录；因某些原因暂时断开连接的 Slave，需要从积压空间中找回断开期间的数据更新记录。如果断开的时间足够长，Master 会拒绝 Slave 的部分同步请求，从而 Slave 只能进行全同步。

下面是源码注释：

 
        // call() 函数是执行命令的核心函数，真正执行命令的地方 
       
        /* Call() is the core of Redis execution of a command */ 
       
        void  
        call(redisClient *c,  
        int  
        flags) { 
       
        ...... 
       
        /* Call the command. */ 
       
        c->flags &= ~(REDIS_FORCE_AOF | REDIS_FORCE_REPL); 
       
        redisOpArrayInit(&server.also_propagate); 
       
        // 脏数据标记，数据是否被修改 
       
        dirty = server.dirty; 
       
        // 执行命令对应的函数 
       
        c->cmd->proc(c); 
       
        dirty = server.dirty - dirty; 
       
        duration = ustime() - start; 
       
        ...... 
       
        // 将客户端请求的数据修改记录传播给 AOF 和从机 
       
        /* Propagate the command into the AOF and replication link */ 
       
        if  
        (flags & REDIS_CALL_PROPAGATE) { 
       
        int  
        flags = REDIS_PROPAGATE_NONE; 
       
        // 强制主从复制 
       
        if  
        (c->flags & REDIS_FORCE_REPL) flags |= REDIS_PROPAGATE_REPL; 
       
        // 强制 AOF 持久化 
       
        if  
        (c->flags & REDIS_FORCE_AOF) flags |= REDIS_PROPAGATE_AOF; 
       
        // 数据被修改 
       
        if  
        (dirty) 
       
        flags |= (REDIS_PROPAGATE_REPL | REDIS_PROPAGATE_AOF); 
       
        // 传播数据修改记录 
       
        if  
        (flags != REDIS_PROPAGATE_NONE) 
       
        propagate(c->cmd, c->db->id, c->argv, c->argc, flags); 
       
        } 
       
        ...... 
       
        } 
       
        // 向 AOF 和从机发布数据更新 
       
        /* Propagate the specified command (in the context of the specified database id) 
       
        * to AOF and Slaves. 
       
        * 
       
        * flags are an xor between: 
       
        * + REDIS_PROPAGATE_NONE (no propagation of command at all) 
       
        * + REDIS_PROPAGATE_AOF (propagate into the AOF file if is enabled) 
       
        * + REDIS_PROPAGATE_REPL (propagate into the replication link) 
       
        */ 
       
        void  
        propagate( 
        struct  
        redisCommand *cmd,  
        int  
        dbid, robj **argv,  
        int  
        argc, 
       
        int  
        flags) 
       
        { 
       
        // AOF 策略需要打开，且设置 AOF 传播标记，将更新发布给本地文件 
       
        if  
        (server.aof_state != REDIS_AOF_OFF && flags & REDIS_PROPAGATE_AOF) 
       
        feedAppendOnlyFile(cmd, dbid, argv, argc); 
       
        // 设置了从机传播标记，将更新发布给从机 
       
        if  
        (flags & REDIS_PROPAGATE_REPL) 
       
        replicationFeedSlaves(server.slaves,dbid,argv,argc); 
       
        } 
       
        // 向积压空间和从机发送数据 
       
        void  
        replicationFeedSlaves(list *slaves,  
        int  
        dictid, robj **argv,  
        int  
        argc) { 
       
        listNode *ln; 
       
        listIter li; 
       
        int  
        j, len; 
       
        char  
        llstr[REDIS_LONGSTR_SIZE]; 
       
        // 没有积压数据且没有从机，直接退出 
       
        /* If there aren't slaves, and there is no backlog buffer to populate, 
       
        * we can return ASAP. */ 
       
        if  
        (server.repl_backlog == NULL && listLength(slaves) == 0)  
        return 
        ; 
       
        /* We can't have slaves attached and no backlog. */ 
       
        redisAssert(!(listLength(slaves) != 0 && server.repl_backlog == NULL)); 
       
        /* Send SELECT command to every slave if needed. */ 
       
        if  
        (server.slaveseldb != dictid) { 
       
        robj *selectcmd; 
       
        // 小于等于 10 的可以用共享对象 
       
        /* For a few DBs we have pre-computed SELECT command. */ 
       
        if  
        (dictid >= 0 && dictid < REDIS_SHARED_SELECT_CMDS) { 
       
        selectcmd = shared.select[dictid]; 
       
        }  
        else  
        { 
       
        // 不能使用共享对象，生成 SELECT 命令对应的 redis 对象 
       
        int  
        dictid_len; 
       
        dictid_len = ll2string(llstr,  
        sizeof 
        (llstr), dictid); 
       
        selectcmd = createObject(REDIS_STRING, 
       
        sdscatprintf(sdsempty(), 
       
        "*2\r\n$6\r\nSELECT\r\n$%d\r\n%s\r\n" 
        , 
       
        dictid_len, llstr)); 
       
        } 
       
        // 这里可能会有疑问：为什么把数据添加入积压空间，又把数据分发给所有的从机？ 
       
        // 为什么不仅仅将数据分发给所有从机呢？ 
       
        // 因为有一些从机会因特殊情况（？？？）与主机断开连接，注意从机断开前有暂存 
       
        // 主机的状态信息，因此这些断开的从机就没有及时收到更新的数据。redis 为了让 
       
        // 断开的从机在下次连接后能够获取更新数据，将更新数据加入了积压空间。 
       
        // 将 SELECT 命令对应的 redis 对象数据添加到积压空间 
       
        /* Add the SELECT command into the backlog. */ 
       
        if  
        (server.repl_backlog) feedReplicationBacklogWithObject(selectcmd); 
       
        // 将数据分发所有的从机 
       
        /* Send it to slaves. */ 
       
        listRewind(slaves, &li); 
       
        while 
        ((ln = listNext(&li))) { 
       
        redisClient *slave = ln->value; 
       
        addReply(slave, selectcmd); 
       
        } 
       
        // 销毁对象 
       
        if  
        (dictid < 0 || dictid >= REDIS_SHARED_SELECT_CMDS) 
       
        decrRefCount(selectcmd); 
       
        } 
       
        // 更新最近一次使用（访问）的数据集 
       
        server.slaveseldb = dictid; 
       
        // 将命令写入积压空间 
       
        /* Write the command to the replication backlog if any. */ 
       
        if  
        (server.repl_backlog) { 
       
        char  
        aux[REDIS_LONGSTR_SIZE+3]; 
       
        // 命令个数 
       
        /* Add the multi bulk reply length. */ 
       
        aux[0] =  
        '*' 
        ; 
       
        len = ll2string(aux + 1,  
        sizeof 
        (aux) - 1, argc); 
       
        aux[len+1] =  
        '\r' 
        ; 
       
        aux[len+2] =  
        '\n' 
        ; 
       
        feedReplicationBacklog(aux, len + 3); 
       
        // 逐个命令写入 
       
        for  
        (j = 0; j < argc; j++) { 
       
        long  
        objlen = stringObjectLen(argv[j]); 
       
        /* We need to feed the buffer with the object as a bulk reply 
       
        * not just as a plain string, so create the $..CRLF payload len 
       
        * ad add the final CRLF */ 
       
        aux[0] =  
        '$' 
        ; 
       
        len = ll2string(aux + 1,  
        sizeof 
        (aux) - 1, objlen); 
       
        aux[len+1] =  
        '\r' 
        ; 
       
        aux[len+2] =  
        '\n' 
        ; 
       
        /* 每个命令格式如下： 
       
        $3 
       
        *3 
       
        SET 
       
        *4 
       
        NAME 
       
        *4 
       
        Jhon*/ 
       
        // 命令长度 
       
        feedReplicationBacklog(aux, len + 3); 
       
        // 命令 
       
        feedReplicationBacklogWithObject(argv[j]); 
       
        // 换行 
       
        feedReplicationBacklog(aux + len + 1, 2); 
       
        } 
       
        } 
       
        // 立即给每一个从机发送命令 
       
        /* Write the command to every slave. */ 
       
        listRewind(slaves, &li); 
       
        while 
        ((ln = listNext(&li))) { 
       
        redisClient *slave = ln->value; 
       
        // 如果从机要求全同步，则不对此从机发送数据 
       
        /* Don't feed slaves that are still waiting for BGSAVE to start */ 
       
        if  
        (slave->replstate == REDIS_REPL_WAIT_BGSAVE_START)  
        continue 
        ; 
       
        /* Feed slaves that are waiting for the initial SYNC (so these commands 
       
        * are queued in the output buffer until the initial SYNC completes), 
       
        * or are already in sync with the master. */ 
       
        // 向从机命令的长度 
       
        /* Add the multi bulk length. */ 
       
        addReplyMultiBulkLen(slave, argc); 
       
        // 向从机发送命令 
       
        /* Finally any additional argument that was not stored inside the 
       
        * static buffer if any (from j to argc). */ 
       
        for  
        (j = 0; j < argc; j++) 
       
        addReplyBulk(slave, argv[j]); 
       
        } 
       
        }

三、主从数据同步机制概述

Redis 主从同步有两种方式（或者所两个阶段）：全同步和部分同步。

主从刚刚连接的时候，进行全同步；全同步结束后，进行部分同步。当然，如果有需要，Slave 在任何时候都可以发起全同步。Redis 策略是，无论如何，首先会尝试进行部分同步，如不成功，要求从机进行全同步，并启动 BGSAVE……BGSAVE 结束后，传输 RDB 文件；如果成功，允许从机进行部分同步，并传输积压空间中的数据。

下面这幅图，总结了主从同步的机制：

如需设置 Slave，Master 需要向 Slave 发送 SLAVEOF hostname port，从机接收到后会自动连接主机，注册相应读写事件（syncWithMaster())。

 
        // 修改主机 
       
        void  
        slaveofCommand(redisClient *c) { 
       
        if  
        (!strcasecmp(c->argv[1]->ptr,  
        "no" 
        ) && 
       
        !strcasecmp(c->argv[2]->ptr,  
        "one" 
        )) { 
       
        // slaveof no one 断开主机连接 
       
        if  
        (server.masterhost) { 
       
        replicationUnsetMaster(); 
       
        redisLog(REDIS_NOTICE,  
        "MASTER MODE enabled (user request)" 
        ); 
       
        } 
       
        }  
        else  
        { 
       
        long  
        port; 
       
        if  
        ((getLongFromObjectOrReply(c, c->argv[2], &port, NULL) != REDIS_OK)) 
       
        return 
        ; 
       
        // 可能已经连接需要连接的主机 
       
        /* Check if we are already attached to the specified slave */ 
       
        if  
        (server.masterhost && !strcasecmp(server.masterhost,c->argv[1]->ptr) 
       
        && server.masterport == port) { 
       
        redisLog(REDIS_NOTICE, 
       
        "SLAVE OF would result into synchronization with the master we are 
       
        already connected with. No operation performed."); 
       
        addReplySds(c, sdsnew( 
        "+OK Already connected to specified master\r\n" 
        )); 
       
        return 
        ; 
       
        } 
       
        // 断开之前连接主机的连接，连接新的。 replicationSetMaster() 并不会真正连接主机， 
       
        // 只是修改 struct server 中关于主机的设置。真正的主机连接在 replicationCron() 中完成 
       
        /* There was no previous master or the user specified a different one, 
       
        * we can continue. */ 
       
        replicationSetMaster(c->argv[1]->ptr, port); 
       
        redisLog(REDIS_NOTICE,  
        "SLAVE OF %s:%d enabled (user request)" 
        , 
       
        server.masterhost, server.masterport); 
       
        } 
       
        addReply(c,shared.ok); 
       
        } 
       
        // 设置新主机 
       
        /* Set replication to the specified master address and port. */ 
       
        void  
        replicationSetMaster( 
        char  
        *ip,  
        int  
        port) { 
       
        sdsfree(server.masterhost); 
       
        server.masterhost = sdsdup(ip); 
       
        server.masterport = port; 
       
        // 断开之前主机的连接 
       
        if  
        (server.master) freeClient(server.master); 
       
        disconnectSlaves();  
        /* Force our slaves to resync with us as well. */ 
       
        // 取消缓存主机 
       
        replicationDiscardCachedMaster();  
        /* Don't try a PSYNC. */ 
       
        // 释放积压空间 
       
        freeReplicationBacklog();  
        /* Don't allow our chained slaves to PSYNC. */ 
       
        // cancelReplicationHandshake() 尝试断开数据传输和主机连接 
       
        cancelReplicationHandshake(); 
       
        server.repl_state = REDIS_REPL_CONNECT; 
       
        server.master_repl_offset = 0; 
       
        } 
       
        // 管理主从连接的定时程序定时程序，每秒执行一次 
       
        // 在 serverCorn() 中调用 
       
        /* --------------------------- REPLICATION CRON  ----------------------------- */ 
       
        /* Replication cron funciton, called 1 time per second. */ 
       
        void  
        replicationCron( 
        void 
        ) { 
       
        ...... 
       
        // 如果需要（ REDIS_REPL_CONNECT），尝试连接主机，真正连接主机的操作在这里 
       
        /* Check if we should connect to a MASTER */ 
       
        if  
        (server.repl_state == REDIS_REPL_CONNECT) { 
       
        redisLog(REDIS_NOTICE,  
        "Connecting to MASTER %s:%d" 
        , 
       
        server.masterhost, server.masterport); 
       
        if  
        (connectWithMaster() == REDIS_OK) { 
       
        redisLog(REDIS_NOTICE,  
        "MASTER <-> SLAVE sync started" 
        ); 
       
        } 
       
        } 
       
        ...... 
       
        }

四、全同步

接着自动发起 PSYNC 请求 Master 进行全同步。无论如何，Redis 首先会尝试部分同步，如果失败才尝试全同步。而刚刚建立连接的 Master-Slave 需要全同步。

从机连接主机后，会主动发起 PSYNC 命令，从机会提供 master_runid 和 offset，主机验证 master_runid 和 offset 是否有效？master_runid 相当于主机身份验证码，用来验证从机上一次连接的主机，offset 是全局积压空间数据的偏移量。

验证未通过则，则进行全同步：主机返回 +FULLRESYNC master_runid offset（从机接收并记录 master_runid 和 offset，并准备接收 RDB 文件）接着启动 BGSAVE 生成 RDB 文件，BGSAVE 结束后，向从机传输，从而完成全同步。

 
        // 连接主机 connectWithMaster() 的时候，会被注册为回调函数 
       
        void  
        syncWithMaster(aeEventLoop *el,  
        int  
        fd,  
        void  
        *privdata,  
        int  
        mask) { 
       
        char  
        tmpfile 
        [256], *err; 
       
        int  
        dfd, maxtries = 5; 
       
        int  
        sockerr = 0, psync_result; 
       
        socklen_t errlen =  
        sizeof 
        (sockerr); 
       
        ...... 
       
        // 这里尝试向主机请求部分同步，主机会回复以拒绝或接受请求。如果拒绝部分同步， 
       
        // 会返回 +FULLRESYNC master_runid offset 
       
        // 从机接收后准备进行全同步    psync_result = slaveTryPartialResynchronization(fd); 
       
        if  
        (psync_result == PSYNC_CONTINUE) { 
       
        redisLog(REDIS_NOTICE, 
       
        "MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization." 
        ); 
       
        return 
        ; 
       
        } 
       
        // 执行全同步 
       
        /* Fall back to SYNC if needed. Otherwise psync_result == PSYNC_FULLRESYNC 
       
        * and the server.repl_master_runid and repl_master_initial_offset are 
       
        * already populated. */ 
       
        // 未知结果，进行出错处理 
       
        if  
        (psync_result == PSYNC_NOT_SUPPORTED) { 
       
        redisLog(REDIS_NOTICE, 
        "Retrying with SYNC..." 
        ); 
       
        if  
        (syncWrite(fd, 
        "SYNC\r\n" 
        ,6,server.repl_syncio_timeout*1000) == -1) { 
       
        redisLog(REDIS_WARNING, 
        "I/O error writing to MASTER: %s" 
        , 
       
        strerror 
        ( 
        errno 
        )); 
       
        goto  
        error; 
       
        } 
       
        } 
       
        // 为什么要尝试 5次？？？ 
       
        /* Prepare a suitable temp file for bulk transfer */ 
       
        while 
        (maxtries--) { 
       
        snprintf( 
        tmpfile 
        , 256,  
        "temp-%d.%ld.rdb" 
        , ( 
        int 
        )server.unixtime, ( 
        long  
        int 
        )getpid()); 
       
        dfd = open( 
        tmpfile 
        , O_CREAT|O_WRONLY|O_EXCL, 0644); 
       
        if  
        (dfd != -1)  
        break 
        ; 
       
        sleep(1); 
       
        } 
       
        if  
        (dfd == -1) { 
       
        redisLog(REDIS_WARNING, 
       
        "Opening the temp file needed for MASTER <-> SLAVE synchronization: %s" 
        , 
       
        strerror 
        ( 
        errno 
        )); 
       
        goto  
        error; 
       
        } 
       
        // 注册读事件，回调函数 readSyncBulkPayload()， 准备读 RDB 文件 
       
        /* Setup the non blocking download of the bulk file. */ 
       
        if  
        (aeCreateFileEvent(server.el,fd, AE_READABLE, readSyncBulkPayload, NULL) == AE_ERR) { 
       
        redisLog(REDIS_WARNING,  
        "Can't create readable event for SYNC: %s (fd=%d)" 
        , 
       
        strerror 
        ( 
        errno 
        ), fd); 
       
        goto  
        error; 
       
        } 
       
        // 设置传输 RDB 文件数据的选项 
       
        // 状态 
       
        server.repl_state = REDIS_REPL_TRANSFER; 
       
        // RDB 文件大小 
       
        server.repl_transfer_size = -1; 
       
        // 已经传输的大小 
       
        server.repl_transfer_read = 0; 
       
        // 上一次同步的偏移，为的是定时写入磁盘 
       
        server.repl_transfer_last_fsync_off = 0; 
       
        // 本地 RDB 文件套接字 
       
        server.repl_transfer_fd = dfd; 
       
        // 上一次同步 IO 时间 
       
        server.repl_transfer_lastio = server.unixtime; 
       
        // 临时文件名 
       
        server.repl_transfer_tmpfile = zstrdup( 
        tmpfile 
        ); 
       
        return 
        ; 
       
        error: 
       
        close(fd); 
       
        server.repl_transfer_s = -1; 
       
        server.repl_state = REDIS_REPL_CONNECT; 
       
        return 
        ; 
       
        }

全同步请求的数据是 RDB 数据文件和积压空间中的数据。关于 RDB 数据文件，请参看《深入剖析 Redis RDB 持久化策略》。如果没有后台持久化 BGSAVE 进程，那么 BGSVAE 会被触发，否则所有请求全同步的 Slave 都会被标记为等待 BGSAVE 结束。BGSAVE 结束后，Master 会马上向所有的从机发送 RDB 文件。

 
        // 主机 SYNC 和 PSYNC 命令处理函数，会尝试进行部分同步和全同步 
       
        /* SYNC ad PSYNC command implemenation. */ 
       
        void  
        syncCommand(redisClient *c) { 
       
        ...... 
       
        // 主机尝试部分同步，失败的话向从机发送 +FULLRESYNC master_runid offset，接着启动 BGSAVE 
       
        // 执行全同步： 
       
        /* Full resynchronization. */ 
       
        server.stat_sync_full++; 
       
        /* Here we need to check if there is a background saving operation 
       
        * in progress, or if it is required to start one */ 
       
        if  
        (server.rdb_child_pid != -1) { 
       
        /*  存在 BGSAVE 后台进程。 
       
        1.如果 master 现有所连接的所有从机 slaves 当中有存在 REDIS_REPL_WAIT_BGSAVE_END 的从机， 
       
        那么将从机 c 设置为 REDIS_REPL_WAIT_BGSAVE_END； 
       
        2.否则，设置为 REDIS_REPL_WAIT_BGSAVE_START*/ 
       
        /* Ok a background save is in progress. Let's check if it is a good 
       
        * one for replication, i.e. if there is another slave that is 
       
        * registering differences since the server forked to save */ 
       
        redisClient *slave; 
       
        listNode *ln; 
       
        listIter li; 
       
        // 检测是否已经有从机申请全同步 
       
        listRewind(server.slaves, &li); 
       
        while 
        ((ln = listNext(&li))) { 
       
        slave = ln->value; 
       
        if  
        (slave->replstate == REDIS_REPL_WAIT_BGSAVE_END)  
        break 
        ; 
       
        } 
       
        if  
        (ln) { 
       
        // 存在状态为 REDIS_REPL_WAIT_BGSAVE_END 的从机 slave， 
       
        // 就将此从机 c 状态设置为 REDIS_REPL_WAIT_BGSAVE_END， 
       
        // 从而在 BGSAVE 进程结束后，可以发送 RDB 文件， 
       
        // 同时将从机 slave 中的更新复制到此从机 c。 
       
        /* Perfect, the server is already registering differences for 
       
        * another slave. Set the right state, and copy the buffer. */ 
       
        // 将其他从机上的待回复的缓存复制到从机 c 
       
        copyClientOutputBuffer(c, slave); 
       
        // 修改从机 c 状态为「等待 BGSAVE 进程结束」 
       
        c->replstate = REDIS_REPL_WAIT_BGSAVE_END; 
       
        redisLog(REDIS_NOTICE,  
        "Waiting for end of BGSAVE for SYNC" 
        ); 
       
        }  
        else  
        { 
       
        // 不存在状态为 REDIS_REPL_WAIT_BGSAVE_END 的从机，就将此从机 c 状态设置为 
       
        // REDIS_REPL_WAIT_BGSAVE_START，即等待新的 BGSAVE 进程的开启。 
       
        // 修改状态为「等待 BGSAVE 进程开始」 
       
        /* No way, we need to wait for the next BGSAVE in order to 
       
        * register differences */ 
       
        c->replstate = REDIS_REPL_WAIT_BGSAVE_START; 
       
        redisLog(REDIS_NOTICE,  
        "Waiting for next BGSAVE for SYNC" 
        ); 
       
        } 
       
        }  
        else  
        { 
       
        // 不存在 BGSAVE 后台进程，启动一个新的 BGSAVE 进程 
       
        /* Ok we don't have a BGSAVE in progress, let's start one */ 
       
        redisLog(REDIS_NOTICE,  
        "Starting BGSAVE for SYNC" 
        ); 
       
        if  
        (rdbSaveBackground(server.rdb_filename) != REDIS_OK) { 
       
        redisLog(REDIS_NOTICE,  
        "Replication failed, can't BGSAVE" 
        ); 
       
        addReplyError(c,  
        "Unable to perform background save" 
        ); 
       
        return 
        ; 
       
        } 
       
        // 将此从机 c 状态设置为 REDIS_REPL_WAIT_BGSAVE_END，从而在 BGSAVE 进程结束后， 
       
        // 可以发送 RDB 文件，同时将从机 slave 中的更新复制到此从机 c。 
       
        c->replstate = REDIS_REPL_WAIT_BGSAVE_END; 
       
        // 清理脚本缓存？？？ 
       
        /* Flush the script cache for the new slave. */ 
       
        replicationScriptCacheFlush(); 
       
        } 
       
        if  
        (server.repl_disable_tcp_nodelay) 
       
        anetDisableTcpNoDelay(NULL, c->fd);  
        /* Non critical if it fails. */ 
       
        c->repldbfd = -1; 
       
        c->flags |= REDIS_SLAVE; 
       
        server.slaveseldb = -1;  
        /* Force to re-emit the SELECT command. */ 
       
        listAddNodeTail(server.slaves,c); 
       
        if  
        (listLength(server.slaves) == 1 && server.repl_backlog == NULL) 
       
        createReplicationBacklog(); 
       
        return 
        ; 
       
        } 
       
        // BGSAVE 结束后，会调用 
       
        /* A background saving child (BGSAVE) terminated its work. Handle this. */ 
       
        void  
        backgroundSaveDoneHandler( 
        int  
        exitcode,  
        int  
        bysignal) { 
       
        // 其他操作 
       
        ...... 
       
        // 可能从机正在等待 BGSAVE 进程的终止 
       
        /* Possibly there are slaves waiting for a BGSAVE in order to be served 
       
        * (the first stage of SYNC is a bulk transfer of dump.rdb) */ 
       
        updateSlavesWaitingBgsave(exitcode == 0 ? REDIS_OK : REDIS_ERR); 
       
        } 
       
        // 当 RDB 持久化(backgroundSaveDoneHandler())结束后，会调用此函数 
       
        // RDB 文件就绪，给所有的从机发送 RDB 文件 
       
        /* This function is called at the end of every background saving. 
       
        * The argument bgsaveerr is REDIS_OK if the background saving succeeded 
       
        * otherwise REDIS_ERR is passed to the function. 
       
        * 
       
        * The goal of this function is to handle slaves waiting for a successful 
       
        * background saving in order to perform non-blocking synchronization. */ 
       
        void  
        updateSlavesWaitingBgsave( 
        int  
        bgsaveerr) { 
       
        listNode *ln; 
       
        int  
        startbgsave = 0; 
       
        listIter li; 
       
        listRewind(server.slaves,&li); 
       
        while 
        ((ln = listNext(&li))) { 
       
        redisClient *slave = ln->value; 
       
        // 等待 BGSAVE 开始。调整状态为等待下一次 BGSAVE 进程的结束 
       
        if  
        (slave->replstate == REDIS_REPL_WAIT_BGSAVE_START) { 
       
        startbgsave = 1; 
       
        slave->replstate = REDIS_REPL_WAIT_BGSAVE_END; 
       
        // 等待 BGSAVE 结束。准备向 slave 发送 RDB 文件 
       
        }  
        else  
        if  
        (slave->replstate == REDIS_REPL_WAIT_BGSAVE_END) { 
       
        struct  
        redis_stat buf; 
       
        // 如果 RDB 持久化失败， bgsaveerr 会被设置为 REDIS_ERR 
       
        if  
        (bgsaveerr != REDIS_OK) { 
       
        freeClient(slave); 
       
        redisLog(REDIS_WARNING,  
        "SYNC failed. BGSAVE child returned an error" 
        ); 
       
        continue 
        ; 
       
        } 
       
        // 打开 RDB 文件 
       
        if  
        ((slave->repldbfd = open(server.rdb_filename, O_RDONLY)) == -1 || 
       
        redis_fstat(slave->repldbfd, &buf) == -1) { 
       
        freeClient(slave); 
       
        redisLog(REDIS_WARNING, 
       
        "SYNC failed. Can't open/stat DB after BGSAVE: %s" 
        ,  
        strerror 
        ( 
        errno 
        )); 
       
        continue 
        ; 
       
        } 
       
        slave->repldboff = 0; 
       
        slave->repldbsize = buf.st_size; 
       
        slave->replstate = REDIS_REPL_SEND_BULK; 
       
        // 如果之前有注册写事件，取消 
       
        aeDeleteFileEvent(server.el,slave->fd,AE_WRITABLE); 
       
        // 注册新的写事件,sendBulkToSlave() 传输 RDB 文件 
       
        if  
        (aeCreateFileEvent(server.el, slave->fd, AE_WRITABLE, sendBulkToSlave, 
       
        slave) == AE_ERR) { 
       
        freeClient(slave); 
       
        continue 
        ; 
       
        } 
       
        } 
       
        } 
       
        // startbgsave == REDIS_ERR 表示 BGSAVE 失败，再一次进行 BGSAVE 尝试 
       
        if  
        (startbgsave) { 
       
        /* Since we are starting a new background save for one or more slaves, 
       
        * we flush the Replication Script Cache to use EVAL to propagate every 
       
        * new EVALSHA for the first time, since all the new slaves don't know 
       
        * about previous scripts. */ 
       
        replicationScriptCacheFlush(); 
       
        if  
        (rdbSaveBackground(server.rdb_filename) != REDIS_OK) { 
       
        /* BGSAVE 可能 fork 失败，所有等待 BGSAVE 的从机都将结束连接。 
       
        * 这是 redis 自我保护的措施，fork 失败很可能是内存紧张 
       
        */ 
       
        listIter li; 
       
        listRewind(server.slaves,&li); 
       
        redisLog(REDIS_WARNING,  
        "SYNC failed. BGSAVE failed" 
        ); 
       
        while 
        ((ln = listNext(&li))) { 
       
        redisClient *slave = ln->value; 
       
        if  
        (slave->replstate == REDIS_REPL_WAIT_BGSAVE_START) 
       
        freeClient(slave); 
       
        } 
       
        } 
       
        } 
       
        }

五、部分同步

如上所说，无论如何，Redis 首先会尝试部分同步。部分同步即把积压空间缓存的数据，即更新记录发送给从机。

从机连接主机后，会主动发起 PSYNC 命令，从机会提供 master_runid 和 offset，主机验证 master_runid 和 offset 是否有效？

验证通过则，进行部分同步：主机返回 +CONTINUE（从机接收后会注册积压数据接收事件），接着发送积压空间数据。

 
        // 连接主机 connectWithMaster() 的时候，会被注册为回调函数 
       
        void  
        syncWithMaster(aeEventLoop *el,  
        int  
        fd,  
        void  
        *privdata,  
        int  
        mask) { 
       
        char  
        tmpfile 
        [256], *err; 
       
        int  
        dfd, maxtries = 5; 
       
        int  
        sockerr = 0, psync_result; 
       
        socklen_t errlen =  
        sizeof 
        (sockerr); 
       
        ...... 
       
        // 尝试部分同步，主机允许进行部分同步会返回 +CONTINUE，从机接收后注册相应的事件 
       
        /* Try a partial resynchonization. If we don't have a cached master 
       
        * slaveTryPartialResynchronization() will at least try to use PSYNC 
       
        * to start a full resynchronization so that we get the master run id 
       
        * and the global offset, to try a partial resync at the next 
       
        * reconnection attempt. */ 
       
        // 函数返回三种状态： 
       
        // PSYNC_CONTINUE：表示会进行部分同步，在 slaveTryPartialResynchronization() 
       
        // 中已经设置回调函数 readQueryFromClient() 
       
        // PSYNC_FULLRESYNC：全同步，会下载 RDB 文件 
       
        // PSYNC_NOT_SUPPORTED：未知 
       
        psync_result = slaveTryPartialResynchronization(fd); 
       
        if  
        (psync_result == PSYNC_CONTINUE) { 
       
        redisLog(REDIS_NOTICE, 
       
        "MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization." 
        ); 
       
        return 
        ; 
       
        } 
       
        // 执行全同步 
       
        ...... 
       
        } 
       
        // 函数返回三种状态： 
       
        // PSYNC_CONTINUE：表示会进行部分同步，已经设置回调函数 
       
        // PSYNC_FULLRESYNC：全同步，会下载 RDB 文件 
       
        // PSYNC_NOT_SUPPORTED：未知 
       
        #define PSYNC_CONTINUE 0 
       
        #define PSYNC_FULLRESYNC 1 
       
        #define PSYNC_NOT_SUPPORTED 2 
       
        int  
        slaveTryPartialResynchronization( 
        int  
        fd) { 
       
        char  
        *psync_runid; 
       
        char  
        psync_offset[32]; 
       
        sds reply; 
       
        /* Initially set repl_master_initial_offset to -1 to mark the current 
       
        * master run_id and offset as not valid. Later if we'll be able to do 
       
        * a FULL resync using the PSYNC command we'll set the offset at the 
       
        * right value, so that this information will be propagated to the 
       
        * client structure representing the master into server.master. */ 
       
        server.repl_master_initial_offset = -1; 
       
        if  
        (server.cached_master) { 
       
        // 缓存了上一次与主机连接的信息，可以尝试进行部分同步，减少数据传输 
       
        psync_runid = server.cached_master->replrunid; 
       
        snprintf(psync_offset,  
        sizeof 
        (psync_offset),  
        "%lld" 
        , 
       
        server.cached_master->reploff + 1); 
       
        redisLog(REDIS_NOTICE, 
       
        "Trying a partial resynchronization (request %s:%s)." 
        , 
       
        psync_runid, psync_offset); 
       
        }  
        else  
        { 
       
        // 未缓存上一次与主机连接的信息，进行全同步 
       
        // psync ? -1 可以获取主机的 master_runid 
       
        redisLog(REDIS_NOTICE,  
        "Partial resynchronization not possible (no cached master)" 
        ); 
       
        psync_runid =  
        "?" 
        ; 
       
        memcpy 
        (psync_offset,  
        "-1" 
        , 3); 
       
        } 
       
        // 向主机发送命令，并接收回复 
       
        /* Issue the PSYNC command */ 
       
        reply = sendSynchronousCommand(fd,  
        "PSYNC" 
        , psync_runid, psync_offset, NULL); 
       
        // 全同步 
       
        if  
        (! 
        strncmp 
        (reply,  
        "+FULLRESYNC" 
        , 11)) { 
       
        char  
        *runid = NULL, *offset = NULL; 
       
        /* FULL RESYNC, parse the reply in order to extract the run id 
       
        * and the replication offset. */ 
       
        runid =  
        strchr 
        (reply,  
        ' ' 
        ); 
       
        if  
        (runid) { 
       
        runid++; 
       
        offset =  
        strchr 
        (runid,  
        ' ' 
        ); 
       
        if  
        (offset) offset++; 
       
        } 
       
        if  
        (!runid || !offset || (offset-runid-1) != REDIS_RUN_ID_SIZE) { 
       
        redisLog(REDIS_WARNING, 
       
        "Master replied with wrong +FULLRESYNC syntax." 
        ); 
       
        /* This is an unexpected condition, actually the +FULLRESYNC 
       
        * reply means that the master supports PSYNC, but the reply 
       
        * format seems wrong. To stay safe we blank the master 
       
        * runid to make sure next PSYNCs will fail. */ 
       
        memset 
        (server.repl_master_runid, 0, REDIS_RUN_ID_SIZE + 1); 
       
        }  
        else  
        { 
       
        // 拷贝 runid 
       
        memcpy 
        (server.repl_master_runid, runid, offset-runid-1); 
       
        server.repl_master_runid[REDIS_RUN_ID_SIZE] =  
        '\0' 
        ; 
       
        server.repl_master_initial_offset = strtoll(offset,NULL,10); 
       
        redisLog(REDIS_NOTICE,  
        "Full resync from master: %s:%lld" 
        , 
       
        server.repl_master_runid, 
       
        server.repl_master_initial_offset); 
       
        } 
       
        /* We are going to full resync, discard the cached master structure. */ 
       
        replicationDiscardCachedMaster(); 
       
        sdsfree(reply); 
       
        return  
        PSYNC_FULLRESYNC; 
       
        } 
       
        // 部分同步 
       
        if  
        (! 
        strncmp 
        (reply,  
        "+CONTINUE" 
        , 9)) { 
       
        /* Partial resync was accepted, set the replication state accordingly */ 
       
        redisLog(REDIS_NOTICE,  
        "Successful partial resynchronization with master." 
        ); 
       
        sdsfree(reply); 
       
        // 缓存主机替代现有主机，且为 PSYNC（部分同步） 做好准备c 
       
        replicationResurrectCachedMaster(fd); 
       
        return  
        PSYNC_CONTINUE; 
       
        } 
       
        /* If we reach this point we receied either an error since the master does 
       
        * not understand PSYNC, or an unexpected reply from the master. 
       
        * Reply with PSYNC_NOT_SUPPORTED in both cases. */ 
       
        // 接收到主机发出的错误信息 
       
        if  
        ( 
        strncmp 
        (reply,  
        "-ERR" 
        , 4)) { 
       
        /* If it's not an error, log the unexpected event. */ 
       
        redisLog(REDIS_WARNING,  
        "Unexpected reply to PSYNC from master: %s" 
        , reply); 
       
        }  
        else  
        { 
       
        redisLog(REDIS_NOTICE, 
       
        "Master does not support PSYNC or is in " 
       
        "error state (reply: %s)" 
        , reply); 
       
        } 
       
        sdsfree(reply); 
       
        replicationDiscardCachedMaster(); 
       
        return  
        PSYNC_NOT_SUPPORTED; 
       
        } 
       
        // 主机 SYNC 和 PSYNC 命令处理函数，会尝试进行部分同步和全同步 
       
        /* SYNC ad PSYNC command implemenation. */ 
       
        void  
        syncCommand(redisClient *c) { 
       
        ...... 
       
        // 主机尝试部分同步，允许则进行部分同步，会返回 +CONTINUE，接着发送积压空间 
       
        /* Try a partial resynchronization if this is a PSYNC command. 
       
        * If it fails, we continue with usual full resynchronization, however 
       
        * when this happens masterTryPartialResynchronization() already 
       
        * replied with: 
       
        * 
       
        * +FULLRESYNC <runid> <offset> 
       
        * 
       
        * So the slave knows the new runid and offset to try a PSYNC later 
       
        * if the connection with the master is lost. */ 
       
        if  
        (!strcasecmp(c->argv[0]->ptr,  
        "psync" 
        )) { 
       
        // 部分同步 
       
        if  
        (masterTryPartialResynchronization(c) == REDIS_OK) { 
       
        server.stat_sync_partial_ok++; 
       
        return 
        ;  
        /* No full resync needed, return. */ 
       
        }  
        else  
        { 
       
        // 部分同步失败，会进行全同步，这时会收到来自客户端的 runid 
       
        char  
        *master_runid = c->argv[1]->ptr; 
       
        /* Increment stats for failed PSYNCs, but only if the 
       
        * runid is not "?", as this is used by slaves to force a full 
       
        * resync on purpose when they are not albe to partially 
       
        * resync. */ 
       
        if  
        (master_runid[0] !=  
        '?' 
        ) 
       
        server.stat_sync_partial_err++; 
       
        } 
       
        }  
        else  
        { 
       
        /* If a slave uses SYNC, we are dealing with an old implementation 
       
        * of the replication protocol (like redis-cli --slave). Flag the client 
       
        * so that we don't expect to receive REPLCONF ACK feedbacks. */ 
       
        c->flags |= REDIS_PRE_PSYNC_SLAVE; 
       
        } 
       
        // 执行全同步： 
       
        ...... 
       
        } 
       
        // 主机尝试是否能进行部分同步 
       
        /* This function handles the PSYNC command from the point of view of a 
       
        * master receiving a request for partial resynchronization. 
       
        * 
       
        * On success return REDIS_OK, otherwise REDIS_ERR is returned and we proceed 
       
        * with the usual full resync. */ 
       
        int  
        masterTryPartialResynchronization(redisClient *c) { 
       
        long  
        long  
        psync_offset, psync_len; 
       
        char  
        *master_runid = c->argv[1]->ptr; 
       
        char  
        buf[128]; 
       
        int  
        buflen; 
       
        /* Is the runid of this master the same advertised by the wannabe slave 
       
        * via PSYNC? If runid changed this master is a different instance and 
       
        * there is no way to continue. */ 
       
        if  
        (strcasecmp(master_runid, server.runid)) { 
       
        // 当因为异常需要与主机断开连接的时候，从机会暂存主机的状态信息，以便 
       
        // 下一次的部分同步。 
       
        // 1）master_runid 是从机提供一个因缓存主机的 runid， 
       
        // 2）server.runid 是本机（主机）的 runid。 
       
        // 匹配失败，说明是本机（主机）不是从机缓存的主机，这时候不能进行部分同步， 
       
        // 只能进行全同步 
       
        // "?" 表示从机要求全同步 
       
        // 什么时候从机会要求全同步？？？ 
       
        /* Run id "?" is used by slaves that want to force a full resync. */ 
       
        if  
        (master_runid[0] !=  
        '?' 
        ) { 
       
        redisLog(REDIS_NOTICE, 
        "Partial resynchronization not accepted: " 
       
        "Runid mismatch (Client asked for '%s', I'm '%s')" 
        , 
       
        master_runid, server.runid); 
       
        }  
        else  
        { 
       
        redisLog(REDIS_NOTICE,  
        "Full resync requested by slave." 
        ); 
       
        } 
       
        goto  
        need_full_resync; 
       
        } 
       
        // 从参数中解析整数，整数是从机指定的偏移量 
       
        /* We still have the data our slave is asking for? */ 
       
        if  
        (getLongLongFromObjectOrReply(c, c->argv[2], &psync_offset, NULL) != REDIS_OK) 
       
        goto  
        need_full_resync; 
       
        // 部分同步失败的情况 
       
        if  
        (!server.repl_backlog ||  
        /*不存在积压空间*/ 
       
        psync_offset < server.repl_backlog_off ||   
        /*psync_offset 太过小， 
       
        即从机错过太多更新记录， 
       
        安全起见，实行全同步*/ 
       
        /*psync_offset 越界*/ 
       
        psync_offset > (server.repl_backlog_off + server.repl_backlog_histlen)) 
       
        // 经检测，不满足部分同步的条件，转而进行全同步 
       
        { 
       
        redisLog(REDIS_NOTICE, 
       
        "Unable to partial resync with the slave  
        for  
        lack of 
       
        backlog (Slave request was: %lld).", psync_offset); 
       
        if  
        (psync_offset > server.master_repl_offset) { 
       
        redisLog(REDIS_WARNING, 
       
        "Warning: slave tried to PSYNC with an offset that is greater 
       
        than the master replication offset."); 
       
        } 
       
        goto  
        need_full_resync; 
       
        } 
       
        // 执行部分同步： 
       
        // 1）标记客户端为从机 
       
        // 2）通知从机准备接收数据。从机收到 +CONTINUE 会做好准备 
       
        // 3）开发发送数据 
       
        /* If we reached this point, we are able to perform a partial resync: 
       
        * 1) Set client state to make it a slave. 
       
        * 2) Inform the client we can continue with +CONTINUE 
       
        * 3) Send the backlog data (from the offset to the end) to the slave. */ 
       
        // 将连接的客户端标记为从机 
       
        c->flags |= REDIS_SLAVE; 
       
        // 表示进行部分同步 
       
        // #define REDIS_REPL_ONLINE 9 /* RDB file transmitted, sending just 
       
        // updates. */ 
       
        c->replstate = REDIS_REPL_ONLINE; 
       
        // 更新 ack 的时间 
       
        c->repl_ack_time = server.unixtime; 
       
        // 添加入从机链表 
       
        listAddNodeTail(server.slaves, c); 
       
        // 告诉从机可以进行部分同步，从机收到后会做相关的准备（注册回调函数） 
       
        /* We can't use the connection buffers since they are used to accumulate 
       
        * new commands at this stage. But we are sure the socket send buffer is 
       
        * emtpy so this write will never fail actually. */ 
       
        buflen = snprintf(buf,  
        sizeof 
        (buf),  
        "+CONTINUE\r\n" 
        ); 
       
        if  
        (write(c->fd, buf, buflen) != buflen) { 
       
        freeClientAsync(c); 
       
        return  
        REDIS_OK; 
       
        } 
       
        // 向从机写积压空间中的数据，积压空间存储有「更新缓存」 
       
        psync_len = addReplyReplicationBacklog(c, psync_offset); 
       
        redisLog(REDIS_NOTICE, 
       
        "Partial resynchronization request accepted. Sending %lld bytes of backlog 
       
        starting from offset %lld.", psync_len, psync_offset); 
       
        /* Note that we don't need to set the selected DB at server.slaveseldb 
       
        * to -1 to force the master to emit SELECT, since the slave already 
       
        * has this state from the previous connection with the master. */ 
       
        refreshGoodSlavesCount(); 
       
        return  
        REDIS_OK;  
        /* The caller can return, no full resync needed. */ 
       
        need_full_resync: 
       
        ...... 
       
        // 向从机发送 +FULLRESYNC runid repl_offset 
       
        }

六、暂缓主机

从机因为某些原因，譬如网络延迟（PING 超时，ACK 超时等），可能会断开与主机的连接。这时候，从机会尝试保存与主机连接的信息，譬如全局积压空间数据偏移量等，以便下一次的部分同步，并且从机会再一次尝试连接主机。注意一点，如果断开的时间足够长，部分同步肯定会失败的。

 
        void  
        freeClient(redisClient *c) { 
       
        listNode *ln; 
       
        /* If this is marked as current client unset it */ 
       
        if  
        (server.current_client == c) server.current_client = NULL; 
       
        // 如果此机为从机，已经连接主机，可能需要保存主机状态信息，以便进行 PSYNC 
       
        /* If it is our master that's beging disconnected we should make sure 
       
        * to cache the state to try a partial resynchronization later. 
       
        * 
       
        * Note that before doing this we make sure that the client is not in 
       
        * some unexpected state, by checking its flags. */ 
       
        if  
        (server.master && c->flags & REDIS_MASTER) { 
       
        redisLog(REDIS_WARNING, 
        "Connection with master lost." 
        ); 
       
        if  
        (!(c->flags & (REDIS_CLOSE_AFTER_REPLY| 
       
        REDIS_CLOSE_ASAP| 
       
        REDIS_BLOCKED| 
       
        REDIS_UNBLOCKED))) 
       
        { 
       
        replicationCacheMaster(c); 
       
        return 
        ; 
       
        } 
       
        } 
       
        ...... 
       
        } 
       
        // 为了实现部分同步，从机会保存主机的状态信息后才会断开主机的连接，主机状态信息 
       
        // 保存在 server.cached_master 
       
        // 会在 freeClient() 中调用，保存与主机连接的状态信息，以便进行 PSYNC 
       
        void  
        replicationCacheMaster(redisClient *c) { 
       
        listNode *ln; 
       
        redisAssert(server.master != NULL && server.cached_master == NULL); 
       
        redisLog(REDIS_NOTICE, 
        "Caching the disconnected master state." 
        ); 
       
        // 从客户端列表删除主机的信息 
       
        /* Remove from the list of clients, we don't want this client to be 
       
        * listed by CLIENT LIST or processed in any way by batch operations. */ 
       
        ln = listSearchKey(server.clients,c); 
       
        redisAssert(ln != NULL); 
       
        listDelNode(server.clients,ln); 
       
        // 保存主机的状态信息 
       
        /* Save the master. Server.master will be set to null later by 
       
        * replicationHandleMasterDisconnection(). */ 
       
        server.cached_master = server.master; 
       
        // 注销事件，关闭连接 
       
        /* Remove the event handlers and close the socket. We'll later reuse 
       
        * the socket of the new connection with the master during PSYNC. */ 
       
        aeDeleteFileEvent(server.el,c->fd,AE_READABLE); 
       
        aeDeleteFileEvent(server.el,c->fd,AE_WRITABLE); 
       
        close(c->fd); 
       
        /* Set fd to -1 so that we can safely call freeClient(c) later. */ 
       
        c->fd = -1; 
       
        // 修改连接的状态，设置 server.master = NULL 
       
        /* Caching the master happens instead of the actual freeClient() call, 
       
        * so make sure to adjust the replication state. This function will 
       
        * also set server.master to NULL. */ 
       
        replicationHandleMasterDisconnection(); 
       
        }

七、总结

简单来说，主从同步就是 RDB 文件的上传下载；主机有小部分的数据修改，就把修改记录传播给每个从机。这篇文章详述了 Redis 主从复制的内部协议和机制。接下来的几篇关于 Redis 的文章，主要是其内部数据结构。

(转)深入剖析Redis主从复制

猜你喜欢