Mongo的Nodejs客户端报出来的错,是Pool实例已经将自身destroy()掉之后再进行写操作的结果,Pool实例destroy()自身的具体原因有两种:
1)30秒内连续重连30次都失败
2)重新认证时,认证失败
问题重现方法:
Mongo服务端以单实例的形式启动,把包含了mongo客户端的web服务启动起来,然后kill -9 mongod进程,等待超过30秒,然后再通过web服务去请求mongod就会报出“server instance pool was destroyed”。当重新启动mongod之后,依然报相同的错,不能自动恢复。这在生产环境可是个严重问题啊。
解决方案:
- 最好的方案是把单实例改成主从结构Replica Set,即便是主从节点同时挂掉,客户端也不会报 “server instance pool was destroyed” error,而且只要服务端访问恢复正常则客户端都能立即自动回复正常。(实际测试主从节点挂掉1小时后再恢复,客户端能自动恢复正常)
- 如果一定要使用单实例的话,可以把参数reconnectTries设成一个足够大的数。如下所示,则24小时之内只要服务端能恢复正常访问,则客户端连接池都能恢复。
"options" : {
"reconnectTries": 86400,
},
源代码追踪如下:
mongodb-core\lib\topologies\server.js
function basicWriteValidations(self) {
if (!self.s.pool) return new MongoError('server instance is not connected');
if (self.s.pool.isDestroyed()) return new MongoError('server instance pool was destroyed');
}
mongodb-core\lib\connection\pool.js
Pool.prototype.isDestroyed = function() {
return this.state === DESTROYED || this.state === DESTROYING;
};
由Pool.prototype.destroy这个函数来负责为state赋值为DESTROYED和DESTROYING
Pool.prototype.destroy = function(force)
而调用destroy()函数的第一种情况:重连失败到达30次。
reconnect: true,
reconnectInterval: 1000,
reconnectTries: 30,
this.retriesLeft = this.options.reconnectTries;
function _connectionFailureHandler(self) {
return function() {
if (this._connectionFailHandled) return;
this._connectionFailHandled = true;
// Destroy the connection
this.destroy();
// Count down the number of reconnects
self.retriesLeft = self.retriesLeft - 1;
// How many retries are left
if (self.retriesLeft <= 0) {
// Destroy the instance
self.destroy();
// Emit close event
self.emit(
'reconnectFailed',
new MongoNetworkError(
f(
'failed to reconnect after %s attempts with interval %s ms',
self.options.reconnectTries,
self.options.reconnectInterval
)
)
);
} else {
self.reconnectId = setTimeout(attemptReconnect(self), self.options.reconnectInterval);
}
};
}
而调用destroy()函数的第二种情况:重新认证的时候认证失败
reauthenticate(self, connection, function(err) {
if (self.state === DESTROYED || self.state === DESTROYING) return self.destroy();
// We have an error emit it
if (err) {
// Destroy the pool
self.destroy();
// Emit the error
return self.emit('error', err);
}
// Authenticate
authenticate(self, args, connection, function(err) {
if (self.state === DESTROYED || self.state === DESTROYING) return self.destroy();
// We have an error emit it
if (err) {
// Destroy the pool
self.destroy();
// Emit the error
return self.emit('error', err);
}
// Set connected mode
stateTransition(self, CONNECTED);
// Move the active connection
moveConnectionBetween(connection, self.connectingConnections, self.availableConnections);
// if we have a minPoolSize, create a connection
if (self.minSize) {
for (let i = 0; i < self.minSize; i++) _createConnection(self);
}
// Emit the connect event
self.emit('connect', self);
});
});
扫描二维码关注公众号,回复:
3053935 查看本文章