MongoDB operations: Replica Set

A replica set is a group of servers (mongod actually) that maintain the same data set, with one primary which takes client requests, and multiple secondaries that keep copies of the primary's data. If the primary crashes, secondaries can elect a new primary from amongst themselves.

Replication from primary to secondaries is asynchronous.

ref:
https://docs.mongodb.com/v3.6/replication/
https://www.safaribooksonline.com/library/view/mongodb-the-definitive/9781491954454/ch08.html
https://www.percona.com/blog/2018/10/10/mongodb-replica-set-scenarios-and-internals/

Concepts

  • Primary: A node that accepts writes and is the leader for voting. There can be only one primary.
  • Secondary: A node that replicates from the primary or another secondary and can be used for reads. There can be a max of 127.
  • Arbiter: The node does not hold data and only participates in the voting. Also, it cannot be elected as the primary.
    • In the event your node count is an even number, add one of these to break the tie. Never add one where it would make the count even.
  • Priority 0 node: The node cannot be selected as the primary. You might want to lower priority of some slow nodes.
    • Priority allows you to prefer specific nodes are primary.
  • Vote 0 node: The node does not participate in the voting.
    • In some cases, having more than eight nodes means additional nodes must not vote.
  • Hidden node: The hidden node must be a priority 0 node and is invisible to the driver which unable to take queries from clients.
  • Delayed node: The delayed node must be a hidden node, and its data lag behind the primary for some time.
  • Tags: Grants special ability to make queries directly to specific nodes. Useful for BI, geo-locality, and other advanced functions.

ref:
https://docs.mongodb.com/manual/core/replica-set-elections/
https://docs.mongodb.com/manual/core/replica-set-priority-0-member/
https://docs.mongodb.com/manual/core/replica-set-hidden-member/
https://docs.mongodb.com/manual/core/replica-set-delayed-member/

Common Architectures

ref:
https://docs.mongodb.com/v3.6/core/replica-set-architectures/
https://www.percona.com/blog/2018/03/22/the-anatomy-of-a-mongodb-replica-set/

Three-Node Replica Set: Primary, Secondary, Secondary

ref:
https://docs.mongodb.com/v3.6/tutorial/deploy-replica-set/
https://docs.mongodb.com/v3.6/tutorial/expand-replica-set/

If you are running MongoDB cluster on Kubernetes, PLEASE USE THE FULL DNS NAME (FQDN). DO NOT use something like pod-name.service-name.

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
> rs.initiate({
   _id : "rs0",
   members: [
      {_id: 0, host: "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017"},
      {_id: 1, host: "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017"},
      {_id: 2, host: "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017"}
   ]
})
{
    "ok" : 1,
    "operationTime" : Timestamp(1531223087, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531223087, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}
rs0:PRIMARY> db.isMaster()

ref:
https://docs.mongodb.com/v3.6/reference/method/rs.initiate/

$ mongo mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local
rs0:SECONDARY> rs.slaveOk()
rs0:SECONDARY> show dbs
rs0:SECONDARY> rs.conf()
{
    "_id" : "rs0",
    "version" : 1,
    "protocolVersion" : NumberLong(1),
    "members" : [
        {
            "_id" : 0,
            "host" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 1,
            "host" : "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 2,
            "host" : "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "catchUpTimeoutMillis" : -1,
        "catchUpTakeoverDelayMillis" : 30000,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        },
        "replicaSetId" : ObjectId("5b449c2f9269bb1a807a8cdf")
    }
}
rs0:SECONDARY> rs.status()
{
    "set" : "rs0",
    "date" : ISODate("2018-07-10T11:47:48.474Z"),
    "myState" : 1,
    "term" : NumberLong(1),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        },
        "readConcernMajorityOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 381,
            "optime" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-07-10T11:47:40Z"),
            "electionTime" : Timestamp(1531223098, 1),
            "electionDate" : ISODate("2018-07-10T11:44:58Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 181,
            "optime" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-07-10T11:47:40Z"),
            "optimeDurableDate" : ISODate("2018-07-10T11:47:40Z"),
            "lastHeartbeat" : ISODate("2018-07-10T11:47:46.599Z"),
            "lastHeartbeatRecv" : ISODate("2018-07-10T11:47:47.332Z"),
            "pingMs" : NumberLong(0),
            "syncingTo" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 181,
            "optime" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-07-10T11:47:40Z"),
            "optimeDurableDate" : ISODate("2018-07-10T11:47:40Z"),
            "lastHeartbeat" : ISODate("2018-07-10T11:47:46.599Z"),
            "lastHeartbeatRecv" : ISODate("2018-07-10T11:47:47.283Z"),
            "pingMs" : NumberLong(0),
            "syncingTo" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "configVersion" : 1
        }
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1531223260, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531223260, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

Three-Node Replica Set: Primary, Secondary, Arbiter

If your replica set has an even number of members, add an arbiter to obtain a majority of votes in an election for primary. Arbiters do not require dedicated hardware.

ref:
https://docs.mongodb.com/v3.6/tutorial/add-replica-set-arbiter/

Issues

Change Replica Set Name

  1. Start mongod without --replSet
  2. Run db.system.replset.remove({_id: 'oldReplicaSetName'}) in MongoDB Shell
  3. Start mongod with --replSet "newReplicaSetName"

ref:
https://stackoverflow.com/questions/33400607/how-do-i-rename-a-mongodb-replica-set

InvalidReplicaSetConfig: Our replica set configuration is invalid or does not include us

$ kubectl logs -f mongodb-rs0-0
REPL_HB [replexec-10] Error in heartbeat (requestId: 20048) to mongodb-rs0-2.mongodb-rs0:27017, response status: InvalidReplicaSetConfig: Our replica set configuration is invalid or does not include us
$ mongo mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local
rs0:OTHER> rs.status()
{
    "state" : 10,
    "stateStr" : "REMOVED",
    "uptime" : 631,
    "optime" : {
        "ts" : Timestamp(1531224140, 1),
        "t" : NumberLong(1)
    },
    "optimeDate" : ISODate("2018-07-10T12:02:20Z"),
    "ok" : 0,
    "errmsg" : "Our replica set config is invalid or we are not a member of it",
    "code" : 93,
    "codeName" : "InvalidReplicaSetConfig",
    "operationTime" : Timestamp(1531224140, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531224790, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
rs0:PRIMARY> rs.conf() 
{
    "_id" : "rs0",
    "version" : 9,
    "protocolVersion" : NumberLong(1),
    "members" : [
        {
            "_id" : 0,
            "host" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 1,
            "host" : "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 2,
            "host" : "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "catchUpTimeoutMillis" : -1,
        "catchUpTakeoverDelayMillis" : 30000,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        },
        "replicaSetId" : ObjectId("5b449c2f9269bb1a807a8cdf")
    }
}

The faulty member's state is REMOVED (it was once in a replica set but was subsequently removed) and shows Our replica set config is invalid or we are not a member of it. In fact, the real issue is that the removed node is sill in the list of replica set members.

You could just manually remove the broken node from the replica set on the primary, restart the node, and re-add the node.

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
rs0:PRIMARY> rs.remove("mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017")

# restart the Pod
$ kubectl delete mongodb-rs0-2

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
rs0:PRIMARY> rs.add("mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017")

ref:
https://stackoverflow.com/questions/47439781/mongodb-replica-set-member-state-is-other
https://docs.mongodb.com/v3.6/tutorial/remove-replica-set-member/
https://docs.mongodb.com/manual/reference/replica-states/

db.isMaster(): Does not have a valid replica set config

rs0:OTHER> db.isMaster()
{
    "hosts" : [
        "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
        "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
        "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local27017"
    ],
    "setName" : "rs0",
    "ismaster" : false,
    "secondary" : false,
    "info" : "Does not have a valid replica set config",
    "isreplicaset" : true,
    "maxBsonObjectSize" : 16777216,
    "maxMessageSizeBytes" : 48000000,
    "maxWriteBatchSize" : 100000,
    "localTime" : ISODate("2018-07-10T14:34:48.640Z"),
    "logicalSessionTimeoutMinutes" : 30,
    "minWireVersion" : 0,
    "maxWireVersion" : 6,
    "readOnly" : false,
    "ok" : 1,
    "operationTime" : Timestamp(1531232610, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531232610, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

You could just re-configure the replica set and only keep reachable members.

rs0:OTHER> oldConf = rs.conf()
rs0:OTHER> oldConf.members = [oldConf.members[0]]
rs0:OTHER> rs.reconfig(oldConf, {force: true})
rs0:PRIMARY> rs.add("mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017")
rs0:PRIMARY> rs.add("mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017")

ref:
https://docs.mongodb.com/v3.6/tutorial/reconfigure-replica-set-with-unavailable-members/

Change Replica Set Name

  1. Stop mongod
  2. Start mongod --bind_ip_all --port 27017 --dbpath /data/db without --replSet
  3. Remove the old Replica Set name
use admin
db.getCollection('system.version').remove({_id: 'shardIdentity'})

use local
db.getCollection('system.replset').remove({_id: 'rs0'})
  1. Start mongod --bind_ip_all --port 27017 --dbpath /data/db --shardsvr --replSet sh0

ref:
https://stackoverflow.com/questions/33400607/how-do-i-rename-a-mongodb-replica-set

Connect To A Replica Set Cluster

ref:
https://api.mongodb.com/python/current/examples/high_availability.html

Use Connection Pools

ref:
https://api.mongodb.com/python/current/faq.html#how-does-connection-pooling-work-in-pymongo