Using Chef to erect a MongoDB data center

Combing through with a fine-toothed comb, this is an example of individual Chef clients set up for different purposes in a MongoDB data center. This table goes along with Transcription of Notes on Setting Up MongoDB and Chef on a VM in a Data Center.

Colors explore tightly coupled relationships. Below are set-ups for the following. By "rolling up", a second step that involves invoking the MongoDB shell for finalizing set-up is specified. The basic set-up bounces the MongoDB service (binary) so that it's running in its fully configured state which must happen prior to deeper MongoDB configuration. Implicit nodes missing explicit treatment are listed here.

Featured in table below Implicit/suggested
1. db01 --single, replica node

2. db03 --replica node that rolls up a replica set



3. db07 --configuration server


4. db10 --sharding router that rolls itself up

5. db12 --arbiter that rolls itself up
 
db02 --single, replica node

db04 --single, replica node
db05 --single, replica node
db06 --replica node that rolls up replica set 2

db08 --second configuration server
db09 --third configuration server

db11 --second shard that rolls up shard 2

1

Node

db01.json:
{
    "normal" : { "port" : 27017 },
    "name": "db01",
    "override": { },
    "default": { },
    "json_class": "Chef::Node",
    "automatic": { },
    "run_list":
    [
        "recipe[apt]",
        "recipe[mongodb]",
        "recipe[mongodb::replica]",
        "role[install-database-node]",
        "role[install-replica-node]"
    ],
    "chef_type": "node"
}
		

Recipe

replica.rb:
# Copy the upstart configuration file to /etc/init...
template "/etc/init/mongodb.conf" do
  source "replica-upstart.conf.erb"
  owner "root"
  group "root"
  mode 00644           # -rw-r--r--
end

template "/data/mongodb/mongodb.conf" do
  source "replica.conf.erb"
  owner "mongodb"
  group "mongodb"
  mode 00644           # -rw-r--r--
  notifies :restart, "service[ $ip_address ]", :immediately
end

# Restart the service so that replica-set.rb can do its magic.
service $ip_address do
  provider Chef::Provider::Service::Upstart
  action [ :enable, :start ]
end
		

Data bag

(none for simple replica nodes)

Role

rollups/install-replica-node.rb:
name "install-replica-node"
description "Role for managing MongoDB database nodes"

for_database_servers = %w{
	recipe[apt]
	recipe[mongodb]
	recipe[mongodb::replica]
}

run_list for_database_servers
		
2

Node

db03.json:
{
    "normal": { "rollup" : "replicaset-1" },
    "name": "db01",
    "override": { },
    "default": { },
    "json_class": "Chef::Node",
    "automatic": { },
    "run_list":
    [
        "recipe[apt]",
        "recipe[mongodb]",
        "recipe[mongodb::replica]",
        "recipe[mongodb::replicaset]",
        "role[install-database-node]",
        "role[install-replica-node]",
        "role[config-replicaset]"
    ],
    "chef_type": "node"
}
		

Recipe

Use replica.rb, plus...

replicaset.rb:
which = node[ :rollup ]           # i.e.: "replicaset-1"
bag   = search( :rollups, which )

# (see this code in the appendix)
load 'get-config-command.rb'
configuration = get_configuration_command( bag )

# Create replica set. This works because the MongoDB service was bounced
# just before we were invoked.
execute "compose-replicaset-configuration" do
  command "mongo --eval '#{configuration}'"
  EOS
  retries 6
  retry_delay 10
end

# Initiate replica set...
execute "initiate-replicaset" do
  command "mongo --eval 'rs.initiate( config )'"
  EOS
  retries 6
  retry_delay 10
end
		

Data bag

rollups/replicaset-1.json:
{
    "id"          : "replicaset-1",
    "name"        : "rs-1",
    "description" : "Shard 1 replica set",
    "replica_1" :
    {
      "hostname" : "16.86.193.100",
      "port" : 37017,
      "node" : "db01"
    },
    "replica_2" :
    {
      "hostname" : "16.86.193.101",
      "port" : 37018,
      "node" : "db02"
    },
    "replica_3" :
    {
      "hostname" : "16.86.193.102",
      "port" : 37019,
      "node" : "db03"
    }
}
		

Role

config-replicaset.rb:
name "config-replicaset"
description "Role for rolling up MongoDB replca sets"

for_database_servers = %w{
	recipe[apt]
	recipe[mongodb]
	recipe[mongodb::replica]
	recipe[mongodb::replicaset]
}

run_list for_database_servers
		
3

Node

db07.json:
{
    "normal": { "rollup" : "configsvr_1" },
    "name": "db07",
    "override": { },
    "default": { },
    "json_class": "Chef::Node",
    "automatic": { },
    "run_list":
    [
        "recipe[apt]",
        "recipe[mongodb]",
        "recipe[mongodb::configsvr]",
        "role[install-database-node]",
        "role[install-configsvr]"
    ],
    "chef_type": "node"
}
		

Recipe

configsvr.rb:
# Copy the upstart configuration file to /etc/init.
cookbook_file "/etc/init/mongodb-configsvr.conf" do
  source "mongodb-configsvr.conf"
  owner "root"
  group "root"
  mode 00644           # -rw-r--r--
end

which = node[ :which ]
bag   = search( :rollup, "configsvr" )

# (see this code in the appendix)
load 'get-hostname-and-port.rb'
hostname_port = get_hostname_and_port( which, bag )
hostname      = hostname_port[ 0 ]
port          = hostname_port[ 1 ]

# Copy the configuration server's /etc/mongodb.conf-equivalent to
# /data/mongodb.
template "/data/mongodb/configsvr.conf" do
  source "configsvr.conf.erb"
  owner "mongodb"
  group "mongodb"
  mode 00644           # -rw-r--r--
  variables( {
      hostname,
      port
      }
 )
end

directory "/data/mongodb/configsvr" do
  action :create
  owner "mongodb"
  group "mongodb"
  mode 00755           # rwxr-xr-x
end

directory "/data/mongodb/configsvr/log" do
  action :create
  owner "mongodb"
  group "mongodb"
  mode 00755           # -rwxr-xr-x
end
		

Data bag

rollups/configsvr.json:
{
    "id"          : "configsvr",
    "description" : "Configuration server",
    "configsvr_1" :
    {
      "hostname" : "16.86.193.100",
      "port" : 37017,
      "node" : "db07"
    },
    "configsvr_2" :
    {
      "hostname" : "16.86.193.101",
      "port" : 37018,
      "node" : "db08"
    },
    "configsvr_3" :
    {
      "hostname" : "16.86.193.102",
      "port" : 37019,
      "node" : "db09"
  	}
}
		

Role

install-configsvr.rb:
name "install-configsvr"
description "Role for erecting a MongoDB configuration server"

for_database_servers = %w{
	recipe[apt]
	recipe[mongodb]
	recipe[mongodb::configsvr]
}

run_list for_database_servers
		
4

Node

db10.json:
{
    "normal": { "rollup" : "shard-1" },
    "name": "db10",
    "override": { },
    "default": { },
    "json_class": "Chef::Node",
    "automatic": { },
    "run_list":
    [
        "recipe[apt]",
        "recipe[mongodb]",
        "recipe[mongodb::sharding]",
        "recipe[mongodb::add-shard]",
        "role[install-database-node]",
        "role[install-sharding-router]"
    ],
    "chef_type": "node"
}
		

Recipe

sharding.rb:
# Sets ownership and privileges on the sharding subdirectories
# which could be newly created.
directory "/data/mongodb/sharding" do
  action :create
  owner "mongodb"
  group "mongodb"
  mode 00755           # -rwxr-xr-x
end

directory "/data/mongodb/sharding/log" do
  action :create
  owner "mongodb"
  group "mongodb"
  mode 00755           # -rwxr-xr-x
end

cookbook_file "/etc/init/mongodb-sharding.conf" do
  source "mongodb.conf"
  owner "root"
  group "root"
  mode 00644           # -rw-r--r--
end

which_bag = node[ :rollup ]
bag = search( :rollups, which_bag )

# Get port on which the sharding router will listen.
port = bag[ :port ]

# Get the bagname for the configsvr bag, then from it, get the
# configuration server list (hostnames and ports).
bagname = bag[ :configsvr_bag )
bag     = search( :rollups, bagname )

# (see this code in the appendix)
load 'get-config-server-list.rb'
configsvr_list = get_configuration_server_list( bag )


# Copy the sharding router /etc/mongodb.conf-equivalent to /data/mongodb.
template "/data/mongodb/mongodb.conf" do
  source "sharding.conf.erb"
  owner "mongodb"
  group "mongodb"
  mode 00644           # -rw-r--r--
  variables( {
    :port => port,
    :configsvr_list => configsvr_list
    }
  )
  notifies :restart, "service[ $ip_address ]", :immediately
end

# Restart the service so that add-shard.rb can do its magic.
service $ip_address do
  provider Chef::Provider::Service::Upstart
  action [ :enable, :start ]
end
		
add-shard.rb:
replSet_name = node[ :mongodb ][ :replicaset ][ :name ]

# (see this code in the appendix)
load 'get-add-shard-shell-command.rb'
shell_command = get_add_shard_shell_command( replicaset_bag )

# Erect the mongos daemon (sharding router). This works because the
# MongoDB service was bounced just before we were invoked.
execute "add-shard" do
  command "#{shell_command}"
  EOS
  retries 6
  retry_delay 10
end
		

Data bag

rollups/shard-1.json:
{
    "id"             : "shard-1",
    "description"    : "Shard 1",
    "replicaset_bag" : "replicaset-1",
    "configsvr_bag"  : "configsvr"
}
		

Role

install-sharding-router.rb:
name "install-sharding-router"
description "Role for erecting a MongoDB sharding router"

for_database_servers = %w{
	recipe[apt]
	recipe[mongodb]
	recipe[mongodb::sharding]
	recipe[mongodb::add-shard]
}

run_list for_database_servers
		
5

Node

db12.json:
{
    "normal": { "rollup" : "arbiter-1" },
    "name": "db12",
    "override": { },
    "default": { },
    "json_class": "Chef::Node",
    "automatic": { },
    "run_list":
    [
        "recipe[apt]",
        "recipe[mongodb]",
        "recipe[mongodb::arbiter]",
        "recipe[mongodb::add-arbiter]",
        "role[install-database-node]",
        "role[install-arbiter]"
    ],
    "chef_type": "node"
}
		

Recipe

arbiter.rb:
# Sets ownership and privileges on the arbiter subdirectories
# which could be newly created.
directory "/data/mongodb/arbiter" do
  action :create
  owner "mongodb"
  group "mongodb"
  mode 00755           # -rwxr-xr-x
end

directory "/data/mongodb/arbiter/log" do
  action :create
  owner "mongodb"
  group "mongodb"
  mode 00755           # -rwxr-xr-x
end

bagname      = node[ :rollup ]
bag          = search( :rollup, bagname )
port         = bag[ :port ]
bag          = search( :rollup, bag[ :replicaset_bag ] )
replSet_name = bag[ :name ]

# Copy the upstart configuration file to /etc/init. Copy the
# arbiter's /etc/mongodb.conf-equivalent to /data/mongodb.
cookbook_file "/etc/init/mongodb.conf" do
  source "mongodb.conf"
  owner "root"
  group "root"
  mode 00644           # -rw-r--r--
end

template "/data/mongodb/mongodb.conf" do
  source "arbiter.conf.erb"
  owner "mongodb"
  group "mongodb"
  mode 00644           # -rw-r--r--
  variables( {
    :port    => port
    :replSet => replSet_name
    }
  )
  notifies :restart, "service[ $ip_address ]", :immediately
end

# Restart the service so that add-arbiter.rb can do its magic.
service $ip_address do
  provider Chef::Provider::Service::Upstart
  action [ :enable, :start ]
end
		
add-arbiter.rb:
bagname  = node[ :rollup ]
bag      = search( :rollup, bagname )
hostname = bag[ :hostname ]
port     = bag[ :port ]

# --------------------------------------------------------------------
# Add arbiter to replica set. This works because the MongoDB service
# was bounced just before we were invoked.
execute "add-arbiter" do
  command "mongo --eval 'rs.addArb( "#{hostname}":#{port}" )'"
  EOS
  retries 6
  retry_delay 10
end
		

Data bag

{
    "id"             : "arbiter-1",
    "description"    : "Shard 1 arbiter",
    "hostname"       : "16.86.192.99",
    "port"           : 27016,
    "replicaset_bag" : "replicaset-1"
}
		

Role

install-arbiter.rb:
name "install-arbiter"
description "Role for installing MongoDB arbiter node"

for_database_servers = %w{
	recipe[apt]
	recipe[mongodb]
	recipe[mongodb::arbiter]
	recipe[mongodb::add-arbiter]
}

run_list for_database_servers
		

Appendix: Extra code for above recipes

These are Ruby code resources loaded and used by the recipes.

get-config-command.rb:
# =====================================================================
# Derive the MongoDB shell command to erect the replica set. Here's our
# example, likely the hostnames will be IP addresses rather than human-
# friendly names as in this example.
#
# config =
# {
#    _id:"replicas", members:
#    [
#        { _id:0, host:"db01:37017" },
#        { _id:2, host:"db02:37018" },
#        { _id:3, host:"db03:37019" },
#        { _id:4, host:"db04:37020" }
#    ]
# }
def get_configuration_command( bag )
  id         = 0
  found      = false
  replica_id = "id : \"%s\"" % bag[ :name ]

  # We'll expect up to 4 replica nodes, though we'd really only expect 3. If
  # an even number, likely, there's going to be a replica, but we don't care
  # about that in here.
  replica_1 = bag[ :replica_1 ]
  replica_2 = bag[ :replica_2 ]
  replica_3 = bag[ :replica_3 ]
  replica_4 = bag[ :replica_4 ]

  replica_members = ""

  if !replica_1.empty?
     found = true
     hostname = replica_1[ :hostname ]
     port     = replica_1[ :port ]
     replica_members += "{ _id:%d, host:%s:%s }" % [ id, hostname, port ]
     id += 1
  end
  if !replica_2.empty?
     if found
       replica_members += ", "
     end
     found = true
     hostname = replica_2[ :hostname ]
     port     = replica_2[ :port ]
     replica_members += "{ _id:%d, host:%s:%s }" % [ id, hostname, port ]
     id += 1
  end
  if !replica_3.empty?
     if found
       replica_members += ", "
     end
     found = true
     hostname = replica_3[ :hostname ]
     port     = replica_3[ :port ]
     replica_members += "{ _id:%d, host:%s:%s }" % [ id, hostname, port ]
     id += 1
  end
  if !replica_4.empty?
     if found
       replica_members += ", "
     end
     hostname = replica_4[ :hostname ]
     port     = replica_4[ :port ]
     replica_members += "{ _id:%d, host:%s:%s }" % [ id, hostname, port ]
     id += 1
  end

  # Build the configuration statement:
  configuration = "config = { " + replica_id + ", members: [" + replica_members + "] }"

  return configuration
end
get-hostname-and-port.rb:
# =====================================================================
# Get hostname/IP address and port number for the configuration server.
# Coming in are which configuration server in the bag to use.
# {
#     "id"          : "configsvr",
#     "description" : "configuration server list--hostnames and ports",
#     "configsvr_1" : { "hostname" : "16.86.193.100", "port" : 37017 },
#     "configsvr_2" : { "hostname" : "16.86.193.101", "port" : 37018 },
#     "configsvr_3" : { "hostname" : "16.86.193.102", "port" : 37019 }
# }
def get_hostname_and_port( which, bag )
  configsvr = bag[ which.to_sym ]
  hostname  = configsvr[ :hostname ]
  port      = configsvr[ :port ]

  return hostname, port
end
get-config-server-list.rb:
# =====================================================================
# Set up configuration server list for launching the sharding router.

# Gather the list of configuration servers from the known bag.
def get_configuration_server_list( bag )
  configsvr_list = nil
  configsvr_1 = bag[ :configsvr_1 ]
  configsvr_2 = bag[ :configsvr_2 ]
  configsvr_3 = bag[ :configsvr_3 ]

  if !configsvr_1.empty?
     configsvr_1_hostname = configsvr_1[ :hostname ]
     configsvr_1_port     = configsvr_1[ :port ]
  end
  if !configsvr_2.empty?
     configsvr_2_hostname = configsvr_2[ :hostname ]
     configsvr_2_port     = configsvr_2[ :port ]
  end
  if !configsvr_3.empty?
     configsvr_3_hostname = configsvr_3[ :hostname ]
     configsvr_3_port     = configsvr_3[ :port ]
  end

  # We must have either one configuration server or three. We'll ignore
  # any more unless all three are known. TODO: tighten the logic and
  # error-handling here?
  if !configsvr_1_hostname.empty? and !configsvr_1_port.nil?
    configsvr_list = configsvr_1_hostname + ":" + configsvr_1_port.to_s
  else
    puts "Missing hostname or port number, we're screwed!"
  end

  if configsvr_2_hostname.empty? or configsvr_3_hostname.empty?
    # so, we'll ignore anything else halfway established...
  elsif configsvr_3_hostname.empty?
    # so, we'll ignore anything else halfway established...
  else
    # well, we'll give all three a shot...
    configsvr_list += "," + configsvr_2_hostname + ":" + configsvr_2_port.to_s
    configsvr_list += "," + configsvr_3_hostname + ":" + configsvr_3_port.to_s
  end

  return configsvr_list
end
get-add-shard-shell-command.rb:
# =====================================================================
# Derive the MongoDB shell command to add the shard.
def get_shard_shell_command( replicaset_bag )
  replSet_name = replicaset_bag[ :name ]

  # Get the name of the replica set that will belong to this shard, plus
  # one of its hostname:port tuples.
  hostname = ""
  port     = nil

  if !replicaset_bag[ :replica_1 ].empty?
     hostname = replicaset_bag[ :replica_1 ][ :hostname ]
     port     = replicaset_bag[ :replica_1 ][ :port ]
  elsif !replicaset_bag[ :replica_2 ].empty?
     hostname = replicaset_bag[ :replica_2 ][ :hostname ]
     port     = replicaset_bag[ :replica_2 ][ :port ]
  elsif !replicaset_bag[ :replica_3 ].empty?
     hostname = replicaset_bag[ :replica_3 ][ :hostname ]
     port     = replicaset_bag[ :replica_3 ][ :port ]
  elsif !replicaset_bag[ :replica_4 ].empty?
     hostname = replicaset_bag[ :replica_4 ][ :hostname ]
     port     = replicaset_bag[ :replica_4 ][ :port ]
  end

  shell_command = "mongo --eval 'rs.addShard( \"#{replSet_name}/#{hostname}:#{port}\" )'"
  return shell_command
end