3.2. Cluster Set Up
This section describes everything you need to know to prepare, install, andset up your first CouchDB 2.x cluster.
3.2.1. Ports and Firewalls
CouchDB uses the following ports:
Port Number | Protocol | Recommended binding | Usage |
---|---|---|---|
5984 | tcp | As desired, bydefault localhost |
Standard clusteredport for all HTTPAPI requests |
5986 | tcp | localhost orprivate networkONLY |
Administrative taskssuch as node andshard management |
4369 | tcp | All interfacesby default | Erlang port mapperdaemon (epmd) |
Randomabove 1024(see below) | tcp | Automatic | Communication withother CouchDB nodesin the cluster |
CouchDB in clustered mode uses the port 5984
, just as in a standaloneconfiguration, but it also uses 5986
for node-local APIs. These APIs areadministrative tools only, such as node and shard management. Do not useport 5986
for any other reason. The port is slated to be deprecated in afuture CouchDB release.
Warning
Never expose the node-local port to the public Internet.
By default, CouchDB only exposes port 5986
only on localhost.If you have a secondary network connection on nodes for management purposesonly, it is acceptable to expose the port on that network as well.
CouchDB uses Erlang-native clustering functionality to achieve a clusteredinstallation. Erlang uses TCP port 4369
(EPMD) to find other nodes, so allservers must be able to speak to each other on this port. In an Erlang cluster,all nodes are connected to all other nodes, in a mesh network configuration.
Warning
If you expose the port 4369
to the Internet or any other untrustednetwork, then the only thing protecting you is the Erlangcookie.
Every Erlang application running on that machine (such as CouchDB) then usesautomatically assigned ports for communciation with other nodes. Yes, thismeans random ports. This will obviously not work with a firewall, but it ispossible to force an Erlang application to use a specific port rage.
This documentation will use the range TCP 9100-9200
, but this range isunnecessarily broad. If you only have a single Erlang application running on amachine, the range can be limited to a single port: 9100-9100
, since theports epmd assign are for inbound connections only. Three CouchDB nodesrunning on a single machine, as in a development cluster scenario, would needthree ports in this range.
3.2.2. Configure and Test the Communication with Erlang
3.2.2.1. Make CouchDB use correct IP|FQDN and the open ports
In file etc/vm.args
change the line -name couchdb@127.0.0.1
to-name couchdb@<reachable-ip-address|fully-qualified-domain-name>
which definesthe name of the node. Each node must have an identifier that allows remotesystems to talk to it. The node name is of the form<name>@<reachable-ip-address|fully-qualified-domain-name>
.
The name portion can be couchdb on all nodes, unless you are running more than1 CouchDB node on the same server with the same IP address or domain name. Inthat case, we recommend names of couchdb1
, couchdb2
, etc.
The second portion of the node name must be an identifier by which other nodescan access this node – either the node’s fully qualified domain name (FQDN) orthe node’s IP address. The FQDN is preferred so that you can renumber the node’sIP address without disruption to the cluster. (This is common in cloud-hostedenvironments.)
Open etc/vm.args
, on all nodes, and add -kernel inet_dist_listen_min 9100
and -kernel inet_dist_listen_max 9200
like below:
- -name ...
- -setcookie ...
- ...
- -kernel inet_dist_listen_min 9100
- -kernel inet_dist_listen_max 9200
Again, a small range is fine, down to a single port (set both to 9100
) if youonly ever run a single CouchDB node on each machine.
3.2.2.2. Confirming connectivity between nodes
For this test, you need 2 servers with working hostnames. Let us call themserver1 and server2.
On server1:
- erl -name bus@192.168.0.1 -setcookie 'brumbrum' -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9200
Then on server2:
- erl -name car@192.168.0.2 -setcookie 'brumbrum' -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9200
- An explanation to the commands:
-
erl
the Erlang shell.-name bus@192.168.0.1
the name of the Erlang node and its IP address or FQDN.-setcookie 'brumbrum'
the “password” used when nodes connect to eachother.-kernel inet_dist_listen_min 9100
the lowest port in the rage.-kernel inet_dist_listen_max 9200
the highest port in the rage.
This gives us 2 Erlang shells. shell1 on server1, shell2 on server2.Time to connect them. Enter the following, being sure to end the line with aperiod (
.
):
In shell1:
- net_kernel:connect_node(car@server2).
This will connect to the node called car
on the server called server2
.
If that returns true, then you have an Erlang cluster, and the firewalls areopen. This means that 2 CouchDB nodes on these two servers will be able tocommunicate with each other successfully. If you get false or nothing at all,then you have a problem with the firewall, DNS, or your settings. Try again.
If you’re concerned about firewall issues, or having trouble connecting allnodes of your cluster later on, repeat the above test between all pairs ofservers to confirm connectivity and system configuration is correct.
3.2.3. Preparing CouchDB nodes to be joined into a cluster
Before you can add nodes to form a cluster, you must have them listening on anIP address accessible from the other nodes in the cluster. You should also ensurethat a few critical settings are identical across all nodes before joining them.
The settings we recommend you set now, before joining the nodes into a cluster,are:
etc/vm.args
settings as described in theprevious two sections- At least one server administratoruser (and password)
- Bind the node’s clustered interface (port
5984
) to a reachable IP address - A consistent
UUID
. The UUID is used in identifyingthe cluster when replicating. If this value is not consistent across all nodesin the cluster, replications may be forced to rewind the changes feed to zero,leading to excessive memory, CPU and network use. - A consistent
httpd secret
. The secretis used in calculating and evaluating cookie and proxy authentication, and shouldbe set consistently to avoid unnecessary repeated session cookie requests.
If you use a configuration management tool, such as Chef, Ansible, Puppet, etc.,then you can place these settings in a.ini
file and distribute them to allnodes ahead of time. Be sure to pre-encrypt the password (cutting and pastingfrom a test instance is easiest) if you use this route to avoid CouchDB rewritingthe file.
If you do not use configuration management, or are just experimenting withCouchDB for the first time, use these commands once per server to performsteps 2-4 above. Be sure to change the password
to something secure, andagain, use the same password on all nodes. You may have to run these commandslocally on each node; if so, replace <server-IP|FQDN>
below with 127.0.0.1
.
- # First, get two UUIDs to use later on. Be sure to use the SAME UUIDs on all nodes.
- curl http://<server-IP|FQDN>:5984/_uuids?count=2
- # CouchDB will respond with something like:
- # {"uuids":["60c9e8234dfba3e2fdab04bf92001142","60c9e8234dfba3e2fdab04bf92001cc2"]}
- # Copy the provided UUIDs into your clipboard or a text editor for later use.
- # Use the first UUID as the cluster UUID.
- # Use the second UUID as the cluster shared http secret.
- # Create the admin user and password:
- curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/admins/admin -d '"password"'
- # Now, bind the clustered interface to all IP addresses availble on this machine
- curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/chttpd/bind_address -d '"0.0.0.0"'
- # Set the UUID of the node to the first UUID you previously obtained:
- curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/couchdb/uuid -d '"FIRST-UUID-GOES-HERE"'
- # Finally, set the shared http secret for cookie creation to the second UUID:
- curl -X PUT http://<server-IP|FQDN>:5984/_node/_local/_config/couch_httpd_auth/secret -d '"SECOND-UUID-GOES-HERE"'
3.2.4. The Cluster Setup Wizard
CouchDB 2.x comes with a convenient Cluster Setup Wizard as part of the Fauxtonweb administration interface. For first-time cluster setup, and forexperimentation, this is your best option.
It is strongly recommended that the minimum number of nodes in a cluster is3. For more explanation, see the Cluster Theory sectionof this documentation.
After installation and initial start-up of all nodes in your cluster, ensuringall nodes are reachable, and the pre-configuration steps listed above, visitFauxton at http://<server1>:5984/_utils#setup
. You will be asked to set upCouchDB as a single-node instance or set up a cluster.
When you click “Setup Cluster” you are asked for admin credentials again, andthen to add nodes by IP address. To get more nodes, go through the same installprocedure on other machines. Be sure to specify the total number of nodes youexpect to add to the cluster before adding nodes.
Now enter each node’s IP address or FQDN in the setup wizard, ensuring you alsoenter the previously set server admin username and password.
Once you have added all nodes, click “Setup” and Fauxton will finish thecluster configuration for you.
To check that all nodes have been joined correctly, visithttp://<server-IP|FQDN>:5984/_membership
on each node. The returned listshould show all of the nodes in your cluster:
- {
- "all_nodes": [
- "couchdb@server1",
- "couchdb@server2",
- "couchdb@server3"
- ],
- "cluster_nodes": [
- "couchdb@server1",
- "couchdb@server2",
- "couchdb@server3"
- ]
- }
The allnodes
section is the list of _expected nodes; the clusternodes
section is the list of _actually connected nodes. Be sure the two lists match.
Now your cluster is ready and available! You can send requests to any one ofthe nodes, and all three will respond as if you are working with a singleCouchDB cluster.
For a proper production setup, you’d now set up an HTTP proxy in front of thenode that does load balancing and SSL termination, if desired. We recommendHAProxy. See our example configuration for HAProxy. All you need is toadjust the IP addresses or hostnames and ports.
3.2.5. The Cluster Setup API
If you would prefer to manually configure your CouchDB cluster, CouchDB exposesthe _cluster_setup
endpoint for that purpose. After installation andinitial setup/config, we can set up the cluster. On each node we need to runthe following command to set up the node:
- curl -X POST -H "Content-Type: application/json" http://admin:password@127.0.0.1:5984/_cluster_setup -d '{"action": "enable_cluster", "bind_address":"0.0.0.0", "username": "admin", "password":"password", "node_count":"3"}'
After that we can join all the nodes together. Choose one node as the “setupcoordination node” to run all these commands on. This “setup coordinationnode” only manages the setup and requires all other nodes to be able to see itand vice versa. It has no special purpose beyond the setup process; CouchDBdoes not have the concept of a “master” node in a cluster.
Setup will not work with unavailable nodes. All nodes must be online and properlypreconfigured before the cluster setup process can begin.
To join a node to the cluster, run these commands for each node you want to add:
- curl -X POST -H "Content-Type: application/json" http://admin:password@<setup-coordination-node>:5984/_cluster_setup -d '{"action": "enable_cluster", "bind_address":"0.0.0.0", "username": "admin", "password":"password", "port": 5984, "node_count": "3", "remote_node": "<remote-node-ip>", "remote_current_user": "<remote-node-username>", "remote_current_password": "<remote-node-password>" }'
- curl -X POST -H "Content-Type: application/json" http://admin:password@<setup-coordination-node>:5984/_cluster_setup -d '{"action": "add_node", "host":"<remote-node-ip>", "port": <remote-node-port>, "username": "admin", "password":"password"}'
This will join the two nodes together. Keep running the above commands for eachnode you want to add to the cluster. Once this is done run the followingcommand to complete the cluster setup and add the system databases:
- curl -X POST -H "Content-Type: application/json" http://admin:password@<setup-coordination-node>:5984/_cluster_setup -d '{"action": "finish_cluster"}'
Verify install:
- curl http://admin:password@<setup-coordination-node>:5984/_cluster_setup
Response:
- {"state":"cluster_finished"}
Verify all cluster nodes are connected:
- curl http://admin:password@<setup-coordination-node>:5984/_membership
Response:
- {
- "all_nodes": [
- "couchdb@couch1",
- "couchdb@couch2",
- "couchdb@couch3",
- ],
- "cluster_nodes": [
- "couchdb@couch1",
- "couchdb@couch2",
- "couchdb@couch3",
- ]
- }
Ensure the all_nodes
and cluster_nodes
lists match.
You CouchDB cluster is now set up.