Skip to main content

High Availability RabbitMQ cluster in OpenStack

The RabbitMQ service is the heart of the processes communication in OpenStack and in a PROD deployment you want to configure a rabbitmq cluster in order to achieve hight availability for the messages queues.

There are two types of RabbitMQ nodes, disk nodes and ram nodes. Ram nodes require less IOPS because the resource management is not written in disk, but in a cluster, at least, a disk node is required.

In this post I’m going to configure a three node cluster: one disk node and two ram nodes.

Installing the RabbitMQ server in all your nodes is as simple as running these commands:

$ sudo echo "deb http://www.rabbitmq.com/debian/ testing main" > /etc/apt/sources.list.d/rabbitmq.list
$ sudo wget -O - https://www.rabbitmq.com/rabbitmq-signing-key-public.asc | sudo apt-key add - 
$ sudo apt-get update
$ sudo apt-get install rabbitmq-server

To start from scratch stop the rabbitmq service in your nodes and reset the queues and configuration:

$ sudo rabbitmqctl stop_app
$ sudo rabbitmqctl reset
$ sudorabbitmqctl start_app

Note: After the cluster creation the ram nodes can be reset without problems, but the disk node cannot be reset because it is the only disk node in the cluster, to reset it you can do it removing the data from the disk:

rabbitmqctl stop_app
/etc/init.d/rabbitmq-server stop
rm -rf /var/lib/rabbitmq/mnesia/rabbit@node1*
/etc/init.d/rabbitmq-server start
rabbitmqctl start_app

Now, if you run the cluster status command you can see the cluster running with your disk node only:

rabbitmqctl cluster_status
Cluster status of node 'rabbit@node1' ...
[{nodes,[{disc,['rabbit@node1']},
         {ram,[]}]},
 {running_nodes,['rabbit@node1']},
 {cluster_name,<<"rabbit@node1">>},
 {partitions,[]}]

In the disk node create the user for the openstack services, set its permissions and set the cluster name:

$ sudo rabbitmqctl add_user openstack openstack_pass
$ sudo rabbitmqctl set_permissions -p / openstack ".*" ".*" "."
$ sudo rabbitmqctl set_cluster_name openstack

Set the queues ha policy to ensure that all queues except those with auto-generated names are mirrored across all running nodes:

rabbitmqctl set_policy ha-all '^(?!amq\.).*' '{"ha-mode": "all"}'

To make a RabbitMQ cluster all the members have to have the same Erlang cookie, find the cookie in the first node and copy it to the other nodes. The cookie is located at:

$ cat /var/lib/rabbitmq/.erlang.cookie
SRITXWMZBCBIRFZMQOAQ

Join the other two nodes to the cluster as a ram nodes, to do that run the following command in node2 and in node3:

$ sudo rabbitmqctl stop_app
$ sudo rabbitmqctl join_cluster --ram rabbit@node1
$ sudo rabbitmqctl start_app

The cluster is completed now:

$ sudo rabbitmqctl cluster_status
Cluster status of node 'rabbit@node1' ...
[{nodes,[{disc,['rabbit@node1']},
         {ram,['rabbit@node2','rabbit@node3']}]},
 {running_nodes,['rabbit@node1','rabbit@node2',
                 'rabbit@node3']},
 {cluster_name,<<"openstack">>},
 {partitions,[]}]

As an additional step you can enable the rabbitmq management plugin in one or all of your nodes:

$ sudo rabbitmq-plugins enable rabbitmq_management

Create a new user for the management interface:

$ sudo rabbitmqctl add_user admin admin_pass
$ sudo rabbitmqctl set_permissions -p / admin ".*" ".*" ".*"
$ sudo rabbitmqctl set_user_tags  admin administrator

And finally go to your browser an and type:

http://server-name:15672
user: admin
pass: admin_pass

Once you have configured your RabbitMQ cluster you can configure the OpenStack services to use the cluster and mirrored queues. Just in case you should configure check OpenStack documentation for each service:

[oslo_messaging_rabbit]
rabbit_hosts=node1:5672,node2:5672,node3:5672
rabbit_retry_interval=1
rabbit_retry_backoff=2
rabbit_max_retries=0
rabbit_ha_queues=true
rabbit_userid = openstack
rabbit_password = openstack_pass
amqp_auto_delete = true
amqp_durable_queues=True

How to reset cluster configuration in Proxmox 2

If you have already made the proxmox cluster, but you want to make changes to the cluster config, for example for changing the hostname of the node, or the network with which the nodes are communicating on in the cluster, you can remove the cluster and create it again:

First, make a backup of the cluster:

cp -a /etc/pve /root/pve_backup

Stop cluster service:

/etc/init.d/pve-cluster stop

Umount /etc/pve if it is mounted:

umount /etc/pve

Stop corosync service:

/etc/init.d/cman stop

Remove cluster configuration:

# rm /etc/cluster/cluster.conf
# rm -rf /var/lib/pve-cluster/*

Start again cluster service:

/etc/init.d/pve-cluster start

Now, you can create new cluster:

# pvecm create newcluster 

Restore cluster and virtual machines configuration from the backup:

# cp /root/pve_backup/*.cfg /etc/pve/
# cp /root/pve_backup/qemu-server/*.conf /etc/pve/qemu-server/
# cp /root/pve_backup/openvz/* /etc/pve/openvz/

UPDATE: This post is also valid to change the hostname of a node in a cluster or to move a node between two clusters. When you have removed a node from the cluster, it still appears in the proxmox nodes tree, to remove it from the tree you have to delete the node directory from another node in the cluster:

# rm -rf /etc/pve/nodes/HOSTNAME