Skip to main content

Manage interface bondings through sysfs interface

This is a easy way to manage your bondings through the sysfs interface.

Load the bond module:

# modprobe bond

Create a new bonding bond0:

echo "+bond0" >  /sys/class/net/bonding_masters

View the existing bondings:

# cat /sys/class/net/bonding_masters
bond0 bond1

Add interfaces to bond0 bonding:

echo "+eth0" > /sys/class/net/bond0/bonding/slaves
echo "+eth1" > /sys/class/net/bond0/bonding/slaves

Remove an interface from an existing bonding

echo "-eth0" > /sys/class/net/bond0/bonding/slaves

Remove the bond0 inteface

echo "-bond0" >  /sys/class/net/bonding_masters

Change the bonding mode (The bond interface must be down before the mode can be changed.):

echo balance-alb > /sys/class/net/bond0/bonding/mode

or

echo 6 > /sys/class/net/bond0/bonding/mode

A full example:

# modprobe bonding
# modprobe e100
# echo balance-alb > /sys/class/net/bond0/bonding/mode
# ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up
# echo 100 > /sys/class/net/bond0/bonding/miimon
# echo +eth0 > /sys/class/net/bond0/bonding/slaves
# echo +eth1 > /sys/class/net/bond0/bonding/slaves

References: https://www.kernel.org/doc/Documentation/networking/bonding.txt [3.4]

Live migration with OpenStack on Ubuntu 14.04

In this post I going to configure the compute nodes to enable the instance live migration on kvm instances backed with CEPH. In my set-up cinder volumes and the nova instances ephimeral disks are backed with CEPH so all the compute nodes can see all the storage.

Assuming that cinder and nova is correctly integrated with CEPH we have to follow these steps to set up live migration:

In libvirt-bin service configuration file we have to enable -l flag to libvirt-bin service args so it listen through tcp socket.
/etc/default/libvirt-bin

# Defaults for libvirt-bin initscript (/etc/init.d/libvirt-bin)
# This is a POSIX shell fragment

# Start libvirtd to handle qemu/kvm:
start_libvirtd="yes"

# options passed to libvirtd, add "-l" to listen on tcp
libvirtd_opts="-d -l"

In libvirtd configuration, set the options needed to listen on tcp:
/etc/libvirt/libvirtd.conf

# Flag listening for secure TLS connections on the public TCP/IP port.
listen_tls = 0
# Listen for unencrypted TCP connections on the public TCP/IP port.
listen_tcp = 1
tcp_port = "16509"
# Override the default configuration which binds to all network
# interfaces. This can be a numeric IPv4/6 address, or hostname
listen_addr = "172.17.16.117"
# Authentication.
#
#  - none: do not perform auth checks. If you can connect to the
#          socket you are allowed. This is suitable if there are
#          restrictions on connecting to the socket (eg, UNIX
#          socket permissions), or if there is a lower layer in
#          the network providing auth (eg, TLS/x509 certilos resultadosficates)
auth_unix_ro = "none"
auth_unix_rw = "none"
auth_tcp = "none"

Because we are setting no auth for tcp connection you should take other actions for your production environment to ensure only certain servers are allowed to connect to this port, for example using iptables.

Configure qemu user and group with root.
/etc/libvirt/qemu.conf

# The user for QEMU processes run by the system instance. It can be
# specified as a user name or as a user id. The qemu driver will try to
# parse this value first as a name and then, if the name doesn't exist,
# as a user id.
user = "root"
# The group for QEMU processes run by the system instance. It can be
# specified in a similar way to user.
group = "root"
# Whether libvirt should dynamically change file ownership
# to match the configured user/group above. Defaults to 1.
# Set to 0 to disable file ownership changes.
dynamic_ownership = 0

Once the changes are made restart the libvirt-bin service:

$ sudo service libvirt-bin restart
libvirt-bin stop/waiting
libvirt-bin start/running, process 21411

Check if libvirt-bin is listening on tcp port 16509

$ sudo netstat -npta | grep 16509  
tcp        0      0 172.17.16.117:16509     0.0.0.0:*               LISTEN    21411/libvirtd  

Set the needed flags in libvirt for live migration:
/etc/nova/nova.conf

[libvirt]
[..]
live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_TUNNELLED
live_migration_uri=qemu+tcp://%s/system

Assuming that the compute nodes have different hardware you have to set up a common cpu model in nova.conf configuration file. You can set kvm64, the most compatible mode across Intel and AMD platforms or if you have intel cpus, like me, you can set SandyBridge. In any case, the mode you selected must be supported in all compute nodes.

/etc/nova/nova.conf

[libvirt]
[..]
type = qemu
cpu_mode=custom
cpu_model=kvm64
[libvirt]
[..]
type = qemu
cpu_mode=custom
cpu_model=SandyBridge

You can see all the cpu modes that kvm support with:

$ /usr/bin/qemu-system-x86_64 -cpu help
x86           qemu64  QEMU Virtual CPU version 2.0.0                  
x86           phenom  AMD Phenom(tm) 9550 Quad-Core Processor         
x86         core2duo  Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz 
x86            kvm64  Common KVM processor                            
x86           qemu32  QEMU Virtual CPU version 2.0.0                  
x86            kvm32  Common 32-bit KVM processor                     
x86          coreduo  Genuine Intel(R) CPU           T2600  @ 2.16GHz 
x86              486                                                  
x86          pentium                                                  
x86         pentium2                                                  
x86         pentium3                                                  
x86           athlon  QEMU Virtual CPU version 2.0.0                  
x86             n270  Intel(R) Atom(TM) CPU N270   @ 1.60GHz          
x86           Conroe  Intel Celeron_4x0 (Conroe/Merom Class Core 2)   
x86           Penryn  Intel Core 2 Duo P9xxx (Penryn Class Core 2)    
x86          Nehalem  Intel Core i7 9xx (Nehalem Class Core i7)       
x86         Westmere  Westmere E56xx/L56xx/X56xx (Nehalem-C)          
x86      SandyBridge  Intel Xeon E312xx (Sandy Bridge)                
x86          Haswell  Intel Core Processor (Haswell)                  
x86       Opteron_G1  AMD Opteron 240 (Gen 1 Class Opteron)           
x86       Opteron_G2  AMD Opteron 22xx (Gen 2 Class Opteron)          
x86       Opteron_G3  AMD Opteron 23xx (Gen 3 Class Opteron)          
x86       Opteron_G4  AMD Opteron 62xx class CPU                      
x86       Opteron_G5  AMD Opteron 63xx class CPU                      
x86             host  KVM processor with all supported host features (only available in KVM mode)

After these changes, if you see a message like this:

$ sudo nova live-migration 6fba9cbe-66e2-484d-ba90-18ad519865ff host3
ERROR (BadRequest): Unacceptable CPU info: CPU doesn't have compatibility.

It could be caused by this bug #1082414. In Juno, as a work around, you can comment out the line number 5010 “self._compare_cpu(source_cpu_info)” in libvirt driver:
/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py

# Compare CPU
source_cpu_info = src_compute_info['cpu_info']
#self._compare_cpu(source_cpu_info)

In KILO this bug should be fixed, so no changes are needed in driver.py.

I’m not so sure that the following is a requirement for the live migration, but it definitively is to enable the migration process and the instance resize, because some commands are run through a ssh connection.

Enable ssh access between compute nodes with nova user. First edit each /etc/passwd file and enable shell access for your nova user:
/etc/passwd

nova:x:108:113::/var/lib/nova:/bin/sh

Put this ssh configuration file in your nova home directory to avoid checking host’s keys between the compute nodes.
/var/lib/nova/.ssh/config

Host *
    StrictHostKeyChecking no

For each compute node create a rsa key pair as the nova user:

$ ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/var/lib/nova/.ssh/id_rsa): 
Created directory '/var/lib/nova/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /var/lib/nova/.ssh/id_rsa.
Your public key has been saved in /var/lib/nova/.ssh/id_rsa.pub.
The key fingerprint is:
e1:97:a7:f5:10:71:bb:1f:9a:91:dd:c8:66:22:be:49 nova@host
The key's randomart image is:
+--[ RSA 2048]----+
|            . .  |
|             o . |
|        .   . .  |
|       . . . ooo.|
|        S + =o*o.|
|         o = *+..|
|          E  o. .|
|         . o     |
|          o      |
+-----------------+

Copy all the created public keys content to an authorized_keys file and share it in all the compute nodes for the nova user:
/var/lib/nova/.ssh/authorized_keys

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDl+XPbYlzlDm3F+5N2SCiZlCRL/wZ9WAD3xwC5uNeza7NbQwy9jL5t2jHQn+bLMHP27GJO5Afl0cx9aPMe+mUvXDf0kk1yhND/eqRauNjQ/NONhUT9VDMiQBL7F28xWD+d0XTSr/G1/ddYxt/ouoZF94nPXCLmzqY4JdwWCq2VV/ChJRAXqs0tzPpOxmAGWNm7+mOxL4SFiFRCHR4LxxveV5rf10EzrOJFOEewUQ51yTqn8tuIs59nPuVzwNezYVJ4iZM3gcdm+rnE/40I/sodePDhiuIVkcT0Zl1stGVxVJrpsUtzE8+YsZLe+aH/IlsHXMPdpCIbinyv0vmzIG1H nova@host1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUTvfP4RmRdRXIlWn72X+y+DKnwiDlz9iWqB+0zVhMmy3T4bYY4Okw5qXCZ6xOA2BLzsuY07QLNdFCHDs6FjPjEtT+A8U4w3x4aZDwS+jgl6eC3vpTU/rkEpCDF/KOvkvoP+U8zuKS4r1r5+UAoFAKvDCM8RGGwY6mC2+uEqv23at9OIrWrbkdHVlVnxhSYk4prg2PnePMFchs3Sh9yEaLw/3F2wGBJGjYbVkfAu87UbQy6mRqWepJx8qSP2XYvIuVKleYpHS41Vk3H/+L4tTR0ibYBD+eDR80IRN4qGE6vzdf7hJW1Gl0Ozx9fzSzO0u6f/8254PqrNxya0PMmCbb nova@host2
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDapvnExGGOKVx0XVqTPNWTwXR0kXLfzb2se1slb7oAL7clZShhUKDwFHOVRO16tV7k/VD3mEf0Z+VBmU2MyxXa5nOIwbBCIIy9E/01fXh9QcP5dn1Qs8GzsoNh4j3AHSDbmYgsaG0d+BrBxmF/HpU+qZvBOMudT8reXT++5VQFNMP5cXkd6b8gyeYlrRH2SAaa7kIy44z3ZqQHzmFA+TJwYSrMoawgpdDE75HWQMAgiECXFK2Nb71+gd9sHOttzNPGmSx6TmbkHAi1W9rGYSZ88n1+19tHbnyZi+Qn8HYvKmLMyQFhje71DMwzK3FzbSpZuTaMfiEslRS9skYD6OTd nova@host3
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCrBNLab4QNjAIwGm7Ajc0CGHrtSlLnbV447vAdc/QWRoU+yiBlv4NxWq3aOogczuq6ar3hufXAnUX7ClMTon6f2Fcq/cv2D5V8YkXG7NtZQUKj0F6R27dEOUMPX64w2PGZen2QpcJNxLJXokbdTnDRc2odJ+0kw8rGKWDPioeLDjw5Qrb6EfddxWBJLbk3+gravyc2zHWMCzLUhRU4JMxBMutk3AXV2XBUflnOBoUMFixv8Mrm4wWQE3w29dZGL6wYtl2dAt9YENo9UIko/jVreuAc5gTIr4v1iywzaDivLT2HR2BjqTkABOd9cuWw6o7ZS0lTTPf8skGxAGNSOoQT nova@host4

Check if you are able to run the ls command in a remote host in your compute nodes:

$ ssh nova@host1 ls -l /etc/nova/nova.conf 
-rw-r----- 1 nova nova 3329 sep 21 11:17 /etc/nova/nova.conf

Now, you should be able to do an instance live migration between your compute nodes and the instance resize/migration should work too without problems.

Interface bonding on Ubuntu 14.04

Install the required package ifenslave

$ sudo apt-get install ifenslave ethtool

To prevent issues make sure that the bonding  module is listed in the /etc/modules file. In this way the module will be loaded at boot time.

$ cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.

lp
rtc
bonding

Configure the network with the new bond0 interface.

$ cat /etc/network/interfaces

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet manual
  bond-master bond0

auto eth1
iface eth1 inet manual
  bond-master bond0

auto bond0
iface bond0 inet manual
  address 172.17.16.10
  netmask 255.255.255.0
  gateway 172.17.16.1
  bond-miimon 100
  bond-mode balance-alb
  bond-slaves eth0 eth1

I configured here with the balance-alb mode that is able to balance the outgoing and incoming traffic without any special switch support, but the network drivers must support ethtool to retrieve the speed from them.

The common bonding modes that are very used are:
– active-backup
– balance-alb
– 802.3ad

You should check the bonding documentation and the features of each mode here:

https://www.kernel.org/doc/Documentation/networking/bonding.txt

To check that the new bond0 interface is working:

$ sudo cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: adaptive load balancing
Primary Slave: None
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 0c:c4:7a:34:e8:a2
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 0c:c4:7a:34:e8:a3
Slave queue ID: 0

 

 

Calamari on Ubuntu 14.04 with Firefly

Calamari is great console developed by the Inktank Team, but the process, building it, was a very difficult and laborious process. The good news is that now they are building binary packages and put them in a repository that make the calamari installation a lot easier. Although, the packages are in a dev stage the console is very usable. The repository can be found here:

http://download.ceph.com/calamari/

To set the repository run the following command:

echo "deb http://download.ceph.com/calamari/devel/trusty/ trusty main" > /etc/apt/sources.list.d/calamari.list

Then get the repository key:

wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc' | sudo apt-key add -

And finally install the packages:

$ sudo apt-get install calamari-server calamari-clients

Run initialize

$ sudo calamari-ctl initialize
[INFO] Loading configuration..
[INFO] Starting/enabling salt...
[INFO] Starting/enabling postgres...
[INFO] Updating database...
[INFO] Initializing web interface...
[INFO] Starting/enabling services...
[INFO] Updating already connected nodes.
[INFO] Restarting services...
[INFO] Complete.

If you want to go through the process of building the packages for yourself, you can do it with the help of these guides:

http://ceph.com/calamari/docs/operations/server_install.html
https://ceph.com/category/calamari/

In fact, I have to built the calamari-server package from the sources because I couldn’t make it work in my server, the initialize process always finished with errors. I guess there was some bug in the binary package that was fixed in the sources. To save time to you, if you want to install the package I built, you can get it here: calamari-server_1.3.0.1-111-g208b255_amd64.deb

I the case that you built the packages from the sources or used the package I post here you should check this post too.

Calamari needs saltstack to run, but I faced issues with the last versions of saltstack, I guess calamari is only compatible with the branch 2014-7 of saltstack. To install this version you can add this repository:

echo "deb http://ppa.launchpad.net/saltstack/salt2014-7/ubuntu trusty main" > /etc/apt/sources.list.d/saltstack-salt-trusty.list

Install de repository key:

gpg --keyserver keyserver.ubuntu.com --recv-key 0E27C0A6 && gpg -a --export 0E27C0A6 | sudo apt-key add -

Or simply run this command to get the repository working on your machine:

$ sudo add-apt-repository ppa:saltstack/salt2014-7

Now you can install the salt-master and salt-minion if needed:

$ sudo apt-get install salt-master

You have to be sure that in your ceph nodes you have the same salt version that you have in your master:

$ sudo dpkg -l | grep salt
ii  salt-common                          2014.7.5+ds-1ubuntu1             all          shared libraries that salt requires for all packages
ii  salt-master                          2014.7.5+ds-1ubuntu1             all          remote manager to administer servers via salt
ii  salt-minion                          2014.7.5+ds-1ubuntu1             all          client package for salt, the distributed remote execution system

You can check if the clamari services are working Ok:

$ sudo supervisorctl status
carbon-cache                     RUNNING    pid 1088, uptime 23:19:02
cthulhu                          RUNNING    pid 27734, uptime 0:15:00

If something is wrong you should check the /var/log/calamri directory

about_calamari

pg X.Y is stuck stale for , current state stale+active+clean, last acting [N]

I got these states when I removed the last OSD assigned to a pool with size 1 in the crushmap. Of course, I didn’t have any precious data in it, but to avoid removing the pool I tried reassigning the pool to a new root and new OSDs through a the crusmap rule.

ceph health detail
HEALTH_WARN 9 pgs stale; 9 pgs stuck stale
pg 18.6 is stuck stale for 8422.941233, current state stale+active+clean, last acting [13]
pg 18.1 is stuck stale for 8422.941247, current state stale+active+clean, last acting [13]
pg 19.0 is stuck stale for 8422.941251, current state stale+active+clean, last acting [13]
pg 18.0 is stuck stale for 8422.941252, current state stale+active+clean, last acting [13]
pg 19.1 is stuck stale for 8422.941255, current state stale+active+clean, last acting [13]
pg 18.3 is stuck stale for 8422.941254, current state stale+active+clean, last acting [13]
pg 19.2 is stuck stale for 8422.941258, current state stale+active+clean, last acting [13]
pg 18.2 is stuck stale for 8422.941259, current state stale+active+clean, last acting [13]
pg 19.3 is stuck stale for 8422.941263, current state stale+active+clean, last acting [13]

The Pgs show that the their last acting and removed OSD was number 13 and indeed, this OSD no longer exists in the cluster.

If I try querying the pg:

# ceph pg 18.6 query
Error ENOENT: i don't have pgid 18.6

The data insight those pgs is not valid so I tried recreating the pg:

# ceph pg force_create_pg 18.6
pg 18.6 now creating, ok

Remember that I reassigned the pool to a new root in the crushmap, so there are many OSDs available for the pool. But now, the PG is stuck with the state “creating” forever:

pg 18.6 is stuck inactive since forever, current state creating, last acting []

I supposed that the problem was with the Pg number of the pool, I thought that the pool couldn’t create more PGs because of the its Pg number.
I tried increasing the pool pg number and finally the PGs where created ok.

I follow these steps for documentation purposes, but if you don’t mind the data insight the pool the best option should be remove the pool and create it again.