Smokes your problems, coughs fresh air.

Tag: xen (Page 1 of 2)

Creating a drbd for an existing Xen domain

I needed some VMs to be available on a backup node, which I accomplished with the distributed remote block device, or DRBD. My host machine is Debian 6.

This post replaced an older one I made.

First install drbd:

aptitude -P install drbd8-utils

Then make some config files. First adjust /etc/drbd.d/global.conf (I only had to uncomment the notify rules):

global {
        usage-count yes;
        # minor-count dialog-refresh disable-ip-verification
}
 
common {
        protocol C;
 
        handlers {
                pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
                pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
                local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
                # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
                split-brain "/usr/lib/drbd/notify-split-brain.sh root";
                out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
                # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
                # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
        }
 
        startup {
                # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb;
 
                # The timeout value when the last known state of the other side was available.
                wfc-timeout 0;
 
                # Timeout value when the last known state was disconnected.
                degr-wfc-timeout 180;
        }
 
        disk {
                # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
                # no-disk-drain no-md-flushes max-bio-bvecs   
        }
 
        net {
                # snd‐buf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers
                # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret
                # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork
        }
 
        syncer {
                # rate after al-extents use-rle cpu-mask verify-alg csums-alg
        }
}

Then I made a resource for my existing logical volume:

resource r0
{
  meta-disk internal;
  device /dev/drbd1;
 
  startup
  {
    # The timeout value when the last known state of the other side was available.
    wfc-timeout 0;
 
    # Timeout value when the last known state was disconnected.
    degr-wfc-timeout 180;
  }
 
  syncer
  {
    # This is recommended only for low-bandwidth lines, to only send those
    # blocks which really have changed.
    #csums-alg md5;
 
    # Set to about half your net speed
    rate 8M;
 
    # It seems that this option moved to the 'net' section in drbd 8.4.
    verify-alg md5;
  }
 
  net
  {
    # The manpage says this is recommended only in pre-production (because of its performance), to determine
    # if your LAN card has a TCP checksum offloading bug. 
    #data-integrity-alg md5;
  }
 
  disk
  {
    # Detach causes the device to work over-the-network-only after the
    # underlying disk fails. Detach is not default for historical reasons, but is
    # recommended by the docs.
    # However, the Debian defaults in drbd.conf suggest the machine will reboot in that event...
    on-io-error detach;
 
    # LVM doesn't support barriers, so disabling it. It will revert to flush. Check wo: in /proc/drbd. If you don't disable it, you get IO errors.
    no-disk-barrier;
  }
 
  on top
  {
    disk /dev/universe/lvtest;
    address 192.168.2.6:7789;
  }
 
  on bottom
  {
    disk /dev/universe/lvtest;
    address 192.168.2.7:7790;
  }
}

Copy all config files to the slave machine (and write an rsync-script for it…).

I learned that Linux 3.1 now has write barriers enabled by default for ext3 (they already were for ext4). This causes bugs and IO errors with xen-blkfront, so that needs to be disabled:

# grep barrier /etc/fstab
/dev/xvda2 / ext3 barrier=0 0 1

I’ll see about finding out if there are bug reports and file them if necessary.

The drbd data is going to be written on the actual LV, so on the primary node, we need to make space (you can also grow the LV):

e2fsck -f /dev/universe/lvtest
resize2fs /dev/universe/lvtest 500M # or however big that's a tad smaller than the actual LV.
drbdadm create-md r0
drbdadm up r0

On the secondary node, make the device as well:

drbdadm create-md r0
drbdadm up r0

Then we can start syncing and re-grow it. On the primary:

drbdadm -- --overwrite-data-of-peer primary r0 # the -- is necessary because of weird option handling by drbdadm.
resize2fs /dev/drbd1

The logical volume has been converted from ext3 to drbd:

# mount /dev/universe/lvtest /mnt/temp
mount: unknown filesystem type 'drbd'

Then, it is recommended you create /etc/modprobe.d/drbd.conf with:

options drbd disable_sendpage=1

I don’t know what it does, but it’s recommended by the DRBD devices docs when you put Xen domains on DRBD devices.

In Xen, you can configure the disk device of a VM like this (actually, I learned that this doesn’t work with pygrub):

disk = [ 'drbd:resource,xvda,w' ]

Drbd has installed the necessary scripts in /etc/xen/scripts to support this. Xen will now automatically promote a drbd device to primary when you start a VM.

Bewarned: because of that, don’t put the VM in the /etc/xen/auto dir on the fallback node, otherwise whichever machine is faster will start the VM, preventing the other machine from starting it (because you can’t have two primaries).

Then, I noticed that Debian arranges it’s boot process erroneously, starting xemdomains before drbd. I comment on an old bug.

You can fix it by adding xendomains to the following lines in /etc/init.d/drbd:

# X-Start-Before: heartbeat corosync xendomains
# X-Stop-After:   heartbeat corosync xendomains 

Mdadm (software RAID) schedules monthly checks of your array. You can do that for DRBD too). You do that on the primary node with a cronjob in /etc/cron.d/:

42 0 * * 0    root    /sbin/drbdadm verify all

One last thing: the docs state that when you perform a verify and it detects an out-of-sync device, all you have to do is disconnect and connect. That didn’t work for me. Instead, I ran the following on the secondary node (the one I had destroyed with dd) to initiate a resync:

drbdadm invalidate r0

Setting max memory of a Xen Dom0

I’ve had some issues with Xen crashing when I wanted to create a DomU for which the Dom0 had to shrink (see bug report). Therefore, it’s better to force a memory limit on the dom0. That is done with a kernel param.

Add this to /etc/default/grub:

# Start dom0 with less RAM
GRUB_CMDLINE_XEN_DEFAULT="dom0_mem=512M"

And make sure you disabling ballooning of dom0 in /etc/xen/xend-config.sxp:

(enable-dom0-ballooning no)

Then run update-grub2 and reboot.

Fixing a Xen DomU’s grub config

When you use xen-create-image to bootstrap an ubuntu, it sets up a grub config file menu.lst. However, this boot config is not kept up-to-date with newer ubuntu’s because they use grub2 (which uses grub.cfg and not menu.lst). And Xen’s pygrub first looks at menu.lst, so if you have a stale file, it will always boot an old kernel.

I ‘Fixed’ it like this (a real fix I have yet to devise, but this works. Actually, I think it is a bug):

The grub hooks in Debian and Ubuntu don’t take the fact into account that the machine might be running as paravirtualized VM. Therefore, it can’t find /dev/xvda to install grub on, which it shouldn’t try. Bug reports exist about this, but it is not deemed important, it seems. The result is that the menu.lst that was created by xen-create-image is never updated and updates to the kernel are never booted.

Pygrub considers menu.lst over grub.cfg (which would give problems with upgrading to grub2). But you can also use it to your advantage. You can edit /etc/kernel-img.conf to look like this (do_symlinks = yes and no hooks):

do_symlinks = yes
relative_links = yes
do_bootloader = no
do_bootfloppy = no
do_initrd = yes
link_in_boot = no
postinst_hook =
postrm_hook   =

And then make /boot/grub/menu.lst with this:

default         0
timeout         2
 
title           Marauder
root            (hd0,0)
kernel          /vmlinuz root=/dev/xvda2 ro
initrd          /initrd.img

Then uninstall grub. This way, you always boot the new kernel. (actually, I found that you can’t always uninstall grub. I now have machines with grub-pc and grub-common installed, but have an empty /boot/grub with only a menu.lst there. This will keep aptitude from complaining about grub-pc not being able to configure, because it’s unable to detect the bios device for /dev/xvda2 (or whatever)

Tracking Xen domU’s with munin

To view statistics of your xen server with Munin (source):

cd /usr/local/share/
mkdir -p munin/plugins
cd munin/plugins
wget http://wiki.kartbuilding.net/xen_traffic_all
wget http://wiki.kartbuilding.net/xen_cpu_percent
chmod 755 xen_traffic_all xen_cpu_percent
ln -s /usr/local/share/munin/plugins/xen_traffic_all /etc/munin/plugins/
ln -s /usr/local/share/munin/plugins/xen_cpu_percent /etc/munin/plugins/
vim /etc/munin/plugin-conf.d/munin-node

#add the following:
[xen_traffic_all]
user root
[xen_cpu_percent]
user root

/etc/init.d/munin-node restart

Original links:

I wanted to attach the scripts, but because of upload problems, I can’t…

Getting a Xen hvc0 on a stock Ubuntu

When you install Ubuntu in Xen with xen-create-image, the console is automatically handled. If you want to add a console to a stock-installed Ubuntu, add this file to /etc/init and call it hvc0.conf:

# hvc0 - getty
#
# This service maintains a getty on hvc0 from the point the system is
# started until it is shut down again.
 
start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]
 
respawn
exec /sbin/getty -8 38400 hvc0

Xen console name on different kernel versions

I’ve just been struggling to get a xen console working for Ubuntu 8.04 (hardy). By default, xen-create-image uses hvc0, but that’s only since kernel 2.6.26 (don’t know if that’s only pv_ops or xen-patched). Hardy uses 2.6.24 and therefore it’s xvc0. The xen-create-image command or xen-tools.conf config file therefore need a parameter serial_device=xvc0.

Making a Debian Squeeze machine into a Xen virtual machine host

My attempts to get Xen working right on Debian stable (Lenny) were not really successful. Xen has had some interesting developments and the version in Lenny is just too old. Plus, the debian bootstrap scripts used to create images don’t support Ubuntu Maverick… Squeeze (testing) has the newest Xen and deboostrap, so that’s cool. I used the AMD64 architecture.

First install Xen:

aptitude -P install xen-hypervisor-4.0-amd64 linux-image-xen-amd64 xen-tools

Debian Sqeeuze uses Grub 2, and the defaults are wrong for Xen. The Xen hypervisor (and not just a Xen-ready kernel!) should be the first entry, so do this:

mv -i /etc/grub.d/10_linux /etc/grub.d/50_linux

Then, disable the OS prober, so that you don’t get entries for each virtual machine you install on a LVM.

"" >> /etc/default/grub "# Disable OS prober to prevent virtual machines on logical volumes from appearing in the boot menu." >> /etc/default/grub "GRUB_DISABLE_OS_PROBER=true" >> /etc/default/grub
update-grub2

Per default (on Debian anyway), Xen tries to save-state the VM’s upon shutdown. I’ve had some problems with that, so I set these params in /etc/default/xendomains to make sure they get shut down normally:

XENDOMAINS_RESTORE=false
XENDOMAINS_SAVE=""

In /etc/xen/xend-config.sxp I enabled the network bridge (change an existing or commented-out line for this). I also set some other useful params (for me):

(network-script network-bridge)
(dom0-min-mem 196)
(dom0-cpus 0)
(vnc-listen '127.0.0.1')
(vncpasswd '')
(Also, don’t forget to disable ballooning and setting a max memory.)

Then I edited /etc/xen-tools/xen-tools.conf. This config contains default values the xen-create-image script will use. Most important are:

# Virtual machine disks are created as logical volumes in volume group universe (LVM storage is much faster than file)
lvm = universe
 
install-method = debootstrap
 
size   = 50Gb      # Disk image size.
memory = 512Mb    # Memory size
swap   = 2Gb    # Swap size
fs     = ext3     # use the EXT3 filesystem for the disk image.
dist   = `xt-guess-suite-and-mirror --suite` # Default distribution to install.
 
gateway    = x.x.x.x
netmask    = 255.255.255.0
 
# When creating an image, interactively setup root password
passwd = 1
 
# I think this option was this per default, but it doesn't hurt to mention.
mirror = `xt-guess-suite-and-mirror --mirror`
 
mirror_maverick = http://nl.archive.ubuntu.com/ubuntu/
 
# Ext3 had some weird settings per default, like noatime. I don't want that, so I changed it to 'defaults'.
ext3_options     = defaults
ext2_options     = defaults
xfs_options      = defaults
reiserfs_options = defaults
btrfs_options    = defaults
 
# let xen-create-image use pygrub, so that the grub from the VM is used, which means you no longer need to store kernels outside the VM's. Keeps this very flexible.
pygrub=1
 
# scsi is specified because when creating maverick for instance, the xvda disk that is used can't be accessed. 
# The scsi flag causes names like sda to be used.
# scsi=1 # no longer needed 

I created the following script to easily let me make VM’s:

#!/bin/bash
 
# Script to easily create VM's. Hardy, maverick and Lenny are tested
 
dist=$1
hostname=$2
ip=$3
 
if [ -z "$hostname" -o -z "$ip" -o -z "$dist" ]; then
  echo "No dist, hostname or ip specified"
  echo ""
  echo "Usage: $0 dist hostname ip"
  exit 1
fi
 
if [ "$dist" == "hardy" ]; then
  serial_device="--serial_device xvc0"
fi
 
xen-create-image $serial_device --hostname $hostname --ip $ip --vcpus 2 --pygrub --dist $dist

At this point, you can reboot (well, you could earlier, but well…).

Usage of the script should be simple. When creating a VM named ‘machinex’, start it and attach console:

xm create -c /etc/xen/machinex.cfg

You can escape the console with ctrl-]. Place a symlink in /etc/xen/auto to start the VM on boot.

As a sidenote: when creating a lenny, the script installs a xen kernel in the VM. When installing maverick, it installs a normal kernel. Normals kernels since version 2.6.32 (I believe) support pv_ops, meaning they can run on hypervisors like Xen’s.

« Older posts

© 2024 BigSmoke

Theme by Anders NorenUp ↑