Salt

Posted on avril 24, 2014 in System

Since a few months, I've been inclined to test and use Salt Stack. I manage a lot a heterogeneous plateforms, but each one are composed of similar machines who does the same stuff.

For example, once three months, I'm being asked to install a new packages, configure a new printer on desktop machines of our datacenter's collaborators. What a great use case :)

Introduction

Salt is like Puppet and Chef, which are also deployment and automation tools. I find it more lightweight and

Installation

It seems that Salt Stack is not yet in the official Ubuntu repositories

Things to do on your master host:

apt-get install python-software-properties
add-apt-repository ppa:saltstack/salt

apt-get update
apt-get install salt-master

Things to do on your client host:

apt-get install python-software-properties
add-apt-repository ppa:saltstack/salt

apt-get update
apt-get install salt-minion

By default a Salt Minion will try to connect to the DNS name "salt"; if the Minion is able to resolve that name correctly, no configuration is needed. If the DNS name "salt" does not resolve, you need to edit /etc/salt/minion

master: 192.168.0.2

Restart everything

Master

/etc/init.d/salt-master restart

Minion

/etc/init.d/salt-minion restart

Communication

Communications bettwen the Master and your Minions is done via AES encryption. But to communicate, your Minion's key must be accepted by the Master

List all keys:

$ salt-key -L
Accepted Keys:
Unaccepted Keys:
NOC1-VTY2
NOC2-VTY2
NOC3-VTY2
NOC4-VTY2
Rejected Keys:

Accept all keys

$ salt-key -A

Accept one key

$ salt-key -a NOC1-VTY2

If you list your keys again you should get an output like this:

$ salt-key -L
Accepted Keys:
NOC1-VTY2
NOC2-VTY2
NOC3-VTY2
NOC4-VTY2
Unaccepted Keys:
Rejected Keys:

You can now test the communication between your Master and one of all of your Minions

$ salt 'NOC1-VTY2' test.ping
NOC1-VTY2:
    True
$ salt '*' test.ping
NOC3-VTY2:
    True
NOC4-VTY2:
    True
NOC1-VTY2:
    True
NOC2-VTY2:
    True

Deployment

Now, I want to be able to add another computer to our NOC team without having to push manually all the configurations (NIS/NFS/packages etc)

There is two major things, the directive file_roots and the file top.sls According to the documentation, SLS (or SaLt State file) is a representation of the state in which a system should be in.

file_roots

In your /etc/salt/master file, you need to uncomment the file_roots directive. It defines the location of the Salt file server and the SLS definitions. Mine look like this

file_roots:
  base:
    - /srv/salt/

After this modification, restart your server

top.sls

Doing specific stuff to specific machines in the main purpose of Salt. This is defined within the top.sls file.

This can be done by:

Ways Example
Globbing "webserverprod"
Regular Expressions "^(memcache|web).(qa|prod).loc$"
Lists "dev1,dev2,dev3"
Grains "os:CentOS"
Pillar
Node Groups
Compound Matching

This is my top.sls file:

base:
   '*':
     - nagios.client
   'os:Ubuntu':
     - repos.online
   '^NOC(\d)+-VTY2$':
     - match: pcre
     - yp.install
     - yp.nsswitch
     - nfs.mount_noc

base:

base:
   '*':
     - nagios.client

This block declare the global environment the minion must apply. In this case, every machine will be assigned the nagios.client directive It's going to execute /srv/salt/nagios/client.sls

os:Ubuntu

This section matches machine using the Salt "grain" system, basically from system attributes. It will execute /srv/salt/repos/online.sls

'^NOC(\d)+-VTY2$'

This section matches using Perl regular expression feature If the hostname of the machine matches this regex, it will be assigned the few directives It will execute, /srv/salt/nagios/yp/install.sls, /srv/salt/nagios/yp/nsswitch.sls, /srv/salt/nagios/nfs/mount_noc.sls

Links


Continue reading

How-To Debootstrap

Posted on avril 23, 2014 in System

For my infrastructure purposes I often need to install as fast as possible. Most of my servers comes with 4 disks and one or more RAID card.

I usually don't trust the RAID cards, so I always create a raid0 / disk in order to use every logical volume like it was a real disk.

And I always use the above partition schema

mount size
/boot 200M
/ *

hpacucli

# find your slot
slot=`hpacucli ctrl all show | grep -i slot | awk '{print $6}'
hpacucli ctrl slot=$slot ld 1 delete
# create one raid0 per physical disk
for phys in `hpacucli ctrl all show config | grep physicaldrive | awk '{print $2}'`;
do
  hpacucli controller slot=$slot create type=ld drives=$phys raid=0
done;

Cleaning

If you use an old server, you must do some cleaning

Let's start by zeroing the first 100MB of the partition in order to be sure to erase the partition table, the MBR

for i in {a..d} ;
do
  dd if=/dev/sda of=/dev/zero count=100 bs=1M
done

Afterwars, let's notify the kernel about devices changes

partprobe

MSDOS partitions

for i in {a..d} ;
do
  parted /dev/sd$i --script -- mklabel msdos
  parted /dev/sd$i -a optimal --script -- unit MB mkpart primary 1 200
  parted /dev/sd$i -a optimal --script -- unit MB mkpart primary 200 -1
done;

GPT partitions

For GPT partitions you need to create BIOS Boot partition a small partition, at least 1mb.

 

for i in {a..d} ;
do
    parted /dev/sd$i --script -- mklabel gpt
    parted /dev/sd$i -a optimal --script -- unit MB mkpart grub fat32 1mb 2mb
    parted /dev/sd$i -a optimal --script -- unit MB set 1 bios_grub on
    parted /dev/sd$i -a optimal --script -- unit MB mkpart primary 2mb 200
    parted /dev/sd$i -a optimal --script -- unit MB mkpart primary 200 -1
done;

Installation

I prefer to use software raid with mdadm. If you want to boot on a mdadm's volume you need it to use the 0.90 metadatas For you /, use the raid-level you want and don't give any metadata paramaters so it can takes the 1.2 one.

/!\ If you use GPT partitions, be aware that /dev/sdx1 is the BIOS partition, not your future /boot, start at /dev/sdx2

# for msdos partitions
mdadm --create /dev/md0 --metadata=0.90 --assume-clean --raid-devices=4 --level=1 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm --create /dev/md1 --assume-clean --raid-devices=4 --level=6 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2

# for gpt partitions
mdadm --create /dev/md0 --metadata=0.90 --assume-clean --raid-devices=4 --level=1 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2
mdadm --create /dev/md1 --assume-clean --raid-devices=4 --level=6 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3

Let's format the RAID volumes

mkfs.ext4 /dev/md0
mkfs.ext4 /dev/md1

Let's start the debootstrap session. I use a basic /etc/apt/sources.list using this convenient sources.list generator

mkdir /mnt/root
mount /dev/md1 /mnt/root
apt-get update; apt-get install -y debootstrap
debootstrap trusty /mnt/root
mount -o bind /dev /mnt/root/dev
mount -o bind /proc /mnt/root/proc
mount -o bind /sys /mnt/root/sys

# basic fstab
echo "proc            /proc   proc    defaults                0       0
/dev/md1 /       ext4    errors=remount-ro       0       1
/dev/md0        /boot   ext4    defaults                0       2
" > /mnt/root/etc/fstab

echo "#############################################################
################### OFFICIAL UBUNTU REPOS ###################
#############################################################

###### Ubuntu Main Repos
deb http://fr.archive.ubuntu.com/ubuntu/ trusty main restricted universe multiverse
deb-src http://fr.archive.ubuntu.com/ubuntu/ trusty main restricted universe multiverse

###### Ubuntu Update Repos
deb http://fr.archive.ubuntu.com/ubuntu/ trusty-security main restricted universe multiverse
deb http://fr.archive.ubuntu.com/ubuntu/ trusty-updates main restricted universe multiverse
deb-src http://fr.archive.ubuntu.com/ubuntu/ trusty-security main restricted universe multiverse
deb-src http://fr.archive.ubuntu.com/ubuntu/ trusty-updates main restricted universe multiverse
" > /mnt/root/etc/apt/sources.list

Now we can go the installed volume and prepare the OS

cd /mnt/root
chroot .
# mount /boot for the future kernel installation
mount /boot
# generate a few locales
locale-gen fr_FR.UTF-8
locale-gen fr_FR
locale-gen en_US.UTF-8
locale-gen en_US
update-locale

apt-get update
# don't forget to install mdadm on the system so it can boots correctly
apt-get install -y mdadm lvm2
# install the required kernel
apt-get install -y linux-image-generic
# install an openssh-server so you can remotely have access to the system
apt-get install -y openssh-server
# change your root password!!
echo "root:changeme"|chpasswd

Stop the few services

/etc/init.d/ssh stop

Umount everything, sync for the last i/o and reboot

umount /boot
exit
umount /mnt/root/dev
umount /mnt/root/proc
umount /mnt/root/sys
sync
reboot

LVM

Work in progress

Rescue

Without LVM

If you happen to boot on a rescue live-cd on one of this configuration, it will detect a RAID system but without the correct device names

mdadm -S /dev/md126
mdadm -S /dev/md127
mdadm --examine --scan /dev/sda{1..4} >> /etc/mdadm/mdadm.conf
mdadm --assemble --scan

Your /dev/md0 and /dev/md1 should come online

mkdir -p /mnt/root
mount /dev/md1 /mnt/root
mount -o bind /dev /mnt/root/dev
mount -o bind /proc /mnt/root/proc
mount -o bind /sys /mnt/root/sys
chroot /mnt/root

Here you go!

Credits

Thanks to my friends Pierre Tourbeaux and Michael Kozma for all the advices and debugging over the year :)


Continue reading

ZFSonLinux

Posted on avril 18, 2014 in System

At Online, we've been trying ZFS On Linux on a few services.

Here's a small how-to (and also a reminder) on how to install it and manage it:

Install

    $ apt-add-repository --yes ppa:zfs-native/stable
    $ apt-get update && apt-get install ubuntu-zfs

ZFS comes with a RAID soft like system

RAID type RAID-z type Loosable disks Min disks
RAID5 raidz 1 disks 3 disks
RAID6 raidz-2 2 disks 4 disks
RAID7 raidz-3 3 disks 5 disks

Now we're going to create a zpool called storage

    $ zpool create -f storage raidz2 c2d{1..5}

If we wan't to add MOAR disks

    $ zpool add -f storage raidz2 c2d{6..10}

Here are a few problems I've experienced:

ZFS Resilvering (replace a drive)

If you've got some spare disks, you should add them your spare pool

    $ zpool add storage spare c2d11 c2d12

By doing so, if a disk fails, ZFS will replace it automatically with the failed one. Personnaly, I prefer to do it manualy. Assuming c2d4 failed, to replace it by c2d11, let's do this:

    $ zpool replace c2d4 c2d11

You will now have c2d11 resilvering your entire zpool. Once the resilver ends, the failed disk is ejected from the zpool.

ZFS Scrubbing

ZFS has a scrub feature to detect and correct silently errors. You could assimilate this to ECC RAM (RAM with error recovery). ZFS scrub feature check every block of your pool against a SHA-256 checksum.

You can invoke a scrub or be forced to live the scrub when a disk fails and you have to replace it.

Recently, on a 200T system, I replaced a failed disk by a spare one. It scrubbed the 200T. The zpool status was mentionning a duration of about 500 hours of scrubbing. Time to hang yourself.

Fortunately, there is some tunnable settings in /sys/module/zfs/parameters

    # Prioritize resilvering by setting the delay to zero
    $ echo 0 > zfs_resilver_delay

    # Prioritize scrubs by setting the delay to zero
    $ echo 0 > zfs_scrub_delay

These changes takes effect immediatly and I haven't experienced any problems afterwards. Everything synced in 60 hours.

Attached a few other features to tune your scrub:

feature default value description
zfs_top_maxinflight 32 maximum I/Os per top-level
zfs_resilver_delay 2 number of ticks to delay resilver
zfs_scrub_delay 4 number of ticks to delay scrub
zfs_scan_idle 50 idle window in clock ticks
zfs_scan_min_time_ms 1000 min millisecs to scrub per txg
zfs_free_min_time_ms 1000 min millisecs to free per txg
zfs_resilver_min_time_ms 3000 min millisecs to resilver per txg
zfs_no_scrub_io 0 (bool) set to disable scrub i/o
zfs_no_scrub_prefetch 0 (bool) set to disable srub prefetching

Links


Continue reading

Symfony performances

Posted on avril 18, 2014 in System

Since a few weeks, we've stumble upon a few performances problem on our Symfony2 backend. For the record, it's a 50k line codes, lots of feature and custom bundles.

autoloader

On the first request, Symfony PHP's code must discover all the classes of your code. It does a lot of stat/open/read/close on each file of your project. We've observe a 100% CPU usage for a few seconds, the time required for the code to discover everything.

By default, this feature is called without the --optimize flag.

So we had to custom our fabric script by adding

  $ php composer.phar dump-autoload --optimize

For example, our autoloader filer was created with 300 lines before. After the --optimize flag, it now has more than 5 000 lines.

To be continued with APC support and OpCode Cache of PHP 5.5


Continue reading