Introducing OpenStack through Ironic (II) Dec 19, 2015

After the philosophical first part, now is time to get to work. Lets see how to use Ironic to show how beneficial is managing resources using OpenStack API’s, starting from zero using Ansible to create a standalone Metal as a Service.

Metal as a Service implementation with Ansible

Ironic concepts are explained in the wiki of the GitHub repository. This section is focused on the ansible implementation assuming:

  • All clients are allowed to enroll/deploy/decommission servers. The user just need to know the URL of the Ironic-Api and provide an empty (fake) token (no Keystone integration).
  • The network will be plain, at this time, there is no need of controlling switches and ports (no Neutron integration). The Ironic server and bare metal clients are connected directly to the PXE network (default native vlan) using one NIC. The same applies for the IPMI network, it is possible to setup another NIC to specifically access that network.
  • Images are saved on the Ironic server, on local disk, and are available to other servers by HTTP when using Ironic Python Agent driver (IPA) or by TFTP when using PXE with the default driver (by Ironic-Conductor). This is not needed if you want to use other http servers or even object storage (like S3 or swift).

The server deployment was tested on Ubuntu Trusty, and the setup playbook will install all packages from the distribution repositories and the official Ubuntu Cloud Archive repository, so no development packages are installed, just official packages needed for running.

The ansible roles have no dependecies between each other, so it can be reused in other projects, they have more parameters than those required for this setup:

  • roles/mysql to setup a MySQL server and databases needed for Ironic services.
  • roles/rabbitmq to setup a message queue using RabbitMQ.
  • roles/ironic to setup the OpenStack Ironic daemons and services.
  • roles/dnsmasq to setup a PXE server to use with Ironic conductor.
  • roles/configdrive (only for the client playbooks)
  • roles/monit (optional) to setup process control with Monit.
  • roles/nginx (optional) to setup HTTP image repo server (for IPA).

The last two roles are not really needed if you do not want to use Monit and if you have another HTTP storage repository where images can be saved and reachable from the bare metal servers. Note that the configdrive role is only needed for the client playbooks (add-baremetal.yml and del-baremetal.yml), not for Ironic server, and it will also make use of the HTTP server, so as starting point, probably it is better just using it, after that you can start with your modifications. The roles were implemented following these patterns, trying to make them simple and reusable, but the idea can be implemented in other provisioning tools easily.

Installing Ironic server

The playbook setup-ironic.yml calls the previous roles with the proper parameters. The deployment can be splitted on different servers: database, messaging, api, conductor by defining the hosts/ips on the inventory file hosts/ironic. Also, in that inventory file, there are the main variables used in the playbooks, like the dhcp range for PXE and the passwords for MySQL and RabbitMQ. baremetal_server_dhcp_params defines a range for the ramdisk, but note that it could be completely different than the final IP address that the bare metal host will have (because it can be defined by Cloud-Init), for this reason, the lease time can be small (~30m) and the IP range can be short (~20 ips).

[provisioning:children]
database
messaging
ironic
client
api
conductor

[provisioning:vars]
# Server side variables
baremetal_server_database_rootpass=root
baremetal_server_database_name=ironic
baremetal_server_database_user=ironic
baremetal_server_database_pass=mysql
baremetal_server_msg_vhost=ironic
baremetal_server_msg_user=ironic
baremetal_server_msg_pass=rabbitmq
baremetal_server_dhcp_params=["10.0.1.100", "10.0.1.120", "30m"]
baremetal_server_dhcp_domain=pxe.local
baremetal_server_http_root_path=/var/lib/ironic/http
baremetal_server_images_path=/var/lib/ironic/http/images
baremetal_server_images_deploy_path=/var/lib/ironic/http/deploy
baremetal_server_configdrive_path="/var/lib/ironic/http/metadata"
# Client variables
baremetal_ironic_token=" "
baremetal_domain=example.com
baremetal_nameservers=[ "8.8.8.8" ]
baremetal_ssh_key="ssh-rsa AAAAB3NzaC1yc2EAAAADAQAB...tion"

The current implementation installs all the services on the same host. Apart of the MySQL and RabbitMQ services, it is possible to split the Ironic-API and Ironic-Conductor in different servers (or deploy multiple Ironic-conductor’s) toggling the parameters ironic-conductor and ironic-api on the ironic role:

 - hosts: api
   sudo: true
   roles:
   - role: ironic 
     ironic_api: true
     ironic_conductor: false
#    ...

 - hosts: conductor
   sudo: true
   roles:
   - role: ironic
     ironic_api: false
     ironic_conductor: true
#    ...
   - role: dnsmasq

Note that each Ironic-Conductor host also needs the dnsmasq role. So, on the ansible inventory file (hosts/ironic in this case), remove api and conductor from provisioning section, and add:

[ironic:children]
api
conductor

It is possible to have and HTTP image repository on each Conductor by adding also the nginx role, but then you have to be aware of distributing the images to all Ironic-Conductor’s servers.

To run the playbook, just change the IP’s and ssh user in the inventory file to your environment. Note that Ansible thinks that is deploying on different hosts (the reason why there are different host names pointing to the same IP is to avoid overwriting the Monit control scripts), but you can change and simplify that configuration by changing the hosts definition of the playbooks according to your inventory file (have a look at vagrant playbook site.yml!). Finally, remember that the playbook is idempotent, so run it as many times as you need, it will not restart the services unless is necessary:

ansible-playbook -i hosts/ironic -e setup-ironic.yml

Ironic Python Agent ramdisk

IPA is a an agent for controlling and deploying Ironic controlled bare metal machines running in their memory:

  1. The agent is a small python application that is meant to be embedded in a ramdisk.
  2. The target machine boots this ramdisk, which starts the agent process.
  3. The agent then exposes a REST API that Ironic uses to interact with the agent.
  4. The API calls pluggable backends to perform actions on the machine.

One of those actions is deploy a image on the physical machine, others could be wiping the disk, create RAID setup …

The IPA ramdisk image must the copied to Ironic-Conductor(s) because is used by Dnsmasq to offer it to the unprovisioned machines via PXE. The agent_ipmitool driver on Conductor will power the machine and control the boot order (PXE vs disk). The playbook to install the Ironic server setup-ironic.yml already does it for you …

- name: Download IPA Coreos ramdisk
  get_url:
  url: http://tarballs.openstack.org/ironic-python-agent/coreos/files/coreos_production_pxe_image-oem.cpio.gz
  dest: "/coreos_production_pxe_image-oem.cpio.gz"
  force: no

- name: Download IPA Coreos kernel
  get_url:
  url: http://tarballs.openstack.org/ironic-python-agent/coreos/files/coreos_production_pxe.vmlinuz
  dest: "/coreos_production_pxe.vmlinuz"
  force: no

… but attention, those tasks will run only once, when Ironic is updated they will not update the IPA ramdisk, to force the update, just rename those files and re-run again to download the new version (or do it manually). Also, if you install the Ironic Kilo version (or any version different that the latest), the IPA you will get is not going to be compatible it, because it always downloads the latest release.

Take into account, when defining a new physical machine with Ironic client, one have to specify the path to IPA ramdisk kernel and image relative to Ironic-Conductor service. This is not the ideal situation, because a client should not know those details, but for now, the ugliness can be mitigated by the playbooks, hidden those details to the end users.

Creating images to deploy on the physical machines

The last step needed to start using our Metal as a Service is create the images to deploy on the physical machines. Because of we are using IPA ramdisk to deploy the image, it is needed some requirements to be compatible with this method of deploy:

  • Whole disk image with the boot loader included.
  • If not using GPT, the image cannot have more than 3 primary partitions.
  • Cloud-Init running at boot time with config-drive support.
  • RAW or qcow2 image format.

The second point is because the IPA will create one partition for the config-drive after the image is deployed, if there are no free primary partitions (the maximum is 4), it will not be able to do that and the process will fail.

In order to automate the process I have decided to use Packer instead of the official disk image tools, because it allows to define support for LVM on hardware for managing snapshots, resizing disks, physical disk independency to move filesystems, etc. Also, I would like to define a simple way to add more features (drivers, programs, configurations, … ) using scripts, and packer makes it easy to implement.

The repository has some examples about how to create images, let’s see how to install and run packer to create an Ubuntu Trusty image:

  1. Install Packer

    # Download packer (choose the latest version)
    wget https://dl.bintray.com/mitchellh/packer/packer_0.8.6_linux_amd64.zip  -O /tmp/packer.zip
    sudo unzip /tmp/packer.zip -d /usr/local/bin
    rm -f /tmp/packer.zip
    
  2. Install Qemu-kvm to run a local VM and create the template

    # Install the packages
    sudo apt-get install qemu-utils virtinst virt-viewer libguestfs-tools
    sudo apt-get install qemu-kvm qemu-system libvirt-bin bridge-utils virt-manager
    # Your user should be part of the libvirtd group to manage vms
    sudo adduser $USER libvirtd
    sudo adduser $USER kvm
    # change the current group ID during this session.
    newgrp libvirtd
    newgrp kvm
    # Check if everything is working:
    virsh -c qemu:///system list
    
  3. Run Packer using a template definition. Have a look at the pressed file on the folder ubuntu/http to define an alternative disk layout on the image. The rest of the configuration settings embedded on the image are defined by the scripts on the folder ubuntu/scripts.

    # Building an ubuntu baremetal image
    cd ubuntu
    packer build  ubuntu-14.04.latest-amd64.json
    # output image in the new folder output-ubuntu-14.04 
    
  4. Finally, just copy the image to the HTTP repository -Nginx, in this implementation, the same server as Ironic-Conductor(s)- to be ready to use and workout the md5 checksum

    md5sum output-ubuntu-14.04/trusty.qcow2 > output-ubuntu-14.04/trusty.md5
    # Copy the md5 and qcow to the location defined by baremetal_server_images_path 
    # (on the ironic role is ironic_pxe_images_path)
    scp output-ubuntu-14.04/trusty.*  <IRONIC_SERVER>:/var/lib/ironic/http/images/
    

The image has to be available on http://<IRONIC_SERVER>/images

Deploying a physical machine

Once the image is on the Ironic server, it is time to use the client playbooks to deploy a physical server. Of course, you can use directly the Ironic command line, but it will be a bit difficult to manage with a lot of servers (also for creating the Cloud-config configuration).

Those playbooks rely on the Ironic client, but in the future with ansible 2.0 they will use the openstack os_ironic and os_ironic_node modules to simplify the tasks. Anyway, they are good examples to follow the steps needed in the manual process.

Also note that now is time to use the role configdrive, which is mainly in charge of:

  • Creating the network configuration files on the nodes, it is capable of creating complex network configuration: bonding, vlans, … with static IP’s. When the server boots for first time, it will do it with the proper network configuration. It works on RedHat and Debian derivatives distributions.
  • Provide the user_data script for Cloud-Init, which is in charge of creating additional users, extend the LVM and the filesystems, trigger other hookups, etc.
  • Create the ISO volume with all the configuration files according to the format needed by Cloud-Init.

Those actions are executed on the HTTP (Nginx) repository, due to the fact that the images and the config-drive specified when the physical server is defined, will be picked up by the IPA agent from there in order to be dumped on the physical disk. The config-drive configuration will be generated in two formats: a folder named with the <UUID> of the physical machine (useful only for the humans, it can be deleted -see configuration parameters of the role-), and the compressed ISO volume of the same folder called <UUID>.gz. Remember you can check them in [http:///metadata].

Before deploying the machine, have a look at the file vars/baremetal.yml to check and, if it is needed, re-define these variables:

baremetal_conductor_server: ""
baremetal_ironic_url: "http://:6385"

Depending on the inventory file, you will have to point to the hostname (or IP address) of the Ironic-Conductor and setup the Ironic-API url (in this configuration both are pointing to the same server).

Also, is good to have a look at some settings with regards to the image you want to deploy. Edit, copy or create a file vars/baremetal_image-<name>.yml, in this case vars/baremetal_image-trusty.yml (we have created an Ubuntu Trusty image).

---
baremetal_driver: "agent_ipmitool"
# Folder relative to the servers
baremetal_deploy_kernel: "file:///coreos_production_pxe.vmlinuz"
baremetal_deploy_ramdisk: "file:///coreos_production_pxe_image-oem.cpio.gz"
# Depends on each image
baremetal_image_type: "Debian"
baremetal_image: "http:///images/trusty.qcow2"
baremetal_image_checksum:
baremetal_image_kernel:
baremetal_image_ramdisk:
baremetal_image_rootsize:

The image checksum can be empty if you have created the md5 file <name>.md5 next to the qcow2 image on the HTTP repository. The rest of the parameters are not needed because it is a complete disk image, deployed by the agent_ipmitool driver.

Now, finally, let’s deploy a machine! Create a file with the parameters of the server in the servers folder, for example servers/test-server-01.yml:

---
# Specific variables for this server
baremetal_ipmi_ip: 10.0.0.1
baremetal_ipmi_user: IPMIUSER
baremetal_ipmi_pass: IPMIPASS
# Main network parameters, this variable is needed for the playbook.
# It can be in a different VLAN or network than the PXE one
baremetal_fqdn: test-server-01.example.com
baremetal_mac: 00:92:fa:ab:0e:ba
baremetal_ip: 10.100.100.10
baremetal_netmask: 255.255.255.0
baremetal_gw: 10.100.100.1
# Image name, defined in vars/baremetal_image-<name>.yml
baremetal_os: trusty
# If you change the interface name to eth1 (for example)
# Remember to change the backend devices!
baremetal_network_list:
  - device: "bond0"
    type: "bond"
    bond_mode: "1"
    address: ""
    netmask: ""
    gateway: ""
    nameservers: ["8.8.8.8"]
    domain: ""
    backend: ["eth0", "eth2"]
  - device: "eth1.500"
    type: "vlan"
    address: "192.168.100.100"
    netmask: "255.255.255.0"
    backend: ["eth1"]
# Remeber, the image needs support for bonding, vlan, bridge ... 
# in order to get then working.

Optionally, with the same name but different extension define a user_data file for Cloud-Init. In this case it will create an user and expand the LV and FS to fill all the available space:

#cloud-config

users:
  - name: admin
    lock_passwd: False
    plain_text_passwd: 'admin'
    gecos: Admin User
    groups: [wheel, adm, audio, cdrom, dialout, floppy, video, dip]
    sudo: ["ALL=(ALL) NOPASSWD:ALL"]
    shell: /bin/bash

runcmd:
  - lvextend -r -l +30%FREE /dev/mapper/system-root
  - lvextend -r -l +60%FREE /dev/mapper/system-var
  - lvextend -r -l +10%FREE /dev/mapper/system-tmp

And run the ansible playbook, providing the name of the server/file:

ansible-playbook -i hosts/ironic -e id=test-server-01 add-baremetal.yml
Server Name: test-server-01

PLAY [Define and deploy physical servers using Ironic] ************************ 

[...]

and done! just check the Ironic client to see the status:

export OS_AUTH_TOKEN=" "
export IRONIC_URL=http://<IRONIC_SERVER>:6385/
ironic node-list 

Related Posts