Every developer knows that writing code to fix a problem is great, but they’ll also tell you horror stories about coming back to that code months or years later to make a change and not knowing how to verify that their changes didn’t break anything further down the line. The same problem exists in the infrastructure and devops world when you automate your infrastructure with code.

Recently, I’ve been faced with updating the software my Ansible role library deploys for a project and I have needed a way to ensure that all of the upgrades go smoothly. Manual tests aren’t really an option due to the number of combinations that would need testing and since some environments still require the older versions I have to make sure my changes don’t break any existing installs either. Luckily, while trying out Chef Rally on a lark I ran into both Kitchen and Inspec.

Since both of these tools come from the Chef ecosystem their integration with Chef is a lot cleaner but that doesn’t mean you can’t use them to make your Ansible life easier! With a little cleverness and a very helpful whitepaper from the Chef blog we can get Kitchen to manage our VMs and test scenarios while Inspec checks that everything in our roles works as expected.

Why Test Roles?

Why are we testing the roles anyway? Doesn’t it make more sense to test the final configuration we intend to push to our infrastructure? Yes! But why not both? Think of testing the roles individually as unit tests, the are quicker to run but perform basic sanity tests for the most common scenarios. They allow us to efficiently confirm that changes don’t break the software we’re installing and make it much simpler to test if a software upgrade runs afoul of the role logic. Full integration or functional testing is still valuable to check the interactions between different roles, but may take far longer run and thus is better suited for a scheduled testing cycle instead of an ad-hoc or on-push testing cycle.

Project Structure

Below you see a mock project structure laid out including a role library and a test directory for the role. Within that test directory we can see multiple Ansible playbooks and Ruby files that define the Inspec tests for each of our different scenarios.

> project/
  |- config/
  |- roles/
    |- role/
      |- defaults/
      |- tasks/
      |- templates/
      ...
      |- tests/
        |- kitchen.yml
        |- scenarios/
          |- default/
            |- main.yml
            |- test.rb
          |- additional-scenario/
            |- main.yml
            |- test.rb

Software Requirements

To get this experiment off the ground we’ll need several different software packages as well as one additional Ruby gem to add some additional functionality.

  1. VirtualBox - quick local VMs
  2. Vagrant - an easier way to control those VMs
  3. Ansible - our provisioning tool of choice
  4. Inspec - the validation tool
  5. Kitchen - orchestrate all of the lower layers efficiently
  6. kitchen-ansible - Kitchen plugin to install and run Ansible

I’m on a Mac so most of my utilities are installed through Homebrew which seems to work pretty well. The only hiccup I ran into was the lack of a Homebrew package for kitchen-ansible. To install that I had to run the Ruby gem executable directly and jump through a few confirmation messages to get it to install. This additionally seems to clobber the link Homebrew makes for the kitchen-vagrant plugin so you may need to manually reinstall that with gem as well.

The Role

Our example role is going to be dead simple because that really isn’t the focus here. We’re going to install Tomcat and assume that we’ve already defined the package name, path to server.xml, a default HTTP port, and a template for server.xml already in the role directory. So, our simplified tasks/main.yml looks like the following:


---
- name: Install Tomcat.
  package: '{{ tomcat_package_name }}'
  state: present
- name: Template server.xml
  notify: restart tomcat
  template:
    src: server.xml.j2
    dest: '{{ tomcat_server_xml_path }}'
- name: Start and enable Tomcat
  service:
    name: '{{ tomcat_service_name }}'
    state: started
    enabled: yes

Now that we have the functional part of our role out of the way we can move on to setting up our testing suite for it. If you need more info on writing Ansible playbooks or how roles are structured, the documentation is quite helpful and well structured.

Setting Up The Default Scenario

We will start by creating the tests directory within our role and the scenarios directory below that. Finally, we will create the default directory and a main.yml along with a test.rb inside it. The idea here is to have a baseline test that runs the role with few or no additional options passed to it beyond its defaults. This means our Ansible playbook, main.yml, is going to be really simple to write!

---
- hosts: test-kitchen
  become: yes
  roles:
    - tomcat

Now we need to attack the verification side of this whole puzzle and we do that by writing a little Ruby for the Inspec tool to run against our server after Ansible has done its provisioning. You can look up all the different modules that Inspec has available in the documentation but for now we will put the following into test.rb:

# Check that the Tomcat service is running and enabled.
describe service('tomcat') do
  it { should be_installed }
  it { should be_enabled }
  it { should be_running }
end

# Check that it is listening on port 8080
describe port(8080) do
  it { should be_listening }
end

This little snippet tells Inspec that there are four tests it should run, three against the Tomcat service and one against port 8080 which we have defined as our default HTTP port for the role. If all of the conditions are true when Inspec runs it will pass and return zero as an exit code. If any of them fail, Inspec will raise an error and print out which of the conditions was not met.

Getting Into The Kitchen

Now it’s time to wire it all together with Kitchen so it’s easy to manage all the different platforms and scenarios we want to run. In the top level tests directory we will create a new file called kitchen.yml which will be how we configure Kitchen.

---
driver:
  name: vagrant

provisioner:
  hosts: test-kitchen
  name: ansible_playbook
  roles_path: ../../
  require_ansible_repo: true
  ansible_verbose: true
  ansible_version: latest
  require_chef_for_busser: false

platforms:
  - name: geerlingguy/centos7

verifier:
  name: inspec

suites:
  - name: default
    provisioner:
      playbook: scenarios/default/main.yml
    verifier:
      inspec_tests:
        - path: scenarios/default/test.rb

The key elements here are the provisioner and suites keys which define how we will set up each of the test servers and what configuration goes with which set of tests. You’ll see that under provisioner the key roles_path is set relatively so that Ansible can resolve any dependent roles that the tested role may depend on. This is also where you can configure which version of Ansible you would like to run your playbooks with so developing a methodology for keeping this in sync with the version you use to deploy is vital.

The real fun is in the suites key though, as this is where we match up our playbooks with our tests. You will see that each suite is given a unique name and we can call out a specific playbook for each scenario we’d like to test. You’ll also see that in the verification step we can define multiple test files for Inspec to run before the suite is considered passed. This can be handy if you have a special case where you still need to test all of your default conditions but have a few additional constraints as well. You can reuse the code you already have and keep different concerns separated to their own files making it easier to maintain going forward.

Running Your Tests

First order of business is to get Kitchen to bring up our VMs and run our provisioner against them. You do this by going into your tests directory and running kitchen converge which will boot up a VM for each of your platform and suite combinations and then run the given playbook against it. If at any point a provisioner fails the others will continue and the error will be logged for you to review later. Once all of that is done you can tell Kitchen to run your Inspec tests to make sure the playbooks did what you expected by running kitchen verify. This will again check each of your platform and suite VMs individually and will log any failures to a file for later investigation. You can also specify a specific VM to test by providing a name or regular expression that matches one or more of the instances you’ve defined. This lets you log into instances that did not finish successfully, attempt a fix, and then retest without the hassle of rebuilding the machine or running tests you already know succeeded.

A handy shortcut through this process is the aptly named kitchen test command which combines both kitchen converge and kitchen verify into a single command but also adds the additional bonus of destroying VMs which successfully complete their tests, saving you from the task. This means that you could reasonably set up a CI pipeline that runs these tests automatically and only have to check on VMs when something goes wrong. When combined with code quality tools like ansible-lint you can start to assemble a proper software development toolkit around your devops practice.

Adding a Platform

Sometimes you get lucky and can choose to deploy your roles to a specific version of a specific OS, but unfortunately more often you need deploy your roles across a range of different platforms. Thankfully, Kitchen makes testing across platforms really easy! We’ll edit kitchen.yml and add an additional Vagrant box entry under the platforms key like below.

platforms:
  - name: geerlingguy/centos6
  - name: geerlingguy/centos7

Now when we run kitchen test it will spin up two VMs, one of them CentOS6 and the other CentOS7, and run our playbooks and tests against them both side by side. Then, like with the single platform case, if either of them fails a test Kitchen will save the whole run log to a file for us to review as well as leave the VM up for troubleshooting.

In Summary

Not only is testing your roles and infrastructure code a good practice we’ve seen that it can be fairly painless as well. By applying tools and methods we already know are effective in the software development world we can avoid a lot of pain and suffering when we deal with our other automations. While Kitchen and Inspec were originally developed around Chef we have seen how they can be leveraged to provide an effective and handy testing framework for Ansible and other configuration management tools.