Devops - Intro to Infrastructure As Code

This post is backdated to the original presentation date to preserve context and show my growth over the last three years.

- Thomas (2020-01-30)

Since our last episode I’ve spent half a year exclusively focusing on devops topics at the day job. With the added experience and perspective, I was excited when I was asked to give another talk at Sapient about how to actually implement devops in practice. The slides that accompanied that talk can be found here, along with example code here and here. Hopefully, this post will fill in any blanks they leave behind. We will be covering how to start virtualizing and deploying your infrastructure with code. We will also touch on repository structure for these kinds of projects and what factors to consider when you are making your tool selections.

On our last episode…

Last time we covered what DevOps entails in the theoretical and cultural sense as well as some of the benefits it offers to an organization. This time around we’re going to cover some of the technical aspects of implementing a DevOps practice starting with my favorite topics: infrastructure and automation. We will also briefly touch on how version control affects this process and some suggested practices for organizing your code.

We’re covering those first because they’re both some of the easiest to start practicing and they provide the quickest benefits. The combination of virtual servers, infrastructure configuration, and build automation removes a huge number of rote tasks and allows your team to focus on what is really important, delivering value!

Version Control … Really, Use It!

I know I beat on this drum a lot, but version control is vital to a successful DevOps practice. If you can’t track what configuration is applied to your infrastructure you have only made manual configuration faster, not made it any easier to know what state you are in. Every script you write and every bit of configuration you do should end up in version control somewhere and you should strive to only execute scripts that have been versioned and code reviewed by your peers.

Repository Layouts

On a more practical side there are issues with repository layout that you should address early on in any DevOps rollout. It isn’t impossible to change course after the fact but it is much easier to do right the first time. The main concern is how tightly coupled your reusable components will be to a particular project or set of configuration. For example, in an Ansible and Vagrant based project one repository structure would be the “all-in-one”.

$ project/
|- .gitignore
|- Vagrantfile
|- site.yml
|- roles/
  |- nginx/
    |- defaults/
      |- main.yml
    |- tasks/
      |- main.yml
    |- README.md
|- group_vars/
  |- all.yml
  |- webservers.yml

This layout is advantageous if keeping your configuration and roles in sync is very important and if your roles are heavily customized to the project so it is unlikely they will be reusable elsewhere. It also is easier to deliver to the client since you can just hand over the entire repository instead of needing to describe the exact commit or branch you were using when you deployed their project.

On the other hand, if you are working with multiple projects for the same client or projects that share many standard components, a split layout may be a better choice. You can actually see an example of a split layout here for the configuration and here for the roles with the example code from this presentation.

$ project/
|- .gitignore
|- Vagrantfile
|- site.yml
|- ansible.cfg
|- group_vars/
  |- all.yml
  |- webservers.yml
$ library/
|- nginx/
|- jenkins/
|- php/

This layout allows you to make updates to a role, for example to improve security, and have all of your dependent projects benefit from it automatically. It also allows you to make a portable collection of reusable components that will get you up and running faster in new projects.

Repo Tips and Tricks

When thinking about your version control there are a few issues you will need to keep in mind to have everything running smoothly. First of all, you will want to plan ahead so you aren’t faced with a single SSH key of failure. The keys you use to administer your servers should be stored somewhere central and secure and should be backed up regularly. The last thing you want is your lead engineer loosing his laptop and now you can’t log into anything because nobody has the master key anymore. In the same vein, you need to be careful about committing application credentials or passwords to your repository in plain text. Each of the major configuration management tools have their own implementation of secret storage, so be sure you use it! For Ansible, Ansible Vault is easy to use and can even encrypt single strings within a larger unencrypted file to make searching and diffing easier.

Lastly, an issue that only effects the split repository layout is your configuration management tool being unable to find its role library or finding that it’s library is out of sync with the configuration. With Ansible the ansible.cfg file allows you to set a relative path to your role library and it is advantageous to create a team or project standard for where the library will be checked out. Additionally, while it can be advantageous to have your development environments configured from the latest and greatest, you should utilize tags and specific releases to ensure that your higher environments have exactly the configuration versions you expect them to.

Virtualization

When selecting your virtualization tools the biggest factors are going to be your client’s requirements and the resources you have available to work with. If your client needs their application deployed entirely in house, basing your configuration on AWS may not be the best use of time. For most of my local development I use VirtualBox controlled by Vagrant because it gives me a quick and portable platform to test against.

Vagrant and VirtualBox

Once you have Vagrant and VirtualBox set up on your host, creating VMs is really quite easy. Vagrant uses the concept of “base boxes” or semi-standardized VM images that you create your individual virtual machines from. By writing a Vagrantfile you can configure the resources for any number of VMs that you would like to create and control. You can see an example of a basic configuration in the 01_vagrant directory of the devops-demo-config repository.

Vagrant and AWS

Vagrant also supports other VM providers, like AWS or Azure, which lets you set up test environments in the cloud. These platforms also usually need additional credentials so it is important to remember to keep your secrets separate and out of your version control. Also, unlike VirtualBox, you need to specify a dummy base box along with an AMI ID to determine what VM image you deploy. Many AMIs also do not use the Vagrant default vagrant user or SSH key so you will need to provide the correct username for your AMI as well as keypair information to successfully login to your new VM. A full example can be found in the 02_aws directory of the configuration repository.

Automation and Infrastructure as Code

Creating VMs quickly is all well and good but it isn’t terribly helpful if you don’t have a repeatable way to configure them. This is where automation and the idea of infrastructure as code really starts to shine. Vagrant integrates with several different configuration management tools, pick one that suits your organization and requirements then get to work!

Ansible

My tool of choice is Ansible since it is written in Python and uses a fairly simple YAML based configuration. An executable “script” is called a playbook in Ansible and for simple tasks that may be sufficient for your needs. For example, if you look in 03_ansible you’ll see a simple playbook that only installs the Git client. However, with larger projects it makes sense to bundle up reusable parts so they can be shared between multiple instances.

Ansible roles are how you bundle up reusable components within Ansible. In most cases one install of MariaDB is much like another save for a password or two, so why rewrite it each time you need to do an install. Now if you reference the devops-demo-roles repository you will see a very basic role for NGINX. The advantage of this role system is now if I have this role deployed a hundred different places and need to make a security change, I only have to update it in one place, the role.

Jenkins

The last piece to our puzzle is, in fact, automating our automations. For this we are going to be pulling in a well known build tool: Jenkins. With a little more configuration we will be able to do pushbutton deploys for not only our code, but the infrastructure it runs on. The trick is in the Jenkinsfile which allows us to define our Jenkins pipeline in flat text instead of through the Jenkins Web UI. This has two distinct advantages:

It is easier to share pipeline configuration and move it to new Jenkins instances if needed.
It is possible to version control and track changes to your pipeline with the version control tools you are already using.

The different steps of the build process are defined as stages and can even be parallelized if needed. This can be particularly useful if you have multiple playbooks that need to be run for a service tier that aren’t interdependent. Instead of waiting for them to finish in serial you can declare the entire tier as a parallel stage and only have to wait for the single longest one.

Credential management can be a bit of a problem in Jenkins and it should be mentioned that anyone with sufficient access to your Jenkins master server has access to all of the credentials loaded there. If it all possible it is wise to store your credentials in some external secure storage and only pull them into Jenkins as needed.

Conclusion

DevOps is a rapidly evolving field but learning about it, and applying some of the tools shown today will help you deploy faster and more reliably. That in turn means more time spent on delivering the features your client wants and less time trying to remember how your servers are configured.