Ansible or Why Docker is not enough
According to Wikipedia, an ansible is a fictional machine capable of instantaneous or superluminal communication. Typically it is depicted as a lunch-box-sized object with some combination of microphone, speaker, keyboard and display. It can send and receive messages to and from a corresponding device over any distance whatsoever with no delay.
This would be quite a cool device but it is not the Ansible I want to talk about today: Ansible is Simple IT Automation
Ansible is a free (GNU public license) software platform for configuring and managing computers. It is written in Python, but don't worry, you will hardly need to write any Python code. Instead you declare your machine roles in YAML files with some (admittedly) awkward syntax. Because of that, the learning curve is a little bit steep, yet once you have written some configuration the syntax will soon get more familiar. And the best thing about Ansible, it is a natural fit for Docker. How? A picture is worth a thousand words:
If you worked already with tools like Chef, Puppet, Salt, CFEngine you think why another configuration tool? Ansible has one important difference compared to all the other tools: The usage of an agent-less architecture. This means you do not have to install any software on the machines you want to manage. Ansible just needs SSH access to these machines and runs the commands in the remote SSH shell. This makes the installation of Ansible much easier because it solves the chicken and egg problem: The agent for the configuration management tool needs to be installed before you can use the configuration management tool.
So how does Ansible work?
Ansible is organized in playbooks. These YAML files follow the Ansible syntax and are written in a declarative manner. That means the configuration files don't specify how (which steps have to be executed) to reach the set up state, instead you describe the desired end state.
This is the playbook needed to set up a machine for our Elasticsearch cluster:
It tells Ansible the needed roles (more on roles below) of an Elasticsearch node in our cluster. These are:
- lingohub.common: installs software packages that we do want to have on all our nodes, eg: htop, build-essential, git, curl, wget, ...
- lingohub.oracle-jdk: since Elasticsearch needs Java, we apply a role that installs the Oracle JDK
- lingohub.elasticsearch: this will set up Elasticsearch and will inject the configuration as needed for a cluster node
- lingohub.gce.logging: sets up GCE logger configuration
- stackdriver-ansible-role: sets up configuration for the GCE Stackdriver monitoring tool
Now if you run this playbook:
ansible-playbook --ask-vault-pass --diff elasticsearch.yml
Ansible will turn these roles into the needed CLI commands and will execute them via SSH on all hosts that belong to the 'elasticsearch' group. This will happen in parallel on all machines, so setting up a whole cluster is as fast as setting up a single machine.
Nice fact: If you run this command a 2nd time, Ansible will tell you that no further changes need to be executed on the remote machine to reach the defined state because it's idempotent manner. All Ansible commands are written in an idempotent way. Each command checks which steps have to be taken to reach the defined state. If a command does not need to be executed to reach the expected state, it won't be executed.
These 12 lines of Yaml configuration are obviously not enough information to set up a whole cluster. This is where roles come into play. They are used to specify exactly how to setup a server, e.g. Elasticsearch. A role is a well defined folder/file structure (convention over configuration).
The file "tasks/main.yml" describes the end state for this role, e.g.:
- this deb package has to be in state "installed"
- this service has to be set up to run on startup
- this linux user has to be present
- this configuration file has to be written as created by the built-in Jinja2 template engine
I won't get into detail how you can write your own role, the Ansible Documentation is a better source for that. And to be honest, I have written just a few roles from scratch. There is a better way to get roles for your servers:
Ansible Galaxy is a directory that hosts roles maintained by the community. So I start to search this directory to find the role that suits our needs best. By doing a search for Elasticsearch roles you will end up with 40 results. Way too many, that is why I came up with a checklist to filter roles. This list ensures that a certain quality level and a minimum of dependencies:
- Are the files publicly available? If not I will skip this role immediately.
- Does it depend on other roles? If yes, I will skip taking a closer look and will try to find a standalone role.
- Is there any documentation how to configure this role? If not, I will skip this role immediately.
- For a deeper look, I will checked "defaults/main.yml" to see how configurable the role is (eg. Elasticsearch version, JVM settings, and all other variables that are needed to configure Elasticsearch properly).
- A look at "tasks/main.yml" tells me the scope of the role (the software packages, does it include proper configuration, does it optionally install tools that support a reliable server installation, eg. monit).
- Does the role reproduce the steps recommended by the maintainer of the service (in this case Elasticsearch documentation) or does it seem to be written in a "trial and error" manner.
- Are there any vulnerability issues (eg. usage of non configurable standard passwords).
As I above mentioned, I haven't invested much time in writing a role from scratch yet, but I really took my time to find the best suitable role in Ansible Galaxy. Nevertheless, I had to adapt every role to our needs, but the roles are built on a solid foundation.
To give you an example for a role, this is the Elasticsearch role that I have used as template for our role: f500.elasticsearch.
Github is a great way to to share playbooks and roles with your team. This works fine till your configuration needs secrets (eg. database passwords, keys for accessing APIs, Salts). Our internal security policy doesn't allow to store data like this in a repository (and I highly discourage it as well!). So where to store this information? Luckily, Ansible has something called Ansible Vault.
If you take a look at the Elasticsearch playbook from above, you will see this entry "vars/credentials.vault.yml". This file holds all our secrets and is encrypted using a choosable cipher. It just can be edited by using the Vault tools and your secret password. The option "--ask-vault-pass" will tell "ansible-playbook" to ask for this password and apply it on all encrypted files before executing the playbook.
Ad Hoc Mode
When starting to use Ansible you do not have to start with writing a playbook. By using the command "ansible -a" you can run any arbitrary CLI command on your machines. Eg.:
ansible all -a "sudo apt-get upgrade"
This will run the given command in parallel on all (filters can be given) your machines that are defined in your inventory and will provide you with the output of the executed remote commands.
I hope you enjoyed my introduction to Ansible. Obviously this article can only scratch the surface on what is possible with Ansible. If you really want to give it a try, head over to the Ansible Documentation and start writing your first playbook! If you are using Docker, you will soon see how these two are just a perfect match.
There is one last thing I want to mention why I think Ansible is an awesome configuration management tool. Ansible playbooks and roles are like documentation. The files describe exactly how your system is set up. Do you have a readme in your company: "How to add an Elasticsearch instance to our cluster?", chances are high it is already outdated. That doesn't happen with Ansible.