Learning Ansible 101

Learning Ansible 101

What is Ansible?

Ansible is an open-source IT automation/configuration management tool.

Why use Ansible?

  • Provisioning

  • Configuration Management

  • Continuous Delivery

  • Application Deployment

  • Security Compliance

  • Agentless

  • Reduce the repetitive tasks

  • Works well on localhost workloads, or cloud servers like AWS or even on private cloud servers.

Understanding Ansible configs, playbooks and Inventory

After installing ansible, it creates a config file at /etc/ansible/ansible.cfg , this controls all default configurations of ansible.

If you have multiple playbooks for different stuff, say for web servers and db servers, say at location /opt/web-playbook/ and /opt/db-playbook/ . We wish to have different ansible configuration for different playbooks, then we could have different ansible config files at /opt/web-playbook/ansible.cfg (for web stuff) and /opt/db-playbook/ansible.cfg (for db stuff)

If you want to use a config file other than that at the default location, that is use an ansible config file at /opt/web-playbook/ansible.cfg as default ansible config then use environment variable to do so : $ANSIBLE_CONFIG=/opt/web-ansible.cfg ansible-playbook playbook.yml

Priority for config files:

  1. The Environment variable ANSIBLE_CONFIG takes first preference

  2. Then comes the ansible.cfg in the directory where playbooks are runned, example : /opt/web-playbook/ansible.cfg

  3. Third priorty is taken by the ansible config file present in the home directory if any present.

  4. Lastly the ansible looks for the config file at default location /etc/ansible/ansible.cfg

Inventory

With Ansible we can configure multiple devices and add stuff using playbooks, ansible is agentless that is we do not need any agent installed on the other devices which needs to be configured, just a SSH(for linux)/PowerShell (for windows) is enough for them to be configured.

The information of these devices/servers is located in inventory files, these look like a ini file. If we don’t create a inventory file , ansible uses a default inventory file located at /etc/ansible/hosts , consider this similar to the hosts file in linux, that is it points a hostname to certain server.

web         ansible_host=server1.company.com ansible_connection=ssh
db          ansible_host=server2.company.com ansible_connection=winrm

Inventory files can be used to create alias, set username/passwords for ssh connections , mention the type of connection(i.e. how ansible is connecting with the servers)

We can group multiple servers based on roles/location etc. We can group these servers together in the inventory file like:

[servers:children]
webserver
dbserver

[webserver]
web ansible_host=server.web.com

[dbserver]
db ansible_host=server.web.com

Variables

Ansible usesJinja2 for writing variables,we could have variables in our playbooks like:

-
    name: Set a configuration
    hosts: web
    vars:
        http_port: 8081
    tasks:
    - firewalld:
        port: `{{ http_port }}`

In the above code, we have defined a variable http_port as 8081 & this variable is used in the next lines as {{ http_port }}

We can also use variables in a different way, instead of setting the value in ansible config file, we could set the value for variables in .ini file as well, that is the inventory file.

# Sample Inventory file
Web http_port=8081

Variable Precedence

We can assign the values for variables in multiple places in different files like the inventory file, playbooks or a different variable file, so which variable value will take the precedence?

Say we have the following inventory file /etc/ansible/hosts


web1             ansible_host = 172.20.0.1   dns_server =10.5.5.2
web2             ansible_host = 172.20.0.2
web3             ansible_host = 172.20.0.3

[web_servers]
web1
web2
web3

[web_servers: vars]
dns_server = 10.5.5.3

In the above inventory file we have defined dns_server in two places, one with the host that is on line 1 & again in the group. So what dns_server does web1 actually gets?

→ Host Variables take precedence over group Variables
So web 1 will have dns_server as 10.5.5.2 , web2 & web3 will have dns_server as 10.5.5.3
Next what happens when I defined my variable in a playbook as well?
Say

---
- name : config DNS Server
  hosts: all
  vars:
    dns_server: 10.5.5.1
  tasks:
    - nsupdate:
        server: '{{ dns_server }}'

So a playbook variable takes precedence over both host & group variables in inventory file , that is the dns_server assigned here will be 10.5.5.1
Last but least, the highest priority is taken by the —extra-vars option in command line while running the playbook, for example

$ ansible-playbook playbook.yaml --extra-vars "dns_server=10.5.5.0"

There is a long list for defining variables, you’ll see all the ways below:

https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable

Register a variable

Say we have a playbook running we have to use output of a certain execution in the next play, how do we do it? Or maybe we output the result to a variable how do we view this?
We do this via registering a variable

---
- name: Check /etc/hosts file
  hosts: all
  tasks: 
  - shell: cat /etc/hosts
    register: result
  - debug: 
        var: result

Using the debug we can see the output/value for variable result, if you don’t want you use this we can just use -v while using the playbook, like:

$ ansible-playbook -i inventory playbook.yaml -v

Ansible Facts

When you run any ansible playbook, it will run a task in the very beginning which is not specified in the playbook that is ***gathering facts***, this is the host machine gathering all information on the given server, to view this facts we can use the debug module, using variable ansible_facts

---
- name: playbook file
  hosts: all
  tasks:
  - debug:
        var: ansible_facts

It collects basic information on the servers like processor, storage,architecture,memory etc

Magic Variables

Say we have 3 different servers and a variable is defined for only one server , how do we access the variable in other servers, as an example, consider following inventory file

web1     ansible_host=172.20.1.0
web2     ansible_host=172.20.1.1 dns_server=10.0.1.1
web3     ansible_host=172.20.1.2
---
- name: Print DNS Server
  hosts: all
  tasks:
  - debug:
        msg: '{{ hostvars[web2].dns_server }}'

The above sets variable dns_server in all the web servers.

Playbooks

Playbook is the place where we define what we want our ansible to do, that is where we set up all the tasks & workloads on multiple or single servers. Example playbook file

---
- name: 'Execute two commands on node01'
  hosts: node01
  become: yes
  tasks:
    - name: 'Execute a date command'
      command: date
    - name: 'Task to display hosts file on node01'
      command: 'cat /etc/hosts'
- name: 'Execute a command on node02'
  hosts: node02
  become: yes
  tasks:
    - name: 'Task to display hosts file on node02'
      command: cat /etc/hosts

To apply the above playbook, say the name of above playbook is playbook.yaml

ansible-playbook -i inventory playbook.yaml

Verifying a playbook

There might be multiple cases such as updating a production cluster and running our playbook might break things, to verify such cases we have different modes in ansible:

  • Check Mode : Ansible’s dry run check where no actual changes are made on the hosts, it allows preview of the changes without applying them. Use —check option to run a playbook in check mode.

  • Diff Mode : This mode provides a before & after comparison of the playbook, understand & verify the impact of playbook changes before applying them. Use —diff option to run a playbook in diff mode.

  • Syntax Check: Using this mode we can just check if the syntax of playbook is correct. Use —syntax-check option to run playbook in syntax check mode.

Conditionals

While writing a playbook , we may configure multiple servers with different os or architecture and while installing dependencies or any other stuff we may have to switch the provider from where to install these say yum or apt this increases complexities within our playbook , it would be easier if we were able to add a condition to check if our server has certain os then use yum or use apt , also we can use ansible built in libraries to add conditions, for example

---
- name: "Install NGINX"
  hosts: all
  tasks: 
  - name: "Install nginx on debian"
    apt: 
      name: nginx
      state: present
    when: ansible_os_family == "Debian"
  - name: "Install nginx on Redhat"
    yum: 
      name: nginx
      state: present
    when: ansible_os_family == "Redhat"

Here when block is used as a conditional, we can use or & and as well to add various conditions.

We can also use conditional in a loop

---
- name: "Install Softwares"
  hosts: all
  vars: 
    packages:
        - name: nginx
          required: True
        - name: mysql
          required: True
        - name: apache
          required: False
  tasks:
  - name: Install "{{ item.name }}" on Debian 
    apt:
       name: "{{item.name}}"
       state: present
    when: item.required == True
    loop: "{{ packages }}"

The above will be similar as specifying :

- name: Install "{{ item.name }}" on Debian
  vars:
    item:
        name: nginx
        required: True
  apt:
    name: "{{ item.name }}"
    state: present
  when: item.required == True

Same for mysql & apache.

Use of Ansible facts & variables in conditional

Say we have different web servers and we wish to have ubuntu or any specific package to be installed on the web servers, so depending on the flavors their would be different ways to install stuff

- name: Install Nginx on Ubuntu 18.04
  apt:
    name: nginx=1.18.0
    state: present
  when: ansible_facts['os_family'] == 'Debian' and ansible_facts['distribution_major_version'] =='18'

Here ansible_facts help to get information on the servers to perform certain tasks.
Variables can also be used for various conditions,

- name: Deploy config files
    template: 
        src: "{{ app_env }}_config.j2"
        dest: "/etc/myapp/config.conf"
    vars:
        app_env: production

Here we can use the above variable app_env according to our environment configuration needs in a playbook.

Loops

Consider a playbook, in which we will create multiple users,

- name: Create Users
  hosts: localhost
  tasks: 
    - user: name=joe    state=present
    - user: name=george state=present
    ..so on

Say we have a very long list to add users, which is just repetitive stuff .So avoid this we use loops

Simpler way would be to have a single task which will loop over all the users

- name: Create Users
  hosts: localhost
  tasks:
    - user: name='{{ item }}'    state=present
      loop:
        - joe
        - george
        - ravi
        - mani
        - kiran
        ..so on

We use a variable item in loop so that it will be replaced with all the names in the loop

Say we’d have more requirement to add some UID along with creating a user how do I do it?

- name: Create Users
  hosts: localhost
  tasks:
    - user: name= "{{item.name}}" state=present uid ="{{item.uid}}"
      loop:
        - name: joe
          uid: 1011
        - name: george
          uid: 1012
            ...so on

From above , all content in loop are specified as variable item, so name is assigned as item.name & uid is assigned as item.uid

Module

Ansible Module are units of code that can control system resources or execute system commands. This can be grouped in:

  • System Module: Actions performed at system level, like creating users, group, getting IP tables, working with services etc.

  • Command Module: Actions performed to execute a command or scripts on host.

  • File Module: Actions performed on files like get info, copy, achieve etc.

  • Database Module: Actions performed on various databases like mongodb, MySQL etc.

  • Cloud Module: Helps to work with different cloud providers like AWS,Azure etc

  • and many more..

Example for a command module:

- 
    name: Play1
    hosts: localhost
    tasks: 
    - name: Execute command 'date'
      command: date
    - name: Display resolv.conf contents
      command: cat /etc/resolv.conf

To use the command module we use a key-value paired name & command , that is name for the task & the exact command to be executed. The above playbook first executes the date command & then to display the contents of resolv.conf file

We can also do the above using a parameter in command , In case you need to change directory before executing the command you can use parameter chdir

- 
    name: Play1
    hosts: localhost
    tasks: 
    - name: Execute command 'date'
      command: date
    - name: Display resolv.conf contents
      command: cat resolv.conf chdir=/etc

This will ensure ansible changes directory before executing the command. Similarly creates can be used to create a directory if it does not exists.

- name: Display resolv.conf contents
  command: mkdir /folder creates=/folder

free_form parameter, the command module takes a free form command to run, There is no parameter actually named free_form . For Example:

- name: Copy file from source to destination
  copy: src=/source_file dest=/destination

As seen above, we won’t be able to input a parameter free_form, here copy is the free form input , which is used to copy files from a source to destination. For more information on ansible built in modules checkout : https://docs.ansible.com/ansible/latest/collections/ansible/builtin/index.html

Script Module: Runs a local script on a remote node after transferring it, If you want to run a script on say 100 nodes, you don’t have to copy the script on each node, ansible takes care of copying the script on all nodes and then executing it.

- 
    name: Play1
    hosts: localhost
    task:
    - name: Run a script on remote server
      script: /some/local/script.sh -arg1 -arg

Use the script module and add the location of the script in the ansible controller machine with the arguments for the script.

Service module : This is used to manage services like starting,stopping or restarting a service.

-
    name: Start Services
    hosts: localhost
    tasks:
    - name: Start the database service
      service: name=postgresql   state=started

Why the action is started & not start?

We are not asking ansible to start the service, we are ensuring that the service is already started, If the service is already started then don’t do anything. This is called idempotency (that is ensure certain service /task is to bring in a expected state)

For example if we have a script which adds a nameserver to resolv.conf file,

# Sample Script
echo "nameserver 10.1.250.10" >> /etc/resolv.conf

If we keep running the above script, we’ll get the following output in resolv.conf, that is we’ll be having duplicate entries

# /etc/resolv.conf

nameserver 10.1.250.1
nameserver 10.1.250.2
nameserver 10.1.250.10
nameserver 10.1.250.10

Same task can be done using module lineinfile which is idempotent

-
    name: Add DNS server to resolv.conf
    hosts: localhost
    tasks:
    -  lineinfile:
            path: /etc/resolv.conf
            line: 'nameserver 10.1.250.10'

If we use the above playbook to add a nameserver entry , we’ll get the following output that is there will be no duplicates

# /etc/resolv.conf
nameserver 10.1.250.1
nameserver 10.1.250.2
nameserver 10.1.250.10

Plugins

What are plugins & why is there a requirement in ansible?

While using ansible in real world, you might have resources like vpc’s spread across the world, and it might get difficult to manage resources in real time, like getting real time updates or logs , which can make configuring devices even harder.

To resolve such issues we use plugins, plugins are the libraries or bits of code which extend the functionality of base ansible.

There are various plugins available such as:

  • Dynamic inventory plugin: This can update our inventory files, with real time data which makes it a lot easier to configure our devices.

  • Module plugin: This helps us to configure cloud resources with custom configurations. These help to seamlessly connect with cloud providers API.

  • Action plugin: We can define high level tasks using this plugin like configuring load balancers, SSL certificates or firewall rules.

  • There are more plugins like lookup plugin, filter plugin, Connection plugin etc, refer https://docs.ansible.com/ansible/latest/plugins/plugins.html for more info.

Handlers

Imagine you have a large infrastructure, and we frequently make changes to the web server’s config, however modifying the config file alone does not work well, we also need to restart the web service manually for the job to be done , this becomes more complex or time consuming, this is where handlers come in, with handlers we can define a task/action and associate it with a handler. This creates a dependency between task & the handler. Now whenever config file is updated, the handler is triggered which ensures to restart the service.

  • Handlers are triggered by events/notifications.

  • Defined in playbook and executed when notified by a task.

  • Manage actions based on system state/configuration changes.

- name: Deploy application
  hosts: application_servers
  tasks: 
    - name: Copy Application Code
      copy:
        src: app_code/
        dest: /opt/application/
      notify:  Restart Application Service
  handlers:
    - name: Restart Application Service
      service: 
          name: application_service 
          state: restarted

In above code, copy module is used to copy a file from source to destination, which further has a notify directive which triggers the handler , which further uses service module to restart the given service.

Roles

As the word says, we can assign roles on blank servers to assign them to make them say database server, web server or a backup server. Assigning the role in automation means doing everything you need to do to make a server say database server or a web server. Such as installing the prerequisites required for mysql, installing mysql packages, configuring mysql services etc.

We can does this tasks by running a playbook, but if we can do these tasks using a playbook why do we need roles?
These set of basic tasks such as installing prerequisites or certain packages would remain common, So instead of reusing this playbook again and again for different servers we can package it into a role and reuse it later. Next time we can directly assign the role in our playbook and the task would automatically run on all the given servers , for example

# MYSQL - Role
tasks:
    - name: Install Pre-Requisites
      yum: name =pre-req-packages state=present
    - name: Install MySQL Packages
      yum: name=mysql     state=present
    - name: Start MySQL Service
      service: name=mysql     state=started

To use the above role in a playbook we can:

- name: Install and Configure MySQL
  hosts: db-server1.......db-server100
  roles:
    - mysql

After running the above playbook, all the servers will be assigned the role mysql, that is it will run all the tasks defined in MYSQL role.

Goal of using role is to make your work reusable.

Roles help to organize our project, we can have multiple directories for say tasks, vars, defaults and handlers.
Roles also help to share our code, example through ansible galaxy https://galaxy.ansible.com/
Ansible galaxy has a tool to get started with our project.

$ ansible-galaxy init mysql

The above command would create the directory structure , that is templates, tasks , vars , handlers etc. Then move all of your code in tasks directory

How to use roles in playbook.

Say we have our playbooks in a playbook directory

- name: Install and Configure MySQL
  hosts: db-server
  roles:
    - mysql

In the above playbook we have assigned role , mysql but how does the playbook know where to pick this role from, there are various ways to do so
we can create a directory called roles within playbook directory and we can move all the contents created from the ansible-galaxy command inside this role folder.
Or we can move roles inside a common location where ansible looks up, known as /etc/ansible/roles. This is the default location where ansible searches for roles.

We can also update this location by updating the roles_path parameter in ansible.cfg

# /etc/ansible/ansible.cfg
roles_path = /etc/ansible/roles

You can also search a role in ansible-galaxy via cli:

$ ansible-galaxy search mysql

We can install the role directly via

ansible-galaxy install <role_name>

Collections

We can use ansible collections to access specialized network automation content. Collections such as network.cisco, network.juniper, network.arista offer vendor-specific modules, roles and playbooks for managing each network.

$ ansible-galaxy collection install network.cisco

We can easily install cisco network collection from above command.

  • Package & distribute modules, roles , plugins etc

  • Self contained

  • Community & vendor created

We get expanded functionality, we can use collections extending aws functionality via

---
- hosts: localhost
  collections:
    - amazon.aws

  tasks:
    - name: Create an S3 bucket
      aws_s3_bucket:
        name: my-bucket
        region: us-west-1

But first we’d have to install aws collection via

$ ansible-galaxy collection install amazon.aws

We can also have all the required collections in a yaml file & install via a cli

# requirement.yml
---
collections:
    - name: amazon.aws
      version: "1.5.0"
    - name: community.mysql
      src: https://github.com/ansible-collections/community.mysql
      version: "1.2.1"

we can install above collections via

$ ansible-galaxy collection install -r requirements.yml

Hope that was enough ansible fot now. 😁

Did you find this article valuable?

Support Pranav's Blog by becoming a sponsor. Any amount is appreciated!