Efficient Sharing Data in Ansible
In this article, the reader will explore and learn how to share data in Ansible between roles and playbooks efficiently.
Join the DZone community and get the full member experience.
Join For FreeIn certain scenarios, you may need to retrieve data from a cloud provider or perform complex calculations, which can be time-consuming. Let’s assume you use an API for launching a virtual machine, and then you would like to reuse data and share among other roles. We assume, in the beginning, we don’t know the IP address of the machine. Once we know the IP, we want to launch configuring only one machine. For different reasons. It might be the speeding up of the execution and the desire not to waste time.
Example 1:
- name: Configure frontends
hosts: frontends
gather_facts: no
roles:
- Create users
- Setup sshd
- name: Configure backends
hosts: backends
gather_facts: no
roles:
- Create users
- Setup sshd
Note. Play is a set of tasks or roles targeted against a set of hosts. Playbook consists of one or a few plays. In example 1, there are two plays: Configure frontends
and Configure backends
.
Variables defined in the play Configure frontends
cannot be shared to Configure backends
, because variables among plays can be shared for a specific host(s). The scope of the variables (when you use set_fact
) is limited by one play targeted against a set of hosts.
In other words, if you generate a password in play Configure frontends
in the role Create users
, these passwords cannot be used in a role in Configure backends
role (because we wish to have identical passwords).
I would like to demonstrate how this restriction can be bypassed.
Approach 1
Let’s have a look at the simple playbook:
Example 2:
- name: Create server
hosts: localhost
gather_facts: no
vars:
server_type: cpx21
roles:
- prepare_server
- add_host_to_zabbix
- lynis
What do we have in the playbook above?
- We launch the playbook on localhost
- We don’t collect data from hosts
- We set some variables
- We launch specific roles
Everything in the playbook is necessary for creating a virtual machine. This playbook is launched locally on the Ansible manager machine, which uses cloud provider API and creates a machine, then configures it. Seems fine. But there is a problem. After preparing_server
we have an IP address, which we didn’t know and got from the provider. For launching the next roles for configuring the machine(lynis
) we should launch ansible tasks on the remote machine and pass the IP address of the machine to add_host_to_zabbix
role because it is necessary for configuring the monitoring system.
But how can other roles be launched with the right parameters and on the right target?
Someone can say, “reload your dynamic inventory, use a name for the server,” and then use delegate_to
. That's possible since you pass the name while the server is being created as an external variable like:
$ ansible-playbook -e "server_name=myserver.cloud" prepare_server.yml
Or export environment variable. However, it works only in this particular case. Therefore, it is not suitable for all. Moreover, it is not convenient. What if we don’t pass the server’s name but generate it?
If you paid attention, this makes running a role slower because Ansible must gather data from the provider one more time, which we wanted to get rid of.
Special for this purpose, Ansible has a module add_host
. This module allows you to add a host to a special temporary inventory file and use it as a temporary storage for your variables and so on associated with the new host that is not available if you just reload your inventory.
A code snippet:
- name: Add host to group 'just created'
add_host:
name: "{{ server_name }}"
groups: just_created
var: "{{ myvalue }}"
So, here we add a virtual machine to group just_created
and set up some additional variables, which can be accessed later, like this:
- set_fact:
server_ip: "{{ hostvars[server_name].var }}"
Or:
- name: Show all the hosts matching the group just_added
debug:
msg="host is {{ hostvars[item]['inventory_hostname'] }}, var is {{ hostvars[item]['var'] }}"
with_items: "{{ groups['just_added'] }}"
Thus you can add and share any type of data (key-value or dicts).
We can rewrite the playbook above as follows:
Example 3:
- name: Create server
hosts: localhost
gather_facts: no
vars:
server_type: cpx21
roles:
- prepare_server
- name: Install Lynis and Zabbix
hosts: just_created
gather_facts: no
roles:
- add_host_to_zabbix
- lynis
Approach 2
Another approach I would like to consider is using a well-known old good way. This is using temporary files.
Let’s say we have a playbook and wish to share generated openssl
pre-shared key:
Example 4:
- name: Create server
hosts: localhost
gather_facts: no
vars:
server_type: cpx21
roles:
- prepare_server
- add_host_to_zabbix_server
- name: Install Zabbix
hosts: just_created
gather_facts: no
roles:
- install_zabbix_agent
- lynis
Role add_host_to_zabbix_server
has a task that generates a preshared key for establishing an encrypted connection between Zabbix server and Zabbix agent; therefore, this key must be propagated among these two hosts.
In the role, we should save this value somewhere. The role add_host_to_zabbix_server
does the following for saving psk:
- name: Create a temporary file
tempfile:
state: file
register: tempfile
- name: Save ip addr to a temp file
copy content={{ openssl_key }} dest=”{{ tempfile.path }}”
Where we save the openssl_key
for further use.
Note. As well as in programming, any allocated resources must be released when they are not needed and not used. The last action must be this temporary file deletion. Especially if it contains sensitive data. I omit this action in the article. Don’t forget.
Once we save the openssl_key
, there is a way to read it. So, here is a code snippet for using in install_zabbix_agent
role:
- set_fact:
psk: "{{lookup('file', tempfile.path) }}"
set_fact
is used for defining a variable that will be used further in the same play(in the example we have lynis left).
In the approach we just considered, we can store data persistently. That is, these data cannot be lost between playbook launches unless you’ve deleted it:
- name: Remove the temporary file
file:
path: /tmp/ansible.*
state: absent
Of course, you can use your own name for temporary files.
Approach 3
This is the most convenient method to exchange data between roles and even playbooks. This approach allows for storing data permanently. You probably guessed it was external storage. It can be Redis, Consul, or other key-value storage, even a relational database. I prefer to use Redis. I put passwords and API access tokens in Redis, and also I store intermediate data like IP addresses or generated keys. It is the most versatile method for storing and sharing variables.
All previous examples were trivial and didn’t require storing massive data.
Just imagine you need to store the data of all visited hosts between launches, or you would like to run Ansible across a different bunch of hosts and move some data from one group of servers to another. This is the significant difference between approach 1 and approach 2.
For this goal, Redis is most suitable.
There are two modules in Ansible for working with Redis:
- One for saving data into Redis
- Another for looking up data by its key
So, let’s consider a playbook. The task is to add all servers under Zabbix monitoring.
Example 5:
- name: Install Zabbix agent
hosts: all
gather_facts: no
roles:
- install_zabbix_agent
- name: Configure Zabbix server
hosts: all
gather_facts: no
roles:
- add_host_to_zabbix_server
In example 5, as you can see, the first play works on each server, where PSKs are generated, and for configuring the Zabbix server, we should store all these PSKs and associated servers’ names (actually we need some additional info like TLSPSKIdentity and so on, It is omitted).
In the install_zabbix_agent/tasks/main.yml
- name: Generating tls psk
shell: /usr/bin/openssl rand -hex 32
register: tls_psk
- name: Store psk in Redis
delegate_to: localhost
community.general.redis_data:
login_host: 127.0.0.1
key: "{{ server_ip }}"
value: "{{ tls_psk.stdout }}"
tls: false
state: present
We generate and save generated pre-shared key in Redis with a key equal to the server name.
In the role, add_host_to_zabbix
we search for pre-shared key:
- set_fact:
tls_psk: "{{ lookup('redis', server_ip) }}"
And finally, we delete it:
- name: Deleting tls psk from Redis
delegate_to: localhost
community.general.redis_data:
tls: false
key: "{{ server_ip }}"
state: absent
Pay attention that working with Redis, you should specify a host for connecting. In the case above, Redis is working on the same host as Ansible, and I use delegate_to: localhost
In your case, you must specify the correct host to connect to.
An advantage of using Redis is the possibility of exchanging data with other applications as well. Then your application can request necessary information in Redis, which is impossible in the two first approaches. Additionally, Redis can be used as a cache for facts that extremely accelerate playbook running.
Last but not least, here is my personal recommendation to consider the Ansible module ‘block in file.’
That can be used in a particular case of using a temporary file if your program can parse the final file. For example, this module is pretty useful for collecting data in one file for WireGuard.
Conclusion
Sharing data can be a challenging task in certain scenarios, and many DevOps professionals struggle to find the most effective solutions. However, I have demonstrated several techniques that can be used to efficiently accomplish this task.
Opinions expressed by DZone contributors are their own.
Comments