Lessons Learned while Automating Fetch of Ansible Vault Encryption Passwords using 1Password CLI

The other day I was unable to decrypt a few of my Ansible Vault encrypted host_var files in a playbook. As best I can tell, the problem was related to my use of an executable vault-password-file and the 1Password CLI for fetching passwords.

Follow along for a frightening story and some interesting technical tidbits.

Background on how I stored Vault passwords

I rely on Ansible Vault to encrypt sensitive data in my homelab Ansible playbooks. I stored the Vault password in 1Password, and retrieve it using the 1Password command line interface (CLI).

Generating a new Vault password goes like this:

Create a new, or Duplicate an existing¹, password item in 1Password.
Let 1Password generate a random password containing letters, numbers, and symbols. Length is chosen randomly by me.
Save it with a name in this template: <PLAYBOOK NAME> - Ansible Vault Password.

Ansible playbooks are run on a control node in my homelab. Retrieving a Vault’s password for use looks like this:

Create a file .vault_pass in the directory containing playbook.yaml.
Put the below script in it.
Make the file executable.
In ansible.cfg set vault_password_file = ./.vault_pass.

The simple script:

#!/bin/zsh
VAULT_ID="<VAULT_ID>"
VAULT_ANSIBLE_NAME="<PLAYBOOK NAME> - Ansible Vault Password"
op item get "$VAULT_ANSIBLE_NAME" --vault $VAULT_ID --fields password

The VAULT_ID alpha-numeric string can be sourced by running op vault list. The script runs the op item get command to fetch the specific item out of 1Password, and grabs just the password field on it.

Now when I run ansible-vault encrypt host_vars/my-host, Ansible executes .vault_pass which triggers the op command and the 1Password Agent asks me to authenticate. Following successful authentication the password value is returned to Ansible Vault, and the file is encrypted using it. This is very convenient!

A problem decrypting

One day I noticed an error when attempting to decrypt a host_var file. An error popped up, something about no suitable password found for decrypting.

My heart stopped.

Normally I don’t think about which password secures the files because it “just works”. The 1Password Agent is running on my machine, I run a command that needs a password, the Agent supplies that password, and life goes on. It’s 100% transparent to me, which is great from a usability standpoint!

After a few deep breaths, I started debugging by manually supplying the password to the Vault command:

ansible-vault decrypt host_vars/my-host --ask-vault-pass

Adding the flag --ask-vault-pass would allow me to manually input the password to be used to decrypt the file. I copied the password from 1Password and pasted it into the CLI when prompted. Everything worked fine. But why was my .vault_pass file solution no longer working?

I noticed that when I manually ran the op item get command I received this:

# Command
op item get --vault=$VAULT_ID "$VAULT_ANSIBLE_NAME" --fields password

# Response
ID:          Password_ID
Title:       <PLAYBOOK NAME> - Ansible Vault Password
Vault:       Vault Name (Vault_ID)
Created:     11 months ago
Updated:     11 months ago by Matt Edwards
Favorite:    false
Tags:        ansible
Version:     1
Category:    PASSWORD
Fields:
  password:    [use 'op item get <Password_ID> --reveal' to reveal]
  password:    [use 'op item get <Password_ID> --reveal' to reveal]

Well that looks… less than promising. I was not familiar with the --reveal flag. Starting in v2.30.0 of the 1Password CLI command (released 2024-07-29) this new flag was added. The return code of op item get without --reveal was still 0 and I am left wondering if perhaps a run of ansible-vault encrypt used the response from that which would have been:

password:    [use 'op item get <Password_ID> --reveal' to reveal]

Using this as a raw string would not decrypt the vault. Oh well.

Peculiarly, there’s two different Fields both named password – which one was being returned? In the 1Password app GUI it only showed a single password value. It did not show any historical passwords for that field. Where was this second value coming from?

Changing 1Password CLI commands

There is a second CLI command from 1Password which can return a value from a saved item, op read. I ran this alongside op item get to determine if they would return the same value. I also added in the --reveal flag.

op item get --vault=$VAULT_ID "$VAULT_ANSIBLE_NAME" --fields password --reveal
# Response: ABC

op read "op://HomeNetwork/pihole - ansible vault password/password"
# Response: DEF

The two commands returned two different values.²

Uh oh. But at least I have both values and surely one of them must decrypt the file.

…nope.

The script is missing something

Deeper into this debugging I noticed something I should have caught immediately. For this particular playbook folder the .vault_pass file was not granted executable permissions.

$ ls -l .vault_pass
-rw-r--r--  1 matt  staff  251 July  29 13:53 .vault_pass

According to Ansible’s (somewhat skimpy!) documentation of the matter, the config value of vault_password_file should be a path which is one of:

A file containing the password string
An executable file which returns the password string

Since this file was not executable maybe Ansible had been using the raw contents of the script as the password?

Unfortunately, this is where the mystery ends as it is too late to go back and validate this. I decided instead to nuke-and-pave the file. I sorely wish I had saved the file so I could test out this theory.

Start from scratch

Since the host machine was alive in prod, I decided to simply re-create the host_vars/my-host file manually. I logged in, grabbed the passwords, and put them in a new file created using ansible-vault create host_vars/my-host.

A little annoying, but the impacted playbook had a fairly tame host_vars file, so I opted for the simplest solution to the problem.

Lessons learned

Make sure script files are executable. This is a strange case where you don’t get a failure if a script file which needs the executable permission does not have it. Instead, the contents of the script could/will be used as the password.

Automation can be a risk! If a script is returning a password used to encrypt something, you had better hope that script returns the value you actually have stored in your password manager. The format of and the passwords in my Ansible Vault encrypted files are mostly not kept anywhere else, other than the live system they are running on. Losing the encryption key to those files is a PITA because not only might I have to rotate passwords I need to work out all of the variable names I had defined in that file. On the other hand, this sort of automation means I don’t have to copy-and-paste passwords into my CLI when running a playbook, and risk accidentally pasting a password somewhere I should not have.

Upgrading your tools quickly can be dangerous. I routinely update my tools. 1Password CLI is installed via Homebrew, and I update all of those Homebrew-installed packages using brew update && brew upgrade about once per week. It’s good hygiene! But it can also result in situations like this. Had I been using op from the command line I might have noticed that passwords were no longer returned without the new --reveal flag; however, I had wrapped op into an executable script which I had no visibility into. I do not think this is either a good thing, or a bad thing, just a regular thing that can happen as you layer various tools on top of each other in your workflow.

Use the right tool for the job. As I dug through 1Password’s documentation for op item and op read, I noticed this line:

To retrieve the contents of a specific field, use op read instead. When using service accounts, you must specify a vault with the --vault flag or through piped input.

Guess that settles it, I’ll use op read from now on. The syntax of op read is a bit nicer; no more alpha-numeric VAULT_ID in the script.

Test all of your assumptions. While writing this blog post, I decided to make a testing item in 1Password with two fields named “password”. What does op read return in this case?

op read "op://HomeNetwork/Test Password/password"

For a password-type item in 1Password, it returns the special/first password field. It ignores any additional fields I add named “password”. And if I remove the value from the special/first password field, it returns an empty string even if a secondary field is defined and named “password” with a value in it.

I have not yet tested this, but I suspect the Duplicate function in 1Password for Mac may have introduced the password with duplicate password fields. ↩︎
Hopefully it’s clear that ABC and DEF are not real passwords I would consider using! ↩︎