Exploring and Provisioning Infrastructure With Packer

Posted on Mar 21, 2023 | By Andrei Buzoianu, Elif Samedin | 15 minutes read

We have attended HashiTalks 2023 with an introductory talk about Packer: how this is situated in the DevOps ecosystem and how to leverage its capabilities in order to adhere to an immutable infrastructure approach.

This post is a follow-up of the talk we have prepared.

Overview

We’ll talk about Infrastructure as Code, Immutable Infrastructure, how Packer fits in all this, Golden Images and we will also provide a demonstration by building a Virtual Machine Image and a Container Image.

What Is Packer?

Packer is, without doubt, one of the tools that shouldn’t be missing from a DevOps Engineer’s ammo bag. Its objective is to create identical machine images for multiple platforms (KVM, Vagrant, VMWare, AWS, Azure, Google Cloud, Docker) from a single source configuration.

Thus, it drastically shortens the time it takes to deploy new instances. By baking our needed configuration into golden images, we also shift our mindset to immutable infrastructure. This in turn brings some major benefits, such as reducing the deployment time and configuration drifts.

Packer can generate identical images for VMs or containers from a single source configuration, which helps having all applications, libraries and configurations already baked into the image.

Why Infrastructure as Code?

The Infrastructure as Code approach ensures repeatable deployments, and to achieve this we usually focus on several areas of coverage: Infrastructure Templating, Manage the Infrastructure, One Time Configuration and Post Install Configurations and Deployments. Thus, speaking about coverage area, Packer is situated at the Infrastructure Templating level.

Where is Packer situated compared with just a few of the other popular IaC tools and in the infrastructure automation landscape?

Infrastructure as Code Tool Landscape

Packer History

Packer was first announced in 2013, and in terms of recent IT infrastructure history, it can be regarded as a veteran.

Packer History

Why Packer?

Up to this point we have seen what Packer is and how Packer has evolved. What Does Packer Solve?

  • Immutable Infrastructure
  • Configuration Drift
  • Deployment & Time-to-Market
  • Maintaining some control over costs - since projects employing “good” automation generally tend to absorb fewer costs

Immutable infrastructure

A traditional mutable server infrastructure dictates that servers are continually updated and modified in place.

Well, immutable infrastructure is another infrastructure paradigm in which servers are never modified after they’re deployed. Typically, configuration management is (only) used during image creation.

The scope is to make the process of deployment more predictable, other benefits being consistency and reliability which translate to a more stable infrastructure.

The classic example here is that we switch from doing SSH into the servers to a foreseeable process of replacing the entire server with an image, which upon validation, already has the appropriate changes (such as updates, fixes, configuration changes).

Configuration drift

Any system configuration will, over time, diverge from its established known-good baseline or even industry-standard benchmarks, if you follow those.

In reality, even a small drift can expose your organization to data breaches or downtime. Proper configuration and management of infrastructure components is vital to security, compliance and business continuity. Anything from day-to-day operational tasks such as software patching or hardware maintenance, even poor documentation or lack thereof can cause configuration drift.

Deployment - Golden Images

As a definition, a golden image is a template for a virtual machine (VM), virtual desktop, server or hard disk drive. A golden image may also be referred to as a clone image, master image or base image. Usually containing up-to-date OS packages, a golden image contains software and configurations needed for a particular task.

Here are some of the reasons to use Golden Images and the problems we try to solve:

  • image reusability in a Infrastructure as Code mindset
  • satisfy specific configuration or software preinstalled on a given machine
  • avoid manually installing and configuring common software
  • save time when deploying many resources at once

Deployment - Faster Time to Market

Automation does increase the infrastructure provisioning speed of development, testing or production environments. Instead of a torrent of manual time-consuming processes, by codifying and documenting every aspect of our systems, we fight slow provisioning and shorten the time needed for scaling or taking down production infrastructure when necessary.

In the software development business, a significant aspect of controlling cost is the reduction of operating expenditures by managing time for development, testing and deployment. Therefore, we have faster time-to-market, by reducing the time it takes to get one’s product to market.

Packer Time-to-Market

The ability to reduce the Speed-to-Market for new products is a key differentiator identified in high performance IT shops. Time-to-Market normally includes the time it takes to come up with the product idea, the design and testing cycle, any supplier sourcing and auditing process, the deployment to production and everything else that needs to be done until the product is ready to reach the customer. Now, because the novelty and profitability of any new product rapidly decreases if not launched on time or when copied, it is crucial for margin increase, market growth and customer loyalty to release your innovative product as soon as possible. Everything we just talked about can be easily distinguished in this diagram by looking at the highlighted area conveniently named “profit”.

This is most probably one of the reasons why, Gartner predicts that by 2025, 70% of organizations will implement structured automation to deliver flexibility and efficiency, an increase from 20% of organizations in 2021.

Managing costs

Unplanned manual processes can add additional delays which translate to overall cost. Infrastructure as Code does capture details about instance types, configurations, security groups and relationships between all these resources. This is actually an added benefit that helps working within a defined cost policy.

Before Infrastructure as Code the source of truth for collecting information was the runtime environment itself. By leveraging automation extensively, modern projects employing Infrastructure as Code create the opportunity to analyze these complex systems prior to their actual provisioning, thus reviewing everything from the architecture, security and threat modeling, up to cost analysis.

Packer Terminology

The Packer workflow takes a source (which can be a docker image, an operating system installation media in ISO format or an existing machine, a clone of an existing virtual machine, etc.) and creates a machine on the builder. It can execute tasks defined by the Provisioners and then will generate some output.

Packer Workflow

Templates

Templates are either HCL or JSON files which define one or more builds by configuring the various components of Packer. Packer uses the HCL2 (Hashicorp Configuration Language, version 2) which includes a number of built-in blocks that one can use. A block is a container for configuration. Blocks can be defined in multiple files and Packer will build the image using solely the files from a directory.

Builders

Builders are responsible for creating machines and generating images from them for various platforms. For example, there are separate builders for EC2, VMware, Hetzner Cloud, etc. Take the vpshere-iso Packer builder, which is able to create new images for use with VMware. The builder takes a source image, runs any provisioning necessary on the image after launching it, then snapshots it into a reusable image. This reusable image can then be used as the foundation of new servers that are launched within the vSphere.

For parallel builds, one can create multiple sources then add those sources to the array in the build block. The sources do not need to be the same type. When that is run, Packer will build multiple images.

Provisioners

These are components of Packer that install and configure software within a running machine prior to that machine being turned into a static image. Example provisioners include shell scripts, Ansible, Puppet, etc.

Could be something as simple as installing the latest security updates for the operating system or more complex managed deployments of an entire application.

Post-Processors

Post-processors perform certain tasks after the build has completed. These tasks might be uploading the image to vSphere, generating Vagrant box files or importing into AWS.

They take the result of a builder or another post-processor and process that to create a new artifact. Those might be tasks that upload the resulting image to a Cloud Provider, even tasks that import a container image to Docker locally or push it to a Docker registry.

There are other terms used throughout the Packer documentation where the meaning may or may not be immediately obvious. To mention some:

  • Artifacts: are the results of a single build, and are usually a set of IDs or files to represent a machine image. Worth mentioning - here, is that a builder produces a single artifact.
  • Builds: are a single task that eventually produces an image for a single platform. Multiple builds can run in parallel.
  • Commands: are sub-commands for the packer program (such as init, build or version) that perform some job.

Hands-on

We are now going to show some examples in which we leverage Packer’s capabilities in order to get a KVM Virtual Machine image and a Docker image.

Qemu

KVM virtual machine images can be created using the Qemu Packer builder. A virtual machine is made by the builder by starting with a blank virtual machine, booting it, installing an operating system, restarting the machine with the boot media acting as the virtual hard drive, provisioning software within the OS, and then shutting it down. The directory containing the image file required to run the virtual machine on KVM is the output of the Qemu builder.

The source block:

hashitalks2023 (master) $ cat qemu/sources.pkr.hcl 
source "qemu" "rocky-9" {
  iso_urls     = ["qemu/iso/Rocky-${var.major_release}.${var.minor_release}-${var.erratum}-${var.arch}-dvd.iso", "https://download.rockylinux.org/pub/rocky/${var.major_release}/isos/${var.arch}/Rocky-${var.major_release}.${var.minor_release}-${var.erratum}-${var.arch}-dvd.iso"]
  iso_checksum = "file:./qemu/iso/sha256sum.txt"

  headless  = false
  disk_size = "${var.disk_size}"
  memory    = "${var.memory}"
  cpus      = "${var.cpu}"

  qemuargs = [
    ["-m", "1024M"],
    ["-smp", "2"],
    ["-cpu", "host"]
  ]

  output_directory = "qemu/output-${local.timestamp}"
  shutdown_command = "echo 'packer' | poweroff"
  format           = "qcow2"
  accelerator      = "kvm"
  vm_name          = "Rocky-${var.major_release}.${var.minor_release}-${var.erratum}-${var.arch}-dvd.qcow2"
  net_device       = "virtio-net"
  disk_interface   = "virtio"

  http_directory = "qemu/http"
  boot_command   = ["<tab> inst.text inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/ks.cfg<enter><wait>"]

  boot_wait               = "10s"
  communicator            = "ssh"
  ssh_username            = "${var.ssh_username}"
  ssh_password            = "${var.ssh_password}"
  ssh_port                = "22"
  ssh_timeout             = "20m"
  ssh_handshake_attempts  = "40"
  pause_before_connecting = "5s"
}

Variables defined within our Packer configuration:

packer (master) $ cat qemu/variables.pkr.hcl 
variable "cpu" {
  type    = number
  default = 1
}

variable "disk_size" {
  type    = number
  default = 32768
}

variable "memory" {
  type    = number
  default = 1024
}

variable "arch" {
  type    = string
  default = "x86_64"
}

variable "major_release" {
  type    = string
  default = env("MAJOR_RELEASE")
}

variable "minor_release" {
  type    = string
  default = env("MINOR_RELEASE")
}

variable "erratum" {
  type    = string
  default = env("ASYNCHRONOUS_ERRATUM")
}

variable "ssh_username" {
  type    = string
  default = env("SSH_USERNAME")
}

variable "ssh_password" {
  type    = string
  default = env("SSH_PASSWORD")
}

The image currently being built is actually configured using the Ansible Packer provisioner, which runs Ansible playbooks. It dynamically creates an Ansible inventory file configured to use SSH, runs an SSH server, executes ansible-playbook, and marshals Ansible plays through the SSH server to the machine being provisioned by Packer.

hashitalks2023 (master) $ cat qemu/build.pkr.hcl 
build {
  name    = "Rocky-9"
  sources = ["source.qemu.rocky-9"]

  provisioner "ansible" {
    inventory_directory = "./../ansible/inventories"
    playbook_file       = "./../ansible/plays/common.yml"
    extra_arguments     = ["--extra-vars", "upgrade_all_packages=yes"]
    user                = "root"
  }

  post-processor "shell-local" {
    inline = ["cp qemu/output-${local.timestamp}/Rocky-${var.major_release}.${var.minor_release}-${var.erratum}-${var.arch}-dvd.qcow2 /var/lib/libvirt/images/"]
  }
}

As you already know, Ansible is an open source tool that automates provisioning, configuration management, application deployment, and many other manual processes. One key benefit that Ansible provides is that its modules are idempotent. This means that the result of performing the action once is exactly the same as the result of performing it repeatedly without any intervening actions.

In our case, Ansible will execute the common Role which we have developed in-house in order to ensure that the servers we manage have a universal baseline. Thus, we aim to achieve some of the following:

  • Set certain networking configurations such as disabling IPV6
  • Enforce certain SELinux policy
  • Ensure that certain packages are installed and ensure that all packages are up to date. Well, this also touches the necessity of maintaining a good patch level.

Our targeted builder has its own directory, qemu. To build simply issue:

hashitalks2023 (master) $ packer build qemu

After the build is done, we have a post-processor which will copy the resulting image to /var/lib/libvirt/images/. Let’s use the Virtual Machine Manager to test the output.

  • We’ll import the existing disk image Packer Import Existing Disk
  • You can see the image we just created Packer Storage
  • Use generic as Operating System Packer OS
  • Leave defaults for resources

Packer Resources

  • Name our instance hashitalks2023 Packer Name of Instance
  • And done, it works! Packer Running Instance

Docker

Using Docker, the Docker Packer builder creates Docker images. A container is started by the builder, provisioners are run inside of it, and then the container is exported for reuse or the image is committed. Without using Dockerfiles, Packer creates Docker containers.

Thus, it is able to provision containers with portable scripts or configuration management systems that are not at all connected to Docker by avoiding the use of Dockerfiles. Additionally, it has a straightforward mental model: setting up containers is very similar to setting up a regular virtualized or dedicated server.

The Docker builder must run on a machine that has Docker Engine installed. Consequently, the builder only functions on devices that are compatible with Docker and does not support running on a remote Docker host.

hashitalks2023 (master) $ packer build docker
docker.nginx: output will be in this color.

==> docker.nginx: Creating a temporary directory for sharing data...
==> docker.nginx: Pulling Docker image: nginx:1.23.3
    docker.nginx: 1.23.3: Pulling from library/nginx
    ...
    docker.nginx: Status: Downloaded newer image for nginx:1.23.3
    docker.nginx: docker.io/library/nginx:1.23.3
    ...
    docker.nginx: Container ID: 47f0ac04005c54fb2604804cbc1c6b25820e696a63b4454d1832efb6a683bf2b
==> docker.nginx: Using docker communicator to connect: 172.17.0.2
==> docker.nginx: Uploading docker/files/HashiTalks2023.png => /usr/share/nginx/html/HashiTalks2023.png
    docker.nginx: HashiTalks2023.png 311.18 KiB / 311.18 KiB 
==> docker.nginx: Uploading docker/files/index.html => /usr/share/nginx/html/index.html
    docker.nginx: index.html 265 B / 265 B 
==> docker.nginx: Pausing 5s before the next provisioner...
==> docker.nginx: Provisioning with shell script: /tmp/packer-shell2653829607
    ...
==> docker.nginx: Committing the container
    docker.nginx: Image ID: sha256:1969d791f60553d2d83c1170edf03074b3957126468a89f409b291b230444792
==> docker.nginx: Killing the container: 47f0ac04005c54fb2604804cbc1c6b25820e696a63b4454d1832efb6a683bf2b
==> docker.nginx: Running post-processor:  (type docker-tag)
    docker.nginx (docker-tag): Tagging image: sha256:1969d791f60553d2d83c1170edf03074b3957126468a89f409b291b230444792
    docker.nginx (docker-tag): Repository: hashitalks/2023:HashiTalks2023
    docker.nginx (docker-tag): Tagging image: sha256:1969d791f60553d2d83c1170edf03074b3957126468a89f409b291b230444792
    docker.nginx (docker-tag): Repository: hashitalks/2023:1.0.0
Build 'docker.nginx' finished after 24 seconds 429 milliseconds.

==> Wait completed after 24 seconds 429 milliseconds

==> Builds finished. The artifacts of successful builds are:
--> docker.nginx: Imported Docker image: sha256:1969d791f60553d2d83c1170edf03074b3957126468a89f409b291b230444792
--> docker.nginx: Imported Docker image: hashitalks/2023:1.0.0 with tags hashitalks/2023:HashiTalks2023 hashitalks/2023:1.0.0

Packer does support several builders, and changing the target is just a matter of changing the builder, or using multiple builders in the same Packer file. The build steps are kept intact and will be the same if you build a docker image or an AWS AMI for example (you can even build both at the same time).

Let’s look at the results.

hashitalks2023 (master) $ docker image ls|grep hashitalks
hashitalks/2023                       1.0.0            1969d791f605   About a minute ago   144MB
hashitalks/2023                       HashiTalks2023   1969d791f605   About a minute ago   144MB

You can see the tags we set up. Now let’s run a container using our image:

hashitalks2023 (master) $ docker run -d -p 8080:80 hashitalks/2023:1.0.0
78fd408575a938a1136fb3d233de6487c56d3d0cb2eef0be9155e277d87f6648

And do a simple curl…

hashitalks2023 (master) $ curl localhost:8080
<!DOCTYPE html>
<html>
<body>

<div>
  <style>
    .center {
      display: block;
      margin-left: auto;
      margin-right: auto;
      width: 50%;
    }
  </style>
  <img class="center" src="HashiTalks2023.png" alt="HashiTalks 2023" />
</div>

</body>
</html>

Final thoughts

We turned the spotlight on Packer as a tool for creating identical machine images for various platforms from a single source configuration and address challenges such as Configuration Drifts and Control over Costs.

Packer helps keep development, staging, and production as similar as possible. But that is not all, for us it is also a prerequisite for:

  • Quick Infrastructure Deployment
  • Multi-provider Portability
  • Better Stability
  • Improved Control over Costs

Should you have missed the most recent edition of HashiTalks, you can definitely watch our talk or you can see the entire playlist which is available here.