Introducing Glu: Deployment Coordination as Code

This blog post introduces our new open-source project Glu.
Check it out and try it for yourself at https://github.com/get-glu/glu

Teams are deploying more frequently than ever before. While this acceleration drives innovation, it has transformed what should be a streamlined software delivery process into a labyrinth of manual interventions, disparate tools, and constant firefighting.

Modern development teams face an increasingly complex deployment environment. Microservices architectures, while powerful, have multiplied the number of components that need to be coordinated during releases. Multi-cloud deployments add another layer of complexity. Security requirements grow more stringent by the day. In this environment, traditional deployment approaches are breaking under the strain.

Kelsey Hightower gets it

Enter the Deployment Doom Loop

The "Deployment Doom Loop" - a vicious cycle where developers find themselves:

Drowning in Manual Tasks: From triggering deployments to coordinating across teams, the time spent on manual tasks is overwhelming.
Overwhelmed by Tool Fragmentation: Most teams cobble together a mix of CI/CD tools, monitoring solutions, and deployment scripts. This fragmentation creates visibility gaps and makes it nearly impossible to maintain a consistent deployment process across projects.
Lacking Deployment Visibility: Without centralized oversight, tracking deployment status becomes a full-time job. Questions like "What version is running in production?" or "Who approved this deployment?" require investigation across multiple tools and systems.
Struggling with Compliance: As organizations scale, compliance and security requirements become more stringent. Manual processes make it difficult to enforce consistent security checks and maintain proper audit trails.

The cost of these challenges extends far beyond mere inconvenience. Development velocity suffers. Team morale deteriorates. And ultimately, the business's ability to deliver value to customers is compromised.

A Brief Aside of How We Deploy at Flipt

We encountered many of these problems ourselves when working on Flipt Cloud.

At Flipt, we've implemented GitOps principles where Git commits drive all artifact promotions. Our toolchain combines GitHub Actions, Timoni, and FluxCD to manage deployments from staging through production environments. This process orchestrates our container builds, deployment configurations, and Kubernetes cluster updates while maintaining proper health checks at each step.

Deployment pipeline at Flipt

The typical deployment flow looks like this:

A PR containing a code or configuration change is merged
A GitHub Action kicks off on merge and builds the resulting Docker image(s) and pushes it to an OCI registry (GitHub Container Registry).
Another GitHub action runs, using Timoni to generate a deployment bundle for our staging environment with the updated artifact SHA from step 2. This Timoni deployment bundle is tracked in Git.
On merge, FluxCD runs and updates our staging K8s cluster, applying the desired artifacts. A number of health checks then run and alert us if there are any failures.
After testing in staging, a developer triggers another GitHub action to run through steps 3-4 again but for promoting from staging to production.

Deployment Challenges

While this Rube Goldberg machine works pretty well for us (when things go right), its not without it’s problems.

Integrating these tools requires extensive configuration across YAML, Bash, Cue, and other scripting languages. Each integration point introduces complexity and potential failure modes, making the system brittle and difficult to maintain.

Secondly, it’s not at all simple to find out what version of our app is running in each environment without having to play detective and jump from GitHub Action run to PR to another PR to find out what the code change actually was that got deployed. GitOps solved one problem of traceability, in that you know what Docker image SHA was deployed, but that is near meaningless to most developers and definitely to product.

I found myself constantly pinging George over slack with questions like ‘Is my latest change in production?’ ‘When was the last time we promoted to production’? and ‘oh crap how do I rollback?!’.

If we have these problems in just a two person startup, I shudder to think of how many similar issues larger teams run into on a daily basis.

Introducing Glu

These challenges led us to develop Glu, a framework that codifies deployment pipelines in an intuitive, maintainable way.

Glu addresses these challenges through three core design principles:

Convention-driven library design: Glu connects deployment pipeline components through conventions, simplifying the process of building and maintaining pipelines.
Integration with existing deployment tools: Glu works alongside existing tools like FluxCD, ArgoCD, and Terraform, rather than replacing them.
Environment-agnostic architecture: Glu's design allows it to work with any environment, not just Kubernetes.

Glu UI showing an example deployment pipeline

By following the conventions that Glu provides you get the following out of the box:

An API for interacting with the state of your pipelines.
An optional dashboard UI for exploring your pipelines and triggering manual promotions.
The ability to develop and test your CD pipelines locally.

In short, the goal of Glu is to be the missing piece that makes it easy to glue together your deployment pipeline, as referenced above in the Kelsey Hightower post.

An Example Pipeline

The following example demonstrates how Glu orchestrates a multi-environment deployment pipeline. Pay particular attention to how the code naturally expresses the flow from OCI repository through staging to production:

return builder.New[*AppResource](glu.NewSystem(ctx, glu.Name("gitops-example"), glu.WithUI(ui.FS()))).
    BuildPipeline(glu.Name("gitops-example-app"), func() *AppResource {
        return &AppResource{
            Image: "ghcr.io/get-glu/gitops-example/app",
        }
    }, func(b builder.PipelineBuilder[*AppResource]) error {
        // fetch the configured OCI repositority source named "checkout"
        ociSource, err := builder.OCISource(b, "app")
        if err != nil {
            return err
        }

        // fetch the configured Git repository source named "checkout"
        gitSource, err := builder.GitSource(b, "gitopsexample")
        if err != nil {
            return err
        }

        // build a phase which sources from the OCI repository
        ociPhase, err := b.NewPhase(glu.Name("oci"), ociSource)
        if err != nil {
            return err
        }

        // build a phase for the staging environment which source from the git repository
        // configure it to promote from the OCI phase
        staging, err := b.NewPhase(glu.Name("staging", glu.Label("url", "http://0.0.0.0:30081")),
            gitSource, core.PromotesFrom(ociPhase))
        if err != nil {
            return err
        }

        // build a phase for the production environment which source from the git repository
        // configure it to promote from the staging git phase
        _, err = b.NewPhase(glu.Name("production", glu.Label("url", "http://0.0.0.0:30082")),
            gitSource, core.PromotesFrom(staging))
        if err != nil {
            return err
        }

        // return configured pipeline to the system
        return nil
    }).
Run()

We won't get into all the details in this post, but the above example should give you a good idea of how easy it is to create your deployment pipelines with Glu.

If you're interested in seeing the full code for the example above, you can find it in our GitOps Example repository.

In about 50 lines of Go code you get a fully functional deployment pipeline that is typesafe, easy to understand, and comes with a UI to quickly answer questions like "What version is running in production?" and "How did it get there?".

Before closing, let's quickly list out what Glu can do today and also what we have planned for the future.

What Glu Can Do Today

Define your deployment pipelines in code with a focus on readability and simplicity.
Allow you to develop and test your pipelines locally.
Trigger manual promotions through the provided dashboard UI.
Query and trigger your pipelines over HTTP with a simple to use API.
Integrate with your existing Git Repositories and OCI Registries.
Create pull requests to propose changes to your pipelines on GitHub.
Trigger automated reconciliations of your pipelines on a schedule.

What We Have Planned

Add support for additional source types (e.g. Helm Charts).
Add support for additional target types (e.g. Terraform modules).
Integrate with your existing CI/CD systems (e.g. GitHub Actions) to track artifact history and trigger automated promotions.
Add support for additional Git SCM providers (e.g. GitLab).
Enable querying external systems (e.g. Prometheus, Healthchecks, etc.) to gate pipeline promotion.
Add status reporting, history, and rollback capabilities.
Add support for additional notification channels (e.g. Slack).

Wrapping Up

Glu is still under heavy development, but we've already found it to be a useful tool for our own deployment pipelines and we're looking forward to sharing it more with the community.

Check out the Glu Documentation for more information and info on getting started. We also put together a quickstart guide to help you get up and running in no time with a simple example.

It's still early days (read: no tests, expect breaking changes), but we're excited to see where this goes.

The code is available on GitHub and we'd love to hear any feedback you have. We are also very welcoming of contributions if you are interested in helping out!

We would also love to hear about your own deployment pipeline issues and how we can help. Feel free to come chat with us on Discord, where we have a channel dedicated to Glu.

Thanks for reading!