DevOps

Tracking your software relationships with Eiffel

2023-07-18

Post by: Magnus Bäck

Whiteboard with Eiffel event scribbles

Do you have a complex continuous integration (CI) pipeline that you’d like to capture information from? Maybe metrics, input for compliance or software bills of materials (SBOMs), or details about all activities and their relationships so that you can make a custom visualization of the pipeline?

Or maybe you want to split your pipeline into independent pieces that trigger each other? Then the open source Eiffel event protocol might be something for you. Eiffel defines a vocabulary for describing things that can happen in a CI pipeline and its surrounding ecosystem, for example:

  • A source code commit is pushed somewhere.
  • An activity with a particular name has started.
  • An artifact is built.
  • The execution of a test case has started (or completed).
  • Someone’s level of confidence in an artifact has changed.

With Eiffel, these occurrences are described in JSON documents that we call events, and they’re published in real-time by the involved systems so that others can consume and act upon them. Eiffel is just a protocol and doesn’t care how the events are distributed but a publish-subscribe pattern is usually used, i.e. anyone with something to announce sends its events to a message broker that distributes them to everyone who’s expressed interest. The publisher doesn’t need to know anything about who’s listening. The Eiffel open source ecosystem has so far been written for RabbitMQ but e.g. Kafka could be used instead.

Obviously, Eiffel didn’t invent events and message buses. Many systems involved in CI pipelines can already publish events in some form, so what value does Eiffel add? There are a couple of reasons why it makes sense to have these systems emit Eiffel events instead (or in addition to). First off, it’s a single vocabulary to describe CI events so if you have multiple systems producing events you can process all of them in the same way. Somewhat related, you can obtain the events from a single source, i.e. your message queue of Eiffel events becomes your one-stop shop for CI events. The last and arguably most important reason introduces a concept usually not found in other event representations, namely the ability to express relationships between events, or links in Eiffel parlance. Each Eiffel event can reference zero, one, or more other Eiffel events and describe what relationships they have.

Axis has been expanding its use of Eiffel over the past few years, and a large part of the rather complex AXIS OS pipeline is described with Eiffel.

In this post, I’ll give a few examples of how Eiffel events can model real-world systems and why I think it makes sense to do so. If you’re already sold on the concept and want to dig into the protocol documentation, or just want to use it as a reference on the side while reading, have a look at github.com/eiffel-community/eiffel.

A simple example

To better understand what it means to describe the relationships between events, let’s look at some of the events we listed earlier and connect them to a graph:

This tells us that a source commit is pushed to a branch, how that event becomes the trigger (cause) of an activity — e.g. a CI job — that creates an artifact, and how the artifact creation declares that it was built from the source code commit that triggered the build. Note that the edges point backward in time. These events form a directed acyclic graph (DAG), not entirely unlike how the Git version control system links the commits that make up the history of the repository.

For a human observer with knowledge of how the system has been set up, collecting and visualizing this information might not add much value. He or she would probably know where to find the CI job that produced an artifact with a particular identity, the UI of the CI system would show why the job was triggered, and so on.

However, for larger systems composed of dozens or hundreds of artifacts, each with its own CI pipeline (possibly implemented with different tools), most people probably only know a small part of the entire system. This is especially true if the system is spread across geographic, organizational, or tech stack boundaries. Trying to find information outside of your realm is often difficult, boring, time-consuming, or all of the above. Eventually, someone is going to write a tool to help collect whatever piece of information is frequently wanted, but that may prove both difficult and prone to depend on the implementation details of various components.

A more realistic example

Let’s expand the earlier example to something that more resembles the kind of real-world CI pipeline that people could find hard to understand. A pipeline isn’t just activities building artifacts from newly pushed code; you run various kinds of tests, set baselines, create new artifacts from existing artifacts, evaluate test results and make verdicts, and eventually maybe deploy the artifacts in an environment or publish them to a new location.

Oof. That’s a lot of nodes and edges. I think we need to go through them step by step:

  1. Like in the earlier example, we start with a commit that has been pushed to a git. In this expanded example we also link it to the commit’s previous version, i.e. its parent commit.
  2. The commit triggers an activity named Foo_Build. The activity links to the previous execution of the same configuration (the exact definition depends on what kind of activity it is).
  3. The activity creates an artifact that declares what source it was built from. In this case, it’s the same source that caused the artifact to be built but that’s not always the case.
  4. The artifact creation links to an event that defines the environment in which the artifact was created. A typical example of an environment is the host or device where something is built or tested.
  5. The activity also publishes the artifact, i.e. makes it available from a particular URL. In this case, the artifact creation and publication are carried out by the same activity but that doesn’t have to be the case.
  6. The publication of the artifact triggers a test case execution. Apart from telling us what triggered it, that event also tells us what’s being tested — the artifact.
  7. The test case execution completes with a passing result.
  8. Because of the passing test the confidence level “ready for release” is reached for the artifact. In other cases, the confidence level decision could be made by a human and based on multiple inputs.
  9. The newly gained confidence in the artifact triggered an activity named Foo_Release.
  10. During the Foo_Release activity the artifact was published to a new location (under /release instead of /dev).

Let’s circle back to some of the questions we asked in the first paragraph and evaluate how this Eiffel model of reality can help us:

  • Capturing metrics: Yes, please. By running the stream of Eiffel events through a stream processor you can e.g. measure the time between the source code push event and the artifact build event. The processor can even be written in such a way that it’s independent of what the component’s pipeline looks like and what systems are involved.
  • Input for audits or SBOMs: Yes, the artifact declares which source it was built from, you can track exactly which tests were run, and how the launch decision was made. These are, of course, not the only things one might need in a compliance or SBOM context, but it’s a good starting point.
  • Visualizing the end-to-end steps involved in building and releasing a piece of software: Yes, we can use the activity events to draw a flow chart with the steps and how they relate.

Instead of just looking at a particular subset of events that represent what follows from, say, the push of a particular commit, we can analyze events across multiple instances of a pipeline (or across different pipelines). For example, we could check if a test case has failed recently, and if so, what source was being tested. Also, what was the build or test environment at the time? Or in which pipelines has a particular test run the past week?

As a final example, we can use the events as the source of custom pipeline visualizations. CI tools usually provide nice graphs that show the execution progress of a particular pipeline instance, but that’s insufficient if you start connecting separate pipelines as you often do in larger systems. Let’s consider the Foo_Release activity in the graph above. It promotes artifacts to another location as part of the release. It’s easy to imagine that part of the pipeline being owned by a different team than the developer team which owns the build of the component itself. It could also be implemented in a different CI system. Either way the developer team probably considers it part of their pipeline and therefore something they’d want to be able to observe. A visualization based on the Eiffel events emitted by all involved systems wouldn’t have to care about the ownership or which CI system underpins each node in the graph.

Summary and next steps

In this article, I’ve tried to explain why Eiffel as a vocabulary for expressing what goes on in CI pipelines can be useful when developing larger software systems. Most CI tools work great for team-level insights but are often insufficient if you want to measure, visualize, or just understand what goes on at the product or enterprise level.

If you found this interesting there’s more to read on eiffel-community.github.io. If you prefer watching, the Eiffel Community YouTube channel contains several introductory videos that explain different aspects of the protocol. Also, keep an eye on this blog for follow-up posts about topics like the challenges of building observable event-triggered CI pipelines.

Author

Tags