Building A Detailed View Of Your Software Delivery Process With The Eiffel Protocol

Hello, and welcome to Podcast Dot in It, the podcast about Python and the people who make it great.

When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode.

With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode

Go to python podcast.com/linode,

that's l I n o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show. Your host as usual is Tobias Macy. And today, I'm interviewing Daniel Stahl and Magnus Beck about Eiffel, an open protocol for platform agnostic communication for CICD systems. So, Daniel, can you start by introducing yourself? Hello. My name is Daniel Stoll, and I'm at Ericsson.

I'm also an associate professor and researcher at Linkoping University and

have been working on Eiffel for the past decade, more or less. And, Magnus, how about yourself? I've worked in in the space of software releases since about 2004 in various roles, both operationally and in development of tools to automate release processes and so on. So I'm currently at Axis Communications where I work on systems for developing Linux based surveillance cameras and other similar products.

And going back to you, Daniel, do you remember how you first got introduced to Python? I don't really. I use Python from the same time, not least in my research projects, actually. But it's 1 of the tools in the toolbox, but I don't recall

when life was first introduced to know.

Magnus, how about you? Do you remember when you first came across Python?

Yeah. I do. It was when we started doing Android phones at Sony Ericsson in 2009.

So the tooling around that was written in Python. I was a Perl guy prior to that, but we switched over to Python for all automation stuff at that point.

And so in terms of the Eiffel project and protocol, I'm wondering if you can describe a bit about what it is and some of the story behind it and the goals that you are trying to achieve by introducing this protocol and the surrounding frameworks?

Well, the story behind it, I guess, for that, we need to step into the way back machine to, like, 2011,

2012, thereabout,

when agile is a thing and everybody's doing continuous integration and continuous delivery as well, everybody has their own pipeline and so forth, and we in Ericsson at that point. And I should say that Ericsson has a long history of developing

networking solutions, not least cellular

going for going back to, the very first generation.

And what some people don't realize, maybe these are actually vast software systems with thousands and thousands of engineers working on them.

And in that situation, you know, when I'm a developer, maybe I'm working on a component which is integrated into a platform, which might, be important for

several different network nodes

delivered to customers in different network topologies and different variants and so forth. You have a rather

complex

network of interdependencies

between different soft software assets,

different software assets.

And in that situation, it can be very difficult when everybody's doing their own thing. Everybody has their own pipeline, and, you know, we all automate our tests, and we're all running Hudson as it was back then

or some other tool or whatever people might prefer.

It's very difficult to get this kind

of cross organizational

transparency

on what is actually going on.

Now I can keep track on what's happening in my component, but it's very difficult for me to understand

who's actually using it, how's it going for them, what kind of feedback might I get from them, Did my software actually make it to a customer yet?

Which customers? Where was it integrated?

How was it tested?

And so forth. So getting this kind of transparency

is really, really hard, and that's really where ifold comes in trying to address that.

Because

this

integration network is a polyglot environment.

All kinds of tools involved,

Trying to get a large organization

to unite on a single set of tools is, at least in our case,

nearly impossible.

And we realized that even if we were to succeed at that point,

tools change. Right? Technologies shift. So a few years down the line,

it would be something else. We needed

some kind of

distributed

decentralized

system

to be able to get the kind of traceability

we needed.

I think 1 engineer I talked to put it like this, you know, in a time before Eiffel.

Making a software change was like throwing it into a black hole, and then you got angry males back a few weeks later.

And that's the kind of situation we wanted to address, giving people that kind of traceability,

understanding

what happened to my software, who picked it up,

where was it integrated, how was it tested, where was it delivered,

and also providing that feedback from both ends, both upstream and downstream in the integration flow. So you're mentioning the issue of being able to understand sort of where in the overall life cycle is a given piece of software or a particular change that you've made to the code. So as you're saying, there are these different stages of the build test release cycle. And so as a software engineer,

unless you're the 1 who's managing all those different stages, it can definitely be difficult to understand

where are we in this process. So for a simple web application where it's just I build the code and then it goes out to this 1 web server or this fleet of web servers, it can be fairly straightforward to understand. But for the case of, for instance, the Android application that you were mentioning, you know, it's a much more convoluted process, and there are different

sort of upgrade cycles that end users might have. And so I'm curious, what are some of the

kind of types of

questions or pieces of information that are necessary to track to be able to understand more about that overall flow and the overall process and life cycle of those software changes.

Right. Yeah. And I think an important point is it's not just a build tests release cycle, but it's many, many different such cycles. You know, every little asset in the vast dependence network has their own cycle.

So trying to get an overview of that is really the the tricky part. We're certainly a lot smaller operations than than Ericsson, but we have the same problems. Maybe it's just a simple thing as generating a change log if you want that for the whole system, where you only have the binary blob.

It has a name. It has a version, but what was the source repository or repositories

used for building that. If you wanna dig into what, you know, suddenly things are crashing, we think it's this component, what was changed in there?

You might not even the the source of code might come from a different Git server. It could be something else, and

and simple things like that, so it can certainly help out. Yeah. And as for use cases, dimension Tobias, you could imagine something like you know, let's say you have an incident in customer deployments. Right? And then you want to troubleshoot that.

And then you want to try to find out, okay, so what actually changed from this version to that version? And that can be quite hard in a in a complex environment, especially in a polyglot environment

where you would essentially need to go into

each

components or each team's individual

kind of continuous integration tool and check their build scripts and their build configurations to try to pinpoint what changed from 1 version to the next.

So instead, we want this kind of

transparent overlay,

a shared language, whichever tools you happen to use, whichever build scripts, whichever

language you use for your software,

We want this shared protocol that tells us what actually changed

so we don't have to do that kind of archaeology. We can just figure out what changed through a simple, lookup. And so as far as EIFLE itself, as I mentioned at the opening, it's a protocol first and foremost, and there are different components that interact in this overall ecosystem.

And so I'm wondering if you can give a bit of detail into the role that Python has played in the

evolution and implementation

of Eiffel as a protocol and as a concrete implementation?

It hasn't played a major role. We do use Python

for various scripts in protocol repo.

And, definitely, some of the tools in the ecosystems, they're written in Python.

We also have a lot of Java. We have some Go and so on. So it is polyglot, and that's part of the point.

I wouldn't say Python plays a greater role than any other language.

In terms of the people who are interacting with the Eiffel system,

I'm curious what the kind of broad personas are and how that helps to inform where you're spending your time and energy on enhancing the protocol or building reference implementations or building tooling or visualization layers and just some of the

kind of interactions that take place with Eiffel and how that plays into a given CICD framework?

With the caveat that people use different names and

phrase roles and responsibilities in different ways,

I guess the main and most obvious 1 is the developer wants to understand

what happened to my software, where is it going, where is it in the pipeline, or am I causing

problems with my changes, not just in my little part of the system,

but downstream from me. And, also,

these dependencies that I have, that I am integrating, what happened to them? Can I trust them? How were they tested? How were they used by others?

But, also,

from a testing point of view, it's a way of understanding

what did we test where, which requirements did we verify, which environments was this executed in, which tests were executed for this version or that version, and how do they compare.

From a kind of release management point of view, it's a way of getting an overview of what is the actual delta between this version and that version, what we did we test, what did we not test, In which environments did that happen?

From a configuration management point of view, it gives a record of what goes into any particular version of your software,

including third party components as well. Very much on what we do is, after all, open source, which we benefit from and contribute to. So

getting an overview of

everything that goes into your build,

again, in the technology agnostic way. You know, I don't want to care about

if you happen to use Tekton or Spinnaker

or if you use Jenkins or if you use Artifactory or Nexus or if you use Maven or Gradle.

I don't want to care about those things. I just want us to share a common protocol where you can inform me of the things that you have done so that I can pick up your stuff and integrate it into my stuff, and then see what happens to that downstream.

And, certainly, if you build on top of Eiffel, you could extract various metrics

and things that will be interesting for management, for example. So depends on what you build upon Eiffel. Again, it's just a protocol. I see it more as an, you know, information platform.

Platform is an overused word, but it depends very much on what you build on top of it. And you can use it for traceability,

like Daniel mentioned, but you can now also use it to actually trigger

activities in the pipeline itself. You know, a particular

artifact is produced, you

act upon that and sign

the files in the artifact, and produce a new artifact that contains the signed files, for example. That might in turn get picked up by some other component and be deployed to a test environment or something like that. Elaborating on that, this triggering of activities is an important point

because you can do quite complex

triggering and complex logic in your pipeline.

But, again, doesn't only depend on what happens in a particular

continuous integration

tooling,

but what happens across many pipelines. You could say things like, well, if there is a new artifact coming from

that component team over there,

and it has actually been built into that product over there, and it has been tested such and such with those results,

then I want to do something. I'm not interested in testing out my big system unless

the subcomponent that I depend upon has reached some kind of confidence level.

So I'm not wasting test resources, lab resources, hardware, etcetera, on something that hasn't been proven not to be garbage.

Given the capabilities

that it provides,

I'm curious what your sense is as to the level of scale or complexity that is

necessary or useful to start thinking about incorporating

ifle into a CICD workflow

or if there is a kind of minimum size, at which point it is too much additional

overhead to make it worth the effort of getting it integrated and building on top of it? I would say it would have to be fairly large.

If you just have

a monorepo, you build 1 product, you only have, like, 1 test stage or something, it's gonna be way too much overhead.

If you have multiple stages of delivery where 1 pipeline delivers to the next, for example, intervening test, you have lots of third party components that you wanna track,

then it might be worth considering.

But as for, you know, exact numbers,

I don't know. No. That's fair. I was just looking for kind of ballpark sort of rough estimate

valuations.

As with everything in software, it depends. You know, there might be some small software shop that has

1 relatively straightforward delivery pipeline, but they wanna be able to instrument that, you know, all the way through to be able to, as you were saying earlier, be able to track certain metrics about what is our delivery speed, how long does it take to go from I made a change in this source file to this file is live in production kind of a thing. That's a fair point. But as a rule of thumb, if I were a simple webshop, say,

and I had a single pipeline and a single tool, then, no, I wouldn't make looking into Ifill a priority.

I would rather

try to use the tooling I already had

to get better transparency.

It's also a matter of, you know, the amount of tooling. Even if you have multiple teams and some

software hierarchy in in terms of deliveries,

if they're all using the same tooling, for example.

Even if you are fairly large, maybe it doesn't make that much sense. But if you have some teams using Jenkins, some using Jenkins x, some use Tekton, and so on,

then

the sort of the cognitive overhead for people to understand that becomes much larger,

and integrating them point to point probably also becomes more difficult.

So you might wanna have some kind of,

you know, dashboard or overview

tool anyway. And instead of building it from scratch, it might be a good idea to look at ifle for for doing that. That's a very good point, and lock in is part of it. Because even if you can build a kind of

system on top of a particular

tool suite and say, well,

in our company, we're all going to use Jenkins or what have you. And we're going to have this huge server or the same servers,

and you can enforce that at a particular point in time. In 5 years, you might realize you actually want to switch out that technology.

And then that might be a difficult thing to do because you cannot build yourself into it.

1 point

of creating the iPhone protocol was to be technology agnostic and make it easy to switch out parts underneath,

to make it frictionless, to switch out 1 server here, let 1 team experiment with something else over there so we don't have to all go in lockstep.

Digging into the specifics of the Eiffel protocol and the architectural elements, I'm wondering if you can describe the kind of broad design of the protocol and the systems and the integration points for the CICD workflow and the software delivery process.

So the protocol, it's it's based on JSON schemas. I think there are 23 different kinds of events. They're individually versioned

and have JSON schemas and

describe 1 particular

kind of event that might occur in the pipeline.

So

there are a couple of different groups of events that are related,

and

events link

to each other

via their IDs. So we're actually building a graph

in the end. And that's a really important point that I don't think we've really covered so far, because that's what allows you to get the traceability

where you have

the artifact creation event. Basically, a state set, okay, an artifact with this ID and version has been created,

and it could link

to

the source code that was used to build the artifact. It could also link to an environment describing

what was the host or container environment used to produce the artifact,

and so on. So you got artifacts. You have activities

that are really just empty containers

within which you could run tests or produce artifacts.

But activities have outcomes.

They have logs

and so on. And activities can be used if you wanna visualize your pipeline. You could basically

extract the activities and how they're connected

and get the sort of node graph that you would, see in a CI visualization tool, for example.

There are also various test related events,

test suites, and test cases. And

now

this test execution has started. Now it finished, and these were the results and so on.

An important point is that this

directed acyclic graph or DAG always points backwards in time. So events, as we call them,

they always refer to something that happened previously, and they can do that in

specifically

typed ways. So

like Magnus mentioned,

an artifact can, for instance, say, well, you know, I was built in this particular environment,

and at the same time, I was I was built from this particular composition pointing to another event,

and that event

describing the composition.

And that event describes the composition by pointing to other events declaring

other artifacts or other

repositories.

And so it goes. So you can trace all the way from a final delivery

through these different events in the graph all the way to maybe an issue that triggered software change, that triggered the building of an artifact, that triggered the tests, that then triggered the composition of something else, and so on and so forth.

And we find that that is quite a powerful concept,

and it becomes very agnostic in the sense that there are no point to point integrations.

You know, I never listen to your pipeline per se,

but I'm listening to messages

that are, you know, publish subscribe.

Messages saying, there is now a new artifact

of this type

that I'm interested in.

And so as a recipient of that, I don't really care who published it. I don't care if you build it using this or that

that build scripts or the this or that pipeline. All I'm interested in is someone saying there's a new version of this artifact and where to pick it up.

And then someone else

maybe or perhaps the same agent, perhaps even a manual agent.

This doesn't have to be all automated even though we we encourage automation, obviously.

But some agent says, well, now we executed a test.

The reason we executed the test was this or that happened. So you get this

kind of rather

intricate

multidimensional,

if you will,

graph of events, which gives you a very good visibility on what is going on. And it kind of caters to different use cases

so that

you don't have to use the full thing.

You can select

the types of events that you think are relevant to you and kind of start out with those. Maybe just saying, well, you know, we use events to declare artifacts and where to find them and what they were built from. And I'm happy with that. And you don't really need to do anymore. And then if you want to, you can build on top of that graph by beginning to emit, say, test events

declaring

tests being started and tests finishing and test results.

And then you can add on top of that

using other types of events describing other aspects of your integration integration flow. So the system is designed to be very modular in that sense.

What are some of the pieces of information that you have found to be

necessary

as a common subset across all of the different event types to be able to understand

maybe, like, who are the common actors or,

you know, how do I understand the chaining of these events? I know that you said that there was the ID field and maybe how do you

manufacture that ID in order to ensure that it is deterministic

as a piece of software or an artifact

travels through the different life cycle stages?

1 way of putting it is every event has

1

meta field, 1 data field, and 1 links field. And the meta field is the same for all

event types.

The data field is different for each type and

depends on the kind of information we're trying to represent.

So, for instance,

let's pick an example.

We have an event called

artifact published event. Right?

The kind of information we provide there is the locations where the artifact is published,

the name of those locations, and the type of those locations,

and any URIs.

That's what we want to convey with that. So we try to keep it as small and condensed as possible for each event type.

Now the artifact

published event then

can link to an artifact.

If we want to describe the actual artifact that was published,

then we link to it with a specific link type.

And the linking is done

via a unique universal ID

that is randomized. So every event has a UUID

we can use to link to it.

The meta part of the events, as I mentioned, is also the same. And the kind of information we put in there

is we have the ID, which is on a UID format.

We have the type and the version of the event,

and then we have a time stamp.

And then, optionally, you can describe something about the source. You know, where did this come from? Did it come from a particular agent?

And you can put that in there if you want to.

And then we have security blocking in there in the metadata that is meant for integrity protection to ensure that nothing was tampered with depending on if this is an environment that you can trust or not.

And that's basically it in the meta field. The time field could be used for metrics, for example. Then you know when something was sent. So if you wanna see how long time an activity takes,

you would

grab the activity triggered event

and the activity finished event and subtract the meta dot time field, and you would get the total duration of of that activity.

For a team or an organization that's interested in investing into

using the Eiffel protocol and building out their own implementations

and being able

to gather and analyze

these pieces of information,

what are some of the

infrastructure level requirements? What are some of the

integration steps that are necessary to start

propagating and storing and analyzing these events?

You would wanna have some kind of message bus. So for all the tools in the ecosystem, use RabbitMQ.

But the protocol itself doesn't carry. You could use carrier pigeon if you like, or definitely, more realistically,

Kafka or or similar. But

right now, RabbitMQ is the prevailing choice.

So that's definitely something you in practice, you wanna have an event repository as well, where everything sent on the bus is stored and made searchable.

Because

sooner or later, even if you have a listener to the system that listens on events and

acts upon them in the real time,

Events that arrive

will have links to previous

events. And unless each service has a cache of all past events,

which, for various reasons, is not entirely realistic,

it needs to be able to query the event repository to obtain those events.

I would say it's a bit of a drawback in 1 sense that

an event isn't self contained.

Events are usually pretty small,

and it's the links that make them really useful. So the drawback would be that you would have to obtain those events somehow,

either by caching or obtaining it from the event repository.

As far as the

specific workflows, we've talked a little bit about being able to send events

that, you know, maybe an issue is generated,

a particular artifact is landed into an artifact repository,

an artifact is deployed into a server,

what are some of the types of workflows that an engineering or products team might build using Eiffel, either in terms of the events that they're sending and then how they're actually

using that information

aggregated across those event flows to feedback into their development cycle?

It depends on, you know, what your goal is with Ifill, and and that's

something

that depends on your organization.

I mean, 1 thing could be to trigger test activities based on artifacts that we've produced without having point to point integrations.

That could be 1 thing. So

you could go as simple as just, you know, inserting some kind of script or something at the end of your build script that emits the artifact events.

And then you would have a listener to the bus that would act upon that. Depending on your setup, that, you know, may or may not be a huge improvement of whatever it is you're using now. That could be 1 sort of way in the other could go in the traceability route. I actually say, like, product structure.

If you have a monorepo, it's pretty simple.

But if you have a system that's built up of various components, just understanding

the structure of the product. We'd have multiple products, and they have different subcomponents.

What would go into

finished product,

and the source code that will go into each sub component. So the structure itself could be an interesting artifact that you might wanna visualize or or somehow publish,

and then the source code could be another. So in that case, you would probably have to hook into your source control system somehow,

or just create the events afterwards based on a checked out repository. That's also a possibility.

So it depends very much on, you know, what are the intended desired results.

I can only second that. And I would say there are kinda main

mainly 2 categories

of use cases.

1 is for driving your pipelines,

and 1 is for analyzing them.

And then there are different needs within those. But you can use Eiffel to actually drive the behavior of your pipelines by, as Magnus mentioned, reacting

to events that are emitted.

Now that tends to be kind of driven by an engineering point of view. As a software developer, I want something to trigger my builds in my pipeline,

that sort of thing. And I don't want to integrate with people's different pipeline tools. I want

a interface to integrate with.

The other part being more analysis driven, which is typically more driven from a management point of view. We need to understand what's happening. We need to understand the product structure. We need to understand how long things take, what is the lead time,

that sort of thing. Depending on

where you're coming from, you're going to be looking at different types of events submitted from different agents at first,

but then you can always build on top of that. You know, flesh out your

graph

with more information and add more agents as you go along

to satisfy more use cases and satisfying

more

user perspectives.

Selfishly stealing from an example that I'm working through right now of

I have a

fairly large Django application that I deploy onto an e c 2 server. And as part of that build, I also have some

operational tooling that goes into it. So I've got my log shipper. I've got my service discovery tool. I've got my agent for being able to retrieve secrets at runtime.

And then

I also have a JavaScript application that gets deployed to an s 3 bucket and deliver and and is used as part of the

full experience. And so as

the operations person,

I might get a question from someone else in the team to say, what is the specific version of the application that has been deployed?

When was it deployed?

You know, being able to answer those types of questions right now might rely on me

incorporating all that information explicitly into my build pipeline to make sure that I have all of the sources as inputs into the pipeline so that I can see, you know, okay. This pipeline ran. This is when it finished. And going back to the beginning, these are the versions of the inputs. But what would be the process for being able to use Eiffel to answer all of those questions in a more kind of streamlined fashion.

Well, what we probably have then is the agent that does the deployment would emit an event saying that we are now starting this deployment activity.

And that would link to a composition saying, you know, what are the things we are deploying,

and that will happen by

referring to events.

So

every

asset that you deploy would be represented

by

a different event saying this artifact has now been created

so that we can point to it

in this deployment information.

So every asset and every action

that you want to

be able to look at needs to be represented by an event.

That means the different pipelines

need to emit an event to the message bus, for instance, RabbitMQ.

And then

these events need to be stored somewhere where you can do a lookup to see what happened. Would you agree with that, Magnus?

Yeah. Definitely. Makes sense. So you would typically

add to your build script. You would add something that gathers the information that you wanna include. Again, it depends on whether you want to include

source code information

or if it's enough just the the artifact identity, for example. Do you wanna capture the environment

where things were built, for example?

And then assembling those JSON blobs and sending them.

Regarding deployment, there currently aren't any standard events for deployment.

That's something

we should definitely start working on.

But there's nothing stopping you from making up your own events

that build upon the existing ones. That would be totally fine. Or you can use a generic event, like just saying it's an activity.

But That's true. Going into deployment is interesting because we have had quite a few more philosophical conversations on where does this end. You know, what do we not want to describe

with these events?

You know, if we go into deployment, then soon we kind of rub up against various

monitoring solutions,

like maybe Prometheus or something like that, and we don't want to compete with anything like that. So where are the kind of conceptual

boundaries?

How far do we take the protocol?

It's not an obvious question to answer, but at the moment,

as when you said, there are no events representing

that.

There are no events representing deployment to the target environment or what is going on in the target environment.

But then it's an open protocol,

so it it can be added to. Taking that a little bit further, another interesting challenge that I'm working on designing through right now is

I have this pipeline. I'm able to

deliver these changes into a given environment, but then I need to

alert somebody to say, this is ready for testing and validation.

So then the next step of the workflow is somewhat manual, and so I need to be able to get some feedback to say, okay. This is now ready to go from, you know, a validation to a user acceptance testing environment or from user acceptance testing into production. Admit

an event back, and then I have another stage of the pipeline that says, okay. This has been accepted for this environment, so now I can trigger the pipeline that pushes that into the next stage.

And, you know, being able to manage that

graduation

of changes from 1 environment to another, particularly in the case where you maybe don't have dedicated

branches for a single repository

to indicate when a particular artifact is going from 1 environment to another where maybe everything is using trunk based development

or you have a composition

of dependencies that are all being bundled together into a particular deployment that all need to get moved together.

There is an event called confidence level modified,

where you state that

some kind of subject,

for example, an artifact,

has reached a confidence level, and that's just an arbitrary string.

So it could be, you know, acceptance test passed,

or ready for this or that environment.

And it could also indicate why this decision was made and point to test results, for example. Or it could be just manually sent based on someone's

feelings about a particular artifact, for example. And that in turn could be used on a dashboard. It could trigger other activities

in the pipeline.

And you can also have a hierarchy of confidence level

changes, where you have 1 confidence level which is based on other confidence levels. You know, the unit test, the functional test, the acceptance tests, and the manual test. That means we are, you know, prod ready or something. Quite a typical use case might be to present a dashboard like that to, say, a release manager,

aggregating all the test information from all the different pipelines involved in producing the system. But then at the end of the day, maybe it's a judgment call. You know, do we want to release this? Do we feel good about it? You literally push a button in the interface. And then the dashboard itself emits

confidence level event

saying, yes. We now feel good about this, which then in turns maybe triggers another pipeline,

which starts some deployment activities

giving rise to even more events. So we we can see what happened downstream from us pushing that button.

For teams who are

investing in the Eiffel protocol and building tooling and platforms around it, what are some of the

other technical or conceptual or organizational

challenges that they might encounter?

I would say, you know, set setting up the links could be

a bit tricky

because

rather, you know, how typically helps to have deep tool integration.

For example, if you have some kind of CI system

where you wanna

model

builds in that system with activities,

and then you can connect artifacts to those activities.

Getting the activity events correct,

you would probably wanna have the tool itself to send the activity events. That that could be difficult. You might need to write a plugin or something that might not be available to you. You could also send the activity events within the build steps themselves.

That could be error prone. You could lose events if the build just terminates early, for example. You would have to guarantee that you can send all the events you wanna send. On the other hand, if you're doing things from the outside,

you would want to inject

the ID of the activity event into the build itself. Otherwise,

it will have difficulties linking things

that it produces to the activity. So building the DAG can sometimes be

a little bit difficult if you are sort of mixing paradigms, if you're not completely in the Ifill domain,

moving from the reality domain where you have, you know, commit IDs.

You don't link to commit IDs. You link to ifle events that represent those commits.

And doing that could be a little bit difficult.

In terms of the

community around Eiffel, how would you characterize the current level of adoption or the

types of

implementations

or

available

integrations

that people can take off the shelf and experiment with as they start to explore the protocol and understand how it fits into their own processes?

Well, there is a basic set of components.

In terms of event repository, which everyone probably needs, you could set up your own database.

And there are a couple of API implementations

built on top of such database. I think all those implementations use MongoDB.

So

there are a few such components.

There's also 1 component that allows you to consume messages from the bus

and insert them into the database.

So there are some components. There is a Jenkins plugin

that should be quite plug and play, so to speak. If you're using Jenkins, you'll get activity events, and it'll make it easier for you to emit other events

from within your builds. Those are the, I think, the most important pieces available.

There are a number of experiments as well, but I always say that the reasonably production ready pieces are the ones I I mentioned. And in terms of the community, we're very happy to have several companies.

Ericsson, where I work myself, is a very active member, as is Axis.

And we have other companies as well on board, and we're happy to see that number grow.

For the components like Magna described, there are multiple existing components that you can pick up. The core part is obviously the protocol.

And then we really strive for having a smorgasbord, if you will, surrounding that protocol

with different

tooling

based on different technologies

as might suit your needs

depending on

the purposes you want to put Eiffel 2.

So

as as mentioned, for instance, an event repository is a central part of it, and there are different options you might go there.

And then there are other options, for instance,

how you might want to plug it into your pipeline or how you might want to plug it in your artifact repository

and so forth. It's also worth mentioning that Eiffel is part of the continuous delivery foundation, which is actually how I came across it in the first place, which is a subset of the Linux Foundation and I believe kind of either a subset or a sibling organization

to the Cloud Native Foundation. So it's also another place where folks can find it and maybe discover some other interesting projects that they might want to use alongside it. There are a couple of folks that are active both within the Ifill community and the community for CD events,

which is an emerging standard based on on cloud events.

So

we'll see if we can get some kind of integration between Eiffel and CD events,

make it possible to translate between them.

But that's still work in progress.

And in your experience,

both using it in your own organizations and building on top of it and working with the community and helping them understand its utility, what are some of the most interesting or innovative or unexpected ways that you've seen the Eiffel protocol used?

Well, I think some of the most interesting things are these complete

real time visualizations

of large

networks of pipelines.

You know, I, as developer, I can trace step by step what has actually happened.

My software that I just made and see,

you know, second by second, essentially, which pipelines it is traversing and where it is integrate,

which deliveries it is included in. I think that is 1 of the more powerful

and most striking

uses

that I see

where you you can follow your own software on its journey through a very

complex web of interdependencies.

And we're building something like that internally in order to to make sure that we avoid the lock in effect of various CI systems and

other technical ways tying us to particular implementations or limitations

of CI systems,

and present a view that will be

consumable by both developers and project managers and then so on. So that not only people with deep

technical knowledge can track

what's going on in the pipeline. Because a developer can almost always do that. They know the tricks, they know the URLs, the files to look, and so on. But on a higher level, people don't know that. In your own work of investing in the Eiffel protocol

and the community and using it in your own jobs, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

I've learned that event driven pipelines are hard

because

on the surface, it's pretty simple. You receive an event, and you act upon it.

But, normally, in a pipeline, you wanna have the observability

to understand

when something goes wrong,

what are the downstream effects of that? Or when is this particular pipeline instance actually done?

When is it time to give up?

And

with an event driven pipeline,

how would you know that? It's the individual components of the pipeline that know what to react to. So So it's very hard for an outside observer to know all that. On the other hand, with an orchestrated pipeline,

you would typically have a YAML file that or similar that declares exactly what's going to happen.

But that has its own challenges in terms of inflexibility.

So we

are trying to find some kind of hybrid between the 2 where we can get the benefits of the event driven pipeline,

but also the

observability

of of,

of the orchestrated 1. I don't know where we're gonna go there.

I would say I have learned that

creating

a information model

that is

technology agnostic

and generic,

and yet precise and expressive enough

is really, really hard.

It's kind of like if you're familiar with XKCD, there's this script, Bobby. Now

what? There are 14 different competing standards? That's ridiculous.

And then soon, there are 15 competing standards. It's very easy to end up in that situation. So trying to really

get to the core of

conceptually

what is the relevant information in any given situation, what is it we're trying to express, what are the

the relevant entities that need to be represented?

You don't do that in, you know, in an evening. Absolutely.

And to your point, Magnus, about being able to understand

the ramifications

of modifying a pipeline or a particular software change, it puts me in mind of the

data lineage tracking that's being explored in the data engineering ecosystem of being able to understand, okay. If I make this change to my ETL job, what are the downstream

systems or dashboards that are going to be impacted?

And it'll be interesting to see

how well that maps into the software delivery process, particularly when you have these large and convoluted graphs of, you know, delivery where maybe I'm making a change to a library that's 1 component of 15 different downstream applications, and those applications are then maybe used by various other systems. You know, if I make this change, what is the actual,

you know, magnitude of the downstream effect that this is going to have?

That's an interesting point. I mean, you can actually use Eiffel to, you know, close that feedback loop and predict what's going to happen in the future based on past behavior. And I think there was an experiment I don't remember if you were involved with that, Daniel, but

there was some work done on

visualizing

the pipeline without having any prior knowledge of what it looked like, and showing, like, a heat map were the common paths. I saw that experiment a few years ago, and it was pretty cool. That's definitely very interesting.

So for people who are intrigued

and considering

whether Eiffel is something that they want to invest in and build on top of, what are the cases where it's the wrong choice and they'll be better suited with just going with a vanilla pipeline definition or building their own sort of visibility

events into their pipeline to be able to understand what their delivery process looks like? If the pipeline is too simple,

then you should probably look elsewhere.

There is an overhead, both cognitive in just terms of

understanding the protocol and how it could be applied to your process because usually you have a process and you want to

describe that with Eiffel. There's an overhead there and then tooling overhead.

So it would have to be a fairly large operation for it to be worthwhile, I think.

As you continue to use Eiffel in your own work and contribute to the protocol and the community, what are some of the things you have planned for the near to medium term either as far as

upgrades to the protocol

or additional reference implementations

or out of the box tooling that people can use to get started with ifle? When it comes to the protocol,

we are gonna look into

the events that define source code changes

because we found that the existing events have some issues that make it hard to to describe what's going on. And also deployment events is probably something we're gonna look into fairly soon as well. Ecosystem wise, we have,

in the past 6 months,

added a couple of SDKs

that make it easy to work with with events in your code.

If you're, you know, just using Python,

you can just have your, you know, anonymous dicts, and it's pretty easy to work with with JSON data. If you have type languages,

that's a lot less fun. So the SDKs help with for for go and for dot net currently.

We have 1 a limited 1 for Python.

That probably should be migrating to be using data classes or maybe PyDantic or something. We also have a visualizer

that we are probably gonna open source during the spring.

Sort of low level visualizer that you can start with 1 event, and it'll show you information about that event and

the sort of surrounding

adjacent or the events it links to, and you can traverse the graph that way.

That's a useful debugging tool for those working with the pipeline development to understand what events are being emitted and so on. Not meant for end user consumption, but it's certainly useful for those working with Eiffel in an organization.

Alright. Well, for anybody who wants to get in touch with either of you and follow along with work that you're doing, we'll have you add your preferred contact information to the show notes.

And so with that, I'll move us into the picks.

This week, I'm going to choose the movie red notice. Watched that recently. It's just

a goofy,

fun adventure film.

So definitely worth checking out if you're looking to, you know, kill an evening with something

not too thought provoking, but still entertaining. So with that, I'll pass it to you, Daniel. Do you have any picks this week? We just had our 3rd kid recently.

And between work and just keeping my offspring

clothed and fed and happy,

we just

maybe end up watching Netflix within the evening. We, watched the second season,

The Witcher recently,

which is

pleasantly surprised, actually. I'll second that. I enjoyed that as well. Magnus, how about you? What do you have for picks this week?

On the same note, really, Lego,

our kid recently turned 4. So we've switched from from the Duplos to the real small pieces of LEGO, and it's very relaxing and and enjoying to build things.

So far, I'm I'm really the only 1 doing the building, but it's it's quite a lot of fun.

Absolutely.

Children are great to give us an excuse to actually play with all the toys that we can't justify spending our time with otherwise.

Oh, yes.

Alright. Well, thank you both very much for taking the time today to join me and share the work that you've been doing with Eiffel and help me understand more about its use cases and utility. So definitely a very

interesting project and endeavor. So I appreciate all of the time that each of you have put into that, and I hope you enjoy the rest of your day. Thank you, Tobias.

Thank you.

Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at dataengineeringpodcast.com

for the latest on modern data management.

And visit the site of pythonpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes.

And if you've learned something or tried out a project from the show, then tell us about it. Email host@podcastinit.com

with your story.

To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

The Python Podcast.init

Summary

Announcements

Interview

Keep In Touch

Picks

Closing Announcements

Links

The Python Podcast.__init__