Event Sourcing with John Bywater

Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great.

I would like to thank everyone who supports the show on Patreon. Your contributions help to make the show sustainable. When you're ready to launch your next project, you'll need somewhere to deploy it, so you should check out Linode at podcastinit.com/linode

and get a $20 credit to try out their fast and reliable Linux virtual servers for running your app. And now you can deliver your work to your users even faster with the newly upgraded 200 gigabit network in all of their data centers.

If you're tired of cobbling together your deployment pipeline, then it's time to try out GoCD, the open source continuous delivery platform built by the people at Thoughtworks who wrote the book about it. With Go CD, you get complete visibility into the life cycle of your software from 1 location.

To download it now, go to podcastthenit.com/gocd.

Professional support and enterprise plug ins are available for added peace of mind. You can visit the site at podcastinnit.com

to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions, I would love to hear them. You can reach me on Twitter at podcast in it or email me at host@podcastinit.com. To help other people find the show, please leave a review on Itunes or Google Play Music. Tell your friends and coworkers and share it on social media. Your host as usual is Tobias Macy. And today, I'm interviewing John Bywater about event sourcing, an architectural approach to make your data layer easier to scale and maintain.

So, John, could you please introduce yourself?

Hi. My name is John Bywater.

I'm a software developer in London. I graduated from Oxford Engineering and did robot development for a while,

and then got into consulting in Cambridge

and eventually started doing my own thing. And, been involved in various open source projects.

And,

1 of which is the event sourcing project, which we're talking about

today. And do you remember how you first got introduced to Python? Yes. I do. In about 2, 004, I met a guy called Rufus Pollock who was starting the Open Knowledge Foundation,

and he wanted to develop some tools,

in Python. And the first thing we built was a project hosting system that integrated different version control systems, such as subversion and git, some different trackers, such as Track and Redmine, other common tools such as wikis and blogs so users could start projects,

add services such as repository and tracker and add members. You would have access to project services. Unfortunately, shortly after we finished that project, GitHub became very popular, and we all started using GitHub. Eventually, we abandoned the project hosting project. The second project in Python was called CCAN, which is more successful. CCAN is now being used around the world to publish open data. For example, the US government's

open data website, data.gov,

uses CCAN.

We factored out from these projects a library that followed the patterns of enterprise application architecture,

and I used this library for several other projects.

Eventually,

disappointingly to my frustration, the domain model library became increasingly difficult to develop for reasons that I didn't understand at the time. I later realized the trouble was I was sitting on was known as the impedance object oriented software and relational databases.

There's a long

Wikipedia page about it.

I had wanted to arrive at something general and complete, but it became increasingly exhausting and complicated.

Firstly, there were temporal properties, temporal objects, the differences between object graphs that seem to be like branches that could be merged. I worked on that for a bit. I was always able to get something working. But then I would think of something else that it couldn't do, and these subsequent thoughts never seemed to follow on very well from each other. I could tell it wasn't going very well. And after spending a lot of time on it, I eventually abandoned the project. When I look back, it was obvious in the number of patterns and those different kinds, those patterns of enterprise application architecture that something wasn't completely resolved. Anyway, it was a big problem I didn't understand at the time, and event sourcing eventually appeared as a solution.

And can you describe briefly

what event sourcing is and some of the benefits that it provides to people who are using it?

1 definition of event sourcing suggests the state of an event sourced application

is determined by a sequence of events.

Another definition has event sourcing as a persistence mechanism for domain driven design. It is common for the state of a software application to be distributed or partitioned across a set of entities or aggregates in a domain model. So the application sequence of events is really a set of sequences,

1 for each entity or aggregate.

It's a different approach. The benefits of of event sourcing lies in the sympathy between what you're coding in an event sourcing system, which are the events, and the domain,

which is inherently,

event

based in that. The interesting thing in any domain is what happens, and what happens leads to a state of affairs,

which could be taken as the object model.

But it's the, it's the events which leads to those states of affairs, which I feel are more interesting. And the the benefit of event sourcing is it allows you to code

against what happens in the domain. I mean, that's not quite strictly true because

there, we're talking about domain events,

and the domain events don't necessarily need to be

persisted. You could have the domain events

changing objects and then saving those objects. But the event sourcing thing gives an economy of scope between coding an event based system and

persisting that system

persisting the state of that system because you can,

simply persist the events which are being published, and you don't have to do anything else.

And what from my understanding

of reading about the sort of architectural pattern of event sourcing is that it also allows

for

fixing sort of logical

errors in the way that you're processing the events such that if you find a bug in your system

and you want to resolve the way in which you are

interpreting it, you can just replay those series of events to create a new view on the data while the actual data itself, the, you know, the individual events remain immutable as well as, I mean, 1 of the other benefits of the overall approach is that it can potentially give the give you the ability to time travel in your data so that rather than saying, you know, as a for instance, what is this person's address today? You can say, what was their address 5 years ago

Where in a

traditional RDBMS system or relational database system,

you can only ask for the present state of information unless you explicitly create a new table of the history of the, you know, changes to that person's address or state.

Indeed. And at that point, we can't say that the advantage of event sourcing is that you can have a history, because you can have a history with other systems too. The advantage of event sourcing lies in the primacy

of the events

in the system,

and

and the avoidance of the problems that you encounter when you make the state of affairs primary, and then the events,

the events secondary

to that. So

you wrote a library to implement the event sourcing pattern in Python. So I'm wondering if you can describe a bit about the library itself

and what your reasoning was for starting work on it.

Sure. The library provides mechanisms useful in event sourced applications,

a way for events to be stored and retrieved,

a way for events to be replayed to obtain the current state.

In addition,

there are a few classes for making an event sourced application, such as domain event and domain entity, aggregate,

repository, persistence, policy, application, and so on.

I started the project because

it seemed that with a good library, it would be much easier to suggest doing things with event sourcing. You don't have to say, it's a good idea, but I would need to delay the start of your project by a couple of weeks so I can write a framework

that will save us loads of time. You can just

say there's a library here. We can start to use it. Also, I wanted something I could add to over time, 1 good piece of open source code, not a number of different little frameworks that I had to leave behind when I changed job.

And so going back to the ideas

of when somebody might want to implement event sourcing, what are some of the reasons that they might not want to include that in their, approach for their persistence layer,

the application?

It's a good question.

To

think about when someone might not want to do it. So let's look at what brings us to event sourcing and see what isn't included in that.

For me, this term, persistence layer comes from from the layered architecture, which, when I started work, was known as n tiered architecture, often with n equals 2 so that you have a presentation layer and a database layer. Before the web and mobile, many applications were developed by dragging widgets onto views and hooking them up to database queries. The domain model was in the database schema,

interleaving a domain model layer between the presentation layer and persistence layer was hard work.

It's hard to think how event sourcing could have figured in these applications.

But as soon as there are different presentation technologies,

common stuff, let's say the business logic, needs to be pushed down below the presentation layer so it isn't duplicated in each interface. We can put business logic in the database, but then it's hard to maintain.

Writing complicated SQL is not really object oriented software development, so we don't get to benefit from that genre. Unit testing, refactoring, the agile approach, and so on don't really apply very well. So we're looking to put it in the middle between the presentation and persistence layer so we can write proper software. It It became known as the domain layer because it's, more than any other of the layers, reflects the domain supported by the application.

So now we've got objects in the presentation layer and objects in the domain layer, but also relational database echoes the names of the domain layer with tables

named after the domain object classes and columns named after attributes and so on. It's all quite promising at first, and it works well for lots of things, but it brings us to the impedance mismatch I mentioned earlier. It brings us to all sorts of different kinds of energy draining complications.

Event sourcing finally purges the data

domain events, but the database doesn't know that. The database just has sequences of items. Now we don't have any problems. We have objects in all 3 layers. We have we can develop software without getting caught up with infrastructure.

We

all we need to do is return to the domain and ask what happens. Event store event storming is is the practice that's developed in that particular corner. It's all very easy, but some people have only read the Django book, and event sourcing isn't in the Django book, so you've got to watch out.

If you ask people casually if they've heard of event sourcing,

almost everybody says no. It's very common to find a situation where people are comfortable with the n tiered architecture as it appears in its half baked form with an ORM

and the impedance mismatch.

But, hey, you might be able to live with the impedance mismatch.

It will drain your energy, but you might be able to live with that. It can seem

more effort to understand event sourcing, to develop the skills, to develop well factored stand alone event source domain model, and to habituate on the software development process that is sympathetic to the architecture, to understand what it is producing above all. That is a ubiquitous language that hangs off the names of the domain events. In this situation, in a situation where there aren't really the skills to do event sourcing, it might not be desirable to try. But it's quite hard to think of a domain of human activity that isn't essentially constituted by what happens

with the resulting state of affairs deriving from these events. So we get to the

equality, nature equals history. So if we're asking about persistence layer, we're already talking about layered architecture, which comes from the world of developing the scope of assistance, quote, the scope of a process, from working to support a domain of human activity.

And since all domains of human activity are constituted by the events which take place, so it's always going to be natural when supporting those domains to name events to cluster them into aggregates that respond to commands by doing some work and publishing the results of an event,

and respond to those events according to a policy that causes further commands. So It's quite hard to think of a domain that wouldn't be susceptible to a domain event oriented approach,

but it's quite easy to think of a situation where the skills just don't reach that far. And in such a circumstance,

might be the reason not to implement event sourcing.

And throughout this conversation, we've been referring a lot to the sort of domain of the application

and from reading through the documentation

that you have for the project and from, you know, past readings.

The

reason for using that particular terminology is because of the fact that event sourcing

was sort of originated

from the area of domain driven design. So I'm just wondering if you can take a brief aside and explain a bit about what you

mean when you're referring to the domain of the application and how that relates to domain driven design?

Well, when I think about the domain of an application,

we're talking about something that's more general than a particular site. So you're trying to identify

the kind of work that takes place that's supported by the software.

And the the approach

that I take is that the the software is,

something that's automating the work

to a greater or lesser extent. So there's a there's a scope of the work, and then there's an extent

to which that scope,

of that work is covered by

a system. So there's a scope of the system which is less than the scope of the work, but the the system tries to automate, to some extent, the work.

And when we're looking to understand the work, we're looking at the domain and not a particular site so that we can maybe find

things that that tend to happen in those in that kind of work.

And it's bounded by an event and a response. So if we're thinking about a process, the way I think about a domain, anyway, it's a process that is basically a collection of event responses

that work towards a goal.

So the process isn't, for example, a fire engine charging around town aimlessly.

The process is that the fire station responds to a call, and then they they they charge a round turn with the goal of putting out the fire. So it's the the triggering event and the and the outcome which is desired, which is worked towards, which constitutes a a process. And and

and those can be kinda stacked up, and the domain is is many of those of those triads, event, response, goal triads.

And then the software system is is kind of picking out things within that work, which

which would be useful to to automate some useful supports for for that process. So, for example,

it's quite hard to make a call to a fire station if there's no telephone. So a telephone would be useful so you don't have to walk there. And, you know, some tracking systems so you know all the different fires that taking place in town so you can you can see which ones you successfully put out or not and redirect firemen to the ones which aren't out, still burning.

And, you know, then you can develop

support for for, you know, fire stations as a domain rather than the particular fire station at the end of your road with all its idiosyncrasies,

which they may use for the depart from or or perhaps not. But any case, the domain's the the, the place in which the work

happens, and that work can take place in different sites in different ways, but you're aiming for the more general work other than the particular site. And given that you're storing a record for each event that occurs on 1 of those domain objects, I'm wondering

how that affects the amount of storage necessary to support an application that's using the event sourcing pattern.

Yeah. It's a good question.

If the objects

changes lots of times, then there'll be more data in an event sourcing system than in a system where the object stored as a single record

that's updated. But often people like to save a copy of it each time it's changed, so that attributes that aren't changed are stored over and over again.

In this case, an event sourcing system would require less storage. Another aspect I've noticed is that even when people are not comfortable coding domain events and would rather have a database schema that is reminiscent, if not reflective, of the domain model, they like to write lots of log messages that reports on what is happening

so no record was created or it was updated and so on.

When scaling a system, it can happen these logging all these messages,

which really trace the domain events, but in a way that minimizes their value. Can require lots of storage, especially when loaded into a search engines,

so they can be queried, which leads to the need to log each message to include the ID of the event, the time stamp, and so on. So you can pick out the log message which pertains to a particular user action and thereby hope to identify what the system you developed does. And if something can't be found, we still aren't sure whether it happened that the log message didn't come through. And, you know, you can spend a lot of time

investigating log messages when you're really all you're trying to do is find out what's happening in your system when you haven't coded that directly as the main event. Anyway, instead of all of that junk, the software can just publish events. The events can be stored, and the stored events can be queried. And,

and if you're doing that, then

then although it takes up more memory than if you're just simply updating 1 record.

Often,

often, you you don't just stop at updating 1 record. You wanna keep a trace of what's happening. And all of the other ways of doing it often can

use more storage than you actually need simply to store the events.

Yeah. Your point about using log messages

as a half measure of

tracking the events in a system is definitely very sort of poignant because,

I I've spent far too much of my time trying to trace through log messages to understand what was actually happening at different steps of the application.

Yeah. We we all we all do. It's,

and and the the the thing that makes me smile is that,

is that the the data that you're putting into these log messages ends up being almost identical to the things which are happening in the in the domain events. It's just that if people aren't used to dealing with the domain events directly, then

you look around to grasp onto other things. And and, you know,

you you resort to Logstash d, Kibana, so

on. And to your point too about

sort of duplicating

the information

that is contained in the overall state of the object when all you really care about for a particular event is

maybe the different

attributes that actually

changed. Are there ways that you can sort of reduce the overall amount of data storage by just storing the diff and then using that as a means of constructing the current state of the object by

referencing from the sort of root node of the you know, when the object was first created and then replaying those diffs against, against each other to reconstruct the current state?

Yes. You can do that. I mean, that's something that I've tried to do maybe 6 or 7 years ago. And it just becomes increasingly difficult because what's the difference of many to many? You know, what's the difference between many to many between 2 temporal objects?

It's it just becomes

harder and harder to actually make it happen in a general way. And if it's not general, it's broken because you're trying to do something and it just doesn't support that particular thing. So you you know?

Yeah. So it sort of becomes the time space trade off, where you're trading the overall amount of storage space for the amount of computational time that's necessary to be able to construct the current state of that object.

Indeed.

And

what is the overall impact on performance and latency from an end user perspective when the application is using event sourcing? Because from my understanding,

if you're

I guess it depends on the way that it's implemented. But when you're storing the individual events and then you want to be able to report back to the user

what the value is that they were trying to retrieve or, you know, what the current state of the overall system is, you would potentially need to replay those events to create a view on the overall system at a given point in time.

So does that increase the overall sort of latency in terms of confirming a commit to the database?

Yeah. It's a difficult question. It's difficult to compare 2

2 different things that aren't entirely

fixed.

But the way I'll answer this question is that, I mean, if the question is what is the impact on performance and latency from an end user perspective

when the application is using event sourcing to render the current state of the system. I would say that obtaining the current state of the system by replaying all events is likely to take longer than reading records

that record directly that current state.

However, in practice, the current state of the system is rendered in views which are developed to support

particular uses.

If the data required in a view is stored in a way that requires very little computation to retrieve,

then the view will perform faster than if the data required in a view is stored in a way that takes lots of effort to assemble. If the domain model is persisted into tables using an r m,

then simple lists of objects of the same type will work well. But pulling a more complex report across different tables will take longer. And similarly, it will take a long time to go through all events in an application to find anything at all, which is why in an event sourced application, it is necessary to have view data that is updated by the events.

If the view data is prepared, then it would be strange to store the view data in a way that requires a lot of work to assemble

when rendering the view. We can expect the views of an event sourced application to have good performance

by virtue of their dedicated purpose.

But if we return to the application, which has lots of tables to

store domain objects, it appears as a big disappointment that the objects aren't already naturally a good fit for the views. And we don't know

whether to spend time optimizing the queries, reworking the domain model, or building a dedicated view model and figuring out how to catch all the differences that result from model changes. This is relatively hard work, and if there isn't enough time and the application can be shipped with a slow view, then it will be.

That's why it's more likely with an event sourced system that the views will be fast. There aren't any complications that divert developer energy away from the task of developing a view

that performs quickly.

And sort of taking a concrete example

of, for instance, a postgreSQL database

where you're writing the event into 1 set of tables,

you know, the the sequence of events, and then you want to be able to report on current state,

would the sort of best practice suggest that you would just write the events into that tape, you know, that sequence table and then create a database view

that just queries the most recent values?

Or would you, for instance, write the event into that sequence table and then also

update a canonical record of the current state in a separate table and then report from that separate table?

Well, there's choices on there. You have to decide what you're trying to do and and why. If,

I think the the nice thing with, an event source system is that you can introduce views

later. So you can,

you can decide later what views you want

and

then initialise them by replaying all the events in the system, which will take a while. But then once they're initialised,

update them from events that continue to happen. In

the Postgres example, you'd have to decide whether you wanted to whether you wanted the the reports to be,

to be something which you can generate quickly or not. If it's not something that you need to be able to generate quickly and you can run it over a weekend by going through all the events, then

then you can then you can do that. But if it's something that

you want on a push of a button, it kinda needs to be sitting there

so that you'd wanna keep it updated as as things in the system happen. And, obviously, there's lots of choices around that as well. But,

well, there's nothing in event sourcing which forces you to have slow views, and there's nothing that

stops you from going through all the events every time you want to find out anything at all.

It's a matter of using the

available developer energy to optimize

this the system that you've that you're developing over time. And the

the, the thing with views is that you have to decide what what you want to see, and

event storming helps to do this because if you wanna decide

which command to issue, you need to look at some data. So you get the little green

Post it notes showing views, and that's what you need to see before you, you know, before you make the decision about something. You wanna see what the state of affairs with pertinent

objects is, and you you want a view to do that. So that view needs to be to be there, and you need to be able to see it quickly. So you need to keep it updated, unless you don't. Unless it's a a slow report that gets generated over the weekend or or, you know, something like that.

I could definitely see the case where

using event sourcing for a reporting

necessary.

And conversely, for a transactional system in a, you know, typical

web app, you know, crowd web application where somebody wants to write something and then being immediately able to retrieve it, you could potentially use something like a database trigger that will update a separate table with the current state based on a you know, any event records that get written to the sequence table.

Indeed. Indeed. Or you could have an application log that,

follows all of the events that are happening in the in application and have a notification

notification log and archived

logs like,

Von Vernon describes in his implementing domain driven design book in the appendix. A, I think, is it can we say in that book?

There's a whole discussion. I mean, I've implemented some of this in the event sourcing library, but, if you wanted 1 context to kinda follow another context that depends on, then

you want that

the the depending

context to pull notifications,

and it needs to be able to see when 1 of those notifications

is missing. So it needs to be integer sequenced.

That's and then you can

bracket those up into, you know, pages of of events, you know, 1 to 20, 21 to 40, and so on.

And then those are very suitable for caching in HTTP

world and scaling. And then you can have an event notification system over the top of that, which tells the subsequent, you know, the downstream components when something's happened so then they can they can catch up either by using the notification directly or by then pulling

from the notification log, the things that it doesn't have.

And bringing us, you know, specifically to the work that you've done with the event sourcing library, I'm wondering Yeah.

So,

it

follows,

Yeah. So,

it

follows layered architecture. So the the top level folders are

application, infrastructure,

domain,

interfaces.

So it follows layered architecture. There's a domain layer and an infrastructure layer and an application layer

and an interface layer.

But most of the code is in the domain and infrastructure layer. There's about 2, 000 lines of code in each of those 2 layers. There's about 6, 000 lines of code altogether in the library, but 2, 000 in the main layer and about 2, 000 in the infrastructure layer. So the event sourcing mechanism that I wanted originally

to reuse is the infrastructure

layer.

The persistence layer, which is

the effectively, infrastructure there,

has a has an event store. So that's the first object really in the in the infrastructure layer for the persistence.

And there's an event store which allows domain events to be stored and retrieved.

Event store has,

internally, it's it's,

concerns are separated into an active record strategy

object, which is like a manager that's encapsulates the database management system.

The active record strategy object has an active record class, which encapsulates stored event records.

Having the active record class allows variation in the stored event schema within a particular database system,

And having the active record strategy, the client allows variation of database technology used to store events.

The 2 main variants of active records, the time sequenced event and the integer sequenced

event. At first, there was just a stored event class that used Cassandra,

and events were sequenced by time stamp. However, the alternative of sequencing

by integer rather than time stamp was necessary to support optimistic concurrency control in distributed system. And since there was still value in having some kinds of event

sequence by time stamp, it was necessary to support variation

in the active record class. That was the main growing pain, figuring out that time stamps

aren't really a very solid foundation for sequencing events of entities in a distributed system,

a system that requires some optimistic concurrency control to maintain consistency of the entity. We can avoid

events being jumbled due to time differences across the network by using a central time server, but then we introduce a single point of failure. Even with a central time server, to have robust optimistic concurrency control,

we need to know if an event is the next 1 in the sequence, and you can't do that with time stamps.

You need an integer

and the Paxos protocol,

for example, as implemented

in Cassandra with the if not exists feature. So if there's any contention,

1 thread can successfully

append an event to a sequence, and the others will fail, and will need to refresh their state before trying again. Another point of evolution was storing events in SQL databases,

introduced the class active record strategy, which should perhaps just be called manager. There are only 2 active record strategy class in the library at the moment, 1 for Cassandra,

1 for SQL Alchemy.

I'd like to support other database services, such as Amazon's DynamoDB,

but I haven't done that so far. Another point of development was optimizing the queries. We don't need to search for all the events at a particular version,

encoding

a general index that allows for that compromises performance when there are lots of events. So, actually, optimizing the queries was an important thing to do. Another point to determine was encryption of the domain event before it stored,

which provided for application level encryption, which is something which is required in a in a place I was working at. So that's the infrastructure layer, which is the original motivation as said. And the library is also domain layer. The domain layer uses the event sourcing infrastructure

and has

DDD, domain driven design shaped things such as an aggregate

and repository.

As described

in Martin Fowler's patterns of enterprise

application architecture,

the repository presents a dictionary like interface.

The event source repository uses the given key to select events, which are replayed to obtain an entity

that is returned to the caller.

The aggregate,

has, for example, a save method, which writes a list of pending events to the database.

I'm not entirely sure these things strictly belong in this library, but since event sourcing is a persistence mechanism for domain driven design, it does make sense

to have some DDD shaped things that directly use the event sourcing infrastructure.

Otherwise, you would have to figure that out

and and do it over and over if you had to do it a number of times.

And if you had to figure it out on your own, you might spend time wondering if you did this in a good way. So having some classes in the library can show how it can be done. They're quite simple, so you aren't going to miss out on loads of great updates by duplicating these classes in your own code. Similarly, if you want to do things differently, perhaps, for example, distribute the mutator function across the events so they can apply themselves, there's nothing in the library that makes it difficult to do things in a different way.

And reading through the documentation,

the majority of the examples are using

a database as the means for

storing and processing the events. But in my readings of using event sourcing, particularly for larger scale infrastructures, I've also seen a message broker used as the means of, propagating those events through the system, usually in the form of Kafka. Do you have a support built into the library for being able to take that approach as well? Or is that just a sort of different concern from what you were trying to tackle in the library?

Yeah. I I haven't

supported we haven't supported Kafka yet. I haven't done any work on DynamoDB,

and I'm pretty sure there's something that uses RabbitMQ,

which is a message broker.

But, I mean, it depends what you're doing using the message broker for. I mean, if we're publishing events

outside the process boundary before they're persisted so that something else is persisting them,

then there's a question, do you return

before the events persisted? Because if you just publish the event and then the application

wants to carry on,

then

if it doesn't know that the event's been persisted, it needs to start pulling for

whether the event's been persisted yet before it carries on. Otherwise, you don't know quite where you're up to,

I guess. So, I mean, it's complicated. You can you can publish the event and not save it in process, not save it synchronously,

but I have never really done that. It just seems

to be building a castle on sun a little bit. I think it's quite useful if you can you can just,

you know, if you make

an update to an object that when you return, that's that's that's happening after the event's been stored so that you know that when you go and read an object you just created from repository, then it's in the repository. It's not gonna be appearing, you know, half a second later. It's gonna be there. I think, though, it's a nice it's a nice quality to have to keep things in process

like that and and synchronize.

Yeah. From from my reading, the main driver for using a message broker as the means of publishing events is in the case of a

data pipeline where you're,

retrieving the events and then you want to be able to publish them to multiple different backing stores for different purposes, whether it's for, you know, batch processing of reports or being able to do historical archiving

less so than if you were trying to use it in a sort of traditional web application where you want to be able to persist the events and then read them back out. Right.

So in that case, what we're doing is, is having dependent systems

basically

being updated from the application event stream.

So that needs to be provided for directly,

and the way that Vaughn Vernon

suggests to do that is with a notification log so that downstream things can pull as a as a last resort. They can always pull

from a from a, basically, a fixed

archive of of the events,

that they can step you know, they can get the current 1, and then they can go back in a kinda linked way to the point where they've already seen, and then they can work forwards again, updating themselves as they go forward.

You can make that event driven by having

by pushing events to these downstream components.

But at that point, you're risking

events getting out of sequence and and, know, being duplicated or being missed. So that's why I think it's important to have the

the the pulling you know, pulling on the notification log

thing underneath it to fall back onto. But these are very different. That's it's it's slightly different from

the in process synchronous persistence of the event so that it it's stored before the the action. The command returns, which is 1 thing. To do across the process involves asynchronous

operations.

And and at that point, you wanna you wanna make sure I mean, if you care I mean, maybe you don't really care if things are out of sequence or a little bit jumbled up or,

or a few things are missing. You know? If we go back to the logging

the logging case, then, you know, if my a happened appears just

before b happened and a a and b aren't cause related, I don't really care if b shows up in the log before a or not

because it doesn't matter.

But if you do care about the the the sequence of events and you care to have a faithfully

reproduced sequence in a different process, then you need to,

you need to make sure that the

first context

is actually generating

a contiguous sequence of events that can be followed.

In other words, an application log that goes across all of the different entity

sequences

because, otherwise, you don't know

which entity sequences there are,

and it's difficult to get them all out. So, for example, Cassandra,

comes down to the big jumbled order, and you have to go through everything to get to see if you've missed something.

So if you're wanting to do as I said, if you're wanting to do things across a process boundary, the most important aspect, really, if you care about entities to maintain, is to replicate faithfully this the original sequence.

And

I I just I don't know how to do that apart from using integer sequences

so that you know, you know, if you're on 5 and you don't have 4, then you're missing 4, and you need to go and get 4 before you can apply 5. If you've got a timestamp

53,

then you don't know if there was something that happened at timestamp

52 or 51 or whether nothing happened at those. The last 1 would was fit 50.

We don't know if there was a million things that happened between those 2 things or whether there's nothing at all. So so that's the difficulty. It's having a an application, an integer sequence for an application that that can

that can be bigger than any 1 database partition

can can take. So you don't really wanna put all of your application on in 1 database partition, for example, 1 1 common family and in Cassandra,

because you're gonna fill it up. And then every time you need to do something, you're always gonna be hitting the same bit of disk. It's

it's not gonna work very well. So, really, what you wanna do is have a an abstracted integer sequence running across a lot of different partitions,

but in a way that gives you order 1 performance on the pending to that and getting,

you know, getting an item at a particular index. So in the library, there's something called big array, which, which actually does this,

which gives a a kind of hierarchy of

of,

of arrays. So there's a there's a sequence.

Can entities have a sequence in the database, and that sits in a partition.

But the erase, it's over the,

these sequences and gives a hierarchy of sequences so that you can have you know, it's gonna enter the list to the sort of power of the of n number of sequences. You've got the, you've got a huge

space in there more than you could possibly fill up, And each item can be accessed in order 1 time, and you can append to it.

You can find it at the end of the list

very quickly,

and, and it spans across partitions. So you don't have the problem of just hitting 1 disk. You don't have the problem of worrying whether you're gonna run out of of numbers in

your in your application log. And then you can, you know I mean, definitely

an unlimited

size of application stream, and then things can pull from that, from the notification log. I mean, it took a little bit of work to figure this out, to figure out what the problem actually was and,

and also how to solve it in a in a stable way. But the important thing is that if a downstream thing, you don't want to miss an event so that if something happens in the system, it needs to be in the application log. So if unless you're doing transactions,

which you can't really do in Cassandra, you can't do transactions because under cross different column

families,

So you can't have a transaction that allows you certainly to write to the application log and the the aggregate sequence at the same time.

So it seems to to me that you have to write to the application sequence first

so that if

to avoid the situation where you write the domain event and then fail to put that into the application log. If you need to kind of put the application

enter the domain in the application log and then save it into the aggregate so that if the application log writing fails,

the event effectively hasn't been published.

So you can't get to the situation where events are happening that aren't being notified. But it does allow the situation

where an event

is notified

that maybe failed to write or hasn't quite finished writing into the database. So,

that's the kind of residual problem.

And if downstream context can handle that, maybe

just working slightly behind,

or if they come across an

entry in the application log that isn't actually stored in an aggregate,

stream, it can pause or it could try again to see or after you know, if if 5 seconds has gone by, then it assumes that's not gonna happen. Or, you know, if it assumes it's gonna happen, but it's important to catch up if it does actually happen, then you can code that into the downstream thing. I mean, if you had a a transaction where you could write these 2 things together, then obviously the problem goes away. But you just can't do that with a a system like Cassandra for scaling scaling reasons.

It doesn't scale so well. So those that's what I've been thinking about with the the downstream thing. I came across this this need to have a large integer sequence and to and to distribute that in

a a way that performs across different database partitions. And I came up with something called big array, which seems to work quite well. Anyway, I was just trying to follow what Von Von Von was saying in his, in his notification log stuff and implementing domain driven design. I think his suggestions are really good, and I tried to get things to work.

And I think I did, but, it's complicated, really. I think think that's the thing with event sourcing is it appears to be simple at first, but, actually, it turns out there's a number of a number of subtlety switch,

which you you can avoid if you don't want to, you know, to have a a distributed system or microservices

or these things. If you just wanna have a big monolith, you can just

just code events and,

you know, get them back out when you need to present the object. But there are some more challenging aspects

which which you come across to.

For somebody who wants

to incorporate

event sourcing into an existing application, is that something that would be feasible? And if so, how would they go about doing that? And,

alternatively,

is it something that would be more beneficial

to

incorporate into a greenfield project that you're starting and then use that from the beginning?

Yeah. It's a really good question. I mean, there's a choice there. Do you do you just do event sourcing at the beginning of a project? If the project started without event sourcing, do you just forget about event sourcing?

And to keep things simple, then perhaps

the answer is yes. But often the answer is,

that we do want to change an existing application to use event sourcing for some reason.

And there's lots of ways of doing that.

Someone who wants to incorporate an event sourcing design into application. It's quite a complicated topic.

Of course, it's much easier to start by doing event sourcing and not switch halfway through a project.

That's when I started the event sourcing project.

So you can you can start from the beginning without any any friction. If you have a legacy code base and you want to convert it, then you can start in the middle and work out. You can pick the most central object. You can write tests around it. You can figure out how it is used and decouple it from the rest of the system, and you can replace this object with an event sourced object. This can repeat it to the next most important object until all objects have been replaced. Or you can start at the edge and work in, starting by picking off helper objects that have few dependencies

and gradually working towards the most important objects.

Alternatively, when things aren't coded in terms of events, you can often find lots of procedural

style code. For example, save methods with calls at the bottom to update other objects,

which in turn have another save method, which calls other staff giving long chains of calls like that. This situation can be unpicked by publishing an event rather than making a call, and then having

a subscriber or a pause to make the call when the event happens. In this way, you can start to shadow what happens in the system with events, which name what happens

directly. These events can be refracted into the end team methods without disturbing any existing code.

When everything that depends on the existing code is changed to depend on the events,

the existing functionality

can be removed. The biggest challenge

granted the refactoring a large application

like that will be the skills in the room, which will have been habituated

on the legacy style. The length of time the code will be in a mixed state will try everybody's patience.

And so perhaps a safer approach is to separate

a part of the system as a service and then develop an event source service that replaces the legacy service. So similar to the first approach I mentioned, you can start to introduce an interface to some objects that are used directly or that were being used directly, so you can use them indirectly, then you can separate those objects into a different system,

and call the same interface, but now the interface would be going across a process boundary or a network thing.

By separating out a little team, the people who are interested and willing to do something a little bit different

can focus and work together within a sort of organizational boundaries that corresponds to assisting boundary, which helps to keep the group dynamics simpler,

less contentious. And an even safe approach is to think forward to replacing the current system entirely, which gives opportunities for introducing event sourcing at start of a project. But,

the I mean, from my experience, the the the thing that's important is the skills in the room. I mean, people develop their skills around the work that they're they're doing. And if they've been developing a nonevent sort of system, then that's what they'll be they'll be familiar with. And it's really asking a lot for those people just to leave all that behind and start to do lots of new things

and have a a whole mixed bunch of code. And Eric Evans mentions this really in a in a discussion about microservices

where he's talking about about the rough and tumble of development. So if he's using the phrase rough and tumble of development, I guess that he's he's had a bit of rough and tumble where people have just started to say, oh, DDD, it's a lot of hard work. Event sourcing, you know, who needs that anyway? We've got 2 scoops of Django. We know what we're doing with our

ORM. We can just carry on like this.

And then you can have a a bit of a energy draining attrition while, you know, 2 different schools will slug it out within the same context boundary, the same software development project.

It's horrible, and there's no real answer to that. There's no real way of kind of managing

harmony there.

People just aren't it's not a situation that's conducive to everybody getting along. You've got some people who've got a vision for departing from what everything everybody else is used to, and the story there isn't clear. And when the story isn't clear, it can be aggravating for people. And if you're aggravating for people, you're gonna get the rough and tumble of development. And the rough and tumble of development is gonna crowd out the the more experimental,

the the the newer, the less well established practices. It's gonna crowd out innovation. It's gonna crowd out some departing from how things are, which is what you need to do when you're doing development. But, anyway, people do development often or can do development by remaining in comfort zones. And 1 way to remain in your comfort zone is to make sure that you exclude

other ways of doing things. And and if you want to succeed in a situation like that, you need to separate out the team of an organizational boundary. You can call it a microservice,

but, really, what you're doing is is, is separating out a little group dynamic

and nurturing that

in distinction from an older way of doing things. So it depends on the skills in the room, I think, most most strongly, rather than the approach itself. And if you want to develop new skills, and it's not necessarily the case that everybody's

dead keen and and passionately wanting to do that, It doesn't make sense to try and force it down people's throats. You have to go with with the flow, and and and if there's people who want to do it, then you need to protect those people and

encourage them to do to do this good work and and protect them from people who might just want to kind of stop it happening so that their way of doing things remains

known as the best and only way of doing things.

I don't think this is uncommon. It's maybe a slightly unpleasant dark side of software development, but it's, I think it's important. I think Eric Evans mentions it for an important reason,

and, these organizational

distinctions are obviously essential when you're thinking about, you know, a bunch of people developing software together.

It's always interesting

to

bring in the people context of technical problems because they're all too often overlooked in the common discourse when trying to figure out how to

build our technologies and build our software because

ultimately

all software

is built to

serve the purposes of the humans that use it.

When you have an event sourcing application that has been in operation for a while, inevitably, you're going to need to

change the

schema and sort of content of the data structures for those different events. So I'm wondering how that manifests

in an event sourcing application,

and in particular, how you would, account for that when you're reconstructing present state from the beginning of an object's event sequence.

Yeah. It's a really it's a really good question,

and it's it's a difficult question.

And so there's a really good paper called The Dark Side of Event Sourcing,

which

which I I hope you mean event sourcing in general and not this project.

For migration of immutable events, that paper outlines 5 strategies

that are available.

So multiple versions,

that's 1 strategy.

Upcasting is another strategy.

Lazy transformation

is the 3rd 1. In place transformation

is the 4th.

And the 5th 1 is copy and transformation.

So the last approach

allows you to rewrite events into a new event store. It's not an essentially difficult problem, but it does require some thought, And like with many other things, mixing

mixing approaches might not be optimal. So if we think about it, what we've got is stored events. And if the model changes, we could change the model in different ways. So we could change the model by changing the events which constitute the objects, or we could change the model by augmenting those events with new versions of those events. So we could start to record which version of the events it was. Or if we just had completely different events, and it doesn't really make sense to start to to version them because you you change the name of something or you you merge events. We made a distinction with an event between 2 different events,

then, at some point,

versioning isn't really gonna stretch to it, and you'd wanna just kinda rewrite your rewrite your history, which gives us the transformation thing. So there's the versioning, and there's kind of upcasting, which is where you go over. This this event is how we used to do it in the past, and then it and whenever we we see this 1, we change it into this other thing, which is the new way of looking at it, and then we apply that event. So you can you can kind of upcast.

Lady transformation is,

is where you, you know, if you hit on on some events which are old, then that's the time when you rewrite them. So you don't just have a baddest job and take your system offline and and rework everything and then switch it all back on and everything's been transformed, you you do it in a lazy way. And then in in place transformation and copy transformation,

so, you know, in place would be where you you just rewrite the events in the event store as you have it. And then copy transformation would be where you

you, you know, you read out the events in your event store and then and then change them and write the changed events into a new store and then presumably get your system to use that new store. So

there's different ways of doing it, but you you just need to remember that there's there's events. And if you wanna change those events, you need to you just need to pick a way of of making sure that the the system is picking up new events somehow. So there's a there's different ways of going from a to b. You can you can assume that you can rewrite a to b as soon as you pick up pick up a. But if that's complicated, then you might want to just rewrite the whole event stream and then carry on with

a completely transformed situation.

But that's that's all you have to do because there's nothing else other than the events.

Yeah. I I can see that, at least from some perspective,

transforming

the model and representation

of the

event structure

is simpler in an event sourced application

than it is in a relational database, for instance, where you need to transform the state of everything at once because you are changing this table structure. Whereas with event sourcing, you can, as you said, just change the way that the events are written out,

and then you can sort of back port that to all of the previous events by replaying them into the new schema.

And what are some of the most interesting uses of event sourcing that you've seen whether in, you know, in particular implementations with your library or just the pattern in general? So I was involved in a in a a start up in East London, like, 3 years ago, and it was just a thoroughly event driven domain. It was a little AI. It was a little little smart application which sat over, all your cloud services, email, and calendar, and notes, and documents, and things to calendar. And and and it would it would, give you a timeline of what's happened across all of your staff, so you could see calendar being changed amongst emails and and documents and whatnot.

And then it would give you smart recommendations.

So the idea was that it would it would be able to kind of infer from your meetings

the documents which,

you know, pertain

to the discussions

with the people who are going to the meeting. So it would then be able to to kind of give you all of the stuff for that meeting all altogether just before the meeting happened. And it would be able to do other smart recommendations and other stuff over the top of

all of these events coming out of the,

these cloud services. So it was,

it was a matter of having a polymorphic

event sourced file model

that,

all of the different objects were then kind of, duplicated into or, you know, the the the data we pull out in these client services was was considered everything was a file, but different kinds of files. So we had a little polymorphic event source model of files of different kinds. So the we were picking up on streams of events, and then streaming events into the data science stuff,

and,

another other bits and pieces into views. So so that when you hit your hit your app, everything just came up straight away. You know, the timeline didn't have to go through all the events to figure out what was it. There's a timeline view data which was already

being updated, and you just went and you got you got your timeline instantly. And the recommendations

are kinda similar.

So that's when I started doing event sourcing stuff. It just seemed like a a really good fit and and it was And we we we got it working quite quickly. And then it worked. And it was quite interesting to to see how

how how good a fit it was with all of the cloud services, how they essentially were just giving you, you know,

streams of differences or streams of events

and quite how it was styled. And you could pick up on those things and update stuff based on them and code those as domain events. And And it was really neat and very understandable

and and worked quite well. But it's just that I started doing it with time stamps, time stamp based,

events, which actually the the CTO there, he really wanted time stamps. So that was that was good. It just turns out that time stamps didn't really didn't really get you all the way there. That was very interesting. It was a start up in East London.

To kinda go to the other end of the scale, another project was,

in a large corporation, and they had a,

a need to have

great scale. I mean, the domain

wasn't,

the most,

kind of, technically complicated

domains.

I can't really talk about it because it was an NDA, it was under an NDA. But it was,

massive you know, they wanted huge scale, you know, bigger than I mean, it's a a kind of classic situation that was referred to where, you know, everything happens in just a few minutes rather than people having to kinda hang on to the website for half an hour, and then maybe not getting getting what they wanted. And,

and it was a big it was a big high scale

event based system and we, you know, we scale it up so it actually worked at scale, you know, on cloud infrastructure. It was it was kind of a huge huge project, really. And that was interesting because of the scale, because of, you know, the size of the corporation that was doing this thing, and the ambition that they had for

for this for this system. So so that was interesting.

The first 1 was interesting for the architectural

aspect of it. You know, it was the first eventsourcing project I did. And then the other the other project that I mentioned was,

was kind of doing that at at scale, you know, in an established corporation on a little start up. So those are the 2 most interesting

pieces of events or something that I've seen. And looking forward,

what are some of the features or improvements that you have planned for the future of your library?

Yeah. Well, thanks for asking. It's kinda complete in that there's application of encryption. There's different types of events. There's optimistic concurrency control with some DDD classes.

It's well factored. It's got a 100% test coverage.

So,

you know, the things that I've got in mind

to do on it are are better documentation. So there's a GitHub issue

about that. So I had this little, you know, bunch of enthusiasts

and discussed this with, and it seems that the documentation

wasn't quite as good as it could be. There was a huge read me file, and I restructured that into some syncs documentation. And I think I didn't quite cut it in the right way. So the documentation

the new documentation will have much clearer distinction between

the

infrastructure layer, the the kind of internal

event sourced

persistent stuff, the original motivation for the library,

just kinda cover that, and then separately to that, introduce the DVD classes which use that stuff. At the moment, there's the DVD classes are be kinda used in a

repetitious

way to give examples of the different ways in which the the internal kind of mechanisms

mechanism

can be used. And it's and it's it's not optimal. I need to I've I've kind of I've said I'm gonna rewrite it on better line. So that's 1 thing. And the other thing is broader support for popular database services. So we mentioned

Kafka. You mentioned Kafka. I mentioned the DynamoDB.

So there's some there's some work to be done there. I don't know,

exactly when I'm gonna do it, but I think it's important to do.

The event migration, the better support for that, I don't know if there's code which can help with that, or whether it's just understanding

the different approaches and how so maybe a bit of documentation.

I don't know whether code which would help there. I

don't know. And then better examples. So on the Slack channel, we've been talking about some better examples. So there's 2 little little projects people have started to use event sourcing, a little to do thing, and then another

another application that someone wanted to write. So if those had I think it's happened in those last couple of weeks, but that would be really nice if if we had some some better examples.

And then, you know, looking at the technicalities,

the thing which I haven't quite satisfied myself about is this, propagation of events from 1 context to another.

It seems the problem seems to have 2 parts.

If you want an integer sequence for a single application, you need to find a way that works across database partitions.

And if you want entity event sequences

and an event and an application notification sequence, you don't have transactions, and you need to find a way of to make sure there is a notification for each domain event. The first problem I solved, as I mentioned before, the a structure I call Big array.

And the second problem, I think I nearly solved, but the residual issue, as I mentioned before, disambiguating

when there is a notification

but not a domain event, whether

the domain event is appearing or or will never appear. If that's something downstream context can cope with, then it sort of solve the second problem too. But that needs

some finishing off stuff that needs to be done there, the the notification

log. You know, there's some a kind of JSON API stuff, an interface layer for that. And it's well tested, and it and it works. It's just that,

I don't feel entirely satisfied

with it for, perhaps, for no reasons at all here. I just need to leave for a little bit and come back to it and talk about it with some people. But there's just an aspect. I think internally within a single process, it's largely complete, but there's just a there's there's documenting

it in a better way. I think that's important. And then there's this idea of propagating events across a context, which,

which works. But if it needs to be really robust,

then I I think we've got that. But I'm I'm not entirely convinced. So there's just an outstanding kind of worry perhaps. It's a real concern or whether it's just a worry. I'm not quite sure. But propagating events from 1 context to another is still something I'm thinking about. I don't think it's actually completely done.

But in but internally, it's it's it's kinda done. Alright. Well, are there any other topics or questions that you think we should cover before we start to close out the show? No. I think your questions have been really excellent and

demonstrates a, you know, a very good understanding of this of this particular topic. And I don't like to I don't consider myself as a kind of expert in in event sourcing. I'm just trying to figure it out and and write some code which would be useful for for me and perhaps others, you know, going forward. So I think I can I can, know, use to accumulate things that I've learned and,

and maybe show the limitations of my understanding to other people so they can help me learn things that I I just got stuck on? That is the the idea of it, really, anyway.

So

maybe there's a completely different take on this that I I kinda missed. There's a new way of a different way of doing it, which I've missed. But I think I think I've tried to think about it,

exhaustively,

and I've tried to do my best on it. I think your questions really, really cover the the scope that I've discovered in doing this. And I I don't know

what you've missed, and I'm sure you have missed anything at all. Sure. Yeah. It's definitely useful to have a concrete implementation

to use as a starting point for any conversation because then everybody has common ground to work from as opposed

to

specifically at the specifically at the ways that it's implemented in the code and then work from there. So for anybody who wants to get in touch with you or follow the work that you're up to, I'll have you add your preferred contact information to the show notes. And with that, I'll move us to the pick. And my pick this week is

a conference presentation

by Corey Quinn called Heresy in the Church of Docker, and it's just a really well put together and humorous look

at the sort of hype that has come up around Docker and how it will be the be all end all of our deployment problems and just talking about how it's great for development, but how it still has a lot of

operational concerns that it doesn't address. And so just looking at the ways that the,

common perception of it as being a panacea for our deployment solutions hasn't quite come to pass. So it's definitely entertaining and educational,

and I highly recommend anybody who's interested in Docker take a look at that. And with that, I'll pass it to you, John. Do you have any hit picks for us this week? Well, I mean, the thing I've actually been working on for the last few weeks is something called,

quant DSL, which is a domain specific language for quant

analytics.

And there's

and, you know, finance is quite an important thing. If you wanna do some energy infrastructure, then you need to get the finance around that. So you need to, kind of, price these things. And there's loads of, you know, groups

doing energy

stuff, and they all seem to buy these tools that cost 100 of 1, 000 of pounds, these proprietary software

modeling tools.

So for the last maybe 5 or 6 years with a friend of mine from Oxford, his PhD in this stuff, we've been developing some open source maths

codes for quant analytics and finance and training, which,

which are really act I mean, I it's it's really quite excellent stuff. I mean, I don't mean to blow my own trumpet, but they did a lot of the maths.

And,

and I've been working on the, you know, the domain suite language

and the aspects of it.

But we comprise power stations and gas storage facilities.

You know, you could correlate

it models between, you know, different markets with multifactor. I mean, it's really quite advanced stuff, and there's nothing else really like it in the open source

space. It's just a lot of proprietary tools. So,

you know, trying to disrupt that that kind of, you know, Fintech

proprietary

Fintech modeling

thing is,

I think it's quite interesting. I mean, if we look at the energy infrastructure in the UK, you know, you've had the government

having, you know, kind of agreeing to

quite shockingly

high strike prices for nuclear power plants when if you had some, you know, whether well, there's no nobody can kinda question these things because there's no real ability to turn the handle on the maths.

But with a system like this, you can start to do, you know, to do valuations.

And internally with energy companies at the moment, what a lot of people do is just knock something up quite,

quite quickly

in a spreadsheet

or in some kinda ad hoc Python code or something. And then, you know, it takes a little while, and then and then, obviously, that can't go into production if a deal goes through. So the structure is trying to knock things up, and then there's

a there's a, you know, a kind of organizational boundaries to kinda getting these things into production. If you had a a domain specific language, then you could just easily, you know, experiment with different structures. And then if something happened, you can just easily just kind of put that into into production. So it's, I think it's a really nice, clever project.

And,

and I've been working on it for the last few weeks. I've been working on it quite a lot recently and trying to get something going with that. So it's

so

it's not really

a a pick, but it's the thing that I've been

really kind of focusing on the last few weeks. Yeah. Yeah. That definitely sounds interesting. So thank you for sharing that, and thank you for taking the time out of your day to join me and talk about the work you've been doing with event sourcing and the library for being able to try and codify that. It's definitely something that I find interesting, and I am sure that other members of my audience will as well. And, definitely seems like you've done a good job with it. So thank you for that, and I hope you enjoy the rest of your day. Well, thank you to miss having having approached, you know, me about this. It's very encouraging and exciting, and I hope that, I hope you have a a great day in Boston today.

The Python Podcast.init

Summary

Preface

Interview

Keep In Touch

Picks

Links

The Python Podcast.__init__

Summary

Preface

Interview

Keep In Touch

Picks

Links

The Python Podcast.init