Automating Application Lifecycles For Developer Happiness At Wayfair

Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great.

When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode.

With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform,

including simple pricing, node balancers, 40 gigabit networking,

dedicated CPU and GPU instances, and worldwide data centers.

Go to python podcast.com/linode,

that's l I n o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

So now your modern data stack is set up. How is everyone going to find the data they need and understand it?

Select Star is a data discovery platform that automatically analyzes and documents your data.

For every table in select STAR, you can find out where the data originated, which dashboards are built on top of it, who's using it in the company and how they're using it, all the way down to the SQL queries.

Best of all, it's simple to set up and easy for both engineering and operations teams to use.

With SelectStar's data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets.

Try it out for free and double the length of your free trial today at pythonpodcast

dotcom/selectstar.

You'll also get a swag package when you continue on a paid plan. Your host as usual is Tobias Macy. And today, I'm interviewing Joshua Woodward about how the application life cycle team at Wayfair uses Python to accelerate application delivery and improve developer experience. So, Josh, can you start by introducing yourself? Sure, Tobias. My name is Josh Woodward. And for the past year, I've been managing the application life cycle team at Wayfair.

Prior to that, I was an IC on the Python platform team. 1 of the really cool things that I got to do was embed with teams looking to decouple from our PHP monolith. 1 thing I really enjoyed about that is I got to see a lot of pain points from that experience firsthand, and that kind of folds directly into a lot of what I wanna talk about with you today. And do you remember how you first got introduced to Python?

Yeah. It was pretty interesting. In my high school physics class, I, at the time, didn't know much about programming,

but I had a TI 84 calculator, which is capable of writing TI basic programs. So I was amazed when a friend of mine wrote a program to solve vector problems. You know, input any amount of vectors with their angle and magnitude, and you can figure out the resultant vector. I thought that was really neat. So I then started using the TI basic to solve other pretty specific physics problems for me. It It was pretty limited and and rudimentary, but, you know, you'd give it a fixed set of inputs. It would run through a very, very specific equation. So I'd write, like, a program for each type of problem I wanted to solve. So I went to college. I was thinking about student loans. I heard about Python and decided to give it a shot. I wrote a really, really basic script to simulate various, you know, payback and interest scenarios.

It wasn't anything super complex. I think I just multiplied some numbers together and printed them out. But that was kind of the primer for when I went to college initially as a mechanical engineer.

Within 3 weeks, I think I switched my major to software engineering.

Some friends of mine down the hall were using Python to draw cool images for homework.

They were using turtle in Python, so I thought that was really cool. And I was jealous they got to draw pictures for homework. So that was kind of all I needed to switch over. Yeah. The turtle module is definitely always fun to play around with, and it's a good way to get people interested.

For sure. And so you mentioned that for the past year, you've been managing the application life cycle team at Wayfair. I'm wondering if you can describe what the

role and context of that team is and some of the story behind its formation and responsibilities.

Sure. I'll start by kind of giving the context in the background about how the team came to be. Around 2018 at Wayfair, we were in a state where we had a lot of deploy congestion.

It was challenging for engineers to iterate and ship changes to our PHP monolith.

Our tech org and our leadership had kind of invested in containerization and decoupling

as a way to directly combat this problem. So we had already decided that we wanted to double down on decoupling from the monolith. So at that time, you know, teams were incentivized to do that. When I was on Python platforms, the team had already been experimenting with code templating.

It was fairly basic. You know, there was a repository that housed a cookie cutter

template, and folks interested in creating their own decoupled application

could clone that repo and run the basic cookie cutter command to template out some code. And that was all that it did. So it kind of gave you the code itself, but left it kind of at that point.

When I spent about a year embedding with 4 different teams to kind of help knowledge transfer around

setting up, you know, containers, the application, Python best practices, testing, linting, we got a lot of great feedback on our tooling.

But 1 thing that stood out is it took even senior engineers with a lot of domain expertise weeks to get something set up. You know, you had to know exactly who to contact, how to click the right buttons,

what ticket to file, and exactly how to fill it out. And we saw that as, like, a pretty big problem.

So kind of the approach we took is we started by counting the manual steps that it actually took to get a very, very basic hello world rest API in our production environment. And to our surprise, it took about, like, 60 distinct steps, which was, like, pretty awful. Again, like, really experienced senior engineers would take weeks to get this set up. So

without really having a focus on the product at first,

we just had a focus on reducing the amount of manual steps. We knew that no matter what we did, we wanted to kind of drive that figure down. So I think I remember

years ago, 1 of my kind of quarterly objectives was, can we take this number of 60 and reduce it to 40? So we kind of got started thinking about that. We ultimately released a product, which we call Mamba. It's on the theme of snakes, which is really cool.

And, initially, the application life cycle team, which at the time was incubating within PyFlats,

supported 1 main user story, which was, you know, as an engineer, I would like to create a production ready application in 10 minutes so that I can have a reliable and standardized application setup that follows best practices.

That team ultimately grew out of Python platforms. We created our own team with our own scope. At this point, that was about 1 and a half years ago. So

summer 2020 at that time. Given the time since the

initial development of the application life cycle team to where we are now, how has the scope and responsibilities

changed or evolved and some of the ways that the team is integrated into the rest of the organization

started to formulate and become more

standardized?

So when the team was created, we had the 1 solution to manage the application creation process, but we decided to carve out somewhat of an ambitious scope. And we decided that instead of just focusing on application creation,

it made sense to deal with the entire life cycle of kind of abstractly managing a decoupled application.

So not only did we want to support the creation of these apps, but we also recognized that even maintaining these apps and keeping them up to date and keeping them running in our production environment

involved a lot of toil for our developers, so we wanted to support that as well. Also, we care about application decommissioning.

1 thing we wanted to avoid was having a bunch of things running in our production environment that didn't really have anything referencing it and weren't actually providing production value. So we felt that it was necessary to also spin down applications at the end of their life.

As far as the

sort of formalization

of that life cycle where a lot of times you say, at the outset, I have the need to be able to build the service or product, but there's usually not a lot of thought that goes into

the sunsetting aspect of a project. It may just stay alive indefinitely

and start to become kind of Frankensteined with unrelated functionality just because it's already there.

And I'm wondering what your responsibility

is as far as being able to help maybe

combat that kind of organic growth and helping to

create that overall

life cycle plan at the outset

and just some of the ways that that is becoming

standardized in the Wayfair organizational

culture?

I think of the 3 parts of our scope, decommissioning is where we have invested the least amount of effort. So right now, we have a self-service workflow

where you can opt in to have your application decommissioned.

We'll do some very, very basic stuff, you know, renaming your GitHub repo, spinning down your build pipelines, etcetera.

The idea is you should be able to generate and decommission

apps indefinitely

using the same name and the same terminology.

Though, we aren't doing anything super intelligent

on the side of, you know, proactively identifying applications that would be candidates for decommissioning.

Though 1 project that we're working on now is kind of getting that insight into the various state of the applications that we have at Wayfair. So we think that insight will help us, you know, focus attention to things that require it. Given the fact that your

mission is to

be kind of an enabling team to the rest of the engineering group, I'm curious how you think about

identifying

what are the areas of greatest leverage that you can focus your efforts in

identifying

and scoping the types of projects that you want to work on, whether that's

from external inputs from other engineering teams or visibility

of the work that's in flight and maybe internal insights that you're building from working with the various stakeholders across the organization?

Cool. Yeah. So I think 1 of the fundamental approaches we take is we're willing to make assumptions, but we're also willing to be wrong. So anytime we want to build a new solution or kind of experiment, we like to validate our assumptions with our customers. So we'll kind of take baby steps initially and make sure that what we set out to build is indeed being used and provides value. Part of this is knowing customer pain points. So

the approach I described earlier where I went and embedded with different teams, that's something that I think has really helped us historically and currently for sure. Another thing that is interesting about the way we work is we have a concept called free day Fridays,

which is just about what it sounds like, where

on Fridays, engineers are encouraged to

spend time solving problems that they really wouldn't get to solve

in their, you know, otherwise day to day work, the work that's, you know, fulfilling objectives and key results. That's where our innovation happens. You know, we're comfortable taking risks and using that Friday time to do some things that are a little bit out of the box even if they're thrown away. These solutions that I wanna talk to you about today all came from that 3 day Friday time where we got to experiment.

People ended up liking kind of the the stuff that we were building, and we just took it from there. So we never once said, we're gonna build and release this new product, and it's gonna be great. We were always, you know, solving a problem or kind of, like, building off of innovation

that we had kind of landed on. Before we dig into the specific projects, I'm also interested in understanding

how you approach the

kind of marketing of the capabilities that you're providing to the rest of the organization because Wayfair is a fairly large company.

My understanding is that there are a number of different physical locations, and I'm sure that over the past couple of years, a lot of your work has been done remotely. And so

just the overall challenge of raising awareness of the work that you're doing so that other teams can take advantage of it and some of the ways that you measure the overall impact that you're having on the kind of developer velocity of the engineering group at Wayfair?

Definitely. Yeah. Wayfair is a pretty big place. There's about 3, 000 plus engineers. So we do some level of broad marketing. You know, we'll send out what we call release notes when something is ready to be announced publicly, but release notes certainly aren't the main contributor to adoption.

1 of the things we focus on

is selling our products and our solutions to our, early adopters. So we really love working with people that love working with us. So the people that are giving us positive feedback and helping make our product better, we really enjoy working with them. And those people end up being our promoters. So we try to find, you know, the interesting problems that need to be solved and kinda do our homework to vet that these are indeed problems that need to be solved. But we'll we just try to build a strong product and close collaboration with these early adopters.

And, you know, by the time we have a mature product, it's already being marketed for us. So 1 thing about our work is that we don't force people to use our products. 1 of the ways we measure success

is by adoption of our tooling. So like I mentioned, we'll market things to the early adopter category and from there, see kind of where things land. So if we see that traffic to a product or net usage is increasing organically, that's usually a good sign. Early on, we cared about looking at, you know, just the net number of microservices being spun up because we are heavily involved in the application creation space. We kind of viewed more is better, which is kind of narrow minded. But, like, I think when we're spinning up that new product and decoupling, that was the way to go. A lot of what we do also has to do with automated pull requests. Like, a lot of our work has to do with that. So we will just measure the wrong number of automated pull requests being created and merged. Again, you know, more automated pull requests being merged isn't necessarily a good thing. So all of our work, we do try to tie to what are called the DORA metrics.

Those are 4 distinct measurements that, Wayfair instruments and other, you know, enterprises do as well, and they have to do a throughput. So deployment frequency and lead time for changes

as well as stability, which is change failure rate and mean time to recovery.

So the idea is, like, if virtual link can help deploy frequency increase and lead time for changes go down, I think we're doing our job well. Given the fact that the

initial

core

software capability of Wayfair is built around this PHP monolith. I'm curious if you can talk to the role that Python has in the work that you're doing and some of the ways that you think about the

selection of language and ecosystem and tooling for being able to provide to other members of the engineering org?

So within our team, we prefer to use Python for a couple of reasons. 1 of the main reasons was just based on history. We do have that existing skill set from our incubation phase within Python platforms.

But, also, we do think it's the right tool for the job that we're doing. We run a lot of lightweight automation. We're hitting a lot of APIs.

We define a lot of user facing specifications

in order to integrate with our tools. So we just find Python to be really, really easy to use for what we're trying to do. An example is something like Pydantic has been fantastic

for creating descriptive human and machine readable specifications.

So a lot of the way we interface from a technical standpoint,

with our customers is through YAML specifications,

and I can talk more about that later. But Pydantic has made it really, really easy to version these specifications

and enforce, you know, different validations.

That's stuff that, sure, would be possible in other programming languages, but we just find it really, really easy to do with Python.

In terms of the actual

specific tools and work that you're doing, 1 of the things that you mentioned is

simplifying the process of being able to spin up a new application and get it into production. I'm wondering if you can talk to the way that you have built that system and

some of

the approach that you've taken to be able to reduce those manual steps and automate as much as possible while still being

understandable and maintainable for the consumers?

So 1 thing that's really neat about this is although the system itself is written in Python, it can template out code for any language. So

Wayfair supports Python, PHP,

Java, dot net, JavaScript, some Go. So there's just a lot of variety out there, and we don't see ourselves as being opinionated at all as far as, like, what the right tooling choices, and we want to enable

all sorts of solutions to be built. So our system

generally

is 2 phases. It will both template out code according to templates

and run automation to onboard an application to production.

So that's things like creating a GitHub repository,

creating a build pipeline,

calling out to various APIs to signal that a new app is present, stuff like that. There's a lot of, like, some proprietary stuff, but just things that developers would have to go through in order to, you know, stand their thing up. So

any team interested in integrating with our solution will create a GitHub repository containing

1 to many cookie cutter templates.

So it is using cookie cutter under the hood. 1 of the cool things is on top of that, they will provide a specification for what questions to ask users.

So with cookie cutter, generally, you'll provide

a JSON payload describing

what questions you want asked and what their default values are. For us, there was a little bit of a limitation

where

we want our question asking approach to be dynamic. An example is a Python fast API application

will have a completely different set of questions that we wanna ask our users than a Java library, for example.

Additionally, if you are building a new application and you want to use a database or have an integration with a messaging application such as Kafka or Google Pub Sub, and you say, yes. I want that thing. We may have an entirely different set of questions we wanna ask in addition. So

1 of my teammates, Pat Lanigan, wrote an open source library called Colombo.

And what that offers our template designers is basically a Python DSL

to describe the set of questions to ask users. It allows them to

determine whether or not a question should be asked in the 1st place based on previous answers. So there's the dynamism.

It allows for a rich validation, default values, and a lot of different types of questions that can be asked as well. So you might just wanna echo some text. You might want a Boolean value. You might want a multi choice, those types of things.

And as far as the

templating aspect and being able to

multipath the questions that are being asked. I'm wondering if you can talk to some of the

ways that

you and your team have had to educate other members of the org

about how to think about either building out their own templates or some of the edge cases that they need to be aware of or, you know, maybe integrating

language ecosystem best practices into these templates to be able to make sure that they're as useful as possible and that you don't, you know, generate a bunch of boilerplate, and then the end user then has to go and change, you know, 20% of what's there? Yeah. I think that's a great question. It puts our team in a very, very interesting position where we aren't super opinionated about the content of the templates themselves

and want to enable

teams that are interested in creating project templates to do so.

We don't have too much insight

or standards in terms of what the actual content of the generated application is, though

we do have ways of verifying that templates are doing what they're supposed to do.

So the template repositories

themselves,

we know where these template designers will create their logic for their templates.

We have enabled different build steps to basically verify,

hey. For all of these templates that you have and all of these questions that you're going to ask, we wanna make sure that at a minimum,

we are capable of successfully running the cookie cutter command to actually

generate out a new application.

And there are some various things that we might wanna check and assert on your application.

If these base set of expectations aren't true, we will fail your build. A really, really high level example is Wafer is undergoing a migration from on premise data centers to Google Cloud Platform.

And in 1 of our manifest files where we describe the data center to deploy containers to, you can imagine,

because we're at such a high leverage point in the application creation process,

we probably want to assert that all templates now are, at a minimum, deploying new applications to GCP and not our old on premise data centers. So those are the types of opinions we will hold, and we will hold them strongly. But aside from that, it's very much on the implementers

to maintain standards.

We've found it to be, like, fairly successful so far where the teams that own these templates are incentivized internally

to have these best practices in mind. So that hasn't really been a problem for us. Though if something out in the wild is found as an improvement to be made, we generally fold it right back into the templates.

1 of the

always interesting and challenging aspects

of templating applications

from a cookie cutter

is that

at the time that they're created, they represent the best practice

and, you know, maybe they have the updated dependencies

in the generated

lock file, for instance.

But as time goes on and bit rod sets in

and the language ecosystem evolves,

you might go back to that template and say, okay. You know, we used to use unit test and knows. Now we wanna use Pytest,

and,

you know, now we need to update the version of Django or Flask or what have you in the cookie cutter so that when you start a new project, you're starting at the latest version.

And then being able

to bring that forward into

applications that were previously generated from that template. I know, for instance, there's the copier project that is designed to be able to handle some of that capability. There are some add ons to the cookie cutter ecosystem to be able to handle that aspect. And I'm just wondering how you think about the ongoing maintenance of those templates for being able to bring in those

upgrades to dependencies or upgrades to best practices

or the

organizational standards, and then being able to alert teams who previously used that template to those changes to determine when and how to incorporate those updates?

Yeah. It's a really, really hard problem. Not just hard, but, like, borderline impossible when you're dealing with the skill that Wayfair does because

the diff that you have between a template and applications that exist out in the wild can be anything. We don't know really

what the state of all the different applications are. Presents a really big challenge. So that actually kind of folds into the second portion of our team's scope, which is application maintenance.

I think our stance is we acknowledge that over time, applications are going to be out of date with the latest and greatest standards. And an app generated today might look great, but 3 years from now might not be reflective of, you know, the desired tooling or standards or even dependencies.

So I would say, like, instead of trying to solve

the technical problem

from a template standpoint, you know, hey. This template's evolving, and let's fold these changes indirectly.

We've taken a little bit of a different approach

where through different pieces of tooling that create automated pull requests,

we use that to try to keep applications up to date. I can tell you a little bit more about that. Yeah. Definitely interested in exploring some of that kind of automated maintenance aspect and some of the ways that you're able to

introspect the current

structure of the project and the dependencies

and understand

what types of modifications

are useful to make, which ones are safe to make, and just being able to

hook that into the overall ongoing life cycle and maintenance of the systems that you're supporting?

Totally. I'll break up the application maintenance space into 2 broad categories. 1 of them is very easy to describe and another is less easy to describe.

First 1, I'll just call versioned dependencies. So if you have an application, Python application,

you might rely on Python packages. You might rely on some base docker image.

If you're using Kubernetes and a technology such as Helm, you are relying on Helm charts potentially.

And if you have a build system like we use, you know, that may have versioned plugins as well.

So managing those versioned dependencies

is 1 of the problems that we've addressed over the past year and a half.

We started off using a variety of in house solutions,

but are now favoring an open source solution called Renovate,

which is able to handle all of the things I described above. It's highly configurable,

and it allows application engineers to

really, really describe the way that they want to manage their versioned dependencies.

So our team takes direct responsibility for helping application owners keep their dependencies up to date, and that's kind of 1 half of the application maintenance

problem. And so the dependency management aspect of it, you know, definitely seen a few different tools that approach that. So there's the Dependabot project that was acquired by GitHub.

There are language specific aspects. I have come across Renovate, but haven't used it specifically. So I'm maybe interested in understanding some of the benefits that that provides of its stance as being a multilanguage

ecosystem

tool

and some of the ways that it's able to

work across languages and maybe some of the capabilities

that would be nice to have from some of these more focused solutions such as the PyUp bot or something like that? The way View Renovate and its advantages over other tooling that we've used before is renovate is very, very, very configurable.

So we don't have to build our own configuration and kind of reinvent the wheel. So we just get to ask people like, hey. Listen. You know, RenovA open source has really great docs on how to set up auto merging and how to, you know, write your own plug ins for managing

various different types of Nuance to dependencies

that maybe renovate doesn't handle out of the box. Certainly, that's an advantage there where it's, like, very, very flexible.

Another interesting aspect of it is the way that renovate actually ships is through a docker image. So I believe it's a docker image and or, like, an NPM package. So the way that we actually because Renovate is open source and not a vendored solution,

we have to kind of stand it up and run it in house. From a technical standpoint, the way that we actually do that is very, very straightforward. Whereas other tools, you know, instead of shipping a Docker image or an NPM package, it's more of an SDK where we have to stitch everything together and kind of reinvent the wheel config wise and figure out how to expose different things to users. We just don't have to solve a lot of those problems with renovate. So for those reasons, it's certainly preferred.

Digression. I don't wanna flame dependabot. We use dependabot at Wayfair. We don't use it in GitHub, but there's open source Dependabot core.

And it's basically a Ruby SDK.

So we've had to, like, stitch together a bunch of, like, weird janky Ruby stuff and write our own Ruby scripts to figure out how to do this. We've also had to literally reinvent the wheel config wise to, like, figure out how do we take Dependabot's config that they've talked about and then, like, expose that to our users in a way that is compatible

with the way that we've implemented the, like, Ruby SDK. So I just didn't wanna speak ill of dependent about, but that's, like, what we're going through. Fair enough. K.

And

the other aspect of the kind of automated maintenance that you mentioned is beyond just

keeping the dependencies up to date. There's also the question of

maybe doing

vulnerability and security scanning,

linting, maybe adding in some fitness functions for

enforcing different architectural aspects of the application. I'm wondering if you can talk to some of the ways that you have

worked with teams

to automate some of that and some of the tools or internal capabilities that you've developed as a result.

Absolutely. Going back to the question that you're asking before about, you know,

diff when an app is created from a template and time passes,

We realized pretty early on that that was a big problem for us, and we wanted to address it.

So the way we view that is we just wanted a kind of general solution for

creating pull requests at scale and basically creating a platform that allowed

various engineers at Wayfair to

describe changes that they wanted to make at scale. So we have an in house solution,

called Gator that we've built, which is really, really neat. An example of something you might wanna do with Gator is imagine you're the Python platform team at Wayfair and Python apps are using the black tool to format code.

And our Python platform team, for some reason, this wouldn't actually happen, but they're opinionated about the line length that they would like all Python apps to use. Our tooling enables them to make declarations

over,

hey. I want to find all Python apps. I want to search in a specific file such as like a pyproject.toml,

and I might wanna do something like a modification,

like a regex replace

to look for, you know, the line length of 88 and replace it with 120.

That's fairly basic. But

with a couple lines of YAML, we enable the Python platform team to make that type of change at scale.

Another more complex example would be, imagine

you

own a shared library that many, many applications are using.

A new version of it is released, but it includes a breaking interface change.

Now, historically, you know, that would happen, and we'd ask people to, hey. Please use this new version of this new library,

and you have a month to do it. And, obviously, people get upset about that. And when they inevitably don't do it, things break.

So to support the shift from we need you to do this to, hey. I'm proactively letting you know that something needs to change, and I made the change for you. We,

through the system, allow for

very, very complex code manipulation to happen.

We basically allow you to run your own container in your own environment that describes

exactly the type of code modification you wanna do. So our Python platform team has gone as far as performing AST manipulation

to resolve the breaking change for people.

So, you know, the YAML config will reference a Docker container where they've written these instructions up. That's how it integrates with our system. Those are the types of things that are being done with our system. Been very, very helpful on a variety of use cases, things that we wouldn't have expected are being done with it, and it's really neat.

On the AST manipulation

and code restructuring

aspect of it. I know that in Python, there are a lot of capabilities built into the AST

and pretty similarly with some of the other dynamic languages. I'm curious if there are any

language environments where you have run up against

complexities or limitations

in the kind of introspectability

of the software to be able to automate some of these changes and you've had to fall back on other approaches?

You know, I think 1 of the benefits of

being where we are in terms of running this platform is we don't need to know about those details. So we haven't really dug into those. It's kind of like, hey. We'll give you the tools to do whatever you need to do. And if you can't, that's kind of on you to figure out. So, you know, we're not necessarily privy to all of those details. I think, like,

within different teams, they're trying to figure out how to do some complex stuff. But I'm kind of unaware of all of that, which I think is kind of a good thing in this case. So we've taken the approach of being as unopinionated

as possible

with our tooling and and letting people do what they wanna do. To the point that you're making about being able to

automatically

make pull requests and suggest changes to

either update dependencies or

improve code structures or

introduce bug fixes.

What is your approach for being able to actually automate the creation

and management of those pull requests across such a large number of repositories?

Yeah. It's an interesting problem. We use the Git Python library, which is an extremely powerful tool. Basically, sits on top of Git and allows you to do anything that Git does. Its API is fairly involved and, by design, doesn't provide the type of high level abstractions we need. Like, if you think about how a human will create a pull request, they'll likely clone a repo, check out a future branch, apply some changes,

stage commit, push up the branch, and create the PR.

So we thought it made sense to have kind of high level abstractions to

mirror that workflow.

We have a couple systems at Wayfair that do create automated PRs. So it made sense for us to first, you know, create those within 1 system and then abstract it out into a library that we could reuse. We ended up creating an open source library called Pygitops.

Again, it's built completely on top of Git Python, which we think is an amazing library. It just kind of adds some stuff on top of it. And, again, it's a high level abstraction for the workflow I described.

1 of the coolest parts about it that I think is, like, the most Pythonic part that I'm really in love with

is the feature branch context manager.

So, again, as a human, if you are making code changes that ultimately get pushed up, you're likely going to be making those changes in the context of a future branch. So, you know, when you

intentionally or accidentally

leave the context of a feature branch, we always want certain things to be true. We always want you to be on the default or, you know, main branch of your repo. We want a clean work dir. We want no unstaged changes, etcetera.

This is important because when you're creating many, many feature branches and PRs and against many repos kind of all at once in batch,

we wanna make sure that the git state of that repository

is clean always. We've had some interesting edge cases where changes from 1 PR get pulled into another PR, or we have Git just completely barf and break things because,

you know,

we had an unstaged change when we check out a future branch or something crazy like that. And we built a lot of those learnings and edge cases into Py GitOps.

You know, it's become more stable over time. When you write PR automation, you generally don't have to worry about those edge cases anymore as a result. PygidOps has 100%

unit test coverage. And it's not just a lot of, like, mock unit tests.

Because of the way Git works and because a remote

repo can just be another directory,

we get to use through pytest

the temp path fixture in our test cases. So we'll start our test cases by, you know, creating a remote repo on disk. We'll clone it. We'll do some stuff to it, and we'll make a lot of assertions on really a live repo. So we're not really faking anything. We're making sure that things work in a sensible way, which I think is a really, really powerful part about

Git, Git Python, Python itself, Pytest,

all the above.

1 of the things that's always interesting to explore

is

the

philosophy around

when and how to open source projects

when building specifically for internal consumption.

And I'm curious if you can talk to

how you think about which tools are worth open sourcing,

whether you bias towards building the tools that you're supporting the rest of the organization with in a way that it can be released publicly.

And

if you generally

approach the sort of open sourcing of the tools up front or if you build the tool first and then decide, okay. This can be extracted.

And then just some of the process that's involved in actually sanitizing it and

updating it so that it is more broadly applicable and not tied to your organizational assumptions?

Yeah. Great question. I think

in both of the open source libraries I referenced, there's a little bit of a different story.

So I will say with Colombo, which again is the

library that allows for specification of questions being asked in the the kind of entire workflow,

the teammate of mine that wrote that, Patrick Lanigan,

I think had a more intentional approach from the beginning where he had a good vision for the project itself, knew that it would be a good candidate for open source. And that came after,

you know, doing a lot of research on solutions that were available,

didn't do exactly what we needed it to do. And so because it led to us having to create a custom solution,

I think part of his idea was giving back to the community from the very beginning and and having it be open sourced.

The story with PygadOps is a little bit different.

Initially,

we have Git automation that touches just about every project that we run in a variety of ways. So it definitely started off, you know, 2 years ago as

how do we even do this in the first place? Like, how do you actually technically make this possible? So So it was very scrappy, very, very iterative. We didn't have kind of design in mind, and so it was just a whole mess of various things that we needed to do. And it wasn't until

we

had system number 2 that wanted to do a very similar thing where I I had even considered abstracting things out. So we then took the step of of kind of, like, making a little bit of a cleaner API, creating a library that was reusable.

And by the time we had the 3rd or 4th system actually using Py GitOps, we thought that it would be appropriate to open source just because, you know, we really like Git Python,

and we wanted to, like, give it credit and call out the accomplishments of Git Python where possible, but also provide a nice abstraction for other folks looking to create TRs.

And in terms of your

work that you've done with the engineering group at Wayfair

and some of the tools that you have built to enable their

application delivery and life cycle management

and improve the overall developer experience and workflow? What are some of the most interesting or innovative or unexpected ways that you have seen

the different tools that you've built used and maybe some of the

interesting approaches that you have

developed internally to be able to

speed up the

time to delivery for these different applications?

I think this answers your question in a way.

1 of the really, really interesting things that happened in recent history, if you're familiar with the Log 4 j issue that happened back in December,

the change propagation system I told you about earlier, Gator,

has a flexible resource model, and it kind of supports the notion of, like, we don't know what we don't know. So it allows people to very abstractly,

you know, target repos and then run some outputs against them. So 1 of the really, really neat things is that Gator was actually used in the remediation of Log 4J where, as

I mentioned, we allow for, you know, implementers of these Gator change sets to call out to a container. So we actually had a Java platform engineers writing some bash scripts to detect whether or not our repo had a log4j vulnerability, and that was able to be plugged into our system. The output of all that was we were able to create GitHub issues on these repos to kind of flag them to our internal ops teams. And, that was a pretty interesting way that our tooling was used.

In your own experience

of working on this team and helping to figure out how to reduce friction in the process from going from idea to delivery?

What are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

1 of the lessons that I've learned along the way that I think is applicable to, you know, most of software engineering, especially when you're doing a platform work for a large enterprise, it goes back to the

fusion of innovation

or the adoption curve that I talked about earlier where, you know, now we know very well to target the early majority, and we know that that's a formula that works for us because

of all the reasons I described earlier. 1 hard lesson along the way was kind of this expectation that I had that I'd built a thing and then everyone would immediately wanna use it. Even in recent history, you know, maybe, like, a year and a half ago, I built a solution that I just kind of assume that people would wanna use. Some people were really excited about it, and that was great and reassuring. But it was kind of a letdown when not everyone wanted to adopt it right away. You know, when you think about it and kind of think about things in the context of that adoption curve, it makes total sense. You know, you're marketing to a huge set of people with vastly different concerns and sets of priorities and tolerances for risk. That was definitely a challenge kind of, like, identifying that as, like, a tough pill to swallow. But I like us thinking about our work in kind of this new format where we market to people that wanna work with us. And then from there, you know, things kind of spread out organically. So it's okay if, you know, it takes a year for a solution to reach a different engineer that that wouldn't otherwise adopted immediately. Like, we're totally okay with that. As you continue to work with the various teams at Wayfair and

manage your own team's concerns, what are some of the things you have planned for the near to medium term that you're excited to dig into?

Yeah. I think the focus for us for the rest of the year is continuing to do what we're doing and supporting the same products we've been supporting,

but with an eye for how it fits into our larger platform.

So in other words, we've built some useful stuff that has, you know, smoothed out developer experience and helped reduce toil. And now it's a matter of folding these into

other products in a way that's very, very seamless. So if you're a wafer engineer, you don't have to understand,

you know, product a, b, and c and tool x, y, z. You're just using, you know, Wafer's platform, and the stuff that we've talked about today is just a part of it. So that's pretty vague and abstract, but a lot of it is, you know, getting our products and our tooling in the direction where it fits more cohesively with with surrounding tools.

Are there any other aspects of the work that you're doing on the application life cycle team at Wayfair that we didn't discuss yet that you'd like to cover before we close out the show? Yeah. 1 thing we didn't talk about yet is the change propagation system, which we call Gator. I think it's an interesting thing to consider for open sourcing. The tool itself in the entire product is quite complex, and there's a lot of, you know, wafer specific stuff involved there.

Though the core technology,

which, again, very similar to Kubernetes resource model, allows you to specify abstractly

ways to grab repos that we care about

and what you actually want to, you know, run against them in terms of code changes that automate PRs. I think that that kind of central

piece is an interesting part of, you know, the innovation that we've done with Gator that I think would be a good candidate for open source. We've had feedback from others

in the Wafer Engineering Org

that, you know, maintain open source projects of their own on public GitHub

and would be interested in running Gator

against those repos on GitHub. And so we think that, you know, if we were to open source, even a very, very minimalistic way to replicate kind of part of what we're doing here, then individuals may run it against their own repos, against their pet projects,

and enterprises may even adopt it. We have to be thoughtful about, like, the approach we take there because, again, I think there's a lot of technical challenges with ripping out this central piece. But I think it could be done, and I think that would be a really, really interesting thing for us to consider. So I hope to be able to give you an update in the future, but that may be something to look forward to.

Alright. Well, for anybody who wants to get in touch with you, I'll have you add your preferred contact information to the show notes. And so with that, we'll move us into the picks. This week, I'm going to choose something called nuchiolata,

which is hazelnut spread. It's very similar to the Nutella product that more people will be familiar with. Just a very tasty treat. Just spread it on a piece of toast for a snack. Just add some of that before this show. So definitely

always great to add to whatever you're eating. So with that, I'll pass it to you, Josh. What do you have for picks this week? Sure. So I love simulation video games. 1 game that I really like playing on my PC is called Cities Skylines, so I do want to

recommend that.

More specifically, though, I've been watching a YouTuber called City Planner Plays. It's a guy who actually is a city planner by profession,

but he also loves simulation games. He has a series called Verde Beach, and it's, like, I think, 75 parts by now. He's been doing it for the past 2 years, but I've been watching him iteratively build a city.

He, of course, pays a ton of attention to detail

and has an eye for all of the things that he considers in his job as a city planner. So it's really, really interesting to watch, and so I recommend that. That's a fun thing to follow along. Yeah. It's definitely interesting to think about how accurate the simulation responds to the real world city planning considerations that he's adding to it. For sure. Alright. Well, thank you very much for taking the time today to join me and share the work that you're doing at Wayfair to help your application teams

build and deliver faster. So it's definitely always interesting to think about the platform aspects of being able to help developers do their jobs. So I appreciate all of the time and energy you put into that and the open source projects that you've released out of that effort, and I hope you enjoy the rest of your day. Cool. Well, thanks for chatting with me, Tobias. Really appreciate it.

Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at dataengineeringpodcast.com

for the latest on modern data management.

And visit the site at pythonpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes.

And if you've learned something or tried out a project from the show, then tell us about it. Email host@podcastinit.com

with your story.

To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

The Python Podcast.init

Summary

Announcements

Interview

Keep In Touch

Picks

Links

The Python Podcast.__init__