Simplify And Scale Your Software Development Cycles By Putting On Pants (Build Tool)

Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. Building data integration workflows is time consuming and tedious,

requiring an unpleasant amount of boilerplate code to do it right.

Riverie is a managed platform for building your ELT pipelines that offers the industry's first native integration with Python,

allowing you to seamlessly load and export Pandas data frames to and from all of your databases,

services, and data warehouses with a few clicks and no extra code.

Riverie is hosting a live demo of their 1st class Python support on February 22nd. And when you use the promo code Python during registration, you will be entered to win a brand new series 7 Apple Watch.

Go to python podcast.com/rivery

today to learn more and register.

When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode.

With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers,

40 gigabit networking,

dedicated CPU and GPU instances, and worldwide data centers.

Go to python podcast.com/linode,

that's l I n

o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

Your host as usual is Tobias Macy. And today, I'm interviewing Eric Arellano, Stu Hood, and Andreas Stenius about the pants builds tool and all of the work that has gone into it over the past year and a half. So, Eric, can you start by introducing yourself? Yeah. Absolutely. I'm Eric. I use they, them pronouns.

I have been using Python for about 6 years in maintainer of pants for a little over 3 years, where I first started using the project as an intern at Foursquare

and ended up changing my internship project about halfway through the summer to lead pants' Python 3 migration.

Fell in love with the community and have been contributing to it since for a little bit working at Twitter and now working at a startup called Toolchain.

And, Stu, how about yourself?

Sure. I was introduced to Python via the Pants Project after spending a lot of time on the JVM.

I've been working on pants for

something like 8 or 9 years now. We'll get more into the history,

but not too deep.

It's a lot of history.

And I think Python has been just a huge boon for the project, so I'm really happy to talk more about how we use it. And, Andreas, how about yourself?

Yeah. My name is Andreas Stenius.

I

joined the Ants built

community about a year ago and fell in love

quite straightaway with

design and

the community and the spirit and everything

there is.

So

I've

been working closely with the team and became a maintainer last summer.

I've been working on the Docker integration and the Docker backend in Pans. Eric and Stu, you already mentioned a bit about how you got introduced to Python. So, Andreas, do you remember how you first got introduced to Python?

Introduced to Python as,

DevOps

developer

at

where I'm current currently employed.

And it

was 7 years ago, thereabouts.

So I've been doing mainly Python development for past 7 years.

And so in terms of the pants project itself,

Stu, as you mentioned, it has

relatively long history. And for folks that wanna dig into

the details of that, I'll point them back at the previous interview that you and Eric were on about a year and a half ago. But for folks who haven't listened to that yet, if you wanna just give kind of the CliffsNotes version of what pants is and some of the story behind

how it came to be and how you got to where we are today.

Yeah. For sure. I think 1 of the things about

Pants's history is that many

build tools, and I'll explain a little bit more about what I mean there,

start as single project

build tools and then evolve

in the other direction. They involve to support monorepos or to support larger projects

that are made up of more units of code.

PANTS started in the opposite direction. It started from

attempting to support monorepos as well as it possibly could and then scaling down as much as we possibly can.

So by a build system, I mean, we are a tool for executing all of the steps between

writing your code and taking it to production, which involves running your tests as parallel as possible,

running linters, formatters, type checkers,

doing cogen for you such that you don't have to commit it to your repository,

making

your scripts as formal and checkable as possible, running REPLs, doing packaging, and publishing. So there are a lot of steps

that you would otherwise have to script

inconsistently and then sort of maintain scripts for.

All of those steps, we strive to pipeline,

run-in parallel,

run accurately and correctly.

And so we integrate with all the tools you're used to, like PIP and Pytest and Black and MyPy and Twine and Docker.

But then we attempt to provide the most consistent

and easy to use interface that we can atop those.

In many cases, those tools are shaped similarly. So you have maybe 8 or 9 linters

that people will use commonly with Python.

And knowing

the right arguments to invoke all of them, you almost no 1 does. So they're going to put them in a script. We remove the need for that script, and we additionally

prevent somebody having to write a script that's going to figure out how to invoke

Winters in parallel on only the changed files.

So we're focused on improving the developer experience

regardless of repo size. But as I said, our history has been in monorepos, and that's where we still shine the most.

And to your point of the

scope of

concerns

that are encapsulated by build tools,

I'm wondering if you

can maybe talk through the range of

considerations

that pants is trying to address and some of the pieces that you have explicitly opted to

not try to include into the pants experience and defer to other systems or other tool chains?

I can say that our bias has been toward inclusion

into the core of pants

for a few reasons. And 1 of them is that our plug in API isn't stable

yet. And so when people write

plugins for pants in the core,

we gain the benefit of experience of seeing, oh, you know, this plugin,

which does something, you know, vastly different from what we could have expected

and is in the core and gives us experience about

how easy the plug in API is to use. And we can sort of notice the rough edges and help. And it gives anyone who contributes to the project the benefit of us maintaining that code.

So as to the scope, like, we would like to be surprised. We would like somebody to come along

and say, hey. We wanna add a plug in for,

you know, x tool that I've never heard of. That can absolutely be in scope if there are, you know, non private

companies that wanna use it or even if there are, you know, enough private companies that wanna use it. As long as it's not completely custom internal code, like, we would be interested in accepting it. And

just I think Andreas, in particular, has a lot of experience

recently with, you know, understanding a use case better than we do and then contributing something, you know, hey. That's in scope. We just didn't know.

So thank you for bringing it with this Docker work. Yeah. We also take the approach that if

you as a user experience

a particular need or you're confused, for example, with a certain part of our docs, it's very likely that other people

are also have the same problem, and we simply don't know that they do. So we always appreciate whenever

people come on our Slack, for example, and share that they're confused with a part of our docs or that they have a certain future request because we assume that they're speaking for dozens of other users who might not even tried pants yet. But because of that feedback,

they'll be able to have a better experience.

And to the point of having a consistent

interface for all of the different stages of the software development life cycle,

there are a number of other tools that have aimed to

have a similar approach with maybe the most venerable being the make tool, but some more recent entries being something like the Earthly project.

I'm just wondering if you can share your thoughts on

the considerations that go into

building this common interface and the design of it, and maybe some of the ways that pants might be

compared to or against some of the tools like Make or Earthly or some of the other environments to try to give this consistent experience?

I think 1 thing that is definitely different from make, and I I don't have experience with Earthly, but would be interested, is that we are attempting

to actually attach more semantic meaning

to groups of things that you might wanna do. So pants has separate goals, capital g goal,

for various tasks that you might wanna execute.

And we're trying to bundle all of the things that have the same semantic

meaning into 1

command line invoke.

So if you wanna test, that's 1 goal. If you wanna lint, that's a separate goal. If you wanna check, that's a third goal. And lint and check are an interesting case because

we've been differentiating between

lightweight linters

that are mostly about style

and heavier weight checks like type checking. Right? So mypy and pywrite

and even actually,

ironically, pylint

are fairly heavy checks that use the transitive dependencies,

all of your code, essentially. Anytime they check anything, they take a lot longer to run. And so we sort of consider them to have a different semantic

meaning than just linters.

Better example, though, is probably, like, packaging and publishing. They have a clear semantic meaning. If you were going to implement packaging and publishing in make, for example, you'd be doing it from scratch, and you'd be figuring out what should the semantics of publishing packaging be.

Whereas

when we

model packaging and publishing,

we're thinking through what that usually looks like for a user and making sure that the arguments

are consistent

and the behavior is consistent in performing.

Another major differentiator for pants is that it has a really fine grain understanding of your project's dependencies.

So it understands, for example, that this file depends on that 1, which depends on this other 1. And with that information,

you can do things like only run tests that have been impacted.

So if you change, like, a common library,

we can figure out that 10 of your tests out of a 100 have been changed, and we don't need to rerun the others.

That all happens automatically.

Normally, to have that really fine grained information with

older build systems like Bazel,

you have to have a lot of boilerplate

that you essentially duplicate your imports and have these things called build files where you explicitly

teach the tool, this depends on that. And instead with pants, it's a major goal of the project that it'd be really easy to adopt and a joy to use. So we have a thing called dependency inference

where we'll read your import statements for you and then map back those imports to the rest of your project.

So you get those benefits of fine grained

metadata and fine grained dependencies

without having a bunch of boilerplate that you need to maintain.

Another aspect of having this

common interface to all the different kind of broad tasks that you might do in the software development life life cycle, 1 of the other things that

is interesting to think about as a user of pants is

how much ownership

of all of that you want to push fully into pants versus how much you might want to also

execute via these other more dedicated tools. So the thing that comes to mind, most notably for myself, is pre commit, where I have a number of pre commit checks that I might want to run using their default

sort of out of the box supported

plug ins. So

Flak8,

MyPy,

Black, all of those can run as pre commit checks. I can also have pants run all of those things.

And so

I can get some measure of consistency by ensuring that they're deferring to, like, the setup dot config or Pyproject.toml

for setting up how those different executables want to run. But there's also the question of, do I want pre commit to run all these checks independently, or do I want pre commit to call into pants, or do I want pants to call into pre commit? So just figuring out, like,

what are the different contexts in which you want to execute which tool.

Absolutely. And I think that we're similar to pre pre commit in that

they are choosing a particular

semantic task,

which is everything that sort of, like, blocks committing.

But by choosing that

as their only goal,

they are essentially limiting the scope of what you can do with pre commit. You're probably not going to run your entire test suite with pre commit because, you know, that's not gonna scale with your project.

You don't wanna wait for that. Right? So they're going to sort of by choosing that 1 semantic

scope of things that are fast enough to run just before I commit,

That's sort of 1 of pants's

multiple goals. So we provide a consistent interface across other tasks that you might do other than just the things that are fast enough to run before commit.

So if if pre commit is adding value, like, using pre commit to call us would get you some consistency

in that you can then

use pants directly or use pants in CI.

I think having this the whole suite of goals is kind of an interesting thing because

in common usage,

you're gonna iterate on code by sort of, like, running your tests or linting or type checking.

And

you may not want to run all of those. You're gonna cherry pick which of the things you know are relevant at a particular point in time. You know that you haven't made enough. Change. You've only changed a comment. You're not gonna bother running all of the tests. Maybe you're just gonna run formatting because

doc forter formatter might complain about that otherwise.

Like, the developer is usually picking and choosing how much they need to do at any given point in time, and

we're just providing a consistent interface for that. Yeah. And that being said, another

major focus we've really focused on in the past year is allowing you to incrementally adopt pants

that we know a lot of organizations already have really big repos and

might have workflows and tools that are working well for them. And we want it to be easy that you can

incrementally add pants that you might start with only using it for your formatters. In winters, for example, we're only using it for test.

So there have been a lot of features where we can complement the workflow that you already have.

For example, 1 of our users is still sticking with their current test runner while they migrate, but they're using Pants's dependency information that I was talking about earlier

to grab the metadata about what tests should we run. So they run Pants for the query and then pipe it into their original test workflow.

At work, have you used, pre commit

to invoke pants

to get the consistent output?

Like, there shouldn't be a difference if I run

the pre commit hook or if I run pants directly, it should say the same

about the current state of my code.

So

if pre commit run

the same tool, but potentially with some different configuration, that would possibly not be the case.

And it has worked out really well, I think so.

And in terms of the pants project itself, as I mentioned at the beginning, we did an episode about the pants project about a year and a half ago. And since that time, it seems like there's been a very

rapid uptake in the pace of development and the size of the community and the number of capabilities that have been built into it. I was I'm wondering if you can give some of the

notable changes in the project and its ecosystem and community that have taken place over that time. I would say that the largest change in the last year has been we've spent a a lot of time going more polyglot, adding some more languages.

In particular, we've added support for Go,

Scala, and Java. And if you if you count

our Docker integration, which is fantastic job with Docker files are a language in and of themselves, I suppose. We've learned a huge amount and

improved the plug in API to support those languages better.

And users have sort of been satisfied that they've been able to keep this consistent interface across multiple languages within their repository, which is sort of 1 of the premises

that we are attempting to fulfill.

We also have spent more time improving the lock file story. I think in our last conversation,

1 of the gray areas was a question of having

incompatible projects within a monorepo,

those with, you know, sort of overlapping

requirements where 1 library requires 1 version and 1 library requires another.

Having those overlapping requirements without

having a global

lock file is interesting because

how

single project

poly repo

tools like Poetry and Talks and Piplock would do or PIPFREASE

would do it is they'd have a lock file per

project or unit of code.

And we think that we found a really great design that sort of threads the needle between

1 lock file for your entire repository

or 1 lock file and potentially inconsistent,

incompatible dependencies between all of your units of code.

And we're hoping to ship that in the next few few weeks. We can definitely talk more about that. Yeah. Beyond code changes, you're right that the community has grown a lot, the past year. We've had a lot of new users and organizations,

joining our Slack every day. 1 of the things I've been most excited about is

honing in on what being a part of our community means.

Originally, we used to primarily think of contributions in terms of code.

And this past year, we

restructured everything that we no longer call them our maintainers committers, but we now call them maintainers.

And we recognize that contributions take a ton of different forms,

including docs. But 1 of the big things

we've focused on is

how useful it is to get feedback from

where things are confusing in future requests.

Even if you never write a piece of code or don't change docs,

simply letting us know how you're using Pants is extremely helpful to our project so that we can focus on making the tool more useful for everyday users.

Going back to the question of the scope of the project and figuring out what you want to build,

Obviously, if somebody comes to you from the community with a contribution,

there's not as much scoping to do there. But there is also the question of, does this belong in core or as a plug in? And I'm just wondering if you can talk through

some of the conversations that you have as the core maintainer team and some of the ways that you engage with the community to figure out what is the overall scope of what belongs in core versus what belongs as a plug in and how to prioritize

the

baseline capabilities that the pants tool should provide out of the box and the

interfaces that it can expose to give control to end users to be able to customize to suit their needs?

I think 1

useful recent example was that the

AutoFlake

plug in

starts to cross the line between

a auto formatter and a sort of a fixer,

where

it does potentially change the semantic meaning of your code by deleting import statements, and it's incredibly useful.

That was a contribution from the community,

and

I think we have to continue to bias toward inclusion in all cases

because they force us to think about where our semantic boundaries between these goals are wrong.

I think the hit rate of, hey. You know, somebody

considering whether something should go upstream or not is pretty high. And there's survivorship bias, of course, because there are probably a lot of things that people keep private and don't tell us about.

But when people, you know, have even an inkling that something might be useful to the wider project, our answer is is yes. Yes. That is useful.

And something like,

you know, 95 or 98%

of use cases that people have so far

fit into,

you know, the goal the buckets that we have designed, the goals.

And I think that's been reassuring.

We continue to, like, to be surprised. If somebody really wants to lean in

on making deployment

something that we should be orchestrating with pants, like, that that's something we could definitely discuss. It's not something we have a goal for yet. Right now, we will go as far as packaging

and publishing,

but we're not necessarily gonna orchestrate,

you know, the restarting of your cluster, for example. People might have custom goals for that. It's something that we'd be willing to discuss and include. But it's just always a learning experience to have potential contributions. And so that's where we're at. Bring your ideas, and we're gonna bias toward bringing them on board. To your point of making it easier to incrementally adopt pants and easier to get started. I know that the tailor capability

hadn't quite made it into the core or was just very recently added to the core the last time we spoke. And I know that that has

been going through some evolution, and I've used it myself a few times. So I'm just wondering if you can talk to some of the

ergonomic improvements that you've added to make it easier

to manage the adoption and getting started and kind of

reduce the

level of effort that's required to be able to try out pants and understand if it's right for your organization?

Yeah. We have a really strong

belief in the project that the tool should adapt to you

rather than you having to adapt to the tool.

So we set up hints intentionally that we can handle multiple different code structures.

Like we were talking about, it's

possible and hopefully easy to be able to integrate pants

incrementally in addition to your current workflow.

So we made a couple of changes

to

make the onboarding process even faster. 1 of them that you're talking about with Taylor is that we

have these build files, which are usually only 1 to 2 lines that give us metadata

about your code. So you can do things like setting a timeout on certain tests.

Now we'll scan your repository and then generate those 1 to 2 line files for you. And then with that, we also are inferring your dependencies

like we talked about.

Another really important part of ergonomics

that we think about is the difference between

power users

and everyday users.

When you use pants, that's something that usually every engineer at your organization will use.

And we hear a lot from power users who are the people on our Slack or who are opening GitHub issues.

But we try to think a lot about that everyday user who might not be as active, but is still using Pants,

and put a lot of focus on optimizing our experience for them,

which includes things like really intense focus on error messages

that we honestly, we assume that most everyday users don't very thoroughly read our docs. And And rather than expecting users to change and adapt to us, we try to adapt to the user. So within the past year, we've audited a lot of our error messages

and set them up, rewritten them, and improved them

so that even if you didn't read our docs, you can intuit what's going on and figure it out.

In terms of the

challenges the teams run into as they're starting to work through the adoption of pants and maybe they're starting to move into

a monorepo

structure for their code for the first time.

What what are some of the complexities

that they run into as they're starting to figure out how to

architect

the

repository layout, how to architect the workflow

of their pants configurations,

figuring out

what are the appropriate places to add the

Python case, what are the appropriate places to add the Python distribution configurations

versus just letting it be

a Python source target,

the types

of custom plugins that they might want to build to simplify their workflows to say, you know, this is my version

schema to be able to say across the board. Whenever I run pants, it will generate the right version for setup dot pie. Just any of those kinds of considerations as they're starting to scale adoption?

I think at a fundamental level, monorepos are about a few things, but the primary thing that they're about is sort of a desire for consistency

and scalability.

So it really matters, and not just across the projects that you have, but maybe also across multiple languages if you do have multiple languages in a repository.

And so the challenge is adopting pants

and adopting a monorepo depend a lot on whether you're sort of, like, converting from a poly repo, having lots of projects in different repositories,

or converting from a monorepo using a different tool

to using Pants. And so those 2 cases are pretty different, the challenges you encounter. The thing when

converting from poly repo to monorepo is you are probably already inconsistent unless you've done a huge amount of work

to reuse

the boilerplate.

For example, if you've used a template generator,

you've generated the code in 1 place, but then you've committed it. And so it can diverge because people people are gonna edit, you know, all that boilerplate,

and they're gonna end up with inconsistent requirements

and scripts and all this. So going from poly repo to monorepo, you know, it really depends how quickly you're trying to

apply the consistency or get the consistency.

And

so

while adopting pants, like, part of why I think our resolve

strategy that I mentioned earlier, the idea of not necessarily having a single global resolve for your repository

and not necessarily having a resolve

per project

is it's definitely a spectrum

between

long time monorepo,

incredible

consistency

of all the projects, and you're using a single version of almost everything

versus just onboarding to that experience.

And it's not necessarily the case that a monorepo that is 100%

consistent and doesn't allow the use of multiple versions, like some tools sort of make difficult,

is better. You know? It's not always the case that stricter is better.

It's a spectrum, and there are

having the flexibility to have inconsistency is an important thing. So depending on which end of the spectrum you're migrating from, for people migrating from a monorepo

with a different tool, we might be making things more flexible and easier.

And going from a poly repo, we might be applying the consistency that you've been lacking by having, you know, a bunch of copy pasted code in a bunch of repos.

So, hopefully, this results strategy, we're we're very optimistic the next few weeks, about a month that we'll have more to share on that. The other thing is, you know, anytime somebody

comes to us, they have huge test suites. And I think our goal

is that pants makes CI

a 1 liner

sort of regardless of your repository size. We have a long way to achieve that, but

CI being a 1 liner regardless of repository size requires a lot of things. It requires caching.

If your repository is huge, it might require remote execution,

which we also support.

If you want to run a variety of different goals and maybe even package and publish, you'd like to be able to include all of those on 1 line.

If you include all those as your invoke of pants,

you would want it all to run concurrently.

So we think there's an opportunity to remove a huge amount of sort of the boilerplate of CI config and and the lock in that you have at various CI providers of huge amounts of YAML and

and probably YAML generators

YAML generator generators

to essentially make

the CI experience

very, very similar to what you run on the command line, but just for a smaller scope. Right? I'm testing just

this 1 file as opposed to the entire repository, but the command is similar. And the scalability

means you don't need an entire separate framework

for CI versus local.

So for teams onboarding to monorepos, that's sort of this promise that we'd like to achieve.

And the challenge is always that the larger the project,

the more work it is to achieve that goal. So we'll continue working on that, and we're we're happy to help teams sort of, like, onboard to this monorepo experience.

And to that point of consistency

and onboarding and particularly as you're expanding into

supporting multiple different language runtimes,

1 of the complexities

that comes about there is being able to manage the execution environment, which a lot of developer teams these days are leaning on Docker for that. And I know that, Andreas, you recently added support for Docker natively into the pants build tool chains. I'm wondering if you can talk to

how that manifests and what that workflow looks like for people who are using pants and want to be able to use Docker to manage the actual execution context without having to do a bunch of setup on developer machines or having to replicate that in their CI ecosystem.

1 of my pain points I was looking forward to solve

when I discovered pants was to

be able to deprecate and

get away from our custom

built tools we have built around how to

build and manage our Docker images.

So

we have this kind of custom

Docker build tool that we call Welder.

That's basically

the version of a multistage

Dockerfile.

Before Docker had support for multistage.

So

what it does is set up

all the different

arguments to run Docker, to build around the images

in sequence, various pipelines, and pushing

tagging images right left.

So maintaining that, those build pipelines using that tool was becoming

becoming increasingly

difficult.

Enterpants

and fell in love with,

engine and the rule system

and thought that, hey. It shouldn't be too difficult to

implement

our

Docker

infrastructure

needs into pants.

And reading more about it, I noticed that there was

a demand or other that had asked for the feature to support Docker in pants.

So when I raised my hands last summer and said, I would be interested in implementing Docker support for pants.

So I gotta go ahead for that.

So the experience was

a real delight.

Implementing it incrementally with

basic support for

just

invoking a simple Docker build command,

integrating with Taylor to

generate the build file necessary

for adding the Docker image target that you have. So

what pants needs to know

in order for you to

use Docker

is just

point at the Dockerfile

and whatever dependencies that you want to have included in your build context for Docker.

If you use the PEX file, it's a Python executable

that you can package with PANS.

You can even infer the dependency on that from your Docker file.

From

that, we have

built on top of that to manage published images to registries.

And,

also, you can chain your Docker images. If you have a common base image,

you build that first to then go on and build your other images that depend on that base image.

And thanks to the

infrastructure that we already have,

all of that works with change detection.

So if a file has been edited, the Docker images,

perhaps, you know, patch a whole chain of them that depend on that file will show up in change detection. You can determine which Docker images need to be republished.

And I think from an architectural changes perspective,

this gets to my point about CI and making that essentially a 1 liner. I think

to achieve that, you would need to actually

rebuild the relevant images.

You might, if you had any sort of native code going into those Docker images,

want to execute the compilation

of the relevant wheels

either inside the Docker image or outside of them, but in parallel. Right? For as many Docker images as you have, you might want that to execute on a remote machine if it can.

And so

the Docker support is fantastic. 1 of the frontiers that we're we'd like to continue to explore with it is continuing to remove the steps sort of before and after

pants in CI, which might include, well, okay. I've got a custom wheel that I need to build in order to put it in the Docker container, or I'm going to, you know, invoke Docker to do something before the build. If we continue to expand our support for essentially

cross building

in Docker,

Your execution platform might be macOS. It might be Linux,

but you essentially cross build into Docker

from your local platform. That's sort of,

you know, a local developer running on Windows

or macOS

transparently using Docker only when they need to for the cross building portion

is something that we'd like to continue to push in the next year. There's improvements planned in that area.

Right now, you know, cross platform Docker builds are possible as long as they're in native code. So

we'd like to improve that. In terms of the support for multiple language runtimes,

that's also an interesting challenge beyond just managing the execution context as far as

this goal of consistency

and

ease of adoption

for

end users of the pants builds tool. And given that pants itself is written largely in Python with a Rust core for the execution engine,

I'm wondering how you've approached the

sort of design of the experience for these additional language runtimes so that it feels idiomatic

and approachable for people who are maybe not Python native or maybe don't even use Python in the repository at all?

Yeah. It's a great question.

So we support now Go, Java,

Scala. We've supported Python for past 10 years. And 1 of the really interesting ones is shell support.

And with each of those languages,

we spend a lot of time first thinking about what are the unique strengths of pants in this ecosystem and what does this ecosystem already do well.

We very much view the rule of pants as complementary.

So for example, with Shell,

pants hooks up with the amazing shell check cleanser

and the sh formatter, which is kind of like black. It will make your scripts pretty automatically.

And then a

unit test framework called shunit2.

And

we decided to focus on those 3 things with shell

rather than trying to hook up with things like running or packaging

because Shell already has

really good simple support for things like executing your script. And we didn't think that pants would add that much value to it, whereas we could add a lot of value that will install

those tools like ShellTrack for you,

making sure that everyone, whether you're in CI or you're different developers, that they're all using the exact same version,

then run it with this consistent interface.

We'll run that all in parallel so that you can run shell check at the same time as flake8 and black and isort and so on.

Same with go. Go already has really strong tooling. So we actually leverage a lot of the underlying Go tooling

and make it better with things like this consistent interface

and benefits like caching.

Yeah. And I would also say that the check goal that I referenced earlier is an interesting example of of making a consistent experience across multiple language ecosystems.

Mypy had actually been in a goal called type check until recently. And we deprecated that goal and renamed it to check

because there is a shared semantic meaning

across

all of Python,

Go, and JVM languages like Java and Scala,

which is that you want to do as much as possible to ensure that your code is correct.

And

that might be type checking, but it might just be compilation.

And so for Go,

Java, and Scala, it is compilation. The the check goal runs compilation.

And it does so in a similar amount of time, sort of in relative terms to mypy, you know, type checking your code. Mypy is definitely a little bit faster. But meeting your transitive dependencies

in this goal

is sort of a common thread.

So the check goal is an example of, you know, finding the shared semantic meaning across languages.

It's definitely not linting. It's doing something more heavyweight. You definitely wanna run it before you submit your code. You may not necessarily wanna iterate on it for absolutely every edit. Maybe you do.

So I think that's an example of this consistency that we're trying to apply. To that point of consistency

across these different language environments,

how would you characterize the overall,

I don't know if you'd call it feature coverage or

coverage of specific targets or goals that are supported across these environments and

any kind of foundational changes that were necessary

in Pants itself to be able to support

adding these different runtime environments?

Go was interesting in that users have very different expectations of what they are going to build. It's directory centric, which isn't really super common.

Python's very file centric, and Java and Scala are as well.

So as we added these languages, we had some surprises. But at the same time, I think the JVM languages were pleasantly

straightforward to add.

So I think the foundation that we have has proved itself to be really useful. At the same time, we have definitely noticed now with half a dozen languages that there's some boilerplate for plug in authors

that we'd like to remove. So we will definitely be doing a little bit more work as Go

and Java and Scala

are stabilized themselves

to remove that boilerplate internally

so that, you know, the next dozen languages are added.

There's less for rule authors to either trip on or have to sort of mindlessly copy. I would say that the other thing that we've definitely

seen as we've added these other languages is the dependency inference has been a success in all of them. It's useful

sort of regardless of how structured

your import statements are or aren't. You know, Java and Scala have gained sort of significant benefit in that you can compile at a very fine grained level

in Go. It's just an expectation that you don't have to, you know, write a bunch of metadata

about your build in order for things to compile

because the Go tooling uses your import statements sort of the same way dependency inference does. Likewise, you know, the Python ecosystem people don't want, you know, to repeat themselves,

and we are trying to avoid that. So I think dependency inference scaling to all these languages has been really important as well. I think as we continue to expand language support,

I think it'll be interesting to lean in further on the assumption

that there is dependency inference

and see what we can do to further lower

the either the boilerplate for users when they're, you know, creating a repository or for plug in authors.

You know, what would it look like for dependency inference not to be optional? How much can we remove

in that case?

It's on by default, to be clear.

And as far as the

growth of the community and overall adoption of the project, I'm curious what your

strategy is for being able

to scale the community and scale the

interaction and engagement patterns that have helped you go from small scale to where you are now and how you're thinking about the continued growth of the ecosystem

now that you seem to be kind of at a tipping point where you're adding these additional

language environments. You're trying to expand beyond your initial base of enthusiastic and power users into people who are finding it and maybe just want to

have something that runs and they don't necessarily care about being as enthusiastically

engaged or just some of that overall community growth aspect of the project?

My first interaction

with the community when I approached pants

was

to

ask about

a feature that I

thought was missing.

And the response I got really quickly from Eric was that,

sure, why don't you get put up a PR for it?

And so

I was rather delighted

in

the welcoming

spirit.

The welcoming spirit was really encouraging

for me to contribute more

and to

get to know pants better and learn more about it.

And

I think I see that in

other members of the community too

that come in, and

they're enthusiastic about what they see. And they have all these ideas,

how those ideas are

welcome,

how they are received

that encourages people to stay and continue to invest and go deeper

into the community and becoming contributors or maintainers

in the long run. Yeah. I think as a maintainer, it's always a question how much time do we spend

directly interacting with the community and, for example, mentoring possible new contributors.

1 thing that really helps frame our perspective here is the idea of the curse of knowledge,

which in some Buddhist circles is called beginner's mind.

That curse of knowledge is once you learn something, it's really hard to go back to where you didn't know that thing before. So once you're a PaaS power user,

it's hard to remember what it was first like when you were using it and take that perspective no matter how much you try.

And like we were talking about earlier, we think a lot about power users versus those everyday users or people who are just using this to get something done. We want it to be a great experience for everyday users.

So we very intentionally

seek that feedback

from beginners and think that beginners often have,

perspective

that makes the entire project a lot better. We're always eager to

actively

support people who are trying out pants for the first time, who wanna do a new contribution. We often pair program with them, for example.

And beyond helping them to have a good experience, it's also helping the project that we get to see things from their perspective.

Yeah. And it's also the case that a huge number of people who use pants will never contribute to it. We love any sort of contribution, and I guess that depends on the definition of the word contribute.

A lot of people's contribution

might also just be answering questions. So in terms of scaling a project,

building a community that's welcoming

and that sort of echoes and

people pass on the assistance that they received,

it goes a long way.

You know, identifying

which people

to encourage

contributions from or patches,

you always ask, but you're also willing to dive in and do it yourself.

So we love all contributions, but if you can't, you know, we're still we're always willing to help. Another aspect of the

community growth and community engagement

and the ease of adoption is having

useful

examples that you can point to of this is how you use pants, or here's a list of plug ins that you can use in your project that are generally available.

And I'm curious

what your thinking is as far as how to

encourage

that level of contribution and adoption necessarily

retread

the

same

ground with everybody. You can say, you know, if you're necessarily retread the same ground with everybody. You can say, you know, if you're coming from this language ecosystem, here is kind of the reference implementation of how you can get up and running. Here are a set of useful plug ins and things like that. And also just

because of the fact that a lot of the plug in development happens inside the monorepo,

ways to think about

architecting that experience

so that it is more conducive to contributing those plug ins either back upstream to pants or into

a, you know, a repository or even just like an awesome list on GitHub or something like that?

Yeah. I can say that we love when people contribute examples

because it's a demonstration

of

how

much boilerplate

we still have.

It also is sort of a way for us to learn how people learned about the project. Right? If they didn't discover some feature and we can help improve

their example,

that's a lesson that we can take back to improve our documentation

so that when they're getting started, they don't they don't need to, you know, bend over backwards.

And I think the other thing about example code

is that it's always a good

thing to look at

in terms of how much boilerplate you have. Right? If our examples all consisted of a single line, like I was promising with CI promising that,

then we would know that we had no more boilerplate to remove. Right? What's the example? Well, I run this 1 line, and you run that 1 line. Oh, how can we reduce that further?

So the size of any given example, you wanna push that down. You wanna play a game of golf in terms of the total number of lines required to accomplish some goal. So we love seeing contributed examples.

Also, to be clear, our goal is absolutely to stabilize the plug in API

and not require that people write things in poor. I really hope that that is going to be the the

when we blossom in terms of lists

of third party plug ins existing. Right? Right now, you know, we absolutely love a third party plug in. It's going to be easier

in many cases for both the contributor

and any consumers of the plug in if it lives in core because we can essentially maintain it for you. We can continue to expand the API.

3rd party plug ins are absolutely something that we want to, you know, further encourage.

We promise that we're gonna make that easier in the future.

As you have been

growing and scaling the community and the project itself and growing the number of use cases that it supports, what are some of the most interesting or innovative or unexpected ways that you've seen it applied?

I tend to

push the boundaries of what you're meant to do with a piece of software or technology or anything, really.

1 of the first thing my mind started doing when I discovered pants was,

where else can I use this?

So I have a kind

of proof of concept

project

where I try to

leverage

the rule engine of pants for

any kind of Python application.

So instead of writing kind of plugins

and build files and running the regular

pants

command line to

get into the engine,

you can instead use the pants engine as a library

that you load from your Python code

and set up the engine and then just start writing rules right off the bat. It was surprisingly

easy to get working actually so

there's a underpants

kind

of library that's does that.

No pants, the name that keeps on giving.

I was really appreciative recently. We had somebody contribute a high oxidizer plug in, and it was mind bending because when I

first considered integrating

PyOxidizer

because PyOxidizer is written in Rust and PANCES as well.

I

initially

went down the path of attempting to integrate

directly with the Rust code, but a contributor came along and essentially dropped it in atop our Python distribution support.

So pyoxidizer is going to be supported relatively soon in an experimental

fashion. Probably, the 2 10 release will have some experimental support.

And I think that was innovative and unexpected and how simple it ended up being when I ignored the fact that there was Rust involved in both of these projects, and they can integrate across the distribution boundary, the Python distribution boundary.

So I think that's both innovative and and exciting is a an alternative to Docker for folks who know their deployment environment really, really well.

And in your experience

of being contributors and maintainers to the pants project and I'm sure consumers of it as well, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

For me, it's been how awesome of a combination

Rust and Python is when used together.

For any Spanish speakers, I actually have a Python talk last year about when to use Rust native extensions.

So about 70% of pants is written in Python 3, and about 30%

our core engine is written in Rust,

which is the engine that

schedules everything, like how to sequence which task and handles things like caching.

Rust is pretty new for me. I've only learned it in the past year.

There's this amazing library called pyo3

that makes it surprisingly

easy to

integrate Rust and Python, that we have both Rust code code we wrote in Python and the other way around.

Yeah. And I would say that something that's not necessarily

new

and probably a classic issue is just that async code

where you're no longer

using operating system

thread stacks, whether it be in Python or on the JVM

or in Rust,

is

kind of a pain to observe and to apply metrics to and to get stack traces from.

Luckily, we have sort of a growing body of infrastructure

to make all of that possible,

but it does, in some cases, end up coming down to creating your own tools,

for,

you know, observability and performance work, which is definitely a focus for us. Go, for example, you know, it can't use the operating system level stack traces. You need custom tooling for that. So

async observability, async has all the benefits, but few downsides as well.

And so for folks who are interested in being able to

have consistency

of their experience across the software development life cycle? What are the cases where Pants is the wrong choice and they're better off using either

discrete tooling or some homegrown solution or something built into their CI framework, etcetera?

Sure. So the most obvious answer is that there's still some languages

and ecosystems that we don't have first class support yet for. A big 1 is JavaScript. There are some ways to

get JavaScript working with Pants, but in general, we find that most users for now are still managing JavaScript,

using tools

like Yarn and NPM

and integrate with Python. That's 1 of the biggest things that this next year, we just are wrapping up a community survey that we hear from our community that we'd love to add proper first class support for.

So even if Pants doesn't yet support your language,

the plug in API means that we can add support and the community is really responsive. But that's an obvious case where you might need to

either use multiple different workflows and tools.

And the amount of boilerplate

required to get a project going is still a thing that can mean that pants is not absolutely perfect for your tiny, tiny, tiny project. Right? The bar on whether pants becomes useful,

continues to move, and hopefully, we continue to move it in the direction of,

sure, as soon as you hit 200 lines, you know, pants is worth adding to your project. Right? We'd like that bar to be ever lower.

And I think we've done a reasonable good job, but there's always more to do. You've mentioned a number of things you have planned for the near to medium term future of the pants project. And so rather than digging more into that, I guess, I'd be interested to explore some of the

sustainability of the project and how you're able to

spend so much of your time and focus on continuing to scale it and grow it and make it available for end users?

Well, I can say that the pants community has ever

more open source maintainers

from more diverse backgrounds. That's a great thing. We're super happy to have Andreas. We have another maintainer that came on board recently from another organization.

And that's always helpful. Like, you want your open source project to have a really diverse community, and that is our goal. At the same time, we also have corporate backing, and that is a useful thing. Right? People need support in their project

when they have, you know, either enterprise use cases or other things.

So I think that we are continuing to to strive for a good balance of open source governance

and corporate support when you need it. From my perspective, it is either we develop the tooling

ourselves in house

with a custom

tool chain and everything

and with all the maintenance that comes with it,

Or

we can get involved with an open source 1 where we have a whole community that will pitch in and help develop it.

We can develop the features that we think make sense for us

and get the benefit of all the additional features we didn't even think of from everyone else.

Together, we maintain it and bring it forward. So it's a win win situation to be part of.

Well, for anybody who wants to get in touch and follow along with each of you, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the picks.

And this week, I'm going to choose a new show I started watching recently called Last Kingdom on Netflix. It's just

a very engaging and well written and well executed historical

drama fiction

about the kind of early middle ages or dark ages in England and focused on the invasion of the Danes and their conquest of England and surrounding regions. And it's really well done show, so definitely recommend that for folks who are looking for something to watch. And so with that, I'll pass it to you, Eric. What do you have for a pick this week? Sure. My pick is

a new show on Netflix from Jonathan Van Ness, who is 1 of the hosts on Queer Eye. And Jonathan has a new show called Getting Curious.

1 of the episodes that came out last week is a 30 minute segment on gender non binary people,

which like I imagine most listeners grew up not really realizing that non binary people exist and how much gender

controls things like what we how we dress and how we talk and what sports you can play and so on. So I thought it was a really engaging and informative

30 minute episode

that can possibly help you better get to know your coworkers

or family members or even open source maintainer.

And, Stu, how about yourself?

I've really enjoyed the Checks and Balance podcast, which is sort of the con American podcast.

They really introduce history in a useful way, and so it helps to put, you know, the story of the day in context.

So every episode, you know, is a good lesson, not just about the present, but but also the past. So And, Andreas, how about you? What's your pick for this week? My pick would be the,

book by Andy and Dave, the pragmatic programmer.

I've read it many years ago when I was

starting out as a software developer. It has

influenced me

deeply

from then on.

So I can hardly recommend it.

Publish the 20th anniversary

edition a few years ago. It's a classic. And not only has it supported my programming career, it's also literally supporting my laptop right now by raising it a few inches off the table. So

great book.

I have it memorized, so I don't need to crack it very often.

Another

related 1 that's really good is The Effective Engineer.

Alright. Well, thank you all very much for taking the time today to join me and share the work that you've been doing on pants. It's a tool that I've been enjoying using and has helped a lot with some of the projects that I'm building at work. So thank you all for the time and energy you put into that, and I hope you enjoy the rest of your day. Thank you, Tobias. It's always a pleasure. Thank you.

Thank you for listening. Don't forget to check out our other show, the Data Engineering podcast at dataengineeringpodcast.com

for the latest on modern data management.

And visit the site at python podcast dotcom to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcastinit.com

with your story.

To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

The Python Podcast.init

Summary

Announcements

Interview

Keep In Touch

Picks

Links

The Python Podcast.__init__

Summary

Announcements

Interview

Keep In Touch

Picks

Links

The Python Podcast.init