MonkeyType with Carl Meyer and Matt Page

Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. I would like to thank everyone who supports the show on Patreon. Your contributions help to make the show sustainable. When you're ready to launch your next project, you'll need somewhere to deploy it, so you should check out linode@podcastinit.com/linode

and get a $20 credit to try out their fast and reliable Linux virtual servers for running your app. And now you can deliver your work to your users even faster with the newly upgraded 200 gigabit network in all of their data centers. If you're tired of cobbling together your deployment pipeline, then it's time to try out GoCD, the open source continuous delivery platform built by the people at Thoughtworks who wrote the book about it. With GoCD, you get complete visibility into the life cycle of your software from 1 location. To download it now, go to podcastinit.com/gocd.

Professional support and enterprise plug ins are available for added

peace of mind. You can visit the site at podcastinnit.com

to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions, I would love to hear them. You can reach me on Twitter at podcastinit

or email me at host@podcastinit.com.

To help other people find the show, please leave a review on Itunes or Google Play Music. Tell your friends and coworkers and share it on social media. We got a few announcements before we start the show.

First, there's still time to get your tickets for PyCon Columbia happening February 9th 10th.

Go to picon.c0

to learn more and register.

There is also still time to register for the O'Reilly Software Architecture Conference happening in New York, February 25th to 28th.

Use the link podcastaniche.com/sakhandashnewdashyork

to register and save 20%.

If you work with data or want to learn more about how the projects you heard about on the show get used in the real world, then join me at the Open Data Science Conference happening in Boston from May 1st through 4th.

It has become 1 of the largest events for data scientists, data engineers, and data driven businesses to get together and learn how to be more effective.

To save 60% off your tickets, go to podcastinit.com/ods

cdasheastdash2018

and register.

Your host as usual is Tobias Macy. And today, I'm interviewing Carl Meyer and Matt Page about Monkeytype, a system to collect type information at runtime for your Python 3 code.

So Carl, could you start by introducing yourself? Sure. Yeah. My name is Carl Meyer. I'm, been a Python developer since, around the turn of the millennium

and been working at Instagram for about a year and a half. And, Matt, how about yourself? Yeah. So

I've been writing Python, I think, a little bit for a little bit less time than Carl. I started in college in, like, 2, 005. And, I've been working at Instagram for about a year now.

And, Carl, again, do you remember HayFirst got introduced to Python?

Yeah. I was actually working

on a content management

system written in PHP,

and, I was bored and poking around on the Internet looking for more interesting things to do than the work I was supposed to be doing. And I found the classic dive into Python book by Mark Pilgrim

and, worked through all the exercises there, read through the book, and kinda fell in love with the language. Did a that was, yeah, back right around the year 20, 021, 001, somewhere in there. And then,

did a bunch of side projects in Python a few years after that, and then,

eventually started a consulting company building Django based web applications.

And, Matt, how about yourself?

Trying to think. I think I started programming

well, so I used to be, like, a huge security nerd back in high school and then in college.

And I started using Python's

mostly to build

tooling for, like, injecting packets on the network and parsing packets and stuff like that. And then that kind of

morphed into I ended up writing a bunch of Python at my job before,

before Instagram, just on, like, random, like, control plane stuff for a large distributed system.

Alright.

And so as I mentioned at the open, we're talking about the work you guys have done on monkeytype at Instagram. So I'm wondering if you can talk a bit about what the project is and does and how you got started with it.

So, yeah, monkeytype, like you mentioned in the open, it's,

it traces, your function calls in your Python code. So when your code is running at run time,

it, traces the calls that it sees and records the type of every argument and the type of the return value, and then stores those in a database. And then it can query that database

and generate, type stubs or even apply type annotations to your code,

using the new Python 3 syntax for,

for type annotations. And then, of course, you can use a type checker like MyPy to do type analysis on your code, reveal type errors.

Also

has a documentation benefit to to see the expected types,

argument types, and return types of the the functions in your code base.

And what was the motivation for starting the project?

So I think this kinda started out as, like,

Mike Krieger, who's our CTO,

kinda started diving into this. Or just the kind

of the the project of of adding types to,

Instagram. And Instagram is like an enormous code base. It's like over a 1000000 lines of Python. And, like, as he was going along, he's kind of like, oh, man. This is really painful to do manually. Wouldn't it be awesome, you know, if we had some way to do this kind of automatically by just observing

types that types at runtime?

And he sort of, like, wrote some like, I think it was like a decorator to do this.

And then posted

something to, like, an internal workplace group and then

people kind of, like, piled onto it and were like, oh, hey, we think we can actually do this, in, like, a much more scalable way. And that was sort of how the project got started.

And at about the same time that you guys released Monkeytype as an open source project, PyAnnotate came out of Dropbox. I'm wondering if you can do a bit of compare and contrast as to

the capabilities

or the sort of target audience for Monkeytype and how they operate compared to each other.

So we actually talked with Dropbox,

at PyCon last spring. And, so we knew that they knew that we were working on this, and we knew that they were working on it.

So we kinda we we were aware that that there was parallel work going on. But the main difference is that Instagram is all Python 3, and, Dropbox is, Python 2. So pyannotate is

Python 2 centric

in the sense that it, generates,

type comments as opposed to Python 3 style type annotations.

Whereas, monkeytype is totally Python 3 centric. It requires,

Python 3.6

and, generates,

Python 3 type annotations.

There's also just some stuff that mongetype has that kind of, I think, make it makes it a little nicer for just kind of incrementally doing this or, like, doing it like, because it lets you build up a database over time, but it also produces stub files, which I don't think Pyannotate does. So you can kind of look at annotations

in a nice textual representation

before you actually apply them. And then we also have like a bunch of other stuff like type rewriters to kind of take what are what can be sort of unwieldy, automatically generated types, and then that can sort of massage them into something automatically massage that into something. It's a little bit nicer to to read and closer to what a human would actually produce. Yeah. I think the incremental point is really key when we make sure that's clear. Like, so my understanding is with pyannotate, you pretty much have to do it

all in 1 shot. Like, you, run some code, trace the types, and then generate the stubs right then and there. Whereas with monkey type, it's like an ongoing incremental thing. You can there's a a backing store system where you can plug in some database, use a SQLite by default, and

it'll just keep over time. You can keep adding more and more call traces into that store and then generate stubs based on that data anytime.

And was there anything particular to the 3.6

release that you

decided to only support from that forward? Or was it just because that was the version of Python that you were running against and you didn't

want to, put in the extra effort to verify that it would function against 3.5?

Very much just that that's what we were using. That's something that we could do now that it's open source. So it's actually probably a great thing to do. So just kind of figure out, you know, what we can do to support more Python or more Python versions. But, yeah, for us, it was we were really focused, you know, on just like making this work for Instagram.

Yeah. I think the Python 3 requirement is pretty baked in, but,

we don't there's no reliance anywhere on, like, variable annotation syntax that came in with 3 6. So I think somebody wanted to add 3 fives to board or even earlier, it'd probably be fine.

And as you mentioned, a lot of the

generated annotations come from potentially running it against production system. So I'm wondering how much overhead

the tracing capabilities

add to the running environment and what techniques you've used to minimize the potential impact on those production systems.

So I think anecdotally, we've seen kind of like a 2 to 3 x slowdown. We do a few different things to try and to try and minimize that as much as possible.

So for us, because we do run this in production, we run on a very small number of requests. And because it's incremental, right, it turns out that, like, if you have a code base as large as ours, right, a lot of your code changes very slowly. So even if you're sampling at kind of like a pretty low rate, you actually get reasonably good coverage. We also

do some other stuff around like, there's an option to to sample

to sample function calls. So in addition to sampling requests, we can only sample we can sample a fraction of function

calls. And then, like, 1 of the slowest parts of the actual the whole tracing process is actually figuring out what function

is called. And so we cache that pretty aggressively.

I was gonna say if you really wanted to squeeze all the possible performance out of something like this,

there is a in Python 3 6, there is the new frame evaluation API. And there is a project, I think it's called typical,

that's trying to use the frame evaluation API to do this tracing instead of, what we use, which is just, sys dot set profile, the same thing as c profile and all existing profiles use. And, I think doing using the frame evaluation API would probably,

be quite a bit faster than using, sys dot set profile, But it would also involve replacing a few lines of pretty clean Python with a bunch of lines of pretty hairy c code. So we haven't really been motivated to do that.

And 1 of the things that I noticed you called out in the documentation

as well is the fact that because of the dynamic nature of Python,

the types of the arguments and even the return types of some functions aren't necessarily going to be uniform

based on, you know, the inputs to a given request or the code paths that lead to a particular function.

So I'm wondering if there's a way for the,

recorded type information to be sort of melded together when you're generating the annotations

so that you can ensure that there's appropriate coverage of all the different types of inputs that are valid for a given function?

We have the concept of something called, like, a type rewriter.

So I guess, like, monkey type will store stores traces. So, like, for a given function call, it'll store the the concrete types that it saw for arguments and return values. But then at query time,

it'll grab all the type or all of the traces that it's seen. And so that kind of gives us a way to sort of massage those things. And what we have right now is we have some relatively simple ones, but you could imagine writing

whatever you wanted in in to support, producing more human readable types, or I guess types closer to what you would write as a human being. That might be more general than the concrete types. So, I mean, specifically, what we do right now is, like, just say you have a function add

that sometimes takes 2 integers and returns an integer. Sometimes it could take 2 strings and return a string. We'll see those different cases,

in the traced calls, presuming that whatever code you traced exercise those different variants. So we'll see 1 trace that has 2 integer arguments and an integer return. Another 1 that has 2 string arguments and a string return. And what monkeytype does is pretty simple. It just takes so for argument a, takes everything it's ever seen and puts it into a big union. So we'll get

union int string.

Same thing for argument b. Same thing for the return type. And then like Matt mentioned, we do have these type rewriters that will take that union

and do some simplifications on it. Like, the default ones we have are pretty simple. We'll take if there's more than, I think, 5 things in the union, we'll just turn it into an any, because at that point, it's not a very readable annotation. And then we have some other some other erwriters, but the the core of it's pretty simple. And it's it's worth pointing out that the the annotations that monkeytype generates are functional and usable,

but often they could be improved by a human looking at them. And, like, for instance, in the ad case, monkey type isn't gonna notice

that

2 integer arguments always resulted in a string an integer return

and same for string. So you're just gonna get union union union. But, ideally, that would be an overload where you'd say, well, if the arguments are both integers,

the return is an integer,

same for strings. That's something that today monkeytype isn't gonna do for you. You

the developer reviewing it needs to notice that there's an invariant there that monkeytype didn't catch and maybe improve that annotation.

But monkeytype gives you pretty good kind of raw material to work from. And it's in theory, it'd be possible at some point to add

a little more smarts there. The current

type Rewriter system that we have only operates on, like, 1 argument at a time, 1 type at a time. So it wouldn't be capable of generating things like overloads or generics that result that require kinda seeing the whole picture of the function, but it wouldn't be that hard for us to add something like that in. Just a question mostly of whether the, what the return on investment is because it's also not that hard to look over the annotations and modify them yourself. Yeah.

And when it's generating

the types for the annotations, does it also support annotating

functions or variables

with custom classes that are found within your code base? So that if, for instance, you have a class object of type car and that is 1 of the expected arguments for a function input, does it generate that class as the type for the annotation, or does it just say that it's, you know, of type object or something like that?

No. I mean, it'll find the it'll find the correct type. So in that case, it'll be type car. And it actually does the the necessary work to generate the correct imports too. So this so kind of the nice thing about another nice thing about MucciTyperite is that you can then look at the stub, and if it seems reasonable, then you can apply it. And if it applies successfully,

the resulting

Python code should actually

be functional. So like it'll add an import for car if you added a type of car to your code.

And so we've talked a little bit at a high level about how the system works and how you make use of it. So I'm wondering if you can dig a bit deeper and talk about the internals

of the project

and

how it's implemented as well, and then also how that implementation has evolved over the time that you've been working on it. Sure. Yeah. So I could I think Carl mentioned this earlier, but,

MucciType uses Python's profiling hooks.

That's kind of how it interposes on function calls and returns.

So then, yeah. It basically, when it gets invoked,

when a function is being called, goes and it looks at kind of like the bread crumbs that it has, and it tries to figure out what the actual function being called is. And that kind of turned out to actually be difficult over time. Like figuring out the argument types is that or capturing the concrete argument types is is pretty easy. Those are pretty oh, they're just there. Those are accessible via the stack frame. But actually figuring out what function got gets called is is pretty tricky. You like, the only breadcrumb you actually get from the stack frame is that is useful for this for the most part is is the code object, and that just gives you the function name. And then you kind of have to go groveling around,

in the environment to figure out what function got called. And so, a fair amount of work went into making that search process work for the different kinds of functions that actually can be called. So like c functions

And the thing that you think about when you think of functions are like functions that are more or less declared at like module scope and aren't annotated as, like, static methods or something. But you also we need to handle, like, instance methods, class methods, static methods. And then we actually have internally a bunch of, you know, like, custom decorators for optimizing various sorts of things that we needed to support. So, yeah, there was a a solid chunk of work went into making that work over time.

Yeah. We spent a lot of time. I remember on,

we used Python a a fair bit at Instagram for for speeding up hot spots in our code, and we have some function decorators that are implemented actually

in Cython compiled code. And so I think I remember spending at least

a a half a day or a day just working on making sure that was all still working in the open source version.

So, hopefully, somebody out there gets some benefit from that.

Yeah.

And when you first started the project, did you know from the outset that you were going to end up open sourcing the work you were doing with Monkeytype? Or is that a decision that came later after it was already functional?

And if that's the case, what was the process like for cleaning it up and getting it ready to be released?

I mean, I don't I don't think we really knew from the outset we were gonna open source it. I don't know. Carl, do you have do you remember, like, kind of when we started talking about doing that? I don't remember exactly. I think it was while while we were working on it,

because I mean, it became clear pretty quickly that this was useful and that other people would find it useful too. Yeah. So that would have been, like, last spring sometime in a month or 2 before Python. Yeah. Yeah. So, I mean, we we had kinda designs to open source this for actually

a long time. And, yeah, I kinda just, like, I was slacking on actually getting it done, and then Carl was finally like, hey. Let's do this. And, you know, kinda set up some some sort of, like, work days where we sort of just put our heads and kind of grinded out the the parts that needed to like, the work that needed to get done in order to open source it. And I think that actually worked pretty well and it wasn't too much work. The design

the only parts that we really had to make or we really had to provide open source solutions for or implementations for were, like, the, I think, the the storage. Because, like, we use a custom storage engine that's, like, is backed by our, like, analytical database called Scuba. But outside of that, I think well, now now outside of that, now we actually are running on

open source monkeytype. Yeah. I don't I don't know. What else what else do we have to do? Or what else was, like, important stuff that we did to get to get it open sourced? Yeah. I mean, it depends what your expectations are, I guess. In some ways, I think people think you just take the code and open source it, but there's actually quite a lot that we had to do just in terms of thinking about use cases,

outside of ours. Well, for instance, the entire CLI was something that we added.

Monkey type run. Because internally, we just basically we we just had monkey type in, like, a Django middleware, so it was running on web requests. But we hadn't really implemented anything for how you'd use it with a library or a test suite or anything like that. So, I mean, the cord obviously didn't change too much, but, there's a fair bit that we had to think about in terms of how are people gonna use this who aren't us.

Yeah. And in terms of processes, they're already sort of an existing set of steps and license requirements, etcetera, within Instagram

for producing open source libraries and modules? Or and just curious what sort of the, logistical process looks like from that perspective as far as what you need to do to, open source code that you have developed internally?

I mean, there is a pretty well defined process. I was actually pretty surprised at how, I think, well defined and easy it was. Don't know what your impression was of it is, Carl. But, like, I I I was pretty impressed at how quickly we were able to do it. Yeah. Totally. I mean so Instagram is part of Facebook, obviously, and, Facebook has a lot of open source code. So something they've done before, and, it was pretty pretty good process, and we really didn't have any trouble.

And so you mentioned that once the type annotations are present in your code base that you leverage mypy as a way to statically analyze and ensure that the functions

and the way that they interact are adhering to the contracts that are defined by those types.

I'm just wondering if there is any other tooling that you use to take advantage of that information,

either,

during the sort of build and release cycle or even at runtime?

No other tooling at this point. I mean, I think, 1 of the big advantages that I think developers get from the type annotations

doesn't even require any tooling. It's just a documentation value. I mean, a lot of our teams were already trying to even before Python 3 came along or type annotations, we're already trying to annotate

argument and return value types in 1 way or another using some kind of docstring syntax. So, that idea has been around for a while. And, but, of course, if you do it in a using comments or docstrings, they tend to get out of date, and then they're worse than not having them at all. So I think a lot of people are finding value from, just having them there as documentation.

But as far as as far as doing anything with them in an automated way or via tooling, MyPy is the the only thing we're doing right now.

And when I was reading through the blog post that you wrote about monkey type when you were releasing it, you identified the fact that you have already found and fixed a number of bugs that were existing in the code base that were just sort of hiding before you added that type information. So I'm wondering if there are any

specific instances or sort of nasty bugs that were uncovered by the fact of adding these type annotations and then executing my pie against them.

Boy, we've had so many. I don't know if I could count them, but I'm trying to think of any, like, particular instance I could describe. Do you have any that come to mind, Matt?

I don't know about, like, particular instances.

There are definitely, like, classes of errors that obviously we see less of now Yeah. Or we hopefully see less of now that we have.

So, like, attribute errors, obviously,

Just and then, like, I don't know. The the 1 that I like the most is, like, none type has no attribute, blah blah blah.

Yeah. Yeah. I would say probably we've caught more optional bugs than anything else. Just bugs where some some variable can be none, but we have code that is assuming it isn't. That's probably been the most common single class.

It also, sort of at a higher level, sort of,

reveals some code organization patterns that I think had readability problems anyway, but my Py kind of calls them out as problematic.

I mean, a common case of that is, like, mix in classes that make a series of undocumented and kind of opaque assumptions about what attributes or methods will be present on whatever class they end up being mixed into. Right? So the mix in the mix in class itself isn't really stand alone. It has to be mixed into a class that has these certain attributes or methods, but that's not clearly documented anywhere. It's just kind of, it's there and it works currently. And my py will call that out and be like, hey. You've got this class here, and it thinks that it has this attribute x, and I don't see that anywhere. What's going on? So so there are there are ways in which,

static type checking asks you to write code a little more carefully. And my feeling from what we've seen so far is that that's almost entirely a good thing. I mean, I'm sure there are cases where some people would disagree, but

Yeah. That specific example is kinda interesting because, like, I think I generally agree that, like, that's a pattern that sort of, like, static type checker discourages or at least forces you to think a little bit harder about and be, like, stricter about how you write a mix in. But, like, that's 1 thing where I think we had trouble with. Like, when we were doing the initial push of, like, adding types and, like, kind of like bootstrapping the type annotations, when we encountered those, or at least when I encountered those, it was sort of hard because, like, you wanna kind of, like, get this initial blanketing of types, but unless you go and do the hard work of, like, actually refactoring that code, you sort of, like, you can't make mypy happy. So I don't know if there's an opportunity there to, like, improve mypy for I would I would imagine that other users of Python also have similar patterns in their code. And as a kind of we could improve mypy so that it could actually at least handle that case as a as a, like, a mechanism to let you move forwards

even though that you know, like, later on you're gonna need to come back and and fix it because it is, like, kind of a code smile. Yeah. Certainly working with legacy code and adding types to it, you're gonna have to make use of the escape patches that mypy does provide, which is like annotating things as type any or just using a type ignore comment or casting.

So we've been pretty free to use those where we need to. But there are certain things that are that are quite painful,

to to deal with. Like, I mean, we we had, I think something like, I don't know, 10, 000 type errors just from running MyPy over the code base without adding any type annotations,

which kinda goes counter to the theory of,

gradual typing, which is that you sort of pay as you go. My pipe does that to a great extent. I mean, it doesn't type check the bodies of unannotated

functions, but it's not really complete in the sense that there's a a number of classes of errors related to, like, class attributes or module variables or or a variety of things where you will get errors even before you've annotated anything. So that's something where

my pie could probably be improved a little bit for at least for annotating such a big legacy code base.

But we just kinda powered through those type errors and got it down to 0 and went from there.

And for cases where you have added in those placeholder types to just

allow mypy to run through to completion,

does monkeytype then still allow you to generate

more accurate annotations to override the, you know, for instance, any types that you've peppered through the code base?

That's a great question because it allows me to hype a feature that was just added to open source monkeytype today,

which we're looking forward to using.

So, typically,

monkeytype

will always privilege an existing type annotation

in the code over what it finds in the traces. So it figures if you've already and there's enough cases where you can improve monkeytype's raw annotation

that we figure,

if they're different, we're gonna privilege what you've actually

explicitly annotated in your code. But there are times when it you kinda wanna check your annotations and see, does this annotation actually match what we're seeing? Could it be more specific? Is it just wrong? So we've wanted to do that for a while and haven't gotten around to it, but, somebody just this week contributed a pull request to to monkeytype. So it now has a flag,

ignore existing annotations, where it'll generate

a

stub based only on the traces that you've collected and ignoring the existing types in your code, which is useful for for that,

use case of trying to improve your existing annotations.

And it also seems as though, particularly when you're run running monkey type in your production environment,

that by virtue of executing these traces and storing the type information for the functions that it identifies, that

it's a way to inadvertently identify dead sections of

use to use that information to perform the analysis necessary to find those unexecuted code paths and potentially prune them from your repo.

You don't really need any of the, like, kinda hard parts of monkey type to do that. So you could use just, like, a kind of a regular profiler which records which functions are called and how frequently they get called. I mean, we've built some profilers to do this, but we didn't use monkey type specifically for it. When you're trying to identify dead code, I mean, it's compared to monkey type, it's more important to get really good coverage because you you wanna avoid falsely identifying something as dead just because it's called infrequently. So it's it's valuable to get your sampling rate as

high as you can when you're doing that kind of work. And so you don't really want the overhead of all the extra work monkey type is doing to collect types in that case. And what have been some of the most challenging and interesting aspects of building and using and maintaining monkey type?

So this is like I kind of was like I said earlier, like, a lot of work's gone into basically, like, making monkey type production ready.

And so when you're building something like this that, like, kind of operates, like, the bowels of Python, you get exposed to, like, all of the weird edge cases of, like, different ways that functions can get called, different kinds of functions that get called when you run it on, like, production code. So there are couple couple, like, interesting examples. So, like, 1 of them is we have internally this concept of, like, a lazy property, and I think Django has it. It's called, like, cache property. But, basically, we annotate functions that are expensive to compute with these. And so the first time you access the property, it does the expensive computation and then it memoizes the results, but it memoizes it as a property. So the first time this thing gets called, like, monkey type will see it and it'll think, okay, like, I have this this function call. But it turns out that the the way that our implementation at least works, there by doing the attribute lookup for the function,

or sorry, by do by doing the function lookup for that lazy property,

you end up in, like, creating a side effect. And this is just in, like, kind of, an example of kind of, like, a a more general class of, like, the whole function search process needs to be side effect free. And in our case, like, the side effect we are we are creating was causing Django to crash, which is obviously not great. But so I spent a bunch of I spent a bunch of time writing a version of getattr. So, like, a lot of the,

function lookup code uses getattr pretty heavily, but that can obviously invoke Python's pretty dynamic

attribute lookup process, which executes arbitrary code. So writing a version of getattr that's side effect free took a little bit of time. And then I discovered that, well, pro tip, like,

inspect the inspect module has get adder static, which does this for you. So if you need to do this, don't waste any time.

Another example is just like, like I said, like figuring out well, figuring out whether or not the return type that you get is actually valid. So the profiling API just you get invoked with a return event, but it doesn't tell you why the function is returning. So it can be because, like, like, the function threw an exception, it returned normally, or it's actually like a generator and it's yielding.

So to to generate meaningful annotations, we actually need to be to be able to distinguish between these cases. And this 1 was kind of like a puzzler for a little while, and we realized that, oh, hey. Like, we can actually look at

the last instruction that the interpreter executed,

as a way of figuring out which of those cases it is. So, like, if it's returning normally, you'll get, like, the return value opcode. If it's yielding, you'll see a yield value. And if it's throwing an exception, you'll see, like, the like, raise opcode or something. But yeah. So, like, those are just examples of all of the, like, guts of Python you get exposed to. And I think kind of, like, working through the edge cases and working through all of those was like, at least for me, 1 of the the biggest challenges.

And are there any other particularly interesting

aspects of the project where you learned something that you weren't expecting to encounter?

I mean, I think, like, the side affecting

attribute resolution process was something that, like, in hindsight is obvious, but, like, when I was writing the code, like, I was

I was like, why is this happening? This doesn't make any sense.

Outside of that, I don't I don't know. I I definitely have a much greater appreciation

for

all of the complexity

that is involved in, like, when you type, like, food dot bar and, like, descriptors and all of that.

I don't know, Carl. Do you have anything else there? No. That that that about covers it, I think. Okay.

And just the fact that Python has added this gradual typing capability

and the built in typing module

has caused

a fair amount of discussion and debate, you know, on both sides of the argument.

And so I'm wondering now that you've been working with actually using these type annotations and applying the static analysis of the typing,

what you have found to be the most useful and the most problematic aspects of the capabilities that Python has for providing this type information,

particularly if you have any experience with other more heavily typed languages such as Haskell

or Rust or, you know, something that's more in the middle ground like Java?

Yeah.

I mean, so I think, first of all, overall,

we're we've been quite happy with adding,

types to our code. I mean, just comparing a large untyped Python code base to a large,

well, increasingly

more typed Python code base. We have

we have no no regrets or or desire to go back.

And we've seen a lot of,

really strong kind of organic uptake from our development teams,

really without a lot of

pushing from us.

People have been picking up monkey type and running with it, and the

the proportion of our code that's typed has been increasing gradually. So

or increasing quickly. So we see that as pretty strong

signal that, developers are getting value from this.

Yeah. So I guess as far as

problematic aspects, I think 1 of the most painful

aspects

of the type annotations themselves is the fact that they're evaluated at runtime currently.

That's caused some,

performance issues.

Like, if if you look into the details of the way,

generic

based classes work in, the typing module, it's pretty hairy. They actually

when you so if you have a generic class like, list or something,

you in your typing system, you wanna know, okay. A list of what? Is it a list of strings? A list of integers?

So the way you represent that in Python's typing system is you say list square bracket int. Say this is a list of integers.

It's fine for mypy to understand that, but because type annotations are evaluated at runtime also, it also has to that has to work in the

Python

interpreter itself.

And the way that's implemented is that when you index into the list,

type,

it dynamically creates a new subtype,

called, like or that that represents a list of integers.

So in our kid code base, we have enough code that we have some cases where,

this results in literally thousands of subclasses

dynamically created because we'll have so many different,

variants of some container type where it contains some different type. And that actually has resulted in

significant,

performance,

issues in a few specific cases. It hasn't been a blocker. Like, we've been able to find,

workarounds, but it all also, the the fact that annotations are evaluated at run time means you need a kind of ugly you have to quote your type annotation anytime you're dealing with, like, a cyclic reference or a forward reference.

That's not terrible, but it's,

a little ugly. It kinda feels like a wart when you're using it.

The good news on this 1 is that,

actually, another Facebook developer who's worked closely with us on the whole typing effort,

has written PEP PEP 563,

that's been accepted.

And so in Python 3 dot 7,

annotations will no longer be evaluated at runtime,

by default. Or maybe there's a future import you need. I forget exactly how it works. I have to go back and look at that. But we'll be able to avoid having them evaluated at run time, which

just takes away that whole class of problems, which will be really nice. As far as, like, some of the tooling,

MyPy is awesome. We've been really

grateful to the team that works on that for all the work they've done, and, overall, it's worked very well.

We mentioned

the

it definitely

there are definitely issues where we see a few false positives.

Those usually aren't that hard to work around. You just

use a type ignore comment or whatever and make a note of it. And later on, it gets fixed, and then you can remove that comment.

1 of our biggest pain points is just with the amount of code we have. Type checking it is pretty slow,

but there is a lot of, work in progress. I know the the MyPai team is working really hard now on some

improved incremental modes and a daemon mode where MyPai will stay resident in memory, and can just keep asking it for updated type checks.

So some of that work should make a big difference to the to the speed. Overall, I think it's really early days for typing type check Python,

and

I think the the tooling is gonna continue to improve rapidly. And even with the current state of things, we've been pretty happy with it. Do you have more to add there, Matt?

I mean, I'm probably just gonna, like, fanboy out on types

in in large software systems in general.

I, like

yeah. I mean, yeah. I think, like, I agree with everything that Carl said.

And I think it's, like, it's definitely worth investing and improving the tooling to get it to the point where, like, we're all really happy with it and we're all happy with how and, like, types become kind of like a normal part of the way that you write code because, like, I don't know. So, like, at my last company, we wrote a ton of Python, and

we like, 1 of the kind of, like right when I was leaving, we were writing a rewriting a lot of stuff in Go. And this wasn't for, like, performance reasons or anything, but it was because we had, like, we had so much code, and we had pretty decent unit test coverage, but we're still like pretty afraid of

making changes in it.

Because we didn't have, like,

confidence that we would make a change and we wouldn't break something. And I think that, like,

types actually

give you a lot more confidence in being able to make changes in large software systems. So they're definitely worthwhile.

I don't know. That's my 2¢. Yeah. I definitely agree with that. And, I mean, you mentioned, like, comparing to other languages. I like Haskell a lot, and Python is obviously never gonna be Haskell, and and Python's

type annotation system isn't

getting it anywhere close.

But,

I think it it it hits a pretty good sweet spot for still being Python, and you have all of the dynamic nature of Python available to you. And the type system,

the gradual typing and the availability of things like any and casts

gives you all the escape patches you need to, like, get out

of the strict typing and do something a little crazy if you really need to. But for the majority of your code,

you can,

get the benefits of a little bit of more confidence in the correctness of your code. Yeah. And I mean, like, even though, like, MyPy and, like, the type system provided is still relatively new, I think it's, like, actually pretty expressive. And, like, coming from Go, which doesn't have generics, right, it's like it's like, it's it's actually, like, I think, quite satisfactory.

Like, I've been pretty happy with it. And have you found that the presence of types in the code base has caused you to change your approach to your software development cycle? Like, do you start thinking about the type signatures of the functions before you actually write the bodies versus doing it, you know, from the inside out? Or have you found that you just leverage it as an additional piece of information and your overall development cycle has remained largely the same? I don't know. Like, it's weird because, like, I've written both a lot of dynamically typed code and statically typed code. And so I think my development process tends to be a little bit more kind of on the, like, the statically typed side. So it doesn't really change too much whether or not, like, I have annotations available to me at Python with Python. I don't know. Carl, what about you? Yeah. I mean, I guess my feeling on that is that typically when I wrote Python before type annotations, I still had a pretty clear idea in mind of what types I was expecting. I just wasn't recording that anywhere or able to check it. So I guess I wouldn't really say it,

changes the way I think about writing code or the way I write code. It's more just I mean,

it's like another complementary way to rule out an entire class of bugs. Like, you know, if you imagine a plane of all the possible bugs,

like, testing is a great way to,

verify crest correctness at a bunch of specific points in that plane, and which is very useful because it's very granular and specific. And type checking is a way to just, like, draw a line across that plane and rule out an entire category of bugs or of possible bugs. So I think they're very complementary ways to gain more confidence in the correctness of your code.

And for somebody who wants to start using monkey type, what is involved in getting it set up and using it in either a new or an existing code base? Yeah. So I'm gonna, sound like an risk sounding like an advertisement here, but it's, it's actually really very easy. So,

if you PIP install Monctite, the next step depends a little bit on what kind of code base you're tracing. Like, is it a library? Is it a web application?

But there's a couple of different ways you can get started collecting call traces. You can

use monkey type run, which lets you run any Python script similar to coverage run if you use that with the coverage pot dotpy module. So you can run any script or your entire test suite or whatever

under monkeytype tracing.

Or if you need more control, if you're, like, tracing requests in a web application, there's a context manager you can apply, and everything within the context manager will be traced using monkeytype. And you can do that just literally with a line of code or using monkeytype run without any configuration.

The stock configuration

works out of the box and will store traces to a SQLite database locally. So you can get started with with no configuration. Just, stick the trace context manager in or use monkey type run and go. And then once you've collected a few traces, you can use monkey type stub, whatever module, and you'll get a usable, type annotation or type out of that. And then you can use monkey type apply

to actually apply it to your code and get the annotations directly in your code. And then all this, of course, once you do have a production deployment, it's all configurable. So you can decide where you wanna store your call traces. You can write your own custom back end to store them in any database you like. You can provide your own typewriters.

You can even provide your own call logger to implement whatever,

crazy logic you want in terms of deciding which traces are interesting to you and worth storing. So, basically, monkeytype solves the hard problem of getting useful type traces

and then lets you configure everything else

if you need to. I think the, like, just kind of the the batteries included approach is is pretty nice. Like, having monkey type run just to work out of the box definitely lowers the barrier to entry. And I think too, 1 of the unintended benefits is the fact that you don't necessarily

need to use it specifically on code that you're writing. If you're just trying to obtain additional type information about libraries

or external dependencies that you're leveraging,

that you can just use that to generate the stub modules and get the type information for those other projects that you don't necessarily

have control of that so that you can then allow that type information to propagate through the code that you are maintaining.

I think that is the first. I haven't I haven't thought about using it like that.

That's actually great. That's that's awesome. Yeah. I think the limiting factor always is that you need, like,

some

you need some code to trace. Right? And the quality of the stubs you get is gonna be dependent on how well the code you're tracing actually exercises a library. So if you're using it that way for a library, you can totally do that, provided you have, like, a good test suite for the library or some some corpus of code exercising that library that actually,

uses the variety of types. Because what you basically, what you get out is what you put in in terms of the types you see.

Yeah. 1 of the things I sort of thought would be a fun exercise would be to, like, actually go through GitHub and go through some of, like, the popular

Python 3 6 projects that are unannotated, and just like try and use monkey type on their unit test suites to generate some stubs for them.

It just, like, submit a bunch of dips being, like, adding types, adding types, adding types.

I don't know if that yeah. Right? I don't know how well it would work, but it'd be interesting.

And are there any

future improvements or additional features that you have planned for upcoming releases of monkey type?

Yeah. We've got a a list on GitHub of,

15 or so,

bugs and improvements.

Everything from,

good first issues to tackle to more complex improvements. So, anyone can come check that out and see if there's anything that interests you. Some specific ones that we could highlight. The,

we talked to Matt talked earlier about, like, descriptors

and having to add support for various kinds of descriptors on classes.

And, like, for instance, the open source monkey type, we recently added support for Django's cache property. But in order to do that, we had to hard code that kind of into the core of monkeytype. And it only applies to Django's cache property even though, there's several other libraries. I think Verksoig has a cache property that works basically the same way, and I think there's even a dedicated library,

for that cache property implementation. So the support we added currently only works for Django's version of it, which is kind of a dumb limitation.

So 1 thing we'd like to do is have a, like, a pluggable system for adding support for,

descriptor types. So you can do that without having to hack monkeytype itself. Did you have anything else in mind as far as improvements, Matt, that you wanna highlight? I think well, generating extending the type rewriter system

is something that I think would be useful. I'm not sure how generic we can write things or how generic we can make the the types that we generate.

But that's something that I think would go a long way towards,

I guess, further reducing human human interaction.

The other thing that I kind of wonder about so, like, 1 problem we've seen is that as we've started adding types into

the code base, people have started using type aliases. And because of the the way that monkeytype is built because it so when it actually stores the traces, it's storing essentially,

the fully so, like, the the qual name of the type. And since that's evaluated at runtime, if you have, like, an alias, it'll actually just get back the the raw type and not the alias, and what you actually want in the annotation is the alias.

So I'm not even sure if it's possible for us to solve this problem. But if we could extend the the sort of stuff generation process to use some information that we glean from either, like, static analysis or, like, kind of a some usage of the AST, we might be able to to fix that and generate better annotations.

Alright. Are there any other topics that you think we should talk about before we start to close out the show? I can't think of anything. I don't know. I feel like we covered a lot of stuff. I don't have anything else in mind. Alright. So for anybody who wants to get in touch with either of you and follow the work that you're up to, I'll have you each add your preferred contact information to the show notes. And with that, I'll move us into the picks. And so for my pick this week, I'm going to choose a new set of headphones that I picked up because the ones that I had been using for recording were open back. So the audio from guests would start to cycle back through the microphone. And so I recently picked up a pair of closed back sort of recording studio headphones and they're the Lixpro

HAS 30,

which were pretty good pricing. They're only $60 on Amazon, whereas a lot of the other types of headphones for studio recording are generally in the upwards of $100 range. So they were really good sort of feature set for the price, and I've been enjoying them so far. So definitely recommend those for anybody who's looking for a new set of headphones that are decently priced and have good noise isolation. Nice. And so with that, I'll pass it to you, Carl. Do you have any picks this week? Well, you said we could, venture outside of tech for our picks. So Absolutely. As far outside as you would like. Nice. Well, I've been,

so I'm gonna go with my Netflix binge watching habits.

I've been,

overdosing on, British crime dramas recently and having a lot of fun with it. So I

I recently,

watched the entire way through Broad Church, and I'm now halfway through Happy Valley. And they're both excellent shows that I highly recommend.

Cool. Okay. Well, I'm gonna sort of stay in tech, but it's gonna be cooking tech. Nice. So I recently got the, like, Anova sous vide. It's like the Bluetooth sous vide, and this thing is, like, amazing. So

I am not, like, a huge cook, but I like cooking, and I also have a 3 year old daughter. And so, like, this thing, it's super easy to use. There's relatively no cleanup

or relatively little cleanup, and

it produces I've used it, like, 3 times, and everything I've cooked has been cooked perfectly. And so it's like you just seal it, seal whatever you're cooking, kinda squeeze out the air, stick it in a water bath, put this thing in it, and then you can pretty much set it and then just walk away. And so, like, I'll set it and then, like, go, like, clean up or, like, go hang out with my daughter or whatever. Come back, and whatever I put in there is now perfectly cooked. So if you're into cooking or

you don't have a lot of time or you wanna be able to do something else while you're cooking, highly recommend it.

That's awesome.

Alright. I think you're the second person to mention that within the past couple of weeks, so it definitely seems to be a popular item. I'll have to, take a look at that myself. Alright. Well, I appreciate the both of you taking time out of your day to join me and talk about the work that you're doing with monkey type. It's definitely a very interesting project and 1 that I am hoping to take advantage of at my work very soon. So thank you both for the work you've put into that and for your time this evening, and I hope you enjoy the rest of your day. Great. Thanks, Tobias, for inviting us. Appreciate it. Yeah. Thanks a lot. This was fun.

The Python Podcast.init

Summary

Preface

Interview

Keep In Touch

Picks

Links

The Python Podcast.__init__