Summary
As Python developers we are fond of the dynamic nature of the language. Sometimes, though, it can get a bit too dynamic and that’s where having some type information would come in handy. Mypy is a project that aims to add that missing level of detail to function and variable definitions so that you don’t have to go hunting 5 levels deep in the stack to understand what shape that data structure is supposed to be. This week we spoke with David Fisher and Greg Price about their work on Mypy and its use within Dropbox and the broader community. They explained how it got started, how it works under the covers, and why you should consider adding it to your projects.
Brief Introduction
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com
- Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project
- We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com
- Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
- To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
- Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
- Your hosts as usual are Tobias Macey and Chris Patti
- Today we’re interviewing David Fisher and Greg Price about Mypy, a library for adding optional static types to your Python code.
es
Interview with David Fisher and Greg Price
- Introductions
- How did you get introduced to Python? – Chris
- Can you explain a bit about what Mypy is and its origin story? – Tobias
- What are the benefits of using Mypy for both new and existing projects? – Tobias
- How does the Mypy compilation step work? – Tobias
- What are the biggest technical challenges in implementing Mypy? – Chris
- Are there any limitations imposed by the syntax of Python that prevented you from implementing any features or syntax that you would have liked to include in Mypy? – Tobias
- In Guido’s keynote from this year’s PyCon he mentioned some tentative plans for adding variable type declarations to the Python syntax in one of the next major releases. How much of that idea was inspired by Mypy? – Tobias
- Type theory is a large and complex problem domain. Can you explain where Mypy falls in this space? – Tobias
- Which language(s) had the biggest influence on the particular syntax and semantics used in Mypy? – Tobias
- What kinds of type definitions and guarantees can be encoded using Mypy? – Tobias
- Can you talk a bit about user defined types as implemented in Mypy? – Chris
- How has the inclusion of the typing module in the Python standard libary influenced the evolution of Mypy? – Tobias
- Did the inclusion of multiple inheritance add any implementation complexity to Mypy? – Chris
- Do you know of any formal studies that have been performed to research the ergonomics or efficiency gains of static or gradual type systems? – Tobias
- What does the future roadmap for Mypy look like? – Tobias
Keep In Touch
$ pip3 install mypy-lang
Bug reports, feature requests, questions welcome on issue tracker: github.com/python/mypy
Picks
- Tobias
- Functional Geekery – Andreas Stefik episode about studies performed on the human factors of development
- Soft Skills Engineering Podcast
- Chris
- David
- fzf – a fuzzy finder
- Thinking, Fast And Slow by Daniel Kahneman
- Ringworld
- Greg
- On Proof and Progress in Mathematics, essay by Bill Thurston
- Axiomatic by Greg Egan
Links
- GitHub repo, and CONTRIBUTING file
- PEP 484
- PyCon 2016 workshop slides
- Typeshed shared repo for stubs
- Other tools (PyCharm, pylint, pytype, …) using PEP 484 types
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast.init, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. Linode is sponsoring us this week. Check them out at linode.com/podcastinit, and get a $20 credit to try out their fast and reliable Linux virtual service for your next project. We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry's real time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at get sentry.com and use the code podcastanit@signup to get a $50 credit. You can also visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch at podcastanit.com.
And to help other people find the show, you can leave a review on Itunes or Google Play Music and tell your friends and coworkers. You can also join our community forum at discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, propose show ideas, and follow-up with past guests. Your host, as usual, are Tobias Macy and Chris Patti. And today, we're interviewing David Fisher and Greg Price about Mypy, a library for adding optional static types to your Python code. So could you guys introduce yourselves? How about you go first, David?
[00:01:20] Unknown:
Sure. So hi, everyone. I'm David. I've, been working at PropBox for the past couple years and on, MyPy itself since the beginning of this 1.
[00:01:30] Unknown:
And, Greg, how about you?
[00:01:32] Unknown:
I'm Greg. I also work at Dropbox and have been doing Python for a long time and long wished that it had types. And I'm very happy that we're now, like, granting my wish.
[00:01:42] Unknown:
So how are you folks introduced to Python?
[00:01:46] Unknown:
So I was introduced to Python, in well, actually, it was part of a low level c class my freshman year of college. My TA at the time held just sort of a special, section on Python, just a sort of a bonus thing. It wasn't really needed for the class at all, but it was just something that he liked a lot. And I had my mind blown at the time by the fact that you could use negative indices to index from the back of arrays. I just thought that was the coolest idea ever. I really liked what I saw in the sort of small, you know, taste of it that he that he showed us. And so I started doing some side projects in it. Nothing too exciting, just playing around with it, and that sort of started my love for it, I would say. And how about you, Greg?
[00:02:22] Unknown:
Sure. So I when I was in high school, I actually used to do computing contests, and my brother did too. And and 1 year when he won 1 of these contests, the prize as usual was you got to pick a book. And the book he picked was 1 on this language, actually, neither of us had heard of before called Python. And he got really into it, and that's where I got into it.
[00:02:40] Unknown:
Great. And, can you guys explain a bit about what MyPy is and some of its origin story?
[00:02:46] Unknown:
Yeah. So the the good news here that we've, we're telling everybody of Python and or we can is that Python now has static types. They're optional. They have no effective run time, but they're really good for making a code base easier to understand. And that's why we're really excited about them, why both David and I are and why Dropbox is excited about them and actually sponsoring full time work on it. So so the the static types are the sort of center of the story, and MyPy is the type checker, which is a key part of making that whole thing work. And so where we come from in in really being interested in static types is about understanding a code base, understanding a Python code base, especially 1 that's larger that's maybe working in it. So there's this little story that I've lived out many more times than I would like where I'm trying to understand some little piece of code. And it's just a couple of lines maybe, just a handful of lines. And I wanna know what it does.
And okay. Well, okay. It calls some method on a thing. Great. What method is that? I wanna go and look up that method so I can find out what it does, either via documentation if there is such thing or the implication. And I can search the code base for methods by that name, but there might be a lot of methods by that name. There could be 100 in giant code base like drop boxes or there could even just be 2, and that's enough to to potentially be a pain. And the question of which method it actually is it's called is a question of what type the thing is that you're calling it on. And the same story occurs for all kinds of questions you might wanna answer about how some piece of code works. You have to know what types the things are that it's operating on. And in traditionally, in a in a Python code base, what you end up doing is, like, okay. Well, what type is this thing? Well, it was passed in as a parameter to this function. Great. I go and search for where this function is called, figure out what types, again, the arguments are, that we pass there. And now it's the same proper recursively, and I go and maybe search for you know, maybe it's obvious at that spot, or maybe I go have to go search for the caller's caller. And I end up searching through many thousands of lines of code and reading through many thousands of lines of code potentially just to understand what 1 line of code was doing. And that's the thing that I hope to never do again.
And if only when I or whoever else was writing that code in the first place, we had just, like, at the top of the function written down 1 line saying what the types are supposed to be of the arguments. That would terminate that whole search right at the start, and it makes it much, much faster to figure out what this code is doing. And that's now what what we're doing, what what, many people are doing across Dropbox code base, many people are doing across a number of open source code bases and elsewhere, and you you put the you write down the types at the top of the function, you write down the types wherever it's not obvious, and, and that's the that's the static types.
And the crucial part, the part that, like, is new effectively, that's needed in order to make this all work is you always could have written down the types in a comment or docstring. Right? And people even had sort of conventions for how to write them down docstrings. The trouble is and, I think probably anybody that that spends much time in a code base that a lot of people worked on over time sees this play out. Comments become lies. They're somebody writes down a comment, and it might be truthful all the time, and then the code changes. And somebody changing the code, doesn't see the comment, doesn't think to update it, the comment becomes no longer true.
And once you see that happen a few times, you realize you can't rely on the comments since you end up actually really doing the same work of understanding the code
[00:06:08] Unknown:
because you can't trust the comment to tell you the answer. And so the critical thing is that you write it down in a way that a machine can actually check is not a Y, And that machine check is the type checker, and mypy is a type checker. And as for the origin story, mypy was started, I guess, about 4 years ago now, by Yuka Letosalo in in 2012 as part of his PhD work. And it was originally a sort of a custom language, which is how it got the name, sort of a language that was very similar to my to, to Python but not quite. But then he ended up meeting Guido at at Python in 2013. And and Guido convinced him to just do this kind of type checker to to submit this from a, custom language to just type checking pure Python. And it sort of had some work on and off by various contributors in the years since. But at the beginning of this year, we really started working on it full time. Me, Greg, Guido, Reed Barton, mainly,
[00:07:01] Unknown:
and a few other customers. Putting Yeah. Good bit of time into it as well.
[00:07:06] Unknown:
Going back to your point about searching through a large code base of trying to figure out where different things are coming from. To some extent, the origin of, at least methods anyway, is somewhat alleviated by the strong focus on name spacing in Python. But it's definitely a lot easier to have that information very local to the function definition because then you don't have to go looking back at the top of the file, which could be many thousands of lines back and then trying to remember, okay, where was I when I first started this search? So And it is then I I don't really agree actually that that that name status by name status, I think maybe you mean, like like module names and and package names or hierarchical names of modules? Correct.
[00:07:42] Unknown:
That works great for, like, invoking top level functions. That would find the top level of some module because you went and imported them from that module or you imported module and refer to the module in the code either way. But if instead you invoke a method on some thing you had a class somewhere that defines a method, and you may have have had several unrelated classes that define methods with the same name because, like, popular names are popular. The in the Dropbox code base, the the name validate. Great work. Lots will choose it. There are well over a 100 different methods named to validate. And a smaller code base, there might be 2 or 3, but that's enough to be uncertain what something is doing until you go and work out, like, which of those it actually is. And the question which it is is precisely what is the type of this of the thing that you're invoking method on? And you don't like, you you just have this thing that, like, was it a parameter to the function or what what you got from an attribute on somewhere else, and you don't have a model name anywhere. The only way to work out what it is to actually figure out the type.
[00:08:36] Unknown:
Yeah. That's definitely true where it's really only useful for top level functions. So it definitely makes it clear for why mypy is useful for existing and particularly large projects. What about for smaller projects or projects that are just getting started? What would lead somebody to want to use them, particularly, for instance, where the entire program exists in a single file?
[00:08:56] Unknown:
Well, I think there are, a few reasons that that you might wanna do that. The first is, I personally find types to be a pretty helpful way to structure my thinking about my own code as I'm writing it. It just kind of lends some some clarity to what I'm writing down. And if I just start by writing the types at the top of a function, it'll be a little bit easier for me to write the function body a lot of the time. Also, MyPy will you know, it's a type checker. It's not just this, like annotation system. So so we'll catch a pretty decent variety of errors, which I think is pretty nice. And also, if you even with just 1 file of code, if you come back to it, like, 2 or 3 weeks later, you may still kind of have this moment. You know, you won't have as big a search to do, but you may still have a moment of, like, alright. What is this function doing? And then you kinda gotta dig and figure it out a little bit. Whereas if it had types, it may be very immediately obvious.
[00:09:45] Unknown:
So can you explain a bit more in detail what the actual compilation step looks like? What kind of operations it's doing under the covers?
[00:09:52] Unknown:
Sure. So it's not actually compilation per se. It's more like a linter on steroids, but it's there's kind of several stages that the that the code moves through. First, we have to parse it all, obviously. And we do this with this, typed AST module that we've created. We wanted something that would be very fast, but also very fast, conform exactly to, like, the Python language spec, and also be able to parse type comments. And so we kind of wanted to use, like, the Python built in AST module, but that doesn't parse comments at all. You know, the benefit of AST obviously being that it is very fast, and and corresponds exactly to to Python because it's the module that CPython uses, when it interprets something. So we actually forked the AST module from both Python 27 and Python 35, and made them compile on Python 3 and and added in the ability, to parse type comments. And so now we can parse, you know, both Python 2 and Python 3 from Python 3, which mypy has written in, and get those type comments out. So so our first stage, we parse. We get a we get a Python AST.
And then we kind of put it through this semantic analysis step. Basically, it's figuring out what refers to what. So if you have some x in the body of a function, you figure out, okay, this is actually refers to the, you know, x that is passed in as an argument. Or if you have, some class definition, you figure out, okay, that's imported from such and such a module at the top of the file. And so we kind of make all those connections in this in this next pass. And then sort of finally, we have, like, a a type checking pass where we go through the code bit by bit, and determine what types kind of every different expression has and and at the same time, check that there are no errors when when things are being used. So kind of an example from the tech checking phase might be something like, okay, we see as we've recursed down into a function, we see, okay, this is an integer literal. So now we know it has type int, and then we can propagate that out. Or kind of more in a more complicated case, we see, like, you know, this is a function call, so let's use the semantic analysis phase information to figure out what function it refers to exactly. And now we know what type, it's expecting to take as arguments. And so we can kind of recursively run this type check on all the arguments and figure out if those know, make sure those types are compatible and and print an error if they're not.
[00:12:02] Unknown:
I have a thought, actually, I'll add on the the benefits for for, like, small pieces of code, which is is amplifying the bit where 1 of the function of it is to find errors that are in your code. 1 of the more frustrating things that you can have happen with when you write, like, some little script, is often you write maybe a small enough code, just dozens or 100 lines of code for a script you're gonna run, and you're gonna go and run it on some data. Right? Or it's gonna go and, like, fetch some stuff from the the web or something. It's gonna do a bunch of processing. And an experience that certainly I've had more times than I would like is you have you have the script. It's going. It's going. It's going. It's doing its work. And right at the very end, it blows up with some stupid bug. And all of that, like, intermediate work that's done is just lost, as the as the script fails.
It may be it's blowing up with, like, an attribute error because you mistype something or blowing up with a, like, is it some some error because you have a none somewhere, which was because you I mean, because you basically made a type error. All those classes of errors you're really most especially, like, likely to have in this code that you're like, it's in the script you're running based the first time, because that's where how script development often is for processing data and so on, are the the kinds that that MyPilot is especially likely to find for you. And so I it's it's been a little while since I wrote a, like, complicated script, but definitely the next time I'd be doing so, I would very much want to be running to be writing Tektonnet from the beginning so that I can have that benefit and so I can never have that experience, that blows up on a stupid dumb bug after doing much of work and destroys all the work.
[00:13:31] Unknown:
Yeah. I've definitely run into situations like that because I've spent a fair bit of my career doing various data munching, and that's particularly painful when the data that you're operating with or the outputs that you're dealing with are not idempotent.
[00:13:44] Unknown:
Mhmm. Yeah. Yeah. Right. Right. Yeah. That's that's actually dangerous. Right? Yeah.
[00:13:48] Unknown:
So what are the biggest technical challenges in implementing, MyPy?
[00:13:53] Unknown:
So I'd say the design decisions, are actually the hardest part. Probably because Python actually has so Python seems simple from a high level, and as a user, most of the time it it is. But when you look at kind of the nitty gritty of the semantics, it's actually really very complicated, and there's a lot of very dynamic things going on. And so kind of figuring out where the right place is to provide structure on this dynamic behavior is very difficult because we wanna support as much of the full Python semantics as possible. Like, we don't wanna say, hey, we're making this really restricted subset of Python that you can type. We wanna we want to try to type as as much of the language as we as we can, but you still have to carve out some boundaries. You know, if you're going to be calling, getattr with some string that comes from somewhere, like, hey, we're not gonna be able to check that that attribute actually exists. That's, you know, potentially just impossible. And so so we so figuring out where those boundaries lie and and, like, what to do about various edge cases there's been quite a quite a difficult task, I would say.
[00:14:52] Unknown:
Yeah. And I imagine that it's probably difficult to to prevent it from being overly opinionated and thereby restricting the types of things people can do with MyPy.
[00:15:02] Unknown:
You know, there are always gonna be valid programs that won't type check. But we what we're aiming for is if you have to convert your program to some form that type checks, the type checking form should be clearer anyway. Like, hopefully, in converting it to to type check, you're just having to, like, remove code smells or or or doing kind of weird things that maybe you shouldn't have been doing in the first place. And so that's that's the main place where we feel good about making the trade off of saying, hey. Like, we're not gonna support this right now. Yeah. Because, like That's the thing we very much see people doing. Both people
[00:15:33] Unknown:
in some parts of Dropbox, working with some of the code that's that's especially, like, been around for a long time and and they have the best choices or or also other people I've seen adopting it, outside. They they see some code. They wanna put types on it, and say, well, we don't wanna support that pattern. And then I'm saying, oh, well, I was actually looking for an excuse to get rid of that anyway because now is the time to do that. So when we hear that, we're happy. We're like, everybody's happy. We don't we don't really mind that there have to be some change to the code for the to to type check. But we, in general, want to minimize the changes that have to be made to the code base in order to put types on it. And and we want to maximize the amount of the code that can end up with types on it with a minimum of change to the code. It'll always be okay to have some parts of the code base just not have types on them. That'll always be, like, totally fine. But we want to be able to use types as much as possible to get the benefit of them as much as the code base as possible without having to make. And so that's the doing that while while maximizing the benefits that they do give by giving useful structure to the code is the the trade off that the boundary that always has to be found.
[00:16:30] Unknown:
Yeah. And there's also a certain critical mass whereby you have a certain percentage of your code that has type annotations, and those types can then propagate out throughout the rest of the code base that doesn't necessarily have the annotations already labeled, but you can provide those inferences based on how those changes are propagating throughout the overall code.
[00:16:50] Unknown:
Well, it actually turns out that we we shy away from type inference for the most part. And this is kind of a purposeful decision because we want people to be able to roll out their types to code bases gradually, and that's partly because we have a very large code base that we're rolling out types to gradually. But, you know, in the in, you know, say, a couple years ago, nobody had types in their code at all. And so every use of this is gonna be a a gradual rollout or or, you know, unless unless you're starting a new project. And and part of that is being very permissive about code that doesn't have type annotations in it. And so we sort of let you gradually expand your area of annotated code, and everywhere that's annotated, we check actually type checks, but we try not to reach or in fact, we're pretty careful to not reach into unannotated code to to yell at you about errors there because, otherwise, if you have, like, a 100000 line code base plus and you start to try to you add types to 1 file and you run myPilot the first time, you'd get just a huge deluge of errors and you'd be completely overwhelmed, and you wouldn't even wanna get get started. And so this lets you actually roll it out in a file by file way and make things better slowly and and catch and even if a lot of those are legitimate errors,
[00:17:58] Unknown:
like or legitimate bugs that may be caught, no 1 can deal with those things if they get them all at once, basically. Although that's actually an option people have, and and people do actually choose different strategies than this. Because as David said, when it does go and analyze all the code even without annotations, it often finds quite real bugs. That's strategy that some users take. For example, 1 of the open source projects using is, is SULIP, a chess system in Python. They've taken a choice. You know what? We only have a few few tens of 1, 000 lines of code. I forget just how much. Less than a 100000. And we like finding bugs, so we're gonna actually, pass this flag to my py, which says, actually, go and check even check all the function definitions, even the ones that don't have annotations on them. And when you pass that flag, it will not make a lot of assumptions about that code, but there are, like, some things that we totally won't be able to find. And so they went through and had to go and systematically actually fix a good number of bugs, to make a claim with that. Mostly small bugs because they had good tests. If you had a less comprehensive test, you'd probably find more bugs. Fixed bugs and also make some things just where, like, my pie was, like, required to code restructure in certain ways in a few places. And then they got it to be totally clean with that option, and now they do they they have that option when they run-in CI, and every change you make to the code base, it it has to remain clean and not introduce any bugs, so that's fine. That's sort of a a strategy choice that you have to make, in adopting on a code base. So for the entire Dropbox code base, there's millions of lines of code that would, for all the reasons they've just said, be a strategy that would not work. It's really important that you'd be able to adopt it gradually and not deal today with finding with, like, fixing all the bugs in the entire code base and satisfied as default.
Yeah. It's definitely important to be able to gradually ramp up the level of aggression on checkers like that. And it is useful too, where even if you're not in your day to day running it across your entire code base, being able to occasionally just say, okay. How far away are we from eliminating all these issues just as a periodic progress report? Yeah. I think it can be. I think the the strategy for adopting this on a given code base, I think always crucially has to involve you put it in your CI system. You run it the same way as running your test. And if hopefully, you run your tests every commit you make or most of your tests. MyPy is quite fast compared to any decent test suite. You run MyPy every commit you make, and you don't let a commit come in that that breaks the MyPy build. And so a crucial part of making that work well is that it's it's actually quite easy to get it running, cleanly on your whole code base because by default, it will be it will be very conservative and not making a lot of checks. It will not it will not, by default, go and, like, try to find all kinds of of of of bugs that might be there somewhere. But it will start by just making very few assumptions, checking very few things, and as you go and add annotations, it will check more and more. And so you you can start by having a, like, totally clean results because it's not checking very much yet. And you start adding more annotations, it checks more and more, and you maintain the inference that it's always totally clean. And the and the reason that that's really a really important thing to maintain is because that that is what lets you actually trust the type annotations.
And if you if you don't keep up on that and and the type annotations just do great to be, like, docstrings, then there's kind of the problem we mentioned before is that you can't trust this and you have to to go back to your search. Another thing actually that that right there that's just maybe worth discussing, you mentioned, type inference. And there's different approaches you will sometimes take to to type checking into, like, type systems in a set like a type code base, ranging in 1 style of programming, which there are some projects trying to do this in Python, and there are also there there are languages where this would be aesthetic. You write virtually no type annotations anywhere, and you just have you're, like, you're really proud of your type checker being very smart and being able to infer how the types flow everywhere. And that is a thing that that can work for some purposes, But actually, even if you have, like, a really, really smart type checker that is able to do that successfully, I think it actually doesn't serve the main purpose that we are are really excited about types 4, which is to better understand the code. Because for that purpose, the crucial part actually is that you do write down the types. And so the the balance we've chosen, which is also the balance that, we seem to work well in some other languages even when you have the option of really not writing very many, is at the top of very function, you write down the types. You write them down. You're partly just because it it helps out the type checker, makes it easier for us to make the type checker, like, fast And but partly because that's really actually the point is to be written there in the source code for the benefit of the next reader. Break down the top of your function, you also write it down in other places where it might not be obvious, obvious to a human reader, also obvious perhaps to the type check. And so the consequence is that it's there. It's written. It's it's quite lightweight. Just 1 line at the top of function. It's not like in some language where, like, almost every line, you maybe have to write some types. But a line at the top of the function and a few other places, and and so it serves this purpose very well of of telling the reader what's going on. And then the type checker has a sort of lighter weight level of of type inference where within a function, it will work out, oh, like, you know, we started with these parameters. You went and did some, like, straightforward operations on them. I know what types those produce. I know what type you did here, etcetera, etcetera. And it can sort of follow the thread of the of how the types evolved through the function.
[00:22:39] Unknown:
Are there any limitations imposed by the syntax or semantics of Python that prevented you from implementing any features or syntax that you would have liked to include in mypy?
[00:22:48] Unknown:
1 of the things that you have to do right now even in in Python 3, which has sort of a nice function annotation syntax, is if you're trying to annotate a variable, you have to put that annotation in a comment. And we think that at at some point down the line, we would like to find a a syntax for variable definitions that doesn't require you to use to use comments,
[00:23:07] Unknown:
but right now you you do. Yeah. And I think you in fact, you you mentioned another question. You heard Guido say on stage at PyCon that he would like in Python 3.6 or 0.7 to actually have that kind of syntax. This might actually be a good point also to to emphasize mypy works great on Python 2 as well as Python 3. That's very important to us at Dropbox because Dropbox is code base entirely Python Python 2. And so so certainly 1 consequence is Python 2 has no built in syntax for any kind of type annotation as Python 3 does. And so a consequence is that you just have to put the annotations in a comment. And if you look at PEP 4 84, you'll see there's specification there for precisely what that syntax should look like, which basically we developed and we standardized in PEP 4.8 4 discussing with people also in the community because Python 2 support was very important to us. And it's in a comment, makes it look a little ugly, but it works fine.
[00:23:58] Unknown:
And another approach, at least in the typing module, was to introduce the concept of stub files. Does mypy support those as well for being able to add those type annotations to Python 2 code as well as an alternative to the inline comments?
[00:24:14] Unknown:
So the short answer is yes, with a small complication. The the thing about stub files is for us the the purpose of the really crucial step for our purpose is actually to write down what the types are right there in the source code. When you're going reading the source code, you have the types right there as the reader. And so if instead you go and put them in a file off of the side, it doesn't really serve that purpose. And so where where sub files are very useful, even in our strategy, is when you have a third party library that you don't control, that you don't want to start, like, patching. Stub files are very useful as a way of being able to write the the types separate from that, in a in a separate file you control more easily. And, actually, also for the standard library, that's how I do it. There's a shared repository that is sort of bundled with my pie that is also used by other tools that you can understand for for writing down types for the standard library and for third party libraries that are cool that you trust them are. And then that that that, repository is called TypeShad? Yeah. So the you can use stub files for code that's really your own. It, as I said, doesn't really serve what we see as the main purpose. And there's a small wrinkle, which is that if you do that, I believe my Py will won't actually go and type check it against it won't type check that file that you have a stub file next to. Partly, that's because that's just not a workflow that that makes sense to us. Probably, it's because the main use case for sub files for us is, again, these third party libraries and so on where it it wouldn't make sense to try to type check them because what are you gonna do about the errors? And in fact, actually related to that extension modules. So even even actually for your own code, if you have an extension module you've written, you know, that's c code. It's not Python. There isn't really a sensible way to type check that or maybe there could be, but that would be a whole project of its own to type check, like, extension module. And so for that, again, you write a stub file, which is then very handy for the your code that uses that extension module. But we don't attempt to to type check the extension module against the stubs you wrote.
[00:25:58] Unknown:
Type theory is a very large and complex problem domain. Can you explain a bit where my py falls in that space and what the biggest influences are on how the particular typing syntax and semantics came into play?
[00:26:12] Unknown:
Yeah. So I think the most kind of important thing about my PyPI from this perspective is that it's a gradual type system. And I I think we've already talked about this a reasonable amount how, like, this is something that lets you roll it out slowly over code base over time. And this is actually something that we took mainly from somewhat recent academic work, that came about in the last few years. In particular, this kind of idea of an any type, which is basically like an escape hatch from the from the type system. But the the formalism of it is, it's like treated a bit separately from from the rest of all other types. So that was a really crucial idea. Kind of, I guess, in terms of, say, expressiveness, I would say it sort of lies it's it's more expressive than Java overall and is less expressive than, say, like, Haskell or or Scala.
I would say closer closer to Java than than those. But you write a lot less than Java. But yeah. Yeah. And in terms of the languages that influenced it, honestly, we tried to keep it just as Pythonic as possible. So we didn't we didn't look I mean, we on the team, we have a a pretty wide swath of languages that we're familiar with by way of background, but we didn't look to those languages too strongly in our design of, like, PEP 44 and MyPy stuff because we really wanted to to make it feel Pythonic and not at all, like, people were having to write top in Python or or anything like that. As Dave said, like, by far, the the major source of the of the semantics of the type system use Python itself. And the mainly the same sort of ideas that come to mind that are not sort of already there in the Python type system are,
[00:27:37] Unknown:
the any type, which as David explained, comes actually really not from any other language in particular, but comes from some academic work of recent years and generics. Generics are a great idea. At this point, they're decades old, and they've they've become a really widespread idea. There's no, like, 1 language we can really attribute it to. They're so widespread that people around it at new languages that come out like Swift, it of course, it has generics. That's just sort of table stakes for a good new language in the in this decade, and so we have those. And I think they're they're they sort of fit very naturally in the in the way that people think about Python Python code in the past.
[00:28:09] Unknown:
And another interesting type idea that is implemented in mypy and in the typing module is the idea of the union type for being able to accept a range of different possible types based on how you're using the function.
[00:28:24] Unknown:
Yeah. And that's that's not something that you see in in that many other languages. I mean, it's it's sort of like a some type. But different. But but also different. Yeah. And that 1 we mainly
[00:28:33] Unknown:
mainly came about because that's something that's a pretty common idiom in Python, and so we just support it. Yeah. I think it really goes in the category of stuff that came from how Python code is written. And there is, like, a formalism for that, which people have had over in type theory for a while, and that's, of course, helpful in being able to write a type checker that that understands it. But then as a feature really comes from from Python proper.
[00:28:51] Unknown:
Yeah. Because a lot of other languages where you would use a function in that manner would generally implement it as a it would implement the function lookup as a function overload and so that would negate the need for any sort of a union type, but because of how Python doesn't have function overloading and instead and it
[00:29:15] Unknown:
And and it turns out actually that that sometimes you want to express overload style things. Like, let's say you have, you know, you have, if you have a function whose return type depends on which element of the union you take in, you can actually sort of write this out. You can say, well, it it, if it, you know, if I give it an int, then it'll give me back a string. And if I give it a list of ints, then it'll give me back a list of strings,
[00:29:39] Unknown:
for example. And this is only at the type level, like, we're not providing any multiple dispatch Yeah. Machinery for you. But Yeah. So other other way I'm saying is we there actually is a a concept that's called overload that is in PEP 484. And it's there because, actually, especially in the center library, there's a large number of of function that have just, like, somewhat complicated behavior where you could have written it to some degree with the union. You could say, well, it takes arguments as union of, like, either this or that. But to really describe the behavior precisely, you actually wanna say you wanna sort of pretend as if it's several different functions with the different types. And as David says, like, the the real way it's actually implemented at runtime is just, you know, no change. There's never any runtime change with all of the MiFi stuff. But at a sort of conceptual level to understand what's happening in the type level, it it's helpful to describe it as if it were an overload. And so that's a that's a common thing to do especially in the center library. So we do support that as a way of describing the type of something.
[00:30:30] Unknown:
So can you talk a bit about user defined types as implemented in mypy?
[00:30:34] Unknown:
Yeah. So this is a question that we actually kinda get asked a lot, but it turns out there's very little difference between user defined types and built in types in mypy. In fact, actually, most of the pretty much all of the built in types are defined in a in a stub file,
[00:30:47] Unknown:
that mypy reads in just like any other file you would write. Yeah. In fact, all of them are there. There's a a handful of very special behaviors for a handful of built in types that will have some, like, special case support in my pie. The great majority of everything you would say about in built in type is said exactly the same way that you would say it about a use a a tech you write yourself. Yeah. It's basically just for types that you actually literally can't write down definition wise. Like you can't you don't, true is
[00:31:10] Unknown:
a you can't use true as a a variable name in Python 3. This is something that is fixed from Python 2. And that means that you can't kind of write it down on the left hand side of an equal sign in the stub file. And so that is why that it that that's why it's a special case built in. But but aside from that, everything is, you know, there's no no no magic going on behind the scenes for the most part.
[00:31:31] Unknown:
And is that largely due to the fact that the type module in Python is sort of the basis for all other types, including custom classes?
[00:31:42] Unknown:
Yeah. Like it it probably falls out of the fact that in Python, this, you know, none of the built in classes do anything too especially magical. I mean, okay, so there's a, you know, maybe they're actually implemented in c, but the way they behave is just like any other Python class. And so that means that we can treat everything
[00:31:57] Unknown:
pretty much the same here. I noticed in the documentation how you had special casing, a special place, lists and dicts of string or int,
[00:32:09] Unknown:
versus any other type. It's it's possible you're seeing type aliases actually, where you can say you can give a name to say like, if you have a list of I mean, this is it's a kind of simple type, but if you have, like, a list of string and you use everywhere, you would call it, like, I don't know. Maybe maybe there'd be some more specific name, but you could call it, like, string list. And then so to say, just, like, string list equals list of string and then use string list everywhere instead. And this is more useful if you have what you really have is, like, a list of tuples of string and ints, And the 3rd argument of the tuple is like a dict of this and that and da da da. Right? So you have this complex big nested type, and you can just assign it. You can just make a type alias that refers to it and use that everywhere instead. Maybe in fact, the thing worth just backing up and and saying specifically is we do support and and I'm very glad that we do having saying, for example,
[00:32:54] Unknown:
not only that, you know, this function, the the types of the of its arguments are supposed to be this argument as an int, this argument as a string. You can say this argument is a list of ints. You could say it's a list of whatever. But more often, actually, what you wanna say is a list of ints. It's a list of strings. It's a list of this class over there that I find. It's a list of any type really you can put there. In fact, most often when you write that, function takes a list as an argument, you're gonna wanna say what types the elements are supposed to be. And same thing with a dict, you're gonna say what types the key and value is supposed to be. And so my Py gives you the ability to do that as we call generics, much like many other languages that comes with generics. Yeah. And we and we also have this kind of cool cool thing in addition called type variables, that let you define more complex relationships,
[00:33:34] Unknown:
between arguments and other arguments, arguments and return values. So for for example, you could have a a function that takes in a set of of something and returns just an element from that set. And and mypy would know if you pass in, like, a set of ints, then the function returns an int. And if you pass in a set of strings, it returns a string, etcetera.
[00:33:51] Unknown:
And that sounds like a use case where you could compose some of the different types where, for instance, if you had a list that can and most likely will include both strings and ints, you would do a list where the type is a union of string and int.
[00:34:05] Unknown:
Yeah. Right. Mhmm. Absolutely.
[00:34:07] Unknown:
So how has the inclusion of the typing module in the standard library influenced the evolution of mypy?
[00:34:13] Unknown:
My understanding is that the inclusion of the the typing module in the in the Python standard library was kind of a nice, it was a helpful way to make sure that PEP 4 84 actually standardized at some point. I think there were, I mean, it makes sense. There's a lot of opinions around what what types should look like and what they could look like. And at some point, you know, we needed to everyone needed to come to a consensus and so people could start really building off of it. And I think saying, okay, typing is gonna be included in in in 3.5. We better figure this out by then, really, really help the standardization actually complete. And so for that, I think it was it was pretty useful. In some other ways, it's actually, a little bit unfortunate in that because typing is part of the standard library, we can't add new things to it very quickly. And so a recent example was we wanted to be able to refer to kind of the type of a class, not just the type of an instance of a class. And that required adding a new thing, which was type with a capital t, to typing. And that meant that we had to wait for 5.3.2 to to come out before releasing the new version of the typing module, and and being able to properly support that feature. I I guess, in general, actually, kind of zooming out a little bit, Mypy and and, like, PEP 44, etcetera, have been have sort of both coevolved. PEP 44 was kind of strongly influenced by the existence of of Mypy and seeing what sort of things work in practice is pretty important for determining, you know, what the what the standards could be. At the same time, it's not just Mypy striking out on its own and being like, oh, yeah. We're well, we did this thing now. We're gonna get it standardized. No. We I guess that that actually has a pretty big effect on MyPy. When we're considering a feature, we usually kind of raise it up to the more general type of community and say, hey. Like, we're thinking of doing this. What what are the people's thoughts? What form should it take? And this and that. And so there's often quite a good amount of discussion before we before we implement things in mypy. And I'll say also 1 1 thing that I'm very happy about with respect to PEP 44 and the typing module is that is that it is a standard and 1 that is really
[00:36:05] Unknown:
a working 1 that that other people implement. And so, when you write down these type annotations that people can read and that the MyPy type checker will will check for you. They're also understandable to PyCharm so that it can give you its its IDE features when you're in some other part of the code base that knows, like, exactly what's going on with this with this type because you've written down type annotations. And they're understandable to other tools people are writing for analyzing Python code, including Pytype, which is a person from Google, including Seml, which is a code analysis tool. And in general, things that people write that wanna analyze Python code, there's this standard, and people are picking it up and are are implementing it. And so you write down the types once in a standard way, and they're useful to every tool that wants understand the Python code. And that's really powerful.
[00:36:50] Unknown:
Yeah. I think that by sort of shoehorning the typing module into Python 3.5, it was a good way for Guido to force the hand of anyone who had any sort of opinion because it's all too easy for a discussion particularly about types to get mired down in just academic discourse and never really end up going anywhere in particular. But by giving a deadline, it forced people to really voice their opinions and if they had any strong opposition to make that known Sure.
[00:37:17] Unknown:
So they could have their way. And all the all the people that were, like, really most invested in in making this good, like, came to the table. We had very useful discussions,
[00:37:26] Unknown:
and and continue to do so with with especially the PyCharm folks who are very keen on this, and this that's certainly made it better. I think that's 1 of the aspects of Python and its community that really, really shines. Right? In some other language communities, that would have never happened. There would be 15 different incompatible versions of type annotations, and everybody would insist to the last that their way is superior and the language would never get a standard, and, ultimately, it would never gain widespread adoption. And this is proving that there are advantages to having this sort of, like, this is the way we're going to do it. Okay. Now let's move forward.
[00:38:02] Unknown:
And I think the other piece of brilliance there too is that just because the typing module is present doesn't preclude any other particular syntax because function annotations can still be interpreted however you want depending on what tool you're leveraging to analyze them.
[00:38:18] Unknown:
Yeah. You still have the opt out. That's right. Mhmm. Although although technically they are specified now, PEP 44 does say, unless you explicitly opt out, function annotations do mean to, like, are supposed to have this particular, meaning. Now nothing Yeah. But you have a way to opt out. You you also don't have to run a type checker. Yeah. And and also, you know, Python itself doesn't really do very much for them at all. And so you can totally have them you know, as long as they're valid Python expressions, you can still have them be whatever you want. And, you're not practically gonna run into any problem. Although, probably, I wouldn't advise that because it's now, you know, against the language standard. But Yeah. Yeah.
[00:38:51] Unknown:
So did the inclusion of multiple inheritance add any implementation complexity to mypy?
[00:38:57] Unknown:
Yeah. Some. I think, like, every language that has multiple inheritance, it add makes things more complex. Python is no exception. You know, back when many years ago when when that was added to to Python, MROs, the method resolution order, was a good idea. That makes it simpler, but it's still complicated. It's a thing that would be you know, like, anything that that's working on Python code have to understand and deal with.
[00:39:16] Unknown:
Are you guys aware of any formal studies that have been performed to research the ergonomics or efficiency gains of static or gradual type systems in programming language communities?
[00:39:24] Unknown:
Yeah. There's this is sort of a funny area. There's, for decades, I think, been a a, you know, a little a a sort of line of of research and academia on this. I think it's always been a tough subject to research in an academic way because, like, the reality of how a system like this that makes a code user understand is beneficial is something that you don't the reality that doesn't show up in a few hours. Your in your first few hours, like, working with a language that was just made up. And the real benefits come in your second week of using it a lot, which is perfectly great for, like, a programming language or, API or systems about it, which people you know, you learn a thing and you work on it for years. But it makes it very hard to do a a a study in the standard, like, academic way. So what I think is and so people do them, but I I think it's hard to draw much conclusion from them most of the time. And and and I think cross language comparisons are actually, in general, pretty difficult. There's a there's a study that I've seen a number of times,
[00:40:16] Unknown:
kind of mentioned reasonably frequently actually, that that says, oh, there isn't much benefit from static typing. But but the way that they've drawn that conclusion is they're like, well, we looked at, like, c and c plus plus and Java, and, oh, wow. On average, those are more verbose, and have more, you know, or have, like, a similar amount of bugs to to Python and Ruby. Wow. It's It's like, well, wait a second. That's not the static types. Like, that's Yeah. There's a lot. Really can't make that comparison. That's why I reasons to yeah. Yeah. So I think I think that there's also some pretty unfortunate studies that are that are And there are other studies that conclude that it totally is beneficial, but even those I think it it's it's hard for them to really do it in the the usual academic fashion. What I
[00:40:52] Unknown:
do think is is is really quite informative is you can introduce a feature like this, you know, 1, before we invest all this time, like, we knew from our own experience, the experience of, like, many other engineers we talked to, they respected that it seemed like this would be really valuable. And then we did build it and make it real, starting speaking this year really. Make it a practical thing you could use on a a large code base. And we've had more and more people, especially at Dropbox originally, and now people outside as well coming to use it. And so just actually, like, 3 months, 2 months even after that really began, we went and did a little survey here at Dropbox of people that were started using types and asked them what they thought and, and made it, like, anonymous survey. We didn't even know who was who was making which answers. And people were overwhelmingly very glad that they had done it. And overwhelmingly said that it made the code user to read, made the code user understand, made them more productive working in it, and just made the code better. It was even more positive than we hoped for. And so that I think that's a a a very real kind of evidence that people with a wide range of background, people who have spent a lot of time writing code, people spend a lot of time working in the very same code bases that they're now adopting types in, say, yeah. This actually this makes me a lot more productive, makes a lot happier, makes the the code better. And that's I think that the same
[00:42:08] Unknown:
views are what's driving more and more people at Dropbox to use it and also as people try outside to to to keep using it and and adopt it more widely on their code. Yeah. I mean, here at Dropbox, we're not going out and annotating large swaths of code ourselves. We're we're building the tool, and and people are coming to us basically being like, hey. I'm excited to start what you know, when can I start using this in my code? What do I need to do to get started? And and and the adoption is really just taking off without us
[00:42:34] Unknown:
having to go and do the annotation work ourselves. Yeah. And, like, 1 person on a team will start using it, and they'll, like, go and, like, show it to all their teammates and, like, teach them how to use it and get them using it and more and more and and then everybody on the team will be using it and will be coming and asking questions and saying, hey, like, can you sort this feature? Or, hey, like, here's this pull request to fix, like, a a bug they found or, like, add a feature I wanted, etcetera. And so that's that's very, very encouraging, very satisfying.
[00:43:01] Unknown:
Yeah. It would be interesting to see some more formal academic studies being done to compare usage of programming languages that both optionally have static type systems or can also be used in a more dynamic manner to have those types of comparisons. So Python with the addition of MyPy and then also JavaScript with the addition of TypeScript, for instance.
[00:43:23] Unknown:
Yeah. If somebody can do a study that, like, really is persuasive and having a kind of evidence, that'd be great.
[00:43:29] Unknown:
I think it is really hard to do in the in the standard academic forms. Yeah. But the fact that MyPy is now just an optional static type system directly on Python could potentially enable those studies in a way that they they couldn't before. Because even TypeScript has some nontrivial differences, Yeah. From JavaScript in addition to the the static types. Whereas, MyPy is really Python. I mean, when you run the code, you're just running normal Python, but you just have this additional static type layer.
[00:43:55] Unknown:
So what does the future roadmap for MyPy look like?
[00:43:59] Unknown:
So the first thing that's coming up here, which is something that I'm pretty excited about, is what we call strict none checking. And so this is kind of a work in progress feature that will hopefully come out in the next, say, month roughly. And, basically, what it does is it it lets you specify in the type system if something can ever be none or not. And if you're trying to use, like, say, a variable that can be none, my pie will warn you if you don't check for none before using it. And we this basically gives you the tools to eliminate, almost all none type errors, which I think is pretty exciting. So so that's that's a kind of the 1 of the last few, larger changes to the type system that I think we'll we'll see. And so for that reason, it's important to get in soon. And, also, I think it just has pretty high potential.
[00:44:43] Unknown:
Yeah. And that so if you look at, like, it at Dropbox and production on our giant front web app at the exceptions that get thrown, a large fraction are precise this kind of bug where it's an attribute error. Value of type, none type has no attribute whatever.
[00:44:58] Unknown:
Right. Because this thing can be set to none. And and the the difficult difficulty with debugging these normally is they can be introduced at sort of many, many places in the code base. You don't know which of these many functions along the way returns none or assigns it to this attribute, dah, dah, dah. And with, with this strict none checking, you'll also kind of get an error right there. If you have a function that can return none and you try to assign it to an attribute that's not allowed to be none, you'll just get an error right on that line and you'll be able to say, uh-huh. You know, here's here's where something's going wrong. I need to check if this is none and and do the appropriate thing. Or maybe the attribute actually does need to be none, but then MyPilot will tell you every time you use it, you know, if you're using it in a place where it really had better not be done, MyPilot will make sure you you've you've checked for it. And we and you can also see this through lens, again, of understanding the code better that now you know
[00:45:45] Unknown:
not only, like, oh, this is supposed to be a string and versus, like, an instance of this class I defined over here. You'll actually know for sure, like, do I need to check this for none? Because it'll either be an it's spelled optional, optional of string, meaning that it's either string or none, or it'll be straight up string. And if it's straight up string, it cannot be none,
[00:46:03] Unknown:
in any type check code. So that's that's kind of, I guess, the the first big thing on the horizon. We're also doing a bunch of perf work. We have a really big code base that we want to type check here, and so we're we're kind of making things faster. I think the most cut probably the most groundbreaking feature there is this incremental mode that we currently have in sort of, like, somewhat experimental stage, which basically caches the results, after every after every run. And so if you just change 1 file and you rerun mypy, it won't have to go and type check all of your other code. And so hopefully, it should be able to be, like, really blazing fast. And kind of there's a lot also, you know, there's still a fair amount of like kind of miscellaneous bugs and features that we're that we're working on. Python has a lot a lot of interesting behavior in various cases.
And so that's basically gonna be a you know, for the foreseeable future is gonna be something that we're always going to improve. And so we definitely encourage people if they use Mypy and they run to some issue, don't worry. Post on the issue tracker, and we will we will figure out what's going on and and and fix it. And I guess far down the line, we have some exciting things like, like editor integration and IDE like features. So a great jump definition or maybe some sort of refactoring support, find occurrences, things like that. Or or or just saying, hey. What's the type of, you know, highlight some expression in your editor and say, hey. Give me the type of this. We think that's something that would be pretty exciting. And then also kind of far down the line, we wanna have sort of plug in system. So if you have, say, like, a proto buff specification.
[00:47:22] Unknown:
Or a Django model or a SQL alchemy call. Yeah. You can That you'd want. For example, say you define a Django model. Right? This is this has this is a thing that has a lot of structure. You went you went to find this this class that inherits from Django model, whatever. And you go and define there's a particular way you define it as some fields, some columns. And then when you go and have an instance of this, like, a row from this model, you'll have particular attributes, and they'll have particular types. And that all comes in a very predictable way from the the definition you gave to that class. But the relationship between that class definition and the types you get on the instances is is sort of a particular choice that's made in the genuine implementation. It's not a thing. It'd be very complicated to to have a type system itself be rich enough in a general way to just write that down in the type system. You could you know, when people, like, push really hard on type theory, they'll devise type systems where you can write all that stuff down. Like, in Haskell, you could write down down all kinds of stuff like that. But that's not I think it actually makes sense for for for Python or or a Python type system. And so what we expect to do instead is have a system for plug ins where corresponding to the sort of Django model itself or SQLAlchemy or protocol buffers or or other things that have this this kind of, like, particular structure relationship that's distinctive between, like, a definition somewhere and the types you get elsewhere to correspondingly have sort of a type level definition of that. To have, well, there's the the Django model like actual implementation, and there's some code that corresponds to that that, you know, initially, like, we'd be writing this stuff. But in the long run, I think in a world with typing everywhere, the Django developers would have the the Django MyPi plugin, and the SQL optimizers have the SQL optimizer plugin and so on, which describes, okay, if you when the type checker sees the thing that is, like, inheriting from this base class or whatever, it goes and invokes this plug in, which is just this code that runs in the type checker that we you know, this is this is all very much on the drawing board. We haven't defined what this API will be, but some kind of API with the type checker to be able to to say to be able to describe what the resulting types are gonna be from, like, looking at the definition of that class. And that I think will be really crucial for for a lot of things that happen in many Python applications, including a Dropbox. We have, certainly SQLAlchemy, certainly a, protocol buffer, certainly some other thing that we have internally that are sort of ORM like. We're gonna very much want something along the lines of this kind of plug in system.
[00:49:35] Unknown:
So are there any questions that we didn't ask that you think we should have or any other topics that you'd like to bring up before we move on?
[00:49:42] Unknown:
I think we covered it. Yeah. Yeah. I I'll just say, yeah, like, we, we're very much at Dropbox excited about the use of static types, and and more and more of our code has static types on it. I was saying at PyCon just a few weeks ago that we had over 75, 000 lines of annotated code. We're actually, I think, just about to break 100, 000 lines probably in the next few days as it keeps going up and up as, again, none of us on the on the Wi Fi team are doing this, but, just a large number of other people at Dropbox are picking up this up every day, and many people outside as well, including in open source projects like Azuloop and Tornado, and other companies. And I think I would encourage anybody listening to this to to try it out and try using it on on your code, and I think you'll like the results. And you can install it straight from pip. The package for historical reasons is called mypy dash lang.
So pip install mypy dash lang in Python 3. And any kind of of of certainly bug you run into, any kind of feature request, anything you wish would do, any questions you have, please come to the issue tracker, github.com/python/mypy.
[00:50:42] Unknown:
And we're very responsive there and always love to hear people. Yeah. And just to Greg said it kinda quickly, but just to be clear, it's a Python 3 thing. So you better do, like, PIP 3 probably or Python 3 dash m PIP install my Py dash line. Yeah. And and just to underline, we said this earlier, but it works great on Python
[00:50:59] Unknown:
on Python 2 code as well as Python 3 code. At Dropbox, it's almost all Python 2 code, and we're very happy with it. MyPy itself runs under Python 3 and then analyzes the Python 2 or 3 code.
[00:51:09] Unknown:
Are there any areas that our listeners can help with that that the project would would appreciate contributions
[00:51:15] Unknown:
from? Yeah. I mean, we there's a lot there's always a lot a lot of work to be done. We have a number of kind of issues that are that are marked, kind of difficulty. Easy easy doesn't mean easy easy, but it means easier at least. Yeah. And relatively good to to do is the first thing you might do. Yeah. Definitely come talk to us first. We're always excited about new contributors. We have
[00:51:37] Unknown:
a IRC channel right now. We're thinking about switching to Gitter maybe. Yeah. It's all described in, right at the root of the repo on GitHub. There is a contributing file, so you can read there all about how to get involved. So anybody excited about getting involved, please, take a look at that. Please talk to us. Any of those means. I think actually for for most people, the best first step is just download and install it and start trying it on some of your own code. And I think generally you'll pretty quickly run into better questions about how it works or, like, more things you wish it would do or or maybe even a bug, And that's oh, that's a great hook to to get started and getting involved.
[00:52:13] Unknown:
Yeah. There's some interest in that on my team at work to, get it running on some of our projects. And I'd also like to say that it's very great and forward thinking of Dropbox to invest in the development of Mypy, understanding the returns that they're going to have in terms of Yeah. Developer efficiency.
[00:52:31] Unknown:
Yeah. That's right. I feel exactly the same way. I'm very glad that we're doing it, And we have hundreds of engineers working in Python every day, so I think the returns are actually very, very strong for Dropbox and, like, very strong fundamental reasons why it's a good choice that Dropbox is doing this. But a lot of companies might not be that forward thinking. I'm very glad that we're doing it.
[00:52:48] Unknown:
So for anybody who wants to follow you guys in particular and keep in touch with what you're doing, what would be the best way for them to do that? David, how about you go first?
[00:52:57] Unknown:
Follow me on GitHub, probably. I I actually don't have that much of an online presence at the moment. Not a problem. What's your GitHub username? I'm, I'm, dd Fisher. So 2 d's and then Fisher like a fisherman.
[00:53:09] Unknown:
Okay. Great. And, Greg, how about you? Yeah. I have a pretty similar answer. I am not, I'm not a, like, active Twitter user or anything. You can find me certainly on GitHub, at, gnprice. That's that's g, n as in November. You you also if you just do a Google search for my name, you'll find contact information and, other things I've done on my web page.
[00:53:33] Unknown:
Great. So with that, I will move us on into the picks. My first pick this week is an episode of the podcast, Functional Geekery, where he interviewed Andreas Stevick. And the subject that they discussed was actually the sort of history of formal studies about the various human factors of development. So things including ergonomics of programming languages and type systems and various other things like that. It was a very interesting episode. Had a lot of really interesting pieces of information and references to different studies, so definitely worth taking a listen to and seems particularly relevant to our discussion today. And, another podcast that I started listening to recently is the soft skills engineering podcast, and that's a couple of developers who have been, long time members of the Ruby Rogues podcast. And they started a show where they take listener questions having to do with various aspects of being a software engineer. Then they do their best to answer the questions, and there's generally a decent amount of humor involved. So definitely worth taking a listen to, And I will pass it on to you, Chris.
[00:54:37] Unknown:
Thanks. My first pick is a beer, unsurprisingly. It's from Grim Artisanal Ales. I've gushed about their beers before, and the most recent 1 that I tried is called Lucky Cloud. It's this really interesting kind of it almost kind of looks like kinda goldish, but with this almost luminescent yellow tinge to it. Their beers are always very interesting in terms of color and and and shade, and they're also darn tasty. This 1 is a a puckeringly sour, dry hopped beer with all these really interesting, fruity sort of, notes because of the hops that were used. I think it's New Zealand hops. I'm not sure. But really tasty stuff. Definitely check it out.
My next pick is a tool. I've been editing a lot of JSON lately, and this thing makes it easy to, slice it, dice it on the command line. Syntax check it, parse it, you name it. It's called JQ
[00:55:33] Unknown:
and if you're working in JSON frequently, you really should check it out. It's it's impressive stuff. And that's it for me. David, what do you have for us for picks? So my first pick is a command line utility called fzf, which is a command line fuzzy finder. And I've I've actually gone through quite a large number of different fuzzy finders for accessing things through my editor and also just being able to quickly pull up files from a large repository, in my terminal. And FCS has actually been by far the best that I've tried out of out of the quite large number. It's it's just really, really great to be able to just, you know, if I know some file name and it's nested 5 directories deep deep in my project, I just start up FDF, type the file name, and boom, I'm I'm I'm right there. And I found that to be really, really huge. So that I'm I've been very excited about. A lot of fact, David got me using FDF a few months ago and especially my editor. I'm very glad using it. It's very fast and,
[00:56:22] Unknown:
I just this week was making a change that evolved, like, going to dozens of different files and making small fixes in them, and it was a lot faster than it would have been if I had to type out the entire names or, like, tap complete in the traditional way. And it has editor integration for, I think, a bunch of different editors. So definitely definitely recommend.
[00:56:36] Unknown:
My second pick is, Thinking Fast and Slow by Daniel Kahneman. It's like a great exposition of human cognition, basically, and partly all the ways in which it breaks down, and it breaks down pretty badly, but also just kind of a it's greatest sort of a bit of an introspective book, for for understanding how, you know, how I work. It's not really a self help book. It doesn't really provide much hope per se, but it still gives a lot of interesting insight. And my last pick is another book, which is, Ringworld by Larry Niven. It's a it's a real sci fi classic in my opinion. It's kind of a, I guess, like an exploration story on on massive, massive scale. And I think it does a really good job of of conveying a sense of incredible, incredible scale, which I haven't particularly, like, I haven't got so much out of other books.
[00:57:25] Unknown:
Yeah. I'll second your choice on that. The entire series of Ringworld novels are pretty incredible. Yeah. 30. And I think it's also a really interesting exploration of of human behavior
[00:57:36] Unknown:
strangely enough given the alien landscape that the people find themselves in.
[00:57:42] Unknown:
Cool. Alright. So the, my a couple of things for you. The the first actually is an essay, written by a mathematician named Bill Thurston, who was 1 of the greatest mathematicians of 20th century. And it's called On Proof and Progress in Mathematics. It's sort of about mathematics, but it's really about how humans, like, progress in learning things, how, like, humanity progresses and humanity, like, comes to know more and understand more, and how people communicate with each other as they, like, learn more things and communicate those somehow to other people and how there's such complex things often we have in our heads and how difficult it can be to to convey those, like, ways we have of thinking to other people. And so I I think it's a it's a really, it's really good reflection on all of that.
And as a sort of special bonus, although he's a mathematician, it has some remarks about software specifically, which are surprisingly spot on, including I'll just give a little teaser. He says that programming is harder than math to get right. And the so that's that's great. I encourage, everybody to read it. It's only, like, 17 pages long, and you can find it on the web. The other thing I recommend, is a book like several, I guess, a science fiction book, by, Greg Egan, who's an author in Australia, still active today. And what I especially like from Egan is the short stories. I'll recommend particularly his first collection called Axiomatic, and what runs through a lot of Egan's work in particular, all the stories in in Axiomatic is through thinking about personal identity and memory, and and how that changes when people can be on computers or computers can be behaving like people, or, where sort of technology starts, like, getting intimately tied with with, like, our brains. And it has some, like, some really interesting reflections on it. I think Egan is 1 of the the the better thing that we have writing about that. So Axiomatic, the first short story collection by Egan is that. Great. I really appreciate the both of you taking the time out of your day to join us and tell us all more about MyPy,
[00:59:35] Unknown:
and I definitely intend to take a closer look at it and start using it in more of my own projects. So thank you for that. Cool. Thank you. Yeah. It's something that we're always super excited about.
[00:59:46] Unknown:
Super excited to talk to people about. Super excited for people to start using. So thanks. Thanks, guys. Good night. Bye. Good night.
Introduction and Sponsor Mentions
Interview with David Fisher and Greg Price
Guest Introductions
Introduction to Python
Early Experiences with Python
What is MyPy?
Understanding Static Types
Challenges in Large Codebases
Benefits for Small Projects
Compilation and Type Checking
Catching Errors Early
Technical Challenges in MyPy
Gradual Rollout of Types
Adoption Strategies
Limitations and Syntax
Type Theory and Influences
Union Types and Overloading
User Defined Types
Generics and Type Variables
Typing Module in Standard Library
Multiple Inheritance
Studies on Static Typing
Future Roadmap for MyPy
Community Contributions
How to Get Involved
Picks and Recommendations