Summary
The divide between Python 2 and 3 lasted a long time, and in recent years all of the new features were added to version 3. To help bridge the gap and extend the viability of version 2 Naftali Harris created Tauthon, a fork of Python 2 that backports features from Python 3. In this episode he explains his motivation for creating it, the process of maintaining it and backporting features, and the ways that it is being used by developers who are unable to make the leap. This was an interesting look at how things might have been if the elusive Python 2.8 had been created as a more gentle transition.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll.
- Your host as usual is Tobias Macey and today I’m interviewing Naftali Harris about his work on Tauthon, a fork of Python 2 that backports features from Python 3
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by describing what Tauthon is and your motivations for creating it?
- What’s the story behind the name?
- What types of applications and environments are you using Tauthon in?
- How much adoption of Tauthon have you seen?
- What are some of the different ways that your users are employing it?
- Is this the missing "2.8" release? In other words, is this intended to be a bridge for simplifying the migration of existing Python 2 code to Python 3, or as an extended support window for Python 2?
- What features have you backported from Python 3?
- What is your process for identifying and prioritizing features to bring into Tauthon?
- What is your workflow for implementing the backported functionality in Tauthon?
- What are some of the cases where you have had to compromise on the functionality or syntax of a feature that you have backported in order to fit into Python 2?
- What is your governing philosophy for how to manage syntax or behavior differences between Python 2 and 3?
- What have been the most challenging features to backport and maintain?
- What are some of the ways that Tauthon might break existing Python 2 code?
- What is the story for compatibility with libraries that are Python 3 only?
- What have you seen in terms of adoption of Tauthon?
- Do you have any sense of the commonalities among those users?
- What are some of the ecosystem challenges that faces users of Tauthon? (e.g. Pip support, package compatibility, etc.)
- What are some of the most interesting, unexpected, or challenging lessons that you have learned in the process of creating and maintaining Tauthon?
- What are your long-term plans for Tauthon, and how have they changed since you first started working on it?
Keep In Touch
- Website
- @naftaliharris on Twitter
- naftaliharris on GitHub
Picks
- Tobias
- Naftali
Links
- Tauthon
- Function Annotations
- Tau
- Nick Coghlan
- MyPy
- Matrix Multiplier Operator
- Python 3.9 PEG Parser
- lazysorted
- nonlocal keyword
- Valgrind
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode. With 200 gigabit in private networking, node balancers, a 40 gigabit public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API, you've got everything you need to scale up. And for your tasks that need fast computations, such as training machine learning models or running your CI and CD pipelines, they've got dedicated CPU and GPU instances. Go to python podcast.com/linode, that's l I n o d e, today to get a $20 credit and launch a new server in under a minute. And don't forget to thank them for their continued support of this show.
You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. SpringBoard has partnered with us to help you take the next step in your career by offering a scholarship to their machine learning engineering career track program. In this online project based course, every student is paired with a machine learning expert who provides unlimited 1 to 1 mentorship support throughout the program via video conferences. You'll build up your portfolio of machine learning projects and gain hands on experience in writing machine learning algorithms, deploying models into production, and managing the life cycle of a deep learning prototype.
SpringBoard offers a job guarantee, meaning that you don't have to pay for the program until you get a job in the space. Podcast.inid is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes, and there's no obligation. Go to pythonpodcast.com/springboard and apply today, and make sure to use the code AI springboard when you enroll. Your host as usual is Tobias Macy. And today, I'm interviewing Naftali Harris about his work on Tathon, a fork of Python 2 that backports features from Python 3. So Naftali, can you start by introducing yourself?
[00:02:08] Unknown:
Hi everybody. I'm Naftali Harris, super excited to be on the show today. I've been writing code in Python for 10 or 12 years and super excited to to tell you about Tathon. And do you remember how you first got introduced to Python? I do, actually. It was the first real programming language that I learned after, getting my start programming TI calculators. I learned it in high school in 9th or 10th grade
[00:02:32] Unknown:
And the real motivation was I wanted to write a chess engine. So, I heard about this new Python programming language and picked it up. And so now you have been using it for a while, and you ended up creating this fork of Python 2 in the form of Touton. I'm wondering if you can just start by giving a bit of a description about what it is and what your motivation was for creating it. Yeah. So Touton is a fork of Python 2.7
[00:02:57] Unknown:
that is totally backwards compatible with Python 2, but nonetheless includes a lot of the exciting new features from Python 3 that a lot of people will move to Python 3 for. So, essentially, you can take your Python 2 code, run it exactly as is, but start using some of the exciting new features from Python 3 such as function annotations, the matrix multiplier operator, argument less super,
[00:03:18] Unknown:
async await, stuff like that. And I'm assuming that the name is a bit of a joke about
[00:03:24] Unknown:
the Dow being twice pie, but I'm wondering if you can give a bit more of the story behind how you selected it. Yeah. Well, I actually have to give credit to, Nick Coughlin, who I, if I recall correctly, is the 1 that actually suggested it as the name. But that's exactly right. The name actually comes from tau which is 2 times pi, and I think there's a lot of interesting stuff in there. If you look at the, actual digits, tau equaling 2 pi, you actually have a 6 point 28 in there. And so the 28 is maybe a little bit of a, you know, tongue in cheek joke. If if there were to be a a Python version, 2.8 would probably be the 1 to have it be.
[00:03:59] Unknown:
And so in terms of the use cases and the ways that you're using Touton yourself or seeing it used by others, what are some of the main environments or types of applications that it's being employed with? Well, I run it personally on my personal laptop. Over the years. I've just written a lot of different code, mostly targeting 2.7. And so I use it personally on my laptop just to run all my old scripts without having to upgrade them. All. And obviously, the main appeal of it is that you can keep using your Python 2 code past the end of life. And as you said, be able to take advantage of some of the new features.
[00:04:34] Unknown:
But what is then some of the feedback you've gotten from other people who are adopting Touton? I think people think it's pretty cool. I mean, I think the concept of it is appealing to a lot of different people that don't necessarily feel the need to upgrade their code to Python 3. I think the other thing is that it's it's sort of a proof of concept of an alternative future that we could have lived in. We don't right now. But I think it it's a pretty convincing demonstration that, had we wanted to, we could have shipped a lot of the new features that people are excited about in Python 3 on top of Python 2 in a backwards compatible way. The main exception, of course, being the, Unicode defaults for Python 3. And
[00:05:11] Unknown:
when Python 3 was still in the early stages and Python 2 still had a decent amount of life left in it, there was a lot of discussion and fewer over the idea of a Python 2.8 that was the sort of missing bridge between Python 2 and Python 3, where instead Python 2.7 was the end of everything. And that caused a lot of people to have to go through fairly painful upgrade cycles to make their code either work in both Python 2 and 3 or jump directly to 3. And so is Toutons, in some ways, the spiritual 2.8 release that never happened? Well, I certainly cannot call it Python 2.8,
[00:05:49] Unknown:
but I I wouldn't even view it as a bridge. I would view it as really an alternative. If you think about it, the Python community over the last 10 years, the sort of main push that has been happening, all of the sort of political capital of Python has been poured into moving from Python 2 to Python 3. And I, you know, to be honest, I think that's largely been successful. For the last, you know, year or 2, I think most, almost everything new is on Python 3. Finally, some of the legacy projects are migrating. But I think if you look over the last 10 years, the sort of main push from Python was really moving from 2 to 3. And I don't think that was the best use of, everyone's time and efforts. I think instead of potentially making everybody upgrade their code to Python 3 and and spending years years doing that, I think we might have been better served by instead
[00:06:40] Unknown:
pursuing a different path, which was keeping the interpreter backwards compatible, but nonetheless, giving everybody new features that they could use. Really, a sort of different way that the community could have evolved. I don't think we saw that in all candor, but I think that, TalThon would have been a sort of different way that we could have gone. And so now that the Python 2 support window is over and it's end of life, there are people who are discussing what that means for people who still have their Python 2 code bases and don't have the time or intent of porting it to Python 3. And there are discussions of the option for commercial companies to provide long term support releases and security patches for the Python 2 interpreters that people are still running. And I'm wondering what your thoughts are on the viability of
[00:07:25] Unknown:
Southland as an option for those people who do wanna keep using Python 2 and do wanna still be able to keep getting security fixes without having to go through the effort of porting. Yeah. I mean, I would say that if somebody wants to take Talithon and run with it that way, I'd be super supportive. I'm, you know, I'm not working on it right now full time. I have a I'm a cofounder and CEO of a startup, which takes up the vast majority of my time as opposed to, maintaining Talithon. There are a group of maintainers that have stepped up to the plate, which I super appreciate to work on on Talithon.
But I do think that for organizations that want to continue running their Python 2 code and have that continue to work, Tylathon could be a viable option for them. And in terms of the overall feature set of Touton as it compares to Python 2.7,
[00:08:13] Unknown:
what are some of the main capabilities that you've back ported from Python 3? And what is your process for determining which features to bring back and prioritizing
[00:08:22] Unknown:
the ordering of them, given that you don't have a full time, investment in it? Honestly, just the things that we prioritize backporting are the ones that, I think are the coolest features from Python 3. So some of the things that are included are function annotations, which is I think is a really exciting idea, particularly for organizations that started by writing small code bases. And then as they grew bigger, are discovering that actually the dynamic typing is, can lead to type errors in production. So I think function annotations and mypy, which is associated with it, is a really exciting addition to Python 3 and something that we've back ported into Python into Tathon. Keyword only arguments is another really exciting 1. I'm really excited about async and await, which I think is 1 of the coolest new things you can do in Python 3, which we've backported to Tathon as well. There's also some new convenience things such as argumentless super or the new metaclass syntax that's available in Python 3. Personally, I think my my pet favorite is actually the underscores and numerical literals, which, is just really nice from a, writing perspective. If you write 10, 000, 000, it's hard to tell if that's 1, 000, 000 or a 100, 000, 000 unless there's the underscores. And so I find that really helpful personally. A lot of different quality of life improvements, the matrix multiplier operator, yield from the non local keyword, a whole bunch of stuff like that. Essentially, the the sort of idea is really anything that if you read the articles, like why upgrade to Python 3? Typically, they'll say 2 things. They'll say number 1, there's a lot of cool new features in Python 3. And number 2, we clean up a lot of the mistakes we made in Python 2. Calathon tries to do everything in category 1, and, of course, can't do any of the things in category 2. So really, in terms of prioritizing which things to actually backboard, priorities have been what are the coolest features in Python 3, the ones that people are most excited about, and let's backboard them into Talithon. In terms of the actual effort of backporting those capabilities,
[00:10:12] Unknown:
particularly as Python 3 marches forward and continues to evolve, what have been some of the most challenging aspects of being able to keep that functionality running in Python 2.7? And how much has that difficulty changed or evolved as Python 3 continues to add new capabilities and add new changes to it, particularly with the upcoming 3.9 where they're introducing an entirely new parser? Yeah. I mean, I would say that,
[00:10:39] Unknown:
historically, it's actually been surprisingly easy. The way that I did this was by looking for the pull request in Python 3 that added the new functionality, looking at each of the different commits that did that, and then, very carefully by hand applying those different commits onto the 2 codebase. So I really can't take very much credit at all for the code because I'm genuinely just taking work that the core Python developer group has done and taking those different changes that they made and carefully applying them to Python too. It's not as easy as literally just doing git diff and piping it to git apply. You have to actually do it by hand. And so I wrote all the code by hand, but I had a really good starting point in the change request that the core developers had already done. I will say that the, core developers are an incredibly talented group. I look up to them all. And if any of you are listening to this, thank you very much for the work that you do. It's really incredible. And did you have,
[00:11:34] Unknown:
much of a background prior to working on Tafon and actually digging into the CPython code base and the interpreter or any other related work?
[00:11:43] Unknown:
I've done some different C extensions for Python before. I actually really love the C programming language. I think it's pretty incredible. Probably the most relevant thing that I've done is I wrote a, a C extension called, lazy sorted, which is it works just like the sorted function in Python, except instead of actually sorting the list, it returns a object which is logically but not physically sorted. And so it actually will sort the list lazily. So, for example, if you just wanted the median from a list, you could sort the whole list which takes n log n time and then pick off the middle element. But actually, there's algorithms that will give you the, median in linear time as opposed to, n log n time. And so, I wrote this c extension, lazy sorted, which allows you to logically sort the list. So it returns when you call lazy sorted on something, it just takes the list, copies it, and tells you that it's sorted. It's not actually sorted but when you be when you request the median element it will sort the list just enough to actually figure out what that is and do that in linear time. So that was probably the most relevant thing I'd done prior where I wasn't hacking on the the Python interpreter itself,
[00:12:54] Unknown:
but I was working with the, the C code. And as far as bringing in these new capabilities, what are some of the things that you've had to do more research on or gain more of a foundational understanding before you can comfortably and, carefully bring in that capabilities? And what are some of the things that you've learned in the process of digging through the interpreter and bringing in that functionality?
[00:13:16] Unknown:
Well, I started really basic. I think I actually, if I recall correctly, I started with the, underscores and numeric literals, or maybe with matrix multiplication. Those are things that are a little bit walled off from some of the more complicated changes like async and await. And I sort of learned the process of adding, new things to interpreter by starting starting a little bit smaller. I think since working on this project, I've learned a lot more about how the interpreter works. As I mentioned, I've gotten a lot more respect for the I already had a lot of respect to start with, but even more respect for the core developer team, which is really doing some incredible work. And just sort of learned how the interpreter works overall. In terms of the
[00:13:56] Unknown:
overall capabilities of the Python 3 functionality, there are some instances where it's going to conflict with Python 2, either because of clashes in potential keyword usage or because the underlying functionality of Python 2 isn't exactly what the feature in Python 3 was built upon. So what are some of those cases where you've had to compromise
[00:14:18] Unknown:
on either the syntax or the feature set of 1 of the Python 3 capabilities that you're bringing back into Python 2? Yeah. I can give you a a good example here. 1 of the cool things from Python 3 is finer grains operating system errors. So for example, in Python 2, if you try to open a file that doesn't exist, you'll get an IO error and you have to parse the error note from that to figure out that the actual operating system error is a, file doesn't exist. And in Python 3, it just throws a file not found error, which is a lot easier to work with, a lot more convenient, and, you know, a lot more semantically correct, I would argue. So, in TalThon, we want to be able to use the, that same fine grain OS error, but not but not but do that in a in a way that's non breaking with, with Python 2. And so what we did is actually introduce a new class of errors that you can catch, but not actually throw them the same way that they're thrown in Python 3. So, you know, for example, in in Python 3, if you open a file that doesn't exist, Python will throw a file not found error and you can then catch the file not not found error. In Talathon, if you try to open a file that doesn't exist, it'll throw an IO error, but you can actually catch it with a file not found error. So sort of compromises like that that maintain some of the old functionality of Python 2, but allow you to use some of the new features from 3. And in those cases where there is a potential conflict with how Python 2 operates, what has been your governing philosophy for how to manage the changes or in the syntax or behavior as to how that features are presented in Python 3 and bringing it into Teflon?
The core idea is keep the code backwards compatible. So everything that's done is with backwards compatibility. The exceptions to backwards compatibility in Talthon are incredibly pedantic. So, for example, if you like, the exceptions are things like if you literally check the, sys.version and you depend on it being literally 2.7, then obviously your code is going to break because we changed the the system version. You know, or for example, if you depend on the, abstract syntax tree, well, we change that. So, obviously, that's not gonna work either. Or if you do things like depend on, not not being able to use async and await, then and, you know, you expect to throw an error when you try using code like that. Obviously, since we introduced those new keywords, that's an I mean, it's technically not backwards compatible, but in the most pedantic way possible. And, you you can't literally write code without doing changes like that. But everything else will work just as is. I mean, the entire 2.7 test suite will pass, and the only places where it wouldn't are, again, where, like, the ad stack syntax tree has changed or stuff of that sort. In bringing the features from Python 3 into Tauthen, are you also back porting the test cases so that you can continue to have comfort in the, forward capability of Touton as you bring in more changes and you wanna make sure that everything's running as expected? Yeah, of course. So if you look at the test suite for a Talton, the bulk of it is stuff from 2.7.
And then there's another class of, tests that come from Python 3 to test that the functionality actually works as it's supposed to. And then there's a 3rd class of tests which test the parts of it that are sort of specific to Talithon. You know, so for example, for the non local keyword which is present in Python 3. You can also use it in Talathon but it's not technically a keyword. And The reason for that is we want to be backwards compatible with people that use non local as a variable name. And so, in TaltHon, you can actually use non local both as a variable name and also to designate that a particular variable is is actually non local. And, so, in Talthon, the tests include both the new tests about using non local are present in Python 3,
[00:17:55] Unknown:
but we also have tests that show that you can still use nonlocal as an actual variable name. Are there any cases where you have either been tempted to or actually gone through with implementing new functionality that's unique to Touton because of its usage of a continued support for the Python 2 ecosystem? Or is that something that you have consciously decided to not accept as either a new feature set or as requests from other people? No.
[00:18:23] Unknown:
It'd been very clear. I mean, the the mission of Talithon is backporting stuff from Python 3 while maintaining backwards compatibility with Python 2. So there's no functionality in it that you wouldn't find in, Python 3. There's, you know, we're not doing anything like removing the GIL, or anything like that. That would be on my list personally, but, I realize that's incredibly challenging. But, no, we're we're we're very focused.
[00:18:49] Unknown:
And as far as the maintenance of the features that you're back porting and the existing features, have there been any cases where you have had bug fixes that have been difficult to bring back or new reported errors because of conflicts with the Python 3 functionality and how it manifests in the Python 2 code base? Bringing back the bug fixes has been relatively straightforward.
[00:19:13] Unknown:
In fact, especially for some of the older features that were implemented, earlier on, you know, typically those, if you look at the sort of command history for any of those new functionality in Python 3, you'll typically find that a, a new feature is released, and then there are bug fixes that people discover bugs as it went into the wild. And, then there are bug fixes that happen afterwards. You know, again, credit to the core, developer team that often these bugs are incredibly pedantic, but, they're fixed nonetheless to make it as close to perfect as possible. And so in the process of backporting stuff to Tethon, I would backport both the, initial feature as well as those bug fixes over time. And, in some cases, you have to be a little bit careful in terms of, you know, again, maintaining that backwards compatibility, but was able to do that without too much difficulty. And have most of the features that you brought back been more core to the actual
[00:20:08] Unknown:
language itself and the interpreter? Or have you also been doing a fair bit of copying from the standard library to bring in some of the new capabilities, like the I p v 6 support and things like that? The sort of approach that I took was initially starting with the core language itself, and then after that, doing the libraries. And the main reason for that, frankly, is that the libraries are written in Python 3.
[00:20:28] Unknown:
And if you start by backporting the libraries, you have to take a lot of the code, which is oftentimes using new functionality from Python 3, and, you have to remove that functionality to make it work in Talithon. So I didn't want to do that. And, instead, we first worked on the on the core language, and then after that started back porting some of the, some of the different standard library features.
[00:20:52] Unknown:
And in terms of that compatibility with libraries that rely on Python 3 functionality, what is the story in Touton for being able to use some of those libraries that might be Python 3 only, particularly new things that have moved to being Python 3 only such as Django or NumPy? The support is actually,
[00:21:10] Unknown:
relatively solid, I would say. I mean, I think that, you know, Python 3 has some new things that Python 2 doesn't, obviously. And Talithon was designed to run stuff in Python 2 as opposed to run stuff in Python 3. But, nonetheless, the, actual support has been, relatively solid, I would say. Like, most Python code that's written in Python 3, depending on which Python 3 it is, will run-in TaltHon unless you're doing something that's, you know, pretty Python 3 specific. Or if you are, it'll run with a couple of, reasonable changes. But again, this sort of focuses on the the legacy code that's in Python 2. And are there any other elements of the surrounding Python ecosystem
[00:21:49] Unknown:
that have been challenging to make work with Teflon? Maybe things like PIP or some of the
[00:21:57] Unknown:
test capabilities or CI services that people might rely on for being able to verify their own code? Well, certainly, probably 1 of the biggest challenges has been just the distribution of Touton. You know, right now, you basically have to clone it from GitHub and then install it, like build it yourself and install it. And that's, that's a challenge as opposed to, you know, installing it on Debian
[00:22:21] Unknown:
or, with Brewer, you know, in your preferred package manager of choice. So I would say that's been a a challenge for sure. And as far as the overall adoption, you mentioned that you have had some people who have stepped in to help with maintenance of it, and you've got a decent, body of people who are using it for their own work. But what are some of the commonalities that you've seen among the people who have adopted it, whether it's shared industries or commonalities in terms of their background or regions or anything like that? I would say it's a a sort of a different
[00:22:53] Unknown:
kind of personality type perhaps, or a different sort of focus. You know, for example, myself personally. So my company works in financial services. We're in general very careful. We try hard, to keep things working. And, nothing on the other side, there's the move fast and break things sort of model. And I think developers oftentimes fall somewhere on that spectrum. You know, where on 1 side, it's like we'll move slowly and deliberately and carefully. The other side, we on the other side, we will change things rapidly and, you know, always be living in the future. And, you know, that's where you have a different JavaScript framework every month. And, I think some of the folks that are excited about are a little bit closer to the, that first side of that spectrum. And in terms of your vision for Touton and your plans for it going into the future, especially now that Python 2 is officially unsupported,
[00:23:42] Unknown:
what are your thoughts on its long term viability or the overall time horizon that you plan horizon that you plan to keep working on it and keep bringing back features from Python 3? Yeah. I mean, so I'm personally I I started this project, but,
[00:23:54] Unknown:
unfortunately don't really have the time to be personally supporting it full time. And so that's why I think we're that group of, contributors who stepped up and are maintaining TalThon, who I really appreciate. Stefan, who I, super appreciate who's stepped up on this. But, I mean, I think in terms of the future, folks that are interested in continuing to run their, 2 code, which I think there's there's still a reasonable size class of them, you know, would encourage them to to either move to Talithon or move to a distribution that includes, will include security fixes for, for their code. I mean, I do think that, Python 3 migration has has largely been successful. You know, my company, we use, 3.7, I believe. And, you know, are running it well. And, you know, I do believe that actually Python 3 is a better language than Python 2, or than Talton, in isolation.
I just don't agree with the idea that we should spend 10 years getting everyone to migrate to the new thing. I think we could have used those 10 years in a different way. But now that that's largely happened, I think that the the future now that that's largely happened, I think that in the future, we can spend some,
[00:25:00] Unknown:
time thinking about things that are hopefully not migration. So a decent amount of the work that you have done with Tathon has been focused on a lot of the visible capabilities of Python 3 in terms of new syntax or things like async, where it offer as a some new runtime capabilities. How much of your time or effort has been spent on bringing in some of the performance improvements that Python 3 adds as far as better memory usage, improvements in the garbage collector. A lot of people are pretty excited about the dictionaries that are ordered by default. I'm wondering what your thoughts are on the, sort of mostly invisible aspects of Python 3 and your thoughts on bringing those into the Python 2 code base. Honestly, we'd mostly focused on backporting some of the more visible features,
[00:25:48] Unknown:
because I think those are oftentimes the ones that rightly or not get, sort of more excitement. And so I haven't done as much on some of the, performance improvements and things like that. But in general are very excited about those sorts of things. I mean, I think to the extent that some of the code can be sped up or, other kinds of, useless resources overall, that'd be a very good thing. And in terms of your experience
[00:26:12] Unknown:
of starting this Touton package and bringing in these new features and keeping it up to date with some of the capabilities of Python 3 as it evolves? What are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:26:25] Unknown:
Memory management was, super hard. I, I remember at some point there was a memory leak and I I couldn't find it. And I I forget which feature I was backporting, but, I literally spent a week digging around for this thing. And, you know, memory was leaking. I'd run it in Valgrind and, you know, there was a leak. And, you know, maybe 1 of the core developers would have, you know, been able to spot this in an hour or something. But, I spent really, literally a whole week and I thought I was going nuts. You know, the eventually, I I found it. You know, I could find the, you know, the reference count incrementing and then, you know, saw where it wasn't getting documented and that was my leak. But that was incredibly challenging to find. And so I think 1 of the lessons I learned was, you know, I thought I was being extremely careful and I think I was being extremely careful when writing it. But after that, I was even more careful, especially with the, reference counting to just be extra sure that, that, you know, that it wasn't creating some bug that would then take another week for me to actually find in squash. That was actually 1 of the hardest bugs I've ever I've ever dealt with in my life.
[00:27:30] Unknown:
And in terms of your overall work on Tafon and your experience of helping more people continue to use their Python 2 code bases and possibly have a viable bridge that makes it easier for them to incrementally add Python 3 support. What are some of the other things that you have enjoyed from that or unexpected outcomes that you didn't anticipate at the beginning?
[00:27:56] Unknown:
I've just really enjoyed the process of hacking on the interpreter, to be honest. I think that the code is incredibly beautiful. I think Python, all of the, languages, if you will, are incredibly beautiful. I think there's a ton of really, really deep thought that's gone into every facet of the language and learning more about that has been for me, like incredibly fulfilling. You know, you can see that every, every aspect of it was thought over very, very carefully, argued over. You know, ultimately, I think the decisions that have been made have been pretty solid overall in terms of language design. And I think just digging deep into how things actually work and how they're implemented has been, for me incredibly interesting.
[00:28:37] Unknown:
Is there any new capabilities in recent releases of Python 3 or features that you haven't yet back ported that you're excited to be able to bring into Tython or anything that you are sort of looking forward to as far as, you know, some of the new releases of Python?
[00:28:53] Unknown:
To be honest, not really. I've been pretty focused,
[00:28:57] Unknown:
in the last couple of months just on my startup. So I I actually, you know, probably for 1 of the first times in my life, I haven't been following the most recent releases super closely. Alright. Well, for anybody who wants to get in touch with you or follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the picks. And this week, I'm going to choose first off the PyCon 2020 online content because we were all forced to stay inside our homes and not travel to PyCon US this year. They have fortunately found a way to at least bring some of PyCon to you with people recording their talks and putting them up on YouTube and poster sessions and some of the elements of the language summit. So they've put together a nice website where new content is added every week for being able to at least get some sense of what's going on in the Python community and learn some new stuff there. So I'll add the link to the show notes. And, my other pick is I've been using a framework called Daxter for being able to do a lot of ETL work recently, and I've just been enjoying that. It's really well designed, has a lot of great elements for being able to abstract out different portions of the pipeline and the execution context to make it easy for testing the logic and isolation, and then being able to get useful metadata out of the pipeline as it's executing. So if you're looking for a way to be able to have some ordered execution of steps and workflow management for your data, it's definitely a great framework for that and I recommend it. And so with that, I'll pass it to you, Naftali. Do you have any picks this week? Yeah, I would say, I'll do 2,
[00:30:26] Unknown:
1 as a startup founder. I, I don't think I would be able to live with myself if I didn't have a toot my own horn just a little bit for our startup. It's called Centrelink. We detect a new form of fraud for banks and lenders and, our hiring engineers. So, if you are looking for a job, super talented, and, are interested in fraud and identity, would love to hire you. We'll put the link in the, in the notes for the show. And the other pick I would say is, 1 piece of software that is just really incredible. It's part of Python, of course, but timsort. If you haven't already learned about this, it's really incredible. It's made its way into other, sorting as well such as Java.
Wikipedia has a really great description, but even better is the 1 that comes from Tim Peters himself, which, is in the Python code base. And so, if you're interested in sorting, this is really something special. I mean, it's not, it's way better. Not better necessarily, but it's there's a lot of thought that went into it. And, you know, the stuff you learned in school about, you know, quick sword or merge sword or bubble sword or something, this is a whole new level. So I would really encourage you to take a look at that. It's it's really, really impressive. Yeah. I've definitely,
[00:31:33] Unknown:
been impressed with it as well. And I've heard a lot of references to other communities and academic work being based on the work that's gone into that. So noise makes me proud as a Python user to have that be built into the code base and have it be something that was a result of somebody who was early in the community and making their contribution to it. Alright. Well, thank you very much for taking the time today to join me and discuss your work on Tathon. It's a very interesting project and 1 that definitely gives a lifeline to people who still have a lot of Python 2 code that they wanna keep running either because it works just fine, and they don't wanna have to move it to Python 3 and potentially add bugs, or they don't have the time or capability to, upgrade it to Python 3. So definitely good thing for the community. So I appreciate all of your time and effort on that front, and I hope you enjoy the rest of your day. Yeah. You too, Tobias. Great to chat today.
Thank Thank you for listening. Don't forget to check out our other show, the data engineering podcast at data engineering podcast dot com for the latest on modern data management. And visit the site of pythonpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host@podcastinit.com with your story. To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.
Introduction and Sponsorships
Interview with Naftali Harris
Overview of Tauthon
Use Cases and Feedback
Feature Set and Backporting Process
Challenges in Maintaining Tauthon
Conflicts and Compatibility
Testing and Quality Assurance
Focus and Future of Tauthon
Adoption and Community
Long-term Viability
Performance Improvements
Lessons Learned
Closing Remarks and Picks