Summary
The current buzz in data science and big data is around the promise of deep learning, especially when working with unstructured data. One of the most popular frameworks for building deep learning applications is PyTorch, in large part because of their focus on ease of use. In this episode Adam Paszke explains how he started the project, how it compares to other frameworks in the space such as Tensorflow and CNTK, and how it has evolved to support deploying models into production and on mobile devices.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
- Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
- Check out the Practical AI podcast from our friends at Changelog Media to learn and stay up to date with what’s happening in AI
- You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with O’Reilly Media for the Strata conference in San Francisco on March 25th and the Artificial Intelligence conference in NYC on April 15th. Here in Boston, starting on May 17th, you still have time to grab a ticket to the Enterprise Data World, and from April 30th to May 3rd is the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
- Your host as usual is Tobias Macey and today I’m interviewing Adam Paszke about PyTorch, an open source deep learning platform that provides a seamless path from research prototyping to production deployment
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by explaining what deep learning is and how it relates to machine learning and artificial intelligence?
- Can you explain what PyTorch is and your motivation for creating it?
- Why was it important for PyTorch to be open source?
- There is currently a large and growing ecosystem of deep learning tools built for Python. Can you describe the current landscape and how PyTorch fits in relation to projects such as Tensorflow and CNTK?
- What are some of the ways that PyTorch is different from Tensorflow and CNTK, and what are the areas where these frameworks are converging?
- How much knowledge of machine learning, artificial intelligence, or neural network topologies are necessary to make use of PyTorch?
- What are some of the foundational topics that are most useful to know when getting started with PyTorch?
- Can you describe how PyTorch is architected/implemented and how it has evolved since you first began working on it?
- You recently reached the 1.0 milestone. Can you talk about the journey to that point and the goals that you set for the release?
- What are some of the other components of the Python ecosystem that are most commonly incorporated into projects based on PyTorch?
- What are some of the most novel, interesting, or unexpected uses of PyTorch that you have seen?
- What are some cases where PyTorch is the wrong choice for a problem?
- What is the process for incorporating these new techniques and discoveries into the PyTorch framework?
- What are the areas of active research that you are most excited about?
- What are some of the most interesting/useful/unexpected/challenging lessons that you have learned in the process of building and maintaining PyTorch?
- What do you have planned for the future of PyTorch?
Keep In Touch
Picks
- Tobias
- Adam
- In Praise Of Copying by Marcus Boon
Links
- PyTorch
- University of Warsaw
- Poland
- Polish Olympiad In Informatics
- Deep Learning
- Automatic Differentiation
- Torch 7
- Lua
- Tensorflow
- CNTK
- Tensorflow 2
- Caffe2
- EPFL (Ecole polytechnique fédérale de Lausanne)
- Fast.ai
- TorchScript
- ONNX
- Transfer Learning
- C++
- Reinforcement Learning
- NumPy
- SciPy
- MatPlotLib
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Today, I'm interviewing Adam Paschke about PyTorch, an open source deep learning platform that provides a seamless path from research prototyping to production deployment. Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When When you're ready to launch your next app or want to try a project to hear about on the show, you'll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 gigabit private networking, scalable shared block storage, node balancers, and a 40 gigabit public network, all controlled by a brand new API, you get everything you need to scale. And for those tasks that need fast computation, such as training machine learning models or building your deployment pipeline, they just launched dedicated CPU instances.
Go to python podcast.com/linode, that's l I n o d e, to get a $20 credit today and launch a new server in under a minute. And don't forget to say thanks for their continued support of the show. And don't forget to visit the site at python podcast.com to subscribe to the show, sign up for the newsletter, and read the show notes. And to help other people find the show, please leave a review on Itunes and tell your friends and coworkers. And to keep the conversation going, go to python podcast.com/chat. To learn and stay up to date with what's happening in artificial intelligence, check out this podcast from our friends over at the changelog.
[00:01:29] Unknown:
Practical AI is a show hosted by Daniel Whitenack and Chris Benson about making artificial intelligence practical, productive, and accessible to everyone. You'll hear from AI influencers and practitioners, and they'll keep you up to date with the latest news and resources so you can cut through all the hype. As you were at the, Thanksgiving table with your your friends and family, were you talking about the fear of AI? Well, I I wasn't at the Thanksgiving table because my wife has forbidden me from doing so.
[00:01:54] Unknown:
Oh, it's it's off limits for for me, lest I drive her insane because I never stop. New episodes premiere every Monday. Find the show at changelaw.com/practicalai
[00:02:04] Unknown:
or wherever you listen to podcasts.
[00:02:11] Unknown:
Registration for PyCon US, the largest annual gathering across the community, is open now. So don't forget to get your ticket, and I'll see you there.
[00:02:19] Unknown:
You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers, you don't want to miss out on this year's conference season. We have partnered with O'Reilly Media for the Strata Conference in San Francisco on March 25th and the Artificial Intelligence Conference in New York City on April 15th. In Boston, starting on March 17th, you still have time to grab a ticket to the enterprise data world. And from April 30th to May 3rd is the open data science conference.
Go to python podcast.com/conferences to learn more and to take advantage of our partner discounts when you register.
[00:03:00] Unknown:
Your host as usual is Tobias Macy. And today, I'm interviewing Adam Paschke about PyTorch, an open source deep learning platform that provides a seamless path from research prototyping to production deployment. So, Adam, could you start by introducing yourself? Hi, everyone. I'm Adam Paschke, and, I'm 1 of the lead developers of PyTorch.
[00:03:19] Unknown:
And as a, like, day to day occupation, I'm a student at the University of Warsaw in computer science and maths. And do you remember how you first got introduced to Python? So I started with Python as part of my university courses. It was, like, kind of a special course that, like, let you skip, like, the basic, let's say, web development courses, database theory, and and stuff like this. Like, we we still kind of needed to pass exams on those. But it was really cool because it was, like, a selected group of people who who were supposed to, like, maintain a system which is used for judging submissions at the Polish, Olympiad informatics. So, basically, like, people are submitting source code for programs, like, for algorithmic questions.
And then you have, like, a whole system which kind of compiles them, evaluates them, and displays the results. And this is written in Python. It's open source. So, yeah, I was I was, like, working on this for a year, and so that's how that's how I got started. Then I had a bit of a break, but, you know, then came back and stuck around for a few years working on PyTorch
[00:04:25] Unknown:
now. And so before we get too deep into PyTorch specifically, I'm wondering if you can start by explaining a bit about what deep learning is and how it relates to some of the terms around machine learning and artificial intelligence that people might be familiar with. Basically, deep learning is,
[00:04:43] Unknown:
a specific, let's say, a subset of machine learning. Like, there are a lot of different machine learning methods, and deep learning is, like, let's call it a set of algorithms or, like, a set of models, which, like, got really common in the in the in the recent years because, like, it allowed people to, like, train models, which empirically work better with, like, real world data and, like, can, you know, classify images or generate audio and, like, all this stuff, which was really hard using the previous state of the art methods. So it, you know, it's really like an algorithm that's kind of taken over the field and, like, revolutionized in the remaining years. And machine learning versus artificial intelligence, I guess, it's kind of a philosophical
[00:05:26] Unknown:
if they're the same or not. So it's just a subfield, I would say. And so can you describe what the PyTorch project is and some of the story and motivation around your creation of it? So PyTorch is,
[00:05:40] Unknown:
first and foremost, it's really, a deep learning library, written in mostly in Python. You know, it's kind of designed around Pythonic semantics, and it really tries to embrace them so that people kind of, you know, have a single mindset when they're writing their programs. It really attempts to be, like, very composable with Python programs. But, also, like, at a somewhat lower level, like, a view that I've been trying to push different places, like, in in 1 of my talks at Py con, for example, is that PyTorch really is somewhat like NumPy, except it has some nicer features, which NumPy today lacks. And and then, you know, there's all of those deep learning helpers that are on top, but this is not really, like, the only kind of viewpoint on what PyTorch is. And kind of the, let's say, 2 most important features which are missing from NumPy are accelerator support, so you can do computation on GPU, for example, and high performance automatic differentiation.
A bit about the motivation for creating it. So PyTorch is really not like, it didn't come out of the void. Basically, like, it grew out of, torch 7 project, which was another machine learning library, except it was written in Lua. It's a library that I basically used before before PyTorch existed because I wanted to learn more about machine learning. And the only problem is that, you know, even though I really like the library, even though I really kind of like the design, I often you know, when you when you kind of went on the Internet and you kind of saw the comments, everyone was like, yeah. It's a nice library, but, like, it's in Lua. So Lua has, unfortunately, like, an almost inexistent scientific computing ecosystem. So, like, if you're missing a package, you kind of, you know, you better know c and, like, writes bindings to, like, c functions that let you, like, load, I don't know, audio files, which is, like, a relatively standard use case, let's say. So, you know, Lua was kind of a pain pain point for the library. So I kind of had this idea. I I was involved a bit with the development of Torch 7, not from the beginning. I actually came, like, relatively late in in its lifetime, but I did send a couple of pull requests. And so, basically, like, I was I was following the news, and I I I was trying to help with, like, decoupling of some of the core parts because, like, the math was happening in in in a c library that was then exposed to Lua. I kind of helped, decouple the c parts from Lua parts. And so, you know, seeing that everything you know, all other libraries are in Python and people are people are really liking Python, this kind of idea of, you know, starting to write Torch, but in Python sprung to my mind. So that's kind of what I started working on. Then I also talked to, like, Sumith Gintala, who's, who was a maintainer of Torch 7 at that time.
And, basically, like, we kind of started working on this and then also not only ported the the whole Torch 7, implementation to Python, we also kind of improved on a lot of, like, you know, a lot of the pain points that people had kind of making it even more seamless
[00:08:46] Unknown:
to to do the research they want to do. And as you were creating PyTorch and porting the torch 7 library, was there any consideration of it not being an open source project? Or was that something that you just established as sort of a default because of the fact that the Torch 7 library was already available and you just wanted to continue with legacy?
[00:09:06] Unknown:
Yeah. We never we never even considered making it closed source. Torch 7 was completely open source. And, you know, even when we started working on this, there were already so many, open source alternative libraries that you could use to train and, like, do research or deploy those models production, which were open source that, you know, basically, there is kind of no space for closed source solutions, I think, in in in this area at least. There the there's like, the open source solutions are too have too high quality for for that to happen, I believe.
[00:09:39] Unknown:
And as you mentioned, there is already a fairly large and growing ecosystem of deep learning and machine learning tools, particularly built for Python. So can you describe how PyTorch fits into that current landscape, particularly in relation to projects such as and CNTK and the Caffe 2 framework?
[00:10:00] Unknown:
So that's a good question because we have we have a lot of today, we have, like, a lot of different frameworks, which it seems like they're, you know, trying to solve the same problem. And basically, the reason why people keep creating new frameworks is that I think that, like, a lot of people saw, like, the existing ways of expressing those problems in, you know, using the tools they that they that were available to them insufficient. And so they really wanted to, like, you know, change how they express things. And so, you know, there were a lot of people had different opinions. And so people with who who who who, you know, differed in in how they want to approach this those problems, they essentially started creating different groups, which kind of developed separate frameworks. Right? So kind of PyTorch comes out of of 1 of those groups, which which, you know, was also kind of revolving around Torch 7 previously.
So kind of that's the reason why there are so many libraries, I think, which which attempt to do the same. And even today, we are still unsure, like, what's the right answer to how those things should be expressed. And so, like, all of those libraries are really iterating. Like, if you look, they really change between versions. Even in PyTorch, like, we went through, I think, a few iterations of, like, the syntax for the most basic things you can do with the library, just so that, you know, it it's the easiest for people to do. Regarding comparisons with TensorFlow, at least, because I think that's what I'm, you know, the most familiar with. TensorFlow just comes from a slightly different background. Like, hybrid arose from a group that was kind of interested in machine learning research. And so the thing we embrace the most is user experience and kind of hackability and just the, basically, the ability to, like, tweak every single thing in the library to kind of, do the thing that you wanted to do, such that you have almost no limitations if you actually want to experiment with, like, new crazy models and and and you just want to try out some crazy ideas.
TensorFlow, on the other hand, I think it was mostly developed as, like, an inference platform. So, like, the first and foremost thing they kind of put for TensorFlow, I think, you know, at the beginning, it it definitely changed since then, was that, like, it can efficiently execute all of those computations. So TensorFlow in, like, the version 1, basically, instead of, like, using the Python program to actually run the computation, the Python program was really like a meta programming language, which would build a TensorFlow program. And then TensorFlow would have a completely separate separate interpreter, which would actually evaluate, like, whatever structure kind of your your Python program built. Right? So those are, like, 2 fundamentally different approaches, and they really started from 2 places. But, like, both have their own limitations.
So, like, you know, the approach that PyTorch takes where you basically use Python code to kind of drive everything is really nice because it feels natural. And in fact, if you look at TensorFlow 2, the second version, they basically put the the, let's say, PyTorch mode, which is called eager mode in TensorFlow 2, as the default because that's, like, something that that is most intuitive to users. But at the same time, they still have the, like, good base for for efficient inference on, like, you know, heavily distributed systems or in mobile systems, which are relatively restricted. And so this inference part was something that was, like, missing from PyTorch at the beginning. And so now with PyTorch 1.0, we're kind of making up for those issues by, like, writing the PyTorch compiler, which, like, can take your Python source code and then, like, interpret it as some program in, again, in some interpreter, which we implement inside our libraries, which can then be shipped, to, like, mobile phones and so on. So I think they just started from, like basically, every framework started from, like, a slightly different set of primitives. And, you know, as time passes by, as developers see what kind of things people like to use, they will basically start to converge on the same fundamental set of principles that they build their software on. And so that's, I think, what is what is happening nowadays.
So, yeah, the distinction definitely used to be much larger than it is today. You know, everyone has still their own biases, like, from the earliest days.
[00:14:35] Unknown:
So but, like, everyone just keeps converging to the same point, I think. Yeah. It's definitely interesting as I've been hearing about some of the, as you said, convergence in terms of the primary concerns between TensorFlow and PyTorch and some of the other deep learning libraries where you were saying that PyTorch started from their perspective of user experience and TensorFlow started from the perspective of performance. And just through natural evolution and using useful ideas from each other, you're starting to tend towards the same set of capabilities, but with a slightly different overarching feel and
[00:15:12] Unknown:
mission for what the projects are originally intended for. Yeah. I mean, just to kind of clarify what you said. Obviously, we, like, held performance in very high regard from the beginning as well. It's not like like we, you know, we've been trying to kind of match the performance of TensorFlow starting from day 0, which I think was, was pretty successful. The only difference is that, like, to run PyTorch programs, you really need Python interpreter, and that's not possible in, like, every single case. Like, you really won't be shipping Python with your mobile app. Right?
[00:15:44] Unknown:
In terms of being able to make effective use of PyTorch, how much advanced knowledge of things like machine learning or artificial intelligence or neural network topologies should users know in order to be able to build these different models or experiment with PyTorch?
[00:16:02] Unknown:
So I don't think it really takes a lot to get started. Like, pipe ultimately, PyTorch is used to teach deep learning in, like, a lot of courses, like, both at traditional universities, like at Stanford. I know it's used for, like, their deep learning course. It's used at, like, EPFL. So you definitely can use PyTorch even if you don't know anything because even online, like, if you look at fastai, you'll basically find resources which will tell you how to, like, start doing those things entirely from scratch. But then, like, as you as you kind of will start learning those things, you'll obviously, like, you know, see more and more advanced, features and and start to use them. So I don't think there's like a really high bar for this. You can look at PyTorch, tutorials as well for for that matter. So you should be able to start with, like, almost no knowledge except for, let's say, linear algebra and, like, basic calculus, I think. And in the tagline for PyTorch, it says that it provides a seamless path from research prototyping to production deployment. So can you talk to some of the features of the library that enable that transition
[00:17:12] Unknown:
and some of the edge cases that users should be looking out for as they are making that transition from exploration and trying to reach a particular outcome and then actually getting it ready for deploying into production for end user interaction?
[00:17:30] Unknown:
Yeah. So this is definite like, this is definitely a kind of slogan of our 1 release because as I mentioned, those were kind of the points which were slightly weaker previously. Like, you really couldn't export PyTorch code, which would run without Python. And so the way we try to approach this right now is, you know, since we kind of we don't really want people to, like, rewrite too much of their programs because we really believe that, like, Python is very nice for research. And, ultimately, like, if you look at the development cycle of, deep learning models, it's like you're going through a lot of iterations of research and, like, you're try you'll be trying different experiments with, like, different parameters.
And only after you're, like, through all of this, which requires you to, like, change your program entirely many times, you will, like, reach a single successful solution, which you will eventually, like, want to, let's say, deploy to a mobile phone. And so this is when this is basically when those features come in. And so the the the way we currently, like, recommend to do this, so there are 2 most important ways. 1 is tracing. So if, like, you have no significant use of control flow, let's say, in the representation of your model, you can basically, like, just run a function on example inputs, and we'll kind of record every single PyTorch call that happened inside. And later, we will be, like, able to replay the exact sequence of of those calls. This is not something that solve every problem because sometimes you really have models which kind of, you know, run for a different number of steps depending on different inputs and so on. And so tracing wouldn't basically export such models correct. And so there, what basically you should use if you have control flow, which you want to export, is called a script mode. And so this is basically like a set of decorators you can put in your functions. And so what they will do is what they will inspect the source of your function. They will analyze the the source code statically. And based on this, they will decide if, like, they understand what's in there because, you know, we call it TorchScript. It's basically a subset of Python which conforms to Python semantics. So, like, every single TorchScript program is a valid Python program. It's just not necessarily the other way around. But if your function happens to be written in the subset that we understand, we can basically encode this using some kind of internal representation that we have. And later, this can be, like, encoded in a file and exported and, like, run from pure c plus plus environments.
Basically, there's, like, a shared library that that you can link to your, to any application, and you can use that, to later run inference of your model. So that's how it's done. And is that export artifact the Onyx binary or o n n x for anybody who's not familiar with it? No. So Onyx is 1 way to do this, and it's like the, let's say, standardized way. But apart from, like, Onyx, you know, while being great to exchange between frameworks, it still doesn't really cover, like, all of the operators that PyTorch has. And, like, TorchScript has a lot more features. Like, you know, it supports lists, it supports dictionaries, it supports custom, like, user types, basically, like, let's say, named tuples for the Python world. So, you know, there are really features which can't be easily lowered into ONNX. And, so this is why we use a different format for this. And and then so, like, you can, of course, convert our format into Onyx. You can try to lower this, but the default path to productionize the models would be slightly different because it would use a, like, a different interpreter. Not every framework would be capable of, like, interpreting PyTorch programs as as they are or definitely, like, that wouldn't be easy to define the translation.
[00:21:10] Unknown:
And so Onyx is essentially just a least common denominator for being able to share some subset of the models or to target that as a means of interchange between different researchers, but not
[00:21:30] Unknown:
production environment for end users to engage with. Yes. Exactly. Plus using a, like, custom data format allows us to experiment with it a little. So, basically, like, we could, you know, add more features without waiting for them to become available in Onyx and so on. So it's not all that different from Onyx, but we have a much, much wider set of operators and data types,
[00:21:52] Unknown:
than Onyx supports. And we're digressing a little bit, but I suppose for, just sake of argument with Onyx, would the main use case for having a model in that format be to enable something like transfer learning, where you have generated a base model from some particular set of research, and then you want to share that among other people to be able to leverage within their framework of choice to then build on top of that using the already completed training
[00:22:19] Unknown:
as part of their workflow? Sure. That's I guess that's 1 of the use cases for Onyx. Like, ultimately, it's just a format that, like, many frameworks are supposed to understand just to kind of make the porting of those models easier. But at the same time, like, you need to understand that when you export a model from a framework and, like, there is an operation which, you know, doesn't have a corresponding op and Onyx, It needs to be, like, some some somehow lowered into the more constrained format just because, you know, not everyone might implement this op in this way. So, you know, all of basically every conversion between 1 format and another is lossy. Right? So, yeah, I guess I digressed a bit. I'm not sure if that's really the answer to your question. Yes. This is 1 of the use cases for Onyx.
[00:23:04] Unknown:
Fair enough. Yeah. There are a lot of rabbit holes to go down. It's a a big topic area. So can you dig a bit deeper in into how PyTorch itself is architected and implemented, and some of the ways that it has evolved since you first began working on it? Yeah. That's a very good question as well. Basically, PyTorch
[00:23:22] Unknown:
kind of started as a pure Python implementation. Really, like, the whole framework was written in Python except for the, like, core kernels that do, like, the math. Because for those, like, if you really want to match performance of existing libraries, you really need this part to, like, live in highly optimized c and c plus plus. So you know? But a lot of like, all of the, let's say, logic of the of of the framework, like automatic differentiation and so on, We started with an implementation within Python. And then, you know, throughout the history, we've been kind of moving away from this choice. Like, we've been moving more and more things to c plus plus mostly from 2 reasons. 1 is that some of this code is, like, really hot. Like, it's getting executed kind of at every single line of user programs. And so overhead of this code is really, really important, and Python sometimes just can't deliver the same performance. Although, like, as an anecdote, 1 downside is that when I ported part of the automatic differentiation system, which was, like, 30 lines of Python code, I think it expanded into, like, a few hundred lines of c plus plus code that, like, work with c Python API and so on. So it's definitely, like, not easy to convert this, but we've been seeing, like, measurable and significant performance gains from this. But this is mostly because, like, it is really hot code that's, like, heavily exploited by the users. We don't really typically see those kinds of speed ups if, like, you were to just write your, let's say, machine learning model in Python versus in c plus plus itself. So this is really only relevant for the library bits.
But at the same time, you know, we have some users which are really interested in running models in pure c plus plus mostly because, like, they have existing research pipelines in c plus plus like a lot of reinforcement learning research on games. A lot of, a lot of games kind of only have C plus plus APIs. And so those people naturally, like, started their projects in c plus plus So they would really like also to have PyTorch features in c plus plus and this is something that we've been trying to address as well. Like, with the release with that 1 release, we also have, like, a bay beta version of our Python bindings, like, of our Python interface, except in c plus plus. So that also required kind of to move a lot of things from Python to c plus plus. Although at this point, we've been so far ahead in this work that it wasn't too hard to actually add those c plus plus bindings. So today, really, most of PyTorch is just a c plus plus library and then, like, a thin layer of Python bindings on top of this. And so as you mentioned, you had recently reached the 1.0 milestone
[00:26:05] Unknown:
release. So can you talk a bit about the journey to get to that point and some of the goals that you had set for that 1.0 milestone?
[00:26:12] Unknown:
So the most important goal for this particular release, as I've mentioned, is this research to production path. This is basically something that, we've had people asking us about over and over, and so we really wanted to do something about this. And so first, we really wanted to address this, which didn't really require a wind printer release. But at the same time, like, if you really want to convince people that you really think your software is ready to be used in production systems as it is used in many production systems. Actually, today, you really kind of need to add this, like, 1 point of time because it makes, it kind of makes it, I don't know, let's say more formal for a lot of people.
So but, you know, at the same time, this also means that we're kind of declaring that we won't be now changing our APIs as we want. So we're kind of giving up a little bit of flexibility because, like, we're kind of comfortable that the set of primitives we've settled on at this point is, like, sufficient to express, most machine learning use cases kind of, I don't know, effectively and conveniently.
[00:27:21] Unknown:
And we've gone back and forth a bit with the terms deep learning and machine learning, and a lot of the context where I've heard PyTorch referenced is in these deep learning and neural network implementations. But is it also useful in just a broadly defined machine learning context without necessarily needing to use these deep learning techniques that leverage things like GPUs or highly parallel compute units?
[00:27:48] Unknown:
So if you have machine learning algorithms, which really don't need this kind of, like, parallel array operations, which deep learning really needs, PyTorch might not really be the best way to implement those algorithms. Basically, like, you know, usually, you will have some kind of bindings, like, you know, for decision trees, let's say, or, like, gradient boosted decision trees. You have different Python packages which kind of implement those algorithms in c plus plus but they they will not, like, use the same, like, set of primitives, which, like, PyTorch exposes. So this definitely is, like, not a good candidate to be implemented in PyTorch. But on the other hand, I also think that you really shouldn't, like, limit the use cases of PyTorch to deep learning or machine learning for that matter. Like, I really think pretty much in any single place where you use NumPy, this is, like, a good sign that you really need those parallel operations on, you know, every multidimensional arrays of data. This is like every every single place like this is a potential candidate for PyTorch's where you might also, like, exploit some of its features, like, you know, running on GPUs, which which was previously not very common for, like, a lot of data analysis programs.
[00:29:00] Unknown:
And do you think that there's room for something like a NumPy compatible API layer for PyTorch to make it possible to drop in in places where NumPy is being used to give access to some of that parallelization and GPU computation?
[00:29:16] Unknown:
Yeah. I mean, that's definitely possible, and that's also, like, a relatively common feature request, I guess, that we just never had, like, the manpower to actually tackle this. Like, if someone really, really writes this, we'll probably be happy to, like, integrate this into PyTorch because we know it is somewhat needed. But, like, there is so much core work and and maintenance and then improving, like, the core features that we have that we actually never, like, got around to writing this.
[00:29:43] Unknown:
And what are some of the other components of the Python ecosystem, particularly in the data analysis space that are most commonly used in conjunction with PyTorch on a given project?
[00:29:54] Unknown:
I think it's just the usual suite of, of of, you know, tools that people use to manipulate numerical data in Python. So definitely, NumPy is used in pretty much every single Piper script because people really like working with NumPy. They really know this, API. Apart from this, definitely, like, scipy and, you know, matplotlib. You know, those are kind of the most common, but they're also kind of the most common Python packages. So I guess you you'll just find them in, like, every single, let's say, numerical program written in Python, to to PyTorch programs. I I'm not sure if there's, like, anything else than than those.
[00:30:33] Unknown:
And PyTorch is this fairly not necessarily low level library, but it's a framework component that is used for building other things with or on top of. So what are some of the most novel or interesting or unexpected ways that you've seen PyTorch used? I think someone wrote a ray tracer in PyTorch,
[00:30:54] Unknown:
which was pretty unexpected. I think we have, like, a library that can simulate physical systems and at the same time, like, differentiate with respect to the parameters that define them, which is also pretty cool. I think we have, like, a ordinarily differential equation solver in Piper, so it's, like, definitely not restricted to deep learning as I mentioned. But, you know, I'd really like to see more kind of cool use cases outside of this.
[00:31:22] Unknown:
And what are some of the cases where you think PyTorch is the wrong choice for a given problem where it's either overkill or just doesn't really fit the particular use case?
[00:31:34] Unknown:
So I think if I just I was just like someone who wants to not to do any research. I just want to, like, reimplement the most standard model, that's available out there as a result of the research, and I just kind of want to, let's say, take the pretrained 1 and just, like, ship it to to a mobile app, there are probably you know, as of today, there are probably easier easier ways to do this than with PyTorch. Although, you know, we're working towards addressing this issue. So this will definitely get better in the upcoming
[00:32:08] Unknown:
months. And deep learning and machine learning and artificial intelligence are areas of active research. And the fact that PyTorch is built in such a way that it's conducive to those activities. I'm wondering what are some of the overall processes for incorporating some of the new techniques or discoveries that are found in the machine learning ecosystem into PyTorch itself, either as convenience methods or as foundational components of the library?
[00:32:37] Unknown:
So definitely since PyTorch is kind of, like, a core to a lot of projects, we definitely, like, don't want to bloat the library with too many things that would end up, you know, not getting used later. So we definitely have a high bar for, like, accepting, let's say, new methods, like, I don't know, new optimizers or, like, new functions into PyTorch unless someone can actually prove to us that, like, they, you know, have been published, that it was hopefully accepted to, like, a conference. And there's, like, a number of papers that build upon this research and actually use those results. And only then, like, we will actually start considering, implementing such things, just as I mentioned, to, like, limit the limit the bloat because that, ultimately, like, if we start putting every single thing into this, deprecation is kind of painful because you always have this, like, 1 person who still uses this. And also, like, it really increases the maintenance cost.
And, you know, since we don't have, you know, infinitely many people working on this, it's really important for us to kind of keep the library lean such that we can actually kind of keep reinventing the internals, and, you know, improving performance, of the things that people most commonly use and just focus on on those things. But at the same time, we, you know, we really welcome all kinds of, feature requests. So if you still think that something should be included, I don't feel discouraged from opening an issue. Like, we're really open to discussion. Worst case, we'll just, you know, discuss and decide that this might not be the best fit at this particular moment in time, but, you know, really want to hear, hear back from our users and, like, what they want. So please do open feature requests.
[00:34:20] Unknown:
And are there any areas of ongoing research that you are particularly interested or excited in?
[00:34:28] Unknown:
So I'm not, you know, very closely following a lot of the recent developments in, let's say, the latest modeling techniques and so on. So this is not, you know, something that that was really exciting lately to me, although there there have been some exciting news in this. The the kind of part that I've been focusing about lately is the TorchScript part, which is kind of this, like, programming language thing. And, basically, there there there was a lot of people moving from the space of machine of programming languages and thinking how to improve existing languages to, like, better assess developers of machine learning applications. So I'm really interested in, like, the collaboration between those 2 fields because I really think they can, like, greatly improve the existing tools.
But, you know, the exact way in which that will happen remains to be seen. And in terms of your own experience
[00:35:24] Unknown:
of building and maintaining PyTorch
[00:35:35] Unknown:
I I think it's the most important to just, like, always be curious and always be welcoming to more to to people who come to your project. Ultimately, you know, a library with no users is kind of useless. You're really building those tools to, like, assist them And, you know, getting any contributors is something you should be really thankful for. I guess I can add that, like, we've had a lot of luck in having a lot of good people actually come to PyTorch. We have a handful of contributors who we didn't know before we started working on the library, who kind of joined and now, like, have implemented some of the most important features in the library or at least helped us a lot in doing this. So we've definitely had a lot of luck in this, and this is something that I'm very grateful for.
[00:36:19] Unknown:
And in terms of the overall technical implementation, what have been some of the most challenging or unexpected aspects that you have dealt with in the process?
[00:36:30] Unknown:
So PyTorch is a library which kind of touches a lot of the, let's say, systems programming issues. So there's definitely a lot of, like, interesting stuff that happened and, like, needed to be solved. Definitely, like, some of the most complicated ones, I think, are around multiprocessing because we can't really use threads to, like, paralyze Python computations, and people will want to paralyze things like data loading. We kind of needed to have a good solution to multiprocessing, to sharing data of, our multidimensional arrays between processes.
And so this involves, like, a whole bunch of, system system level programming. And, unfortunately, the, like, common APIs for implementing shared memory are not super helpful and, like, super convenient. So I think it took us, like, 2 years to actually have, like, a really good implementation of a data loader that, like, wouldn't kind of, you know, leave memory around if you if your process crashed or, like, made sure that all the subprocesses it spawned, like, will get killed if your process crashes or, like, if you actually kill it manually. So there is a lot of things to get wrong in that space, I think. And this, I think, was, like, where we really kind of spent a lot of time thinking about those things.
[00:37:49] Unknown:
And looking forward, what do you have planned for the future of PyTorch in terms of overall improvements or new features or general growth in the ecosystem?
[00:38:00] Unknown:
So, yeah, we're definitely looking forward to improving the mobile export experience. This is something so Pyberge is merging with Caffe 2, and Caffe2 already has, like, solutions to do this. So this is, like, 1 of the biggest engineering efforts happening today, to kind of merge those 2 code bases. They already, like, use the same, let's say, c plus plus level data structures to, like, describe memory. So we're kind of trying to write this unified library, which will, like, back both Caffe 2 and PyTorch as of today, and later also integrate, like, a lot of a lot of the nice features that Caffe 2 has, which are, you know, better than in PyTorch as of today to kind of, you know, improve improve this experience, you know, using, at the same time, the front end of PyTorch and, like, the script mode to kind of bridge to this exportable subset. So, yeah, I think that's that's where a lot of things will be happening. And at the same time, we will be working to improve the the subset of Python language, which we support through TorchScript script to kind of make it even easier to port user programs and export them.
[00:39:09] Unknown:
And are there any other aspects of the PyTorch project or deep learning or your overall experience of building and working on this library that we didn't discuss yet that you would like to cover before we close
[00:39:22] Unknown:
out the show? Yeah. I guess it's just that, I guess it's just that, you know, I guess it's just that, ultimately, we know that, like, there are still things which might get hard to, like, express using today's tools. And so I'm really looking forward to, like, discovering, like, what are the what are the, let's say, next steps for the for the whole field.
[00:39:49] Unknown:
Alright. Well, for anybody who wants to follow along with you or get in touch, I'll have you add your preferred contact information to the show notes. And, with that, I'll move us into the picks. And today, I'm going to choose the book An Wan Dun by China Mievel. I've mentioned before that he is 1 of my favorite authors, and this book in particular is just a very interesting and engaging young adult novel. So I've started reading it with 1 of my kids recently, and it's great to revisit it. So definitely recommended reading for anyone of any age. And so with that, I'll pass it to you. Adam, do you have any picks this week?
[00:40:26] Unknown:
Yeah. So since we're since we're at books, 1 of the more interesting ones that I've read lately, that my friend borrowed me is, in praise of copying by Marcus Boone. Basically, I was actually expecting something slightly different from this book because, you know, kind of, let's say, the the the topics of copying and, like, the, I don't know, the ethical kind of consequences of this are something that's, like, really, let's say, discussed today. So, but ultimately the book really doesn't focus on those problems. It really ultimately just is definitely more philosophical in nature. So it actually, like, just kind of notices that, like, copying is basically, let's say, the underlying principle of how universe operates. So if someone's a more philosophical type, they they might enjoy this, I think.
[00:41:18] Unknown:
Alright. I'll definitely take a look at that. So thank you very much for taking the time today to join me and discuss the work that you've been doing with PyTorch. It's a project that I've been keeping an eye on for a while with a lot of interest. So I appreciate all the effort that you and the other people working on it have put in, and I, wanna thank you for that again, and I hope you enjoy the rest of your day.
[00:41:39] Unknown:
Yeah. Thanks a lot for the invite, and thanks a lot to everyone who listened.
Introduction to the Episode and Guest
Adam Paschke's Background and Introduction to Python
Understanding Deep Learning and Machine Learning
The PyTorch Project: Origins and Motivation
Open Source Philosophy of PyTorch
PyTorch in the Deep Learning Ecosystem
Convergence of Deep Learning Frameworks
Getting Started with PyTorch
Transitioning from Research to Production
ONNX and Model Interchange
PyTorch Architecture and Evolution
Journey to PyTorch 1.0
PyTorch Beyond Deep Learning
Novel Uses of PyTorch
When Not to Use PyTorch
Incorporating New Techniques into PyTorch
Ongoing Research and Future Directions
Lessons Learned from Building PyTorch
Technical Challenges in PyTorch Development
Future Plans for PyTorch
Closing Thoughts and Contact Information