Summary
Building well designed and easy to use web applications requires a significant amount of knowledge and experience across a range of domains. This can act as an impediment to engineers who primarily work in so-called back-end technologies such as machine learning and systems administration. In this episode Adrien Treuille describes how the Streamlit framework empowers anyone who is comfortable writing Python scripts to create beautiful applications to share their work and make it accessible to their colleagues and customers. If you have ever struggled with hacking together a simple web application to make a useful script self-service then give this episode a listen and then go experiment with how Streamlit can level up your work.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
- Having all of your logs and event data in one place makes your life easier when something breaks, unless that something is your Elastic Search cluster because it’s storing too much data. CHAOSSEARCH frees you from having to worry about data retention, unexpected failures, and expanding operating costs. They give you a fully managed service to search and analyze all of your logs in S3, entirely under your control, all for half the cost of running your own Elastic Search cluster or using a hosted platform. Try it out for yourself at pythonpodcast.com/chaossearch and don’t forget to thank them for supporting the show!
- You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
- Your host as usual is Tobias Macey and today I’m interviewing Adrien Treuille about Streamlit, an open source app framework built for machine learning and data science teams
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by explaining what Streamlit is and its origin story?
- What are some of the types of applications that are commonly built by data teams and who are the typical consumers of those projects?
- What are some of the challenges or complications that are unique to this problem space?
- What are some of the complications or challenges that you have faced to integrate Streamlit with so many different machine learning frameworks?
- Can you describe the technical implementation of Streamlit and how it has evolved since you began working on it?
- How did you approach the design of the API and development workflow to tailor it for the needs and capabilities of machine learning engineers?
- If you were to start the project from scratch today what would you do differently?
- What is a typical workflow for someone working on a machine learning application and how does Streamlit fit in?
- What are some of the types of tools or processes that it replaces?
- What are some of the most interesting or unexpected ways that you have seen Streamlit used?
- What have you found to be some of the most challenging or unexpected aspects of building and evolving Streamlit?
- How do you see Python evolving in light of Streamlit and other work in the machine learning space?
- What do you have in store for the future of Streamlit or any adjacent products and services?
- How are you approaching the governance and sustainability of the Streamlit open source project?
Keep In Touch
Picks
- Tobias
- The Book Of Why by Judea Pearl
- Adrien
- No Self, No Problem by Anam Thubten
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links
- Streamlit
- Carnegie Mellon University
- Google X
- Zoox
- IBM
- Cornell University
- NumPy
- SciPy
- Machine Learning Engineer
- Jupyter
- DeckGL
- Matplotlib
- Plotly
- Seaborn
- Altair
- PyTorch
- Tensorflow
- Protocol Buffers
- Streamlit for teams
- Heroku
- EC2
- React JS
- Awesome Streamlit
- Flask
- Plotly Dash
- Voila
- NeurIPS
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode. With 200 gigabit private networking, scalable shared block storage, node balancers, and a 40 gigabit public network, all controlled by a brand new API, you've got everything you need to scale up. And for your tasks that need fast computations, such as training machine learning models, they just launched dedicated CPU instances. And they also have a new object storage service to make storing data for your apps even easier.
Go to python podcast.com/linode, that's l I n o d e, today to get a $20 credit and launch a new server in under a minute. And don't forget to thank them for their continued support of this show. Having all of your logs and event data in 1 place makes your life easier when something breaks, unless that something is your Elasticsearch cluster because it's storing too much data. ChaosSearch frees you from having to worry about data retention, unexpected failures, and expanding operating costs. They give you a fully managed service to search and analyze all of your logs from s 3 entirely under your control, all all for half the cost of running your own Elasticsearch cluster or using a hosted platform.
Try it out for yourself at pythonpodcast.com/chaossearch, and don't forget to thank them for supporting the show. And you listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers, you don't want to miss out on this year's conference season. We have partnered with organizations such as O'Reilly Media, Corinium Global Intelligence, Alexio, and Data Council. Upcoming events include the data orchestration summit and Data Council in New York City.
Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
[00:02:12] Unknown:
Your host as usual is Tobias Macy. And today, I'm interviewing Adrian Troy about Streamlit, an open source app framework built for machine learning and data science teams. And so, Adrian, can you start by introducing yourself?
[00:02:23] Unknown:
Hi. Yeah. So I'm Adrian. I started Streamlit about a year ago with, some friends of mine, and we launched it 5 weeks ago. It's been really exciting. And before that, I was a professor at Carnegie Mellon University. I ran a pretty big AI project at Google X, and I was vice president of simulation at Zoox, which is a unicorn, a self driving car startup.
[00:02:48] Unknown:
And do you remember how you first got introduced to Python?
[00:02:50] Unknown:
Yes. I do. I remember very clearly. So the very first time I heard of Python, I was in my 1st year of computer science grad school. It was 2001, and I was you know, I thought Java was so cool because it was, like, garbage collected and stuff, and it was so much better than c plus plus Anyway, and I'm in this class, and someone mentioned this language called Python, and none of us had heard of it. And, and someone said, what's Python? And I remember this grad student said, really derisively, he was like, oh, it's this weird language where white space is important. And, we you know, I he was he'd obviously thought that was, like, the really just the stupidest idea in the world, and, I guess I sort of agreed with him for no reason. But then, some next summer, I went to IBM in Cambridge, and a friend of mine who was super smart, this amazing Cornell undergrad, said, oh, you should check out this Python thing. You know, I'm doing all my work in it. And so I started playing with it, and I fell in love. And, Python really is the language so, I feel like I've been there, not from the start, but from early days, and that NumPy, SciPy, all those things, I I learned about them and loved them along the way.
[00:04:11] Unknown:
And at this point now, you have decided to start a business and a project based on Python using it for your core product. And so I'm wondering if you can talk a bit about what the Streamlit product is and some of the origin story and your inspiration for creating it. Yeah. So,
[00:04:29] Unknown:
what Streamlit is is an app framework, really for the whole Python language. But we started, with ML engineers and data scientists, and that's because that's really what our background was. And we had a lot of experience in this numerical computing. And, we 1 of the things that I noticed, basically, at Carnegie Mellon, at Zukes, and at Google is that, actually, if you're in a Python dev team, you spend a lot of time building web tools. So like, in the self driving car project, we had internal web tools to search the entire image data set, a huge number of images. We would, like, run models in real time on the images. We would run simulations. We'd run comparisons between multiple simulations, and we'd had a scenario search engine, and so these were really the tools that were kind of like the lifeblood of the project and they really kept everyone aligned together.
And they also, you know, extended into the ops group, the people who were driving cars around and needed to know what their schedules were, and to the executives. And the observation was it was, really, really difficult to build these tools, and either they were super ad hoc and, you know, they were like Jupyter Notebooks and so they weren't really usable by the group, or if they became important, we would, call in the tools team, which were a group of engineers who were, really had a specialty in React and Vue and server architecture and stuff, and they would sort of bless a tool and then make it this really beautiful polished thing, which was actually amazing, but it they, you know, then they'd say, well, we have to now we have to we'll get back to you in 2 months because now we're working on the next tool, And so the ML engineers were really sort of, in many ways, disintermediated from the rest of the company by this by this barrier.
So, so so that was really a lot of the, original thinking behind Streamlit, and I think once we released it, I guess 5 weeks ago now, it was really sort of encouraging to see basically the community immediately responding and saying, yes, this is super real and this is it's awesome that this exists now and and, so that's been very validating.
[00:06:44] Unknown:
And I like what you were saying too about the fact that these applications were valuable to people outside of the data team, specifically, more broadly, in software engineering where the more broadly in software engineering, where the capabilities of what we can build with software is valuable to virtually everyone. But because of the high barrier to entry that we've created with needing to know so many different levels of the stack to build something effective, it prevents a lot of people from even exploring that space and wanting to build their own tools. And so it's nice to see that streamlet is another entry to be able to provide that capability who has some facility with programming, but doesn't necessarily want to know and understand the entire space of building a beautifully designed web application.
[00:07:39] Unknown:
Yeah. I think that's exactly, that's exactly the spirit in which we design the software, you know, and it's really true that the, you know, the data scientists and ML engineers and it also applies to data engineers, and, you know, DevOps often have they're doing this hard and amazing work, and yet because of the way these different tech stacks are built, they're sort of disintermediated from their customers. And so, Streamlit is, it's very simple in some ways, but it's kind of like a little superpower that turns any Python program into an interactive app that then can you know, really project the programmer's power throughout the organization. And and we're seeing a lot of uses, outside of just the ML group as well, so that's really exciting.
[00:08:24] Unknown:
So for somebody who is building an app with Streamlit, what are some of the types of widgets that you support and the types of applications that you've seen built commonly by data teams and some of the typical consumers of those applications?
[00:08:39] Unknown:
So, well, first of all, I have to give a shout out to Jupyter because they really, like, led the way for, having this amazing widget support in in the Python community, really. And so now a lot of great JavaScript libraries like, deckgl, which is, Uber's amazing geographic, visualization library, has Python bindings, and the the reason is because of Jupyter, basically. And Streamlit's, we it's a it's a different use case from Jupyter. We we actually use both side by side. So, Jupyter is really for interactive exploration and and disseminating ideas. I mean, it has many use cases actually. Streamlit's really for app building.
But it turns out that we were really rapidly able to assimilate almost all of the major visualization libraries into Streamlit. So, you know, .gl, matplotlib, plotly, seaborne. Let's see. I'm missing a whole bunch here. Altair, which is an amazing library. And then we have a bunch of the basically the standard widgets, so, you know, various kinds of inputs, sliders, text input, dates input, those kinds of things. And, and those are sort of the the basic, you know, atoms in the periodic table of Streamlit. And then the real innovation is in the ability to mix and match those, you know, sort of almost instantly without having to define a complex declarative web layout with divs and spans and all these HTML and CSS things. Just really write it as an ordinary Python script. So in that sense, so, you know, that allowed us to see a whole bunch of applications. I'm happy to share with you some of the some of the ones that we've seen if you're interested.
[00:10:24] Unknown:
Yeah. We can dig a bit more into some of the interesting examples later. In the meantime, I'd be interested to dig into some of the challenges and complications that you have run into that you feel are unique to this problem space of building an easy to use application framework for people who don't necessarily have a lot of front end experience or the time and inclination to dig deep into that area?
[00:10:49] Unknown:
That, to me, is kind of the central challenge of all of Streamlit, and it's really the animating question that drove us to build it. And I think that the key thing is, how could we make building web tools as easy as writing Python scripts? And the basic idea here is that in in a very logical sense, a web tool or a, you know, web an app on a phone, it's usually described as this declarative set of widgets, which are then wired together sort of reactively in order to create sort of a UI experience of some kind. And Streamlit, our starting point is actually a Python script, so something that executes from top to bottom. And what we wanted to do was let you weave GUI code into that logic without actually subverting or inverting that logic at all, and then and come out with an app that you can use.
And I think it's perhaps a slightly subversive thing to do, but I think it's a very, very Pythonic thing to do, actually, and we always strove to make there to be easy and 1 way to do things, or whatever that Python code is. And the response has been really, like, extraordinary. I mean, tens of thousands of apps have been created, in our 5 week lifespan, and 1 of the really amazing things has just been to see dozens of tweets coming out per day of people saying look at this app I made in 50 lines of code or 70 lines of code or I just turned my Python script into an app that's deployed on Heroku you know, overnight.
The streamlet hype is real. And, that's just been so cool and exciting, and it really actually completely exceeded our expectations by a large, large margin. So, yeah, I I feel like people resonated with this approach to looking at app development from the perch of a Python programmer.
[00:12:46] Unknown:
And the fact that you originally conceived this as targeting machine learning engineers and data scientists is exhibited by the fact that you have some strong integration with a number of different machine learning libraries. And I'm curious what you have seen as far as any challenges of being able to cleanly represent those bindings, given the fact that there are so many different libraries that you're working with that might have conflicting views of what a typical workflow might be or what the necessary bindings are for being able to wire it up to,
[00:13:21] Unknown:
front end component, for instance? Yeah. That's a great question. So, indeed, we do have sort of our our first class citizens include a lot of the basic, things that you'll come across in machine learning and data science. So data frames, NumPy arrays. We do a lot of stuff with PyTorch, TensorFlow. And those were really our guiding examples as we were building stream La2, so we feel most comfortable endorsing its use in those use cases. And I think to your question of what was hard and easy, on the 1 hand, it was really easy, and on the other hand, there were some tricky things, and the the easiness just comes from starting with Python itself. I mean, you know, Python has become, like, par excellence, the the sort of glue language of of all of these different ideas. And so people come to us and they say, well, is Streamlit compatible with Spark? Is it compatible with TensorFlow? Is it and we're like, it's compatible with anything that Python's compatible because it's pure Python, and that's just an amazing superpower. I mean, other app frameworks that that we see coming out in the startup space, for example, are, you know, SaaS platforms. And then every single integration, they have to write themselves and and wrap in the language, and we just, we present pure unadulterated Python to our users and let them do whatever they want, which is just so exciting.
[00:14:39] Unknown:
So digging deeper into Streamlit itself, can you describe a bit about how it's implemented and some of the ways that it's evolved since you first began implementing and iterating on it? So, in terms of how
[00:14:52] Unknown:
Streamlit's implemented, the the basic idea is that instead of saying Python your Python file, you say streamlet run your Python file. And what that does is we take it, we take your Python file, we import all of the imported libraries, and and then we run it in what we call a ScriptRunner. And what this ScriptRunner does is connect first of all, it allows your script to connect live to a web browser via a WebSocket connection. So you can transfer information back and forth to the web browser just purely, using only Python calls and all of the details of this are completely hidden from the user. And the other thing that it allows us to do is rerun your script really efficiently in case anything changes on the web browser. So it gives you kind of like an interactive view into a static Python script, which is really the the Streamlit magic.
And that was the core idea from the start. I mean, the Streamlit actually was a solo programming project and not a company, in the early days, and it was that was the core idea that we were working on. What happened was really early on, a a bunch of, engineers first from Uber and then, Stitch Fix and some other great companies started using it and gave us feedback. And so in a sense, since we, you know, we had real users, even though there were only 2 or 3 in the beginning, it, it was sort of a crowdsourced development process. We regularly met with them, and they showed us what they were doing, and they told us what they wanted. And, really, most of the best ideas in Streamlit came from our users. And so now that we've launched, it's it's actually just grown exponentially. So at this point, there are tens of thousands of people have used Streamlit, just in the first 5 weeks, and we you know, I think 1 of the things that's very much a concern for us and the company is how can we keep this cadence of listening to the community and keep this development approach, but, you know, scale it up so that people feel like their voices are being heard and so that multiple voices, can, and and information from multiple sources can be assimilated, scalably. So so it it was sort of crowdsourced from the start, and now we're struggling to to keep up with the community and and scale our processes. 1 of the challenges
[00:17:20] Unknown:
of building a tool like this is identifying what the user facing API should look like to make sure that it is approachable and usable, but at the same time, sufficiently expressive for people to be able to build the types of applications that they want to without having to dig too deep into the guts of it? Yeah. That's something that we've
[00:17:40] Unknown:
really, approached with, I guess I would say, a huge amount of care and actually a lot of work. We've literally used every competing app framework that we could find on the web, and we wrote detailed notes on how they all worked and what we liked and what we didn't like. And then we we worked really, really closely with the community on and the ordering of arguments to functions. And so I I think and also just making things work right the first time as you would expect, which is, of course, the hardest thing to do. It's really easy to say, oh, well, you know, that's user error. Or if they wanna have it work this other way, then we'll just add another 16 arguments to this function and let them, you know, configure everything. But the the cost of that is really in is is paid in the complexity of the API and, you know, that's that's not we we recognize that our users' brain cycles are really extremely valuable.
And, and so we we we worked really hard and and and, for that matter, broke compatibility a few times along the way in order to to get the API right. And that's that's still an ongoing process. I think that that Streamlit is really in its MVP phase right now. And just having this flood of people come in and tell us this this didn't work or that didn't work has really sharpened our our picture of where we where Streamlit needs more attention and what parts were working, and and so, we're we're actually, you know, our road map over the next 6 to 8 months is is really, really clear and, of features that we'd like to to bring out and ways that we wanna empower the community to build, you know, richer, faster, more beautiful apps quickly. And, and I guess I would say to the community of users, first of all, thank you for for telling us what's broken and what's wrong. It's so valuable.
And, also, please help us. We've started seeing pull requests trickle in and and users teaching 1 another tricks about Streamlit that we didn't even realize, and that's just been so cool. And we it's just so so so cool, and it's it's necessary, for for to keep the community growing. So so yeah.
[00:20:00] Unknown:
The choice of Python seems fairly natural given the initial community that you were targeting because of the fact that it's so widely used in the data space, both for data engineers and for data scientists. But from the perspective of the project itself and the way that you have engineered it, if you were to start it over today knowing what you do now, what is it that you think you would do differently either in terms the overall system design or in the early efforts of building and promoting it?
[00:20:30] Unknown:
Yeah. I think we, technically, Streamlit is language agnostic, actually, in the sense that the underlying data layer, that intercedes between the browser portion and the server portion is written in is actually written in Google protobuf, so it's sort of a language agnostic layer. That said, we, you know, we are Pythonistas. That's our background, and it was certainly informed by our, you know, our experiences as Python programmers. And so, we we actually the the notion of opening it up to other languages, I think, makes sense and is is very exciting. And I think the Dreamlet model totally works in other languages, but but we are really committed to Python right now.
And also it's just it's just such a great language. It is really great to, you know, be able to write just a small library in some sense and then have it be super powered by the insane reach and sort of compatibility of Python based almost unmatched in the, in the language world. In terms of what we would have done differently, I think if if you look at early Streamlit, it looks totally different than it does now. So in a sense, I think we have taken the opportunity along the way of changing things when we were wrong, basically. And so from I'm actually sort of happy with where it is now, but it but that's because we we really rewrote it along the way a few times.
And, and, you know, we we we rewrote even the APIs when we when we realized they were confusing and stuff, so we weren't afraid of that. Now, of course, now, the big problem is that, we don't want to do that to to our our user community because Python is really being used in production now. So I think that we are going to have to be more careful as we add new features and I think exercise additional judiciousness because, we want I think it's it's an important value to sort of maintain backwards or at least be, very cognizant of the cost of of breaking changes.
[00:22:54] Unknown:
And for somebody who is building on top of Streamlit, can you discuss a bit more detail about the overall workflow of what's involved in designing their script to be compatible with what Streamlit is expecting and some of the model as far as how you would go about deploying it for use by other people?
[00:23:11] Unknown:
So I think the most important point is that we don't Streamlit app. Our many of our use cases are people who already have existing Python scripts for training a model or for running a model on some data set, And so what we try to do is let the user instrument their script graphically. So, for example, anytime you have a variable in Python, so you could say x equals 3, in Streamlit, you can simply just remove that 3 and say x equals st. Slider. And now that x is a slider that can be changed and all of the downstream computation will be executed properly as a result of that.
And so and that doesn't apply just to numbers. You can have all kinds of different inputs and you can actually even get into various kinds of control flow buttons and checkboxes and stuff and in a funny way, because the flow of a streamlet app follows the logical flow of a Python program, the UI follows that flow also. And so a very funny and sort of mysterious aspect of Streamlit I'd sort of love to write a blog post about it, if I can manage to crystallize this idea is how Streamlit GUIs tend to be logical from the user's perspective without a huge amount of effort or design required. Now, of course, you can also just clean slate a Streamlit app, and we we do that, for example, all the time. All of our dashboards, for example, of looking at downloads and GitHub stars and all the the ways in which we, the the telemetry on the Streamlit itself, we we have written our own dashboards for. And that's a a first class Streamlit app that we wrote that lets us look at it and understand these things. So so so you could really do it both ways. And and, certainly, you know, intentionally writing a Streamlit app makes it possible to also think a little bit about how to write things quickly.
We have some caching technology that allows you to save computation and reuse it across runs, so there's lots of neat stuff you can do there. Then, as far as the final part of the story arc is deployment. And right now, we don't have a solution to that. We're actually working on a solution which we're calling Streamlit for Teams and that's something that is designed to be a sort of enterprise version of Streamlit, but at the same time, the community has now written, you know, probably 1 or 2 dozen articles about how to deploy Streamlit on EC2, on Heroku, and so that's great too and we really encourage that and we've been, reading those articles carefully, so thank you guys for writing them, and we want to make sure that there are also great open source solutions for deploying Streamlit.
[00:26:17] Unknown:
And 1 of the things that I'm working to understand is because of the fact that you're building on top of scripts that have a logical progression of run from start to finish. And in terms of my experience of building web apps, they're generally run-in some sort of demonizing process. I'm curious how that affects the way that you run the Streamlit application and how you make sure that it's always available for user input. And then also some of the challenges as far as trying to make some of these Streamlit applications multi user capable, where you might have more than 1 person interacting with it at a given time? So that is really the,
[00:26:59] Unknown:
tech of Streamlit is exactly solving the challenge. And so the the actual coding that we do, not the design not the API design and all that stuff the API design is just, you know, designed to be very simple. There's actually very few function calls in Streamlit. There's probably a couple dozen. But underneath the hood, a great deal is happening to to enable exactly what you're describing. And so that's 1 of the reasons why we don't just Python run your script, we Streamlit run it, is because, in fact, Streamlit creates a server, a multi threaded server.
Every time your script is run, it's in a separate thread that's isolated from all the others. We preempt threads when, when events come in. We have our own sort of queuing system on top of the, WebSocket layer, which allows events to go from Python to to the web browser and back, and then we do a great deal of caching and deduping to make everything fast. So for example, if the user changes an input like in that example before x equals 3 it becomes a slider and we recognize that the graphical elements above that point in your script haven't changed. It's looked very much like React in some ways. We have a hashing, caching, and deduping, happening at almost every layer of Streamlit to to sort of make your app as performant as possible. And that that is a little bit of deep magic. And at times, it doesn't quite do what the user wants or or or, you know, it requires a little bit of sophistication to understand why this might be fast or slow. And so that's actually something that we're thinking a lot about right now, and I think we have some really neat features coming out in the next few months, which are gonna, like, even more simplify, and have it work right the first time every time.
[00:28:47] Unknown:
Yeah. Particularly in the slider case, I can imagine that there's a fair bit of difficulty in terms of debouncing the signal, so that if somebody's playing with the slider a bunch and they haven't really settled on what they want the actual input to be, that you're not just constantly sending those signals back and forth between the app and causing it to be preempted so many times.
[00:29:05] Unknown:
Yeah. Yeah. That's that's a great point. And, oh, and I and I and I didn't also address the multi user, aspect of what you're saying, but, indeed, we have isolated, we call them sessions, so that 2 users sort of don't see 1 another even though they're running on the same Python process. So there's a there's a lot of really interesting stuff going on under the hood, and if I do say we also we also you know, I think some really wonderful engineers, most of us are, you know, from from great companies came and and worked on Streamlit and contributed and and helped to inform the company. And we we all we loved the code too, and it's very tested and commented and stuff. And so, if someone's really interested in how this all works, we would we would love to answer questions on the forums and and encourage you to actually read the code. It's all on GitHub. It's all, you know, open source, permissive license, so, so, you know, people are people are welcome to poke around and and see see how we're doing this.
But, yeah, you're you're totally right. There's questions of, not over flooding the event queue because, that would not be a good thing.
[00:30:09] Unknown:
That would break the illusion. And then in terms of more advanced uses, so for somebody who's writing a simple script, they want a simple application. It's fairly straightforward as to how they would go about that. But what I'm wondering is for maybe the case of you with your dashboard for all these different Yeah. Metrics that you're using to measure the success of Streamlit. Mhmm. I'm wondering if you have any capacity for being able to compose together multiple apps that somebody has built with Streamlit into a single overall experience.
[00:30:40] Unknown:
Yeah. So totally. And, in fact, there's a really cool app that someone created called Awesome Streamlit. I think it's awesome dash streamlet.org, and it's sort of a meta app in the sense that other people can commit apps to it, and then you can run them and see different examples of of little code snippets and how they execute in Streamlit. It's really cool. So so that's an example that, and, in fact, the creator of of Awesome Streamlit, Mark, has been on our forums and has really been sort of pushing the the the limits of how do you do multi page apps in Streamlits on, you know, applications with lots and lots of files and so we've actually been learning a lot from his experience and improving streamlets.
So it is absolutely possible to create those kinds of complex apps and moreover, as we gain more experience in building them, we are sort of adding even more sugar to make it a really, really fun experience.
[00:31:37] Unknown:
For people who are first coming to Streamlit, I'm curious, what are some of the types of feedback that you've seen as far as the types of tools or processes that they had been using previously that they've been able to replace with Streamlit Mhmm. And some of the other types of systems or frameworks that you consider to be in the same type of space that you can either use collaboratively or that Streamlit might replace or supplant?
[00:32:06] Unknown:
I'm sorry. Can you can you repeat the first part of the question? I apologize.
[00:32:11] Unknown:
Sure. Just curious what you have seen in terms of feedback of people who are coming new to Streamlit, who had existing workflows or processes, what types of technologies or workflows they are replacing with Streamlit?
[00:32:28] Unknown:
Yeah. So, the I think that there are a bunch of adjacent technologies, that sort of overlap 1 another in the same way that, you know, a Jupyter notebook and an Excel spreadsheet overlap. You could do the same things in both, but also they have distinct centers of gravity. And and similarly, you know, you could do interactive data exploration in Streamlit, but I would probably recommend Jupyter for that. You could also write an app in, in Jupyter, but we think Streamlit is, is a is a better experience for that kind of thing. So there's all these kinds of overlapping things. But I think, you know, in my experience, the the the thing that we actually had in mind the most was Flask, which is we really saw a lot of, ML engineers especially saying, you know, I just trained this model on this data set, and now I created a Flask endpoint, that you can go to and type in all kinds of URL parameters, and then I'm going to, like, barf out a bunch of HTML, that tells you about whatever it is you're interested in. And that's the tool that the whole team uses, and they're they all think it's amazing. Right?
And so we were we were really thinking about how to make actually, Flask is is an amazing framework, and and, actually, we're thinking of of using it in Streamlit. But in that use case, it was really there was a sort of major impedance mismatch, and so so we were thinking about, you know, how do you interleave a neural net and all machine learning code into an interactive app in those kinds of use cases. In terms of other adjacent technologies, you know, there are, Plotly Dash is really cool. It's much more customizable visually than Streamlit and has a different kind of sort of event model than we do. And then if you're coming from the R world, Shiny is really cool.
And, let's see. I think there's a panel, there's voila from Jupyter, and there are a bunch of other things. So I think there's a I think there's a growing sort of agreement that there is really a use case here that's been underappreciated, but, I think also, you know, Streamlit has a sort of a unique place in that firmament.
[00:34:36] Unknown:
In terms of some of the uses of streamlet, you mentioned that there have been a number of interesting or innovative ways that people have leveraged data. And I'm curious, what are some of the most notable that you think are worth calling out or some of the lessons that you've learned as a result of seeing the ways that people have been building Streamlit that you didn't necessarily think were possible or plausible?
[00:34:58] Unknown:
Yeah. I mean yeah. As I mentioned, we really started in our own mind with sort of internal tooling for ML and DS teams, and we've just seen this sort of explosion of cool apps being posted. And, in fact, a better answer might be just to, like, go to Twitter and search for Streamlit and see what people are putting up there. We've seen people build, like, explainer demos to help show off their models, you know, a cool NLP model or something. We've seen people, show off their just their GitHub repos, you know. Here's a useful repo to do x y z, and, oh, if you wanna see how it works, just run this Streamlit app and and all of a sudden you'll, be able to play really easily with my with my code. And, let's see. You know, we've seen people create dashboards for marketing teams. That's been really actually interesting for us is to see, for example, we're working with 1 company, which is where the researchers are building a recommendation engine for the sales team, and doing it in Streamlit allows them to basically directly create this app for the sales team, disintermediated by any kind of other, you know, app building team. And so that means that not only is time to market much faster, but the iteration cycle on, making changes to the app is really much shorter.
Tools for operations team to view, data as it comes off self driving car. Annotation tools. Somebody created an app which lets you see all of the speakers at NeurIPS, the, AI conference. We've seen demos of people's AI research. So, yeah, there's a lot of, a lot of cool stuff out there.
[00:36:35] Unknown:
And in your own experience of building and evolving Streamlit, what have you found to be some of the most challenging or unexpected aspects of the technical implementation or lessons that you've learned in the process?
[00:36:47] Unknown:
Well, I mean, man, we understand Python way better than we ever thought we were going to. I think actually to your to your earlier question about the the different ML frameworks, 1 thing that has the probably the most challenging thing has been TensorFlow and PyTorch because TensorFlow and PyTorch, each in their own way, are doing some deep, deep magic in Python, and they are kind of subverting the language to their own, for their own means. So, you know, typically in TensorFlow, a variable x isn't really a variable in the Pythonic sense. It's really a pointer to a to this graph of computation, which will be executed, at at an at a later time. And that has been tricky to weave into Streamlit because, in a sense, like those libraries, we also do some very deep magic and sort of subvert some of the naive assumptions about how Python program might work.
But, so so that's been a that's been a really interesting challenge, and I think 1 of the goals has been to not just brute force solve those problems, but solve them elegantly so that that that, again, the idea was always that you wouldn't write a Streamlit app with the idea of writing an app. You would write it with the idea that you'd already done something else and then you wanted to to make it interactive in some way, and we wanted to make that just a super fast process. So that that's been really fun. And, of course, you know, Python 23, we we have, still a bunch of people who use Python 2, and that's a very, very intricate thing to do sort of complex Python programming as a library builder because you have to support both simultaneously.
[00:38:27] Unknown:
In light of your work on digging into some of the guts of Python and making it do things that are potentially unexpected or out of the bounds of normalcy and your experience of seeing the same being done in things like PyTorch and TensorFlow. I'm curious what you see as being some of the future evolutionary paths of the language and the overall ecosystem for people who are working, you know, with Streamlit or in the machine learning space or just broadly within Python?
[00:38:56] Unknown:
We've really been trying to create really simple, APIs that often do complicated things, and and at times we find ourselves running up against the limits of the language itself, I must say, just the syntactic limit of the language sometimes, it's difficult to express some things in Python. And so, as a very simple example, there's, not really a notion of blocks. The closest thing is sort of function decorators or if statements or with statements and, the interst the sort of Venn diagram of those 3 constructs leaves a lot of room, actually, for interesting things that you might want to do naturally and so we we think really, really hard about how to create syntactic constructs that will make sense in the Python programmer's mind that also allow us to do these interesting things that we want. And so, as, you know, as the as the language as, sorry, as Streamlit has evolved, we we really been thinking about what are, some ways that, like, you know, maybe like multi line anonymous functions or sort of more generalized syntactic structure, also various pre processing things on a python script that we could do to, to to make the language a very natural and beautiful fit for these use cases, of course, while trying to maintain that just incredible simplicity and understandability that at the core of Python. So, that's a that's a big challenge, and I think, it's 1 that Python has historically been exceptional at and, and so we I think we wanna tread very delicately there. But but certainly, we're running it up against some interesting, both, logical and or sort of operational and also syntactic limitations of the language. And so, excited to to see how those things evolve over time and and hopefully, play a role in them actually if we could.
[00:40:58] Unknown:
And for Streamlit itself, what do you have in store for the future of the project and any of the adjacent products and services that you're looking to build to tie into its ecosystem?
[00:41:10] Unknown:
Oh, yeah. So so exciting. So first of all, if you are a Streamlit user, a lot of people have had problems with caching, which is 1 of our big features. And so we're we're working on a lot of improvements there, so we've heard you, and we're we're we're doing some cool stuff. Then we have a bunch of features which we think are really going to expand the range of possible Streamlit apps. So we have we're working on horizontal layout and other kinds of layout primitives and also on better handling for app state, making and so it'll make Streamlit more useful for, like, annotation problems, for example.
There's a bunch of really cool features that that the community has asked for that are sort of smaller and but we we really get why they're important to people, and we're just are trying to make sure that we save enough space in in our end cycles to to develop those and get those out. And also I'd encourage, people who are excited about adding this or that feature to Streamlit to reach out. We really would love community help. And then the 2 big, the 2 big ones, which are coming down the mill, are a plug in system, which will basically allow people to write their own React components and and then wire those into Streamlit sort of seamlessly, and the auto deploy, Streamlit for Teams, which, which is the enterprise, solution for Streamlit and also something that we hope to figure out a way to to give for free to the community as well. So that's that's something that we're working on, both technically a lot but also from a sort of a business perspective.
We want to keep the lights on, but we also wanna get Streamlit out to as many people as possible. So those that that's kind of the roadmap right now, and, each 1 of those, I think, in its own way adds a new dimension to Streamlit that I'm just so excited about. So, you know, right now, it's a it's a square, but the next feature will make it a cube, and the next feature will make it a tesseract, and so on, and, and I think that's something I'm just super, super looking forward to. So I think we're just the beginning of the of the journey, and and I can't wait to can't wait to keep going.
[00:43:22] Unknown:
And in terms of the governance and sustainability of the project, I'm curious how you are approaching that given that it is open source for the actual code, but you're also trying to build and maintain a business around it, and just how you see those dividing lines, and how you're trying to make sure that you're, keeping the needs of the community forefront.
[00:43:43] Unknown:
Yeah. As I mentioned, sort of crowdsourced input from the community has been the driving start of evolution from the start even when we were in closed beta. And we recognize now with, sort of the huge amount of interest and the rapid growth in the community that scaling our community involvement is going to be a major challenge, and in many ways I think that we are the bottleneck right now. The number of questions coming in and and GitHub issues is sort of larger than our ability to to handle it, quite frankly, so we are really interested actually in in how we can have the community play a role in understanding the issues that people are having, helping 1 another, and triaging, triaging bugs and and actually contributing code.
So that's something that's that, you know, our our attitude towards this is we really are looking forward to to working with the community and and and building a structure of community government and and having the community have a a sense of ownership over the future of the product. And, that's that's something that's evolving right now and, or rather I should say, that's something that we are thinking a lot about right now and so actually, in that spirit I really would encourage people who have thoughts about great community governance models and other open source projects to reach out and write something down in the forums and let us know and let the other community members, react to it. We are really paying attention, and our goal is to share ownership and and share a stream with the world. That's, you know, that that's ultimately the goal here. This this all before this was a company, this was just a cool project, and, you know, I really think it would be so much fun to to to to sort of get as many people involved as possible.
[00:45:46] Unknown:
Are there any other aspects of the Streamlit project itself or the ways that it's being used or your goals for it that we didn't discuss yet that you'd like to cover before we close out the show? No. No. I think we, yeah, I think we got a lot. And for for anybody who wants to follow-up with you or get in touch and keep up to date with the work that you're doing, I'll have you add your preferred contact information to the show notes. And with that, I'll move us into the picks. And this week, I'm going to choose the book of why by Judea Pearl, which I've been reading recently. I'm still only partway into it, but so far, it's been quite interesting. And it's discussing his views on some of the issues around how to systematically represent causation and not just correlation and some of the ways that that's important in our current age of computing and artificial intelligence. And he's actually 1 of the touring award winners, so definitely somebody who has a lot of thoughts and a context on the matter. So definitely recommend checking that out. And so with that, I'll pass it to you, Adrianne. Do you have any picks this week? Yeah.
[00:46:49] Unknown:
I think 1 of the most touching books that I've read recently is called No Self, No Problem by Anam Tubton. And there are a couple of books with that name, believe it or not. Anam Thupten is the Tibetan Buddhist monk who wrote a book with that name, and, the thing about this book that's so special is it's not at all sort of religious in some ways. It doesn't, certainly doesn't ask you to believe anything, and yet it's written with this sort of exquisite precision about the world through the eyes of a so called enlightened being. And, it's it's really it's it's just so unapologetically and and beautifully states there's this way of of looking at the world, that's accessible to everyone. And when you read it, you I I just feel like, of course, that's true. And and I was really just touched to the bottom of my heart.
[00:47:58] Unknown:
Well, thank you very much for taking the time today to join me and discuss your experiences building the Streamlit application. Definitely a very interesting tool and 1 that I am excited to start playing around with. So thank you for all of your efforts on that front, and I hope you enjoy the rest of your day. Yeah. Thank you. That would be great. Thank you.
[00:48:19] Unknown:
Thank you for listening. Don't forget to check out our other show, the Data Engineering podcast at dataengineeringpodcast.com for the latest on modern data management. And visit the site at python podcast dotcom to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcastinit.com with your story.
[00:48:42] Unknown:
To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.
Introduction and Sponsor Messages
Interview with Adrian Troy: Introduction and Background
Streamlit: Origin and Purpose
Streamlit Features and Use Cases
Challenges and API Design
Building and Deploying Streamlit Apps
User Feedback and Adjacent Technologies
Technical Challenges and Lessons Learned
Future of Streamlit and Community Involvement
Closing Remarks and Picks