Summary
Starting a new project is always exciting and full of possibility, until you have to set up all of the repetitive boilerplate. Fortunately there are useful project templates that eliminate that drudgery. PyScaffold goes above and beyond simple template repositories, and gives you a toolkit for different application types that are packed with best practices to make your life easier. In this episode Florian Wilhelm shares the story behind PyScaffold, how the templates are designed to reduce friction when getting a new project off the ground, and how you can extend it to suit your needs. Stop wasting time with boring boilerplate and get straight to the fun part with PyScaffold!
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great!
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host as usual is Tobias Macey and today I’m interviewing Florian Wilhelm about PyScaffold, a Python project template generator with batteries included
Interview
- Introductions
- How did you get introduced to Python?
- Can you describe what PyScaffold is and the story behind it?
- What is the main goal of the project?
- There are a huge number of templates and starter projects available (both in Python and other languages). What are the aspects of PyScaffold that might encourage someone to adopt it?
- What are the different types/categories of applications that you are focused on supporting with the scaffolding?
- For each category, what is your selection process for which dependencies to include?
- How do you approach the work of keeping the various components up to date with community "best practices"?
- Can you describe how PyScaffold is implemented?
- How have the design and goals of the project changed since you first started it?
- What is the user experience for someone bootstrapping a project with PyScaffold?
- How can you adapt an existing project into the structure of a pyscaffold template?
- Are there any facilities for updating a project started with PyScaffold to include patches/changes in the source template?
- What are the most interesting, innovative, or unexpected ways that you have seen PyScaffold used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on PyScaffold?
- When is PyScaffold the wrong choice?
- What do you have planned for the future of PyScaffold?
Keep In Touch
- Website
- FlorianWilhelm on GitHub
- @florianwilhelm on Twitter
Picks
- Tobias
- Daredevil TV series
- Florian
Closing Announcements
- Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
- PyScaffold
- Innovex
- SAP
- Cookiecutter
- Pytest
- Sphinx
- pre-commit
- Black
- Flake8
- Poetry
- Setuptools
- mkdocs
- ReStructured Text
- Markdown
- Setuptools-SCM
- Hatch
- Flit
- Versioneer
- Gource git visualization
- MyPy Compiler
- Rust Cargo
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode. With their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers, 40 gigabit networking, and dedicated CPU and GPU instances. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover.
Go to python podcast.com/linode today to get a $100 credit to try out their new database service, and don't forget to thank them for their continued support of this show. Your host, as usual, is Tobias Macy. And today, I'm interviewing Florian Wilhelm about PyScaffold, a Python project template generator with batteries included. So, Florian, can you start by introducing yourself?
[00:01:08] Unknown:
Yeah. Sure. Thanks for having me. So, yeah, my name is Florian. I'm head of data science at Innovex. We are an IT project house. I got a PhD in numerical mathematics, and then I started in the field of data science. And throughout the whole time, even also in my studies, I was using Python. And, yeah, it's cool to be here at this podcast.
[00:01:35] Unknown:
And do you remember how you first got started working with Python?
[00:01:38] Unknown:
Yeah. Actually, it was even during my school times when I remember correctly. So when, I was doing internship at SAP, which is, like, 1 of the largest software companies in Germany. And there, I had to to learn Perl for scripting. It was whole Perl 5 point 4, I guess. And back then, I was quite excited about Perl, but 1 of the colleagues told me that I should definitely also look into Python. Yeah. So he convinced me to take a look into Python, and I pretty much locked directly loved it and preferred it to Perl because it was way more readable and easier. And this is what got me started.
[00:02:21] Unknown:
1 of my favorite jokes about Perl is that most programming languages are write once, read many, and Perl is write once, read never.
[00:02:29] Unknown:
Yeah. Yeah. Yeah. It's exactly what they say about Perl. Yeah.
[00:02:32] Unknown:
And so in terms of the Py Scaffold project, can you describe a bit about what it is and some of the motivation that you had for building it and some of the overall goals that you had when you started it?
[00:02:45] Unknown:
Yeah. Sure. So when I was doing my PhD, so it was in numerical mathematics. So we also started to use Python as a kind of clue code for all our numerics code in c plus plus and Fortran. And in this research group, I was the guy who was really excited about Python and told everyone about Python, and people started using it. But then everyone was more like scripting, and pipe wanted to kind of help everyone to write proper packages so that we could share our software components. And the first thing I did, I was writing a huge, Wiki page. I think it was in in red mine or what kind of Wiki there was at the time. So I was writing, like, a whole documentation about, yeah, have this setup dot py, and you should put this in there. And don't forget to add this, and this is how you should do it.
And when I later joined the first company after my PhD, I was also kind of taking this Wiki with me, all the knowledge I have gathered. And also in the first company at Blue Yonder, we also started to use Python for c plus plus code. And then I realized after a while so, okay. No 1 is actually reading my documentation anymore because it has grown to, like, 20, 30 pages. And I thought, okay. Why not just create some project generator for helping people to apply all the best practices, everything from the docs really easily because, yeah, the actual barrier was for them to read the docs and having some kind of project generator that boot straps your code and you can easily start using it is is way easier to just use.
And this is basically when, it was back in 2, 000 14. Blue Yonder at the time was the company I was working for where I had this idea. I pitched it with my boss. He gave me some time to work on this and also allowed it to be open sourced, which I'm really grateful for that they allowed it. And, yeah, this is how it all got started. And then after I mean, now it's, yeah, it's 8 years ago, it started to grow and constantly evolve. And, yeah, the rest is history.
[00:05:12] Unknown:
The context of the fact that you started it 8 years ago is interesting because if you were to go today and look for, I just need a template for XProject. There are all kinds of different options out there. There are different cookie cutter templates. Some of them are 1 off. There are some kind of generalized options. And I'm wondering, at the time that you started the project, what the kind of available landscape looked like for somebody who just wanted to say, I just want a template that gives me a Python package that has all of the basic setup that I need so that I don't have to read through, you know, reams of documentation to figure out which configuration parameters to put where and what the, you know, the package architecture needs to look like?
[00:05:52] Unknown:
Yeah. So back at the time, I think it was also about the time when cookie cutters started, actually. And they also have a a Python template. But at the time, I was not really convinced, so I had different ideas how to tackle this problem. And yep. So, basically, the main idea was so the cookie cutter, you can basically use for for everything. Right? So you can roll your own templates. It's really easy to configure your own templates, but you still have to have the kind of knowledge. Right? Or if you look on the web page now, you have, like, tons of different templates, and somehow you need to decide. So what is the right template for me and what are the pros and cons of each template? And still, you have to kind of read a little bit in the documentation what it is really for and how you should use it.
And the idea with PyScaffold is more or less that you have something really opinionated. That's kind of a best practice approach that comes with everything that you need to get started developing your own package really easily. And then there are, like, extensions that kind of deviated change a little bit the kind of basis, but it's not a complete new template. It's a different approach. So there's this core template, and you can alter it a little bit, which is in contrast to cookie cutter where you can have, like, complete different templates that are completely independent of each other. And I like the pie scaffold approach more. So the mission the goal always was to have, like, this 1 go to tool for beginners of packaging that you can just you know, you install, Py scaffolds. You run put up my project, and you have already a really nice scaffold there that comes with everything you need.
You can easily build your package. You can upload it to PyPy. You can run talks for your unit tests using Pytest. You can build your documentation using things. You have pre commit, for instance, configured to have, like, black and flake, 8, and so on. And everything, like, is included. And, also, you have this nice documentation that we built over the years. So the web page explaining what everything is what everything does, and then you can kind of start step by step, like, using all those best practices. And I think this helps people a lot because if they weren't using PyScaffold, they would have to make a lot of decisions.
So of what they should use, what kind of build system. I mean, nowadays, there's, like, can also use poetry instead of setup tools. You can use PDM, then there's hatch. So there are many different tools and also for documentations, different tools, and how to configure those. I think you can easily spend more than 1 week, like, just configuring your package and your whole development setup. But in the end, what you wanna do is to develop where you're working on your project, your idea. And so Py scaffold helps you to get really fast to the point where you can start just hacking away on your idea, and everything else is already with same defaults covered in in PyScaffold. So this is our main goal.
[00:09:23] Unknown:
As you mentioned, there are a large and growing number of options for any, you know, particular use case or any aspect of the kind of environment setup or the way that you structure your projects. And before we get too much into the details of which choices you've made where, I'm interested in discussing kind of the categories or types of applications that you're focused on supporting with PIScalfold and some of the ways that those areas of focus developed both from your own usage and some of the evolution of the project over time?
[00:09:57] Unknown:
1 main focus, you can build a library with PyScaffold, also some real Python application where you have console scripts and so on. So this is all supported. Then for specific applications, maybe thinking about, like, more data science or data oriented application, We basically we developed those extensions whenever we needed them. Since I'm in the field of data science, I brought the DS project for the data science project extension, and everyone can just contribute extensions whenever they want. But, yeah, we are completely open in this regard to whoever wants to contribute a new extension for some really specific use case. For instance, I saw last time I checked that someone had, like, a extension for Visual Studio Code together with a container configuration because 1 person seems to, yeah, develop with the help of Visual Studio Code and, some containerized setup.
And, basically, everything's possible with the help of the extension system. And I think at that point, huge shout out to Anderson, Bravoielli because so he's right now the core maintainer of PyScaffold, and he introduced the extension system back then. So he changed a lot to make this possible to have really yeah. To have extensions easily available in KiteScaffold.
[00:11:40] Unknown:
I'm definitely interested in digging into the extension system a bit. But before we get there, as far as the kind of decision making aspect of saying for this type of project, so for a Django project, these are the, you know, different basic plugins that you want to use. This is how we're going to default the project structure. You know, we're going to use Poetry or PDM or you know, just curious if you can talk through what your process was for deciding on what constituted best practices, both across all of the templates that you use in PIScalfold and for kind of each specific use case for a given project.
[00:12:21] Unknown:
Yeah. So, of course, everything's kind of opinionated. So we had huge discussions with several people from the Python community, or sometimes it was also just, like, trying out different tools. And so PyScaffold uses right now has been using setup tools because it's like the big default way of building Python packages. And since, also, setup tools evolved a lot, it's still used also a lot and has all the features that are needed. So we looked at different tools, for instance, Poultry. But on the other hand, Poultry itself also comes with a kind of minimal project setup. So and it also directly includes managing your whole virtual environments, and we wanted to have those separated that you, for instance, can also use an Aconda, which is maybe a little bit more preferred in the field of data science for managing isolated environments.
And yeah. So, basically, we evaluated different ways and looked, of course, at a lot of other open source tools, how they are doing, and this kind of yeah, then helped us to kind of extract what we think are the best practices from different project. And, of course, this evolves over time. For instance, when we started, like, in 2014, it was completely normal that you would write a setup dot py. Right? Because at first, Python was like, yeah. We can configure everything with the power of Python itself. And then later on, the whole community realized, okay. But being declarative is actually much better. So we should give up some of the huge power of Python and go over to some configuration file. And at first, this was a setup dotcfg.
And, yeah, we welcomed this and implemented this too in Hinescaffold and always yeah. We are always keeping up with what setup tools does so that our users always have the current and most modern way of setting up setup tools. And now, I think since a few months, it's also possible to completely to completely configure setup tools with the help of pyeproject.toml. This is also gonna be, like, the next step, what we're gonna do. Then for documentation, I think, yeah, it was, from the start, pretty clear that Sphinx is the 1 go to tool.
Right now, we are also evaluating MK Docs, which it seems like the community is, like, growing more in on the MK Docs side. When we started, most of the docs were written in restructure text. But nowadays, markdown is, much more preferred. So we first started this as an extension that you could say, okay. I wanna rather write it in markdown. And now we are thinking about maybe just to completely switch over so that your default project directly is all in markdown. But these are just ideas because we kind of want a way to have it standardized. And unit tests, I think, in the beginning before Pytest became really big, there were also a lot of other unit testing tools that we first checked out.
And, yeah, this is how we kind of look what happens in the community. We also get, for instance, Anderson. He's, like, reading all those peps before going to sleep, it seems. So he knows all the peps and what we should implement next and so on. And then there's also a guy from setup tools, SCM, So he developed this tool. Yeah. Setup tools as SCM, and this tool allows you to do semantic versioning and kind of bridges the gap between git and the version strings of your Python package. So you can just tag a certain commit, and this, version is then also used directly in the metadata of your project.
Yeah, this allows you to avoid any kind of of conflict because you want this to be always consistent. Right? What you haven't get and what kind of packages you are building. And he's also giving a lot of really valuable suggestions to PyScaffold. For instance, he suggested that we change to the source layout so that you don't have the the actual Python package at the root of your project, but that you rather have a source directory and the package is a subdirectory of the source because this has a lot of advantages. And so this is basically how those best practices so we listen to different ideas. We discuss them, and then we look, okay, who else is doing this? And is this already kind of a mainstream?
And then we include it into PyScaffold.
[00:17:38] Unknown:
The kind of determination of what is mainstream is also interesting because it also depends a bit on who you're looking to for kind of signals where some kind of areas of the community are focused on, you know, everything should be in poetry, and we use pyproject.toml for everything. Other people are looking to some of the newer PEPs where I think PDM is 1 of the package managers that uses the directory based dependency management so that you can have sort of like a NPM style where you just install your packages into the project directory. You don't have to worry about managing your virtual environment.
And I'm interested in your general kind of philosophy and approach for how you determine what your opinions are about how you want things to be done and how you balance that against the tendency for a project as it goes on to just add more and more configuration flags so that by the end of it, you might as well just do all the work yourself from scratch rather than rely on the, you know, so called opinionated approach because the opinions become diluted because there are so many options.
[00:18:46] Unknown:
Yeah. If we were actually now to switch setup tools for something else, then we would maybe over some time have a kind of, yeah, an extra flag to go back to setup tools. But then we kind of would drop setup tools if this ever happens. So we are still in discussion, and setup tools for me is still a it's still a good way to build your Python packages. But the idea is not to just add more and more and more, as you said, but to kind of replace them after a while, like, 1 thing with the other because you don't need to build tools, actually. So you have to decide anyway for a single 1. And since we wanna be a tool that you can just run and run put up my project, and then someone makes decision for you and gives you the default 1 where we think it's a good choice for you. And, I mean, right now, all those different things, build systems and also environment managers like PDM, Hatch, Poetry.
There are so many. And right now, it's really hard to say who's gonna win in the end. And this is where we just kind of take our time and wait. And, also, actually, right now, pie scaffold doesn't really come with a way of managing your virtual environment. So it's completely agnostic, which I think is good. So you can do whatever you want. You can even use pipenv if you wanna use this, or you can use anaconda, whatever you want because we we are concentrating on this single thing, which is giving you a nice Python project setup. So it's not about managing your environments.
There, you can still choose whatever you want. Maybe poetry, of course, would be a little bit difficult because, I mean, we use setup tools. So you would have to replace then some yeah. But in the end, after we have done the switch to Pyprojecttoml, it should hopefully be only, like, 1 line in the beginning of Pyprojecttoml, where you define your adult system. And then you could even use poetry. So, actually, if you wanna develop a Python library or application, you should not care that much about the build system in the best case because this should be completely transparent for you. It should just know the command how to build the wheel and not decide, okay. Should I rather go for there there's also flits.
There are so many different tools, and some implement those caps. Some implement there's also PyFlux or something that only comes with this new way of the underscore underscore pipe packages. So this way of not having isolated environments at all. And yeah. So this is still some beast we kind of say, okay. We are re agnostic. Who knows if we later gonna adopt something or not? So I would rather actually keep or stick to the UNIX philosophy, which is do 1 thing and do it really well. And a project setup has actually or should, in best case, have nothing to do with how you manage your virtual environment. Right? It should be kind of a to
[00:22:21] Unknown:
it. Absolutely. You should just build the wheel, not reinvent it.
[00:22:25] Unknown:
Yeah. Absolutely.
[00:22:28] Unknown:
And then in terms of the actual implementation of Spy Scaffold, I'm wondering if you can talk to some of the ways that the project is designed and some of the choices that you've had to make in terms of how it is managed and structured and the evolution that it has gone through as the overall ecosystem and the adoption of best practices has changed?
[00:22:50] Unknown:
Yes. So implementation perspective, we always try to keep things as simple as possible. Right? So we also have nice documentation about this for contributors who beginners who want to contribute to PyScaffold. And, yeah, basically, we have a module for the command line interface, then we have a module for our internal APIs, So how to create a project. And then the base principle is that structure of your project, which is the kind of nested dictionary with the files and the content of the files. And and this put into a pipeline called the action pipeline. And each step of this action pipeline, which can be create a file or check the consistency of flags or make sure that git is there or initialized.
This is then applied on each step on this structure. And in the end, the end structure at the end of the pipeline is then kind of deployed on your hard drive. So this is the basic idea. And this also gives the flexibility. So if your extension introduces another flag, then you have access to that flag. And, basically, you define a new action in this pipeline. You can say it should be should be run after after another step in this pipeline, and this gives you the whole flexibility to change at that point the project structure however you want it. And this is basically the whole magic. So I think it's kind of really simple to understand what's going on.
And, yeah, throughout the time, a lot changed. So when we started, we wanted to have this versioning feature in there. I think it was way before setup tools SCM was actually invented, so we were using version here for that. And, also, we wanted to have declarative configuration options. So there was a time when we used this Python build reasonablest. So it's from OpenStack, PBR. So we were using this, and we were vendorizing those packages. And, yeah, later on, we decided to rather use, like, real dependencies for this. So the reason for vendorizing it was that we were, at that time, a kind of development requirement.
So we tried not to have that many dependencies. So we would just package everything into PyScaffold. So this changed in version version 4. So there were a lot of kind of architecture changes throughout the time. And I think the biggest 1 was the change to have an extension system that Anderson introduced. And he also helped to restructure and reorganize the source code a lot, which helped us to maintain this over so many years. Because, I mean, when I started it, 8 years is already kind of long for a Python project. Right? So much happened even to the Python language. There was also the switch from Python 2 to Python 3. And in the beginning, we wanted to support this. So we had 6 also in the templates to help people develop a package that runs for Python 23 at the same time.
This then also got dropped after a while because nowadays, you basically don't care about Python 2 anymore. Right? So and this is how we evolve this. And for version 4, we also took the time to just see what happened over those 8 years. So I know if you know this this course, if it's pronounced like this. This is visualization tool for Git where you just put in your whole Git repo, and then it creates some nice visualization. We put the PyScaffold repo in this tool, and I think it's a 4 minute video where you can say see easily how much has changed over those 8 years.
Really, a lot has changed over those 8 years. And sometimes it was a real challenge to keep up with all the changes and all the taps. And sometimes it's also hard to say no to people for kind of suggesting new features and saying, okay. Please, could we also have this and that? And then you think about, okay. It's not only implementing it. It's also maintaining it over many years and kind of removing something from a package. It's much easier it's much harder than actually adding something. So you also always have to keep this kind of balance between not adding too many features because you have to maintain everything.
On the other hand, you also wanna be, like, welcoming to new contributors and to also allow them to, yeah, implement their ideas and help improve pie scaffolds. So
[00:27:58] Unknown:
Another aspect of the work that you're doing is that by being opinionated, you have to make your own determination about what constitutes best practices. And as you mentioned, that means staying up to date with the different conversations that are happening, the PEPs that are being adopted, the changes to the Python language and ecosystem. And in your work of doing that, I'm curious what you see as being the kind of noisiest or most volatile elements of what constitutes best practices and what are the pieces that are largely stable or widely agreed on. So I know that black, for instance, is generally accepted as a good choice. Obviously, there are people who have their own opinions who dissent, but it's fairly uncontroversial versus the question of kind of environment management that you were discussing earlier of Poetry versus pipenv versus PDM, etcetera.
I'm wondering what you see as kind of the, you know, most stable versus most volatile elements of that ecosystem.
[00:28:57] Unknown:
So definitely something like Plaque and Flake 8. And, also, I would say, pre commit. So also having this end of line fixer and white space and everything that helps you kind of have a certain standardized style in your Python project. There is, I think, a lot of agreement that those tools make sense. Also, for unit tests, pytest is still the go to tool, I would say, and also talks for running tests. For a long time with regard to documentation, it used to be strings. But, yeah, as we mentioned before, people also use MK Docs. And what I like about MK Docs that it also has a declarative way of configuration.
And Sphinx, on the other hand, has this const dot py where you also use Python for configuration. So then more and more people are adapting type hinting. So this is still, like, not 100% used, but there's a lot of agreement that people really like it. And, I mean, if you used Python docstrings to document your source code, you would anyway write if it's a list of int or a list of characters or whatever, a list of strings. And, yeah, having this now directly in the parameters of your function and the return value, I mean, it's it's much easier to read, and you have many, many benefits for instance, like static type checking. And there's also tools that even compile your Python codes, like the mypy compiler and so on. So there's a lot of things going on in this regard. So, yeah, mypy is definitely something I would say or it doesn't have to be, MyPy, I mean, also other tools.
But type hinting, I would say, is yeah. Many people agree with it, and this is secure to stay. We also wanna support this more in high scaffold. Then when it comes to the build system, this is, like, really hard to say what's gonna happen there in the future. So I actually hope that setup tools is gonna be the winner because, I mean, it also does this 1 thing, and it does it also well. It has a kind of legacy code base. So this is maybe why people have their problems with it. And sometimes yeah. It has some really rough edges, so to say. So I had a lot of fun trying to understand what's, going on in setup tools, but I think it's improving a lot. So the Python package authority, the pipe PyPA I mean, they also have FLIT, which is like a light wide weight version of setup tool, so to say. But it also is missing a lot of features, like you can only have pure Python packages, so no compiled code.
You don't have an extension system. For instance, what I explained before, this having semantic versioning directly from Git in your metadata of your Python package is not possible with Flit, which I think is a is a downside. So I'm not a huge fan of it, I must say, but there are many others. I haven't really evaluated so much PDM right now. And this new PEP with the local environments, with the Py packages. But then again, there's also Poetry and Hatch. Hatch is is definitely cool, I would say, because it already comes with a lot of features. But this is, I would say, the part where there's the most fluctuations, where it's not really sure what it's gonna happen. And I think, on 1 hand, it's kind of nice that people are inventing new stuff. But on the other hand, it's also a huge distraction because in the best way, you just you wanna have this 1 tool that you just use, and it does whatever you want. And you develop, like, other languages like Rust, for instance, come with cargo, where you have this 1 go tool how to handle everything.
And it's actually a pity that Python had over, like, almost decades now so much pain with handling your dependency, with installing things, and so on. And, yeah, with respect to other things like, I would say, the the source layout of a Python package is also, yeah, many benefits of it so that you, for instance, cannot accidentally test something completely different because you import the package directly, not whatever you installed. So this, I think, is also kind of yeah. It makes sense. And most people agree that you should set up your project like this. Also, that you have should have a license file or, file called contributing and a readme.markdownor.rst.
So the the over and and that your docs folders should be named docs, and you have a test folder. So certain things are already kind of, yeah, pretty defined. And I don't think that the overall, like, structure from a directory perspective is gonna change a lot. But, of course, you had some tools, yeah, some tools, especially the built systems. I think they they built surely changed a lot. And, yeah, let's see who's gonna win in the end.
[00:34:55] Unknown:
In terms of the extension system that you mentioned earlier, I'm wondering if you can talk to some of the ways that that enables people to use Pyscaffold as a utility and build their own templates and maybe some details on sort of how the templates are implemented as compared to maybe a cookie cutter template.
[00:35:16] Unknown:
The templates of PyScaffold are just Python packages themselves, so you can just pip install them. And, yeah, they're using the Python hooking system. So the entry points basically so that PyScaffold can discover those extensions. And they are discovered, and then you define your flags and your actions. And then you can also define where in the pipeline this extension this your your action is gonna be placed. So it's really not that hard to write an own extension. And we also wrote an extension to write PyScaffold extensions. So, you can just pip install install py scaffold dash custom dash extension, and then you have a special flag. And then you get a setup, a project setup that already includes all the necessary subfolders and has some stubs with pseudo code or not some pseudo code with, like, all the boilerplate code that you need to write your own PyScaffold extension.
And, of course, also the the documentation, I think, at that point is really great. So it's it's not that hard to write your own extension. What cookie cutter does is clone some some GitHub repo, which uses Jinja or Jinja 2 to define the overall structure, and then there's some substitutions going on. So I think this is, like, really different to how we do it. So I must say that, of course, it's easier. So if you are already using Jinja a lot, then it's easier to write your own cookie cutter template. But on the other hand, I mean, the cook for a cookie cutter template, you would then need to decide first, okay, from which of the many templates are out there. I wanna kind of clone it and then do your changes.
And in case of py scaffold, you always know that you have this core setup that is also gonna evolve over time. And then your extension just defines how you differ from this core definition of your project. And so this also means that maintaining your extension is way easier for PyScaffold. And it's a different thinking in this regard that you only define the changes and not like I copy everything and modify in place.
[00:38:04] Unknown:
For somebody who's getting started, you mentioned that you have this put up command to generate the scaffolds for somebody who maybe already has an existing project and wants to migrate to adopt some of the opinions that you are have encoded in py scaffold or for somebody who started a project in PyScaffold, but the underlying template has evolved, what are the options for being able to incorporate those changes into that existing project?
[00:38:32] Unknown:
We have a migration documentation about this. What you can basically do, it's, again, really simple. So just make sure that your current project already uses Git. I mean, this is, for instance, also something which is here to stay. So everyone uses Git. Right? There's no doubt about it anymore. So almost everyone agrees on this. This is also a requirement for ISCAPSOL. So let's assume that your current project, you're using git. Just make sure that that you have a clean repository. Everything's, checked in. And then you just go to the parent directory and type put up minus force and the name of your project.
And and then PyScaffold is kind of overwriting, your current project setup with its own files. And then you can just git merge tool and git diff to kind of change whatever you need to change. So it depends if you already have a kind of similar project setup. Of course, there won't be that many changes. Yeah. If it's completely different, for instance, if you don't have the soft layout, there are some steps you can do before, but everything is written in the documentation. Of course, this process like this cannot really be automated because we never know how your project might look like. But I would say the process shouldn't cost you more than, let's say, 2 or 3 hours depending on how large your project is.
And, yeah, it's kind of easy.
[00:40:12] Unknown:
And your experience of building Pyscoffold and using it and kind of evolving the project and now handing it off to a new maintainer, I'm wondering what your overall approach to community engagement has been as far as do you have any maybe voting process for deciding what are the right opinions to have or sort of who to listen to and what your approach has been for kind of bringing on your new maintainer and handing over the reins because that's 1 of the kind of least formalized or least understood approaches in open source as far as, like, how do you, as somebody who just starts a project because it's something a problem that you want to solve, then say, okay. This is how I am going to manage the kind of ownership and life cycle of this project once I have moved on to other things?
[00:41:05] Unknown:
That's a hard question. So I also yeah. I'm really open to all kinds of ideas and suggestions how to do this. Our current approach, and I think it worked quite well, is, first of all, be always be welcoming to new contributors and trust new people always. So you should not be too hesitant. And, yeah, listen to their ideas, kind of then also challenge their ideas, but in a nice way, and discuss first new ideas before you get some huge pull requests that change of, like, 80% of your code base. This can be formalized. So we have a lot of documentation about, as I said before, how our code looks like, also how to do your first pull requests. So, really, like, from the beginning, like, you click the clone button in GitHub and so on.
So everything is included, and this helps new people making their first pull request. And we also had pull requests just fixing typos, which I really liked. And then I find it really important because it's, of course, writing open source code in most cases is nobody is paying for it. Right? So whenever someone wants to contribute, they are kind of giving you their spare time in most cases. Right? And I think this should always be appreciated. So if someone takes the time to fix our typos in pie scaffolds, so we are really grateful for this. And I think it's important to show this in a community that you write, hey. Thanks a lot.
We also mentioned those in our as authors, as contributors to PyScaffold, and so on. So this is really important to have this, like, welcoming way for new contributors and always also explaining things when there are questions, of course. Also important is that when someone tries to make a first pull request, not to just think, okay. I can now implement whatever they ask for myself in maybe, 2 hours because I know the code base, but rather to say, okay. You could implement it like this. And why don't you make a pull request for it so that people kind of get motivated? Okay. There's someone helping, and I write a small, maybe new unit test or whatever. And they fix a little problem, and then they have their first pull request. And this gets many people motivated then to go on and to maybe write a huge feature for PyScaffold and so on.
And then there's always, like, trust. For instance, I trust Anderson, like, 100%. And I've known him now for many years. And, of course, if he has an opinion on something, this opinion is, like, really valuable. And this is also why I said, like, okay. I wanna also go on to other projects. So still, I want to contribute. But since you have been doing, like, the most work and the greatest feature in the last years, you have been implemented, so you should be the core maintainer of Pyescafold. And this is, like, the usual kind of, if you call it, ring of trust, or you have certain people, known them for many years, and they have done an awesome job. And, of course, you trust them more in contrast to someone coming in and saying, okay. I wanna change completely the goal of PyScaffold.
Still, in this case, 1 has to be, like, really also, again, appreciate that they have cool ideas, but in a nice way, saying no. And this, like, learning to say no in an open source project for me was the hardest thing because you have motivated people, and they say, okay. Let's add this, and here's the pull request. And I've already put, like, maybe days of work into my pull request, so please accept it. And then saying, no. This changes the whole project into a direction that we kind of don't wanna go. It's the hardest part, but I think it's also something you have to do as an open source maintainer because, otherwise, you're kind of losing the goal of your project. So yeah. But I think there's a lot to learn still for us and for me how to do this. And as you said, I think maybe there's no no best way how to do or how to work with the open source community.
Yeah. But, again, some best practices, I assume.
[00:45:54] Unknown:
In your work of building and using the project and engaging with the community, what are some of the most interesting or innovative or unexpected ways that you've seen Piscaffold used?
[00:46:03] Unknown:
I really like seeing that people are using the custom extensions, building their own, like, really specific extensions for their certain needs, like, something like on the b project and this Visual Studio Code. And someone also added an interactive shell for PyScaffold. So beforehand, everything was just that you define the configuration flags for PyScaffold. So 1 person, he built this interactive way, which I found really surprising because I actually never needed it. But we realized then due to him and also that other people were using this project that he built, that we should have this as a core feature. Yeah. But other than that, yeah, I think, yeah, most people are using PyScaffold as it is intended, and we are always happy to hear when people are using it.
And yeah. So if you're using PyScaffold so to the listeners, yeah, let us know about it. Just use the GitHub discussions and tell us about it. We are really happy to hear what kind of projects you are using PyScaffold for, yeah, if it helps you or how it can be improved.
[00:47:28] Unknown:
In your work of building the project and growing it and now adding a new maintainer, what are some of the most interesting challenging lessons that you learned in the process?
[00:47:40] Unknown:
For me, this was my first open source project. All the community aspects that we discussed before, this was really challenging. And we have a few core contributors, but I would love to see a lot more people contributing to PyScafold and building this up. So this was really challenging. Also, keeping up with all the peps, of course, and that Python itself is really changing so rapidly for the whole language. I rather thought it's gonna be, like, maybe in, c plus plus or so that you have, like, a new standard every 4 or 5 years or whatever so that things are not changing so quickly. But Python has such a huge speed. Also, the whole ecosystem is is changing.
And this is definitely yeah. It has been really surprising to me. It's still a huge challenge to keep up with this.
[00:48:40] Unknown:
For people who are looking for ways to be able to quickly scaffold or bootstrap a project or they're looking for an example of best practices for how to approach their application structure? What are the cases where PIScaffold is the wrong choice?
[00:48:56] Unknown:
I think it's the wrong choice if you wanna have a application that deeply integrates into your operation system, because then you would maybe rather go for some operation system specific packaging solution like a DPN package or Flatpak, Snap, whatever. But it's also the case for other type and packaging systems. And I think if you work, for instance, for a company or for some reason, the context you're working in demands a really specific setup. You're kind of deviating too much from what Hire Scaffold is giving you, then it really might make sense to use, for instance, cookie cutter instead and roll your whole template.
And, yeah, this can be really a case where your own template that you then also, of course, have to maintain makes more sense. But if you kind of just wanna get going, then a pipe scaffold is the better choice.
[00:50:06] Unknown:
In terms of the near to medium term, what are some of the things that you and your co maintainer have planned for, you know, new capabilities or new areas of use or just some of the projects that you're excited to dig into?
[00:50:20] Unknown:
In the next version, we definitely wanna go to, yeah, wanna adapt Pyproject toml completely because this is now possible with the new setup tools version. So setup dotcfg is completely gonna disappear and all the metadata of your project. Everything's gonna be nicely in Pyproject. Toml. Also, we heard from the community that people rather have or rather want to have, like, 1 file with all the configuration in it instead of tons of small files, like file only with the flake 8 definitions and only with your coverage settings and so on. And most of those tools now integrate with pyproject.toml and also wanna integrate this. So this is definitely the near term goal for PyScaffold and also to have an update path for this next version.
We are evaluating right now, for instance, MK docs, if this could be something, a switch to MK docs. So this is rather an idea I'm having right now, but, yeah, there still is gonna be a lot of discussion. Yeah. Anderson is the core maintainer. I'm gonna discuss it with him, so it's not yet decided. This is something then more integration of MyPy and to make it easier for people to then start with static type checking. So this would be something that's, like, midterm. And, also, we've really seen that people like markdown much more than structured text, and you can have it already in PyScaffold with the markdown plugin, which is, for instance, also by default install if you use the data science project plugin or extension.
I think this should be, like, already in the default core setup so that you have markdown by default. So this is also rather near term. And then long term yeah. I really don't notice so much pace in Python's development. Let's see.
[00:52:36] Unknown:
Are there any other aspects of the PyScaffold project or the overall space of kind of managing the structure and best practices around the development of an application in Python that we didn't discuss yet that you'd like to cover before we close out the show? Yeah. Just like to mention also that if you want to contribute to PyScafold
[00:52:57] Unknown:
and if you have new ideas or if you wanna evaluate, especially tools like CATCH, PDM, and so on, and maybe have also a sister project or something that maybe works together with Py scaffold and yeah. Helps yeah. Or make suggestions that how you can or how to use, virtual environments and isolated environments or wanna help us with adding documentation about this. I mean, this would be really valuable. Other than that, I think this whole Titan packaging and structuring your project is really a huge rapid hole. So if you start looking into this, you will find a lot of EuroPython and PyCon talks going back to 2, 000 something where people discuss about the different ways of doing things. So it's actually, the less you care about this and the more you concentrate on your actual work or whatever you wanna do.
So I think I have no more suggestion that you should look into something. So there's a few blog posts. I, yeah, could maybe add to the references of this podcast. I think 1 important thing that I see, especially in the field of data science, so people are using a lot of Jupyter and JupyterLab. And this brings own challenges of how to integrate this in a nice way into larger projects. So people start with 1 notebook, and then they realize, okay. The code is growing and growing. And at this point, how to migrate this into a proper Python project check with the help of PyScaffold. Yeah. It helps a lot. And we've also written 1 blog post about this.
This is definitely worth a read, I would say, for people in the field of data science because they're the, yeah, like, the software craftsmanship, so to say. Yeah. There's still a lot of potential. Let's say it this way.
[00:55:11] Unknown:
Well, for anybody who wants to get in touch with you and follow along with the work that you're doing and contribute to PyScaffold, I'll have you add your preferred contact information to the show notes. And with that, I'll move us into the picks. This week, I'm going to choose the daredevil TV series I just started watching. Originally, it was aired on Netflix, and now it's on Disney plus. But definitely a very well put together show, a lot of, you know, interesting themes and interesting characters, so definitely recommend giving that a shot if you're into any of the superhero shows or genre. And with that, I'll pass it to you, Florian. Do you have any picks this week?
[00:55:46] Unknown:
Yes. So we started watching the peripherals, on Amazon Prime. It's, the creators of Westworld, which was already, like, a really cool TV show. I think there's season 4 also running right now. I haven't checked out yet. But, yeah, it's the second episode we have seen, and it's really interesting, really cool show, kind of sci fi, but also little bit mystery. You really have to think what's going on. So if you are, yeah, a fan of Westworld where you also don't know how things are, and it's a little bit mysterious. So it's a really cool show. So I'm gonna continue.
[00:56:28] Unknown:
Thank you again for taking the time today to join me and share the work that you've been doing on Piscaffold. It's definitely a very interesting project, and it's always great to have people contributing to helping to cement and share their thoughts on what constitutes best practices for how to structure your Python projects. I appreciate all the time and energy that you've been putting into helping save time and energy for everybody else who wants to just build something. So thank you again for your time, and I hope you enjoy the rest of your day. Yeah. Thank you. You too. And thanks for having me. It was really a pleasure. Thanks.
[00:57:04] Unknown:
Thank you for listening. Don't forget to check out our other shows, the Data Engineering podcast, which covers the latest on modern data management, and the Machine Learning Podcast, which helps you go from idea to production with machine learning. Visit the site at pythonpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you learned something or tried out a project from the show, then tell us about it. Email hostspythonpodcast.com with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction and Sponsor Message
Interview with Florian Wilhelm
Florian's Background and Introduction to Python
Motivation and Goals for PyScaffold
Evolution of Project Templates and Best Practices
Categories and Types of Applications Supported by PyScaffold
Decision-Making Process for Best Practices
Balancing Opinions and Configuration Options
Implementation and Evolution of PyScaffold
Stable vs. Volatile Elements in Python Ecosystem
Extension System and Custom Templates
Migrating Existing Projects to PyScaffold
Community Engagement and Project Maintenance
Interesting Uses and Contributions to PyScaffold
Challenges and Lessons Learned
When PyScaffold is the Wrong Choice
Future Plans for PyScaffold
Closing Remarks and Picks