Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list.
Summary
Writing tests is important for the stability of our projects and our confidence when making changes. One issue that we must all contend with when crafting these tests is whether or not we are properly exercising all of the edge cases. Property based testing is a method that attempts to find all of those edge cases by generating randomized inputs to your functions until a failing combination is found. This approach has been popularized by libraries such as Quickcheck in Haskell, but now Python has an offering in this space in the form of Hypothesis. This week, the creator and maintainer of Hypothesis, David MacIver, joins us to tell us about his work on it and how it works to improve our confidence in the stability of our code.
Brief Introduction
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- Subscribe on iTunes, Stitcher, TuneIn or RSS
- Follow us on Twitter or Google+
- Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+
- Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
- I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com
- Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project
- Open Data Science Conference on May 21-22nd in Boston. 20%
- Your hosts as usual are Tobias Macey and Chris Patti
- Today we are interviewing David MacIver about the Hypothesis project which is an advanced Quickcheck implementation for Python.
Interview with David MacIver
- Introductions
- How did you get introduced to Python? – Chris
- Can you provide some background on what Quickcheck is and what inspired you to write an implementation in Python? – Tobias
- Are there any ways in which Hypothesis improves on the original design of Quickcheck? – Tobias
- Can you walk us through the execution of a simple Hypothesis test to give our listeners a better sense for what Hypothesis does? – Chris
- Have you had trouble getting people to use Hypothesis? How has adoption been? – David
- What does this sort of testing get you that conventional testing doesn’t? – David
- Why do you think this sort of testing hasn’t caught on in the Python world before? – David
- Are there any facilities of the Python language that make your job easier? Are there aspects of the language that make this style of testing more difficult? – Tobias
- What are some of the design challenges that you have been presented with while working on Hypothesis and how did you overcome them? – Tobias
- Given that testing is an important part of the development process for ensuring the reliability and correctness of the system under test, how do you make sure that Hypothesis doesn’t introduce uncertainty into this step? – Tobias
- Given the sophisticated nature of the internals of Hypothesis, do you find it difficult to attract contributors to the project? – Tobias
- A few months ago you went through some public burnout with regards to open source and Hypothesis in particular, but circumstances have brought you back to it with a more focused plan for making it sustainable. Can you provide some background and detail about your experiences and reasoning? – Tobias
- What’s next for Hypothesis? – Chris
Keep In Touch
Picks
- Tobias
- Chris
- David
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast.init, the podcast about Python and the people who make it great. You can subscribe to our show on Itunes, Stitcher, TuneIn Radio, or add our RSS feed to your pod catcher of choice. You can also follow us on Twitter or Google Plus, and please give us feedback. You can leave a review on Itunes to help other people find the show. Send us a tweet or an email. Leave us a message on Google Plus or on our show notes. And And you can also join our community. Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, propose show ideas, and follow-up with past guests.
I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show, you can visit our site at python podcast.com. Linode is sponsoring us this week. Check them out at linode.com/podcast in it and get a $20 credit to try out their fast and reliable Linux virtual service for your next project. I'd also like to announce that the open data science conference is happening in Boston on May 21st, 22nd. If you look at our show notes, you can find a discount code to get 20% off your tickets. Your host as usual are Tobias Macy and Chris Patti. And today, we're interviewing David McKeever about the hypothesis project, which is an advanced quick check implementation for Python.
David, could you introduce yourself, please?
[00:01:26] Unknown:
So, hi. Yes. I'm David McKeever. I wrote hypothesis, and that's probably most most of what Python people know me for as I was relatively unknown in the community beforehand. But previously, I tend to do sort of back end data engineering stuff.
[00:01:40] Unknown:
I was just gonna ask, what kinds of data engineering work did you do? Is it mainly big data plumbing kind of thing or sort of database management? Just out of curiosity.
[00:01:52] Unknown:
It was a mix of things. So usually what happened is I got hired to do something interesting like recommendations or so on, and then discovered that the data pipeline for the system I was supposed to be working on was such a mess that I ended up being the 1 to fix it instead. Isn't that always the way? Unfortunately,
[00:02:07] Unknown:
yeah. So how did you get introduced to Python?
[00:02:10] Unknown:
So what actually happened was I got introduced to Python simply because I got a job writing it. I'd previously been writing Ruby and previously a bunch of other languages before that. But when I changed jobs, the company I was moving to was using Python. And so I figured I should probably learn it.
[00:02:30] Unknown:
So can you provide some background on what QuickCheck is and what inspired you to write an implementation in
[00:02:35] Unknown:
Python? Okay. So what QuickCheck is, at a high level, it's basically randomized testing. It comes from a programming language called Haskell. And the original original version, it was less about or they thought about it less about as a form of testing and more a way of doing formal methods that was a lot less hard than doing formal methods. But the idea caught on, and it's been ported to a variety of languages with varying degrees of success. And particularly in the hypothesis incarnation, it's really much more like conventional unit testing with some additional magic rather than its methods incarnation. The story of why I prototyped Python is a little funny in that originally I did it to learn Python.
It was basically just a little prototype that I wrote back in 2013, and I needed Python projects. I'd had good luck with quick check-in its Scala check implementation beforehand, and it didn't seem to be anything very satisfying for Python. So I thought I would just have a play around and see what it how hard it would be. And the original prototype wasn't very good, but the problem is that neither were any of the other things similar to it for Python. And so with the Lightning role, I didn't really work on it for a while. And then I sort of got, out of the job I was in in 2014. And beginning of 2015, I had some time between jobs. I realized people were using this hypothesis thing that I had written 2 years ago, because there wasn't anything better. And it kind of annoyed me that there wasn't anything better. So I decided to take some time and just improve it and try and, make it a more useful project. And it sort of got away from me.
[00:04:11] Unknown:
So are there ways in which Hypothesis improves on the original design of QuickCheck?
[00:04:16] Unknown:
1 of the ways is simply that it's not working at Haskell, which makes it a lot more usable for, a much wider variety of tasks. Like, in some sense, the objective of a hypothesis is to make this sort of testing mainstream. But along the way, I had to figure out a lot of useful functionality that isn't present or is sort of differently present in the original QuickCheck. Probably the biggest, feature of Hypothesis over QuickCheck is that it has an example database. So what happens is that when you have a failing example, hypothesis will save it in its database so that when you rerun the test, it will rerun with the previous failing example rather than having to generate a new 1.
Then crew check has some functionality for that in more recent versions, but hypothesis is a lot more advanced. There also a bunch of things that it does to make more things work out of the box and to improve sort of data quality and exam and the final example quality compared to original quick check. I also think it's a better design from the point of view of the user interface of the library and how people interact with it, but that may be a matter of taste.
[00:05:31] Unknown:
And are there significant differences between it and, you mentioned ScalaCheck as another implementation.
[00:05:39] Unknown:
I'm just curious what your impression is in terms of So there's quite a lot of difference between it and ScalaCheck. Sorry, I didn't mean to interrupt you there. So Hypothesis isn't very closely based on ScalaCheck. And 1 of the big differences between Hypothesis and either ScalaCheck or QuickCheck is that Python is dynamically typed and Haskell and Scala are statically typed. And in Haskell and Scala, the the way you generate data is very closely tied to the type system. So you don't have necessarily custom generators that you're composing in the way that you do in hypothesis. You instead say, I want a list of integers. Give me a list of integers.
That's probably the major difference from the outside. The, the way it interacts with test frameworks and test runners is all very Python specific, so it's obviously quite from Scala or Haskell. And there's a bunch of additional functionality in terms of how you compose data and how you build things up that are quite different. Personally, when I was writing it, I thought they were quite similar, but I've since been having some, quick chat users trying to use hypothesis and getting confused. So I think there's probably more of a difference than I realized at the time. Do you think that there's value in dissociating
[00:06:55] Unknown:
hypothesis from the term quick check and using the more generic term of property based testing?
[00:07:00] Unknown:
I do often do that. Yeah. Particularly from an implementation point of view, it's actually very different from quick check. But the problem is that what happens when people are looking explicitly for this sort of thing is that they inevitably Google for Python QuickCheck. So
[00:07:13] Unknown:
it's hard to disassociate too much on it. Right. So can you walk us through the execution of a simple hypothesis test to give our listeners a better sense for what hypothesis does?
[00:07:23] Unknown:
Sure. So you've got a hypothesis based test, and it's got this given decorator that specifies how to generate some values for some arguments. And what will happen is that or when the test runs, it will call your test function multiple times with arguments provided from that from hypothesis's data generation. So the first thing it does is it looks in the database, and it says, do you have any saved examples for this test? If it does, then it pulls the arrows out of the database and tries those. And if things if any of the previously saved examples fail, then it will skip to next phase.
But if they all pass, then what happens is it moves on to a generation phase where it's essentially picking randomly generated examples from a distribution that's been provided by the strategy implementation and tries to find 1 of these that causes the test to fail. And if neither of the previous step those 2 steps have caused a test failure, then it stops. But if any of them have caused a test failure, then it takes that failure, and it tries to prune it down. Because the problem with randomly generated examples is that they're often huge and messy, and they contain lots of extraneous detail that doesn't actually matter for failing the test. Like, you might have an a 1, 000 element list, but all that actually matters is non empty and contains 1 non 0 number or something like that.
So it just sort of basically hacks bits off until it can't find anything to remove from the example that would still cause the test to fail. And at that point, it then says, okay. Here's the smallest example. First of all, I'll save that in my database so that when you rerun the test, the, we'll get that 1 again. And then it reruns the test 1 final time to sort of let the test failure bubble up and basically look like it had just not picked exactly the right example,
[00:09:14] Unknown:
from the very beginning. Does that make sense? That that does. And that's really neat. It almost seems like in addition to the randomized property testing aspect of the tool, it sort of performs some of the processes that a good QA tester would perform when sort of, like, ruggedizing a test suite anyway, in terms of finding the minimal possible, the the you know, the most minimal failure case for a given test. It's very neat that it has that kind of intelligence built in. It sounds like it could, in that sense, be a much more effective testing tool for developers than standard tests.
[00:09:51] Unknown:
That's definitely the goal, to some extent. So the phrase I sometimes use is thinking with the machine, where basically what's happened is that hypothesis gets to be a sort of little external brain for your testing process, where hypothesis is not itself very smart, but it is very fast. And sort of the combination of a experienced human tester plus hypothesis doing a lot of the heavy lifting for you ends up being, like, a a much larger and much more experienced QA department simply because there's sort of a force multiplier between the 2. I haven't talked to many people who work directly in QA about using Hypothesis, but it's something I'm hoping to do more because I think that there's probably quite a lot of value to them as well as to normal developers.
[00:10:35] Unknown:
Definitely. As a matter of fact, just before we did the podcast, I was talking to 1 of my coworkers who's in test automation here, about Hypothesis. So, hopefully, we can use this episode as a way to shine more light on on it and get it to be used by a wider audience of people.
[00:10:55] Unknown:
That would be excellent.
[00:10:57] Unknown:
So have you had any trouble getting people to use Hypothesis? And I'm wondering how the adoption curve has been.
[00:11:03] Unknown:
It's actually mostly been much better than I would have expected. It's very weird having a project which most people just say really nice things about. There have been some people who just seem very uncertain how to get started, and I'm trying to sort of improve documentation and writing around that to give people easy entry points into Hypothesis. But generally speaking, when people do get started, they sort of rave enthusiastically about it, which is really gratifying. I mean, it's still early days. I think 1 of the consequences of the fact that I've done so much full time work on Hypothesis is that a lot of the Python community sort of sees it as having just come out of left field where it basically just jumped dropped this completed project on them. So it's not got as many people as do not necessarily expect for sort of the level of readiness it currently has, but we've got, Mercurial is using it. PyPI is using it. Those are probably the 2 big names, but there's a whole pile of little projects using it at this point. I know a bunch of companies are using it. So pretty good for the most part. It definitely hasn't hit the sort of test saturation something some of the bigger tools like pie dot test or coverage have, of course. But, I'm hopeful that it'll get there at some point in the next couple of years.
[00:12:16] Unknown:
And what does the integration story look like for using Hypothesis along with some of those other tools, like, that you mentioned?
[00:12:23] Unknown:
So mostly really good. I sort of made it originally because I was lazy, but it turned out to be an amazingly good idea, is that Hypothesis is not a test runner. It doesn't have any testing framework built into it. You can just use it with whatever test framework you like. So it works basically out of the box with py.test, with nose, with, unit test if you really must. And as far as I know, basically every test runner. The only things I know who have some problems is that it doesn't play very well with asynchronous tests. We've got some people using it with asyncio, and it does more or less work. But with Twisted's, trial test runner, it's a bit problematic.
And the combination of Pypi plus coverage plus hypothesis is 1 that I generally recommend people avoid simply because every sort of every pair of things in this this triple have slight problems with each other. I mean, PyPI and hypothesis don't really help with each other. It's more that it's not as fast as you would expect given how much faster PyPI usually is. But high client coverage and hypothesis and coverage both have a bit of struggle sometimes. The other problem that I think people sometimes have is that letting hypothesis generate your coverage for you without some explicit examples to see this can be a bit of a problem because it makes your coverage random.
[00:13:38] Unknown:
Yeah. I can see how that would be the case, but given the fact that the input data is somewhat randomized. So it would potentially pick some arbitrary code paths to execute. So it's interesting.
[00:13:49] Unknown:
Normally, it's pretty good at getting as a 100% of coverage reachable by the test. But if you're running enough tests, then you're gonna be unlucky on 1 of them, basically.
[00:13:59] Unknown:
What does this sort of testing get you that conventional testing doesn't?
[00:14:03] Unknown:
So the major thing is that it is a huge effort saving because you can write this sort of testing with semi conventional testing, like the pytest parameterized stuff. But the problem is that you then have to write all the examples out by hand. And this is both tedious, which means you're not gonna do it nearly the degree that you should And also error prone that it's very easy to be forgetful. And having the computer basically take care of that for you means that you can concentrate on testing some much more high level things and basically have it figure out all the edge cases that you forget.
So, typically, people's experience using hypothesis that it won't get it won't let them get away with things because it will try every edge case, even once they've they forgot when writing both the tests and the original code. So, basically, it's just an automated way of being much more than you would necessarily otherwise have been during your testing.
[00:15:04] Unknown:
Interesting.
[00:15:05] Unknown:
Does it constrain the ways in which you would generally write a unit test in order to be compatible with, how hypothesis executes? Just thinking in terms of the, you know, for instance, like setting up of mocks and and so forth.
[00:15:18] Unknown:
Mostly, it doesn't do much. You do have to use hypothesis' own setup and teardown hooks if you want something on setup and teardown to work simply because you need to do it before and after each example, rather than before and after the entire test function. The other thing that can sometimes happen is that hypothesis isn't very forgiving if your tests are slow because you're running your tests many times, so it tends to exacerbate that sort of problem. It will tend to encourage you to write small, fast tests, but that's not a bad thing in and of itself. Although it does cause some problems for Django users simply because
[00:15:55] Unknown:
the Django test runner is relatively slow and has to do a lot of database setup and teardown. Sorry. Did that answer the question? And yes. No. It absolutely did. I bet you that's also problematic for developers who tend to sort of have lots of database dependencies and and who tend to blur the lines between
[00:16:13] Unknown:
unit testing and integration testing. I bet hypothesis is a a tough pill to swallow for them. It can be. And then to some other sense sorry. To some degree, I'm also 1 of those developers in that I don't necessarily believe there is a hard and fast line between unit testing and integration testing. But it's not like you can't do that under hypothesis. It's just that it's what tends to happen is that once you get a failing integration test, you end up wanting to write a failing unit test to isolate that case so you can work with it more easily, which, again, is no bad thing.
[00:16:46] Unknown:
So why do you think this sort of testing hasn't caught on in the Python world before?
[00:16:51] Unknown:
So the major reason is that it was really hard to write. It turns out that a lot of, the design of the original QuickJack has very Haskell shaped assumptions built through it. And so Python isn't statically typed. Python isn't immutable. And both of these things turned out to be quite hard to make it work well. And it does all work well out of the box now, but that basically required a lot of work on the part of designing it. And even without those, sort of the baseline level of difficulty implement in good quick check is really high. And so what you saw prior to hypothesis or not even prior to hypothesis, prior to my the hypothesis reboot is that there were about 8 or 9 different abandoned projects on PyPI, which were basically doing, similar thing.
And, basically, a combination of luck and timing meant that I had enough time to basically plunge into the problem and brute force my way through it. So, the result was that hypothesis managed to overcome the hump, and we, by virtue of a couple of months, saw it work. And then once you've got the basic tooling working and it worked well enough that people can use it, at that point, people can get excited about it. Whereas previously, if they had tried any of the things that were lying around, most likely, they would have gone, this is a really nice idea, but it sort of makes my life harder in practice. So for example, the, the shrinking tends to be the part at which most projects, for them to do this, fail. Because it turns out that making that work well is really quite difficult. And so you would see most of the previous attempts at it sort of got to the generation point.
And then when it failed, the output output was almost incomprehensible, so people wouldn't really like to use it.
[00:18:32] Unknown:
That's really interesting. And it's also really interesting that you sort of picked a problem space that many people had attempted, but no 1 had really succeeded at in a truly workable way. And congratulations for choosing to climb a fairly tall mountain and and actually getting there safely.
[00:18:48] Unknown:
Thank you. I'm not necessarily sure I would say it was a good decision, but the Yeah. Thank you. I'm not necessarily sure I would say it was a good decision. But it worked out pretty well in the end. I think the like I said, it's not necessarily that I have some unique ability to implement good quick checks. It's simply that this sort of thing takes time, and most people don't have the ability to take 3 months to basically scratch a niche.
[00:19:12] Unknown:
That is all too true. So are there any facilities of the Python language that make your job easier? And are there aspects of the language that make this style of testing more difficult? And as a corollary to that, I'm curious if the typing module in Python 35 would provide any hooks for you to be able to tie into that to be able to potentially automatically determine the type signatures for the quick check implementation. Because I know that as part of the test setup, you need to tell it sort of what general types are accepted in order to get it to generate an appropriate set of data for it.
[00:19:47] Unknown:
So in terms of features of Python that have made my job easier, not many, to be honest. Decorators are very nice, and I use them extensively. And that's probably the big thing that let me make hypothesis quite so test framework framework agnostic, in that it means that I can just expose functions that that do what I want and mirror the original functions. And most other attempts to do this in other languages don't have quite that level of test framework of agnosticism. Features that make my life harder, I can definitely list more of this. But the big 1 is just simply how complicated the Python, argument path and conventions are. If you think they're simple, then that's probably because you have never tried to emulate them. And sort of the differences between positional arcs and keyword arguments and the ability to mix them and how these are then used by the different testing frameworks cause it caused me a lot of problems.
And at this point, most of the solutions are just nicely packaged up in 1 or 2 functions inside of hypothesis, so it doesn't get any grief now. But every time I need to, implement a new code that has to deal with this in a different way, I get sad all over again. There was a third part to your question.
[00:21:00] Unknown:
I was curious if the typing module and type annotations in Python 3.5 would provide any way for you to sort of automatically generate the signature that Hypothesis uses for determining what types of, inputs to generate.
[00:21:17] Unknown:
Right. So people keep asking me that, and it's totally understandable because of the original quick check's relationship to typing. But I actually don't think this is a good idea because if you look at hypothesis strategies, they are much more fine grained than types. And almost all tests that people write, I find they want to do something a little bit custom because you don't really just want to list the strings. You want a list of strings with at least 1 element where the individual strings can only fit in 3 bytes, like UTF-eight or something like that. And you want to possibly filter some stuff out. And hypothesis strategies give you a huge variety of knobs to Twiddle and ways to compose them that aren't really present in, any sort of nondependently typed type system. The other big problem is that, I'm not a huge fan of the typing machine. I forgot to be honest. I think that it's got quite a ways to go before it's a useful piece of software. And I'm a little worried about it's inclusion in Python 3.5 because I think you're gonna need to change a bunch of things before it becomes useful.
But at some point, I'm sure I will do something to integrate along those lines. It's just not quite there yet.
[00:22:30] Unknown:
Sure. And what are some of the design challenges that you've been presented with while working on hypothesis and I'm wondering how you overcame them?
[00:22:37] Unknown:
So the major design challenges have been simply trying to figure out how to deal with the extreme flexibility of Python. So you you've got a lot of issues where people can't. So So for example, 1 of the things that hypothesis relies on is being able to replay tests, and you have to deal with the fact that, a function could do something arbitrarily different the, second time it's been called, or it could have mutated its arguments the first time, or it could basically be asserting that you're never calling in the same argument twice. And so there's a lot of sort of defensive things that have to happen every, basically every time that hypothesis calls out to someone else's function. There's been a lot of work in terms of, implementations of Python. And the major way I've ever implementations of Python.
And the major way I've overcome that is simply by using Travis, which has been really good for, letting me run and observe an Apple CI jobs and observe a number of different, Python version and not an observed number of operating systems, but 2 operating systems, and then AppFare for Windows. So, Adam, I'd like to say that there were sort of deep interesting theoretical challenges that, were things I had to overcome. But the reality is that those are the easy part, at least for me, because that's the part that I find interesting and enjoy working on. And what sort of really surprised me is just the simple math of, detail work and sort of boring grunt work that 1 has to do in order to make all these sort of things into usable production software.
[00:24:13] Unknown:
And given that testing is an important part of the development process for ensuring the reliability and correctness of the system under test, I'm wondering how you make sure that hypothesis doesn't introduce uncertainty into that step.
[00:24:25] Unknown:
So hypothesis does introduce uncertainty into that step, but it sort of does it in the direction in that basically the hypothesis principle is no false positives. Every time that your test fails with hypothesis, it's a real failure. It's it's not sort of the bad kind of randomness in test, where a test can fail flakily. It's every failure is a real failure. Then you've got the example database, which I mentioned earlier, which means that every failure is not just real, but it's also replayable. So a bug won't go away just because you've had bad luck with a random number generator this time. You actually have to fix the bug. And there is a sort of explicitly feeding hypothesis with handpicked examples that you want to want it to use each time to make that more reliable.
And you can also put hypothesis in deterministic mode if you really want, in in which case it just fixes the seed each time and, reruns the same test each time. So with all of these basically hypothesis, it will find everything that you would have found through the manual testing process. And the only uncertainty that really remains is basically how much additional stuff will it find. So you can't guarantee that on every single run, hypothesis will find every single bug that hypothesis could find. And I know some people have had hypothesis happily running for months, and then suddenly it said, hey. These particular 2 floating point numbers no longer work. And, of course, they've not worked all along. It's just that, it didn't find them before now. And to some extent, that's a real problem, and it does introduce uncertainty into the testing process. But it doesn't introduce more uncertainty than having users does.
And hypothesis just essentially, in this case, becomes another user of your system, who every now and then just drops to you, a line saying, hey, so I found this bug.
[00:26:15] Unknown:
And is there a parameter that you can tune to be able to increase the number of trials that QuickCheck will run through?
[00:26:25] Unknown:
Hypothesis has so many parameters. It's a bit silly. But, yes, 1 of them does do that. It's by default, it's running, 200 examples and will time out your tests if, it takes more than a minute to run. The reality is that, a minute is ridiculous overestimate of how much how long most of these tests run. It's simpler to have some value that you probably be hitting, but makes it not get into an infinite loop. I think it's more common for people to turn down the numbers rather than to turn off the numbers, but I typically run it with a thousand rather than 200 simply because I do like a bit more thorough testing. And I think people who are using the Django integration typically turn it down to 50 simply because of the aforementioned speed of the Django test runner.
[00:27:11] Unknown:
So given the sophisticated nature of the internals of hypothesis, do you find it difficult to attract contributors to the project?
[00:27:18] Unknown:
To some degree. 1 of the nice things about hypothesis is that it's relatively well factored. So it's much easier for people to contribute new strategies to the strategy library without really understanding how the sort of the core kernel of code of generation and shrinking works because things should be built on top of other things. There's sort of lots of little internal libraries that you can more or less work on independently if you find a problem with them. So there have been quite a few contributions around sort of that sort of periphery. The internals, yeah, no 1 worked on them other than me. And that sort of I think as far as today is by design, that right now, a lot of the hypothesis internals are almost a research project. Like, they're very much production software, but they're production software where I'm still figuring out a lot of the theory. I'm still figuring out improvements.
So it's quite hard for other people to come into that more because of the fact that we'll move out from under them rather than because of the level of sophistication. And for the moment, I'm okay with that. Once it stabilizes a bit more, I will be writing up a lot of stuff about how everything works and how we're trying to make it more approachable to people so that people are over the meaning of it. And it's also it's actually not that large. So if I were to stop changing things and basically get hit by a bus tomorrow, then someone else could easily figure out how it works. It's just that right now, no 1 has a very good reason to because I'm likely to change the answer to how it works next month. No 1 has so far.
[00:28:49] Unknown:
A few months ago, you went through some public burnout with regards to open source and hypothesis in particular, but circumstances have brought you back to it with a more focused plan for making it sustainable. I'm wondering if you are
[00:29:01] Unknown:
comfortable with providing some background and detail about your experiences and reasoning behind all of that. So the reality is I'm still a bit burned out and I'm sort of working against my own best interests. Part of the problem is that shortly after doing public burnout, I had this really great idea that I just couldn't resist implementing. But fortunately, also around this time, I got a few contracts related to hypothesis, doing some training, doing some custom development. A lot of the use of hypothesis material was done by me, paid by a client, which was nice. So I've managed to sort of make enough money this year that I'm not feeling a complete sucker. But if you do know anyone who wants more hypothesis training or contracting, that will always help because I wouldn't really regard hypothesis as sustainable right now. And and this is sort of a problem. I think it's it's not just me who has this problem. I know that, potentially, most of the Python projects struggle quite a lot with this. I don't think anyone has ever paid Ned to work on coverage.
I know that the PyPI project has some good commercial customers, but I think that they could use a lot more and they could use a lot more funding given how amazing PyPI is. And this is sort of a problem across open source where, a lot of the things we build on, we basically go, hey. This works. Great. Let me build my stuff on it. And then money doesn't really fund back or flow back from the people using it to the people who are watching the 1st place. And even though I'm doing business on top of hypothesis, to some extent, that's still very true because the stuff that is most useful to the Python community at large and also the stuff I most want to be working on is really improving hypothesis functionality, improving hypothesis internals, and generally making it a more useful tool. And that's sort of the 1 bit that no 1 is paying.
People are paying me to teach them how to use better. They're paying me to use hypothesis, to test things for them. But making hypothesis work itself work better is just not something that anyone so far has proven very interested investing in. And in the long run, I don't know how sustainable that is, but for the moment, I'm interested enough in it that I'll take what I can get in terms of cash flow and see how things go.
[00:31:23] Unknown:
Yeah. The money in open source discussion is definitely 1 that has been going on for a while and has not yet reached any useful conclusion. I do know that they're probably going to mispronounce her name, but Niafra has produced a series of blog posts and also done an interview with the folks at the changelog on this topic, trying to bring it a little more out into the open in the general conversation, you know, even bringing it outside of, people who work in open source and in the day to day. So definitely 1 that's interesting and 1 worth keeping an eye on. And, also a sort of outlier from the general trend is the Jupyter project. Recently received, I think, $6, 000, 000 in funding. So definitely a an example where, someone at least saw the utility in providing some money to the project, but it, certainly could do with being a more general trend.
[00:32:17] Unknown:
I do get the impression that the scientific and Python community are a little bit more on the ball, than a lot of the rest of the Python community. And there is more funding for this tool and for the, for this side of the ecosystem. I mean, there's NumFocus, for example, who do a lot of NumPy and sci fi related funding. I don't know how much money they actually have provided to them by commercial customers, but it's at least, it at least exists. And presumably, I know, Jupyter is really heavily used in the scientific community. So so that's sort of my guess. This is where a lot of this comes from, but I may just be making that up.
[00:32:55] Unknown:
There are some glimmers of light at the end of the tunnel, though. Right? We've had a couple of guests on here. Jessica McKellar from the Python Foundation as just a for instance, they support various open source projects that are important to the Python community, so you might wanna talk to those folks, or those folks might wanna talk to you if any of our listeners are Python Foundation folks with their hands in the purse strings. And, also, we just had a really interesting conversation around this with the, gentleman behind the Read the Docs project. Tobias, do you remember his name? Eric Holscher.
[00:33:28] Unknown:
Eric Holscher. Thank you so much. I was gonna say it also came up briefly in our conversation with, Maciej Pielkovski discussing R Python and the PyPI project.
[00:33:38] Unknown:
Absolutely. With regards to the conversation with Eric, he brought up the idea of open source projects as libraries, as opposed to I mean, not like in the technical sense, but in the sense of everybody contributes to the public library because it's considered to be a an important resource for the community. In the same way, all the companies that use open source software should be contributing back to those projects because those libraries, those open source projects are enabling the company to get higher velocity, and it's essentially valued for free. So if you have any corporations or big companies that are using Hypothesis, it might be a reasonable thing to sort of say, hey. Do you like the library? Would you like to see future development happen? Then would you be willing to fund it or something like that?
[00:34:28] Unknown:
So thinking in order, so the Python Software Foundation, first of all, their funding work is really good, and I'm not saying anything against them in this regard. But what tends to happen as I understand it, and maybe I've got the wrong end of the stick, is that what they will typically do is offer you grants for a specific thing. So if there were a particular piece of functionality that I wanted to add to Hypothesis, then I could apply for a grant to work on that piece of functionality. And they don't really fund sort of ongoing work on projects like this, which is a perfectly reasonable decision, but sort of means that it's hard to use them as a route to sustainability. In terms of corporate funding, the big problem that I tend to see, and this isn't just a problem with open source funding, this is also a problem I have with trying to sell into corporates, is that the people who are most invested in these things aren't the people with their hands on the purse strings. So it's relatively hard as a developer working for a corporate to go, I have this great open source library I'm using. I really want to support it. I will, let me give them money because the developer doesn't have any access to the budget. And so what you often see is sort of the closer you get to the money, the further you get away from being able to see why you might want on this, this tool you're using downstream.
And that's sort of part of why I've tried to go down the training route is because it tends to be much more clearer to people with access to budget that they want to spend their training budget than it does to them that they want to throw money at guy who, as far as they know, isn't doing anything for
[00:36:05] Unknown:
them. So what's next for Hypothesis?
[00:36:07] Unknown:
There are a couple of different directions I can go with Hypothesis right now. 1 of the big things, which I maybe shouldn't admit in the Python podcast, is that I'm looking into how to port hypotheses to other languages. So in the last couple of months, I've been doing some work on a project that I originally named conjecture. In the end, it sort of has been rolled back into hypothesis, which is trying to make the internals less complicated in a way that makes them much easier to make them work in a new language. So I'm looking into I'm thinking about this sort of tool in, in the context of Java or I got a lot of people asking me about GAR, which is actually 1 of the more annoying languages to port it to for, reasons.
And so at some point, I'm gonna start looking into that, at least partly because other languages are often a lot better about paying for tooling than, private community necessarily is. So it's much easier, for example, sell proprietary tools, in the C plus plus or Java world. And I still have an open source core for all of these things, but it's really gives me a few more options in terms of sustainability front. The other thing is that recently, I just, sort of I've had effort for 3 months. I suddenly have this brainwave. Usually, it's the result of reading someone else's work, and I've been doing a whole bunch of reading about, formal language theory and induction of regular languages. And I've suddenly been going, oh, my god. I've got so many amazing ideas for how to improve hypothesis.
So that's sort of that's the fun direction, which is sort of very impractical from a business point of view. But I'm I'm certainly going to end up spending a bunch of time in it already because I'm my own worst enemy as far as money is concerned. And related to that, there's been sort of I've probably been promising it for about 6 months now, but there's been a long running discussion about how to improve hypothesis by using coverage information to try and get it to, target interesting corners of your code because it can track things. Sort of the research directions I'm thinking about right now should have bearing on that and have bearing on how to make that work better. So if you're if you're not a Python user, then what's next and is most exciting is obviously the other language stuff. If you're a Python user, it's basically getting better data and being able to find, magically difficult corners of your code that you wouldn't expect hypothesis to be able to find.
[00:38:20] Unknown:
Is there anything that our listeners can do to help the project? Like, where could you use contributions effectively?
[00:38:26] Unknown:
So the big thing that is very easy for people to get started on, and I do get those contributions on, is documentation improvements are always great. Because hypothesis documentation is better than adequate. In places, it's even pretty good, but it's a relatively complicated project, which is quite unfamiliar to people. So writing about hypothesis here in the context of the documentation, or in just on their own blogs or whatever is always welcome and always gratefully received. Writing new sources of data generation is always useful, and it's very easy for this you to do this in your own project and for the bits that you want to generate for your own stuff and then try and contribute them back upstream. I'm generally very welcome to people adding stuff to the hypothesis.strategies module. The other thing is simply getting more projects tested using it. If you have a project that's not using Hypothesis and you would like it to be better tested, just start using Hypothesis for it, both simply because I like it when there's more quality software out there and testing it with hypothesis will almost certainly find bugs in it. But also, the more people use it, the better feedback I get and sort of the better idea I have about what the library needs to improve.
[00:39:38] Unknown:
So before we move on, is there anything else that we didn't ask you that you think we should have or anything that you wanna bring up? Nothing comes to mind. Yeah. Okay. So for anybody who wants to follow you and keep up to date with what you're up to, what would be the best way for them to do that? So I'm available on Twitter as Doctor McKeever,
[00:39:55] Unknown:
and I have my blog at drmciever.com. There is also a tiny letter, which I'm very much overdue to updates this month, which is at tinyletter.com/youguesseditdrmcgaver. That one's more specifically how office is focused.
[00:40:10] Unknown:
Great. So with that, I will take us into the picks. And for my pick today, I'm going to choose Typeform, which is a form building service that lets you put together some very nice looking surveys. And if you are on 1 of their paid plans then it also supports logic branching so that depending on the answers that somebody makes to 1 question you can put them down a path of a few different sets of other questions and just a lot of really great functionality. It works great on mobile. It has a good good way to view the analytics associated with it to see how many people viewed it, what kinds of devices they were on, what their answers were. And it also has Zapier integrations to be able to pipe that data out to other services.
So I've been using it for a couple of different surveys that I've sent out, So I'll put those in the show notes as well. Most specifically, I put together a survey for listeners of this podcast. So if you have any feedback that you wanna give, you can fill that out. And also, I put together a survey for trying to do some market research on continuous integration and people's experience with that to get some ideas for a project that I've been,
[00:41:23] Unknown:
started working on. So if anybody wants to fill that out, that'd be great as well. So I'll put all that in the show notes. And with that, I'll pass it to you, Chris. Thanks, Tobias. I I just wanted to say, the CI form, I just filled it out today, and I was very, very impressed. Like, I was thinking actually as I was filling it out, well, I wonder what he's using because this site is really, really slick. I love the the UX. It's it's great stuff. So my first pick is a kind of brainless wonder, game that I've been enjoying lately, video game, for iOS devices called Seashine. It's a perfect my brain is fried. I'm commuting. I just wanna enjoyably pass a little bit of time kind of thing. Basically, you're a jellyfish, and you're swimming around in the ocean, trying not to get killed or eaten, and it it uses the the the touch interface of the tablet beautifully. You swim by sort of making little swishes with your finger. It's it's gorgeous. The graphics are great. The sound design is is really good. It's just an awful lot of fun, and it's free. I'm sure they, at some point in time, try to commodit you know, monetize with something or other, but I haven't hit it so far. My next pick is actually something that we had on the show quite a number of episodes back, but I've been really sort of getting into it lately. Check. Io, it's a website, where it, basically, it's a it's a Python, problem solving site where you get practice problems in Python, and they've gamified the whole thing, and you can publish your solutions and have them reviewed by other people. And I just I've really been enjoying I've I've been, lately trying to undertake, improving my algorithmic problem solving skills, And Check. Io has really just sort of made it a a real pleasure and sort of given me the incentive to want to go further and push harder in terms of, you know, getting the next level and whatever the case may be. And the interface is really nice, and it's also super easy to use an IDE to, you know, work on your on your code. They make it really nice.
My last pick is, a former coworker of both Tobias, and mine, Mike Kudermarsh, currently works for, Product Hunt in San Francisco, has written this excellent series of articles called the junior developer series, which is a kind of bland sounding name, but a really phenomenal series of articles, containing lots of sort of really interesting kind of off the beaten path tips for people starting out in the industry. Not the usual, like, okay, learn this, don't learn that crud. It's more like, this is how you work effectively with your coworkers. This is how you, you know, can work in a way that will have you be liked by your teammates. This is how to generate good pull requests. This is how to, you know, all of that kind of sort of, like, tips that are not covered elsewhere in the standard. Here's how to get started doing dev kind of articles that are so, you know, rampant around the intertubes these days.
I've heard some criticize it for having too many emojis, but I think that if you disregard the article series for, you know, just over Mike's love of of the emoji, then I think you're doing yourself a disservice. Push forward and read it anyway. It's excellent stuff. And that's it for me. David, what do you have for us for picks?
[00:44:35] Unknown:
So I've got 3 picks. The first is a recent discovery, and the 2 are sort of long running papers for me. The first is Make It Stick by Peter Brown, which is a really good book about learning theory and in particular about sort of what the science supports in terms of how we learn and through practical advice for how to actually apply it in your day to day life. So, this is the sort of thing I personally love reading about, but it's also got some really quite useful tips about how to how to make it stick. And, I would recommend checking that. The 2 classic favorites from me are there's a service I use called Beeminder, and I frequently have to tell people I'm not affiliated with them because, and not even on commission, unfortunately.
But basically, it is a great way of sort of building good habits and essentially committing to yourself to do things that you would otherwise have days or a week. And the final thing is simply an author recommendation. It's sort of a running joke that whenever anyone asks me for fiction recommendations, I listen to them very carefully, and I go, but the author you really want to read is, Lewis Macmaster Bujold, and in particular, her Borkoskin books, which are relatively light space opera, particularly in the original ones. But in the later ones, when she has figured out basically that they will let her keep publishing these regardless of what she writes as long as it's in the universe, and she started telling a much wider variety of stories in this universe, and some have been really good.
[00:46:16] Unknown:
All right. Well, we really appreciate you taking the time out of your day to join us and tell us more about Hypothesis and how it came to be and how people can take advantage of it as well as your experiences in building it. So I appreciate that, and I hope you enjoy the rest of your day. Yes. And you. Thank you very much for having me. Thanks. Bye bye.
Introduction and Announcements
Interview with David McKeever
David's Background and Introduction to Python
QuickCheck and Hypothesis
Hypothesis vs QuickCheck
Property-Based Testing
Executing a Hypothesis Test
Hypothesis as a QA Tool
Adoption and Community Feedback
Integration with Other Tools
Effort Saving with Hypothesis
Constraints and Compatibility
Challenges in Python
Design Challenges
Ensuring Reliability
Contributors and Community
Burnout and Sustainability
Future of Hypothesis
How to Help the Project
Closing Remarks and Picks