SymPy With Aaron Meurer

Hello, and welcome to Podcast. Init, the podcast about Python and the people who make it great. You can subscribe to our show on Itunes, Stitcher, TuneIn Radio, or add our RSS feed to your podcatcher of choice.

You can also follow us on Twitter or Google plus and please give us feedback. You can leave a review on iTunes so that other people can find the show, send us a tweet or an email, leave us a message on Google plus or in our show notes, or you can join our discourse forum at discourse.pythonpodcast.com.

I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.

For details on how to support the show, you can visit our site at pythonpodcast.com.

Linode is sponsoring us this week. You can check them out at linode.com/podcastinit

and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project. I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of podcast.init.

Use the link hired.com/podcastinit

and double your signing bonus to $4, 000.

We are recording today on January 18, 2016, and your host as usual are Tobias Macy and Chris Patti.

Today, we are interviewing Aaron Muir about SymPy.

Aaron, could you please introduce yourself?

Hi. So, my name is Aaron Muir. I

am a

research scientist at the University of South Carolina. I

work with Anthony Skopatz, who was a previous guest on this show. I

work at his lab,

Ergs,

and I'm presently working,

primarily on SymPy, which is what we're gonna be talking about today.

So Aaron, how did you get introduced to Python?

Yeah. So I'm back in

the freshman my freshman year of college. So I I originally was a computer science major, but our

computer science courses were taught in Java,

which I immediately

didn't like.

But there was a free course that was offered at my university about a language called Python,

and I figured

it's good to have some language diversity. So I went and I immediately

fell in love with it.

So, yeah, so then I learned some more. I did some project oilier problems, learned some more. And then that summer, I learned about Google Summer of Code, which is a program that Google runs

every summer,

where they pay college students to write code for open source projects.

And through that, I was able to discover SymPy,

which is basically the perfect project for me because I'm a mathematician and I also love programming and I love Python.

And so SymPy is

a a computer algebra system in Python.

So that's, like, all 3 right there.

And, yeah, I

did a project

for that in Google Summer Code, and I've been working on it ever since. I got hooked.

Yeah. It's,

amusing how often I hear Project Euler come up as

a intro to various programming languages for people wanting to learn something new. I think it's propped up a couple of times on our show, and I've also heard it a few times on the, Ruby Rogue show and various other places. So I've done a few of them myself as well, and they're definitely a great test bed for trying out new

algorithms or new,

programming languages because the problem itself is decently constrained, so you don't have to worry about getting too deep in the weeds.

Yeah. Although they are pretty hard problems there. Yeah.

There are certain languages that I would I would maybe be interested in learning, but I wouldn't I don't know if I'd wanna jump in with Project Euler because they're pretty hard. Certainly. I don't think I would wanna use Haskell as my intro to Project Euler.

Or Bash.

Certainly.

So what is SymPy, and what kinds of problems does it aim to solve?

Right. So as I said, SymPy is a computer algebra system, so something like Mathematica or Maple.

What it means is that it SymPy is doing mathematics symbolically.

So unlike something like SciPy, where you have a bunch of mathematical functions and you pass in a number and you get out a number,

With SymPy, you actually are doing,

things symbolically like you would do in a math class or on a on a whiteboard.

So for example, you can take symbolic derivatives,

do symbolic integrals. You can solve

functions and get a symbolic solution.

So SymPy is has a whole range of

different modules for different kinds of mathematics, and, all sorts of people use it for different things

depending on what you're doing. I like to personally like to think of it just as a a more advanced calculator. So

a calculator, as you usually think of it, just as numerical things, you know, like addition and subtraction of numbers, but a symbolic calculator lets you symbolically

calculate things. So if you wanna, you know, solve a quadratic equation or a more complicated equation, get a symbolic solution.

Very cool.

And how did the SymPy project get started?

Yeah. So,

a guy named Andrei Chertak

back in 2005

was a graduate student and,

a physics graduate student, and he, I guess he wanted to do some things

that require a symbolic system. He's a theoretical physicist, so there's a lot of symbolic calculations that are involved in that, which I personally don't know much about because I'm not a physicist. But he also wanted

to use something that was open source, and

I guess he wasn't really happy with the the open source solutions at the time, especially since,

I guess, he also liked Python, and Python at that time didn't really have anything else.

So he, started working on it and some other people used it and they improved it and it ended up growing into a

full featured

computer algebra system over the years.

So how did you get started with the SymPy project?

Right. So,

back in 2009, I did a Google Summer Code project,

as I mentioned earlier.

So,

I kinda I kinda discovered it, I think, just through looking through the Google Summer Code projects,

looking at the Python projects that had to do with mathematics. And

so my project was to implement ODE solvers. So you would input a symbolic

ordinary differential equation who would give you a symbolic solution.

So for example, you could enter something like a derivative

of of or, yeah, second derivative of f of x equals f of x, and it would spit out the solution to that, which is or I guess is that the 1 that I want? Yeah. I guess that's I think that's a sine and cosine. It's been a while since I've actually done

that

math. I kinda got hooked on it at that on that point.

I you know, the open source philosophy,

I really agree with that philosophy. And,

just Python and math and the the simpie community is also a really great community to work with. And, it's I found that I

it's a great way working with open source is a great way to learn just about

computer science and whatever whatever the open source project happens to be working in.

Yeah. I I can totally understand why you could, you know, become fascinated by the project. I mean, when I first encountered it, I was kinda blown away. It's like this magical mathematical

blackboard that, you know, you can sketch out your equation and have it actually get solved and have the results displayed in a beautiful way. It's, it's it's really kind of impressive.

Yeah. When you first see it, it's kinda like, wow. How does this how does this work? And then you you dig into it and some of it, you know, you see how it works and you're like, oh, yeah. Okay. That makes sense. And there are some parts that actually,

you know, there's actually some deep mathematics going on there and, it it takes a little bit of understanding to to know what it's actually doing.

So are there any limits to the complexity of the equation SymPy can model and solve?

Oh, absolutely.

It's

extremely easy to

make an symbolic expression

that arbitrarily complex. There are a few problems here,

theoretically even. There's something called 0 equivalence testing, which is basically the problem of given

some

symbolic expression, you wanna know is it identically equal to 0 or not? So for example,

say you have the expression

sine squared of x

plus cosine squared of x minus 1. We know that due to some trig identity, that expression is actually identically equal to 0 because sine squared plus cosine squared is 1, and then I subtracted the 1. And so there's a theorem called Richardson's theorem that says that as long as you have enough

functions that you're dealing with, which is actually not very many functions, basically

elementary functions, I think absolute value, and I think you have to include

pi, As long as you have enough functions, that problem is intractable

and that it's undecidable. So it's it's not NP complete. It's actually undecidable.

It's physically impossible to solve the problem in generality, meaning there's no algorithm that can do it. That's not to say there aren't heuristical algorithms that can solve it first for some problems. But what that means for SymPy is there's always gonna be cases where it's not gonna work. And a lot of the algorithms

do some sort of pattern matching heuristics where if your input expression sort of looks like something that it knows how to solve or if it can sort of manipulate it into something that it knows how to solve, then it can solve it. But in other cases, you might you can easily construct some other thing that doesn't work. So I would say, yes, the complexity is actually a big problem because you can make symbolic expressions that

are sort of arbitrarily complex without making them particularly,

large.

So it's not like you need to have some, you know, some huge expression for it to crash. You can actually have some very small expression and, that'll be

sufficiently difficult.

And how does SymPy compare to similar projects in other languages? I know you mentioned Mathematica and Maple. I don't know if there are any others that compare favorably or unfavorably to SymPy.

Right. So Mathematica and Maple are sort of the big proprietary

computer algebra systems. And,

yeah, they're they're actually quite powerful. Mathematica, I would say, is sort of the the gold standard,

as far as

computer algebra systems. It's

extremely capable. There's a lot of people working on it.

But, you know, it also

costs a lot of money to use.

There are several open source computer algebra systems as well that, compute with SymPy.

For example,

there's Maxima, there's Acxiom.

A lot of these are actually

quite old in their their roots. Some of them date back to the seventies. People were first starting to

actually invent computer algebra.

Yeah. I was gonna actually say I know for a fact that I've seen references to Maxima going all the way back to McCarthy and

LISP 1.5

on the, you know, DEC 10 and DEC PDP series computers

in in in back way back when in in the olden days of prehistory.

Sure. Well, I mean, the McCarthy's original Lisp paper,

shows how to use Lisp to construct symbolic expressions and then how to symbolically differentiate them. I'm not sure if that's actually the

the beginning of computer algebra or if there's stuff before that. I don't I don't know the history that well, But, yeah, so there

as far as how it compares,

some of these older systems,

they do have some algorithms which are more advanced than SymPy. But

the thing about SymPy is that the things that it can do are actually quite broad. So there are some areas in SymPy where it's very advanced,

and there's some areas where it's less advanced. I'd say that the 1 of the main advantages

of SymPy is that it it is well, it is open source. So unlike Maple and Mathematica, you don't have to pay for it.

It's

written completely in Python, so that

that means that

it it makes it a lot easier for

people to learn the language around the computer algebra system. Most other computer algebra systems sort of invent their own language, like Maple has its Maple language.

Mathematica's got

Wolfram language,

But even, like, Acxiom and Maximal have their own little lispy languages that they've invented,

which means that if you wanna actually do programmatic stuff, you have to learn this language. And so the the idea behind SymPy is that,

we're not really, we're not really language designers. We're we're mathematicians

constructs are. So we just

we we use Python as our language. Python tends to be a pretty good language.

Yeah. I think knowing your strengths and avoiding

the crevasses

is is an important,

maxim for good design, to be sure.

I mean, there are plenty of other advantages too. Like, you know, the simpiles in the Python ecosystem. So,

you know, it's easy to start with SymPy and and end with some other library that in Python, like, say, map plotlib or or scipy or

something else that you wanna do that that SymPy doesn't necessarily do.

And because Python is sort of acts as this has this huge scientific

ecosystem and it sort of acts as this glue language,

it's really easy to integrate,

symbolic stuff with SymPy into other scientific workflows.

Very cool. Well, I can only imagine that things like Jupyter Notebook really sort of leverage that to to its fullest. Right? Like you use something like the notebook and you can sort of use SymPy along with other things in in concert to

sort of solve your problems in really, efficient ways.

Yeah.

So how does SymPy render results using such beautiful mathematical symbols when the inputs are simple ASCII?

Yeah. So 1 of 1 of the, features of SymPy that,

it's actually 1 of my favorite features, is, the printing system.

So SymPy has several different ways of once you have an expression of actually displaying it to you,

depending on what sorts of capabilities you have,

if you are in the Jupyter Notebook, you can

display the expressions using the MathJax

latex render. So it's basically gonna show you

this really nice mathematical representation.

But if you're using SymPy in a terminal,

we have these

pretty printers that use all these fancy unicode symbols

to basically,

print

expressions

in a way that is

quite readable. There's actually,

I don't know if everyone knows, but the Unicode is has a lot of these symbols for

drawing shapes and things like that where we can we can use those to do things like actually draw the square root symbol on the screen or draw an integral sign on the screen.

And then you stack the different expressions

on top of each other so that it's

a perfectly readable 2 d expression.

That's kind of 1 of the aspects of sempai that was totally amazing to me is that I would sort of, you know, cobble together my little equations, my mathematical abilities are very meager to say the least,

and simpie would just produce this gorgeous,

you know,

rendering that just kinda blew me away. Like, I I felt like I was looking at what my math teacher might have laid out in the test. And

for it to be able to just sort of do that dynamically on the fly,

I guess I knew that Unicode in an abstract way, that Unicode had these had, you know, some shape rendering capabilities, but to see them use this way is just is totally awesome.

Yeah. That's actually,

it blew me away the first time I used it too. I I

especially when you look at at most other computer algebra systems, if you use their terminal interface,

their

printing system is sort of rudimentary

at best. You know, you'll you'll see things that,

like, only use ASCII symbols, for example, which SymPy has as well. There's a there's a version that

only uses ASCII if if you're in a limited environment that doesn't support Unicode. But I think the Unicode

printing really makes a difference because you can draw these these large 2 dimensional parentheses or whatever, and they actually look nice because they're connected.

Yeah. My experience with using Mathematica when I was in school definitely

left a lot to be desired in the representation of the formulas that I was working with and

occasionally actually led to some confusion as to what it was actually trying to show me. And so having the visual characteristics of something like SymPy would have been very beneficial in those contexts.

Yeah. Definitely.

So what are some of the challenges in creating documentation for a project like SymPy that is accessible to nonexperts while still having the

basically

say in our docs that we're gonna assume that you already know the mathematics behind,

whatever

our docs are talking about

beyond just basic definitions.

So for example,

if you read the SymPy tutorial in the in the introduction, it it says that the the tutorial assumes that you have the reader has a knowledge of basic,

introductory calculus.

Basically, we like, we we don't have the resources to, to, you know, try to document the mathematics and the

the sim pi itself.

And there are plenty of great resources out there to learn the mathematics.

For example,

there's Wikipedia. There's Mathworld. There's,

the Wolfram function site. These are all great resources.

And so

I I imagine this can be difficult for newcomers. There's actually

a a book that I reviewed recently on my blog

called

Doing Math with Python, which uses SymPy. And

it it's a takes an approach where it's only using, like, high school level mathematics.

And it's

it's it serves as a decent introduction to SymPy as well as

other

mathematical aspects of Python. So if if somebody actually doesn't

necessarily know the calculus or the, you know, whatever the advanced math, that might be a more gentle introduction to SymPy.

And which fields of academia and business seem to be most heavily represented in the users of SymPy?

So

the, as far as academia,

there seems to be a a pretty good following, from physicists. Andre,

the creator the initial creator of Symbiah is a physicist.

A lot of contributors have been physicists.

I guess physics requires enough,

complicated math that it's it's nice to have

something like SymPy to to do some calculations.

But aside from that, I I see a lot of people

use it to,

do what's known as code generation, which means that they'll

they'll use SymPy to generate the equations for whatever it is they're trying to model. And then SymPy has the ability to take those equations and generate

c code or Fortran code that

computes those equations.

And so that's a lot nicer than writing out the c code by hand because you you might have a large amount, like, a really huge equation that you don't wanna you don't wanna try to write it out by hand. It's much better to have SymPy compute it for you, and you can you can do things like compute symbolic derivatives

inside of SymPy and then generate code for that. This is actually part of what I'm I'm working on as my day job

is improving this code generation stuff that's in SymPy.

As far as business,

there's there's less use in business

compared to,

you know, your standard data science or

scientific

libraries. But I I have heard of it being used in, in here and there in, like, say, the finance sector.

I imagine that the code generation capabilities would be pretty difficult to

create

appropriate unit or integration tests. So I imagine that must be something that you spend a fair amount of time on to make sure that the code that it generates is

correct and provably correct based on the inputs?

Yeah. Well, so we do we do have tests that that basically if you run the tests that they're gonna they're not gonna run unless you have a compiler

on your system.

So we do have some tests for them. And also they're, it's written in a modular way, so we we can test most of it without actually,

compiling the code just by, you know, testing that the code is

whatever the code is generated.

But, yeah, I mean, there that is probably 1 of the the more challenging things as far as, testing in SymPy.

So what are some of the uses of SymPy in education outside of the obvious, like students checking their homework?

Well,

there are teachers who

use SymPy to to help them teach their class.

I've noticed that it it's actually pretty common,

especially

outside of the US, for

instructors to use open source software in their class. And SymPy is a pretty common thing that I'll I'll hear teachers trying to use because it's a really nice, open source computer algebra system.

So how does SymPy integrate with the Jupyter Notebook?

Yeah. So as as I mentioned earlier, if you, if you you can, set it up so that your expressions will all automatically display as fancy printing. The way that you do that is you

you import initprinting.

So that's init_printing

from SymPy, and then you just call that function at the top of your notebook.

And and, actually, you you call that function in any environment, not just in the notebook. And what that that's gonna do is that's gonna set up

the pretty printing to for the, whatever the the best printing is possible in the environment that you're in. So in the notebook, that's gonna set up the,

MathJax printing. In in the terminal, it's gonna set up the Unicode printing.

If you're using the IPython QT console, that's gonna

set it up so that it can actually call out to the LaTeX program if you have LaTeX installed and

generate math that way.

And is that leveraging the, import hooks that are part of the the Python language?

No.

So as far as Jupyter goes, Jupyter has a whole

printing system set up where different objects can sort

of register different ways to display themselves.

So

a sympy object can register

itself as saying, I I know how to display myself as math.

And then the Jupyter Notebook knows that it can render math using MathJax,

and so it'll,

it'll print that using MathJax.

Yeah. And then the same thing in the similar in the in the QT consulate. It it displays the LaTeX math as an image.

And is SymPy generally used more as an interactive mathematics environment or use or as a library that's integrated within a larger application?

Well, so

I I both. It's actually 1 of the 1 of the nice advantages of SymPy is that it is usable as a library.

A lot of the other,

computer algebra systems

are not necessarily designed with that in mind, and so it can be

difficult

to integrate them with other things. But since SymPy is basically just a Python library, it's really easy to to just import it in your your other library code. And we also try to make it so that SymPy objects are extensible

on. So for example, you can you can take

1 of the SymPy functions in some class

if you wanna change its behavior in some way.

But on the other hand, yes, it's it's

it's also we also try to make it easy to use it as a

a calculator, basically. And that init printing in the Jupyter Notebook is is a big part of that because,

that makes it so that you have this nice printing. There's also another 1 called init session,

and that's basically just gonna also run a SymPy from SymPy endpoint star for you. So it kinda acts more like

a traditional computer algebra system where all of the names are sort of there for you already. It's also gonna define some symbol names for you because

in a in a traditional computer algebra system, if you just sort of type a variable name that isn't defined, it's just gonna use that as a

as

a symbol.

But and since SymPy is in Python,

we're not

really hooking into you know, we're not really breaking language. So undefined variables are still undefined. You have to define

symbols. So you have to write x equals symbol x if you want to define x to be a symbol.

What were the challenges moving SymPy from Python 2 to Python 3?

I'd say that the worst thing there was that when we did it, 2 to 3 was still the recommended way of doing the transition. So that's how we ended up doing it was with 2 to 3.

And that ended up kinda being a mess because,

you know, if you wanted to run the tests in Python 3, you'd have to run 2 to 3 on the code base. And it basically the the workflow makes it so that you have you sort of just still are still working in Python 2. And it's not really possible to work in Python 3 because that's sort of just generated code. And the 2 to 3 script takes like, you know, a few minutes to run if if it hasn't been run yet.

It's a pain.

But then we eventually moved to a single code base, which is what pretty much everyone is doing these days.

And I'm glad that the the Python core team has changed the their recommendation to doing that,

because that's much better.

That that actually lets you use Python 3 as your active development environment,

and there there's no

compilation step or anything like that to sort of compile your code from 1 language to another. It's just a single code base. And the 2 languages are similar enough. Well, they're not really 2 languages. They're 1 language. But the 2 versions of the language are similar enough that it's it's actually

very simple to use a single code base for both.

There's

really

you just have to set up a compatibility

file with some

things that you have to import when you use them,

And it's really not that big of a deal.

It is a little bit challenging because we still support Python 2.6,

which doesn't have

a lot of the features. We also support Python 3.2, which doesn't have the

the unicode literals.

But I believe we used to support 2.5, which

it was a lot harder to do a similar code base with because it didn't even have the from future import print

function. But if you're not supporting that anymore, which you shouldn't be,

you shouldn't even be supporting 2.6 anymore, really, Then there there's no reason to to not do a similar code base.

It's tough because

I think it's very easy for either of us to say you shouldn't be supporting 2.6 anymore. But there are a lot of people out there

with environments

where

they may be moving to 3 or or even 27

for for their sort of future deployments,

but they already have fleets of of servers out there in the wild that are kinda locked into,

maintenance mode where installing a whole new

Python version just isn't practical.

I believe was it Brett Cannon, 1 of the core developers,

has basically come out and said that you should stop supporting

Python 2.6 for free

as a library developer.

And so I think SymPy is going to drop support after the next version.

Sure. And it shouldn't be very difficult to upgrade from 2.6 to 2.7 because there aren't any backwards incompatible changes, unlike updating Python 2 to 3, which can be a lot more difficult.

And and and to your point also, I think that, you know, even when you are in situations like my situation at work actually is is that we have 26 out there,

but it's not clear how much development we're going to be doing on those servers.

And if new development ever major new development ever is required, I, as a developer, am going

to exert some upward pressure on my management to say, hey,

you know, we really gotta upgrade the platform. You just you can't you can't live in the past like this. You know, otherwise, we're gonna be sort of, you know, working in legacy versions of everything and not being able to import anything new and that's just

not productive.

So I think it's actually a tricky balance to strike and I I do totally understand

that at some point you just have to cut the cord and say, okay, we're not we're not doing it. It's 22.7

or bust. I think that makes a lot lot of sense. Well, there's a lot of other libraries that aren't supporting it anymore. And I believe Python, the core Python is no longer supporting it for security releases, which is that alone should be

a reason for you to upgrade.

Yeah. That generally provides some decent impetus when the security releases are no longer available for a language or platform. That's generally when people finally start kicking it into gear and actually migrating upwards.

So I'm wondering if there are any

features of Python 3 that you would like to be able to use but can't because of the fact you're supporting both Python 2 and Python 3?

Oh, yeah. Well, I mean, there's

actually, most of it's the stuff that I can't even use because I'm still we still have to support 26. Like, we we can't use dictionary comprehensions

or set comprehensions or any of that stuff.

So

I I'm

you know,

ever since I've been working on SymPy, we've had to support,

you know, several versions older than the latest version. So I'm sort of used

to sort of being not being able to use the the newest features of Python until many years after they get introduced into the language.

As far as Python 3 itself,

I'm trying to think.

I'm not I I haven't really thought about what sorts of things we can do if we could support Python 3 only.

I guess we could do some cool stuff with function annotations.

What are what are the other cool features in Python 3? They just recently released the matrix multiplication

operator.

Actually, I don't I don't know if we've added that to SymPy. SymPy does have symbolic matrices.

I mean, we can add that can be added to SymPy.

It's just you just have to define the matmul function on matrix.

I should check if that's been implemented or not.

And so anybody who's using 35 will be able to use that. Actually, 35 has some cool

some cool, new features relating to star

unpacking, where you can basically use the the star operator to unpack things, almost anywhere.

So you can you can write stuff that's very,

sort of a lot tercer than it would be otherwise because you can sort of you can do, like, 2 different star unpacking in the same function call, or you can write a, like, an open square bracket and then a star and then a variable name and then a close square bracket, and that'll create a list out of whatever that

variable was. But, yeah, it's not

I I can use those in my personal projects, but it's it's gonna be many, many years before,

probably at least 2020

when CPython stopped supporting Python 27 when

SymPy is not gonna be supporting Python 2 anymore.

Yeah. That will be an interesting year to see how many people,

run around with their hair on fire because they finally realized that they have to upgrade or they won't be able to, get any more security updates on the Python 2 branch

at all. With that being said, I do I do wanna highly recommend anybody who

is looking into SymPy to use Python 3.

1 of the 1 of the big feature or changes in Python 3 that you don't hear about much these days is

the change of division the division operator to do floating point division instead of integer division.

So,

for example, if you do 1 divided by 2 in Python 2, that's gonna give you 0.

In Python 3, it's gonna give you 0.5.

And

that tends to catch people a lot who are, say, new to SymPy because they'll try doing something like

x to the 1 half power,

and they'll just write that out. And then x will just sort of disappear because that ends up just being x to the 0. They just turned it into a 1.

Or they'll they'll, like, try to add 1 half to an expression and it just goes away.

And so, you know, this is sort of 1 of the gotchas with SymPy is that you,

you have to sort of if you wanna do

divide 2 integers, you have to sort of wrap that

around

with SymPy types so that you get a rational number.

So you'd have to write, like, rational 1, comma, 2 to get

the rational number 1 half. But if you're using Python 3 or if you use

if you import a division from future import division in Python 2, you'll at least get a floating point number instead of

what's basically

the wrong answer that sort of just makes things disappear without you noticing which is what happens in Python 2.

Yeah. That is definitely a tricky bug to find when you run into it. And I actually was having a similar situation when I was doing division between 2 integer numbers in SQL and wondering why I wasn't getting the expected results, and then remembering that I had to explicitly cast 1 of the operands to a float to be able to actually get a floating point result. So, yes. Definitely,

anytime you are wondering why it doesn't work and you're doing any division, then double check that.

I think anybody who's doing any kind of greenfield development these days, I think most people know you should just be using Python 3. There's really no reason not to the sort of the old excuses of, well, this or that module isn't supported.

It's kind of gone away because those few modules that really just haven't been

updated have been replaced pretty much. Other, you know, honestly, honestly better, more modern modules that do exactly the same thing. And the

the few sort

of really big, you know, outliers like Twisted and

Yeah. I mean, I get it when you you've got a legacy code base to work with. Yeah. I mean, I get it when you you've got a legacy code base to work with, and you, you know, you don't wanna deal with porting it. But if you're if you're, say, a new user to Python,

there's no reason to not start with Python 3. And it I I'd say the thing that irks me the most is

when I see people

writing guides for new users or teaching courses for new users that are based on Python 2 because it's much better to learn Python 3 and then then and then if you that person

needs to go and and work with Python 2, then they can go learn how Python 2 is different from Python 3.

I'd be really kinda shocked to see any such new guides being written that point Python 2. I mean, I think once Django updated all their sort of getting started documentation to Python 3,

I think a lot of other people fell in at that point.

Yeah. It's definitely,

fewer and fewer,

these days, but you'll still see them if you're

or even just, like, maybe not a guide to Python, but just still, like, a tutorial of of something that's using Python, and the author

happens to use Python too.

So were there any performance bottlenecks you needed to overcome in creating SymPy?

Unfortunately, Andre couldn't make it. But, yeah, the the performance is a problem because, like I said earlier, it's really easy to

sort of create an expression that's arbitrarily complex.

So, you know, for example, if I 1 of the functions in SymPy is is factor and that what that does is it takes a polynomial

and factors it into

the products of irreducible polynomials.

So for example,

if you do factor x squared minus 1, that'll spit out x+one

timesxminusone.

And so if you sort of look at the sort of the runtime complexity of factor, it's

it's based on the degree of the polynomial.

So if I have a, you know, x squared minus 1 is a second degree polynomial,

But it it's really easy to create it, like, you know, a million degree polynomial

without it, you know, and it's not very large when you type it out because you're just typing x to the 1, 000, 000. So that's about,

I I guess that's about 10 characters right there.

And then so it's it's really easy to create some, say, some million degree polynomial

that's gonna take forever to factor because

the complexity of it has to do with the degree of the polynomial.

And so that's

just a simple example to show that,

you know, a lot of the in a lot of the cases, the sort of the algorithmic complexity of a of a function

might not have to do with the actual

size of the input, but maybe 1 of the numbers that appears in the input.

And so

another issue is that it's it's very easy to sort of write a symbolic algorithm in a a naive way that ends up being extremely inefficient,

like maybe

exponentially

slow when it can actually

be maybe run-in linear time or something like that.

So, for example, if you if you just sort of naively try to write an algorithm to expand

the product of 2

polynomials,

or

actually, I guess a better example would be to expand the the power of a polynomial. So, say, x plus 1 to the 3rd power.

If you sort of try to write that out naively,

what it is,

you get something that's very inefficient.

But the correct algorithm sort of requires

knowing what,

these,

how these multinomial coefficients work and and something like that.

So it's it's pretty common. It's not uncommon for somebody to put an algorithm in simpie that,

ends up being inefficient just because it's not as algorithmically

well designed as it could be.

And I guess the other final problem is that Python as a language can be slow,

and it it's difficult.

It's not really,

a lot of the things that sort of make it faster don't really work very well for SymPy. So

for example, if you try running SymPy through PyPy,

it's can actually end up being slower than just CPython

because a lot of the dynamic stuff that SymPy is doing sorta doesn't really work with PyPy's JIT'ing.

And a lot of these other things like Numba, for example, just they're not gonna work at all with SymPy because those are designed around numerical algorithms and not this object oriented stuff that SymPy is using.

And so there's a lot of sort of a lot of dynamics. SymPy takes a lot of huge advantage of the dynamic nature of Python,

and it can be slow.

Andre Sertag is actually working on a project called Sym Engine, which is a,

maybe not a rewrite, but it's a a new core for symbolic engine written in c plus plus. We'll eventually be able to sort of swap out the core of SymPy with SymEngine to for the so that the fundamental operations inside of SymPy could be faster.

Interesting. It definitely sounds like 1 of those problem domains where it's like 1 step, 2 step. Okay. It's, you know, reasonably shallow. It's all good. And then the 3rd step, it's like, woah, it's like a bottomless chasm. You know, it sounds like it's difficulty of 1 expression to another can be very, very uneven computationally

speaking.

Oh, absolutely.

Because he I mean, you're basically trying to compute on all of mathematics.

So if you think about the hardest math that you had to do in

in high school or college, it's,

simply trying to do stuff like that.

So,

forgive me if you've already answered most of this in the previous question, but perhaps not, so I'm gonna ask it anyway. What are some of the interesting design or implementation

challenges

you found creating and maintaining SymPy?

I I would say that it's difficult to, or interesting even just how you design the

the core system of SymPy.

Once you sort of have the, like, the basic symbolics, like, you're, you know, you I can create some symbolic expression, and I have some, well, I have some ways of manipulating it.

Once you have that sort of

doing these higher algorithms on, like, here's how I solve functions or

or equality,

or here's how I simplify an expression,

Those are those tend to be easier. But the actual getting down to how how do you actually design it so that you know, what what is the, like, the type system look like? What is, what what are the ways that you manipulate expression? And there are several different approaches to to doing that. And so a SymPy expression is basically a tree. So if if you think of something like x squared plus 1,

that would be represent so the top level expression of that is in addition,

so that might be represented as add

X squared and 1 And then the x squared is the power, so that would be, like, powx2.

And so you have this tree where the the

the nodes are these, the leaf nodes are these express, like, usually symbols or numbers, and then the the

higher nodes are the functions that,

contain those. And so

most symbolic algorithms end up being sort of these tree manipulation algorithms.

And, there's different approaches to doing, tree manipulation algorithms.

And are there any new features or major updates to SymPy that you have planned? I know you mentioned the SymEngine, but I'm wondering if there's anything else.

So SIM engine is a separate

project from SymPy.

We do have a we do have a release planned, hopefully, pretty soon. It's actually a 1 release because we've been, we've been releasing,

versions that are less than 1 0 for too long,

and it's time to

SymPy is is a full featured

computer algebra system at this point, and it's time for version numbers to reflect that. As far as the, the new changes in this release,

most of them are gonna be coming from our Google Summer of Code project from projects from this past summer, and I don't I don't wanna I'm sure I'm gonna leave some stuff out because we had several of them,

but, just to highlight some of the the ones that I remember.

The solvers module, so which is basically the

the module where you that deals with solving equations. So, like, you know, solving

x squared minus 1 equals 0.

The solutions there are plus or minus 1. So the the solver's module is sort of being rewritten

to use

sets.

So 1 of the issues with the current solvers module is that it sort of returns just this list of solutions,

but it's not the problem with just a Python list of solutions is you can't really represent things like infinitely many solutions with that, and it's also not really a mathematical object.

And so SymPy has a has a sets module,

which lets you represent mathematical sets.

So for example, there's an object that represents the set of integers,

and you can

there's another object that

represents the set of real numbers, and you can do things like unions and complements and stuff like that. And so the the new solve set module will return a set when you solve. And so that lets you do things like,

say, solve sinex equals 0.

The solution to that is n times pi where n is an integer. So that's an infinite number of

solutions. And so that this can actually represent that

as

sort of a set

the set n times pi where n is an integer. And so this lets us

represent a lot more possible solutions.

The other nice thing about the solve set module is that it actually

it's able to

sort of give you some guarantees about whether or not it's found all the solutions.

So 1 of the problems

1 of the issues with SymPy's current solve module is that you can't if you pass

it a a function and it returns, say, an empty list,

there's no way to know that that just means that there's no solutions, or that means that there that SymPy wasn't able to find any solutions,

which is you know, that can be problematic. So, you know, for example, if you if you wanna write a a little function that finds the

maximas and minimas of a function,

So from calculus, we know that we can just take the derivative

and solve where it equals 0.

But the problem is if

you need to know everywhere where it equals 0. Otherwise,

you might miss 1 of

the maximal points, and it might not actually the 1 that you find might actually be the global maximum. And so this offset module makes it easier to implement stuff like that. Another thing that's continuously improving,

and there are gonna be a lot of improvements in the next version, is our assumption system,

which the assumption system

deals with making assumptions

on expressions. So for example,

I wanna say I have a symbol x and I wanna assume that it's positive.

And this is important because there are a lot of sort of mathematical operations which are only valid under certain conditions.

So a classic example is if you take the square root of x squared, that's that's x squared and then the square root of that,

That expression

is only equal to x if x is positive.

So, for example, if if x is not positive, if x is negative,

then it it doesn't equal x. It equals negative x.

And so if you wanna simplify

that x the 1 the square root of x squared down to x, you're gonna have to,

assume somewhere that x is positive. And so our assumption system,

lets you do that. And,

there's a lot of

other examples throughout SymPy where it needs to check some assumptions on some expression in order to perform an operation

because it's only valid for that, domain.

And so,

the SymPy assumption system has sort of

a colored history.

There's there's an old system and a new system.

I don't

I don't wanna get into too many details because

that's gonna be a lot longer than

what I wanna say, but we're

working on

sort of replacing the old system with the new system.

And how is the evolution of SymPy managed from a feature perspective?

And have there been any occasions in recent memory where a pull request had to be rejected because it didn't fit with the vision for the project?

So that the, the actual

things that fit into SymPy is is very broad. Basically, anything that that can sort of be mathematically,

symbolically represented

is a good fit for SymPy. So if you if you look

at, like, if you go to GitHub,

github.com/simpi/simpi,

and you look

in the SymPy

module at all the submodules, you can see

there's a huge range of modules. There's a logic module, there's matrices,

there's number theory, there's physics.

But,

as far as things that actually don't

fit,

anything that's sort of not a symbolic thing. So

for example,

numeric stuff, we try to leave that to other libraries which do that much better, like scipy or

NumPy.

And another example would be graph theory. We you know, there are plenty of good graph theory modules.

So, yeah,

I would say that

most things don't actually make it to pull requests that, would be rejected, so I don't have any examples of pull requests. But we do get a lot of,

a lot of people who wanna implement stuff that

are basically not

symbolic. So they're, you know, they're not it's not that they're bad ideas, but they they

are better fit for other libraries. And the great thing about Python is that it's easy to

sort of use SymPy and these other libraries together, so it's not really necessary for them to go into SymPy for them to be usable

by users of SymPy.

And which of the features of SymPy do you find yourself using most often?

Well, so back when I was in college,

I

used it to help me do my homework,

which, you know, I guess that

that varied depending on what,

what specific class I was in. I,

majored in mathematics, so I took

a lot of different mathematics classes.

These days,

as I mean, I use it I do development in it, but as far as just using it, I may mainly use it as just a calculator.

You know, if I see something

interesting online

about mathematics,

I'll plug it in there. I like to peruse,

I guess it's

a Math Snack Exchange,

and sort of look at some of the questions there, and sometimes I'll

I'll pull stuff in there. Or if I see something something cool in mind that, you know, I I oh, I wonder if somebody can do that, then, you know, I like, recently, I,

I apparently, if

on on Twitter, if you use the poll feature

where you can create a poll where people can vote on things, the

you can actually squeeze more than a 140 characters into there.

And so I figured out the new character limit was, like, I think, 157,

if you include the poll answers.

And so there's this question of, you know, what's the largest prime number that could fit in the tweet?

And it's really easy to compute that with SymPy because

there's a just a function called prev prime, and you just do prev prime 10 to the 140,

and that'll give you the largest

140 character

prime in base 10.

And so I I recently computed the largest 157

or whatever it was character prime and and

tweeted that

out

as a poll.

Somebody voted on on 1 of the, poll responses, which is just, you know, a bunch of nines.

So before we move to the picks, is there anything that we didn't ask that you think we should have or anything else that you'd like to bring up?

I guess I would mention that,

you know, if you're we are always

we are very welcoming to new contributors,

for SymPy. So if you're interested in if you're interested in this sort of stuff and you'd like to

maybe new to open source, come check us out. We have a,

if you go to,

I guess, if you go to our website, it'll point you to where to go. But we have a a Gitter channel,

where you can

you can go to get started. And there's some pages on our Wiki on on how to get started with some pie development. And we have some some issues in our issue tracker that are

tagged as as easier to fix issues for people who are new. And, yeah, we'd we'd try to be as as welcoming as possible to new contributors because,

that's

ultimately how most sempai developers sort of started out.

Great. So with that, I will move us to the picks.

And my first pick this week is going to be

the podcast Functional Geekery.

It's 1 that I've been listening to for a little while, and the host has on

people from various

languages

in the functional space.

So they'll you know, 1 week, he might have somebody talking about their work within Erlang, and the next week, it might be somebody talking about closure, and then followed by somebody talking about Haskell.

So it's just a interesting way to learn more about functional programming and some of the work that people are doing on that side of the fence.

So been appreciating that.

My next pick is a music pick that I stumbled across while I was listening to Spotify. It's called Necrogoblicon.

It's melodic death metal, and it's actually pretty enjoyable. I've been listening to that a bit lately, and in particular, their latest album called Heavy Meta.

So for anybody who

listens to metal at all, definitely worth at least trying it out.

And my last pick this week is going to be,

Marble Fun Run. So, you know, if you ever seen

the marble tracks that you can snap together in various ways and has different shaped

bits. So, you know, some of them are Kirby, some of them have little water wheels that'll spin as the marbles go down.

I recently picked up a set of those for my kids. Just a good way to pass the time and experiment with different, different flows of the marbles. And with that, I will pass it to you, Chris.

Cool beans. My first pick is a new podcast called Surprisingly Awesome.

And each week, they showcase something that seems kinda mundane,

but that it actually has some interesting characteristics

about it. Like,

1 week recently they

showcased

Tub Thumping by Chumbawamba

which

for any of us who were around during that time period just got overplayed to death. But did you realize that they were actually a,

you know, hard edged punkinarko

collective

before they became a a pop band?

So it's that kind of thing. It's very

fun.

My next pick is a documentary

that Avdi Grimm had recommended and I sort of, you know, really, I'm a huge fan of his and follow most of what he writes about. So I thought, you know, I should give this a shot. And I have

to say, it's really interesting. I'm not sure I entirely agree with everything that the author has to say, but it's incredibly thought provoking and visually

beautiful with all all sorts of really great footage. It's called All Washed Over by Machines of Loving Grace. It's an old documentary,

2011 from the BBC.

And the basic premise that the author makes is,

in the at the dawn of the computing era,

people had,

some of the pioneers in the industry had thought that,

with computing power,

in on the increase,

we would be able to create this sort of worldwide utopia. We'd be able to do away with government and that just hasn't panned out. And he has really

interesting interviews with showing, you know, footage of like Ayn Rand and,

it just it's really interesting thought provoking stuff. I could go on and on about it, but I won't. I think you should watch it and, and let me know what you think.

My next pick is a band, a Japanese pop band. Because

I watched the documentary

and the theme song,

from it stuck in my head and just burned its way into my brain. And I said I gotta hunt this down and figure out who these folks are. It is a Japanese pop band called Pizzicato

5.

And

the,

as I said, the theme song for that documentary

is made by them called Baby Love Child. It it's just a really interesting kind of quirky,

Japanese,

pop sound. I really like it.

My last pick

is

a beer.

I encountered this last night actually. It's it's called Mayflower Hoppy Brown Ale.

It's also seen it as as Cooper their Cooper series brown ale.

It's it's really kind of interesting because most of the time when you see brown ales,

they're very sort of malty

and don't have much of a hop signature.

But this 1 has some really sort of

pronounced hops,

and so it's really kind of an interesting neither fish nor fowl. It's neither a brown ale kind of or I should say, it's not like the average brown ale and it's so hoppy. It's also not an IPA because it does have the usual sort of, you know, malts you find in a brown ale. It's really tasty stuff and and, something

a little unusual to,

tickle your palate if you're a beer fan and and need an in need of a change.

Aaron, what picks do you have for us?

Yeah. So I I mean, I don't I don't have nearly as many as you guys do, but,

I guess so my first pick is gonna be, a website,

called,

Vermont's Library,

and this is for anybody who's who is interested in mathematics

or, I guess there's also some physics here. So the idea of this is that they

they pick a paper,

each week,

and,

present it in this

this way that where you can, people can comment on it on the in the margins,

sort of like how Vermont famously did,

with his

his theorem that he never actually proved.

And so there's some interesting papers here. I think it I think this is just a a great way to sort of get see an interesting paper

every once in a while, and they they tend to be very short papers as well.

There's 1 1 here is the the Bitcoin paper by

Satoshi Nakamoto.

There's a

a paper here on a simple proof that pi is irrational.

There's there's a a paper here that's,

the shortest

what what was it?

The shortest paper ever published in a serious math journal,

by John Conway and and Alexander Soefferri.

You should look at what that is.

And so if you're interested in mathematics as I am, I think that's a great

thing.

My next pick, I don't know how

self promotional these picks picks are about to be, but As much as you want. As much as you want. Go for it. This is actually it doesn't this is just a

a little thing that I wrote that I was reminded of today,

which is called cat image.

So I guess this is sort of 2 picks. 1 is

if you use OS 10 and

are use a terminal at all, I highly recommend that you use Iterm 2 instead of the default terminal that comes with OS 10 because it

is just amazing, and it has just

a huge amount of features.

It's

probably

it's easily the best terminal emulator on on any operating system.

And so 1 of the features is that you can actually display images in the terminal

by doing some special

escape sequences.

And so

what I've done is I created created a little program,

that goes to MGIR and downloads an image of a cat

and displays it in your terminal.

And so this is

I have this at the top of my bash profile

where, you know, it'll

you just type cat image, that's c a t I m g, and it'll download a a cat from IMGAR and display it in your terminal.

And so I I at the top of my bash profile, I have a cat image, and then, it displays a fortune from the fortune program.

And

so whenever I open a new terminal tab, I'm greeted with a nice cat. You can install that with with pip, pip install cat image or conda install cat cat image from my conda channel. So for anybody who wants to follow what you're up to and keep in touch with you, what would be the best way for them

to do that? Well, for me specifically,

probably the best way would be my,

Twitter,

which is ASMEURE.

It's a s m e u r e r.

I'm

fairly active on Twitter. So

now for SymPy, I would recommend

the SymPy mailing list and the Gitter channel. And if you if you Google either of those, you should find those,

quite easily.

Alright. Well, we appreciate you taking the time to join us this evening,

and I definitely enjoyed learning some more about simpie and computer algebra systems in general. Very cool. Thank you very much for taking the time to talk to us. Simpie is a really neat project. And,

I I I've only sort of scratched the surface with it. Mathematics is 1 of those things that I feel like I'm just at the at the sort of very outlying

edge of, you know, my own capabilities with it. And SymPy seems like a really great

exploration tool to sort of broaden my mathematical horizons. Thank you. Well, you shouldn't you shouldn't feel bad because, there are areas in simpie where I just I have no idea what's going on because I don't know the mathematics behind it. So I don't know if there's any 1 person who actually is capable of understanding everything in there.

And if they do, they should step forward.

Alright. Well, thank you for having me. Good night. Have a good night. Good night.

The Python Podcast.init

Summary

Brief Introduction

Interview with Aaron Meurer

Picks

Keep In Touch

Links

The Python Podcast.__init__

Summary

Brief Introduction

Interview with Aaron Meurer

Picks

Keep In Touch

Links

The Python Podcast.init