Making Automated Machine Learning More Accessible With EvalML

Hello, and welcome to Podcast Dot in It, the podcast about Python and the people who make it great.

When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode.

With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers, 40 gigabit networking,

dedicated CPU and GPU instances, and worldwide data centers.

Go to python podcast.com/linode,

that's l I n o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

Your host as usual is Tobias Macy. And today, I'm interviewing

Angela Lynn and Jeremy Shi about EvalML,

an AutoML library which builds, optimizes, and evaluates machine learning pipelines. So, Angela, can you start by introducing yourself?

Sure thing. Hi. I'm Angela. I graduated

from my undergrad

at MIT around 2 years ago, and I joined the team working

on a ValML at Feature Labs as an intern.

And then after Feature Labs was acquired by Alteryx, I joined them full time as a software engineer, and I've been there since. And Jeremy, how about yourself? Hi, everyone. I'm Jeremy.

I pretty much have the same timeline as Angela, but I finished my master's at Tufts University

2019 as well, and right after I joined Future Labs, and I was basically there for

the inception of the Val Val project, and I've been working on it since, and have been joining Alteryx as well. And going back to you, Angela, do you remember how you first got introduced to Python?

Yeah. I was

very lucky in that I attended a high school

that had a pretty, like, well thought out CS curriculum.

So we started the 1st semester with

more functional programming, so scheme that logo. And then the 2nd semester was in Python,

where we are able to just, like, code up small, simple algorithms,

small, like, Flask, web apps, that kind of stuff. And Jeremy, do you remember how you first got introduced to Python? Yeah. So I started a little later, probably like sophomore year of college where I took, like, this web engineering course, and that's when we were introduced to Django and Flask.

So I kinda got my start there, and then as I progressed down more of the machine learning framework at Tufts, I was, like, naturally introduced to more and more Python things. Can you start out a bit by describing

what the evalML project is and some of the story behind how it got started and what the reason was for creating it in the first place? Sure. I can talk a little about that. So ValML, as you mentioned, is the automated machine learning library.

And so maybe to talk a little bit about what that means, we can talk about the workflow for machine

learning usually is. I think, traditionally, with machine learning, it refers to, like, being able to select data and then models and then the model parameters,

and that's quite difficult to do by hand. I think, usually, it requires a lot of brute force or specific knowledge,

whether about data science or also just the field.

And what I think my manager likes to refer to, it's trial by grad student. Basically, just a lot of, like, brute force trial and error. And so AutoML

and EvalML

aims to

simplify that process.

So the user, instead, only has selected data that they care about that describes the system of interest, and then AutoML is supposed to help you

select both the type of model and also the parameters for your model.

And a little bit about, like, how that came to be.

So I can speak personally, at least.

Like, I came from

a background where I had used machine learning in some of my courses, but I wasn't

very familiar with it. And in the courses where I used machine learning, it was exactly

what I described, where

I had to use specific models that maybe my professors had talked about or we went over in class. I didn't really understand how to tune the models or exactly

how to, like, get better performance except try different parameters.

So there was a lot of frustration on my part where I was just trying a lot of different things and, like, seeing the scores go up and down and not really, like, understanding

how to approach it well and spending a lot of time

on that. So I think because we saw a lot of frustrations out there, not just in modeling, but in the data science process in general, a ValML was created

through that process.

Yeah. Like Angela said, I think, basically,

we really wanted to work more on AutoML just to cover, like, the pain points of data scientists out there. Me personally, I worked 1 summer as a data science intern, I just kind of got to see that, like Angela said, it was like a very iterative process where there was a lot of trying new things and going back and trying more things, so you kind of really needed

some infrastructure around you to kind of facilitate that process, and

not many companies out there actually have that kind of infrastructure or can invest in that kind of infrastructure.

So when I was talking to Max, Max Kantar, who was a co founder at Future Labs, that was something that really inspired me, and to join, so that's kinda, like, why we started the project as well. To mention Feature Labs, in house, what we already had was Feature Tools, which is a library

used for automated feature engineering,

but that's only 1 step of the pipeline, Right? And since then, because we want to build, like, quality tools for data scientists, we've kind of expanded out our repertoire

to include different steps of the pipeline. So now we have, in the open source world,

Compose, which helps with automatic data labeling

Woodwork,

which helps with typing data typing,

feature tools, which is for automated feature engineering, and now ValML, which helps with the last step of that process,

our automated

machine learning.

In terms

of AutoML, my understanding

is that it's basically just enumerating and exploring a particular search space for the machine learning outcome that you're trying to solve for,

determining the appropriate set of inputs

and models and the tuning parameters thereof, and then giving you

the output of this is the best result. This is the model that was used. Here are a couple of other options.

Wondering if you can just describe some of the process of determining that search space and how you're able to constrain it so that you don't end up wasting a lot of time and compute resources

going down potential dead ends or sort of determining early in the cycle which paths might end up being suboptimal for the problem at hand. So I think I'd kind of like to

describe the whole AutoML process kind of through the lens of a data scientist.

So at the start of any problem, a data scientist would probably try to, let's just say for like a customer churn problem, they'll probably try to configure a hypothesis and try to try out what kind of data

they want. So at the beginning, there's this problem formulation phase, and NextJIT you kind of

build features or grab data to try to see the data that you grab can be converted into good models.

And there, like, at the model search is kind of, I guess, like, on a higher level of what you described, like, what AutoML works, right? So

maybe like a traditional data scientist

has good domain knowledge on a customer churn problem, they kind of know what model is out there or what kind of data out there is good

for something like that. So maybe for some specific models, like he can take like a Random Forest model or like an XGB boost model and kind of build out the Skuid search of possible parameters for these models

and kind of just work off that. So there it's kind of like an iterative process,

and for us, we try to do the same. At the very beginning of our project, we kind of just had a very simple automata algorithm, and it was kind of exactly as you described, right? Given these data and the potential models out there, kind of iterate through these models, and for each of these models, just trying to figure out which hyperparameters

or what configurations for these models

are the best for this specific problem.

Recently, we've been doing a lot more development into the automail algorithm, as we call it, and that's something I've been working on. And basically, it's just adding more and more heuristics

or more automation

in the automile process. So example of that would be

to try to see if

adding on feature engineering or feature selection

is like a possible avenue for us to optimize on, and if it's not, we can move forward. What else? Like, try to see what steps we should do ensembling and stuff like that. But like I said, like, at the very beginning, our auto ML algorithm was fairly simple, but we're still in the works of trying to improve that. In terms of the

audience for the eval ml project, who is sort of your your target persona for the eval ml toolkit and the types of problems that it is well suited for being applied to? I would say, at the end of the day, I would want

anyone who has, like, any knowledge in Python and wants to use machine machine learning but maybe doesn't have as much knowledge about machine learning, that's who I want the persona of a ValML to be. I think it's not quite there yet. Right now, we still,

I think, gear more towards a data science persona simply

because we're not as opinionated as we want to be about, like, the choices that the algorithm makes or, like,

so at this point, we still require the user to know at least some knowledge

about tweaking specific

things or understanding, like, the problem type. But at the end of the day, I think we are

trying to get to a place where we are more opinionated, and we do a lot more of the automation so that, again, all the user has to do is provide us with data. We'll tell you if the data is faulty or things that you can even clean up with your data, and then you can just call

our library,

build the model that you need, and then go off and solve your business objective. Another interesting thing to dig into is the

overall space of AutoML

as a problem domain and some of the tools and systems that are available for people who are looking to

take advantage of the capabilities of machine learning without necessarily

either having all of the domain expertise or understanding of how to actually build and apply these bottles or

who are looking to iterate faster than if they were to do all that exploration on their own. And so I'm wondering if you can just talk a bit about sort of what are the available systems and tools in the ecosystem, whether,

you know, in Python or outside,

and what was missing in the available offerings that motivated

you to create the eval ML project and sort of what it brings to the space that is unique? I think in general, there's, like, a couple of good tools out there in the Python open source community

for automated machine learning. A couple examples of them would be HTML, AutoML,

Auto scikit learn, I think Databricks has their own library, AutoWeka

is another 1, and I think in general, all these packages do cache optimization,

which is like combined algorithm selection and hyperparameter

optimization.

At the base level, that's what we all do, and it's kind of like what you described earlier on like what AutoML is like in general,

basically taking the process of getting data and then figuring out what the best models are out there and trying to figure out the best parameters for those models.

It kind of felt like we could improve on the user experience for AutoML, and that's kind of what we focused on to begin with. Like Angela explained earlier on, at the very beginning, we wanted a lot of customizability

or a lot of user input in our automail

library. An example of that was adding custom objectives to evalmail,

and custom objectives on a high level is basically

taking

the real objective of your business

and trying to optimize your models on that instead of these more obscure

loss functions or these optimization objections like you see in machine learning. So it's like kind of like transforming

the machine learning objective into

something that you can see have more of an impact on your business. Like example of that would be for customer churn, maybe a certain

gain or loss, depending on if your customer were to come back or not. Yeah, so in general, we just wanted to focus a lot on the user side of things.

And now that our

package has progressed a little more, like Ajayo said, we kind of are trying to focus more

on giving more opinions, or having more of an opinionated approach to AutoML.

We can also move, not move necessarily, but like support both data scientists that can take advantage of all these things that we allow them to configure, but also give it to users who have less domain knowledge or less, machine learning expertise, and also figure out their problems as well. I think for me, for sure, the first thing that comes to mind is being very user centric, and I think that's pretty evident in that

for our landing page or for when someone wants to

use a Valmont. There's 1 method that they really need to think about, which is our search method, and that runs the AutoML process for you. There and done. So I think that makes it a very, like, easy access for beginner users because there's just 1 method that you really need to focus on. But then if you are a more expert user, then, sure, you can dive more into the nitty gritty and change things up as you will. I think other important things out there is that we're 1 of the open source libraries that combines other popular libraries via a unified API.

So

Jeremy had mentioned auto scikit learn and some other ones out there. We use models from popular libraries, such as scikit learn

or LightGBM,

Catboost, XGBoost,

and we provide all those models

under our API,

which I think is pretty useful if people just wanna try out different things and not have to learn

how each of those libraries work individually because they do tend

to have, like, slightly different

APIs.

Something else that I

was able to work on a while ago and kind of expanding beyond the AutoML

model building process is something that we call data checks, and data checks are basically heuristics

that you can apply onto your data to tell you if there's something

wrong

or potentially

would cause a model to not perform as well. So some simple examples might be if you have a lot of NaN values

in your data. Usually,

a lot of ML models out there, they don't really know what to do with NaN values or they'll just error out. So

we built a whole

collection of, and we're continuing to build more and more data checks, which

check for different types

of errors or potential issues with your problems, such as

NAND values is probably the 1 that comes to mind right now.

But in building the data checks, we hope that we're able to give users clear feedback about how to update their problem configuration and their data so that they get better

better model performance at the end of the day. Digging into the project itself, can you talk through some of the software design that has gone into it and some of the sort of libraries that you're able to use from the ecosystem and just some of the

overall design considerations that have gone into how you've built the project and how you have aimed to make it accessible to end users?

So I think the way that our project is architected is probably similar to a way that a user

might want to interface with it top down. So at the very top, we have what I mentioned before, which is the AutoML search object or the search method. That's a primary interface that a user would interface with if all they cared about was creating

or using AutoML.

And so

that breaks down into all of our other, like, smaller components. So we have objectives, which are what metrics the model should be optimizing for. AutoML search creates pipelines, which represent a series of

operations that should be applied to the data. And each of these operations are either a data transformation

or an actual ML

modeling algorithm.

And then we have component graphs, which fall under

needvar pipelines, and they simply encode

the transformations that are applied to data. But unlike pipelines, they don't necessarily have to be a linear sequence so that you're able to have different input pathways for different types of features, or you're able to

combine different, like, pathways into 1 final

algorithm, basically.

And kind of going even 1 step below that, what our component graphs

are comprised of are components,

which are exactly what we talked about, the data transformations or the modeling algorithm. And they're the lowest building blocks of a ValML, and they just represent, like, a fundamental operation that should be applied to the data.

So I guess going back to the user

side of things,

again,

if a user just wants to come into our library and run AutoML,

they can just go from top up, AutoML search, call search, done. If they care more about what pipelines

are created, then they can go and create their own pipelines. They can create their own component graphs or components. And so there's, like, different levels of control that they could have depending on their expertise.

Yeah. The sort of incremental reveal of complexity is always a useful

attribute of different software projects where if all you care about is I just need to get something done fast, there's a way to do that. But then as you spend more time with it and want to, you know, pull back the covers more and gain more control, there is a way to do that without having to hack apart the project and, you know, pull out the guts to put it back together the way you want it to.

Right. I think in some ways, that's

kind of representative of how we've, like, progressed as well, where we had AutoML,

and it had pipelines which were hard coded at first because we didn't have this abstraction of components

really baked in. And then we realized, well, for our more complex problems, we might not always want to use the same components or, like, fundamental operations.

Right? So that's how we broke it down to the level of components, and now we're able to,

on our side as well, be able to combine these components in different ways for a smarter algorithm.

In terms of the overall

design and evolution of the project, you mentioned that you have added in some of these features that make it a bit more flexible. But what are some of the assumptions that you had early on about the way that the project was going to be used or the feature set that you wanted it to be able to have that have been

sort of changed or updated as you worked through it and as you had other people using it and giving you feedback?

So I think at the beginning of the project, like Angela said, like, a lot of what we were doing was just building out the abstractions, and I guess I kind of referred to it as like the platform for everything,

so

kind of just like we are building out the pipelines and components, I think that's actually the very first big project we both worked on together back in October of 2019,

and this was just very near the inception of the project. And still, like, evalma was still quite a young project, just going on basically 2 years old.

So like you said, we've been evolving a lot. Now, like a big part of that is expanding

our team. Like when we started, there was just 2 of us and Max who was leading us, and then eventually, we had a couple more interns join on board and more and more team members, and now we have around 10 full time team members, and basically that has just allowed us to expand our output by a lot. But I don't think, in general,

we really deviated far from our original goals. I think maybe certain things came in when we felt that it was necessary to help with the automail process, like example, with a lot of our model understanding tools, we were missing this final piece at the end of the automail process that allowed users to gain more from using a Valmall. So we put in a lot of investment into

model understanding tools, things like partial dependent, and just kind of things that people could look at the pipelines or models that we outperform

of Alma and see how it relates to their data beforehand, so I think that's 1 thing that was added. Yeah, Angel, do you remember anything else that we kind of added as we went along?

Sure. I think that's a really good call out, and to what you talked about before, Tobias, about AutoML libraries just being libraries that automate the building process of pipelines. I think we've tried to expand beyond that so that that's not

all that we do. Jeremy mentioned, like, model understanding. So once you get your pipelines, like, how can you actually understand

or interpret the model? There's also, like I mentioned before about data checks. So, like, even before you build your models or third at AutoML,

like, can you better understand

your data or, like, clean that up?

I think another big push for us that's still in beta, but something that we've been working towards is time series.

And that, I think, as Jeremy mentioned, has just been, like, because we've been able to expand our team, we've been able to

kind of dive into

other realms and kind of broaden our scope a lot more. The overall space of time series is 1 that's definitely been seeing a lot of attention lately where there's the profit library that's been gaining a lot of attention from coming from Facebook.

LinkedIn recently released great kite. I know that Zillow has a project for being able to do

some anomaly detection on time series data. And so I'm curious what you've seen as some of the

interesting problems to solve in that domain and how how you can apply AutoML techniques to it and some of the complexities

that that brings to the problem when you are dealing with this dimension of time in the dataset.

Well, I think it's funny that you mentioned Prophet because 1 of our team members had just merged in Prophet integration

so that now you can use the Prophet library within eval.

And, honestly, I can't say I've been too involved with the time series work. Like, listen to some discussions, and I think at the end of the day, like,

not a lot of AutoML libraries out there have time series because

working with classification or regression problems versus time series problems can be, like, a whole different

set. I think 1 of the biggest

things that I think about is, like, for time series because,

well, time is important. You can't simply just do CV or, like, cross validation split of data

because you can't shuffle the data around without messing up the importance.

Yeah. I guess just to add a little more, like, I guess it's just a little more than

just expanding a Vowel model to accept time series as a problem type. It's just time series is so different from traditional,

like, classification or just regression problems that, like, we were thinking about how we might potentially need to change our

user interface or our user experience to deal

with such problems, and another thing that kind of Angela alluded to was

how we build our features from time series problems. Right? Like, it's not as simple as just

taking everything in and then just running through a model. We kind of need to go through the process of maybe building windows

or building time gaps or creating time buckets, or stuff like that. There's many, many different

varieties of things you can do to time series problems, and I think it's still very much work in progress for us to figure that out, and how we can potentially leverage feature tools

to do some of that for us. So I think right now, that's like a big thing for us, and we're still trying to figure it out. You've mentioned a few of these other libraries and components that you have to help support the evalML

project in terms of the other stages of the overall process. I'm wondering if you can just give a bit of an overview about

the capabilities that those tools provide and some of the benefits that you've been able to lean on of having this componentized approach to the overall end to end process of building and deploying or building and training and using these machine learning models?

Sure. So within the VowelML, I think the 2 biggest ones that we lean on are trying to integrate with are woodwork, which we use for our data typing. So I think it builds on top

of pandas, which is a very popular library

for holding, storing data.

So this was an iterative process where, originally, we just worked with Pandas and NumPy, but we realized

over time that that wasn't enough because

pandas, you can store a lot of different types of data under the same, like, physical data type, but it was very difficult for us to understand, like,

how to parse that data and use it for modeling. So to give you an example,

like, let's say you might have string data, and in some cases, that string data might refer to a categorical type or, like, different categories.

But other cases, it might be,

it's, like, a natural language, just text. And so how can you get a model to understand how exactly to use that data?

I think we ran into limitations with pandas there, and we decided, well, if we build our own library, which enables us to

differentiate that through what are called logical types, which refer to how the data should be used, then we're able to use that in a VowelML and handle those cases separately even if they might both be string physical types at the end of the day.

So that's 1 library that we integrate closely with,

and other 1 feature tools, which we use for automated feature engineering. Jeremy, I don't know if you wanna talk about your ongoing work with that. So in the past,

I guess, like, our integration with feature tools more so, we recommended users to run feature engineering as a step before

running AutoML, and I guess that kind of required users to have more

expertise or knowledge on the whole feature engineering process in general. But a current project of mine right now, it's kind of related to what I talked about earlier about the AutoML algorithm, is trying to add feature engineering as a step into our AutoML algorithm,

and also utilize or leverage more feature selection

to build better models.

So I guess that's 1 thing I've been working on recently, and,

still TBD when we're gonna be done with that, but that is like, my main focus right now. And I guess to add a little more, like, these are all Alteryx open source tools.

So the other ones that Angela mentioned are

the only 1 other was compose. Right? So that it's, like, even earlier in the,

I guess, like, the data science life cycle or pipeline where label creation is at the very beginning, so I think these are the couple tools that Alteryx open source

has, but within Avama, we still try to integrate as many useful machine learning libraries out there, and like a big 1 that we integrate closely with is SKlearn, and not only like we

draw upon their capabilities for some of the machine learning algorithms,

but we try to actively support kind of like the SKLearn API,

and that kind of makes it easier for new users to come to ValML and use our components or our pipelines and stuff like that, so I think that was a main focus of ours at the very beginning of the project. I guess some other libraries include Capboost or XGBoost or more specific

machine learning algorithms out there. Going back to the overall search space problem and finding the optimal stopping point in particular, I'm wondering what you have found to be some of the useful heuristics for understanding

when you have either reached a sort of local maxima for the problem.

And this is a good point to stop the search base to say this is the answer

or knowing when to sort of prune certain branches of the overall search base to say early on, this is not going to be a fruitful pursuit. We're going to, you know, drop this model or this set of parameters because it is leading to a local minima. So I guess to begin with, like, what we have within Avail Mail isn't that sophisticated yet, so a lot of it, what we have, kind of draws upon user configuration to end our search. So example of that would be a certain amount of time has passed, or if a user expects a certain amount of batches out of our auto model search. I think that was what we started with, so just basically,

yeah, user defined endpoints.

We've kind of expanded

on that by adding early stopping. I guess this is like fairly simple early stopping, where if the score stop

improving to a certain degree, we'll end the search. So we have that early stopping parameter and, like, associate tolerance parameter that helps configure that. So I think that's what we have right now. And I guess, like, more to what you alluded to is, like, what we want to move into,

and I guess I'll come with the iterations of our automa algorithm.

So right now I'm working on this algorithm that takes, I guess, like more machine learning heuristics, but less about

stopping.

But as we move forward, I'm sure that's something that we'll try

to figure out eventually.

In terms of the

problem of dealing with AutoML

and working with all of the different types of data that people are going to bring to the problem,

what are some of the

challenges or edge cases that you have run into yourself and that you have seen users of the library run up against and some of the ways that you

either try to

smooth over those edges or provide sort of early warning signs to the users that they're about to, you know, end up in a place that they don't want to be? Yeah. I think that's exactly what data checks tries to warn users about.

And

so, basically, with data checks, again, the user provides the data to the data checks, and we have a whole bunch of different heuristics about, like, you have too many NAND values or

your distribution

of data. Like, you might have class imbalance,

and that might not perform well. I think

that's kind of how we try to have users

be aware of issues before they run AutoML search. We have

also considered

and talked about creating data check actions. So first, sure, you might have these data checks that warn you or tell you of, like, errors that might happen, but taking that a step further and saying, well, how can we,

as an AutoML library, automatically

update and transform your data so that those issues that you ran into are no longer issues?

So, for example, with the simplest 1, like, always being NAND, having NAND values, how can we automatically

impute

your features such that

to the end algorithm, you don't see those NAND values and you don't run into errors. To add a little onto that, even with something like NAND, there's like all sorts of different types of NANDs out there, and we always have to try to figure out, like just NAND in general. I think within the Python ecosystem,

Pandas uses their own NAND, there's a NumPy NAND, so that's something that came out, I guess, like as edge cases

for us, and like even more so than that, we needed to figure out like how to handle Dans for all these different data types as well, right? Like, let's just say even for string data types, we had to figure out how should we handle NANDs

for what we call natural language or text data, or how should we handle NANDs for categorical data, and there's some intricacies on not only how we should handle them so that models run, but so that models run better as well. So I think for us, it's definitely a iterative process where users come to us with, like, situations where we haven't seen before, and we're always constantly trying to improve in that run. Yeah. I think we see the

issues that are in the datasets that users give us, and

then we take that to the drawing table and we say, Alright. We see that this user ran into this error because their data was malformed in some way. How can we more generalize that so that next time the users

have that kind of issue, we can warn them beforehand

with a data check.

In terms of the capabilities of the EvalML project, what are some of the either overlooked or underutilized

features that you think are worth calling out?

I know I talk a lot about the data checks,

but I

well, I guess I

care a lot about the data check, and I think that is something that I really wish was used more. We've played around with having the data checks run

as part of the AutoML search process.

Because right now, the way that it works is users have to go out of their way to call these data checks, and we provide a default set of data checks that they can use, which, like, will cover some general cases that we've seen.

But it still requires a user to manually do that, and I think

that's something

that

could

help users

with a lot of their issues that might come up in later in the modeling process that they could have avoided

if they

had, like, basically sanity checked their data. Yeah. No. I definitely agree with that. I

I don't know if I can give a specific number, but I guess like I would say maybe like 80% of

your, I don't know, like data science journey or like data science process is done before

any modeling or even think of modeling, and a lot of it is like Angela was saying about cleaning your data or augmenting your data such that not only it actually works with machine learning models, but also all performs better as well. So, yeah, I definitely agree that data checks is something people should leverage more, but I guess like another point to call out is like some of our model understanding tools, so shout out what we did for partial dependence earlier, and I think we're in the works of adding more model understanding tools, but I guess like in general, it's just kind of underlooked because people kind of see that like, oh, once you got these models and they perform well, they kind of think it's the end of the process, but there's so much more out there, right, like you can take these models and kind of see how your earlier features related to your end result through partial dependence, or see how important these features are, and you can, I guess, drive more

analysis

than just pure

prediction out there, right? We don't actively

support image data, but an example of that would be like taking like a neural network and then seeing what specific features are associated with what, then you can go back to that, like the beginning of the process. Let's just say if you're working with breast cancer data or something like that, then you can focus on certain things without even utilizing machine learning at the end of the day. Yeah. And that goes to your point about the explainability

problem and how that is an important aspect of the overall process and something

importance that it has. Because if you don't understand

you

And then your point of not having support for image data right now, I guess, brings up the question of sort of what are the types of datasets and

the types of sort of industry or problem domains that EvalML

is currently focused on supporting and some of the upcoming capabilities that you're looking to add to it? So at the end of the day, what Avanell,

I guess, expects is to be tabular data. And I don't think that's limited to any specific industry per se. As long as you can give us data in a tabular form, then, hopefully, we'll be able to run a ValML with that. I will say that

a ValML currently powers

Alteryx machine learning, which is, like, an enterprise solution.

And, like, with Designer and Alteryx, like,

that's already being used in a lot of different industries. So, again, I don't think that should be limited in terms of, like, where it should be used. I think anywhere where

machine learning applies, then hopefully eval now applies as well. And in terms of the uses of the project and some of the experiences that you've had building it and working with the community, what are some of the most interesting or unexpected or unexpected or innovative ways that you've seen it used? So ValNL is still fairly new, especially in the open source world. I don't think it's gotten a lot of traction yet,

but we did come across a blog post about

someone. I think his name was Mike Casales,

who

had wanted to use Val Amel as part of the process of confirming or denying whether or not South Florida real estate prices were taking a hit from sea level rise or urban climate risks.

So he asked

the question of, like, is the difference in property evaluation

versus

similar properties nearby high? And if so, can it be explained by

a metric called the flood risk factor?

And he actually used not only a ValML, but also feature tools. So he used feature tools to clean up some data, and then he used a ValML to generate

models. And then later in the pipeline, he used a SHAP to, like, better understand the model that was created by a VowelML.

And so there, I think it was a pretty cool use case,

especially in that I mean, this might seem,

like, counterintuitive

in some sense, but a VowelML got maybe, like, 1 or 2 sentences, which was and then I used the VowelML

package to, like, generate a model, period. But I thought that was actually

pretty indicative of how I imagine a ValML to be used because most of the blog post was talking about all the different ways that he needed to clean up the data. And once he got the data into the 2 d tabular form, he ran a ValML

search method,

got the model, and then he was able to do a whole lot of model understanding and, like,

more

insight driven work using Shap and other products.

So the takeaway, I think, he got from

doing that was that, yeah, the flood factor

flood factor info, I think it was. I don't remember the exact feature, but 1 of the features that he was looking at, it was 1 of the most important features that, like, popped up at the top. So it was pretty important in predicting the differences

in real estate valuation.

In terms of your own experience of working on the evalML project and using it for your own problems and helping the community to take advantage of its capabilities, what are some of the most interesting

or unexpected or challenging lessons that you've learned in the process?

Well, I think the biggest 1 for me is how we should serve users from all sorts of backgrounds.

I think we're getting, like you said, we kind of targeted, I guess, people with more data science or machine learning knowledge, and tried to give as much information or as much customizability

to our users, right, but definitely created a lot of confusion

for people that don't necessarily have that kind of knowledge

or that kind of expertise, so I guess there's 2 routes you could go with that. 1 is to try to upscale your users by providing a lot of education on that, or the latter, which kind of we took, was try to give more opinionated advice on certain things. So, like, for example, like for the data checks, we start out with just having data checks and basically telling users where they've gone wrong or where they could have improved, but we're trying to move to having data checks actions where basically saying, just give given your data, we'll fix it up, and we'll pass that into

AutoML.

And I think the last blog post we talked about, like, the real estate blog post, was, like, a very good example of

what we want to encompass within Avao.

As, like I mentioned,

the guy that wrote the blog post used shop, and then, and that's like something we've actually added to a eval model as part of our model understanding tools. So I guess more and more

trying to figure out what our users need in the whole data science or machine learning process

and trying to address those needs within our library as well. So I guess it's, like, always a continuous learning process for us as well. I guess for me, the challenge, the ongoing challenge, this is maybe, like, less about a ValML and more again, personally, I didn't come from, like, a machine learning background. It was never really a focus for me.

I came in as, I guess, what

the a ValML user should be, which is someone who wanted to use machine learning, but wasn't very well versed in it. So I always take, like, that perspective when building the library,

and

I'd like to say that, like, I offer a different, maybe, perspective from other people on the team because of that. But that also means as someone who's

building helping build and maintain the library, like, some of the machine learning concepts

trying to boil

down something that might seem foreign to someone who has, like, no understanding of machine learning. How do you boil that down into something that they understand and therefore can use is always a challenge that I think

everyone on the team is trying to face.

For people who are interested in exploring evalML and the overall space of AutoML,

what are some of the cases where

1 either AutoML as a general approach or EvalML

specifically are the wrong choice? I think it's a wrong choice when

machine learning is a wrong choice. I know machine learning is, like, a buzzword, and everyone wants to use machine learning. I've definitely been in companies

where they've told me, like,

I don't know how I want to use machine learning,

but we want to use machine learning. Here's some data. Go. And I was like, well,

I don't know what problem you want me to solve. So I think that's

pretty key, like, understanding exactly what problem you're trying to solve, like, what objective

it is

that you're trying to solve for

is very important. And another thing, I guess, is if you don't have the data that, like,

correctly

represents or accurately represents the system that you're trying to develop,

then I'm not sure that machine learning or a vowel model can help you there because,

like, it's a little cliche as well. But it's, like, if your data is garbage, then it doesn't matter what the machine

learning

model predicts. Right? Like,

that will also be garbage too, or it'll give you insights about garbage. Those are 2 things. I guess, like, on a ValML specifically,

there's a lot we're trying to tackle, but there's also a lot we're not trying to tackle,

especially with, like, deep learning, architectural search.

Jeremy mentioned before, like, we don't handle image, video, audio audio

or higher dimension data.

I don't think we have plans to right now. Maybe that maybe I'll bite my words in a few years, but

yeah. Well, for anybody who wants to get in touch with either of you and follow along with the work that you're doing, I'll have you each add your preferred contact information to the show notes. And so with that, I'll move us into the picks.

This week, I'm going to choose a band that just came across on Spotify that is

interesting and amusing and, you know, fun to listen to called the Glory Hammer that's a sort of power metal band that is a throwback to the sort of metal ballads of the eighties and sort of

each album is sort of its own epic story

played out as a sort of metal.

Just

it's hilarious and entertaining, so definitely worth taking a look at if you're looking for something to keep you occupied for a little while. So with that, I'll pass it to you, Angela. Do you have any picks this week? My pickle has to be a restaurant that I've been craving in Cambridge,

and that's Sarma. It's a Mediterranean

restaurant, and I think it's really, really good. Alright. And Jeremy, how about you? For me, I've been reading this book called

Crucial Conversations.

I know it's not as cool as your topics, but this book called Crucial Conversations by

Stephen Covey, and I feel like it tries to provide a framework for people to tackle exactly what it says, like, crucial conversations, which are difficult conversations where

people

have different opinions or have different interests, and I think it applies to,

you know, not only just, like, the workplace, but, like, you know, just individual relationships or stuff like that. So I've been enjoying the book a lot recently.

Alright. Well, thank you both very much for taking the time today to join me and share the work that you've been doing on EvalML and some of the related tooling. It's definitely a very interesting problem and an interesting project. So I appreciate the time and energy you've both put into that, and I hope you enjoy the rest of your day. Thank you. Thank you for setting this up. Yeah. Thank you for having us, Tobias.

Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at data engineering podcast.com

for the latest on modern data management.

And visit the site of pythonpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes.

And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcastinit.com

with your story.

To help other people find the show, please leave a review on iTunes and tell your friends and coworkers.

The Python Podcast.init

Summary

Announcements

Interview

Keep In Touch

Picks

Closing Announcements

Links

The Python Podcast.__init__