Summary
Housing is something that we all have experience with, but many don’t understand the complexities of the market. This week Travis Jungroth talks about how HouseCanary uses data to make the business of real estate more transparent.
Brief Introduction
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
- When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
- You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
- Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
- To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
- Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
- Your host as usual is Tobias Macey and today I’m interviewing Travis Jungrot about HouseCanary, a company that is using Python and machine learning to help you make real estate decisions.
Interview with Travis Jungroth
- Introductions
- How did you get introduced to Python?
- What is HouseCanary and what problem is it trying to solve?
- Who are your customers?
- Is it possible to get data and predictions at the neighborhood level for individual homebuyers to use in their purchasing decisions?
- What do you use for your data sources and how do you validate their accuracy?
- What are some of the sources of bias that are present in your data and what strategies are you using to account for them?
- Can you describe where Python is leveraged in your environment?
- What are some of the biggest software design and architecture challenges that you are facing while you continue to grow?
- What are the areas where Python isn’t the right choice and which languages are used in its place?
- What are the biggest predictors of future value for residential real estate?
- Can your system be used to identify risks associated with the housing market, similar to those seen in the bubble that triggered the 2008 economic failure?
- What are some of the most interesting details that you have discovered about real estate and housing markets while working with HouseCanary?
Keep In Touch
Picks
- Tobias
- Travis
Links
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast.init, the podcast about Python and the people who make it great. I would would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you're ready to launch your next project, you'll need somewhere to deploy it, So you should check out linode@linode.com/podcastanet, and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You'll also want to make sure that your users don't have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix those bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinet to get 90 days and 300, 000 errors tracked for free on their bootstrap plan. You can visit our site to subscribe to our subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. And to help other people find the show, you can leave a review on iTunes or Google Play Music, and tell your friends and coworkers.
You can also join the community at discourse.pythonpodcast.com to find out about upcoming guests, suggest questions, propose show ideas, and communicate with other listeners. Your host as usual is Tobias Macy. And today, I'm interviewing Travis Dungroff about HouseCanary, a company that is using Python and machine learning to help you make real estate decisions. So, Travis, could you introduce yourself?
[00:01:18] Unknown:
Yeah. I'm a software engineer at HouseCanary. I'm a back end software engineer. So working on the on the API for 1 of our web products.
[00:01:27] Unknown:
And how did you first get introduced to Python?
[00:01:30] Unknown:
Well, I was I was working in PHP, making WordPress websites actually. And I always heard about Python, seemed like this great, like, next language for me. And my first start was in learn x and y minutes, a website that just gives really fast introductions to languages. And I I started there and just kept going.
[00:01:49] Unknown:
It's interesting hearing all the different ways that people get introduced to Python, whether it's their first language or they come by it by way of a different language and the different learning materials that people take advantage of because there are just so many different avenues into software development.
[00:02:03] Unknown:
Yeah. I had I mean, I had a super weird path. I didn't go to college, get a CS degree and would great for people who did, but kinda came in from a from a second career and learn from books. First book I did was, Al Swigert's book Hacking Secret Ciphers, I think in Python, meant for kids or adults. Adults who is already doing software, but I still liked it a lot.
[00:02:25] Unknown:
Yeah. He's, he's an interesting guy. He's definitely got some interesting content out there. We actually interviewed him. I don't remember what episode number it was, but it was a fun conversation.
[00:02:35] Unknown:
Yeah. He's he's super nice. I've actually met him. Well, I mean, last I saw him was at, Pie Bay. He gave a really great talk there about automate the boring stuff.
[00:02:43] Unknown:
So can you start by telling us what HouseCanary is and what problems it's trying to solve?
[00:02:49] Unknown:
Yeah. At our core, we're a real estate data analytics company. And then so we have all sorts of information for residential real estate in the United States, valuations and forecasting, how much is the home worth, how much is going to be worth, and property details for for all of these homes as well. We serve it up in a few different products. So we have a product called pro, which should be looking over a large area, say the Bay area or even then down to, like, the multi block level. What's going on in that neighborhood? Is it going up? Is it going down? Is it risky? We do the same thing for individual homes, and then the product I work on is called appraiser, and it's for a professional home appraiser. It's a tool for them to do their appraisals, faster and more accurately. So so a lot of data spread over different ways, and then an API for the data on top of all of that.
[00:03:36] Unknown:
Given the sort of range of products, what are the target customers for each of those products?
[00:03:42] Unknown:
Yeah. They you know, they're each aimed at pretty different segment. So the value report is often used by say company that owns a large amount of homes, and they want information on each of those homes perhaps every month. Maybe you've for a consumer, if you've narrowed down your homes that you're considering buying down to a few, and you wanna run a value report on each of those, look at the nearby comparable sales. So that work well for perspective home buyer also. Investors, lenders, people looking at sort of making those real estate decisions. Then pro over a larger area could be used by, developers, again, investors. When you're investing, you're not so tied to a specific neighborhood, you're looking at which neighborhood to invest in. And then appraiser is used exclusively by professional home appraisers.
So that's someone who's licensed and they're using the tool as part of the whole real estate transaction, like, when you get a mortgage or a home is sold, they're gonna use they're gonna
[00:04:38] Unknown:
use our tool as well. And what are some of the kinds of information that the appraiser tool provides to them that makes it easier for them to increase the accuracy of what they would typically be able to do without it?
[00:04:49] Unknown:
Yeah. So there's there's a few 100 fields on a appraisal form, and we have data that we can prepopulate for for most of those, and it'll come from multiple sources as well. And we'll show what the piece of data is and what source it comes from. Because relatively often, they can be in conflict, and we'll leave it up to the appraiser to to make the ultimate decision. So we might say that 1 source says the house is 2, 000 square feet, and another 1 says it's 2, 100 square feet. The appraiser will choose between those and can also use it use it as a reference. But there's a huge amount of information, about the neighborhood, all the property details you can imagine, how many fireplaces does it have, what's the kitchen like, what's the quality of all those sorts of things.
[00:05:33] Unknown:
And where do you get all the data to feed into that tool and the other ones that you provide? And how do you ensure that the data that you are providing is accurate?
[00:05:42] Unknown:
Well, it's a it's a huge task. You know, we have a data engineering team, and that's really most of what they do. It's just getting data into the product teams, to the research team. So we we pay for it from other aggregators. For example, if we're getting information at the county level, we don't go to that specific county. Sometimes they're very difficult to work with directly, so we go with aggregators for that. We also have aggregators. So the the listing services, when a real estate agent, list a home for sale, their services for that. Some of those we even work with directly. So really just a combination of public and private, whether we're going somewhere directly or we're working with a different data provider.
[00:06:20] Unknown:
And when an appraiser is using your tool on-site and they correct any of the information that they're being presented with, does that get fed back into the tool and potentially into some of the, predictive algorithms that are being run against that data?
[00:06:33] Unknown:
There is a whole feedback loop, when you do an appraisal though. It's a it's a little bit tricky because the data actually is kinda belongs to that appraiser so to speak. But a feature somewhere on the road map is say referencing past appraisals. And we we are just storing that information. We largely have to for regulatory reasons, but we'll continue to use the past appraisals in in better and better ways.
[00:06:56] Unknown:
And is there any way or any plan for, point where the appraisers can sign off on releasing the data that they enter if they wanted to share it back to your tool to improve the effectiveness for their use later on down the road? You know, I'm not sure about that. Like, exactly how the details would work. It's it's very complicated because a a lender actually initiates,
[00:07:18] Unknown:
a request for an appraisal with what's called an an AMC, and this is an organization that's between the lender and the appraiser. So there's actually a lot of different parties involved in any given, appraisal. We do have we do have all that data. We just can't say suggest from 1 appraiser to another. Oh, this appraiser said it was this, because part of the value of the appraisers is how they are, independent of each other.
[00:07:42] Unknown:
Yeah. Definitely seems like a potentially rather complicated industry to be dealing with because of the number of regulations, particularly in the light of the, recent economic impact that the housing market has had and also just the general importance of the housing market from an economic perspective, you know, at a national level?
[00:08:02] Unknown:
Well, the the AMC them talking about being between the appraiser and lenders, that's was born totally out of that crisis. Because you can see there's a really obvious conflict of interest here when you have a a lender that's trying to in in a lot of ways, they want the loan to go through, and they are hiring the person to tell you how much the the house is worth. And it's pretty obvious, that there's a little bit of a problem there. So at least putting this 1 organization in between that contracts it out has hopefully made it more objective. And then we think that the work that we're doing will take it even even the next step further, because we can show all the data behind how we came to that final valuation on that home.
[00:08:43] Unknown:
And does your tool provide any sort of, I guess, alerting if the value entered by the appraiser is outside of a certain standard deviation of what the predicted price would be?
[00:08:55] Unknown:
Yeah. We have a few different ways of working with that. We ultimately leave those leave the decisions to the appraiser. Well, in a lot of cases, most of what we do is say suggest, you know, they're they're the professional, they're the human. There is a certain level of validation that we do all the time, you know, field validation kind of just sanity checks. This field should always be less than that, should always be greater than this. You know, if your year probably should be a 4 digit number, not more than a 1000 year old home. And a lot of those checks that goes a long way because humans just commonly make those those little mistakes. We also have a tool we call audit where we're comparing fields from 1 to another. So if you said that 1 home had a better kitchen than the other, the the home with a better kitchen should be worth more, not less. So hundreds of checks, done just like that.
[00:09:47] Unknown:
And with all of the data that you're dealing with and particularly with volumes of data, 1 of the insidious problems is that that data can potentially contain different sources of bias that are not necessarily obvious at first blush. And I'm wondering what strategies you use to account for that and, any mitigation factors you've put in place?
[00:10:07] Unknown:
Well, a big part of that is just the more different sources that we have. The more we can compare them against each other looking for a sort of consensus. We also, with any piece of information, have what we consider the authority on that piece of information. So if it say comes to the the record about the parcel, we're going to trust the information come from the coming from the county versus the information coming from a real estate agent. And we do see some of those biases sometimes, For example, information that's coming from other real estate agents may be tend to be more positive. For example, like gross living area is a is a common, very common important number on home value. And it's it's it's square footage. How much livable square footage. But it gets kinda gray sometimes. You know, do you count basement? Do you count this? And we have an idea of who tends to be more accurate for those pieces of information.
[00:10:55] Unknown:
What are some of the common areas where the different tools intersect, and what are some of the architectural and engineering challenges associated with making sure that the disparate tools are able to share as much as possible without being sort of bespoke, individual applications that are difficult to integrate.
[00:11:14] Unknown:
Yeah. We do use a a service oriented architecture in in a lot of ways. So the the data engineering team that we have handles things coming in from all those external sources, feeding it to the the research team, and then also feeding it to the product teams. We have a central repository of real estate information that is used across teams. So that's actually a big part of what I did over the last year was hooking up appraiser to use that, central CMS of information that just gets used across everybody.
[00:11:45] Unknown:
And
[00:11:46] Unknown:
what have you found to be the biggest predictors of future value for residential real estate? Well, the funny thing is that there there are the big national trends, but it's a lot like the weather, especially, you know, our office in San Francisco, and it's all about microclimates. So what's going on in your city and what's going on in your neighborhood has a much bigger effect nationally. So we can see things, for example, areas that are very very sensitive to oil prices, because they because they're very like oil based. If the price of oil drops, you're gonna see a lag and homes dropping, but even a little bit later as it takes time for say people to move away or people to lose their jobs, things like that. So there's the big national trends. There's also what's happening on each individual area. Things that are the same mostly are risks that come up and then say crash. So if an area becomes insanely unaffordable, unaffordable being how much is the average home cost versus the average income? That's a big problem.
[00:12:44] Unknown:
And can the data and the different algorithms that you're using to process it be used to identify risks associated with the housing markets so that you could potentially foresee something on the scale of the, 2008 economic failure that was caused by the massive That's definitely
[00:13:02] Unknown:
within the range of what we're dealing with. That's definitely within the range of what we're dealing with. I mean, we don't have something that intense on the horizon that we're seeing right now. Things are generally across the nation pretty positive. Although, usually, I'm when I'm looking, and a lot of times, it's just for fun looking at things more locally. But that that would fall right into it as well because we and and 1 of the reasons I feel okay saying that is, of course, we back test everything. So looking back on how we would have reacted, to those models and the the things like the the very unaffordable, what was going on with loans and things like that. Of course, it's easy to see in retrospect. There are some of the same things going going on right now. So it is a little bit nerve wracking. We've seen affordability come up in general to to to where it was at the pre 2008 crash levels. So that's a little bit of an issue.
[00:13:49] Unknown:
Shifting gears a little bit, can you dig a little bit into where Python is used in your software and architectural environment?
[00:13:56] Unknown:
Yeah. I'm I'm really proud to say that we use Python in most of the ways that you can or at least it's done very commonly. So I work in Django, use Django REST framework a lot, and that's on the API that's powering or it has a web app and a iOS app as well. The other products are done in similar ways. People are also using Flask, especially for things that are more like microservices. So web back end for Python. We use it for machine learning, especially the exploratory stuff, where you are say testing out, you know, data science. You're testing out your hypothesis there. The DevOps team uses it. I've all, you know, deployment scripts and things like that. Data engineering team uses it a lot as well.
[00:14:36] Unknown:
And what are some of the most popular tools across the different departments? You mentioned that you use Django, which is probably 1 of the best known web frameworks around, but are what are some of the tools that I guess are a little bit less known but are popular that cut across the different departments?
[00:14:51] Unknown:
Yeah. I'm thinking thinking across teams. It's I don't wanna say we're we're not siloed because we work very collaboratively across the teams, but people tend to, you know, make their products with whatever they think is best fit. So we don't have a bunch of standards across, you know, interesting packages I can think of that that that we use across. Everyone comes in with their own with their own different flavors sometimes, which which is a lot of fun. You you learn a lot when you get to work with other people. But I actually can say, and it's not gonna be less well known, would be things like Pandas, NumPy, your your kind of classic data science, Python stack is used by quite a few people.
[00:15:24] Unknown:
Yeah. It's pretty remarkable how many different use cases just those 2 tools can facilitate. My last position, I found myself pulling in pandas for being able to do in memory aggregations of data that would have just been too difficult and cumbersome to do in the, back end data store that I was using. But by using the backend as a sort of intermediate aggregate and then pulling it into memory and doing the more complex computations before feeding it back out to the API, I was able to take take advantage of where pandas was strong and then, let the back end do some of the heavy lifting.
[00:15:58] Unknown:
Yeah. I need to step up my pandas game. I mean, I wrote a script that I was kinda proud of doing some, like, data analysis, pulling something out of a CSV, and I showed it to a friend of mine. And she's, like, oh, I'll show you how to do this in, like, 10 lines of Pandas. It was it was a lot more than that.
[00:16:12] Unknown:
Yeah. Pandas as a tool has a pretty large surface area of its API. And when I was using it, I probably used, like, a fraction of 1% of its overall capabilities.
[00:16:22] Unknown:
Yeah. It's all like like Django is. I mean, you can just keep you can just keep going forever. And a lot of times, I've had it before. I wanna do something for the first time, and it's, like, already in built in Django. And if it's not, it's probably in an extension.
[00:16:34] Unknown:
Right. Yeah. That's 1 of the, 1 of the beauties of the Python ecosystem is that it is so vast that you can generally just kind of do a quick search and say, oh, it's already been done. But it is still fun and entertaining sometimes to purposefully reinvent the wheel just to see how you approach problem and see how it differs from something that's already available. You know, I've definitely done that where, you know, I wanted some addition to to 1 of our products, you know, adding a feature and
[00:16:58] Unknown:
I would say, you know, back end feature and looking at what other people have done and deciding ultimately not to go with that. But if you look at how 3 other people have attacked a problem on on GitHub, it's you start so much further ahead. I like that a lot. Even if I don't actually use their code at all. Right. And also 1 thing too to be said for reinventing the wheel or some portion thereof is that when you do install
[00:17:20] Unknown:
an external package, then you ultimately own that code even if you don't necessarily fully understand what it's doing. And if you're only using 1 fraction of its overall capabilities, then it might actually be better if you just replicate that functionality in your own code so that you don't have to then pull in all of those other unknowns.
[00:17:38] Unknown:
Yeah. I mean, the mental overhead is real. I read an interesting article. We can't place it that well that some they identified that actual problem because when a bug came up because they didn't understand that code they were pulling in a 100%. It was actually really easy to blame the bug on that problem or blame the bug on that external dependency. And so, oh, it's it's that when it it wasn't that at all. So if you don't trust your code, you you probably don't wanna pull pull it in either.
[00:18:04] Unknown:
Right. So what are some of the biggest software design and architecture challenges that you guys are facing while you continue to grow?
[00:18:11] Unknown:
Scale. I mean, definitely a big a big problem. You know, a lot of our products are just starting to get traction, but when you are offering service nationally, we need to have information on all those homes from the very start. So we do evaluation on a 100, 000, 000 homes across the United States every month, which is not a a huge number, but each of those homes has an, you know, property details on it with and property sources. So it starts multiplying itself out very quickly.
[00:18:38] Unknown:
Yeah. And particularly as you start to increase the feature space of each of those homes to be able to niche down and give more detailed information that is potentially more relevant depending on the perspective that you're viewing it at, then that can also have a combinatorial effect in terms of the overall complexity.
[00:18:56] Unknown:
Yeah. It especially going from 1 to another, trying to reference between the homes, for example, finding comparable sales. So these are sales of homes in the area that are comparable, you know, similar number of bedrooms, bathrooms, similar size, and things like that. Doing a comp search is is definitely expensive. It's definitely difficult.
[00:19:15] Unknown:
And I imagine 1 of the other difficult things is because of the fact that there are so many different potential features in the dataset, making sure that you have at least some commonality between all of them because I'm sure that it's a bit of a sparse matrix where you have some available information along different features for 1 property whereas the other 1, you're not able to get that information because of the data sources available between different counties or states, things like
[00:19:41] Unknown:
that? Well, there are a few things that work in our favor there. Because of the way that our data tends to work, we we tend to have similar data points within similar areas. Now, of course, there's like border problems and things like that, but 2 homes next to each other at least are going to be listed in a lot of the same services. And and the other thing is that most of the value of your home is tagged by just a few simple attributes. Location, like everybody talks about, how big is the home, how big is the land, and some really simple things is is gonna cover most of the value. When people look at homes, they think about a lot of small details, but those aren't nearly as important as the as the big details and that the multiplication that goes along with those.
[00:20:22] Unknown:
So what are some of the areas in your infrastructure where Python ended up not being the right choice? And which languages did you use in its place?
[00:20:29] Unknown:
The biggest place that we could be using Python but we're not is actually in our real core real estate valuation algorithms, and that's done in R. We have a a research team. They're actually over in San Antonio, Texas, and that's led by 1 of our cofounders, Chris Stroud. And I've I've kinda talked to him about it. I'm like, why don't you why don't you guys use Python? And a lot of them are coming from a strong academic statistics background, so they're already very skilled and familiar with r. Another thing that I've heard is that it actually has an even better ecosystem for statistics related things. So when they wanna try some statistics algorithm, it's already done in our 5 different ways. They can grab that and pull that in instantly.
[00:21:12] Unknown:
Yeah. I've definitely gotten the impression. I haven't done much work in R myself, but I've gotten the impression that, like you said, R is sort of the powerhouse when it comes to just doing pure statistics, and that most people end up converting from R to Python when they need to operationalize those algorithms. So they do their exploratory analysis in r, and then when they need to put it into production, they might rewrite it or at least rewrite some portion of it in Python or potentially a different language if Python isn't their particular preference.
[00:21:40] Unknown:
We've considered some of that going, you know, going 1 way or the other are, for example, being not as good for real time valuations and things like that. Like, you know, it generated on the spot valuation. We we can do once a month, and that works great because real estate moves slow. It's not the stock market. It's very illiquid. Things change hands very slowly. But to get more frequent than that or to be able to do it on demand, we've considered which way to go with that, and Python could be an answer to that. And have you done much work with the Arrow library
[00:22:08] Unknown:
from, Wes Mckin, and I forget the other gentleman's name, for being able to inter operate with data frames coming from R and going to Python or vice versa?
[00:22:17] Unknown:
As far as I know, we we haven't, but I'm not super involved in that team. So it could be using it in production right now for all I know. So what are some of the
[00:22:24] Unknown:
1
[00:22:30] Unknown:
1 that's really interesting to me just because it's not exactly what you'd expect is in most cases, 2 homes otherwise being the same, same square footage, same land. If a home has an additional bedroom, we've actually seen a negative correlation with price. And it makes sense when you really think about it because you're essentially just taking the same home and splitting it up into smaller pieces. You know, if you built an additional bedroom and added the square footage, that's gonna increase it a lot. In a more rural situation, adding having an additional bedroom will hurt the price. In a very high urban density area like San Francisco, it actually has a positive effect on the price just because you can cram, more people into 1 home.
[00:23:09] Unknown:
Yeah. It's definitely interesting how different geographical regions in terms of the population density and the sort of general character of the metropolitan area can have such drastic differences in terms of what the expectations of living situations are. Because for instance, if you go from where I grew up in Vermont to where I'm living now in Boston, the overall general housing prices are drastically different and the sizes available on a price per square foot basis are so so much less. And then you go from, for instance, from Boston to downtown Manhattan, and it's compounded yet again because just the physical land area is at such a premium because people have been building there for so long that there's nowhere to go except up. Well, I mean, I live in downtown San Francisco, and I'm basically in a dorm room. I have about
[00:23:58] Unknown:
70 other roommates in a big just a huge 3 story 3 story building, but just with my own little private bedroom, otherwise shared spaces. You're never gonna see that in a rural area of the United States. You're gonna see it almost in in nowhere of the United States, but it's just, it's a result of incredibly high real estate prices.
[00:24:16] Unknown:
Yeah. I was reading an article last night while doing some research for this interview about the correlation between politically red and versus politically blue states and the general trends in housing prices associated with that. So I'm wondering, I guess, 1, what you thought about that piece and 2, any other interesting bits of research that have gone on or been initiated by the folks at HouseCanary?
[00:24:38] Unknown:
Well, I actually did part of the data analysis for that piece. So it's very proud that you, you know, that you bring it up. And that's been a really cool thing at House Canary is, you know, I'm like a web back end engineer, but get to just pick up something like that and do some do some work on it. So that was hugely interesting piece. We have more coming down the pipe just like it. What was really interesting to me was looking at the areas, the the states that have had quickly very high increases in increases in unaffordability since since 2008, and 3 out of 4 of those states were oil states. I believe Texas, North Dakota, and, another very oil driven state as well. So it's interesting to see a bunch of states move together, and they all have something in common.
[00:25:23] Unknown:
And what are some of the broader goals or, more altruistic focuses that HouseCanary is trying to realize, you know, in in conjunction with the business goals of the company, you know, of being able to generate revenue, but what are some of the other, tertiary goals?
[00:25:42] Unknown:
Well, it's I mean, it's it's it's not saving the environment, you know, it's not a a huge altruistic thing, but people having better tools to make their real estate decisions is definitely just gonna help people, you know, remove some of the risks. It will help people get, say, mortgages in some cases that they wouldn't be able to get otherwise or risk they wouldn't necessarily be willing to take on, but we can do that with faster and more accurate valuations. People also say not buying homes that aren't a good deal. The more accurate that we can have the valuations on on homes and the tighter we can have that spread. It's just gonna make a more efficient market. It's gonna help people a lot.
The appraiser industry is also having an issue right now with a lack of people going into the industry. Now our tool is absolutely not going to get rid of professional home appraisers. It's used by professional home appraisers. So until we have sentient robots walking around doing these appraisals, they'll definitely be part of the process. But the idea is that 1 appraisal appraiser could become much more efficient. So instead of doing, you know, an appraisal taking 8 hours could get that down 4 to, you know, just just ideas, a reduction in the amount of time and be more accurate as well.
[00:26:49] Unknown:
So what part of the process would an individual who's looking to become a home buyer start using HouseCanary? Would it be after they've already narrowed down their search and they have a couple of specific properties that they want to look at or would they start using HouseCanary even before then when they're just beginning the search and trying to determine
[00:27:07] Unknown:
what are the areas that they should even be focusing on. Right now for someone who's a potential home buyer, I think our best match is value report. And that's gonna be just like you said. Once they've narrowed it down to a few properties, the value report is $5. It's going to be, an accurate estimate of the the value of the home, give you a lot of information about that home. We'll tell you how confident we are on our score too. You know, some houses are harder to value than others. So if if we're not confident, we say so. I look at homes in Beverly Hills a lot, just because they're difficult. They're also different. They also, you know, so many different features. So we might give a lower confidence versus if it's a, you know, cookie cutter home will come in with a high confidence. Generally, you know, going pretty high there. But then you're gonna have a PDF that has a bunch of comparable sales. It's gonna have a bunch of property information. You can go to the negotiating table with that, and you're gonna get that for $5 So that's where your normal home buyer, is is gonna match with HouseCanary.
[00:27:59] Unknown:
And what's the turnaround time for somebody wanting to purchase a report for a specific home? Is it just a transact the payment and they get the PDF immediately, or is there a lag time between the request and receiving the results? It's instant. I mean,
[00:28:14] Unknown:
a few yeah. It's in this, you know, the seconds timeline. It does a lot of number crunching, so we're actually trying to get that faster. But, yeah, it's it's done in self serve in real time. So on the value report for a home, does that also include
[00:28:27] Unknown:
a a forecast of the house's value over the next certain number of years for people to be able to determine how sound of an investment it is to actually purchase a given property?
[00:28:36] Unknown:
We do forecast for up to 3 years, and then with a, you know, with a with a band and kind of an estimate there as well. So that would definitely be valuable for someone looking at at buying a home.
[00:28:47] Unknown:
And for developers who are looking to build out real estate or purchase real estate on a broader scale, what are some of the most useful aspects of the pro product that they would be leveraging? That they wouldn't be able to find as easily or as accurately elsewhere?
[00:29:02] Unknown:
Yeah. It's interesting thinking compare. I mean, with the the pro, the strategy, there is just lots of data and filtering and making accessible because we're not sure, you know, for a given any given developer what's going to be most important to them. But we do identify risk in neighborhoods, and we'll do we'll do a heat map of an area. So you can say you can see if, you know, 1 area has more risk than other. In this case, it's being risk of decline. What's our our estimate say the percentage chance per year? Well, that this property is going to be worth less than 12 months than it is right now. So that'd be a big part as well. Demographics information being a huge thing there. And then all sort of all sorts of price indices. So, you know, the the index and forecast, the velocity, is it moving up or down? Supply is a really interesting thing. So that would be if people kept buying homes, but they stopped selling them, how many months would it be until we ran out of homes? And it's really interesting to see as that number goes up and down, areas with high supply, low supply.
[00:30:00] Unknown:
Yeah. It's definitely an interesting thing to think about because, you know, it's not really something that you would generally consider because whenever you're looking around, you always assume that there's going to be some property available for purchase. But if everybody just decided to hold on to what they had and not actually offer it up, what kind of an effect that would have on the economy both regionally and depending on how wide scale it was, you know, at a, you know, county or state or national level.
[00:30:22] Unknown:
Yeah. And there's, you know, and it's the flip side of that too, which is if there's 8 months of supply sitting on the market. I don't know if adding a 200 home development is maybe necessarily a good idea. You're just gonna make that supply, you know, even longer and even longer and bigger. Where you see the complete lack of supply is somewhere like San Francisco, you know, because something comes up in the market and usually gets picked up pretty quickly.
[00:30:44] Unknown:
Yeah. And going back a bit to your comment about the different regulations that you have to deal with, I guess, what are some of the incidental complexities associated with having to make sure that you are properly managing compliance with the different regulations that are present in the housing industry and the various association that are related to that?
[00:31:03] Unknown:
You know, I don't want to underplay it too much but with where with where we are you know because we're not lending money because of the role that we take in with a lot of our products, we definitely don't say have it, as difficult as as a lot of other companies at all. Even more of our effort goes into making sure that appraisers are following, you know, those regulations in the case of the appraiser product so that they are filling out the form to the correct specification, all the validation we do on the form to make sure that it's filled out properly. We do have some basic, regulatory requirements keeping the appraiser data for for years years, but just kinda sitting in cold storage at that point. So that's definitely something that I'm involved in. So maybe there are more regulations that we have to deal with that I'm unaware of, but coding those regulations is is something that I do pretty often. So are there any other topics or questions that you think we should cover? Can definitely talk about that. We are, we are hiring at HouseCanary. Always looking for skilled people in Python, whether it's 1 of our applications or on the data engineering team, especially people who can really rock SQL and Python. That's a great combination.
[00:32:08] Unknown:
And what's your stance on local versus remote, employment? It's all local right now. So all of the Python jobs are in our San Francisco office, which has a great view of the bay. So it's got that going for it. Sure. Well, I definitely wish you best of luck in filling your open positions. Definitely seems like an interesting company with an interesting product behind it. And, I'm sure that you're gonna end up continuing to grow. Seems like you guys have been pretty successful so far. So happy to see that. Definitely seems like you're leveraging some pretty interesting technologies in a particular portion of the, economy that is in great need of better transparency.
[00:32:45] Unknown:
Yeah. Thanks. And it's funny because sometimes we're doing these high like, strategy sessions about what we're gonna do with the product. And and real estate in a lot of ways is just is just so far behind that I often suggest, let's just do what people started doing 5 years ago. You know, the fact that we have this really easy to use restful API is like wild to people. It's just, you know, it's just JSON and rest. But the real estate world that's very advanced. You know, appraiser being a cross platform product, iOS and desktop and, you know, responsive and all that is apparently a big deal. Not a lot of other people doing the full blown, modern web with real estate information.
[00:33:19] Unknown:
Yeah. It's really easy to get sort of stuck in the echo chamber of what the current trends are in technology and what's interesting, what's new, and cease to realize the fact that there are still huge portions of the economy in various industries that are still using technology from the, you know, late eighties early nineties because they've just been chugging along and they don't even necessarily know that there is a better way to do it or people who are running their businesses largely on the back of Excel, which, while it's a great tool and very useful in its particular context, there are people who are trying to stretch it beyond what it should really be applied to.
[00:33:55] Unknown:
Well, people still fax stuff. I mean, we're like, it's just we're, you know, we're hooking up WebSockets to 1 of our products and, you know, looking at which all sorts of fancy things and all that stuff still going on.
[00:34:08] Unknown:
Yeah. I was actually just speaking with a coworker about that the other day because he was saying how I don't remember exactly what was trying to do, but somebody who he was dealing with said that the only way that they would accept the information that he was trying to give to them was via fax. And he said, I don't even know how to send a fax anymore. Like, where where am I gonna find 1? How do I do that? Which fortunately, there are services that let you upload a PDF and send a fax via the Internet, but it is kind of funny to stop and think about that for a little while. Well, it's really great is then there's services where you can receive a fax from a phone number. So sometimes people are doing PDF to fax back to PDF, and I just I think that's really awesome. Yep.
Yep. That's definitely true. Alright. So for anybody who wants to keep in touch with what you're up to and follow what the House of Canary is doing or even, reach out to you directly, what would be the best way for them to do that? Our website,
[00:34:57] Unknown:
housecanary.com, which you know is the contact page. You'll have the products, the careers page. We're also on Twitter with House Canary. I'm all over the Internet as Travis Jugrath. I'm the only 1. So I'm on Twitter, travis, j u n g r o t h,
[00:35:11] Unknown:
GitHub too. So you contact me through any of those if if anyone who would like to, to reach out to me personally or reach out about HouseCanary as well. Great. So with that, I will move us on into the picks. For my picks today, I'm going to choose a book that I just recently finished called Rail Sea by China Meeval. I picked the author a while ago. I don't remember which episode number. But this time, I'm gonna pick his specific book, Rail Sea, because it was just a very well done and very well presented novel that takes a lot of interesting aspects from a few different literary pieces.
So it captures some of the essence of Moby Dick, and that there's a captain who's hunting after this, creature that has sort of been where they have this interesting relationship and interplay because of their shared history. There elements of the Odyssey incorporated in it. There's a, 1 of the sub plots and 1 of the sub themes of it is the effects of industrial and corporate greed and, corporate competition and the what kind of an impact it has on society and the environment. He's really well presented, has a lot of really interesting literary devices that he puts into play, and he just has 1 of the best commands of language of any author that I've read. So I definitely recommend people checking that out. For a lot of those same reasons, I also recommend checking out Kraken by China Meeval as well, which is another 1 of his books that I finished a little while ago that is an interesting take on sort of an occult thriller slash mystery kind of thing that he makes a lot of interesting use of symbolism, various literary devices to keep the story moving and he keeps you guessing all the way up to the end as to what's actually happening. So I certainly highly recommend that as well. And with that, I will pass it on to you. What do you have for picks, Travis? Well, my first take will be pretty straightforward, and it is a, a Python package called DBT,
[00:36:58] Unknown:
short for data driven tests. And it's a package for writing a whole bunch of tests really fast. So the main thing that you'll use it is a decorator. So you need using, unit test test cases, you'd write your normal test class, and then you'd write a decorator. I'm sorry. You'd write a, test method, but that would also take an argument. And this way you can pass it a whole bunch of arguments, and see that they all come out the way that you that the way that you expect. So I use it to just write dozens of tests really, really fast. The benefit of doing it that way over a 4 loop is test will fail individually. Instead, if you just do a 4 loop, then the first 1 that fails, that's gonna fail that whole, that whole test case. So, yeah, that's DDT, data driven test. Love it. Other thing would be a book as well, a little bit different and definitely not obscure book, but on writing well by William Zinsser, and it's about writing nonfiction.
And it is 1 of the best coding books that I have ever read. Not about coding at all. It's entirely written about re writing, like, travel articles and restaurant reviews and biographies and things like that, but the structure of learning to get your point across really well, terceness, things like that. I I really found that it affected the way that I that I coded, and I'm working my way through it through it a second time now, So huge fan of that book. Probably helps you write more Pythonic code. I mean, if there's any programming language that's really close to English, Python is was definitely 1 of those. I'm sure there's 1 better, but it's 1 of the things I love about Python. Yeah. It definitely sounds like an interesting book, and I can certainly see how it could be leveraged to
[00:38:29] Unknown:
feed into, you know, your coding style and making sure that it's easy to understand and comprehend because oftentimes it is easy to forget the fact that we're writing code more for humans than we are for machines. Because if we were just writing it for machines, then we'd all still be using binary or assembly.
[00:38:44] Unknown:
Yeah. Yeah. Yeah. Yeah. I mean, I definitely I I'm very much in that camp where I write code that's, that's meant to be read by other people. And then oh, by the way, it also, like, runs.
[00:38:55] Unknown:
Yeah. Well, I really appreciate you taking the time out of your day to tell us about the work that you guys are doing at HouseCanary and, some of the internals of how it runs. It's been pretty interesting and 1 that I the product that I may very well end up using myself and then I'll do this in the future. So, I appreciate your time, and thank you. Excellent. Thank you. Yeah. So so happy to get to talk to you. It's it's been great.
Introduction and Guest Introduction
Travis Dungroff's Background and Introduction to Python
Learning Python and Career Path
HouseCanary Overview and Products
Target Customers and Use Cases
Appraiser Tool and Data Accuracy
Data Sources and Feedback Loop
Regulations and Economic Impact
Handling Data Bias and Accuracy
Service-Oriented Architecture and Data Integration
Predictors of Future Real Estate Value
Python in HouseCanary's Software Environment
Software Design and Architecture Challenges
Handling Sparse Data and Feature Space
Using R for Core Valuation Algorithms
Interesting Data Insights
Political and Economic Factors in Real Estate
Altruistic Goals and Market Efficiency
Using HouseCanary as a Home Buyer
Pro Product for Developers
Regulatory Compliance and Challenges
Hiring and Employment at HouseCanary
Technological Advancements in Real Estate
Contact Information and Closing Remarks