Demystifying home pricing models with Lee Kennedy
Lee Kennedy worked for American Savings Bank in the 1990s when the first automated valuation models for pricing homes came into existence. Since 2005, he has run AVMetrics, a company that tests and compares new home-pricing models that come onto the market.
Kennedy’s expertise is relied on by lenders, appraisers, real estate brokerages, and, recently, iBuyers. He regularly appears as an expert witness in civil court cases.
So, few are better prepared to discuss not just the swirl of news surrounding pricing models, including the demise of Zillow’s iBuying program and appraisal bias, but how, exactly, an AVM works.
For this week’s episode of Houses in Motion, which is part of HousingWire Daily, Kennedy spoke with HousingWire senior real estate reporter Matthew Blake about the capabilities and limitations of AVMs and also the data present in making decisions. One issue is the data held by Fannie Mae and Freddie Mac.
In an edited excerpt here, Kennedy pleads for the government-sponsored enterprises to make more data available to appraisers:
Matthew Blake: What do you have in mind when you talk about the democratizing of data? What kind of data might Fannie Mae have that might be valuable for you or valuable for someone building an automated valuation model?
Lee Kennedy: So, think of an appraiser as being a data cleaner of sorts. He takes raw data from multiple sources including their own files and they validate and verify that data. Not only their own data but public record data, MLS data, agent data, parties to the transaction, those data points, Fannie Mae cleans that all up and they put it in a report.
So, now you have very clean data points that have been validated and verified. It would make your job a lot easier [to be able to decide] what to rely upon to make adjustments. So, it’s going to have a huge impact, not only for AVMs but for appraisers if they allow this data to come back into the public domain.
HousingWire Daily examines the most compelling articles reported across HW Media. Each afternoon, we provide our listeners with a deeper look into the stories coming across our newsrooms that are helping Move Markets Forward. Hosted by the HW team and produced by Elissa Branch. If you have an inquiry relating to podcasts, you can reach our team at firstname.lastname@example.org.
Below is the transcription of the interview. These transcriptions, powered by Speechpad, have been lightly edited and may contain small errors from reproduction:
Matthew Blake: Hello, and welcome to “Houses in Motion,” part of “HousingWire Daily.” And welcome to 2022. I am Matthew Blake, real estate reporter with HousingWire. Each week, I will try to interview a guest who has something important and interesting to say about the biggest issues in real estate. For this episode, I spoke with Lee Kennedy about the necessity of automated valuation models, or AVMs.
AVMs are the algorithm that lenders, government-sponsored enterprises, real estate brokers, iBuyers, bridge lenders, and other real estate actors use to determine what a house is worth. Lee Kennedy is the founder and owner of AVMetrics, which tests the merits of AVMs. He is routinely called on as an expert witness regarding home valuation models. And with nearly three decades working on home pricing models, he has authority on the news of the day, like Zillow announcing a wind down in their iBuying due to a faulty price forecasting model. I hope you find this conversation accessible and enjoyable. Please email me thoughts at email@example.com. That’s M B-L-A-K-E @housingwire.com.
Hello, and welcome to “Houses in Motion.” I am Matthew Blake, real estate reporter with HousingWire. And I am here with Lee Kennedy, the founder and owner of AVMetrics. Lee, thank you very much for coming on.
Lee Kennedy: My pleasure.
Matthew Blake: So, Lee, why don’t you tell us about yourself and what you do, and what AVMetrics does?
Lee Kennedy: Yeah. So, I started testing AVMs back in the late ’90s, early 2000s while at American Savings Bank, which was merged with Washington Mutual Bank. I ran a business and analytics unit for them in the risk department under the appraisal department, I was a district-level manager. And when we started going transactional with our loan origination systems into the appraisals, we were using AVMs at that time, but we weren’t really testing them. We were using human interaction with the AVMs as part of our evaluation process.
So this new transactional system mean we had to make decisions computer-to-computer. So I developed testing of AVMs on a national basis. About the same time, BofA was doing it, Countrywide was doing it, Wells Fargo was doing it. And I knew the gentlemen at all those places that were developing AVM testing as well. So I’ve been doing it, like I said, since the late ’90s into the early 2000s.
And my background before that was, I was a nuclear engineer onboard a fast-attack submarine. Came out of that in the ’70s, went into manufacturing and industrial engineering. And my wife was always a real estate broker and appraiser, and we decided we would start a small cottage business, and I learned to be an appraiser. And that was in the mid-’80s. And then went to work for the lending institutions in the mid-’90s. So that’s kind of how I got to where I’m at now. I spun AVMetrics as a standalone business when I left Washington Mutual in 2000. So I’ve been doing testing since, let’s call it 1999, of AVMs.
Matthew Blake: And what does AVMetrics do exactly?
Lee Kennedy: Yeah. So, we stand as like the independent third-party in between the developers and sellers of AVMs and the end-users of AVMs. Our business model is such that the vendors, the modelers participate at no cost. And so, we test, I think right now we’re testing somewhere between 30 and 35 commercial or lender-grade models. We’ve tested probably close to 80 or 90, but some of those were development models, some of those were version control models. Some of those models have been retired, they don’t exist anymore.
We test, basically, on a [inaudible 00:04:34] and we roll that up into a [inaudible 00:04:37] report to get enough statistical observations to be able to do what we do. And then we sell those reports to major lending institutions, rating agencies, REITs, securitizers, the end-users of the AVMs themselves.
Matthew Blake: So, what is…? I mean, we hear a lot about automated valuation models in the news with Zillow winding down its iBuying program, saying that its price forecasting model didn’t work. We also hear a lot about AVMs from companies like HouseCanary, that say they have a superior alternative to the human appraiser. What is, in kind of the most basic language, say like a Martian descending on to earth, what is an AVM? And what should people know who maybe haven’t heard of an AVM, what it does?
Lee Kennedy: Yeah. So AVMs really started back in the ’70s and ’80s as a Computer Assisted Mass Appraisal, CAMA, we used to call it. Basically for assessment districts, where they were valuing the same properties over and over and over again. They started to develop in the mid-’90s for the U.S. residential housing stock on the mortgage side of the business. Basically for portfolio analytics, right? Get an idea of where your portfolio was at. More of a directional piece on the equity. Is the equity going up or reducing? Then it got into risk assessment, da, da, da.
Now, it’s developed with new techniques of machine learning and coupled with artificial intelligence into some very powerful valuation tools, hence the claims of, like HouseCanary, and Quantarium, and some of the others, that there are circumstances where these models do a very good job of valuing individual residential real estate properties.
Matthew Blake: And what are the inputs that go into the AVM, and how are those weighted?
Lee Kennedy: Yeah, that’s a great question. So, the inputs into the models are much like an appraiser would use. Think of an appraisal emulation type of product. You’ve got qualitative and quantitative data that goes in. The qualitative being what anybody would think about buying a house. What size it? How many bedrooms? How many baths? Does it have a pool? How is it located within the neighborhood? What are the external conditions? What are the condition of the property? All those types. Some of that’s qualitative more than quantitative when you start getting into the condition of the properties, though it’s factually based on, you know, whether it has granite countertops, or what the remodel status is.
So, those are the normal things that you would think of that are data inputs. We’re seeing a lot more of the nuanced data inputs. You’ve seen it on Zillow with like the walkability score, things along those lines. Microeconomic data, right? School districts. And these are things that appraisers take into consideration in their reports as well.
I think the biggest data change that’s happened in the last, probably 5 to 10 years, is the regionalization and the nationalization of multiple listing services. So there’s a lot of qualitative data, if you know how to get at it through the photos, and the narrative comments on the property, as to location and condition. And then the government-sponsored enterprises, the GSEs, Fannie and Freddie. They have these UCDP data ports where there’s a lot of appraisal data, right? Appraisal strip data. And, again, if you do your data mining well, there’s a wealth of information in that. Those have been the two major changes in data.
Matthew Blake: And how does that work? As someone who does not have a mathematics degree, like, how do you…? Like something qualitative like proximity to nature, local schools, even number of bedrooms, which is, you know, a very basic input, do you then assign a numerical value to all of these? And can you? Is there like a pretty reliable, precise formula for making these qualitative factors quantitative and getting to a home’s value?
Lee Kennedy: Again, great question. So, a lot of it is stuff that’s straightforward on the quantitative portions of the data, you know, you can do regression analysis on that. These are, you know, pretty stock variables in an equation. It gets a little nuanced when you start getting into conditional deals. So, if you’re building a model, there are variables that you don’t have, right? That you need to proxy for, right? And in the past, condition and location has been two of the major stumbling blocks for these models because that information was hard to get. It’s getting easier to get now and easier to quantify, right?
If you don’t have a variable and you have to proxy for it, that’s an omitted variable, or can create an omitted variable bias in the model itself. So, that part is getting better. There’s another thing that, just think of it as the form adjustment process in a Fannie or Freddie appraisal form, where, you know, it really starts with the most important adjustments at the top and works down to the least impactful adjustments towards the bottom of the form. And that’s the same way if you look at the weighting process, how much weight needs to be given to each of these variables in the equation.
And appraisers and models both have trouble with, the statistical word is multicollinearity, and appraisers call it double adjusting, right? So, how much weight are you gonna put on, you know, the bedroom, bathroom count versus the square footage, right? Because you can adjust it either way. So, those are some of the nuances in model building, and, of course, the output of the model for those qualitative and quantitative factors.
Matthew Blake: I wanted to go back to a couple of the developments you mentioned. Because a lot of our audience is real estate agents, and you mentioned the regionalization and nationalization of the MLSs. I, sometimes, with, you know, varying degrees of success, have tried to track the mergers of various multiple listing services over the past few years. Some of these MLSs are getting like kind of bigger, like the Bright MLS, I think, in the Northeast, is getting bigger. So, what kind of data are you now seeing in the MLSs that maybe, like, we didn’t see 10 years ago?
Lee Kennedy: It’s a couple of different folds, depending upon the specific MLS board, because there was a lot of liability involved for the MLS in the square footage of the home, for instance. So, they’ve kinda got records on that and asterix that out, so they don’t wanna rely upon a measurement that they’re going to make or homeowner’s representation of that. Which, if you are from a modeler’s aspect, you gotta look at what I call field criticality.
You may have three or four different square footages for the same property, right? Which is the most correct one, right? That you’re going to use? So, you get into little nuances like that as well. But the general availability of data is there, right? There’s hundreds of data points in these MLS broker loads. And which ones are you going to use in your modeling, right? So, there’s a lot of data that you’re not going to use, or you’re going to combine and proxy for something like the condition.
Matthew Blake: And I think another thing that I wanted to get back to was what you were talking about with the GSEs and their appraisal data. Maybe remind me again how that is affecting automated valuation models, but then also there’s been kind of a review of, like, home appraisals right now in our country. Like, there’s reports of bias by appraisers, people are trying to understand how appraisers actually do their jobs. So, like, what kind of data do the GSEs use that, you know, A, can shed some light on that question, but also, B, like, sort of inform automated valuation models?
Lee Kennedy: Yeah. So, both GSEs have their own internal AVM. With Freddie Mac, it’s Home Value Explorer, which is commercially available as well. Fannie Mae’s is not a commercially available model, they just use it as part of their automated underwriting system, internally. But the Uniform Collateral Data Port, the UCDP, on the new [inaudible 00:13:37] allows access to all those individual data points in an appraisal. And both those GSEs utilize that information as part of their automated underwriting system, and as part of their valuation modeling system.
Now, there’s been a push for democratization of that data, so that it’s more widely available to the builders of AVMs and other valuation practitioners out there. I don’t personally know how far that’s gone, I know there’s a general call for it. I think, with the bias studies, the appraisal bias studies, I think that’s a push in the right direction to democratize that data, make that more available so appraisers can make better decisions and have a more universal data set available to them on these properties.
Matthew Blake: What do you have in mind when you talk about, like, the democratizing of the data? Like, what kind of data might Fannie Mae have that might be valuable for you or valuable for someone building an AVM?
Lee Kennedy: So, think of the appraiser as being a data cleaner of sorts. So, it takes raw data from multiple sources, including their own files, and they validate and verify that data, not only their own data, but public record data, MLS data, agent data, you know, parties to the transaction, those data points. They clean that all up and they put it in a report. So now you have very clean data points that have been validated, verified. So, that’s what appraisers do as part of their business model, right? In order to get the facts straight. So, they have to do that every time, right? For each and every appraisal.
If you made that data available, then somebody has already done for your comparable property on the subject that you’re working for. You know that that’s good, clean data and somebody has already done the validation and verification. Makes your job a lot easier, right? To know that another professional has done that. For the models themselves, now you’ve got a good standardized clean data source on which to rely upon for those models to make their adjustments. You don’t have the data noise that you would normally have if you’re doing your own data standardization and cleaning. So, it’s going to have a huge impact, I think, not only for AVMs, but for appraisers, if they allow this data to come back into the public domain.
Matthew Blake: Why are, to ask a deliberately provocative question, why are human appraisers still needed with all the developments of AVMs that you’ve described, or are they still needed?
Lee Kennedy: Great question. So, again, you keep asking good questions.
Matthew Blake: Thank you.
Lee Kennedy: There’s about 110 million residential properties in the U.S. And the models work really well when there is a lot of data, right? So, in higher density areas, the urban areas, where there’s a lot of transactions going on, a lot of velocity in the market, turnover, change. They don’t work so well when there’s not a lot of transactions in the marketplace or very sparse data points, right? So, where there’s a lot of data, where there’s a lot of transactions, the models do really well. But at any given time, I would guesstimate that the models can only accurately produce values on, somewhere around 50% to 60% of the U.S. residential housing stock because of those confines with data, and with the market movement itself, and the complexities of the properties.
Well, the properties are very uniform, right? The comparable selection database and data points are a lot… One side of the street has an ocean view and the other side does not, right? Those are complex problems for…or, you know, it’s a non-tracked environment where every house is different, right? The models don’t do so well there. So, the human appraiser involved in this is really, you know, the fuzzy logic that kind of puts it all together. They say art and science, well, a well-trained appraiser is 90% science and 10% art or less, right?
Matthew Blake: Interesting.
Lee Kennedy: Yeah. To use that. I think that’s where some of the studies are showing there is repeatable bias with certain appraisal factions or appraiser factions, or in certain areas. But there’s counter studies too that says, you know, that is not so much an appraiser bias issue as it’s other variables and factors involved in the appraisal processes as well.
Matthew Blake: So, in terms of, I mean, AVMetrics, you guys look at different AVMs. And so, like, we report on companies like HouseCanary, CoreLogic, some of these power buyers, Orchard, Ribbon, that each kind of have their own in-house AVM. To you, I mean, is there much of a difference between these different AVMs, like, can one company really get a competitive advantage because they have a stronger AVM?
Lee Kennedy: It’s almost a timeline. When I first started doing this, the models were fairly simple. Like we discussed, they did a pretty good job of non-conflict properties in, you know, doing that kind of stuff. Hedonic regression, you know, type of modeling. Now you have the newer generation of models, I would say in the mid-2000s, was all about neural nets, and fuzzy logic, and such. And now, the newer models, here we talk House Canary, Quantarium, these guys, it’s all about machine learning.
The computing power has, you know, double, tripled, quadrupled, you know, year, to year, to year, and the data storage capacity costs have gone way down and data accessibility has gone way up. So you have all these factors kind of converging. And the newer modeling techniques are taking advantage of that. I’ve seen probably six models retired in the last three years, that were the older type of models, and replaced with new models. So, I think the advantage is, can you build a model based on the higher computer capacities, the machine learning, and the data availability that we have today? It’s gonna give you a competitive advantage.
But AVMetrics, we kind of stand on the fence between the builders of the models and the end-users of the models as a neutral position. So, we test all of these models, we do outcome analysis and sensitivity analysis. And what I can tell you is there is no one model that can value the entire U.S. residential housing stock, right? You need a bevy of models to be able to do that.
Because the same model that does a single-family detached dwelling in Los Angeles between $250,000 and $500,000, isn’t the same model that’s gonna do a mid-rise condo in Los Angeles County over $500,000. The models are each tweaked to do a good job with certain types of properties, certain geographies, certain price tiers. And that’s how we test. So, we built what we call a model preference table. So you go to the table, you give it an address, you look at it, it says, “For this property in this area, in this price range, in this property type, here is your best model, or your best two or three models.” And that’s kinda how the industry uses the models as well.
Matthew Blake: Let’s talk about iBuying for a bit, because, I mean, first of all, obviously, it’s still of interest. I mean, we’ve been talking about, like, models to value homes, and with iBuying, it’s like models to value homes, but also models to value what a home’s value will be in like three to six months. So how is that kind of short-term price forecasting different, or is it different from an AVM?
Lee Kennedy: I mean, we deal with them as well on the testing side, because they wanna know, you know, what the most accurate model is for whatever types, groups, and geographies of properties that they’re interested in making offers on. But an AVM, much like an appraisal, is a point set in time, that is what it’s worth today at 9 a.m. And it does have a lot of forecasting ability to it. Now, there is a group of appraisals out there that do that, and these are corporate relocation or relo appraisals.
And they’re set up much like an iBuyer would, they’re valuing the pricing of the property, right? What would the property sell for in a forecasted date and time? Usually 60, 90, or 120 days, depending upon the market level. What’s it gonna be worth then so we can make an offer today? Because we need to get the employee out and moved, and take the property into inventory, or get it on the market, not take it into inventory, and understand. So that’s a forecasting type of appraisal.
I’m sure that the models can do forecasting. I know Veros, you know, they do…their model is built around a forecasting methodology. But it’s not very often that somebody wants to know what a property’s going to be worth 30, 60, or 90 days from now, they wanna know what it’s worth today, while it’s going through the contract price, or they’re gonna lend upon it today, right? Advanced credit on it today. So, those are probably…in Zillow’s case, probably two different models, the one that you see and I see, you know, saying, this is what the property is worth today, and then a forecasting piece that says, if we’re going to be involved, you know, we’re gonna take this property into inventory, what’s it gonna be worth in 30, 60, or 90 days? Depending upon days on-market transactions in that particular marketplace.
Matthew Blake: What are the different inputs that would go into a model trying to figure out, like, what a home is worth in 60 days?
Lee Kennedy: Housing activity reports, what’s your current absorption rate in that marketplace, right? How long are the properties on-market? What are the days on-market? What are the list to sales price ratios of those properties? Other microdemographic data, what’s going on in that particular marketplace? What’s the competition from new construction properties? What’s the employment outlook look like? Did Kodak just shut down their factory and move, which is gonna affect your amount of properties in the marketplace at any given time? Those are all, you know, forecasting-type information or data points that you would probably take a look at in a forecasting model. And I’m just scratching the surface for guys that actually build those, so.
Matthew Blake: So, one last question for you, where do you see AVMs heading? Is there going to be an increasing reliance on them? Is there sort of a fork in the road that you see coming up?
Lee Kennedy: Yeah, I think we’ve hit the fork in the road. I think the models, the newer type of models are doing really, really well. You’ve got the adoption by the GSEs for their appraisal waiver programs is huge, right? I mean, I haven’t looked at the latest percentages, but it’s almost like the majority of what they’re doing right now is without a traditional appraisal process. It’s through their property inspection waiver programs.
That usually trickles, because we had the same thing back in the early 2000s up through ’04, ’05. There was property inspection waiver programs with the GSEs that worked really well. You don’t hear much about it because only the properties that were easy to value went to the PIW programs, right? Only the borrowers who were extremely well qualified went to the PIW program. So, there wasn’t a lot of losses in those PIW programs.
But I think, with Fannie and Freddie leading the way by utilizing these models as part of their credit extension processes with the lending institutions, I think you’ll see it trickle down to lending institutions. You’ve got the graying of the appraisal profession. A lot of them look like me, and you don’t have a lot of new people coming into the appraisal profession, right? So, you’ve gotta look at that. And then, of course, you know what I mean, the elephant in the room is appraisal bias itself, right? Is that really happening? And at what level is that happening? And are the models a potential band-aid or are they a cure as part of that bias process?
So, I think it’s kind of a nexus point where you’re gonna see the adoption and use of these models. Plus, the millennial generation or whatever it is, I know my kids, I don’t think they’ve ever been inside of a bank, right? Everything’s done electronically. People are buying houses. I just bought a house a year and a half ago without seeing it. I knew the area well enough and I could do everything over the internet. So, I think you’re gonna see a lot more of that. And that involved the use of valuation models too, pricing models to see what was going on in those areas. So, I think you’re gonna see this growing. But there’s always gonna be a need for appraisers because of the complex properties, unique properties, things that are modeled…that just too complex for the level of the models to get to, yeah.
Matthew Blake: Lee Kennedy, AVMetrics. Thank you so much for appearing on “Houses in Motion.”
Lee Kennedy: Oh, not at all. Always a pleasure, sir. Thank you.
Matthew Blake: Thank you.