If you haven't waded through the default industry waters before, you might be surprised to see the vehemence with which different data companies argue the validity of their numbers. RealtyTrac, Foreclosures.com, Economy.com, and many -- many -- more. All will BRISTLE at even the slightest hint their numbers are wrong. Yet someone's are. Why? Because nobody has the same numbers, often not even close. The LA Times waded into these murky waters this past weekend:
Zandi takes issue not only with RealtyTrac for numbers he says are too high but also with DataQuick Information Systems, a La Jolla, Calif.-based research company frequently cited in The Times, for numbers he says are too low. DataQuick and RealtyTrac draw their numbers directly from filings in county recorders' offices. After four years of boom, the market in California last year definitely turned queasy. But RealtyTrac's numbers show a full-fledged crisis, with 142,429 foreclosure filings — one for every 86 households in the state, the company said in a February new release. DataQuick reported less than a tenth of that total: 12,672 foreclosures. "The RealtyTrac data is overstated, but no way there were only 13,000 foreclosures," Zandi said. His own data, based on a random sample of 5% of the consumer credit files assembled by data collection firm Equifax Inc., show 56,747 first-mortgage loan defaults in California last year. Zandi acknowledges that the actual number of foreclosures is probably a little less than this figure, because some defaulting owners manage to save themselves, but says he stands by it as a more accurate representation of reality in the state than anyone else's numbers. John Karevoll, chief analyst for DataQuick, said Zandi, like RealtyTrac, was miscounting. "You tell Mark Zandi we will go toe to toe with them, address by address, foreclosure by foreclosure. My numbers are right. I know they're right," Karevoll said.
When I was the lead editor of this publication and led the development of the daily news Web site there, I made the decision to go with RealtyTrac's stats because of the fact that they counted each stage of foreclosure, and because they were willing to set up an XML file for us to connect to quickly. I personally think the stats RealtyTrac provides are very much correct, but can be misinterpretated when the uninitiated attempt to aggregate across NODs, auctions and REO numbers. Adding the three will NOT give you accurate numbers for "total foreclosures" -- whatever that means -- because of what's known as data duration. A property can run through multiple stages, and even back between stages, depending on what takes place during that 30 days. The problem is even more rampant in quarterly data. The RealtyTrac data is a great way to get a read on trends in the foreclosure process, but trying to use it to get some magical number of "total foreclosures" is something only a journalist outside of the trade would attempt to do. And then, of course, wonder why the numbers don't match up. I'm not a RealtyTrac honk. Their numbers have problems like nearly everyone else in this market -- primarily selection bias, because their numbers only report whatever is in their proprietary database, which may or may not represent reality -- but let's at least frame the discussion around some understanding of what the numbers are actually good for before getting into a pissing contest over who's numbers are "the best."