The democratization of data

"Indicizing" masses of industry information

These are simultaneously the best and worst of times for data-driven decision-making. The mortgage industry is so awash in data that making sense of it all can be a full-time job – and quite often is.

In the best-case scenarios, some of the industry’s most highly educated and well paid hires spend inordinate amounts of valuable time sifting through terabytes of data to facilitate decision making. In other situations, the volume of data can be either too burdensome, or – for some institutions – the costs to acquire these hires may be too expensive to justify. In either scenario, extremely important, strategic business decisions may be made based on little more than headlines and intuition.

The fact is, it doesn’t need to be this way. Of course, for certain use cases there is no substitute for direct application and analysis of massive data sets. But for others, there are perfectly suitable proxies.

The HPI model

Consider home prices. So many business decisions are made based on what home prices are doing in a given geography or price point, and more importantly, on what they might be expected to do, given historical performance.

This makes perfect sense. Collateral is the foundation upon which the entire mortgage industry rests. Every loan in a portfolio, every asset backing a security, and every decision a lender or servicer makes all boils down in some fundamental way to the value of the properties behind the mortgages.

Automated valuation models (AVMs) can provide accurate and up-to-date values on nearly any property. However, using them in an ongoing fashion for every asset in a portfolio can be cost- prohibitive for some. This is particularly true when the goal is keeping tabs on total portfolio value in order to react nimbly to changing market conditions.

Enter the home price index (HPI). HPIs have long served as valid analytic proxies for gauging the value of real property in a given geography. By observing an origination value and legitimate arm’s length transactions, and then marking that value forward with an analytic that leverages nearly every U.S. residential real estate transaction, HPIs provide an accurate approach to valuation at the portfolio level.

That’s why today, many industry participants – lenders, investors, secondary market players, etc. – use HPIs as a completely acceptable alternative to AVMs for marking portfolios to market. But HPIs have continued to evolve as well. While historically HPIs have given broad views of major markets, there has been a push toward greater granularity in the years since the Great Recession began in 2008.

The Black Knight HPI, for example, provides detailed home price information on a monthly basis at the ZIP-code level for approximately 18,000 ZIP codes and at a variety of property price tiers. The greater the granularity and the timelier the data, the more detailed the series can become, and, in some cases, exponentially so.

Of course, as granularity increases, so does the potential for introducing volatility into the index itself, making it critical to achieve the right balance between the two. The beauty of an HPI is the way it takes an incredible amount of data and presents it in a very simple, easy-to-digest-and process format. Again, certain use cases will always demand the direct manipulation of raw source data – but there are a number of uses for which an index is a perfectly suitable proxy.

The index approach

The logical progression seems clear: we need to take a page from the proven model of home price indices and put it to work in other data-heavy aspects of the mortgage industry. Consider the possibilities. Imagine an index that can match the granularity of an HPI like Black Knight’s, but – instead of looking at home prices – is focused on other critical factors; loan originations or mortgage performance, for example.

Currently, the information necessary to produce such an index is already available in multiple forms. Individual institutions have their own internal data on both originations and loan performance; public records data can be culled for information on new originations; and robust loan-level mortgage performance databases have existed for quite some time.

However, these huge datasets can be incredibly unwieldy. Many organizations struggle with normalizing this data in a way that becomes useful. Others find themselves up against the limits of their own capacity – both human and computational – in working with the sheer mass of information. Add in the affirmation bias that often comes from analyzing an institution’s own internal data and excluding third-party sources, and you get a sense of the difficulty.

To be able to take those many terabytes of individual loan- and property-level data and convert them to an index format, while at the same time retaining the granularity and accuracy afforded by something like the Black Knight HPI would be nothing less than game-changing. So many of the key factors influencing business decisions in this industry could be much more efficiently and cost-effectively determined.

Imagine a scenario where, in addition to an HPI, an organization also had an originations index – a valid, reliable measurement that could show how originations were performing in a given geography or among a given cohort. Then, add to that a performance index that clearly and accurately shows prepayment and default information, again by geography or cohort. And then, perhaps pull in real estate listing data as well, and add a listing index to the mix to show the level at which houses are being listed.

Decisions are currently being made based on assumptions around home prices, origination volumes, prepayment and default activity, and changes in listing prices – not only at the national level but also in specific cohorts or geographies that interest or impact a given business. Likewise, every day, risk-based pricing decisions are being made based on past performance, and models are being developed in order to make those decisions.

All of these very specific factors can be measured via indices, but so many organizations are instead piecing together different reports from various organizations; monitoring items in the news; trying to make sense of market trends from what’s reported and then making decisions based on it all.

Intuition Is no substitute

Even if relying on news reports and various data points produces somewhat accurate results, it does not produce time-sensitive information, nor does it provide the same credibility or assurances as true data-driven decision-making.

For those who don’t feel secure relying on such an approach, there are alternatives, though they can be very costly in terms of both money and time. Analysts must first assemble and double-check massive queries on huge data sets. Then, assuming adequate competence on the part of the person doing the work and the proper computational power – they must wait for the results – all to see what’s going on in a given geography or cohort.

On the other hand, having multiple index series actually requires much less iteration to determine very specific pieces of information. Once you’ve looked at the level of originations in Georgia last month, for example, within seconds you can see California and Florida for benchmarking.

The most basic value of any index is the way in which its historical data can be employed to forecast future movements. Home prices give us a good idea of where we are and how we got here, which allows us to infer, going forward, what the world might look like tomorrow.

Each of these individual indices has significant value on its own; but taken together as a suite is where they really begin to shine, with the layering of one upon another.

Putting an HPI and listings together, for example, provides something of a proxy for a bid-ask spread. There is an implied equilibrium and an implied spread, but the two are not identical. If, for example, the listing index were accelerating while the HPI declines, it would become apparent that an inflection point is likely approaching.

Let’s take this a step further. An origination index gives a lender a clear, ZIP-level picture of the current state of refinance originations in Missouri in light of current market conditions. Combining those findings with the HPI also shows where home values are trending up or down. Layering performance index findings on top provides risk-based metrics to help inform decision making. The information gleaned can help with making intelligent choices about where to target marketing efforts, for portfolio retention and potential market expansion.

Change the dynamic

Across the board – in mortgage banking, secondary markets, capital markets, hedge funds and financial services more broadly – vast amounts of resources, in terms of personnel, time and capital, have been leveraged to answer some very essential questions. As the pervasiveness of data has grown, this has become – paradoxically – more, not less, difficult.

Giving these personnel access to key indices can greatly accelerate this analysis, while providing deeper more valuable insight. In short, analysts who are already dealing with this data would be far more efficient than they are today.

The end result is nothing less than the democratization of data. Taken all together, it is a way to make decisions based upon the collective intelligence of extensive amounts of raw data – using trusted analytics to make sense of it all. This is not a goal we should be looking for somewhere down the road; this capability is available today. The powerful simplicity of mortgage and housing market indices is poised to make a significant impact on the quality of decision-making in our industry.


Most Popular Articles

3d rendering of a row of luxury townhouses along a street

Log In

Forgot Password?

Don't have an account? Please