Harnessing the power of alternative data

When Facebook meets FICO

Underwriting is an incredibly important and complex piece of the lending industry. The secret sauce of underwriting, or the underwriting methodology, varies from lender to lender and is often a closely guarded secret. Lenders devote a tremendous amount of time and money to hiring professionals and engaging vendors who can provide the right expertise in guiding them to achieve the perfect recipe. 

Increasingly important in underwriting is the use of “alternative data” – a variety of information about a prospective borrower that, while not directly related to credit, has proven to hold great predictive value. Using alternative data means a lot of data to sort through and find patterns. Accordingly, the latest advances in artificial intelligence and machine learning are put to work by lenders and data providers to get the most out of this data.

The Consumer Financial Protection Bureau recently indicated its consent to the use of alternative data in loan underwriting, thus making a stronger case that the use of such data points is here to stay. Furthermore, it is likely that the use of alternative data will increase as the computer algorithms behind it grow smarter and more powerful.

Defining alternative data

The use of alternative data has been most prevalent in the smaller credit agencies, as opposed to Experian, Equifax, and Transunion (the Big Three), although Transunion has emerged as a leader when compared to its competitors. The smaller agencies, such as Clarity Services, CoreLogic and Factor Trust, focus primarily on the subprime consumer credit market which serves individuals whose files are often thin or nonexistent. This means that traditional credit reports would not provide a useful picture of a prospective borrower’s creditworthiness.

Some products offered by these smaller agencies are still primarily focused on credit data, and others provide specific kinds of alternative data. For example, CoreLogic’s SafeRent offers a product that tracks housing rental history. Milliman’s Intelliscript product focuses on prescription drug history. Lexis Nexis, among multiple data products, provides data on consumers’ loss history in personal property and auto insurance through its Comprehensive Loss Underwriting Exchange, or C.L.U.E, report.

ID Analytics, which offers reports used for identity verification purposes, is itself representative of the utility of alternative data. The company on its website describes its “ID Network” as “a unique cross-industry repository of near real-time consumer information.” This set of data incorporates disparate sources to, in the company’s example, provide insight into “[whether] an individual with no traditional credit history is low risk because of a great payment record on wireless phones and utilities.” 

SafeRent, Intelliscript, and C.L.U.E. have a clear role in their respective industries: a landlord can obviously find great utility in a collection of data that indicates the rental history of a prospective tenant. But ID Analytics demonstrates the power of assembling multiple disparate sources and, through data analytics, putting them together to create a greater whole that reveals additional insights – an approach increasingly favored by lenders in both the consumer and business lending spaces.

Machine learning and artificial intelligence

As alternative data provides many new sources of data for a lender or an underwriter to examine, there is an increase in the amount of computing power needed to sift through and understand it all. This need is filled by machine learning, a subfield within the greater study of artificial intelligence. AI, according to Stanford computer scientist professor John McCarthy, is when a computer is able to perceive its environment and perform tasks that achieve goals in that environment. Machine learning focuses on creating computer algorithms that are able to deduce patterns and use them to make predictions. 

The ultimate ramifications of machine learning on society are unknowable at this point, but the impact is already being seen in the world of lending and credit underwriting. The increase in processing power and reduction in storage costs have created an environment where these algorithms can ingest and process greater amounts of data and, through the learning process, refine their processing efficiency over time.

Since the sets of data lenders use are vast and varied, machine learning is an integral part of turning all those data points into useful underwriting insights. The algorithm, forming new connections between this data and examining it at a deeper level than previously possible, is able to uncover groups of worthy borrowers that were ignored in the past.


Lenders’ use of alternative data in conjunction with traditional credit data

On September 14, 2017, the CFPB granted its first no-action letter to Upstart, a personal lender who offers unsecured loans ranging from $1,000 to $50,000, with three- to five-year terms and for varied purposes including student loans and credit card refinancing. Upstart sought this no-action letter with regard to its underwriting model. Upstart’s description of the methodology behind this model in its request document to the CFPB reads as a prototypical example of the process and benefits of alternative data’s use in the lending space:

By relying exclusively on the credit report and traditional modeling techniques, lenders ignore some of the most predictive information about potential borrowers… In Upstart’s view… traditional credit scores are simply one good predictor of loan repayment, and it believes that underwriters should use other variables as well. 

…By complementing (not replacing) traditional underwriting signals with others that are correlated with financial capacity as well as propensity to repay a loan, Upstart’s underwriting properly understands and quantifies risk associated with all borrowers— those with credit history, and those without. In the three and a half years since launching the loan products on the Upstart platform, our model has demonstrated strong performance and has improved across model versions.

Upstart’s use of alternative data includes examining borrowers’ educational background, job history and field of employment. This favorable evaluation by the CFPB comes at a time when many other lenders are integrating alternative data into their own underwriting. The CFPB’s decision to issue the no-action letter was received favorably by the industry, though some were disappointed that the letter did not provide more explicit guidance to the industry at large. 

Kabbage is one such lender. It uses, in addition to traditional credit data, such sources as information about a borrower’s product shipments and social media. Kabbage has been successful in licensing its underwriting platform – which integrates the multiple, disparate sources of information into a holistic picture of a prospective borrower – to large banks such ING and Santander. F1 dice

Tala (formerly InVenture) makes small loans to borrowers in Kenya, Tanzania and the Philippines using the data found on a borrower’s mobile phone. A user’s mobile phone can provide enough information for identity verification, evaluation of more traditional credit data points like debt-to-income ratio, as well as alternative information such as other applications the borrower uses, location data and the borrower’s social network. Tala is a prime example of the utility of alternative data in evaluating customers’ creditworthiness outside of the traditional credit report context.

Many of the borrowers who obtain loans through Tala would undoubtedly have a thin or nonexistent file as viewed by a traditional lender using credit data alone. Tala’s inclusion of a variety of data into its underwriting has allowed it to make loans to a population that would likely be ignored by a traditional credit data-focused lender, with impressively high repayment rates. 

Kabbage and Tala use alternative data to achieve somewhat different ends. In Kabbage’s case this is a more holistic picture of applicants that nevertheless belong to a class already served by the lending industry, but that can benefit from the extra assurance provided to the lender by Kabbage’s underwriting model. Tala seeks to serve an entirely new borrower population, one likely to be ignored by an underwriting motel focused solely on credit. Both, however, integrate their alternative data points with traditional credit inquiries, creating a new whole greater than the sum of its parts. 

Market research indicates that many other companies seek to integrate alternative data for similar reasons. Factor Trust’s 2017 study of lenders and financial service providers found that the top three reasons lenders were looking into alternative data were expanding the universe of eligible borrowers, re-evaluating previously rejected borrowers, and making risk-based pricing more accurate. Factor Trust noted that these goals are best achieved by combining traditional with alternative data.   

The major credit analytic firm FICO appears to agree. In an August 29, 2017 blog post about the use of alternative data, “Using Alternative Data in Credit Risk Modeling,” FICO stated its findings that alternative data sources add predictive value when combined with traditional credit data. FICO’s research found that alternative data on its own is less valuable than traditional credit data for underwriting purposes, but the model created when alternative data was properly combined with traditional data was more predictive than traditional data alone. 


Risks of alternative data

As with all new technologies, the risks of alternative data must be kept in mind when considering its transformative power. 

One particularly salient risk, the one that inspired Upstart to seek its no-action letter from the CFPB, is possible discrimination. The heavy presence of machine learning in the world of alternative data and credit decisions means possibly transferring certain decisions from human to machine responsibility, and the unsettling possibility exists that the algorithms may focus on certain data points – or combine them in an unforeseen manner – ultimately resulting in disparate impact.

Upstart, in its request document, limited the constraints of its request to compliance with the Equal Credit Opportunity Act and its implementing regulation, Regulation B. Upstart cited uncertainty about the evolution of its underwriting model and changes to its pool of applicants over time as the reason for its request. 

General privacy concerns will also be important for companies to mitigate as alternative data grows in popularity and use. Even when a company’s actions are in compliance with the Fair Credit Reporting and Gramm-Leach-Bliley Acts, there may still be a psychological hurdle on the part of consumers to overcome as more and more of what they do becomes a factor in their credit decisions. Calls for a European-style privacy regime of “opt in” rather than “opt out” may be more commonplace as more consumers become aware of the increasing amount of data points that define them. 

This is particularly the case for social media.  Kabbage’s use of social media includes the option for borrowers to link their business’ social media accounts to their Kabbage account. Though linking an account is a predictor of lower likelihood of defaulting, Kabbage does not consider other social media data points like followers, posts or content. 

The August 2017 FICO blog noted that data from one’s social network profile holds little value because it is not as strongly connected to credit as other pieces of alternative data, and also because it is modifiable by the user. Nonetheless, as the algorithms get stronger and faster, valuable patterns may emerge from this now underappreciated data, and the privacy and regulatory issues of incorporating this data would need to be addressed.

Facebook itself has made what could be categorized as attempts to enter the world of alternative lending data. The Wall Street Journal reported that in 2014, Facebook executives met with several lenders and data providers, seeking to determine whether Facebook data could be of use to these firms in underwriting. 

Following that, in 2015, Facebook acquired a patent for assessing creditworthiness based upon the credit ratings of connections in the applicant’s social network. The regulatory hurdle is the most cited reason for Facebook’s continuing failure to move forward in this area.  The Federal Trade Commission, for example, has issued guidance stating that a company like Facebook providing data used by other firms to underwrite or make credit decisions would turn that company into a credit reporting agency pursuant to the Fair Credit Reporting Act. It is likely that Facebook simply does not want this regulatory headache.

It is interesting to consider what similar firms have done in other jurisdictions where the regulatory obligations are different. Baidu, China’s large search engine, has a financial services arm that, among other operations, makes unsecured consumer loans. Some of China’s other large tech firms – like Alibaba and Tencent – also have moved into the consumer financial services space. The Wall Street Journal attributes this move to the amount of data they collect and their ability to analyze this data in new ways to underwrite loans, such as analyzing a user’s search history or the online videos they watch. 

While it may be difficult for American consumers to imagine obtaining a loan from Facebook or Google, the Chinese companies’ solution may translate to a similar American firm in possession of a great deal of customer data. If Facebook is hesitant to incur the regulatory obligations of providing data to other companies, those obligations may be less if it were making the loans itself and keeping the data in-house. 

FCRA obligations are largely limited to either providing one’s own data to another entity (making them into a credit reporting agency under the law) or using reports provided by another entity (which creates the obligation to provide consumers with adverse action notices, for example). The FTC, in a 2016 report on “Big Data,” even stated clearly that the FCRA does not apply when a company is using its own data about its customers for purposes of making decisions (such as credit decisions) about them.

Assuming the credit utility of their data becomes apparent, it is not difficult to imagine Facebook or Google forming or acquiring a lending arm and leveraging their vast stores of customer data to compete with other lenders in the space.

Moving forward

While the regulatory risks are rightfully on the mind of legal and compliance professionals in the lending and underwriting industries, the use of alternative data has already proven useful to a variety of firms and there is no reason to expect that it will stop anytime soon. Incorporating data outside of the traditional credit data points has allowed lenders to better understand their current universe of borrowers and to make loans to them in a more educated manner. 

More importantly, users of alternative data can widen the group of eligible consumers for their products, particularly consumers who have traditionally been underserved. As concepts like “underbanked” and “unbanked” gain more and more recognition, lenders will increasingly turn to alternative data to capture this population.

The recent high-profile data breach of Equifax, and related skepticism of the traditional power and role of the Big Three may even be setting the stage for more widespread adoption and acceptance of alternative data. It is clear that, for lenders, examination of credit data alone will not be enough to compete. 

Most Popular Articles

3d rendering of a row of luxury townhouses along a street

Log In

Forgot Password?

Don't have an account? Please