What is the difference between alternative and behavioral data; how widely are they used in fintech and other verticals today? On this week’s State of Identity podcast, host Cameron D’Ambrosi and Michele Tucci, Chief Strategy Officer & MD of Americas at credolab discuss how alternative data with AI & ML algorithms can promote greater financial inclusion and improve lenders’ profitability by better understanding their customers.
Cameron D'Ambrosi, Senior Principal at Liminal
Michele Tucci, Chief Strategy Officer & MD Americas at credolab
Cameron D’Ambrosi [00:00:04] Welcome to State of Identity. I’m your host, Cameron D’Ambrosi. Joining me this week is Michele Tucci, chief strategy officer and managing director of the Americas at credolab. Michele, welcome to the podcast.
Michele Tucci [00:00:17] Hello, Cameron. Thank you for having me.
Cameron D’Ambrosi [00:00:20] It is my pleasure. You know, I think this conversation should be a fantastic one. I think you guys are, you know, at credolab, sitting right at a really interesting intersectionality of some of the trends we’re seeing really deeply impacting the identity landscape. But before we get into all that, I think it would help to lay down a little baseline, as it were. How did you come to get into the digital identity space? Would you mind giving us just a quick hit on your background and some of that experience you brought to the team at credolab before you joined?
Michele Tucci [00:00:56] Yeah, it’s been a long, long ride and it started with Capital One some 20 years ago, I would say, and in Italy. And then it went on to companies like MasterCard and some mobile payments startups, and one of which not very successful. But that’s also part of the of the learning curve and then landed into a private equity fund that invested about $30 million to to find, scout or develop new technologies that would leverage smartphone meta data or in telco data take time to to see how can we improve the assessment of an identity or verify income or or even verify credit worthiness of people. And and then I joined the lab about five years ago. So I’ve been with them for five years out of seven of the life of credolab. And, and it’s been quite a journey, quite a ride. When I joined our company, we were analyzing very limited data points, about 10,000. Now we analyze about 70,000 data points. We were turning them into about a thousand features. Today we have 10 million features. So the space is really exciting. And and we are I believe we are at the forefront of biometrics, behavioral biometrics applied to risk and fraud.
Cameron D’Ambrosi [00:02:37] Fantastic. And, you know, credolab specifically, why I think this episode is so interesting is to me, I think this is an application of the type of, you know, identity graphing or or probabilistic applications of identity data to unlock new use cases. That, again, is really at the forefront of many of the trends we’re seeing shaping the landscape at a 15,000 foot level. You know, what’s your elevator pitch for the credolab platform? And then we can start kind of peeling back some of those onion layers and and getting to the heart of some of those more detailed questions I have for you.
Michele Tucci [00:03:17] Yeah mean think about the way you use your smartphone is your life is in your smartphone and the we have developed a platform that captures and processes and analyzes your digital footprint the way you use your smartphone through and compare it with the way delinquent customers use it or risky customers or fraudulent customers. And so we take data from everyday life kind of usage and turn it into insights that any financial institution can use to improve the way they run their own assessments, their own verifications, and the way they onboard the customers. So we have also a solution for web where we look at the way people type their keystroke patterns, cadence, how they hesitate before submitting an application for a loan, how many times they change their income field, for instance. So we look at a number of behavioral patterns and then we identify those that are predictive for a particular outcome.
Cameron D’Ambrosi [00:04:32] So a lot of follow up questions, obviously, and excited to keep diving deeper, but I guess to set that baseline. So how are the ways in which your clients are consuming the platform? It sounds like you offer kind of an SDK that can run at an app level as well as an API that can be consumed for web applications. Is that a fair assessment?
Michele Tucci [00:04:56] Yeah. That is spot on.
Cameron D’Ambrosi [00:04:58] Perfect. And when we talk about credit in some ways, right, we had these traditional paradigms which were, okay, you have the big credit bureaus and you have the traditional credit header data, which were all, I think, especially among this audience, well aware of some of the shortcomings and challenges of using that data when you’re trying to onboard, you know, different types of customers that might not be well represented in those data sets. We’ve seen the, you know, the burgeoning of what we’d call, you know, these so-called alt credit platforms, which is is a term that I think to some degree has irked me. I think, you know, as far as I’m concerned, it’s all credit data. It’s just, you know, breaking down exactly how you were making those assessments is maybe the more interesting lens to view it through as well as you know, from a compliance perspective, you know, if it is Fair Credit Reporting Act or fixer compliant or not, how do you view the the current landscape and, you know, kind of your competitive peers? It sounds like, you know, you guys really think that you can draw an accurate bead on people and really allow these institutions to unlock more value from the data that they might already have on hand in many ways.
Michele Tucci [00:06:14] Yeah. So we don’t see credit bureaus as competitors, actually. We encourage our own clients to ingest the credit bureau data whenever it is available. And when we look at it from a US point of view, then yes, the main credit bureaus, they have resale codes, they have been data for a long time. They can deliver value and they do, however, not for everybody. And so even myself, as a white collar immigrant into the U.S., I didn’t have a credit bureau score until just a few weeks ago. So initially I had to resort to an alternative lender to have a credit card that traditional lenders wouldn’t issue to me, regardless of my income and or perhaps even education. So we do serve a we help unlock value for banks and traditional lenders alike. Buy now, pay later, for instance, order also credit builder type of apps whenever they are going after customers that are same fires by definition. So this could also be not necessarily risky customers, but just the people that have not been part of the mainstream credit ecosystem long enough to generate a solid, thick file. So imagine millennials or young professionals, they are seeing fires by default or people they just don’t like credit as a payment method, but they prefer debit so they don’t generate data that is then consumed by the credit bureaus to generate their own credit reports. So we come in by complementing what the credit bureaus have, by supplementing what the lenders ingest in terms of data to make a more rounded assessment of that individual. So we look at behavioral data that usually has very little correlation with traditional data. So a credit bureau data or open banking kind of data in the U.S. you have played, for instance. So there is the correlation between transactional data out of your bank account and behavioral data. The credit processes is usually below 10%, which means that when you combine these data to get that into into the general model, you’re looking at assessing the same customer from different point of views. And so one example I like to give is comment on imagine that you and I, we have the same salary and we bank with the same bank and we even have the same credit score. So from a traditional way of looking at our applications, the bank will be making a conclusion, however, and that conclusion is based on the affordability check. Right? Can I repay the loan amount I’m asking for? Do I have enough savings to repay the loan amount? But from a behavioral point of view. KAMEREN You want to repay any state Mikail is not willing to repay. So this is the additional layer that behavioral data allow you to to add to the assessment. So it’s about discriminating. Between people that despite having the means, they may not be willing to repay. And people that even though they may not have enough data in the traditional sense of the word, they are still good behavior. Good, good. Repay years.
Cameron D’Ambrosi [00:10:12] That’s fantastic. You know, there is that all important regulatory lens when we start talking about, you know, how you can leverage these data insights as a lender, as a financial institution, do you guys offer Fair Credit Reporting Act compliant products or is this intended to be used in a non ficar compliant way by your customer platforms?
Michele Tucci [00:10:38] You know, that’s a good question and it is one that we receive every time we talk to a new prospect, as I do love as a company, we are a technology provider, we are not ESRI, a credit reporting agency, and we are not a furniture of data deal. So the FCA rate doesn’t apply to credit lab. However, we do recognize that the FCA does apply to a bank, to a lender and therefore for decisioning purposes they need to be using data that can be explained to the user, especially in case of adverse action notice. So imagine some of the data points we collect are related to surface. Can we reject the customer by saying you have taken too many selfies and that’s why you are denied credit? Obviously not. However, from a pure analytical statistical standpoint, logistic regression tells us that that behavior is predictive for risk. Why? Because it’s common to all defaulters. So the we focus on top of the funnel verifications. We have 100% hit rate. So with our data, every lender can score any incoming application and we look at early defaults. So on the score, although in other countries it’s used for the risk assessment and for decisioning in the US because of the FCA rating, we don’t recommend using the score as a for a decision for this decision, but more for early fraud detection. So we focus on detecting first party fraud, people that are applying for a particular loan or credit card, but they are highly probable not to repay, not even the very first time. So these are so-called nonstarters people. They apply, they go through the process, they are approved, but then they disappear. They get the money and they don’t pay back. So this is a very particular type of fraud almost at the intersection between credit risk and fraud. Right. If you are able to filter out bad actors, then of course your risk profile will improve and your probability, your probability to default rate will also improve. So that’s where we focus in the US on top of the funnel verification for 100% of the incoming customers through mobile or any web aimed to detect early fraud.
Cameron D’Ambrosi [00:13:32] That’s fantastic. So talk a little bit more about how, you know, your customers are adding value to those existing data sets. You know, I think that’s a really interesting and compelling point you made, which is that you are not, you know, in and of yourselves furnishing data points. You’re almost right adding a computational toolkit to these platforms to allow them to make value out of the data sets that, you know, basically they already have. It’s almost, you know, found it’s almost found spoils, if you will. They’re unlocking unlocking value from things that basically they’d be getting no value from previously.
Michele Tucci [00:14:12] So imagine we are processing data that is privacy consented that is already available in the smartphone of that particular applicant and we receive the consent to process for fraud detection purposes. So D then we process these data, we put it through our proprietary feature engine and does a modeling pipeline and then the output is a score. Now the score is optimized by using the lenders each individual lender’s outcome data. So we look at behaviors that are predictive for that particular of. Outcome that is peculiar to each individual client. And in this sense, also, there is no way for an individual user to claim that they have been adversely. How do you say they’ve been rejected for wrong reasons because the CIA had reported wrong data? There is no way that an end user can correct the digital footprint they have generated by themselves. So this is also one of the reasons why we are not a fair share of data, but we do provide this. You very well put a statistical insight that helps our clients improve their ability to detect and reduce false positives and false negatives.
Cameron D’Ambrosi [00:15:50] And, you know, from that data collection standpoint, I think this is an area of intense interest for platforms similar to yours, which is, you know, ongoing access to these data elements and constriction, I guess, shall we say, of that ecosystem, whether that’s due to regulatory constraints or tech platforms, just deciding, you know, whether it’s Apple or whether it’s Google or whether it’s Microsoft, you know, narrowing access to device level data and to kind of, you know, browser level signals. How are you guys coping with some of those restrictions and what are your thoughts for the future of how you’re going to continue to work alongside, you know, those hardware and software vendor platforms to make sure that you have access to those data bits that you need to kind of make this thing fly.
Michele Tucci [00:16:40] Yeah. So we well, first of all, we don’t process personal data. So whenever we access data on a mobile order website, we don’t know whether that is Mikayla or Cameron. We just know that one device came through the door of that particular bank and we are assessing that device. So also, we don’t collect data in an ongoing manner. The data is collected only upon loan application, for instance. It depends on the use case, but let’s say on boarding for a particular unsecured lending product, that’s when we collect data. So one time only the SDK doesn’t run in the background. It’s not checking every minute or every day what you are doing. But it does. It does so only when when the SDK is triggered by the client of credit. So it is the bank that decides when to trigger the SDK one time only not persistently. So it’s not an ongoing kind of process that tracks whatever behavior you do online. Not at all. Also, we don’t process cookies that are going away. We don’t process IP address, order device. I may order a mobile advertising ID. If you can think of other digital identifiers, be reassured that we don’t process any of those. So our datasets really are a list of binary information. Have we seen this on the mobile? Yes or no? How fast that was the client typing x much? How many times D how many UI interactions where done in the app order on on line. So this is the kind of data that we collect that there is nothing personal about our data sets at all.
Cameron D’Ambrosi [00:18:40] The breadth of applications for these types of probabilistic capabilities that you have, I think are another area I’d love to unpack. And you know, if I may ask you about kind of the vision in the future for the platform. You know, we at Liminal have really been big proponents of, you know, banks. What basically any platform that’s dealing with identity, thinking about identity in a holistic way, you know, across the lifecycle, the notion that how you are thinking about your customers in terms of who you’re marketing to and how you’re marketing to them should feed into how you’re onboarding them, how your risk scoring them, how you’re facilitating those ongoing customer interactions after the person has become a customer, you know, how are your customers talking about post onboarding or pre-onboarding use cases for your platform and really, you know, again, putting identity kind of at the center of their stack as opposed to maybe previously where it’s lived on the periphery.
Michele Tucci [00:19:39] Yeah. So the main use case really is onboarding. The other one is portfolio management. So once the customer has been approved, how to optimize the way that a credit line increase is allocated, for instance, or how to detect signals, behavioral signals that could be used to prevent churn or prevent customers from becoming delinquent. For instance, six months after the application. So there are a number of behavioral signals that can be used to optimize portfolio management. But I have to say the main use case still is today and for onboarding purposes. So we are an embedded scoring technology. We are part of the frontend mobile app or order website. And perhaps interestingly for you, historically, we we are a Singapore headquartered company, although today we have offices in Miami and London as well and clients in about 41 countries. So we are truly a global organization. I have colleagues in 17 countries, not 17 U.S. states. So on the what we do, we came up with a mobile first solution for onboarding. So we’ve been working on our mobile SDK for seven years and I believe today, without false modesty, that we are the only one that truly understands digital footprints on Android and iOS. We have even added a behavioral module to the Digital Footprint Module of our Android and iOS SDK, where we I believe today we are the only technology provider in the space that uses UI interactions, gestures in app gestures and keystroke patterns to detect behavioral patterns that are correlated with risk of fraud. So all through an Android and iOS SDK. No, but perhaps I digressed a bit. My point was that we have started on mobile and have gradually. In 2020 we launched our web solution coming to America. We found a market where originations are still today, largely done through web websites, through online, perhaps mobile responsive websites, but still websites. So we believe that U.S. lenders and banks are missing out a lot of data that they could be crunching in a privacy consented way, even in an equal or Equal Credit Opportunity Act compliance compliant way, simply because they are still originating loans and cards through websites. We encourage them to use mobiles simply because they allow access to a lot more data. That is also privacy content and that is also quite predictive. So we know in the US perhaps buy now, pay later, players are focusing on app more than banks and credit builder kind of solution providers are also originating customers through mobile apps and these are the ones that have understood earlier than others the power of alternative data to to improve the way they verify identities, the way they detect fraud, and also the way they mitigate risk.
Cameron D’Ambrosi [00:23:32] So for folks and this is something I like to call shameless plug time for our audience members who are listening and are intrigued to learn more about the platform, learn more about the capabilities, or get in touch with you to explore what is the best place for them to go? How should they go about doing that?
Michele Tucci [00:23:50] So we have our website is Cradle Icon. They can reach out to me as well and to try and connect dot com and and we can do a demo we can do we can walk them through best practices that we are importing into the U.S. from from other markets, global markets. And and we can also consult them on the way to optimize onboarding so that they even optimize that waterfall of verifications. So I mentioned this idea of verifications a few times because it’s a real pain point. In most cases, people are rejected because of lack of data. And and when you look at the onboarding process from a backhand kind of view, not from a user’s point of view, then you see that you will likely be assessed based on your email verification, your income verification, your I.D. verification and and then soft pool and perhaps at some point hard credit bureau pool as well. So the allocation of of these verifications along the onboarding journey, it typically changes depending on the availability of data, but also depending on the cost of those verifications. So let me give you an example. Are you going to do the KYC for all your incoming applications? Most likely not for two reasons. One is friction, and the other one is cost. KYC verifications are expensive and they quickly add up in terms of unit economics costs. So when you have a data, a dataset that allows you to filter out obviously fraudulent customers based on the statistical evidence that we have, for instance, at the lab, then you can also save money down the funnel in terms of avoiding verifications that the delinquent customers don’t need to go through because they should be stopped too early on in the process. And then perhaps you can add friction to the applications that are maybe borderline. So they need more data and we encourage to use as much data sources, as many data sources as possible, as long as they add value in terms of predictive power. And also the cost is right. So my point is that if finally there is a source of data that is available for every incoming application through mobile and web, let us not look at it only from a how much does it cost the credit score, but also the bigger the bigger business case is how much am I going to save if I don’t call other providers for clients that are likely to be rejected down the funnel anyways? And also, when you combine behavioral and behavioral assessment with the affordability checks, with credit bureau checks, then you truly have a 360 degree assessment of that particular individual.
Cameron D’Ambrosi [00:27:19] Amazing. Well, thank you so much for your time again. You know, really think that you guys are right over the target in terms of a lot of the trends we see around the use of probabilistic data and its tremendous value and, you know, critical, critical importance in supplementing, you know, those deterministic data sets that, again, can’t really touch all of the types of customers that platforms need to be able to reach, especially, as you know, we’re moving into an era where young people are going to have increasing buying power and the folks who the traditional credit bureaus were strongest in being able to assess, you know, not not to be McCabe about it, but continue to to die off and be replaced with younger consumers for whom these technologies can have so much more impact. Thank you again for your time. Really appreciate it. We’ll include those links to your website in the show notes below. For anyone who who missed shameless plug hour and need a little help finding their way.
Michele Tucci [00:28:20] That sounds like a plan. Thank you, Cameron.
On this week’s State of Identity podcast, host Cameron D’Ambrosi is joined by ID5 CEO, Mathieu Roche to explain how identity solutions are a means to enforce data protection mechanisms rather than go against them. They present and explain what ID5 does in contrast to the surveillance advertising narrative.
What is a synthetic identity and who is doing it? On this State of Identity podcast, host Cameron D’Ambrosi and Kurt Weiss, Vice President of Enterprise Sales at Ekata discuss synthetic identity and the levels of sophistication. Can it be solved, and what are the keys to solving the problem?
On this week’s State of Identity podcast, host, Cameron D’Ambrosi sits down with Aaron Goldsmid, VP of Product for Twilio Communications Platform. They discuss verified identity as a primitive of the internet and the digital “anti-fragile identity” becoming better than in real life.
How can document fraud detection help fight identity fraud? On this State of Identity podcast, host Cameron D’Ambrosi discusses building AI and machine learning models for a fraud vector with Inscribe Co-Founder and CTO, Conor Burke. This duo breaks down the challenges banks and fintechs face in combatting fraud.
Understanding where your user is physically located is critical for compliance, trust and safety, and anti-fraud applications. On this week’s State of Identity podcast, host Cameron D’Ambrosi welcomes Isabella Edmonds, Head of Government Relations at Geocomply. They discuss the shifting regulatory and industry landscape, and the role geographic signals should play within a digital identity tech stack.