The Pathways of Data Integration

Episode 275

05/12/2022

Episode 275

The Pathways of Data Integration

Where is the biggest AI bottleneck and what is the next foundational shift in AI? On this week’s State of Identity podcast, host Cameron D’Ambrosi welcomes Dr. Eric Daimler, CEO & Co-Founder of Conexus AI to dive into data integration and consolidation. They break down the limitations of AI and look at  regulatory headwinds around the development and deployment of AI technologies.

Host:

Cameron D'Ambrosi, Managing Director at Liminal

Guest:

Dr. Eric Daimler, Chair, CEO, & Co-Founder at Conexus AI

Links:

Share this episode:

Cameron D’Ambrosi [00:00:07] Welcome everyone to State of Identity. Joining me this week is Dr. Eric Daimler chair, CEO and co-founder at Conexus I Eric or I should say Dr. Daimler. Welcome to State of identity.

 

Eric Daimler [00:00:21] Eric. Eric’s just fine, but thank you. It’s good to be here.

 

Cameron D’Ambrosi [00:00:24] Really excited for the conversation. I think, you know, your your deep subject matter expertize in artificial intelligence is so, so relevant to so many of the trends that we touch on here across the breadth of topics here on State of Identity. So really excited for our chat. But before we dove deep into the weeds, we’d love to hear just a little bit about your background. You know, you have, according to my cheat sheet here, over 20 years of experience in the field, always love to ask my guests, you know, about their career journey, some of those formative early experiences that led them down the path, culminating in their appearance here on the podcast. So would you mind walking us through a little bit of those 20 years experience and how you came to co-found Can Access?

 

Eric Daimler [00:01:14] Sure. Like saying it’s an appearance, even though there’s an audio track, the you know, if anybody knows my name, you know, to the extent that they would, it’s often because of the time that I spent in the Obama administration as an air authority or during during the Obama administration as an air authority in that the group colloquially known as a science advisory group to the president. I was really fortunate to work with a really great group of people that have dedicated Americans just fulfilling on what we saw as the future of aid. Speaking humbly on behalf of the President, coordinating the efforts of the rest of the executive branch of state, defense, energy, transportation around what the future of air would hold for the United States and its allies. I hope to do that again someday. That was a not necessarily a culmination of of a career, but it certainly put yet another perspective that I’ve had the good fortune of experiencing into my set of experiences around air where I started as a researcher. University of Washington. Seattle. Stanford. Carnegie Mellon was on the faculty to being a venture capitalist on Sandhill Road and and doing the company thing as an entrepreneur a number of times with some good exits. So I experienced air from a number of different perspectives. I’ve been fortunate enough to have got a good start by choosing good parents and, you know, engaged in the in the discussions around engineering really early in my life. They hope to have that continue and continue to contribute in any way. I can be most useful in the years to come.

 

Cameron D’Ambrosi [00:03:06] And double tapping on conneConexus you know the I guess, you know critical question or critical need that you saw that needed to be addressed or answered in the market. You know, can you speak to what that founding story was like and what the impetus for developing the conneConexuss platform was?

 

Eric Daimler [00:03:27] There’s a lot to say about the value that conneConexusprovides because conneConexusreally is defining a new epic will say in the foundation of i.t infrastructure. It was, from my perspective, working in the U.S. federal government that everybody’s gotten the memo about data being the new oil. We collect a lot and big data and all that, which is really becoming somewhat of an old term. What’s less appreciated is that the number of data relationships is also increasing exponentially or more precisely, quadratics. The result being data relationships are the real the new currency we have an example with conneConexusclient Uber, where they grew up paying attention to their business. They grew up then by jurisdiction, by city. They then wanted to answer ordinary business questions, supply, demand sort of questions, and found that to be cumbersome, they looked at all the commercial solutions available to integrate this data. They they realized that the solution was not in computer science, but instead in math a deeper than the computer science based solutions. At the time, they found this branch of mathematics called categorical algebra or category theory. And in categorical algebra, the leaders that they found, even though they’re in the back yard of Stanford, they they came to us. We happened to be about 40 miles north of them. But we then worked with Hoover to consolidate these 300,000 databases so that they could ask these ordinary business questions, hey, how does New York’s driver supply look for this upcoming week? And giving that given that there’s going to be a big event? Or how does rider demand look in Washington, D.C., without doing these statistical comparisons that provide some degree of friction, inaccuracy and time? We worked with them over a number of months and have them tell it. You know, they save over 10 million a year with the alacrity with which they’re able to answer these basic business questions and respect the privacy lattice of driver’s licenses or license plates. That’s what conneConexusprovides is that this type of composability of data and data models that’s unavailable in any other solution.

 

Cameron D’Ambrosi [00:05:53] That’s super, super interesting. And, you know, I know you brought up and maybe this is a bit of a sidebar, but I always think it’s it’s a fun topic, this notion of, you know, data as the new oil. And I think it’s a very apt analogy in some ways, maybe less apt in others. But I like to play with it in the sense that, you know, beyond how I think folks have interpreted this phrase to be meant, which is, you know, it’s this resource and people can extract it from their platforms and monetize it. And in some ways that it’s a lifeblood of the modern economy in the same way that petro fuels have been. I think you can carry that out in terms of some of the negative externalities that abuse and over a collection of data create, you know, similar to owning a property that has a giant lake of oil on it. Well, that’s all well and good. But if it starts leaking out, you are now on the hook for the cost of cleaning up when it spills and, you know, ends up poisoning a bunch of people or or creates these negative externalities. And I think it it sounds like you really envision Conexus helping it in some ways to mitigate the toxicity or the the negative side effects that might result from enterprises over collecting data. And to your point, these kind of inferences or or second level data that you can create about relationships between existing data points can help them maximize the utility they’re getting from existing data that they’re collecting without maybe having to go too far and start veering into the territory of we’ve collected too much data and now this has ceased to be an opportunity and an advantage for us and is now firmly in the liabilities portion of the balance sheet.

 

Eric Daimler [00:07:40] You know, it’s really a great lead in to the the foundational shifts that we’re all experiencing. The way that Conexus looks at it isn’t just from the data, but it’s from the data models. And those data models are becoming so sophisticated, so complex that to to your point, the companies, governments cannot keep up with them. They don’t even realize where the fragility lies. You know, if we think back to the modularity that existed to power the Industrial Revolution, we had many of these parts exposed where we could see the modularity, we could see transparency in the modularity, where we collected a construct in an assembly line. We don’t have that in a digital environment where we now have trillions of interactions. One of the US as customers has 1.4 trillion transactions coming through their system in health care per year. You have to abstract your thinking to be able to work with relationships at that number. You know, I’m not talking visa transactions or or equity trades where we’re we have systems to accommodate those inside of standards where we don’t have standards in complex engineered systems, either deploying power or developing a jet engine. Those systems become so sophisticated that the employees, participants of those projects can’t themselves understand where the fragility lies. And it’s in that that they’re a lot of anxiety is is emerging because lives are often at stake, if not tens or even hundreds of millions of dollars from the some of the stories from some of conneConexusas clients. I think this is this is a composability that is emerging from these sophisticated systems. That conneConexusse powers at a deeper level, this level of math and categorical algebra. You know, this breakthrough in math allowed for us to be able to even interpret quantum computers. You know, quantum computers and the physics behind that is very sexy to talk about. And, you know, that’s that’s a Hollywood screenplay probably, but and certainly much more than a Hollywood screenplay about the math that powers it. But but I’ll tell you, we would not be able to interpret a quantum compiler without categorical algebra formal methods. We just wouldn’t be able to understand it. You know, if you didn’t have categorical algebra, quantum mechanics would demand that you’re using imaginary numbers. It gets it’s too much. And so the world we’re living in is we’re shifting from an epic of of logic that powered our computing revolution to one of composability, which requires this categorical algebra to be able to analyze for integrity.

 

Cameron D’Ambrosi [00:10:41] So where would you say that the the biggest A.I. bottleneck is currently, you know, building on some of these topics you’ve already unpacked. You know, with these new frameworks and approach to composability, you know, has that eliminated these bottlenecks or is it just kind of pushing that kind of farther up or downstream, depending on your perspective?

 

Eric Daimler [00:11:05] I’d say the the bottleneck is upstream now that there’s so much data collected it somewhere between the data scientist and the data engineer where there’s a lot of data, but it’s dark data. It’s data that’s collected by the organization, but it’s too hard to bring in to one repository. We have marketing terms, data lakes, data lake houses, data warehouses, you know, and those are partial solutions. You know, one of our clients says, boy, those those other companies, they sell me this proposition. That’s a little bit like throwing all my books into a big library and say, see, it’s integrated. And then and then a data warehouse would be, all right, let me sort of by height, you know, restart the books by wait, like, okay, that’s semi-structured, but it’s not structured in a way that’s useful. What it conneConexusprovides is a way of integrating the models, and that’s really what you want. That’s the example that we’ve been talking about that the board level discussions I have and I sent in a couple of AI boards is that that some of our clients have been told to and giving pitches to their boards about the future of AI, providing a universal view, you know, an enterprise wide view of a customer’s interactions, their own customers interactions, but they’re unable to fulfill on this because the data and the data models coming together in an enterprise wide way isn’t available to them. That, I think, is going to be the big bottleneck and where a lot of companies and investors are going to find some disappointments.

 

Cameron D’Ambrosi [00:12:48] So, you know how I think I have understood how companies are positioning themselves to take advantage of conneConexusto solve their challenges. But maybe it would be helpful to our listeners to put a bit finer of a point on that. You know, is it safe to say that, you know, you feel the most impact you are having is is helping them make sense of the data that they do have, as well as drive insights around the relationships and opportunities to exploit existing data points that they already have. Or are you also coming in and advising them on, you know, new data that they should be creating themselves? Or can you talk about, you know, if I want to work with you in a more practical nuts and bolts sense, like where I bring you in and and where you can have the most impact.

 

Eric Daimler [00:13:42] Sure. I’m going to give you an example from a couple of conneConexusis is clients you know, one is a big hospital network that we work with in New York. And in one of these hospital networks, they have different definitions of diabetes. You’d think I can just look in a dictionary and find a definition. You know, I’m no, I’m not a physician, but there’s a definition of this but that you can quickly understand inside of their universe that the context matters. The context matters a lot. You’re one part of the group might be researchers where the definition of diabetes will be yes or no. Another might be in a clinical setting where the definition of diabetes will be represented as diabetes. How are we treating it? And a third might be in a billing setting, which is, well, Eric used to have it. Now he does. Or Eric’s at risk of. But he still doesn’t have it yet. These are different contexts under which the same term, the same tag, you might say, and a knowledge graph is diabetes, but represented differently in different contexts. The traditional way of addressing these issues and every company has them is reaching consensus or forcing consensus. And some companies are able to have the king come down and say, these are what labels should be designated as within our organization. And if you’re in a low consequence context, such as digital advertising, that actually can work pretty well. But even companies like Uber that have a king that that can dictate what those labels are call, you’ll still have this difficulty of bringing together 300,000 databases into one. What we do is we allow for what conneConexusdoes with our customers is conneConexusor creating a conneConexus provides this, this consolidated view against which you can query that looks for logical contradictions without demanding a consensus. That’s the revolution that’s powered by the math. That’s similar to how smart contracts work, where some of the cryptocurrency on the blockchain demands a consensus. And that’s one reason it’s so energy intensive and has such a difficult time scaling. It demands consensus. That’s what the problem is with demand and consensus. It doesn’t scale well, but when you don’t demand consensus, then you have a you have a foundationally different scaling proposition where instead of scaling factorial, we scale linearly with cost. So the proposition that Uber would have in the traditional method would have cost something on the order of $2 trillion, you know, clearly infeasible. But with us, it costs just a linear amount per per database.

 

Cameron D’Ambrosi [00:16:32] So what’s next in terms of, you know, where you feel there are foundational shifts to be had in AI? You, I think, are are pretty well-positioned to make some prognostications as to the future of of data in this intersectionality with artificial intelligence. Where do you see things going?

 

Eric Daimler [00:16:53] I think there’s a couple of different revolutions coming. You know, the largest epic change is going to be one from modularity during the Industrial Revolution to logic during the computer revolution to one of composability. In this new epic that can only be analyzed with foundational math such as categorical algebra, formal methods, adjacent math such as type theory that powers the ability to create these modules of expertize based from subject matter experts that can then be reconfigured and redeployed in different contexts. You start then with a logical model. You make the implicit explicit that’s going to be part of the skills for everybody to develop in the future as we all. Create this large digital infrastructure together. You know, another one will be around email and aiops that is becoming more sophisticated where these custom deployments are going to become more automated and going to become a little more routine. So those together will allow a degree of data model sophistication to be digitized and become more part of a streamlined process. I think many people out in the general population would be a little concerned if they actually realized how much of our world is powered by people just exchanging Excel spreadsheets and then, you know, that is inevitably going to change so that the faster the people can embrace what it what it means to transfer their implicit knowledge, that’s often just an Excel models to something that that is composable with their team that doesn’t require consensus. You know, the faster that we’re going to be making that transition, those are the bottlenecks that are that are going to be be mitigated, I think, over the next five to 10 to 20 years.

 

Cameron D’Ambrosi [00:18:55] And where do you think regulators have a role to play in all this? Obviously, you know, I think a bit of a what’s the word for this political football, right? In some ways, I think it’s a mirror that you can hold up to yourself. And whatever your views are, it can manifest as either the solution to those challenges or as a reflection of the worst things that you want to see in a in a piece of technology. You know, you could you can have people who make the argument, well, I, if trained correctly, can be fundamentally unbiased and therefore is the solution to some of the human biases that inject themselves into some of these systems. Conversely, you have, I think, some very good examples of AI being used and in some cases are magnifying those existing biases based on what datasets are you feeding into these models? Government you know, as I don’t want to be completely cynical here, but you know, it’s very hard for regulators in any field to kind of be out in front of technological developments, and I think is a field where certainly regulation has not been kind of leading the way. You know, what are you seeing in terms of the broad strokes of how regulators are thinking about these challenges? And do you expect there to be regulatory headwinds around the development and deployment of additional kind of AI technologies?

 

Eric Daimler [00:20:17] Yeah, I really love this question because it gets to the heart of what I want to communicate to your listeners and to a broader audience, which is that we all need to be engaged in the conversation. If we don’t like the regulation, we need to help regulators. These are often, if not the politicians, the staff, you know, well-meaning people that want to engage in the conversation themselves. And to the extent that regulation is required, put together and deploy intelligent regulation, the ways in which I would suggest that will tend to be pretty modest, because we can we can first just take what the low hanging fruit is for a regulation and separate out what we can do from what we can theoretically do. You know, this issue about data bias that’s always there. You know, I have a little model ship given to me by my father in law in my home. It’s a model. It has a representation based on the biases of the person that made the model ship. But my my father in law, it doesn’t it’s not a 1 to 1 representation. It’s not the ship. It’s a model of the ship. This gets reflected in the data. It’s going to reflect that in data models. It even gets reflected in conneConexus experience about whether data exists or not. So Nexus has customers that will deploy ESG reporting that we find to be easier in Western countries than developing countries. Well, that’s that’s a bias in itself. It will then suggest a result that may or may not be true because there’s bias in the presence or absence of data. There’s a lot of different things that we as a society need to work through and see how we want to represent these different biases and correct for them. If we if we do want to correct for them, the easy, low hanging fruit can start with such things as separating out the data where there’s biases from the data model, where there’s biases that those too can be treated then differently. An easy way to do that is have to have a degree of oversight, whether it’s a an internal board, whether it’s an external board, whether it’s forced government regulation, which is in some cases going to be inevitable. In some ways, separating those out can make this much easier for data models. For example, we have this thing called zero knowledge proofs where we use this in credit all the time. We don’t actually know how those FICO scores are. Exactly. But we know approximately that your input, my income and my age and my historical ability to pay off whatever credit I have had, it outputs a number that seems reasonable ish that that’s a type of zero knowledge trust where some companies that think that they need to keep their data model private have a mechanism based on precedent to keep it that way. Should we do that? Another way to do that would be declare the business model and then have some people knowledge and knowledgeable in the art that then observe whether or not that model did what it was intended to do. So maybe it stays private or private ish. But the other is oversight about that data model. And that could happen in domains such as hiring, where just disclosing that there’s an automated program really doesn’t do anything for anybody. What I get really concerned about is regulation, like we see with GDPR, where it could be well-meaning, but the implementation is so far off of reality that people ignore it. So the latest example is this demand from an addendum to GDPR that requires pseudo an organization. Well, if you try to do suit on organization or rather prove its existence, I think you can’t actually do it without that this these sort of proof assistance that are available in in categories there. You know, automated math, I think, will be something that’s ignored until some company other than Google gets a $1 billion fine and then says, whoa, whoa, I need to pay attention to that requirement for pseudo anomalies approving the existence of pseudo normalization of data. So that’s where I think we can go with regulation. And what I think the danger is there’s a lot more to say here. But for us to have the the the A.I. fulfill on its vision of a of a utopia and maybe the degree to which it fulfills this utopia or or goes to a Hollywood narrative or dystopia is the degree to which we are engaged, and we help policymakers construct the most intelligent regulation.

 

Cameron D’Ambrosi [00:25:03] So I know I asked you for some future predictions on the regulatory front, broadening that out. You know, I love to ask my guests for their crystal ball predictions and like to keep it pretty open ended in terms of things that they, you know, believe we’ll see happen or things that they hope to see happen. What are your thoughts there? Like, where is this market headed and what are you hopeful to see in in terms of the deployment of artificial intelligence to solve some of these meaty challenges, you know, across industries?

 

Eric Daimler [00:25:36] Well, I hope that we’re going to be developing a robust system of circuit breakers. You know, we have this in other parts of our world, and it’s beginning to emerge in automated vehicles where, for instance, I need to touch the steering wheel in my car every so often to let it know that I that I’m still there and not sleeping in the backseat. That’s a type of circuit breaker that I think deserves to be explored in other deployments of automation. You know, just because we have automation that can be linked doesn’t necessarily mean that we always will be well served by linking automation without some human oversight. Hey, is that what I intended? Just as a systems organizer, you know, we have 30 million or so programmers in the world with accompanying management and project managers. We need not just be insulated to ourselves, but instead look at broader society about how we want these values to be represented in these automated systems. I expect that to come into existence. I also think that data lineage will expand into data model lineage. Where did I get that information and have some degree of ground truth be able to be established through the network of linkages themselves? This is enabled by a new type of deterministic API that’s also powered by categorical algebra, but that’s a lineage of providence, I think, or something that’s going to be demanded more and more, starting with the financial context and moving into engineered systems. I think around the math that we’ve been talking about in this conversation, I really think that, you know, maybe more math is better. But if I were to choose, I think that geometry, trigonometry and even calculus is going to become less important. A little bit like Latin still exists, but just less and less relevant. And it’ll be replaced more and more by probability statistics and categorical algebra.

 

Cameron D’Ambrosi [00:27:47] I love it. You heard it here. First, kids rip those trig textbooks up, get rid of them. We don’t need them anymore. You know, I do. In that same vein, I do things funny. You know, I’m of the age when, you know, a common refrain in math class was, you know, you better learn this because it’s not as if you’re going to carry a calculator around in your pocket every day, which definitely gives me a chuckle every time I pull my iPhone out. You know, it just goes to show you and again, that, you know, the shift in in the skill sets that are going to be necessary, you know, for maintaining relevance in the future and that the environment in which we’re all going to inhabit. So before we go, we are coming up on time. But I did want to give you a chance for, as I call it, a shameless plug. Dr. Damore, for folks who are interested in either getting in touch with you or more importantly, perhaps learning more about the solution and how they might deploy artificial intelligence to help solve some of their data challenges. What’s the best place for them to go? How should they reach out?

 

Eric Daimler [00:28:53] Sure conneConexus WSJ.com is is our firm. You can certainly reach out to me, Eric. Daimler at always all the usual places I can pitch my book that’ll be coming out around the coming composability in 2023. But but I actually can pitch my wife’s book, which is coming out on corporate culture next month. You can preorder on Amazon Re Culturing. It was one of the top business articles forwarded from Harvard Business Review and now is in a book.

 

Cameron D’Ambrosi [00:29:23] Amazing. Be sure to check that out. Eric. I’m sorry, Eric. Dr. Daimler, it is fun to say not going to lie. Thank you so much for your time. I greatly appreciate it. And this was super, super illuminating. Definitely taught me many things that I didn’t know and I think more critically didn’t know that I didn’t know which, you know, those kinds of blind spots I think are are the most dangerous. So thank you. Thank you so much.

 

Eric Daimler [00:29:49] Thanks. Is was fun.

Episode 298

On this week’s State of Identity podcast host, Cameron D’Ambrosi sits down with Attila Torok, Head of IT and Security at Zapier to take on the hot topic of managing remote security practices in organizations. We discuss the main security areas for infrastructure; good logging standards, vulnerability scans, and how software development. 

Episode 297

In this month’s Investing in Identity series, we discuss the latest movers and shakers in fraud and fintech and take an analytical look at the digital identity trends that are best positioned for deal activity this fall.
The agenda includes:
  • Sardine, a leading provider of fraud, compliance, and instant settlement solutions raises a $51.5MM Series B led by Andreessen Horowitz
  • Alloy, an ID verification platform for banks and fintech companies, receives $52MM in additional funding to accelerate growth and global expansion
  • We’re seeing record levels of accumulated dry powder. Although there’s been a recent slowdown in deployment, once the market resets, how will VCs put their money to work?

Episode 296

What impact does eID have on the KYC space? On this week’s State of Identity podcast, host Cameron D’Ambrosi is joined by Liudas Kanapienis, Co-Founder & CEO at Ondato. This duo discusses the impact of eIDs on the broader KYC space and where the industry is headed. Find out what lessons the rest of the world can learn from Baltic nations, deployment of eID.

Episode 295

On this week’s State of Identity podcast, host Cameron D’Ambrosi sits down with serial entrepreneur, Mickey Boodaei, CEO and Co-Founder of Transmit Security. This duo discusses the challenges of finding an internal stakeholder champion to “own” identity across business units, why the UX battleground isn’t just about your competitors, it’s about any consumer experience across industry verticals, and the importance of shifting enterprise perspective on identity to encompass the entirety of the “digital identity lifecycle.”

Episode 294

How are organizations building technology that can help prevent fraud and automate KYC and compliance?  State of Identity host, Cameron D’Ambrosi and Gbenga Odegbami, CEO and CoFounder of Youverify take on the hot topic of closing the gaps between businesses and consumer identities. 

Episode 293

Why are banks adopting open banking solutions even when regulation isn’t requiring it? Join this week’s State of Identity podcast with host Cameron D’Ambrosi and Bose Chan, Head of Strategic Partnerships at MX to discuss what “open banking” is to banks, how it differs from end users or non-banking entities, and what to consider when it comes to building open banking capabilities. 

Search
Generic filters
Filter by Content Type
Select all
Research
Podcasts
Articles
Case Study
Videos
Filter by Category
Select all
Customer Onboarding
Fraud and Risk
Go-to-Market
Growth Strategy
Identity Management
Landscape
Market Intelligence
Transaction Services