Speaker: 00:00:00

Welcome back to Data Driven, the podcast that peeks into the

Speaker: 00:00:03

rapidly evolving worlds of data science, artificial intelligence,

Speaker: 00:00:07

and the underlying magic of data engineering. Today's guest

Speaker: 00:00:11

is someone who's redefining the rules of the game in AI and data,

Speaker: 00:00:15

Ina Tokarev Saale. She's the CEO and founder of

Speaker: 00:00:18

Illumix, a company pioneering the use of generative semantic

Speaker: 00:00:22

fabric to make organizations AI ready. We'll dig into how

Speaker: 00:00:26

Ina's background as a frustrated data user sparked her innovative

Speaker: 00:00:29

journey, why 80% of enterprise decisions still aren't data

Speaker: 00:00:33

driven, and her bold vision for a future with app free workspaces

Speaker: 00:00:36

where AI copilots handle the heavy lifting. Oh, and we're

Speaker: 00:00:40

tackling the ultimate question. If the future is already here,

Speaker: 00:00:44

why does it still feel so delightfully chaotic? Sit

Speaker: 00:00:47

back, grab your favorite coffee mug, or a Maryland state flag

Speaker: 00:00:51

one if you're feeling fancy, and let's dive in.

Speaker: 00:00:57

Alright. Hello, and welcome back to Data Driven, the podcast where we explore the emergent

Speaker: 00:01:00

fields of data science, artificial intelligence, and, of course, it's all made

Speaker: 00:01:04

possible by data engineering. And with me today is my most favoritest

Speaker: 00:01:08

data engineer in the world, Andy Leonard. How's it going, Andy? It's going well,

Speaker: 00:01:12

Frank. It always warms my heart when you introduce me like that. Well, you are

Speaker: 00:01:15

my most favorite data engineer. Well, that's cool. You're well, you're my

Speaker: 00:01:19

most favoritest. I like, there's so many things. Right? Data

Speaker: 00:01:23

scientist, developer, evangelist.

Speaker: 00:01:27

I mean, there's all sorts of cool things that you do. Super,

Speaker: 00:01:30

certified person. What are you up to in certifies in certification?

Speaker: 00:01:34

12. Wow. Yeah. I'm in I'm in the

Speaker: 00:01:38

New York City area code now. So that's good. Next

Speaker: 00:01:42

up, the Bronx area code 718. So Wow. That's a

Speaker: 00:01:45

big jump. Yeah. Yeah. We're we're working on we're working on it, and I'm at

Speaker: 00:01:49

760 some odd consecutive days. I'm at the point now

Speaker: 00:01:52

where when I post anything on about Pluralsight

Speaker: 00:01:56

or, my number, the search or the number of

Speaker: 00:02:00

days, Pluralsight always sends me a congratulations, Frank. Keep

Speaker: 00:02:04

going. So, like, I'm on their radar now. So which is really

Speaker: 00:02:07

nice. I don't know. It's super cool. Yeah. It is super cool, which reminds me

Speaker: 00:02:10

I still have to do 2 days. But in the

Speaker: 00:02:14

virtual green room, we were talking about coffee mugs. We

Speaker: 00:02:17

were. And, we're we're I don't have a coffee mug with

Speaker: 00:02:21

me today, but, there's an

Speaker: 00:02:25

interesting anecdote from a previous show, which I think the show is live now, about

Speaker: 00:02:28

the Maryland state flag coffee mug, which is, pretty funny.

Speaker: 00:02:32

So today we have with us a very special guest,

Speaker: 00:02:35

Ina Tokarav Sala. She's the CEO and founder

Speaker: 00:02:39

of Illumix, and a pioneer

Speaker: 00:02:43

of generative semantic fabric, which I wanna know more about that, but it

Speaker: 00:02:47

empowers organizations with AI readiness throughout her career

Speaker: 00:02:51

leading data products, monetization, and as a data

Speaker: 00:02:54

stakeholder. Ina recognized the oxymoron of our

Speaker: 00:02:58

domain. Despite huge investments in data and analytics,

Speaker: 00:03:02

most business decisions are still not based on these data or

Speaker: 00:03:05

insights. And when I read that, I felt that one.

Speaker: 00:03:11

So she, she works she founded this company,

Speaker: 00:03:15

Lumix, which is, the the byline says, get your organization

Speaker: 00:03:19

data generative AI ready. So So welcome to the show, Ina.

Speaker: 00:03:23

And, tell us about this. Like, because I think this is a big problem

Speaker: 00:03:26

with generative AI. Well, first off, let's tackle the big

Speaker: 00:03:30

one, which is the idea that despite all this money that's been

Speaker: 00:03:34

thrown at data and analytics for at least 2 decades, probably

Speaker: 00:03:37

longer, a lot of decisions are not data driven.

Speaker: 00:03:44

Yeah. Fine. Can you hear me? Because

Speaker: 00:03:48

I see a little bit Yeah. We can hear you. Okay.

Speaker: 00:03:51

So yeah. Thank you. You're totally right. The the benchmark says

Speaker: 00:03:55

only 20% of decision making in enterprise is based on data.

Speaker: 00:03:59

And to me, I I have been around for a

Speaker: 00:04:02

while. So 25 years in data analytics, and it was

Speaker: 00:04:06

always about cloud, big data. But

Speaker: 00:04:10

what it actually boils down to? Are you able to

Speaker: 00:04:14

pull out whatever analysis of data you need when you have, like, question on

Speaker: 00:04:18

hand? Not really. And this is a situation in majority

Speaker: 00:04:22

of enterprises, right? Even if those huge data

Speaker: 00:04:25

teams and huge investments in infrastructure and all of that.

Speaker: 00:04:29

And to me, the biggest promise

Speaker: 00:04:33

of of LLMs in enterprise setting is to

Speaker: 00:04:37

to bring the contextual and relevant data

Speaker: 00:04:41

to the stakeholders in need.

Speaker: 00:04:44

Right? In this experience which is impromptu which

Speaker: 00:04:48

means it's improvised, it's governed and hallucination free, it's

Speaker: 00:04:52

transparent. So I I would totally love have to

Speaker: 00:04:56

have this experience where I'm in my Slack or Teams, right, and

Speaker: 00:05:00

I've been able to to chat with my data copilot

Speaker: 00:05:03

and ask a question and get the answer I can base decision happen.

Speaker: 00:05:07

Right? Not just an answer. I should be reverse engineering

Speaker: 00:05:11

with, you know, bunch of people.

Speaker: 00:05:15

Interesting. Interesting. But I don't think that I think that

Speaker: 00:05:19

the companies, they

Speaker: 00:05:22

they they they throw a lot of data. They store a lot of data. They

Speaker: 00:05:26

analyze a lot of data. But a lot of at the end of the day,

Speaker: 00:05:29

not all decisions, but a lot of decisions are not based on just the direct

Speaker: 00:05:32

decision of the data. They're based on quite frankly a lot

Speaker: 00:05:36

of it's particularly the higher the, higher the

Speaker: 00:05:40

level. Sometimes it's based on what's good for the person, not

Speaker: 00:05:43

necessarily the organization or the business, let alone the customer.

Speaker: 00:05:48

Do you think what are your thoughts on that? I'm familiar with the saying,

Speaker: 00:05:51

if you touch your data long enough, it will confess. That's

Speaker: 00:05:55

right. It goes exactly to the domain.

Speaker: 00:05:59

So I guess you can you can massage the results

Speaker: 00:06:03

right? But, secondhandly, when an

Speaker: 00:06:06

employee comes to me with suggestion with a business plan with,

Speaker: 00:06:10

you know some project I always ask like what's the ROI like what's

Speaker: 00:06:14

it going to be to spend and what's the impact on on you know

Speaker: 00:06:18

other activities and and what it's going to be on expense of

Speaker: 00:06:22

so having numbers having data to you

Speaker: 00:06:25

know to the basic decision or to bring to your boss is

Speaker: 00:06:29

always has been a struggle and it's still struggle today so I

Speaker: 00:06:32

think it overweights maybe some you know,

Speaker: 00:06:36

reluctance to have open data for all just for the

Speaker: 00:06:40

sake of of being able to to have specific context on it.

Speaker: 00:06:45

Interesting. That that is very interesting. And, you know, that I

Speaker: 00:06:49

think that's been the the purpose of a lot of

Speaker: 00:06:53

data driven activities in in corporations globally

Speaker: 00:06:57

is, you know, and for a very long time is how do you convert

Speaker: 00:07:01

data in its raw natural form into

Speaker: 00:07:05

information? Mhmm. And, you know, and and

Speaker: 00:07:09

defining information as, something I

Speaker: 00:07:13

can glance at and know, you know,

Speaker: 00:07:16

almost instantly how my enterprise is performing.

Speaker: 00:07:20

And that was kind of my opening line 20 years ago when I

Speaker: 00:07:24

started in data warehousing is to go talk

Speaker: 00:07:27

to a decision maker, CIO, CEO,

Speaker: 00:07:31

and, you know, try and do a very small, project,

Speaker: 00:07:35

a phase 0. And just ask them that, how do you know?

Speaker: 00:07:39

And the surprising answer, yeah, even then it was surprising,

Speaker: 00:07:43

was, you know, something along the lines of, well,

Speaker: 00:07:47

people email, information to

Speaker: 00:07:51

a lady out front or a secretary assistant guy out front,

Speaker: 00:07:55

and he or she compiles it and puts it into this summary,

Speaker: 00:07:59

and then they tell me. And so, you know, 1 PM

Speaker: 00:08:03

every day or, you know, Monday on 1 PM. I know how we

Speaker: 00:08:06

did last week. Something like that. It's very

Speaker: 00:08:10

manual processes. So does

Speaker: 00:08:14

does Illumix, address that? The

Speaker: 00:08:18

manual part? Yeah. Yeah. Totally. So

Speaker: 00:08:22

I don't think reports will go anywhere, but I think we'll

Speaker: 00:08:25

have, you know, at least 3 types of

Speaker: 00:08:29

experience with data. So I do I do believe in

Speaker: 00:08:33

application free future where you have a

Speaker: 00:08:37

question or a task and then you have a launcher and you

Speaker: 00:08:40

just, you know, articulate whatever request you have.

Speaker: 00:08:44

And in the background whatever applications, workloads, and data have

Speaker: 00:08:48

been engaged with each other to to basically come up with the

Speaker: 00:08:51

results. Right? So I do believe in this future. Right? So this is

Speaker: 00:08:55

the ultimate. Right? But I think we will have this intermediate

Speaker: 00:08:59

stage where we'll have a lot of copilots or

Speaker: 00:09:03

assisted insights in, in the context of

Speaker: 00:09:07

applications you're already using. So using your CRM systems, you will have

Speaker: 00:09:11

all kind of insights, suggestions, you know, data driven,

Speaker: 00:09:15

actions which which might come up with the system in your

Speaker: 00:09:19

workflow inside your context. Right? And you might have to have

Speaker: 00:09:23

this pure experience when you do go to analytic systems like BI

Speaker: 00:09:27

or something else where you do have your static dashboards,

Speaker: 00:09:31

day after day, same way that I go to, you know, to to my

Speaker: 00:09:35

CRM dashboards and see how pipeline is going and all of that. So I do

Speaker: 00:09:39

not them need to them to change. Right? I don't want to go to some

Speaker: 00:09:42

chatbot and and ask again and again the same question, like, what's the pipeline

Speaker: 00:09:45

conversion today? Right? I do want to have those static dashboards where I just,

Speaker: 00:09:49

you know, sneak peek and see if everything in line and

Speaker: 00:09:53

we we in the benchmark. So those three types of experiences, I

Speaker: 00:09:57

do not think they're going to to evaporate in

Speaker: 00:10:00

the future. Right now, we are mostly bound to the last type of

Speaker: 00:10:04

experience of being in the closed garden of our BI tools,

Speaker: 00:10:08

like this 3 modeled analytic experience and then we'll have this

Speaker: 00:10:12

phase where we do have embedded experience. Majority of the companies are

Speaker: 00:10:15

already suggesting some kind of improvements in the

Speaker: 00:10:19

space, some better, some halfway, let's

Speaker: 00:10:22

say. And and the ultimate goal is to to have this

Speaker: 00:10:26

launcher when for for majority of ad hoc

Speaker: 00:10:29

task of questions, you will have this improvised experience.

Speaker: 00:10:33

So a follow-up on that. You mentioned Copilot, and,

Speaker: 00:10:38

Microsoft has been the company that I've heard using that term most

Speaker: 00:10:41

often for some sort of digital assistance. It

Speaker: 00:10:45

to me, outsider looking in, although I I use the

Speaker: 00:10:48

tools, it it seems to have been a quantum leap,

Speaker: 00:10:53

this year in that technology. It just seems like last year, they were

Speaker: 00:10:56

talking about things that it might help with, and I've seen

Speaker: 00:11:00

all sorts of examples of this. But have you seen that? Has that been

Speaker: 00:11:04

your experience that in the last 12 months, these type of

Speaker: 00:11:07

assistants have just, you know, taken a giant step forward?

Speaker: 00:11:11

Mhmm. I will address this question together with the previous one, like, how

Speaker: 00:11:15

Illumax is is positioned in in this context. So I

Speaker: 00:11:19

do see many projects in the companies

Speaker: 00:11:23

which, and mainly, they're providing

Speaker: 00:11:26

copilots, for call centers or support centers

Speaker: 00:11:31

and mainly based on document summarization.

Speaker: 00:11:35

Right? So document summary is more,

Speaker: 00:11:39

lightweight and and risk averse use

Speaker: 00:11:43

of LLM technology where I can actually go and check the document

Speaker: 00:11:46

itself based on the resource. Right? So it's kind of and documents

Speaker: 00:11:50

are already articulated with lots of context in

Speaker: 00:11:54

business language. So it's kind of low hanging fruit and majority

Speaker: 00:11:58

of the companies go to the direction including, Microsoft.

Speaker: 00:12:02

Where Elamax goes Elamax actually,

Speaker: 00:12:05

tackles the market which is less,

Speaker: 00:12:09

less digested, the market of structured data. So you mentioned you

Speaker: 00:12:13

started your career in warehouse and, so warehouses,

Speaker: 00:12:16

databases, data lakes, business applications such as supply

Speaker: 00:12:20

chain, ARP, CRM, and all of that. All of that

Speaker: 00:12:24

con defined as structured data space. And despite the

Speaker: 00:12:27

name, it couldn't be less structured than it is at the

Speaker: 00:12:31

moment. Right? So you have If it is structured, it's not structured

Speaker: 00:12:35

the way you need it. Yeah. Exactly. So the nay namings are not meaningful, like

Speaker: 00:12:37

abbreviations, frank table, or for like abbreviations,

Speaker: 00:12:41

the, frank table or and this

Speaker: 00:12:45

transformation or alias. Right? So all those weird names especially under

Speaker: 00:12:48

SAP systems. I love that and and no

Speaker: 00:12:52

single source of truth. Right? In documents, you might have versions, but you do

Speaker: 00:12:55

still have some alignment to single source of truth. In data, you

Speaker: 00:12:59

can have many definitions even in the same

Speaker: 00:13:03

data source. And the thing is, if you put semantic

Speaker: 00:13:06

models like semantic search on top of them and it works by proximity,

Speaker: 00:13:11

you might have hallucinations and random answers every time you engage

Speaker: 00:13:15

with the tool. So this this is where we chose with

Speaker: 00:13:18

Illumix to to tackle the problem as,

Speaker: 00:13:22

basically, defining as a 3 step approach.

Speaker: 00:13:25

Right? The first step is getting data AI

Speaker: 00:13:29

ready. So there is no yeah. There is

Speaker: 00:13:33

no way of using generative I or AI analytics in general

Speaker: 00:13:36

if you do not have other data. But for analytics, which is

Speaker: 00:13:40

served to you as BI dashboard, it's actually feasible to do

Speaker: 00:13:44

manual data massaging. Right? Well, fun. Yeah.

Speaker: 00:13:48

Yeah. That's fun. That's near and dear to my heart as a as a data

Speaker: 00:13:51

engineer, data quality. Because

Speaker: 00:13:56

you can have the, you know, the fastest, best presentation, the

Speaker: 00:13:59

slickest graphics, and it could be totally lying to

Speaker: 00:14:03

you. And back, you know, even from the days of of

Speaker: 00:14:07

data warehousing all the way through today's semantic models and

Speaker: 00:14:10

dashboards, it's a the the quality

Speaker: 00:14:14

of the data store you're reporting against,

Speaker: 00:14:17

That that data quality, if you were to measure it, you know, there's a number

Speaker: 00:14:21

of ways to do it. But it's well north of

Speaker: 00:14:25

99% of that. And people see that, and they go, wow.

Speaker: 00:14:29

That that's super good. And it's like, no. No. It didn't. You can't do

Speaker: 00:14:32

predictive analytics off of something that's 99%

Speaker: 00:14:36

because that that 1% of bad data or

Speaker: 00:14:40

incorrect data or duplicate data will skew your results.

Speaker: 00:14:44

And what often, you know, the the layperson doesn't understand

Speaker: 00:14:48

is that if it lies to you and tells you you're gonna make a $1,000,000,000,

Speaker: 00:14:53

that's just as bad as it telling you you're only gonna make a

Speaker: 00:14:57

$1,000,000 if the if the truth is you're gonna you're at about 25,000,000.

Speaker: 00:15:01

That's your real projection if you were to follow that line out and do the

Speaker: 00:15:04

extrapolation, you know, properly. And you can make

Speaker: 00:15:08

bad decisions with an overestimation just as easily,

Speaker: 00:15:12

maybe more so than if it's an underestimation. Yeah.

Speaker: 00:15:16

Exactly. So this goes to to, to the ground truth of

Speaker: 00:15:20

your results as good as your data is. And you cannot

Speaker: 00:15:23

trust, simple semantic search

Speaker: 00:15:27

to solve all these problems for you. And

Speaker: 00:15:31

so for us, the baseline, the first use

Speaker: 00:15:35

case is to get data AI ready or generative AI ready And we

Speaker: 00:15:38

do use generative AI for that from day 1. We actually generated company

Speaker: 00:15:42

from 2021. Yeah. It's funny to say now. It it was very hard

Speaker: 00:15:46

to explain to our investors back then what it actually means.

Speaker: 00:15:51

Yeah. You know, I I get it. I mean, if you build on a crooked

Speaker: 00:15:54

foundation, you you can't get anything straight, you know,

Speaker: 00:15:57

out of that. So that makes perfect sense to me. And it and,

Speaker: 00:16:01

please correct me if I'm mischaracterizing, the work that Illumix

Speaker: 00:16:05

does. But is it automated,

Speaker: 00:16:09

AI automated, data quality? Is that really what you're

Speaker: 00:16:13

after? So, basically, we automated full

Speaker: 00:16:16

stack of LLM deployment for structured data, and it takes the

Speaker: 00:16:20

AI readiness part. AI readiness, which means we have automated

Speaker: 00:16:24

reconciliation, labeling, sensitivity tagging Okay.

Speaker: 00:16:27

Like lots of lots of data preparation which is automated.

Speaker: 00:16:32

Gartner actually named us as a call vendor for that lately. We have

Speaker: 00:16:35

this layer of a context automation. Right? So so any

Speaker: 00:16:39

LLM, any semantic model needs context and this context and reasoning

Speaker: 00:16:43

usually rebuild by data scientists. To me, it's controversial

Speaker: 00:16:46

because, you know we had data modelers which didn't

Speaker: 00:16:50

understand business logic and now we have data scientists who do not necessarily

Speaker: 00:16:54

fully understand business logic and the model into black

Speaker: 00:16:57

box experience of context. Right? So ElamX

Speaker: 00:17:01

reverses process. We actually automate context and we wrap it

Speaker: 00:17:05

up in augmented governance workflow so business people or

Speaker: 00:17:08

governance folks can actually certify it. So it's auto generated

Speaker: 00:17:12

context for LLMs but certifiable by humans. We do

Speaker: 00:17:16

believe that we need to bring human to the loop, right, to to certify

Speaker: 00:17:19

it. Yeah. And the last I love I'm sorry. I have

Speaker: 00:17:23

interrupted you, like, 3 times now, and I apologize. I haven't met 2. I

Speaker: 00:17:26

thought you paused. So finish please finish your thought.

Speaker: 00:17:31

No. No. I'm saying, like, 3 parts. So you already did data governance and the

Speaker: 00:17:34

actual alarm deployment because you need to interact with the whole thing, and the interaction

Speaker: 00:17:38

to have to has to be explainable and transparent. You need to understand

Speaker: 00:17:42

how, especially on structured data, you need to understand how

Speaker: 00:17:46

the question was calculated based, sorry, how answer was

Speaker: 00:17:49

calculated based on questions and how, data was

Speaker: 00:17:53

actually sourced, what's the lineage, what is the governance and access

Speaker: 00:17:57

control through search your clients. So all of that should be on the interaction layer.

Speaker: 00:18:01

So AI readiness, governance, and the interaction layer explainability to

Speaker: 00:18:05

the end user. Absolutely. Okay.

Speaker: 00:18:09

Thanks. And I do apologize again for the

Speaker: 00:18:13

interruption. So my my characterization of it as something that's just

Speaker: 00:18:16

data quality is is way low. There's a little bit of overlap between

Speaker: 00:18:20

data quality and what you're describing. You're talking about taking this into

Speaker: 00:18:24

that next level that is specific to, generative

Speaker: 00:18:28

AI and perhaps other, you know, AI related,

Speaker: 00:18:32

AI adjacent technologies, machine learning leaps to mind and stuff like

Speaker: 00:18:35

that. But your the tagging, the categorizing,

Speaker: 00:18:39

and all of the things you're describing there, that is next level.

Speaker: 00:18:43

And it's very interesting to me that you're

Speaker: 00:18:47

using AI to get data ready for AI.

Speaker: 00:18:51

That's an interesting combination. Mhmm. It makes sense, though. Right?

Speaker: 00:18:55

You can kinda scale out human capability with AI. I

Speaker: 00:18:58

think that's you you kind of alluded that with Newman in the loop. Right? Like,

Speaker: 00:19:02

I think I think where you were kinda going with that, again, don't wanna speak

Speaker: 00:19:06

for you, but it's like the idea that AI isn't gonna replace

Speaker: 00:19:10

humans. It's just gonna make humans more productive. Yeah.

Speaker: 00:19:13

For sure. Augment us because frankly speaking, no one

Speaker: 00:19:17

wants to to model data, you know, as their

Speaker: 00:19:20

career. We want to solve problems. Right? And to solve

Speaker: 00:19:24

problems, we we have to to understand what the problems are

Speaker: 00:19:28

And letting AI to surface the problems as alerts and for us

Speaker: 00:19:32

to to resolve them as conflicts takes, you

Speaker: 00:19:35

know, 1% to 10% of the time that it should take,

Speaker: 00:19:40

where we are busy, you know, wrangling data still. And, you know,

Speaker: 00:19:43

it's sad to some extent because data is growing and we cannot keep up.

Speaker: 00:19:48

No. That's a good point. Even if even if there are people out there and

Speaker: 00:19:51

some of our listeners may really do like modeling data. Right? But, you

Speaker: 00:19:54

know, Dow, they can model about 10 times the amount of data or maybe

Speaker: 00:19:58

a 100 times more. Right? And then ultimately, the expectation of

Speaker: 00:20:02

what a, you know, what a person

Speaker: 00:20:05

can do in a set period of time is gonna go up just

Speaker: 00:20:09

because I I I think I think you're on to something there. Plus,

Speaker: 00:20:13

I also I would also, like, double click on the idea that you said earlier,

Speaker: 00:20:16

which I think was very intriguing, was this notion of

Speaker: 00:20:20

a lot of the apps that you use would kind of fade away. You just

Speaker: 00:20:22

have this virtual assistant. You know, I I think back to

Speaker: 00:20:26

there's a number of scenes in, you know, Star Trek The Next Generation where they

Speaker: 00:20:30

have a conversation with the computer. Right? Mhmm. You know, you they

Speaker: 00:20:33

don't they use the computer. They get stuff done. There's no

Speaker: 00:20:37

Microsoft Word. There's no PowerPoint. Right? Like, there's no, like, it's

Speaker: 00:20:40

just the the there is no application. The application is kind of invisible. It

Speaker: 00:20:44

becomes the computer. And I think that's a very

Speaker: 00:20:48

intriguing kind of way. And if you had told me that a year ago, I

Speaker: 00:20:51

would have been very skeptical. Now I look at it, I'm like, I

Speaker: 00:20:55

mean, it's it's it's almost inevitable.

Speaker: 00:20:58

Yeah. Yeah. I agree with you. Futures here,

Speaker: 00:21:02

it's not evenly distributed as people say. So I

Speaker: 00:21:06

guess, you know, when you're attending conferences in Bay Area,

Speaker: 00:21:09

it's already it's already here. It happens. Right

Speaker: 00:21:14

and when you go to let's say Europe we

Speaker: 00:21:18

even just say you know just say a EU act in

Speaker: 00:21:21

Europe is is ramping up so it's all about

Speaker: 00:21:25

controls and and this is great So I do not think that regulation and

Speaker: 00:21:28

innovation, actually, jeopardize each other. I think

Speaker: 00:21:32

they should go hand by hand and, that's where I see

Speaker: 00:21:36

industry is going. So so East Coast approach, majority of our customers

Speaker: 00:21:40

are coming from East Coast US, Pharma,

Speaker: 00:21:44

financial services, insurance, highly regulated data

Speaker: 00:21:48

intensive companies. They have now,

Speaker: 00:21:51

sometimes even inventing standards for generative AI

Speaker: 00:21:55

implementations because everything is so new but companies

Speaker: 00:21:58

want to go fast. Right? So no one wants

Speaker: 00:22:02

to to downplay risks on one hand. On the other

Speaker: 00:22:06

hand, everyone want to, you know, to implement generative AI

Speaker: 00:22:10

and see the productivity cuts. It's, you know, it's evident productivity

Speaker: 00:22:13

cuts are already here with all those co pilots summarization,

Speaker: 00:22:18

what have you and this is where we are today. So I

Speaker: 00:22:22

think like again Bay Area running fast

Speaker: 00:22:26

and east is coming up with regulation. We will meet somewhere

Speaker: 00:22:30

in between. I believe in both. Well, if you kind of,

Speaker: 00:22:33

like, look at, like, historically, you know, when .coms first

Speaker: 00:22:37

started, right, there were a number of, hey. Look. You know, we're gonna sell pet

Speaker: 00:22:41

food online. Right? Like, and then it was

Speaker: 00:22:44

like, back in the dial up days, it didn't really make a lot of

Speaker: 00:22:48

sense. So it would just be easier for me to go to the store.

Speaker: 00:22:52

Whereas now, I mean, if you think about ecommerce, obviously,

Speaker: 00:22:55

Amazon is the £2,000,000,000 gorilla in the

Speaker: 00:22:59

room. I like, do I really

Speaker: 00:23:03

wanna think about, you know, dealing particularly as we get into the holiday season, do

Speaker: 00:23:06

I really wanna deal with the traffic at the mall or the store when I

Speaker: 00:23:10

can just click on something, either have, you know, groceries delivered

Speaker: 00:23:13

or, you know, I'm I'm okay waiting 2 days for

Speaker: 00:23:17

something to come up if I don't have to deal with them all.

Speaker: 00:23:21

Yeah. Totally. What's what's the difference between Black Friday

Speaker: 00:23:24

and Cyber Monday? No. It's not. Right? Like not really. Yeah.

Speaker: 00:23:28

Yeah. So it's like Not anymore. I remember Yeah. You

Speaker: 00:23:32

know? So we're recording this just before Black Friday. And,

Speaker: 00:23:36

you know, this whole idea of, you know, going to the store, get

Speaker: 00:23:40

the best deals, it's like, do I really wanna deal with the

Speaker: 00:23:44

crowd? No. Yeah. Although ironically, the name for the

Speaker: 00:23:47

podcast came on a Black Friday, while I was

Speaker: 00:23:51

at a Dunkin' Donuts, drinking coffee, waiting waiting

Speaker: 00:23:55

in line actually to get so there's a I'm a Krispy Kreme

Speaker: 00:23:59

person. So I'm Ah, okay. Yeah. So With you and

Speaker: 00:24:03

I, right, definitely. Right here. This is before we had a Krispy Kreme

Speaker: 00:24:06

near us. So it's I I have split sides, but yeah. Yeah.

Speaker: 00:24:10

Jeff's JT. He's a mess. From up north. So they are

Speaker: 00:24:14

they're Dunkin' Donuts. I've noticed this. They're Dunkin' Donuts, like, north of

Speaker: 00:24:17

Virginia. And he's in Maryland. I'm in Virginia. Then down

Speaker: 00:24:21

south, you rarely see a Dunkin' Donuts. I see more Dunkin' Donuts down

Speaker: 00:24:25

south than Krispy Kreme's up north, though, for sure. Yeah. But

Speaker: 00:24:28

I They're they're from Boston. That's why. Yeah. Oh, that's why. And then So at

Speaker: 00:24:32

Krispy Kreme's from Atlanta. And plus, it's funny. Right? Like, so I live in

Speaker: 00:24:35

Maryland Mhmm. Which depending on who whom you ask is either

Speaker: 00:24:39

north or south. So that's right. That's true.

Speaker: 00:24:43

Interesting. Interesting. We're a quarter state for sure. Yeah. That that's

Speaker: 00:24:47

that goes safe for Virginia. But I wanted to follow-up on, you know, you've

Speaker: 00:24:51

been we've been talking about all the cool stuff. I'm

Speaker: 00:24:55

gonna try and say this correctly. Illumix. Is that correct? Am I getting it

Speaker: 00:24:58

right? So Illumix name

Speaker: 00:25:01

from Illuminating the Dark Side of Organizational Data.

Speaker: 00:25:05

Illuminate like illuminate. Illuminate. I like that. And x x

Speaker: 00:25:09

for the x factor. Excellent. X for the x

Speaker: 00:25:13

factor. Yeah. What? And I'm not asking you to I'll

Speaker: 00:25:16

just ask a question. What are the risks in in what you're doing?

Speaker: 00:25:21

And, you know, what are the risks you're aware of and how are you addressing

Speaker: 00:25:24

those? Yeah.

Speaker: 00:25:28

So I think the biggest risk of 2025

Speaker: 00:25:31

is going to be, a TCO, total cost of

Speaker: 00:25:35

ownership. So already today,

Speaker: 00:25:39

it's, it's very hard for organizations to to

Speaker: 00:25:42

monitor where the generative AI tokens are spent.

Speaker: 00:25:47

And the benchmark say that 80%

Speaker: 00:25:50

of LLM tokens actually spend on customization

Speaker: 00:25:55

of off the shelf models. And that's not a good news because

Speaker: 00:25:58

which means ROI is is pretty low on on the actual

Speaker: 00:26:02

production use of generative AI in in enterprise.

Speaker: 00:26:05

And I think it doesn't get any better because the

Speaker: 00:26:09

customizations techniques which are used today gains a black box

Speaker: 00:26:13

performed by super expensive data scientists and

Speaker: 00:26:17

they're not very scalable for data that you don't want to, you know,

Speaker: 00:26:20

to schmooze around. I think it's cost prohibitive actually to bring data

Speaker: 00:26:24

to AI. You need to bring AI to data. So so putting

Speaker: 00:26:28

data in some graph structures for graph, frog, and all of that, it's to me,

Speaker: 00:26:32

it's cost prohibitive. So this is why I think that, the Telumex

Speaker: 00:26:36

position for 2025 is actually favorable because we bring this

Speaker: 00:26:39

transparency. We do create this, a virtual,

Speaker: 00:26:43

a semantic knowledge graph, which is transparent to certify, which is

Speaker: 00:26:46

created for business people. Based on business

Speaker: 00:26:50

logic. We do use extensively industry ontologies and so on so forth.

Speaker: 00:26:54

And I think the the most interesting part about generative AI is

Speaker: 00:26:58

we do not necessarily going to mimic processes that

Speaker: 00:27:02

the humans performed. Mhmm. We're going to invent

Speaker: 00:27:06

those processes. Right? So new new processes and new workflows. So

Speaker: 00:27:09

right now, a generative AI is deployed like like

Speaker: 00:27:13

analytics is deployed, which means you you have to

Speaker: 00:27:17

label your data, check the quality, usually manually, and then

Speaker: 00:27:21

you have to to prepare the test set which is fed

Speaker: 00:27:24

into customization of the model and then you actually provide the

Speaker: 00:27:28

context to on every question. So this is

Speaker: 00:27:32

very old fashioned or, you know, 40 years old

Speaker: 00:27:35

machine learning technique to to actually train generative

Speaker: 00:27:39

vi. So this is why why I'm saying that, many companies are

Speaker: 00:27:43

probably going to to mimic what Equinox does in the sense

Speaker: 00:27:47

that you have to you have to be focused on domain

Speaker: 00:27:50

specific knowledge, reason, ontologies, and knowledge graphs. You have

Speaker: 00:27:54

to onboard your customers automatically via metadata because

Speaker: 00:27:57

metadata has the factor all

Speaker: 00:28:01

activities in organization documented for us. We're

Speaker: 00:28:04

just under utilizing them, right? And then you bring your

Speaker: 00:28:08

business people, your domain experts, your governance teams to the

Speaker: 00:28:11

loop because you can simply cannot bring this business acumen,

Speaker: 00:28:16

to, you know, to data. You have to bring data to to those people.

Speaker: 00:28:20

That's an interesting thing because I've seen the the particularly is this this this

Speaker: 00:28:24

statistic around 80% of the tokens are being used to

Speaker: 00:28:27

manipulate the data. I have a microcosm example of that

Speaker: 00:28:31

where I use AI to augment my blog post, my blog

Speaker: 00:28:35

that I create, and I finally took

Speaker: 00:28:40

a closer look at this because I was spending a lot more on

Speaker: 00:28:43

the OpenAI API than I really wanted to. And I'm like, well,

Speaker: 00:28:47

what exactly am I I'm using a product called Fabric.

Speaker: 00:28:51

And I'm like, wait, what exactly is the source of this prompt? And I look

Speaker: 00:28:55

at it, and I'm like, I can't. It's a lot. It's a long prompt. And

Speaker: 00:28:58

I'm like, I really don't need that. Right? So we are gonna do a deep

Speaker: 00:29:01

dive in a show on Fabric at some point. Not not the Fabric Andy

Speaker: 00:29:05

works with, but there's an open source thing called fabric. There's

Speaker: 00:29:08

a I'm sure there are lawyers right now that are doing their

Speaker: 00:29:12

holiday shopping based on how much money they're gonna make off of this

Speaker: 00:29:15

dispute. But, the the short of it is, like,

Speaker: 00:29:19

I realized, like, well, no wonder why I spent so much money. I'm sending all

Speaker: 00:29:22

of this in my prompt plus the content. So I

Speaker: 00:29:26

actually in the verse before you joined in, Andy and I were talking, and I

Speaker: 00:29:29

was like, I actually got a really good result based on a more optimized

Speaker: 00:29:33

prompt. You know? And, you know, strictly speaking, it's

Speaker: 00:29:36

not I I like your approach of bringing the AI to the data rather than

Speaker: 00:29:39

bringing the data to the AI because that is expensive.

Speaker: 00:29:43

You know, I I think that bringing the AI to the data will be less

Speaker: 00:29:47

expensive. How less, I think, remains to be seen. But I like that approach,

Speaker: 00:29:50

right? Because that's typically what we've done, you know, and we've seen

Speaker: 00:29:54

huge upsides to that, whether it's from Hadoop bringing the

Speaker: 00:29:58

compute to the data rather than vice versa. I like that

Speaker: 00:30:02

approach. And it's backed by historical precedent. Right? So it's not

Speaker: 00:30:05

completely gonna be this crazy idea. It's just a very sensible

Speaker: 00:30:09

idea. Yeah. Yeah. I believe the future was already

Speaker: 00:30:12

invented. Right? So it's just the inclination of technologies we already have.

Speaker: 00:30:16

It's been healthy about it. So, we had

Speaker: 00:30:19

machine learning practices which are very healthy like feature

Speaker: 00:30:23

exploration, feature definitions and then we had neural net brute

Speaker: 00:30:27

force and then majority of companies used combination of both,

Speaker: 00:30:31

right, to to to be optimized. This is what I think what's happening with

Speaker: 00:30:34

generative AI. So this, you know, wild west of brute

Speaker: 00:30:38

force or great spend is going to be replaced by methods

Speaker: 00:30:42

which have, like, this automated context filtering or pre

Speaker: 00:30:45

processing and then use like fraction of your budget to to actually

Speaker: 00:30:49

run the query. Yeah. I remember hearing about a lot

Speaker: 00:30:53

of this in the late nineties. And, I worked for a company who

Speaker: 00:30:57

was a big SAP shop. I see you have a history with SAP. Yeah. And

Speaker: 00:31:01

this lady and and and so we were an we were the IT department. So

Speaker: 00:31:04

we were in the basement, but the analytics team back then was in

Speaker: 00:31:08

a closed in space inside the basement. So it was

Speaker: 00:31:11

like even more like, you know, I was the web developer, so I didn't

Speaker: 00:31:15

have a window, but I could see the window about 50 feet away.

Speaker: 00:31:19

But, like, when you when when you went

Speaker: 00:31:23

into this, like, you know, further enclosed space deeper into

Speaker: 00:31:26

the the the the the depths of the IT department,

Speaker: 00:31:31

there was the database team. And and and and in the back of that area

Speaker: 00:31:34

was the analytics group. And I remember this lady telling me

Speaker: 00:31:40

that she was working with these things called OLAP cubes. Oh, wow.

Speaker: 00:31:44

Yeah. And I was like, what is that? And then she went on this thing

Speaker: 00:31:47

and, you know, I'm remembering a conversation, oh my god,

Speaker: 00:31:51

almost 30 years ago. But I just remember walking away with,

Speaker: 00:31:55

like, that sounds either crazy because she's talking about,

Speaker: 00:31:59

like, you know, figuring out patterns. Right? So, you know, will

Speaker: 00:32:03

rainfall patterns in Australia affect not just the agricultural

Speaker: 00:32:07

side of the chemical business, but also the plastics purchasing

Speaker: 00:32:10

versus rainfall in the Amazon versus this and all of

Speaker: 00:32:14

that? And I just remember walking away from that conversation as I as I

Speaker: 00:32:18

as I as I leave the depths of the IT department back to my normal

Speaker: 00:32:22

kinda, basement. Back to the regular basement from

Speaker: 00:32:25

the sub basement. I remember thinking that is either the craziest thing I

Speaker: 00:32:29

ever heard or the most profound thing I ever heard, which

Speaker: 00:32:33

now with the, hindsight of time, it turns out it was the most profound

Speaker: 00:32:36

thing. Yeah. You you can think about it as

Speaker: 00:32:40

semantic layers of, you know, that era. Right?

Speaker: 00:32:44

Mhmm. Right. And I think You know go ahead.

Speaker: 00:32:48

I'm sorry. Sorry. I think it's delayed between the

Speaker: 00:32:51

between the connection. So I think around the same time I was

Speaker: 00:32:55

doing my bachelor and my project was about multi dimensional

Speaker: 00:32:59

theory. So multi dimensional geometry,

Speaker: 00:33:03

of these neural nets. So basically, you model neural nets as multi

Speaker: 00:33:07

dimensional graph and it does operational research calculations.

Speaker: 00:33:11

So it's exactly the same. You you model your universe in a

Speaker: 00:33:15

graph. Back then it wasn't MATLAB. We didn't have any, you

Speaker: 00:33:18

know, neural nets Right. Structures or graph structures and so you're

Speaker: 00:33:22

modeling in MATLAB in this weird language,

Speaker: 00:33:26

a graph which has a neural nets on there. And

Speaker: 00:33:30

this is exactly like modeling all of cubes. Right? A

Speaker: 00:33:33

multidimensional representation of your reality. Now,

Speaker: 00:33:36

unfortunately, we have a new technologies which,

Speaker: 00:33:40

which are semantic and context. Right? Large language

Speaker: 00:33:44

models and graphs, which do the same thing but much

Speaker: 00:33:48

more efficiently. Yeah. So this is amazing. Like, I

Speaker: 00:33:52

think it goes back to what you said. You know, The future's already here. It's

Speaker: 00:33:55

just not widely distributed yet, which I think is a William Gibson

Speaker: 00:33:59

quote, or is it a Esther Dyson quote? I forgot.

Speaker: 00:34:04

But it's one of those 2 kinda luminaries. Yep.

Speaker: 00:34:07

You you said what I was going to say, you know, and it

Speaker: 00:34:11

was, you know, more of what off of what Frank

Speaker: 00:34:15

said is it turns out that we're just

Speaker: 00:34:18

doing more nodal analysis and vector

Speaker: 00:34:22

geometry as a result of that. That's it did all start

Speaker: 00:34:26

with multidimensional and and grow from there. And

Speaker: 00:34:30

that's where these algorithms, like nearest neighbor

Speaker: 00:34:33

originated, was in that math. So

Speaker: 00:34:38

Yeah. Yeah. Great minds. Exactly. Exactly.

Speaker: 00:34:41

Alike. Exactly. Now you're

Speaker: 00:34:45

complimenting me. Thank you. I I feel I feel better

Speaker: 00:34:49

when smart people in the room agree with me.

Speaker: 00:34:53

No. I'm on the right path. You know, I employ

Speaker: 00:34:56

millennials. So so having people with experience in multidimensional

Speaker: 00:34:59

geometry and all of cubes, it's just a miracle to me to to start

Speaker: 00:35:03

with. You know? People now like Python, neural

Speaker: 00:35:06

nets, we do actually, the average age in in in

Speaker: 00:35:10

Lumex is around 35, 37, something like that. So we do

Speaker: 00:35:14

have like also pretty experienced folks, you know, but new talent,

Speaker: 00:35:18

they, they they're not familiar with all all of that.

Speaker: 00:35:22

And I think it's actually a disadvantage because,

Speaker: 00:35:26

when when you do know different patterns in architecture Yeah.

Speaker: 00:35:29

You can model them with new technology. Right? Make them more

Speaker: 00:35:33

efficient, but you already know what works and what doesn't, and it

Speaker: 00:35:36

helps. That yeah. That's a great point. The old

Speaker: 00:35:40

experience, you know, the experience that we have from doing this for

Speaker: 00:35:43

decades is that we see the patterns that have

Speaker: 00:35:47

repeated over time, architectural patterns and design patterns. And,

Speaker: 00:35:51

you know, and we know that they've

Speaker: 00:35:55

I I love that how you said that. The, you know, the future's already been

Speaker: 00:35:58

invented. We we realize that if we reapply some of these

Speaker: 00:36:01

patterns, that there are use cases for them, not just now, but

Speaker: 00:36:05

also in the future. So totally get you.

Speaker: 00:36:09

Too, you know, like,

Speaker: 00:36:12

you know, it it is painful to think that, you know, we've been in this

Speaker: 00:36:16

industry for decades. Right? It is a little hurts a little bit. But,

Speaker: 00:36:20

like, also, if you're listening to this, you've not been in the industry for

Speaker: 00:36:23

decades, and you're thinking like, woah. You know, what are these what are these

Speaker: 00:36:27

old geezers now? I would point out when I was

Speaker: 00:36:31

a young kid in the industry and, you know,

Speaker: 00:36:35

client server was like the new hotness. Right?

Speaker: 00:36:39

And, you know, the whole notion of going back to,

Speaker: 00:36:43

you know, cloud and and and and and, you know, terminal

Speaker: 00:36:47

and an old mainframe geezer basically said to me, like, this is just

Speaker: 00:36:51

this industry has a cycles. Right? It's like the fashion industry. This goes in

Speaker: 00:36:54

style. This goes out style. And it was like, I had that moment

Speaker: 00:36:58

of, like, wait. I think he's on to something, but he's just an old geezer,

Speaker: 00:37:02

so I won't listen. So, you know, so so

Speaker: 00:37:06

if you are a young buck, like, or,

Speaker: 00:37:11

buck is a male deer, right? What would be a Yes. A doe. A young

Speaker: 00:37:14

doe. So if you're a young buck or a young doe, I grew up

Speaker: 00:37:18

in New York City. So all of this wildlife thing is brand new. I'm here

Speaker: 00:37:22

for you. I'm here for you, Frank. So, you

Speaker: 00:37:26

know, listen to, like, some of the things that these, you know, more

Speaker: 00:37:29

experienced colleagues will say. Yeah. You know,

Speaker: 00:37:34

if you don't believe it right away, just put it on the shelf in your

Speaker: 00:37:36

mind because you're gonna need it later. It'll come up at some point.

Speaker: 00:37:40

And it's like, if you look at kind of, you know, everybody ran to the

Speaker: 00:37:44

cloud. Right? And cloud is effectively like a

Speaker: 00:37:47

mainframe effectively. Right? The same philosophy. Right? Centralized

Speaker: 00:37:51

computing somewhere else. Right? And then your browsers become

Speaker: 00:37:54

the terminals, terminals with fancy graphics, but terminals nonetheless.

Speaker: 00:37:58

Now I think you're gonna start seeing it kind of we're about due for a

Speaker: 00:38:02

seismic shift backwards, right, as people kinda move

Speaker: 00:38:06

repatriate data and things like that. Particularly, I think driven by AI

Speaker: 00:38:10

because of the cost of some of this. You know, I had this debate,

Speaker: 00:38:14

you know, the other day. It was like, you know, if if one of these

Speaker: 00:38:18

super clusters with, you know, a 100, 8 100,

Speaker: 00:38:21

all of this, if it costs, say, $500,000,

Speaker: 00:38:26

right, I could probably do the math, and that probably means

Speaker: 00:38:30

about, you know, there's a certain break even point,

Speaker: 00:38:34

and it's probably after about 7 or 8 fine tunings or full

Speaker: 00:38:37

on trainings where it's just cheaper to have it. Just buy it.

Speaker: 00:38:41

Yeah. Yeah. Yeah. Totally on that. And also, you

Speaker: 00:38:45

know, salary skills are the most expensive part. So you

Speaker: 00:38:48

want to spend it on your business specific problems and

Speaker: 00:38:52

not generic problems you can solve with software. Right? So

Speaker: 00:38:56

it's always like that. Yeah. Yeah. So,

Speaker: 00:39:00

I do think that, basically capacity to process data

Speaker: 00:39:04

is is going to be a challenge. Right? And this is why we

Speaker: 00:39:08

see that, that majority of,

Speaker: 00:39:11

of I would even say countries not

Speaker: 00:39:14

only specific enterprises, kind of gear

Speaker: 00:39:18

up with, with GPUs, FPGAs,

Speaker: 00:39:21

whatever hardware you have. Right? So do you see it in

Speaker: 00:39:25

middle east, in emirates? They they have national generative

Speaker: 00:39:28

vi grid and they're building it for, you know, not only government companies

Speaker: 00:39:32

but also private companies. We see the same in Europe

Speaker: 00:39:36

and I would assume, you know, US based telcos

Speaker: 00:39:40

are going to to provide those data centers with GPU soon

Speaker: 00:39:43

enough, right, for, you know, for everyone to purchase as an

Speaker: 00:39:47

alternative to the public cloud. Yes. And we'll

Speaker: 00:39:50

see it. So this is for starters. And second one, the second part where

Speaker: 00:39:54

you don't need, this, you know, heavy machinery,

Speaker: 00:39:58

you might just have your variables processing

Speaker: 00:40:02

parts of whatever generated AI on your end before sending to the cloud

Speaker: 00:40:06

because you do not necessarily need to to process everything in a central

Speaker: 00:40:10

manner. We basically have pretty powerful machines on

Speaker: 00:40:13

our hands or in our hand, you know, as

Speaker: 00:40:17

glasses as well. We can see that, and it's

Speaker: 00:40:21

going to be part of the processing. So the processing is going to be distributed.

Speaker: 00:40:24

You bring AI to your data, where your data is. You do

Speaker: 00:40:28

not shift your data all the time. It's not, it's not

Speaker: 00:40:32

cheap anymore. And we'll have this, as you mentioned,

Speaker: 00:40:35

those central repositories of mass processing

Speaker: 00:40:39

and those distributed powerhouses which are

Speaker: 00:40:43

small enough to to process data on on edge.

Speaker: 00:40:47

I think you're right. I think you're gonna see a set of data being processed

Speaker: 00:40:50

in one place. I think it's gonna be everywhere. There's gonna be some

Speaker: 00:40:54

and and I think that that introduces some interesting, consequences. Right?

Speaker: 00:40:58

So my wife works in IT security, and I can immediately hear her voice in

Speaker: 00:41:02

the back of my head. Contrary to what you think, ladies, we do

Speaker: 00:41:05

listen. We just don't always pay attention. But

Speaker: 00:41:09

I can hear her like, well, if compute's happening everywhere,

Speaker: 00:41:13

gee, couldn't like that be poisoned anywhere.

Speaker: 00:41:16

Right? I think I think that's going to be the next kind of thing. Right?

Speaker: 00:41:20

It's and it's again, it's a pattern. Right? Advancement.

Speaker: 00:41:23

Bad actors take advantage for that. Problem happens. And

Speaker: 00:41:27

then then that's the new thing. Right? So it's almost like you're you're building like

Speaker: 00:41:30

a, like a like a like a layer cake. Right? Like, you know, the cake

Speaker: 00:41:33

goes down then the frosting. The cake is the innovation. The frosting is

Speaker: 00:41:37

security, and then so on and so on. So Yeah. Yeah. Yeah.

Speaker: 00:41:40

So it basically back to the semantics. What we started is

Speaker: 00:41:44

semantic ontology as a baseline for generative AI.

Speaker: 00:41:48

It has multiple benefits. Single source of truth, of course, has the

Speaker: 00:41:52

benefits for accuracy. But also, if you're passing every

Speaker: 00:41:56

question to this semantic ontology context,

Speaker: 00:41:59

it's almost impossible to poison it because we're going to either

Speaker: 00:42:03

match to part of your logic or Right. Right. We're going to

Speaker: 00:42:07

miss. So it's it's another layer of security if you think about

Speaker: 00:42:10

it. So, so yeah.

Speaker: 00:42:14

That's an interesting point. All new. Yeah. All new ontology, all new

Speaker: 00:42:18

semantics have governance meaning, it has

Speaker: 00:42:21

accuracy meaning, it has also security meaning.

Speaker: 00:42:27

And also if you want to have single source of truth you have to to

Speaker: 00:42:30

have means to distribute it to those edge devices or

Speaker: 00:42:33

to to bring it back to central location and without ontologies, without

Speaker: 00:42:37

semantic layers, simply it's impossible to do that. I was gonna

Speaker: 00:42:41

say, like, the the the infrastructure, not just the computer infrastructure, but the

Speaker: 00:42:44

logical infrastructure to distribute this stuff,

Speaker: 00:42:48

it's probably not a trivial problem. That's the first thing that popped in my mind.

Speaker: 00:42:51

I was like, you know, like, oh, yeah. You're right about the distributed

Speaker: 00:42:55

activity on this data, but, wow, what does that

Speaker: 00:42:59

look like? What do updates look like? Like, the whole like, it's a it sounds

Speaker: 00:43:02

like a growth industry to me.

Speaker: 00:43:07

Definitely. Yeah. Yeah. I don't it's, it's

Speaker: 00:43:10

what we call, engineering problem. Right? So

Speaker: 00:43:14

creating ontology is data science or generative AI problem, but

Speaker: 00:43:17

distributing it, maintaining it, thinking it's its engineering problem.

Speaker: 00:43:21

Engineering problems tend to to have engineering solutions. Oh, Oh,

Speaker: 00:43:25

that's a good point. That's a good way to look at it. I like that.

Speaker: 00:43:27

I like that. So did you wanna do the, premade questions?

Speaker: 00:43:31

Because we haven't we've gone a few shows without them. If you're okay with those,

Speaker: 00:43:34

Ina, we can we can ask them. If not, that's fine

Speaker: 00:43:38

too. Of course. Yeah. Sure. Mhmm. So they're not they're not complicated.

Speaker: 00:43:41

They're more kinda just general questions. I pasted them in the chat.

Speaker: 00:43:46

But the first question and and you've had a a pretty

Speaker: 00:43:50

significant career with SAP and and before that. How'd you

Speaker: 00:43:53

find your way into this space? Did you find data or did

Speaker: 00:43:57

data find you? I

Speaker: 00:44:01

found my way to data by being frustrated

Speaker: 00:44:04

user. Right? So I started in engineering

Speaker: 00:44:08

and it was evident to me that

Speaker: 00:44:12

using data as engineer is not enough. You have to go to

Speaker: 00:44:15

data management. You have to fix those things because otherwise

Speaker: 00:44:19

I will I will going to be frustrated for the end of my life. Right?

Speaker: 00:44:22

So I went to data management analytics to to solve the problem

Speaker: 00:44:26

and I discovered that, as you mentioned, every experience

Speaker: 00:44:30

has a footprint. So my experience with graphs and with

Speaker: 00:44:34

operational research and multidimensional geometry and all of that is so

Speaker: 00:44:38

useful for data management. And it was actually exhilarating.

Speaker: 00:44:42

That's true. Like and I like that because, like, every experience does leave

Speaker: 00:44:46

a footprint. Like, you know, that that's cool. I'm gonna I'm gonna pull that out

Speaker: 00:44:50

as a special quote for the episode. That's a great quote. Yeah. So

Speaker: 00:44:54

our next question why we do these? Yeah. Is what's your favorite part of your

Speaker: 00:44:58

current gig? My favorite part of being a

Speaker: 00:45:01

founder is is

Speaker: 00:45:05

unlimited ability of experimentation,

Speaker: 00:45:09

right? So majority of my day actually say no

Speaker: 00:45:13

to things, not to experiment, which is which is hard, which is not fun part,

Speaker: 00:45:17

right? But, still, we can

Speaker: 00:45:21

make decisions and we can do

Speaker: 00:45:24

new stuff every day. So as a founder,

Speaker: 00:45:28

it's been very, very different than enterprise setting. And don't don't take

Speaker: 00:45:32

me wrong. Like, SAP is a huge place of growth and had

Speaker: 00:45:36

very, fulfilling career at SAP, you know, building

Speaker: 00:45:39

stuff, founding p and l's, running big organizations,

Speaker: 00:45:43

but but been able to to actually, you know,

Speaker: 00:45:46

start anything new. And, like, right now, we have this customer

Speaker: 00:45:50

and they want to to try Illumax on in

Speaker: 00:45:54

parallel on the newest, you know, newest BI

Speaker: 00:45:58

tool with semantic layer or and on the oldest

Speaker: 00:46:02

warehouse on premise at once. I'm like, okay. Challenge accepted.

Speaker: 00:46:05

Yeah. And next Wow. Yeah. And next day, you know, engineer

Speaker: 00:46:09

comes with we have this academic data set and they have these benchmarks.

Speaker: 00:46:13

Let's beat them. I'm like, yeah, let's do it. It could be cool stuff.

Speaker: 00:46:17

Right? Lovely. So, you know, you know, it's to some extent,

Speaker: 00:46:20

so we don't need to justify it, you know, business wise and but but in

Speaker: 00:46:24

majority of cases, we can. Cool.

Speaker: 00:46:28

We have a couple of complete the sentences. When I'm not working, I

Speaker: 00:46:31

enjoy blank. I used to

Speaker: 00:46:35

enjoy doing jogging and yoga when I'm not working.

Speaker: 00:46:39

Right? So right now when I'm not working which means when I'm not

Speaker: 00:46:43

traveling I just spend time with my family. Whatever

Speaker: 00:46:47

is the plan for the weekend if it's just you know Netflixing,

Speaker: 00:46:51

or cooking or hiking whatever is the plan I just

Speaker: 00:46:55

join So sometimes just, you know, plan it. But spending time with my

Speaker: 00:46:58

family has become, indulgence and I'm

Speaker: 00:47:02

very focused on that. Cool. Very cool. Our

Speaker: 00:47:05

next is I think the coolest thing in technology today

Speaker: 00:47:09

is blank. I think the coolest tech is

Speaker: 00:47:13

thing right now is not in tech. It's actually the

Speaker: 00:47:16

pull from CEOs of companies

Speaker: 00:47:20

for technology. This is something which didn't experience for decades.

Speaker: 00:47:24

So we were pushing cloud and big data and machine learning and deep learning. We

Speaker: 00:47:27

were explaining to business stakeholders why do they need that. Mhmm.

Speaker: 00:47:31

And now, so you're all coming and saying, okay, I want to have

Speaker: 00:47:35

chatbot experience for x y that, so just

Speaker: 00:47:38

build it. This is actually I think this is the coolest

Speaker: 00:47:42

part because it's kind of a removes majority of the friction that

Speaker: 00:47:46

we had to to deploy technology in the past.

Speaker: 00:47:50

Interesting. On our 3rd and final complete the sentence,

Speaker: 00:47:54

I look forward to the day when I can use technology to blank.

Speaker: 00:48:00

So many things. You know, travel has

Speaker: 00:48:04

been so frustrating lately, and, I

Speaker: 00:48:07

don't think what happened because it's like kind of technology goes

Speaker: 00:48:11

forward but airline, you know, travel technology,

Speaker: 00:48:15

hospitality technology in general, I don't feel it bridges a

Speaker: 00:48:18

gap. So I really look forward to the

Speaker: 00:48:21

future where I can just have this comment, this prompt

Speaker: 00:48:26

of plan, this conference in Dallas on

Speaker: 00:48:29

x and the system already knows all by preferences and

Speaker: 00:48:33

just done. Oh, boy. It would be it would be fantastic.

Speaker: 00:48:38

Yeah. That that the travel experience as I I've had to

Speaker: 00:48:41

travel quite a bit, like, for the past, like,

Speaker: 00:48:45

couple months, and it's just like, oh my god. Like, it never was

Speaker: 00:48:49

great, but awful is not a word I remember. But it's post

Speaker: 00:48:52

pandemic, I think it's gotten way worse. It's like there's just so many small things

Speaker: 00:48:56

that you could be done a lot better. I'm I'm a 100% with you on

Speaker: 00:48:59

that one. So true. So our our next

Speaker: 00:49:03

question is to, ask you to share something

Speaker: 00:49:06

different about yourself.

Speaker: 00:49:11

Sharing something different about myself. I think I'm a controversial

Speaker: 00:49:15

person in general. So, so some people,

Speaker: 00:49:20

so some people agree with, you know, with the degree

Speaker: 00:49:24

of, of living in the future. So I,

Speaker: 00:49:27

I, you know take myself as person who is very much in the future so

Speaker: 00:49:31

all this seed happening and I might be a little bit you know ahead because

Speaker: 00:49:34

I see the technology being developed in my mind is already there, it's already

Speaker: 00:49:38

used right? So and so where this is

Speaker: 00:49:42

where I see myself controversial because you know in majority of the

Speaker: 00:49:46

cases, then you sit over family dinner

Speaker: 00:49:50

and say, you know, we're still paying our bills

Speaker: 00:49:53

online when we have this notification. Right?

Speaker: 00:49:57

So everyday technology has

Speaker: 00:50:00

developed a lot. And when I'm speaking about this application

Speaker: 00:50:04

free future and, you know,

Speaker: 00:50:08

automated, x y zed. Sometimes or many

Speaker: 00:50:12

oftentimes on everyday level, we are still not there and

Speaker: 00:50:16

this is where people think that I'm too visionary or too

Speaker: 00:50:20

too dreamer on that. Interesting.

Speaker: 00:50:23

No. I'm with you on that one.

Speaker: 00:50:28

Growing up, I was the technical person in the family. So

Speaker: 00:50:32

Yeah. They don't they don't know what you're talking about. Right? I I I love

Speaker: 00:50:36

how the, you know, or, you know, they all they

Speaker: 00:50:39

all get confused until the printer breaks and then suddenly

Speaker: 00:50:43

But you're the smartest people in the room. That's why you're the smartest person in

Speaker: 00:50:46

the world. Alright. So where can people find out more about you and

Speaker: 00:50:50

Illumix? I love socializing on

Speaker: 00:50:53

LinkedIn. I don't know that many people think LinkedIn became a

Speaker: 00:50:57

marketing tool. I still see tons of valuable

Speaker: 00:51:00

discussions and I just absolutely love keeping in touch

Speaker: 00:51:04

on LinkedIn and and see the latest and greatest and I also share quite a

Speaker: 00:51:08

bit. So LinkedIn would be the the most

Speaker: 00:51:11

straightforward way in Atokaropsala on LinkedIn.

Speaker: 00:51:15

We do have blogs and I actually write many of

Speaker: 00:51:19

them. So if you go to illumeg.ai/blocks,

Speaker: 00:51:23

you will see lots of materials written on semantics,

Speaker: 00:51:27

on ontologies, on generative AI governance. So those

Speaker: 00:51:31

topics which are close to my heart, and we communicate quite

Speaker: 00:51:35

frequently on that. Very cool. Very cool. Very cool. So

Speaker: 00:51:39

so Audible is a sponsor. And if you

Speaker: 00:51:42

would, like to take advantage of a free month of

Speaker: 00:51:46

Audible on us, you can go to the datadrivenbook.com.

Speaker: 00:51:51

I just tested the link. That's why I was looking over here for anyone watching

Speaker: 00:51:55

the video. And it works. Sometimes it doesn't. And

Speaker: 00:51:59

we ask, our guests, do you have, do first, do you

Speaker: 00:52:02

listen to audio books? And if so, can you recommend 1? If

Speaker: 00:52:06

you don't listen to audio books, just a a good book.

Speaker: 00:52:11

I do listen to audiobooks. I also podcast, more

Speaker: 00:52:14

frequently recently. I I'm not sure this book is

Speaker: 00:52:18

already on Audible, but, if not, it's going to be

Speaker: 00:52:21

in Audible soon enough. So it's Nexus by Yuval Noah

Speaker: 00:52:25

Harari. It is audible. I have it in the library already.

Speaker: 00:52:28

Yeah. Amazing. So it speaks about the truth

Speaker: 00:52:32

in the age of generative AI. Right? Interesting.

Speaker: 00:52:36

What's the truth? What's the ground truth? And I was

Speaker: 00:52:40

actually in the lunch party in SoHo, New York, you know when Yuval

Speaker: 00:52:44

was speaking about you know how how technology

Speaker: 00:52:47

and what we see right now is not very different from what we experience

Speaker: 00:52:51

in you know middle age like when when Gothenburg

Speaker: 00:52:55

and printing was was a new thing and like what was

Speaker: 00:52:58

printed actually was you know rumors

Speaker: 00:53:02

and juicy stuff rather than scientific books and this

Speaker: 00:53:06

is where what we see right now in, you know, in chatbots and internet, on

Speaker: 00:53:09

social overall. So it's it's interesting parallels that he's

Speaker: 00:53:13

taking about what's what truth is in generative

Speaker: 00:53:17

AI age where what truth were was, like, 20 years

Speaker: 00:53:20

ago or even, like, 500 years ago. Yeah.

Speaker: 00:53:24

We're the we're the same species with the same problems and the same drama

Speaker: 00:53:28

and the same drivers. Like, it's just our tools have changed, whether

Speaker: 00:53:32

it's a printing press or, you

Speaker: 00:53:36

know, celebrity gossip or whatever or fake news

Speaker: 00:53:39

or anything like that. Plus, I also think the, you know, there's an old phrase

Speaker: 00:53:43

like who watches the watchers. Right? Like Mhmm. Who decides what's

Speaker: 00:53:46

misinformation and who decides what's true? I think. I think

Speaker: 00:53:50

because misinformation could be, you know, there there's

Speaker: 00:53:54

a image of me robbing a bank. Right? Like, you know?

Speaker: 00:53:57

Mhmm. Mhmm. I thought, Frank, I thought when the US

Speaker: 00:54:01

Marshals put you into the witness protection program, they said

Speaker: 00:54:05

we couldn't bring up you robbing a bank any any longer.

Speaker: 00:54:09

Misinformation. You gotta be careful because, like, one of the things I I wanted the

Speaker: 00:54:13

flow was so good. I didn't wanna interrupt it. But, like, one of the things

Speaker: 00:54:15

was I was experimenting with fine tuning an LLM locally.

Speaker: 00:54:19

Mhmm. And I'm basically trained it on information about my blog. My blog's

Speaker: 00:54:23

been around since 1995. Right? Or my site has been around since 1995.

Speaker: 00:54:28

One of them hallucinated this really great origin story for my

Speaker: 00:54:31

website. It was awesome. It was awesome. I'm like, I like that

Speaker: 00:54:35

better. So basically, it said that Always. Always.

Speaker: 00:54:39

It was really good. It was basically that Frank's World started as a

Speaker: 00:54:42

show, a kids TV show in the nineties on

Speaker: 00:54:46

the BBC or channel 4. I forget. Like one

Speaker: 00:54:50

of the big British channels. And it was about a talking

Speaker: 00:54:53

trash can named Frank that would teach kids about the importance

Speaker: 00:54:57

of, recycling. That's my favorite part.

Speaker: 00:55:01

And it was and it was the best part was that it was it was

Speaker: 00:55:04

the first professional project of the guys who did Sean the sheep and Wallace and

Speaker: 00:55:08

Gromit. Yeah. And I'm like so I

Speaker: 00:55:12

I I pinged the guy I worked with. Has this ever been a show?

Speaker: 00:55:15

Because no. Not that I ever heard of. And I looked over it. I couldn't

Speaker: 00:55:18

find it. But and then what I did was as an experiment, I fed

Speaker: 00:55:22

that that whole paragraph that it came up with into

Speaker: 00:55:26

notebook l m. Mhmm. Notebook l m

Speaker: 00:55:30

took that and ran with it. There's, like, a 20

Speaker: 00:55:33

minute audio, and it is the funniest thing because it basically

Speaker: 00:55:38

talks about the early environmental movement. They said it was the Britain's

Speaker: 00:55:41

answer to, Captain Planet. Like, they made up all the

Speaker: 00:55:45

stuff. And now it's documented. So now someone is going

Speaker: 00:55:49

to pulling to pull some information. And if you have Right now it's out there.

Speaker: 00:55:53

Right. And I guess to your point earlier about Lumix, like, if you start

Speaker: 00:55:56

building a crooked foundation, right, like, that eventually as

Speaker: 00:56:00

it moves on, it's gonna so, I mean, who knows, like, couple of years

Speaker: 00:56:04

from now, like, Wikipedia may say, like, there might be a

Speaker: 00:56:08

Wikipedia article about this TV show didn't exist. We're talking about it. We're feeding

Speaker: 00:56:11

the machine. That's fascinating.

Speaker: 00:56:15

Yeah. And it was a so a little bit on the books. I have to

Speaker: 00:56:18

mention it, like, in a couple of sentences. So, in US

Speaker: 00:56:22

a legal entity actually is a citizen. It

Speaker: 00:56:26

has social number. Right. So, technically machines

Speaker: 00:56:30

can create legal entities. They can vote, they can,

Speaker: 00:56:34

you know, they can create information and this information is,

Speaker: 00:56:37

you know, created with social number, with identifiers. So it's actually real

Speaker: 00:56:41

information. It's not fake news. It's created by social number.

Speaker: 00:56:45

And so this is how you create, like, this new truth. Right?

Speaker: 00:56:49

And, and how do you control that? So it's an interesting aspect of what's,

Speaker: 00:56:53

what even is defined as ground truth.

Speaker: 00:56:57

That's true. Everybody needs to define it. I think that's gonna be the question of

Speaker: 00:57:00

the 20 That's a big deal. Mhmm. Yeah. Well,

Speaker: 00:57:03

awesome. It's been great. We wanna be respectful of your time. This has been an

Speaker: 00:57:06

awesome show. Yeah. We'll let Bailey finish the show. And

Speaker: 00:57:10

that's a wrap for today's episode of data driven. A massive

Speaker: 00:57:13

thank you to Ina Tokarev Saleh for joining us and sharing her

Speaker: 00:57:17

fascinating insights into the world of generative AI, semantic

Speaker: 00:57:20

fabrics, and the ever evolving relationship between humans,

Speaker: 00:57:24

data, and decision making. If you're as inspired as we

Speaker: 00:57:27

are, be sure to check out IllumiX and follow INA on LinkedIn for

Speaker: 00:57:31

more thought leadership in the AI space. As always, thank

Speaker: 00:57:35

you, our brilliant listeners, for tuning in. Don't forget

Speaker: 00:57:39

to subscribe, leave a review, and share this episode with your data

Speaker: 00:57:43

loving friends or that one colleague who insists they don't trust

Speaker: 00:57:46

AI. We'll convert them eventually. Until next

Speaker: 00:57:49

time, stay curious, stay caffeinated, and remember,

Speaker: 00:57:53

in a world driven by data there's no such thing as a trivial

Speaker: 00:57:57

question, just fascinating answers waiting to be found. Catch

Speaker: 00:58:00

you next time on Data Driven.