Speaker:

Hello, listeners. And welcome back to another thrilling episode

Speaker:

of data driven. In today's episode, we delve deep into the

Speaker:

fascinating and, let's be honest, slightly terrifying world of

Speaker:

generative AI and security risks. Joining us is

Speaker:

Niamh Braun, co founder and CEO of Noma Security,

Speaker:

who's on the front lines of keeping your AI driven project safe from

Speaker:

digital mischief. So grab a cuppa and let's get data

Speaker:

driven. Well, hello, and welcome back to Data Driven, the podcast where we explore

Speaker:

the emergent fields of AI, data science, and, of course, data

Speaker:

engineering. Speaking of data engineering, my favoritest data

Speaker:

engineer in the world can't make it, today. But we

Speaker:

have an exciting, conversation queued up with Niv Braun,

Speaker:

who is the cofounder and CEO of Noma. Noma

Speaker:

is a security firm that focuses on effectively

Speaker:

he'll describe it more eloquently than I can, but effectively thinks about

Speaker:

security in the context of data and AI across the

Speaker:

entire life cycle. Welcome to the show, Niv. Hey,

Speaker:

Frank. Happy to hear you, bro. Yeah. It's good to have

Speaker:

you. And and security is one of those things where I've been thinking about more

Speaker:

lately. Right? So my background was a software engineer and,

Speaker:

you know, software engineers historically have not thought of

Speaker:

security. Then I made the transition into data engineering and data

Speaker:

science, and, traditionally, security is not really at top

Speaker:

of mind, for them either. Now I

Speaker:

kinda look at this, and I kinda look at the landscape that we're in where

Speaker:

enterprises are deploying LLMs,

Speaker:

generative AI solutions, on top of the predictive AI solutions,

Speaker:

fast and furiously, and not thinking about

Speaker:

security ramifications. So what are your what's your take on

Speaker:

that? 100% agree. I think that, it's

Speaker:

even like the the the the current, like, timing is even more fascinating

Speaker:

than the than just, like, a new technology. Because exactly like you said, like,

Speaker:

Frank, like, we all like the data practitioners. We all know that, like, security is

Speaker:

not, like, our top priority. And by the way, like, by, like, like, this is,

Speaker:

like, how it should be. Like, we are focusing on the business and, like, drive,

Speaker:

like, drive, like, the business forward. And this is why we're, like, this is

Speaker:

what we're paid for. The problem is that

Speaker:

because we're not, like, in this kind of, like, mindset, we also, like, like

Speaker:

any technologies in the company, also, like, create some risk. What we see right

Speaker:

now is the LLM drive, which is pretty cool, is that for the

Speaker:

first time, the security teams started to put

Speaker:

the focus and, like, the spotlight on the data and AI teams. Because until

Speaker:

now, let's be honest, they were focusing only on the software developers and

Speaker:

their SDLC and the CICD and all these areas. Like, we were,

Speaker:

like, you know, like, in the shadow. And we were, like, able, like, to act

Speaker:

like exactly like, like, like, completely freely as we wanted.

Speaker:

But now when, like, the security team start, like, to put the spotlight on the

Speaker:

data and AI teams, what they understand is that it's not

Speaker:

only this new kind of LLM threats, but also all

Speaker:

the basic principles of security are not implemented

Speaker:

in the data engineers and the data science teams. Nobody, like, scans all the

Speaker:

code in our notebooks, for example, unlike the software developers that, like, all

Speaker:

their code is being scanned. Nobody helps us to

Speaker:

find configurations in our data pipelines or our

Speaker:

MLOps tools or our AI platforms, like Databricks, for example.

Speaker:

Like, nobody, like, provide us this ability to to find it easily,

Speaker:

unlike, again, the software developers that they receive all this coverage

Speaker:

and everything. Like, on the moment that they have, like, the smallest misconfigurations

Speaker:

in their SCM or their their CICD, they

Speaker:

will immediately, like, receive, like, a notification, like,

Speaker:

helping them exactly, like, how to secure it. And also eventually,

Speaker:

like, in the run time, in the runtime, in software life cycle, in

Speaker:

classic like software application, we also have a lot of API security and web

Speaker:

application firewalls tools that help us to protect the application in the

Speaker:

runtime. But now specifically in LLM, this is, like, very

Speaker:

related also, like, to what you said. Like, there are new kind of adversarial attacks,

Speaker:

all the prompt injection and model jailbreak and stuff like that.

Speaker:

And, again, nobody, like, else would like to protect it, like, in real time. And

Speaker:

I think that this is, like, one of, like, the main shift that we see

Speaker:

today in this area. We understand that the spotlight

Speaker:

moved to the data and AI teams, but we need to make sure that we

Speaker:

do, like, both. Like, we start with, like, a new kind, like,

Speaker:

trendy, like, risk that we want to make sure that we are protected from.

Speaker:

But also that for the first time, after a lot of years, we're

Speaker:

starting also, like, to implement the basic security measurements

Speaker:

needed in our area. But the most important thing, of course,

Speaker:

is to continue and, like, do it without slowing us down. Like, we need to

Speaker:

make sure that, like, everything, like, all the different, like, security measurements that

Speaker:

we take still provide us the ability to move fast, to enable

Speaker:

the data sent the data science and the data engineering teams to

Speaker:

continue and, like, innovate, but in a secure way.

Speaker:

You know, that's a good point because I never thought about scanning a notebook for

Speaker:

errors. Right? Shame on me. Right? Like for code

Speaker:

security I mean, not errors, but, you know, security vulnerabilities. That's not something

Speaker:

that I have seen done in practice. I mean, the the

Speaker:

closest I've seen where security has been an issue for

Speaker:

anyone in this space is,

Speaker:

basically using protected, you know, Python

Speaker:

libraries, right, or or Python library repos, right, where they're those

Speaker:

are scanned by, I forget the name of the 3rd party that'll do it where

Speaker:

you just basically say you point your Python instance to there. Yeah. Because

Speaker:

I also think that Internal Artifactory. Yes, exactly. So

Speaker:

like, what, because I often

Speaker:

wonder, you know, people just like to install.

Speaker:

God only knows what's in there. I can tell that, like, it already, like, happens.

Speaker:

Like, I don't know if you heard, but for example, like, like,

Speaker:

like, pretty recently, PyTorch, for example. Right.

Speaker:

PyTorch that we all know was compromised. We all know and love. We're most people

Speaker:

love. It was compromised. Like, specific version of PyTorch, a

Speaker:

malicious actor succeeded to to put some

Speaker:

code inside that basically,

Speaker:

collected all the the the secrets and token that you have in the

Speaker:

environment and sent it to DNS. Now we all

Speaker:

know, like, how much like like, how many downloads, like, PyTorch have.

Speaker:

And most times, where PyTorch is downloaded to through, like, to

Speaker:

all these different, like, notebooks, wherever they be, JupyterOps,

Speaker:

SageMaker, Databricks, like, we all use them.

Speaker:

And it I can tell that, like, it caused us to a lot of, like,

Speaker:

problem. I can tell, like, like, like, firsthand, like, we saw, like, a lot

Speaker:

of organizations that were compromised because of this attack.

Speaker:

And it happens all the time. And by the way, if you mentioned, for example,

Speaker:

like, if you already, like, touched the point of, of open source,

Speaker:

now you have also Hugging Face, which is completely different area. Now it's

Speaker:

not only Open Source packages. It's all these different Open Source

Speaker:

Hugging Face models and Hugging Face datasets. And there,

Speaker:

all these internal artifact are completely useless because they don't even

Speaker:

scan these models. It's completely different technology, completely different, like,

Speaker:

heuristics in order to find it. And, therefore, you start to

Speaker:

see kind of, like, trends for for the attackers. They started to

Speaker:

upload a lot of backdoored and a lot of malicious models

Speaker:

into Hugging Face. I can tell you, like, we personally, we already,

Speaker:

like, detected, I think, almost, like, 100, back

Speaker:

or the malicious models, on Hugging Face because it's a wild

Speaker:

west. Right. Because how do you because these these models, first off,

Speaker:

they're physically large files. Right? So that there's that's a factor.

Speaker:

Right? I don't know how Hugging Face makes money. I'd be

Speaker:

curious to have someone on the show talk about that. But, you know,

Speaker:

they're doing the service. And, how would you even scan? I

Speaker:

mean, that's a good question. Right? What types of vulnerabilities have you sent have you

Speaker:

found so far? And how does one even scan, like, a safe

Speaker:

tensor or g file? Like, how do you what's what's

Speaker:

that look like? Right? Obviously, I'm pretty sure, you know, McAfee

Speaker:

antivirus doesn't have a thing for that. But, like Exactly.

Speaker:

But, how do you even do that? I'm just curious. Yeah. So this is, like,

Speaker:

exactly, like, the problem. Like, it's even, like, in in in the models, like, it's

Speaker:

even, like, a a more, like, the the risk there, like, is

Speaker:

more, like, clearer because as you know, a lot of time, like,

Speaker:

these models in hanging face are even, like, in pickle. And, like, pickle is, like,

Speaker:

by design, like, insecure, like, file. And so

Speaker:

binary dump, right, of, like, the memory space. Yeah. Like, in the deserialization

Speaker:

process, like, basically, you can, like, put, like, any kind of, like, malicious,

Speaker:

action that you'd like, that, like, the attacker can. So we see,

Speaker:

like, different attacks. Like, most of the attacks come today, like, from pickle files.

Speaker:

Some also, like, not even, like, in the deserialization process, but also, like, in the

Speaker:

model code itself. For example, like, if you ask

Speaker:

for a specific example, like, share something that we

Speaker:

detected, like, recently. We found, like, a very,

Speaker:

let's say, a popular, open source, LLA model that we all

Speaker:

know. But we know that, like, a it has a lot of, like, different

Speaker:

versions. And one of the version was actually a docker

Speaker:

that took the original model, wrapped it up with few

Speaker:

lines of code in the model, which what they did is that every

Speaker:

input to the model and every output from the model

Speaker:

was also sent to the attacker, which basically

Speaker:

just received full visibility and observability to all the

Speaker:

runtime application and production. So, like, all the organizations that,

Speaker:

like, use this model. And performance wise, the

Speaker:

data scientist, of course, they cannot, like, detect it because performance

Speaker:

wise, it worked perfectly because it took the original model. So nothing to be

Speaker:

suspicious about. If we want the data

Speaker:

scientist, every new open source model that they like, like

Speaker:

in Hugging Face, they'll start, like, to open, like, these files and the binaries and,

Speaker:

like, to start, like, to looking, like, in their own hands, they're manually

Speaker:

for, like, a for a for risk. First, like, of course,

Speaker:

like, we understand that this is not their expertise and, like, it it

Speaker:

like, we want to be secured, but, like, like, even, like, worse,

Speaker:

we just spend all their time on security. And I think that

Speaker:

this is, like, the worst stuff. Actually, it's not the worst. I think that, like,

Speaker:

the worst, and this is also, like, something that, like, I saw recently in several

Speaker:

organizations is just, like, to block everything. Organizations

Speaker:

that, like, understand, okay, Hugging Face model, it's, like, true, like, a secure, like,

Speaker:

in secure area. Let's block it. Let's say, like, to

Speaker:

all the data scientists in the organization, you're disallowed to use HAG interface model. I

Speaker:

think this is, like, the worst. That seems like a mistake because

Speaker:

because the people are gonna find a way. Well, 1, where you can't stop the

Speaker:

signal. Right? That was a line from, a movie.

Speaker:

They can't, kudos if people know who that what movie that is.

Speaker:

But, you know, if you block Huggy Face, people are gonna find a way

Speaker:

around that. They're gonna put it on a thumb drive at

Speaker:

home and then bring it in. So percent. This is, by the way, also, like,

Speaker:

what you see, like, with this kind of, like, internal Artifactory. You see that, like,

Speaker:

once you get to you you create for the r and d or create for

Speaker:

the developers or for the data scientists, you create some level of, like,

Speaker:

friction. They will just find a way out to, like, bypass

Speaker:

it and to to lower this, this friction.

Speaker:

Right. So so couple of questions.

Speaker:

One, I've seen, improper naming

Speaker:

Not improper naming, but but basically using, names,

Speaker:

like, that's looks similar to what should be. Yeah. Type will split. Type

Speaker:

type splitting. That's it. I've seen that, which is kind of, I guess,

Speaker:

kind of, you know, dollar store approach. But also,

Speaker:

how does how does it if you wanted to look through these model files, as

Speaker:

far as I know, they're just I just looked at them. I just see binary

Speaker:

stuff. Like, how would you look for malicious code in there? Because I think you're

Speaker:

right. That's not a skill set the average AI engineer or data scientist

Speaker:

would have. Yeah. So, basically, like, you need, like, to manually kind of, like,

Speaker:

parsing it because, like, you have, of course, like, the the binary file, but most

Speaker:

times, it's not only, like, the binary file. You label for, like, the the code

Speaker:

file that run, like, run the model, and you label for, like, the, in

Speaker:

case it's, like, pick a, like, the deserialization process, that you can, like,

Speaker:

parse and then, like, to see, like, the code there. But then you

Speaker:

need also, like, you know, like, you have, like, 2 phase. 1st, you need to

Speaker:

to parse it, you know, like, to see, like, the code, but then you need

Speaker:

also, like, to be able to read code and to understand which

Speaker:

one is valid and which one is malicious, which is also, like, completely, like, you

Speaker:

know, like, you need expertise in this area. If you see bash

Speaker:

commands, is it okay or not? Do you see access to the

Speaker:

Internet? Okay or not? Like, you you need, like, to have, like,

Speaker:

some, like, detectors in there that, that know how to do it, like, build

Speaker:

by by expert or something. So how would you even detect

Speaker:

that if you found it? Like, how was this found? Was this just somebody looking

Speaker:

in network packets? Or, like, what how was it discovered? I'm

Speaker:

just curious. Yeah. This specifically was, like, by our

Speaker:

security research team. Okay. Yeah. That's like, looks a

Speaker:

lot if, a lot like all the time, like, you know, all these different kind

Speaker:

of, like, open source and third party models in order to to help

Speaker:

our users to make sure that, like, everything that they use

Speaker:

is is valid. And again, most importantly, without slowing

Speaker:

them down. They can just, like, download and, like, run, like, with everything that they

Speaker:

that they want. And in case, we see something that is,

Speaker:

that is suspicious, we know how to detect it and to to help them to

Speaker:

to secure it. Interesting. Interesting.

Speaker:

Because I know a lot of people, you know, they they've been downloading

Speaker:

these models from Hugging Face. And just taking it on

Speaker:

faith, and I've heard that these things don't call

Speaker:

out to the Internet. Mhmm. And I fell into that. And then

Speaker:

I kinda had this moment of paranoia where I'm like, how do I know?

Speaker:

I mean, the only way I'm a I'm just a humble data scientist. Right? Like,

Speaker:

so the only way I would think about it would be to have a firewall

Speaker:

rule that would block network traffic going up for that box.

Speaker:

And I'm sure there's probably workarounds to that too. I mean, are these

Speaker:

attacks are these attacks that sophisticated yet?

Speaker:

Yeah. Yeah. And, like, also, like, most times you don't, like, the data

Speaker:

science, like, they don't want, like, to permanently, like, to close, like, the Internet, like,

Speaker:

the outbound because also, like, the application needs it. And also, like, the, you

Speaker:

know, like, the the in order, like, to download, like, the dependencies and the models

Speaker:

you needed. So most times, like, just, like, to block the Internet, it doesn't solve

Speaker:

everything. It was, like, more, like, in the past that everything was, like, network based

Speaker:

only. Today, when you have, like, also, like, the applicative layer here, so

Speaker:

it's, like, a bit more sophisticated.

Speaker:

But yeah. Wow. So

Speaker:

the safe tensor format, as I understand it, what you

Speaker:

know, you basically digitally sign or somebody

Speaker:

digitally signs the contents of it. Is that is

Speaker:

that a correct understanding? Yeah. So it's end up like a

Speaker:

like, in general, first thing, of course, that, like, a safe denture is, like,

Speaker:

much more secure. Okay. I already like by design, and as long

Speaker:

as we as the industry will go, like, more and more, like, towards

Speaker:

this road, because today, like, we still see, like, tons of light pickles.

Speaker:

But as long as we progress, like, all as an industry, we'll already,

Speaker:

like, be, like, in a bit better situation. It's not

Speaker:

perfect, of course. We still see some issues. And, of course, organizations still

Speaker:

need, like, to have some security measurements and processes

Speaker:

to make sure that, like, they're aware of what,

Speaker:

like, Hang in Face are using. But I think that it's already, like,

Speaker:

going to be a bit better. I can tell you something that, actually,

Speaker:

like, recently one of our one of our partners told me,

Speaker:

which was pretty cool, very similar to what you said that you

Speaker:

start, like, to feel a lot of concerns about this area.

Speaker:

VP data science of a very big like,

Speaker:

Fortune Fortune 500, like, very big, like, corporate. And you kind

Speaker:

of, like, the the head of, like, the older data science, like, groups here. And

Speaker:

they told me, you know, Niv, I I already know

Speaker:

what I'm going to be fired about, like, in a in the next, like,

Speaker:

24 months, and it's going to be about that. I know for sure, like, we're

Speaker:

using, like, so much, like, Agiface models. I know for sure that I'm this is,

Speaker:

like, the reason that I'm going to be fired, like, one day. Because today, like,

Speaker:

we're using it, like, freely. We are also, like, very creative. We're not, like,

Speaker:

only using, like, the most popular LAMA model, but, like, we're to,

Speaker:

like, take advantage of this great advantage of the platform, which is, like,

Speaker:

the amount and the diversity of the model that you have there. But I have

Speaker:

no no doubt that we create so many risks that we're just,

Speaker:

like, not exposed yet, that I'm going to to pay with it,

Speaker:

like, with my head. So it it's

Speaker:

it's pretty cool because it's not it's not always that you see,

Speaker:

r and d and business owners that are so concerned

Speaker:

about security even before the security team arrived

Speaker:

to them. But they're already aware of this risk. And it's something that

Speaker:

we start, like, to see more and more because, you know, it's just like it's

Speaker:

it's too obvious. Like, the the the window is open and everybody see it.

Speaker:

Yeah. I I would suppose that's in in a in a very kinda strange way

Speaker:

that's bit progress, right, where people think about security beforehand.

Speaker:

Like, even if they don't know I mean, I think this this this VP,

Speaker:

you know, is pretty spot on. Like, what concerns me about the widespread

Speaker:

adoption of these models and particularly Hugging Face, so there are no knock on Hugging

Speaker:

Face. I think whatever you get your models Mhmm.

Speaker:

I mean, we just don't know. And these things are just complicated.

Speaker:

Right? I mean, they are by design complicated with 1,000,000,000,000 of

Speaker:

parameters. In some cases, I guess, 1,000,000,000,000. But also, you

Speaker:

know, they have this ability to even

Speaker:

even if everything worked out well, even even assuming everything is

Speaker:

fine, right, in terms of the operationalization of these things,

Speaker:

There's still the chance that the model itself and its

Speaker:

training was poisoned. So, like,

Speaker:

I I mean, like, there's just so many because when I my wife works in

Speaker:

IT security, and I was all excited. It was about a year and a half

Speaker:

ago. I I was talking to her about LLMs and stuff like that

Speaker:

and chat GPT and and and those types of things.

Speaker:

And I was like, oh, well, you take all this data and you train a

Speaker:

model and you you distill down this graph and this and this. And then she's

Speaker:

like, that sounds like a big attack surface to me. Yeah.

Speaker:

And I was like like, data poisoning in the classic one and data

Speaker:

poisoning can be, like, in in in 2 levels or, like like, someone

Speaker:

like poisoning your data or exactly what you say,

Speaker:

somebody just, like, this way, like,

Speaker:

create backdoor in, in third party models and open source

Speaker:

models that then, like, everybody downloads. Right.

Speaker:

Right. And we wouldn't know, like, what's

Speaker:

the I mean, the defense against that seems very

Speaker:

intricate. Not impossible, but very delicate and intricate.

Speaker:

So in in in classic application security, there is a

Speaker:

great practice called SBOM. SBOM is a software

Speaker:

billing of material. Basically, it means that, you get, like, in

Speaker:

specific format, kind of like visibility to all the different

Speaker:

software components that build your application. One of the things that

Speaker:

now we're also, like, part of the building is a

Speaker:

official framework of OWASP, the nonprofit organization

Speaker:

around security of AI and machine learning. And

Speaker:

what you have there is for the first time you have like double layer

Speaker:

of visibility. The first one is just like to understand

Speaker:

what models I'm even using in the organization. Everything, like

Speaker:

what models like, include in my application. It can be open

Speaker:

source models. It can be self developed models. Also, by the way, not only not

Speaker:

only LLM, of course, also like vision, NLP, like everything else.

Speaker:

And also third party models that are embedded as part of the application, they

Speaker:

are not open no. They are not open source. For example,

Speaker:

if software engineer add API call as part of the application

Speaker:

to OpenAI, in this way, they embed

Speaker:

LLM as part of the application. This is also like one of, like, the models

Speaker:

that you are using, but you you you want to know this is all my

Speaker:

AI and model inventory that I'm using as Spyro as part of the application.

Speaker:

And in addition to that, you have even the deeper context there, which

Speaker:

is also like what you referred to. It's not only this is

Speaker:

the list of the model that I'm using, but for each one, you want to

Speaker:

understand on what dataset it was trained, what data

Speaker:

maybe also like it has access to in case it's in production, let's say, with

Speaker:

RAG architecture. You want to understand, like, the deep context

Speaker:

of all these, like, models, what I'm using, but also, like, what

Speaker:

happens, like, in this specific, like, model. Sometimes

Speaker:

it's, as you said, for to to understand what data was trained on a

Speaker:

model before, like, I'm starting, like, to use it by 3rd party, a

Speaker:

lot of time is even, like, internally in the organization.

Speaker:

Because once we start to train a lot of models,

Speaker:

we want to make sure that we don't violate

Speaker:

any policy that we have in the organization, either it's for compliance or

Speaker:

security. For example, one of the things that, like, we are like, I keep, like,

Speaker:

hearing a lot of time from, from security and legal and privacy

Speaker:

teams is that, look, we instruct all the

Speaker:

organization not to train any sensitive

Speaker:

data, PII, PCI, PHI, any other sensitive

Speaker:

information on our models. But except instructing

Speaker:

it and speak about it, nobody knows if it

Speaker:

happens. And we don't provide also our data

Speaker:

teams tools that will help them to

Speaker:

detect it in case it, like, it happens, like like, not in purpose. For

Speaker:

example, I can tell you, like, one of the thing that we saw very

Speaker:

recently. Big organization, a huge Fintech company,

Speaker:

that data scientist unintentionally trained all the

Speaker:

transaction of the application on one of the models. Now it's

Speaker:

a, like, crazy big violation there of, like, compliance and

Speaker:

security. The data scientist did this unintentionally. They

Speaker:

truly, like, didn't know it. If they had something that, like, would help them, like,

Speaker:

the basic visibility that you mentioned before, it will truly, like, help them to

Speaker:

start, like, to continue, like, innovate and just, like, in case something like bad happens,

Speaker:

to be alerted in that. And so I see that, like, the the data training

Speaker:

is also, like, very, very important point also internally and not

Speaker:

only the external data train on the external models that we're embedding and

Speaker:

downloading. So you mentioned, OWASP. So just

Speaker:

for the benefit of folks who may not know, because most of our listeners are

Speaker:

either data engineers, data scientists. What is OWASP? And what is the

Speaker:

I think it's with the OWASP 10? Yeah. So

Speaker:

OWASP in general, it's a amazing organization that,

Speaker:

is like a nonprofit one that helps basically,

Speaker:

we combine a lot of people together, gather together in order to make

Speaker:

sure that all our industry is much more secured with a lot of

Speaker:

different security initiatives in a lot of different aspects, mainly of like product

Speaker:

security, but not only. Product security is like application

Speaker:

security. It's building security.

Speaker:

Specifically in OASP, you have several different types of

Speaker:

projects. So for example, one type of project is the OSP10,

Speaker:

top ten, that basically takes different areas

Speaker:

and define the top ten risks in this specific area. So it can be top

Speaker:

ten for API, top ten for

Speaker:

CICD. And now there is also like top ten for LLM.

Speaker:

Addition framework, like, there are a lot of like different tools. Specifically,

Speaker:

if someone wants to understand a bit more about like the wide

Speaker:

landscape and the risk around AI and machine learning,

Speaker:

the framework that I would like recommend on, highly recommend on, is

Speaker:

amazing and very comprehensive called the OWASP AI

Speaker:

Exchange. A group of people, again, gathered together,

Speaker:

that covered not only LLM, but all the basic

Speaker:

principles and risk in data pipelines and MLOps

Speaker:

and start from the building and up to the runtime and start from the

Speaker:

classic machine learning and up to Gen AI, very comprehensive,

Speaker:

very also practical, which is very important and

Speaker:

speaks in both language, on both languages. On one hand,

Speaker:

of course, security, but on the other, also like very oriented

Speaker:

for data and machine learning and AI practitioners.

Speaker:

Interesting. Interesting.

Speaker:

What what do you see

Speaker:

well, here's what I mean, I'll have a lot of questions, but one of them

Speaker:

is, do you think the 0 what do you think the

Speaker:

0 trust approach is a good starting point? I don't think

Speaker:

it's the answer here like it is kinda everywhere else. But do you think that,

Speaker:

that type of philosophy of don't trust anything?

Speaker:

Right? Kind of like, I mean, is that because you you mentioned this

Speaker:

early when I talked about network firewalls, right, where the old approach of thing

Speaker:

is just pull the plug or set up rules. And that used

Speaker:

to work, but there's plenty of other ways around it, Both I think

Speaker:

kind of low skill, mid skill, and certainly high skill

Speaker:

ways around that. What do you you mean then 0

Speaker:

trust is meant to address that. What are your thoughts on like I

Speaker:

mean, is that the pro is that the mindset that either

Speaker:

security folks in this space would have to take on? Like, it's more

Speaker:

if they well, they probably already have. Right? Yeah. I think you're,

Speaker:

like, I think you're actually, like, the the you you you perfectly

Speaker:

defined it because I believe that 0 Trust is exactly like you say, it's kind

Speaker:

of like a, like, kind of like a mindset. It's not like a very

Speaker:

accurate, like, technical approach, but it's kind of

Speaker:

like more like a a philosophy with some level of implementation.

Speaker:

I believe that, like, the right mindset and, like, the right framework to look

Speaker:

on a on a security for AI and, like, all the building

Speaker:

and also, like, the runtime is basically to take all the

Speaker:

different principles that we are all already aware

Speaker:

of. Like we are all, like I'm saying, like the security industry,

Speaker:

we are all already aware of on classic software development,

Speaker:

building and runtime, and to implement it on the

Speaker:

data and AI lifecycle. For example, if we mentioned, like,

Speaker:

code scanning, so code scanning the notebooks, we mentioned open source,

Speaker:

so checking all the all the Ag interface models. But it's not only that.

Speaker:

For example, one of the things that, like, we see, a lot of attacks that

Speaker:

we, like, we had recently in the security area are around the

Speaker:

CICD. A few years ago, there was a big attack called

Speaker:

SolarWinds, that basically, yeah, so you know it

Speaker:

perfectly, just for the audience that, like, are not familiar with the specific details

Speaker:

in, like, very high level attacker that exploited and

Speaker:

misconfigurations in CICD tools. And this is

Speaker:

basically how they succeeded, like, to start, like, this whole huge attack and

Speaker:

breach. Now one of the things that, like, it taught us all as an industry

Speaker:

is that until now we were focusing on, like, securing only

Speaker:

our code. Now we understand that the code is not enough. We need to make

Speaker:

sure that the building tools are also well configured. So

Speaker:

we start, like, to see a lot of, like, tools that help us to make

Speaker:

sure that we don't have misconfigurations in the CICD and the SCMs and all

Speaker:

these different kind of tools. But when we are going to our domain, when

Speaker:

we go to the data and AI teams, as we know, we just use different

Speaker:

stack. We use all these data pipelines and model

Speaker:

registries and MLOps tools and platforms like Databricks and Domino

Speaker:

and Snowflake and stuff like that. The configuration, as we know, is

Speaker:

not like neverwhere. Most time, it's even wider. This is why it's

Speaker:

not managed by DevOps. It's managed by us, by the data teams. It's managed by

Speaker:

MLOps teams, by data infra, by data platform. And we're doing a

Speaker:

lot like, a great job in order to optimize all the configuration for the

Speaker:

product. We're not security experts. We don't want to be

Speaker:

security experts and, like, start, like, to spend a lot of time in that. But

Speaker:

nobody else just like to very easily find all these different kind of misconfigurations.

Speaker:

And this is also a threat and, like, attack vector that we

Speaker:

started, like, to see a lot in the field today. I can tell you that,

Speaker:

like, we see tons of attacks around

Speaker:

different misconfigurations in tools like Airflows and Databricks

Speaker:

and stuff like that. And I think this is also like a very, very important,

Speaker:

like, mindset, like, to be in. And in addition to that, of course, we have

Speaker:

all the all the runtime and all the adversarial attacks there.

Speaker:

There are specifically, if I mentioned in the

Speaker:

OSPI exchange, so OSPI exchange covers everything.

Speaker:

The OSPI 10LLM specifically is more

Speaker:

covering this LLM, like,

Speaker:

specific risk. And then you have, like, all the adversarial attacks, like prompt injection

Speaker:

and model jailbreak and model dn out of service, model dn out of wallet,

Speaker:

etcetera. So basically, the mindset should

Speaker:

be we already know security very well. We already have, like, these

Speaker:

principles. Until now, we just haven't

Speaker:

implemented them on the data and AI teams,

Speaker:

tools, and technology. And this is exactly what we start, like, to

Speaker:

what we, like, need, like, to start to do. And this is what we see

Speaker:

also that, like, you know, like, now we have no reason. Like, we all see,

Speaker:

like, these different kind of attacks. So we start to see that all the organizations

Speaker:

were, like, starting to to already, like, walk the walk.

Speaker:

Wow. Yeah. I I often wonder too, like, what you

Speaker:

mentioned the pipelines being a vulnerability or an

Speaker:

attack surface. Right? Like, or a potential vulnerability.

Speaker:

I often wonder now, like, when, you know, we're looking at agentic

Speaker:

AI, right, where these things aren't just LLMs,

Speaker:

right, producing text or going through these materials.

Speaker:

We're giving them, you know, abilities,

Speaker:

right, to influence pipelines, right, to to or to

Speaker:

whatever. Right? Like, that just seems to me like a giant

Speaker:

security risk. I mean, telling someone you know, there's there's multiple ways to

Speaker:

break an LOM. Right? Like, obviously, there's the the the $1 Chevy

Speaker:

Tahoe. Right? Where the guy did that. Right? Pretty low tech

Speaker:

approach, pretty brute force ish.

Speaker:

But I often wonder, like, well, what

Speaker:

what sorts of things are agentic systems gonna open up?

Speaker:

Like, what does that look like? I think that this is exactly like where we

Speaker:

we will start, like, to see, like, the very big LLM,

Speaker:

breaches, that we'll have. I believe that, by the

Speaker:

way, my belief is that the the how does the

Speaker:

attack start will still be, like, in a lot of cases,

Speaker:

very similar to what we see today. But the impact of the

Speaker:

attack will be much, much, much, much, much higher because now like the

Speaker:

model cannot only like, promise you a

Speaker:

$1 a car, but you can throw, like, I already like

Speaker:

send the order, can send the car to you, can like book your

Speaker:

hotel, can do like everything there, can share with you, like, the data

Speaker:

of maybe, like, other customers in the application because it is,

Speaker:

like, a RAG architecture, and it is also, like, different, like, tools

Speaker:

that provide him the ability to maybe even, like, write different codes

Speaker:

to the application. And then it might also like start like different types

Speaker:

of remote code execution. As long as we are going to

Speaker:

provide to these NLMs more privilege, more access,

Speaker:

more tools, more abilities, the impact of the risk

Speaker:

that they will be able, like, to cause will be much higher. I still

Speaker:

believe again that that pack vectors are going to start from more or less,

Speaker:

like, the same areas, like prompt injection and model jailbreak,

Speaker:

but they they eventually, like, the outcome of these attacks will be much

Speaker:

higher. I could see that. Because we're giving them

Speaker:

actuators, so to speak. Right? Like we're not we're we're

Speaker:

giving them agency. Right? Like where they could actually do real damage as

Speaker:

opposed to because one thing in saying you're gonna give somebody

Speaker:

a $1 Chevy Tahoe. It's quite another to actually place the order,

Speaker:

sign off on the invoice, and then ship it. Right? Yep. And what

Speaker:

if you'll do, like I don't know. Like, you'll you'll start, like, to see it

Speaker:

also, like, in banks and in investments. They will start, like, to transfer

Speaker:

your money. They will start, like, to invest, like, to buy stock. They will like,

Speaker:

the the the the amount of, like, potential impact here is, like, a

Speaker:

crazy high. I believe, by the way, that eventually, this is going to be one

Speaker:

of the things that, like, we'll see also, like, slow down the adoption, not

Speaker:

less than the than the technology or, like, finding, like, the

Speaker:

right use case. Yeah. No. I could see

Speaker:

that. I I just think that we're just setting, as an industry.

Speaker:

We're setting ourselves up for a huge exploit that we

Speaker:

haven't figured out is already there yet.

Speaker:

And so so what what

Speaker:

can AI engineers, data scientists,

Speaker:

data engineers do today to make things

Speaker:

better? I know we can't fix it because we don't know what's we really don't

Speaker:

know what's broken. I think that's one of the frustrating and kind of fun things

Speaker:

about security work is, like, it's not that there's no vulnerabilities.

Speaker:

You haven't discovered any vulnerabilities yet. Right? There are no unknown there are

Speaker:

always un there are always unknown unknowns.

Speaker:

But if you have an unknown unknown or a known thing,

Speaker:

you can you can say that you pretty much figured that out. But there's this

Speaker:

whole aspect, which I don't think data scientists

Speaker:

fully appreciate. I think they can understand the concept of the unknown

Speaker:

unknowns. But in terms of the consequences of it, I don't

Speaker:

think I think it's gonna take 1 major solar wind style

Speaker:

issue or CrowdStrike style issue to make people conscious

Speaker:

of of that. But how do

Speaker:

we how do we prepare ourselves? Right? You can't

Speaker:

stop the hurricane, but you can board up your windows. Right? Like, you

Speaker:

know, how do you Yeah. I and I totally

Speaker:

agree that, like, what's going through, like, to to shake every everybody

Speaker:

will be, like, the the first SolarWinds or, like, the 4 log 4

Speaker:

j attack that we see, like, in these areas. I think that,

Speaker:

like, I think that you broke it very well

Speaker:

and that we need to relate to both categories.

Speaker:

1st is, like, the known,

Speaker:

which already, like, exist. Like, we know that, like, you know, like,

Speaker:

we see that as scientists. Like, we are not a scientist.

Speaker:

And we see that one of the the things that, like, we see

Speaker:

in in in our code in compared to software developers

Speaker:

is that we don't give a

Speaker:

tip on, like, everything, around security.

Speaker:

Like, you'll see, like, tons of exposed secrets in plain

Speaker:

text. You'll see tons of, like, test and, like, the sensitive data

Speaker:

just like playing. And, like, it's state, like, exposed, like, in the notebooks.

Speaker:

You'll see that we download, like, any dependencies without, like, like,

Speaker:

even, like, think about it. Even so that, like, yeah, it looks like maybe, like,

Speaker:

a bit suspicious and stuff like that. So it's it's far

Speaker:

from from the basic. Let's make sure that, like, what we know that is not

Speaker:

best practice, just, like, start, like, to implement it. And

Speaker:

then regarding the unknown unknown, so, of

Speaker:

course, like, you don't know how to handle it. I think that, like, as you

Speaker:

as you said, you can start to prepare yourself. How do how do you

Speaker:

prepare yourself in security? It's basically to be very

Speaker:

organized and to to make sure that you have, like, the right visibility and

Speaker:

governance. As long as you have, for example, like, you know how to build,

Speaker:

like, your your AI or the machine learning bomb. You know all the

Speaker:

different, like, models that are built or embedded as part of the application,

Speaker:

and you have, like, the right lineage, which one

Speaker:

was trained on which dataset, etcetera.

Speaker:

Once, for example, that now let's say we'll continue with the

Speaker:

examples of of Hugging Face. Like, a new Hugging Face

Speaker:

model is is is now, like, published as a like, someone,

Speaker:

like, found that it's, like, malicious. You because you prepared

Speaker:

yourself and you have, like, the right visibility, you are able to go

Speaker:

and very easily search exactly, like, if you use it and

Speaker:

where you use it in all your organization. And this is also

Speaker:

because you prepare yourself. This is exactly what happened, like, in Log 4

Speaker:

j. In Log 4 j, it was like a dependency that

Speaker:

found as a critical vulnerable. And a lot of

Speaker:

organization, what they spent, like, most of the time is to try to understand

Speaker:

where they even use this Log4j. And they seem that, like, if you prepare

Speaker:

yourself, you are like, if you are organizing everything, you'll already

Speaker:

be very, very, like, ready for the for the

Speaker:

attack of, like, the unknown unknown. And, of course, everything

Speaker:

in addition to to, you know, like, learning and, like, educating

Speaker:

yourself. If you start, like, to understand, you'll go

Speaker:

to, I don't know, Databricks, for example. A lot of people use Databricks. You'll

Speaker:

go and, like, start, like, to see what are, like, the best practices of how

Speaker:

to, like, configure your Databricks environments and what are, like, the best practices

Speaker:

there. It's something that you can, like, find very easily, like, in the Internet. You

Speaker:

don't need, like, to to do it, like, from scratch.

Speaker:

But I'll say that, like, you you know, like, it's still, like, when we are

Speaker:

aware of that, it's not still, like, the the top of our mind as the

Speaker:

data practitioner to start looking, like, in our free time for this

Speaker:

kind of concept. Right. I mean, that's a good point.

Speaker:

Right? The fundamentals are still fundamental. Right?

Speaker:

You know, making sure, you know, you track what

Speaker:

your dependencies are. Right? So that way, if there's a breach in a hugging face

Speaker:

model, like you said, you'll know right away whether or not it

Speaker:

impacts you or not. Also too, I think you're

Speaker:

right. This isn't top of mind for AI practitioners. Right?

Speaker:

Even when I code, like, an app, my met

Speaker:

my thought process are very different than when I'm in a notebook.

Speaker:

Mhmm. It's just different wiring.

Speaker:

Yep. And by the way, it's kind of like, it's kind of

Speaker:

like a paradox because most times on the notebooks,

Speaker:

we are connected to much more sensitive information than on our

Speaker:

ID. Right. No. Exactly. So

Speaker:

it's kind of it's like the worst, one of the worst case

Speaker:

scenarios. Right? And and you're right. Like, people wanna work with real

Speaker:

data, and they they just assume that if they're on a system that's

Speaker:

secured and internal, they

Speaker:

they, they don't have to worry about such things,

Speaker:

which I think you're right. Like, with these systems that have access to

Speaker:

sensitive data, these pipelines, I mean, it's one of those

Speaker:

things where we need to start thinking about this. And what would you do

Speaker:

you think that there's a, like, a career path for, like, an AI security engineer?

Speaker:

Right? So it's not just a security engineer, like, in a traditional

Speaker:

sense. Right? But also a someone who specializes

Speaker:

in AI related issues. You think that's a growth industries? I

Speaker:

have, like, no doubt that we are going to like to see more. Like, we

Speaker:

already see these kind of practitioners in the field. I have no

Speaker:

doubt that it's going, to be more and more frequent. And in

Speaker:

addition to that, I believe that, like, even in the future, it's it's going to

Speaker:

be even, like, several different, like, roles. For example, one of the

Speaker:

things that, like, a lot of people that we work also, like, very closely with

Speaker:

are AI red teaming. Right. It's not even,

Speaker:

like, just like a AI security engineer, like, general one. Specifically around,

Speaker:

like, credit teaming because all these kinds of adversarial

Speaker:

attacks on models are very different, requires

Speaker:

different techniques, different tactics. And the red teamers are the

Speaker:

ones that, like, to, like, learning all these different

Speaker:

types of adversarial attacks and how to, like, check your model,

Speaker:

in your organization. And by the way, specifically in this

Speaker:

area, I do feel that it's kind of, like, top priority and

Speaker:

like top of mind also for the data science

Speaker:

team. Like you do see that on LLMs,

Speaker:

once they are deployed into production, the data

Speaker:

scientists, they are kind of like understand that there are a lot of risk there

Speaker:

and they are starting, like, to take also, like, responsibility even completely, like, regard

Speaker:

regardless of the security team to make sure that, like, we we

Speaker:

we reduce some of the risk there. Now the risk is not only

Speaker:

security. The first thing is security, like, to try and, like, make sure

Speaker:

that you are secured from all these different adversarial attacks or that you know how

Speaker:

to detect sensitive data leakage, for example, as part of the response and stuff

Speaker:

like that. In addition to that, it's also a lot of time

Speaker:

like safety risks. You want to make sure that once you deploy LLM into

Speaker:

production, your model doesn't give any financial advice to your

Speaker:

customers, doesn't give any health advice in case it's not your business.

Speaker:

So you then have, like, these kinds of responsibility, or example, like in the

Speaker:

Chevy example that you gave, that you just, like, you don't just, release

Speaker:

free cars or flights or books or a tail off, like, anything

Speaker:

like that. So I think that because the

Speaker:

the the the amount of potential risks are

Speaker:

so high on the run time. In this area, I

Speaker:

believe that, like, the data scientists already understood that this is, like,

Speaker:

under their responsibility. They see it also as part of,

Speaker:

like, being a professional data scientist. If I

Speaker:

deploy this model, it has, like, a lot of, like, accuracy, but,

Speaker:

like, it creates all these different kinds of risk.

Speaker:

I would define myself as not a super professional data

Speaker:

scientist, unlike on the supply chain, unlike in the

Speaker:

notebooks that if I code a code that is not secure, I wouldn't say that,

Speaker:

like, it's not professional. I would say that, like, it's okay. You're just, like, focusing

Speaker:

on the business. So I do believe that we start, like, to seeing this shift

Speaker:

also, like, in the mindset of the data scientist because of the risk of

Speaker:

the Gen AI, but now it's also, like, like, a move

Speaker:

to to all the the development and the building practices

Speaker:

that we have. Yeah. And I think data

Speaker:

scientists are acutely aware that LLMs

Speaker:

are just taking they mean, we talk we we call it hallucinating when

Speaker:

they get things wrong. But realistically, they're

Speaker:

always hallucinating to a very real degree. Right? It's just they

Speaker:

happen to be correct. And what these things are doing

Speaker:

under the hood is they are looking for patterns of words.

Speaker:

Sometimes those patterns of words are wrong, obviously wrong.

Speaker:

And sometimes they may give out sensitive information

Speaker:

inadvertently. So I can talk at least at least there's some common sense out there

Speaker:

when they when they do realize these things are higher risk than I think

Speaker:

we've been led to believe. Yeah. Actually, I love this this finish. They are,

Speaker:

like, hallucinating, like, all this time. Sometimes they really find it

Speaker:

as wrong. Like, they do the same thing as always. Right.

Speaker:

Right. The they don't know they're hallucinating because they're just operating normally.

Speaker:

And so when they go in a different direction and I've noticed

Speaker:

that, you know, kinda like a little bit of, you you know, off by a

Speaker:

little bit, and then then then it generates an off by a little bit, off

Speaker:

by a little bit. I ran an experiment with a hallucination, and

Speaker:

I read it through I ran it through a bunch of models and each one

Speaker:

of them didn't do any fact checking, which I mean, realistically,

Speaker:

I wouldn't expect that. Right? In the future, I think that'll be kind of table

Speaker:

stakes. But, you know, it would just go through. So

Speaker:

I took a hallucination, fed it through notebook l m, which then

Speaker:

create even more hallucinations. Right? So it took this little

Speaker:

genesis of something that was wrong and then made it even crazier wrong,

Speaker:

which I think is an interesting kinda statement and and and

Speaker:

also is a risk. Right? Like hallucination on top, compounding

Speaker:

other hallucinations. And I don't think we've really seen that yet because we've

Speaker:

only really seen for the most part, I've only seen one kind

Speaker:

of model in production. But if you have these models that will kinda work together

Speaker:

as agents or, you know, whether they're agents

Speaker:

that do things or agents that it's different LLM discrete LLMs that talk

Speaker:

to one another. They can get things wrong and make things worse. I mean, I

Speaker:

haven't I think it's too soon to tell either way, honestly. Yeah. But, like, the

Speaker:

like, theoretically, like, it makes a lot of sense. I think in general, like, we

Speaker:

don't see, like, a lot like, we hear a lot about Gen AI.

Speaker:

I think that, like, the level of adoption and the amount

Speaker:

of business use cases that, like, businesses

Speaker:

found are not that high yet. I think that, like, the

Speaker:

most of the usage today is done by, like, consumers, like,

Speaker:

like, directly, like, from, from the foundation model providers, like OpenAI and stuff

Speaker:

like that for day to day, like, jobs, like, you know,

Speaker:

like, reviewing mails and stuff like that.

Speaker:

The the big businesses are still trying to find these

Speaker:

different, like, use cases. I do believe that the that the

Speaker:

agents are going, like, to open a lot of different use cases

Speaker:

around it. Right. Right. I could I could see that. And

Speaker:

I think I think it's just too soon to make a statement

Speaker:

either way. But I think grounding yourself in the fundamentals

Speaker:

is probably always a good idea. Mhmm.

Speaker:

And probably a good a good

Speaker:

approach. So so tell me about NOMA. What is is it NOMA? I

Speaker:

I don't wanna make sure I pronounce it. NOMA. Okay. NOMA. Security. What does

Speaker:

NOMA do? Is it security firms that focus on this space? You

Speaker:

mentioned red teaming. Is that is that a sir service you offer?

Speaker:

Yeah. So NOMA basically is an like, our name is Nomo

Speaker:

Security. The domain is Nomo dot security. So it's Oh, okay. Sorry about

Speaker:

that. No. No. We're good. So, so, yeah,

Speaker:

what we do is, like, secure the entire data in the AI life cycle.

Speaker:

Basically means that we truly, like, cover it end to end. Like, we enable, like,

Speaker:

the data teams and the machine learning and the AI teams, to continue and

Speaker:

innovate while we are securing them without

Speaker:

slowing down. And this is like the the like, we are built from, like,

Speaker:

data practitioner, like, the company. So this is, like, our main focus,

Speaker:

meaning that we start, like, from the building phase. So if we

Speaker:

said, like, notebooks and hugging face models and all these different stuff and the

Speaker:

misconfigurations are on all the different stack and all the envelopes

Speaker:

tools and AI platforms and data pipelines and stuff like that. So we are

Speaker:

connected seamlessly on the background, and,

Speaker:

basically assist the the data teams to to work securely,

Speaker:

without changing changing anything in the workflows.

Speaker:

And then also, like, we provide, as you said, the red teaming.

Speaker:

Before you're deploying the model into production, you want to

Speaker:

understand what is the level of, of

Speaker:

robustness and security that the like, that your model has. And

Speaker:

what we do is we had, like, a big research team that,

Speaker:

like, builds, simulated, thousands of different

Speaker:

attacks. And then we dynamically start to run all these attacks against

Speaker:

your models, showing you exactly, like, what kind of, like, tactics

Speaker:

and techniques your model is vulnerable to, and exactly

Speaker:

also how to mitigate and improve it to be more

Speaker:

robust. And then the 3rd part is also the runtime.

Speaker:

We are mapping, we're scanning all the prompts and all the

Speaker:

responses in real time, making sure that you don't

Speaker:

have any risk on both sides. The security, we are detecting all these

Speaker:

different kind of, like, a host and a little, like, adversarial tax prompt

Speaker:

injection, model jailbreak, etcetera. We check also the responses for

Speaker:

sensitive data leakage and stuff like that. But in addition, also the

Speaker:

safety. We see a lot of organizations that the data scientists, as we

Speaker:

said, they understand the risk of deploying

Speaker:

models into production. And this is why not even, like, the security, but more like

Speaker:

the the Chevy example and, like, the the health advice and stuff like that.

Speaker:

So they built for their own, model

Speaker:

guardrails in order to make sure that they are, like, controlling what

Speaker:

are, like, the topics that the model is be able like, is allowed or

Speaker:

disallowed to communicate about. And what we do is basically to save

Speaker:

them also like this time. We also provide them, like, all this

Speaker:

runtime protection already, like, as a service. You can define exactly what kind

Speaker:

of, like, detectors and in native language, what kind of, like, policies you want

Speaker:

to make sure that are enforced. And then we also, like, protect it in the

Speaker:

run time. So, basically, we just, like, cover you, like, end to end, start from

Speaker:

the building and up to the run time. It starts from the classic data engineering

Speaker:

pipelines and machine learning and up to gen AI. Interesting. Interesting.

Speaker:

It sounds like something I think is totally, I think, a

Speaker:

needed needed service and and skill

Speaker:

set. Because you're right. Like, I mean, there's just so many risks

Speaker:

here, and the hype around Gen

Speaker:

AI is so over the top.

Speaker:

It is gonna be revolutionary, but

Speaker:

maybe not in the way you think. Right? And I always call back to the

Speaker:

early days of the dotcom. Right? Where it was pets.com. There was,

Speaker:

you know, this.com, that, you know, like all these crazy things. But the

Speaker:

real quote unquote winner of, you know,

Speaker:

.com was some guy in Seattle selling books.

Speaker:

Mhmm. Right? No one no one I mean, selling books. Like, really?

Speaker:

Like, not, you know, and it's

Speaker:

it's interesting to see how I think I

Speaker:

think that the the obvious use case for chat for for

Speaker:

LLMs thus far has been chatbots. Right? Customer

Speaker:

service type things. I think that's really only the

Speaker:

the the the the the surface of it. I think for me, what

Speaker:

I've seen is most impactful is the ability for natural language

Speaker:

understanding and their ability to understand what's happening in a in

Speaker:

a block of text. And I think

Speaker:

that that has enormous potential. I

Speaker:

agree. A lot of risks too. Right? Because what if, you know, what if

Speaker:

I I mean, to your point. Right? You wanna make sure these things stay on

Speaker:

topic. Right? Like, I don't if I'm talking to a

Speaker:

financial services chatbot and I say, hey, I have

Speaker:

my my leg kinda hurts. Right?

Speaker:

It's, you know, the risk of moving into health care, like, it's just kind

Speaker:

of, I don't how mature are those guardrails? Because I've

Speaker:

not really seen a good implementation of

Speaker:

it yet. Yeah. So, you know, like, I

Speaker:

don't want to to give ourself, like, a compliment, but,

Speaker:

we Oh, you guys are pretty good at it? Yeah. Like, we're pretty good. Like,

Speaker:

we were, like, you know, like, with fortune 5 100, with fortune 1 100.

Speaker:

Not in vain. But, yeah, I believe that in general, specifically, like, when we speak

Speaker:

more, like, on the guardrail side, I see that the most important thing is

Speaker:

to make sure that it's, it's building the

Speaker:

right architecture to be very flexible and easily

Speaker:

configure for the organization because eventually, like, you know, like, each

Speaker:

organization is completely different needs, completely different

Speaker:

context to the calls, like, in their customers, internally to their employees.

Speaker:

So everything should should be, like, very easily configured, but very flexible.

Speaker:

Interesting. Interesting. I wanna I I could talk for

Speaker:

another hour or 2 with you because this is this is a fascinating space.

Speaker:

Where can folks find out more about Noma and you? I you think it's Noma

Speaker:

dot security? Yeah. Noma dot security. Can't believe that's now

Speaker:

a top load pain, but,

Speaker:

and, any any, NOMA dot

Speaker:

security, you're on LinkedIn, and, anything

Speaker:

else you you'd like the folks to find out more?

Speaker:

No. I had, like, a great time speaking with you, Frank. Great.

Speaker:

Likewise. And for the listeners out there, if you're a little bit

Speaker:

scared and a little bit paranoid about generative AI and LLMs,

Speaker:

then I think we had a good conversation. Because I think we need a little

Speaker:

bit of that fear in the back of our heads to guide us and

Speaker:

maybe think about security issues. A

Speaker:

little bit of thought ahead of time will probably save you a lot of problems

Speaker:

later. And want to lose some. That's

Speaker:

that's all I got, and we'll let the nice British AI,

Speaker:

Bailey finish the show. Well, that wraps up another

Speaker:

eye opening episode of data driven. A big thank you to Niamh

Speaker:

Braun for sharing his expertise on the critical intersection of AI,

Speaker:

security, and innovation. If today's conversation didn't make

Speaker:

you double check your data pipelines or rethink your Hugging Face

Speaker:

downloads, well, you're braver than I am. As always,

Speaker:

I'm Bailey, your semi sentient MC, reminding you that while

Speaker:

AI might be clever, it's never too clever for a security breach.

Speaker:

Until next time, stay curious, stay secure, and

Speaker:

stay data driven. Cheerio.