Speaker: 00:00:00

You've found the backup wrap up your go-to podcast for all things

Speaker: 00:00:04

backup recovery and cyber recovery.

Speaker: 00:00:07

In this episode, we take a look at the use of artificial intelligence in backup.

Speaker: 00:00:12

Can AI make your backup environment actually better?

Speaker: 00:00:16

Prasanna Malaiyandi and I discuss AI and how it can help from

Speaker: 00:00:21

possibly everything from scheduling backups to detecting ransomware.

Speaker: 00:00:25

We talk about using it for deduplication, for capacity planning,

Speaker: 00:00:30

and even helping you to write better disaster recovery plans.

Speaker: 00:00:33

It's time to talk about AI and backups.

Speaker: 00:00:36

Hope you enjoy it.

Speaker: 00:00:37

By the way, if you don't know who I am, I'm w Curtis Preston, AKA, Mr.

Speaker: 00:00:41

Backup, and I've been passionate about backup and recovery for over 30 years.

Speaker: 00:00:46

Ever since I had to tell my boss I. That we had no backups of that really

Speaker: 00:00:50

important database that we had just lost.

Speaker: 00:00:52

I don't want that to happen to you, and that's why I do this podcast.

Speaker: 00:00:56

On this podcast, we turn unappreciated backup admins into Cyber Recovery Heroes.

Speaker: 00:01:02

This is the backup wrap up.

Speaker: 00:01:20

Welcome to the show.

Speaker: 00:01:22

Hi, I'm w Curtis Preston, AKA, Mr. Backup, and I have with me a guy who apparently

Speaker: 00:01:27

doesn't know how to hold a coffee cup.

Speaker: 00:01:29

Prasanna Malaiyandi, how's it going?

Speaker: 00:01:31

Prasanna

Speaker: 00:01:31

I am good, Curtis.

Speaker: 00:01:32

I. So I think we need to clarify a

Speaker: 00:01:36

are you defending yourself?

Speaker: 00:01:37

Are you gonna try to defend your weirdness?

Speaker: 00:01:39

I think we have to talk about multiple things.

Speaker: 00:01:42

First.

Speaker: 00:01:42

In India, they don't typically use like a mug.

Speaker: 00:01:46

They use like a stainless steel cup, right?

Speaker: 00:01:49

So if

Speaker: 00:01:50

you're drinking hot beverages, you can only hold it from like the very,

Speaker: 00:01:52

you saw it when we went to the Indian

Speaker: 00:01:53

restaurant in San Diego,

Speaker: 00:01:55

yeah, yeah.

Speaker: 00:01:56

you have to hold it from the very top, otherwise you'll burn your hand.

Speaker: 00:01:59

Right.

Speaker: 00:02:00

And then most mugs, it just feels weird.

Speaker: 00:02:03

Like I got, I got chunky fingers, like sausages, right?

Speaker: 00:02:06

And so like putting it inside the mug, like the handle part of the mug.

Speaker: 00:02:10

I feel like, especially if it's like a curve, not like a straight, I feel like

Speaker: 00:02:14

there's not enough stability there.

Speaker: 00:02:17

That's fascinating.

Speaker: 00:02:18

So for people watching the video, who, by the

Speaker: 00:02:20

way, we do publish a video on YouTube if you want to see our

Speaker: 00:02:23

glorious faces and our expressions.

Speaker: 00:02:25

But yeah, so when I hold a mug, I don't hold it like this through the

Speaker: 00:02:29

handle.

Speaker: 00:02:29

I basically grab it either from the top or I hold it like on the side.

Speaker: 00:02:36

And then of course, the Pinky's kind

Speaker: 00:02:38

The pinky,

Speaker: 00:02:39

the bottom.

Speaker: 00:02:39

But what's weird though is the pinky supporting the bottom thing.

Speaker: 00:02:43

I know you've complained to me many times, but that's also how I hold my phone.

Speaker: 00:02:50

you end up covering your microphone.

Speaker: 00:02:51

always hold the phone and

Speaker: 00:02:52

then my Pinky's kind of on the bottom, and so it always blocks the microphone.

Speaker: 00:02:58

So Curtis is always

Speaker: 00:02:59

like, were you underwater?

Speaker: 00:03:00

Did you swallow your phone?

Speaker: 00:03:01

What's going on?

Speaker: 00:03:03

So regarding your defense from, you know, how they hold, do things in India.

Speaker: 00:03:09

What part of India were you born in?

Speaker: 00:03:11

Uh, just remind

Speaker: 00:03:12

Yeah, I was, uh, born in not India, but,

Speaker: 00:03:19

but at home.

Speaker: 00:03:20

Right.

Speaker: 00:03:21

But yeah,

Speaker: 00:03:21

you were, you were raised by people born in

Speaker: 00:03:23

India, and so you were, you were taught,

Speaker: 00:03:26

yeah.

Speaker: 00:03:27

And so actually I prefer, so even drinking water.

Speaker: 00:03:30

I don't drink from a glass cup.

Speaker: 00:03:32

I drink from a stainless steel cup.

Speaker: 00:03:34

Right.

Speaker: 00:03:34

Which is, if you haven't spent any time around, you know,

Speaker: 00:03:37

Indians, you wouldn't know that.

Speaker: 00:03:39

It's just that you use a lot, you use stainless steel for cups, for plates,

Speaker: 00:03:45

right.

Speaker: 00:03:45

As Curtis knows what I'm loading, the dishwasher and

Speaker: 00:03:48

he's like, what is that racket?

Speaker: 00:03:50

what is happening over there?

Speaker: 00:03:53

Because everything's so noisy.

Speaker: 00:03:55

They last longer and you don't have to worry about them breaking.

Speaker: 00:03:59

That's, you know, I can't, I can't complain.

Speaker: 00:04:01

Yeah.

Speaker: 00:04:03

Uh, but yeah, I don't get the whole knot, you know?

Speaker: 00:04:06

Here I am with four fingers in my mug.

Speaker: 00:04:09

I'm just saying.

Speaker: 00:04:10

Okay, so now what if that mug was smaller and the handle was curved,

Speaker: 00:04:14

Well, then that's like a, that's like a girly mug and then,

Speaker: 00:04:18

then you use two fingers like

Speaker: 00:04:20

this.

Speaker: 00:04:21

feel like it gives you enough stability?

Speaker: 00:04:23

And yet I've never dropped a mug.

Speaker: 00:04:25

I'm

Speaker: 00:04:25

just saying.

Speaker: 00:04:26

It's not from dropping the mug.

Speaker: 00:04:27

It's from like when you, yeah.

Speaker: 00:04:29

See when you're drinking it, it just feels like it's a little like,

Speaker: 00:04:33

Yeah.

Speaker: 00:04:33

Um,

Speaker: 00:04:34

all over you.

Speaker: 00:04:34

I just think you don't know how to hold a mic, but.

Speaker: 00:04:38

Our listeners are probably like, what are these people talking about?

Speaker: 00:04:41

By the way, this is a new format starting in the new year.

Speaker: 00:04:44

We are now gonna just be talking about coffee and all the crazy

Speaker: 00:04:46

things that Prasanna does.

Speaker: 00:04:49

Yeah, absolutely.

Speaker: 00:04:51

Um, or maybe we might actually talk about some stuff.

Speaker: 00:04:56

So I thought, um, you know, we've been seeing, uh, AI on the news a lot,

Speaker: 00:05:03

right?

Speaker: 00:05:03

ai, I've never heard about it.

Speaker: 00:05:05

Yeah, I've never, never heard of it.

Speaker: 00:05:07

Yeah.

Speaker: 00:05:08

So artificial intelligence, and if, if you've been following the backup

Speaker: 00:05:13

industry much, you probably saw a few announcements from your, uh, backup

Speaker: 00:05:22

company or maybe backup companies you're interested in about the use of ai.

Speaker: 00:05:28

Within backup.

Speaker: 00:05:29

And so I thought we'd talk about that a little bit,

Speaker: 00:05:31

um, in this episode, and

Speaker: 00:05:33

whether or not it has a use, right?

Speaker: 00:05:37

And can, just to clarify, I think when a lot of these backup vendors launched ai,

Speaker: 00:05:43

they were using AI for like the, not for the core product, right?

Speaker: 00:05:49

So they were using AI for their support agent, or to help answer questions, right?

Speaker: 00:05:54

Which I think we all understand, we all know about, but I think in this

Speaker: 00:05:58

episode, I think we should focus on like the core part of backup.

Speaker: 00:06:03

Yeah.

Speaker: 00:06:03

So, so let's talk a just a little bit about, you know,

Speaker: 00:06:06

what we mean when we say ai.

Speaker: 00:06:08

There are different categories of ai and then also there's machine learning, which

Speaker: 00:06:12

is very closely, and honestly, I, I, I,

Speaker: 00:06:16

you know, I think I could describe the difference between machine

Speaker: 00:06:19

learning and ai, but then there's something that, that.

Speaker: 00:06:22

Changes, you know, that, that messes me up when we talk about that.

Speaker: 00:06:26

Um, I'll just, for those of you that actually really know what AI

Speaker: 00:06:29

is and machine learning is, you're gonna be offended by something

Speaker: 00:06:32

I say during this episode.

Speaker: 00:06:34

I, I'll just tell you that.

Speaker: 00:06:35

But we're gonna use the terms almost interchangeably, but they're not.

Speaker: 00:06:38

Uh, but I do want distinguish between.

Speaker: 00:06:41

What is referred to as generative ai, right?

Speaker: 00:06:45

Which is a, you know, a large language model that is

Speaker: 00:06:49

going to create things there.

Speaker: 00:06:52

It's not ex nihilo, right?

Speaker: 00:06:53

It's not from, it's not from nothing.

Speaker: 00:06:55

It's it, it has to, it has to have been trained on a large data set.

Speaker: 00:06:59

But, those are the kinds of things that they're using,

Speaker: 00:07:01

like you talked about there.

Speaker: 00:07:03

Sup for support

Speaker: 00:07:04

models, right?

Speaker: 00:07:05

And, And,

Speaker: 00:07:05

just as examples of large language models, you might've heard about

Speaker: 00:07:09

meta's llama, lama three, Lama four, there's chat, GPT or open ais.

Speaker: 00:07:15

What is it?

Speaker: 00:07:16

OPT?

Speaker: 00:07:19

What,

Speaker: 00:07:20

the, the, actual model.

Speaker: 00:07:21

the

Speaker: 00:07:21

underlying model.

Speaker: 00:07:22

Oh, okay.

Speaker: 00:07:23

I, I, I would just, I would've just said chat, GPT.

Speaker: 00:07:25

'cause everybody knows what chat GPT

Speaker: 00:07:27

is, right?

Speaker: 00:07:27

I mean, you've got copilot, you've got, you've

Speaker: 00:07:29

got, Yeah, you, so you've got Claude from Anthropic.

Speaker: 00:07:33

Um, there are a lot of people, you know, um, confused the company with the product.

Speaker: 00:07:39

But, um, these are the, these are the ones that are grabbing

Speaker: 00:07:42

all the headlines, right?

Speaker: 00:07:43

They're also, they're also writing large bodies of texts.

Speaker: 00:07:47

They're helping people to write books.

Speaker: 00:07:49

They're helping people to do art.

Speaker: 00:07:51

That, and there's a lot of, um.

Speaker: 00:07:54

A lot of legal discussions around that, around the use of things like

Speaker: 00:07:59

the books that I've written as, um, you know, feeding into that and, um,

Speaker: 00:08:05

the, we're not talking about that,

Speaker: 00:08:07

right?

Speaker: 00:08:08

Um, we're not gonna talk about, Hey, um, chat GPT.

Speaker: 00:08:13

My restore didn't work.

Speaker: 00:08:14

Can you recreate all my documents?

Speaker: 00:08:17

Um, it's not,

Speaker: 00:08:18

there's not gonna be anything like that, at least not yet.

Speaker: 00:08:21

Um, the, um, we're gonna talk about how AI can be used to basically

Speaker: 00:08:28

enhance the core functionality.

Speaker: 00:08:31

I mean, you said this in way, a fewer words a few minutes ago,

Speaker: 00:08:34

but, uh, basically how it could be used to make backups better.

Speaker: 00:08:40

And I think a good chunk of this is really, like you said, more

Speaker: 00:08:44

around machine learning models,

Speaker: 00:08:47

right,

Speaker: 00:08:47

right,

Speaker: 00:08:48

large language models.

Speaker: 00:08:50

right.

Speaker: 00:08:50

So the, the first section we will just talk about how potentially just talk about

Speaker: 00:08:58

this is just sort of thoughts out loud.

Speaker: 00:08:59

I know that we have a lot of vendors that listen to the podcast.

Speaker: 00:09:02

We are.

Speaker: 00:09:03

Technically aimed at the, the people who actually use backup and

Speaker: 00:09:07

recovery, but I know a lot of vendors use the podcast, so feel free to

Speaker: 00:09:11

take this episode and run with it and

Speaker: 00:09:13

do stuff.

Speaker: 00:09:14

So I, I guess the first question would be, do we think that, uh, machine learning

Speaker: 00:09:20

can be used to help just to prove the efficiency of the backup process itself?

Speaker: 00:09:26

What do you think about

Speaker: 00:09:27

Oh, a thousand percent.

Speaker: 00:09:28

A billion percent, Curtis.

Speaker: 00:09:30

So I've never actually had to implement a backup system.

Speaker: 00:09:33

But you've done

Speaker: 00:09:34

tons of this, right?

Speaker: 00:09:35

And how do you go about just planning your backup, right?

Speaker: 00:09:41

How to back up an infrastructure, right?

Speaker: 00:09:42

It's like, just walk us through that, right?

Speaker: 00:09:45

And how many spreadsheets and all the rest that you have in

Speaker: 00:09:49

order to try to optimize these.

Speaker: 00:09:51

Yeah, I, I think about that a lot.

Speaker: 00:09:53

And, and, and, and, and the answer is gonna depend greatly on the

Speaker: 00:09:58

product that you're using, right?

Speaker: 00:09:59

You know, I, I can think of.

Speaker: 00:10:01

The traditional way is that you're going to create some kind of schedule, some

Speaker: 00:10:06

kind of, uh, automatic backup schedule.

Speaker: 00:10:09

Um, and you're going to do a, again, traditionally we'll

Speaker: 00:10:13

do three categories here.

Speaker: 00:10:14

Traditionally you've got some full backups and you're gonna do some

Speaker: 00:10:17

full backups every once in a while.

Speaker: 00:10:19

Um, and I was always a proponent if you had to do full backups, I was

Speaker: 00:10:23

always a proponent of doing those.

Speaker: 00:10:25

No.

Speaker: 00:10:27

More often than once a month.

Speaker: 00:10:30

Um, back in the days of tape, it was once a week because

Speaker: 00:10:34

it, was

Speaker: 00:10:35

complicated the restore process.

Speaker: 00:10:36

Yeah.

Speaker: 00:10:37

But, um, you know, doing it no more often than once a month, but depending on your

Speaker: 00:10:42

backup product, you might be able to, to

Speaker: 00:10:44

spread that out even over like three months.

Speaker: 00:10:46

And then you also want to schedule, if your backup product

Speaker: 00:10:50

is capable of doing it, you wanna schedule a cumulative incremental.

Speaker: 00:10:55

A differential, some products call it.

Speaker: 00:10:57

Um, and then of course the daily incremental.

Speaker: 00:11:00

Right.

Speaker: 00:11:01

So spreading

Speaker: 00:11:02

that all

Speaker: 00:11:03

for one application you're talking about,

Speaker: 00:11:05

E exactly.

Speaker: 00:11:06

You're doing this per application, per server.

Speaker: 00:11:10

Um, and, and you're trying to load balance things out because if you've

Speaker: 00:11:17

properly designed your system, it's probably not capable of doing a full

Speaker: 00:11:21

backup of your environment in one night.

Speaker: 00:11:24

Right.

Speaker: 00:11:24

Um, because that would just be really expensive, and then the rest of the

Speaker: 00:11:28

time it would go completely unused.

Speaker: 00:11:30

Right?

Speaker: 00:11:31

Um, so you, you buy it so that it's you, you size it so that it's big

Speaker: 00:11:35

enough to do a full backup over time.

Speaker: 00:11:38

And, um, you're right that, that, that scheduling that out is problematic, right?

Speaker: 00:11:45

Um, and you, you definitely could use, um, uh, AI

Speaker: 00:11:49

or ML to, to do that.

Speaker: 00:11:51

And even for the scheduling aspect.

Speaker: 00:11:53

So we talked about the applications, and then you were talking about sort

Speaker: 00:11:56

of that infrastructure piece, which is shared and you now have to worry

Speaker: 00:11:59

about it across all of these things.

Speaker: 00:12:02

And I'm sure you had these bonkers spreadsheets that you

Speaker: 00:12:04

were creating, trying to do this.

Speaker: 00:12:06

Did it stretch all the way to the moon and back, by the way?

Speaker: 00:12:11

Well, you know me for, it wasn't even a spreadsheet, it was just, uh, it, it was a

Speaker: 00:12:15

script.

Speaker: 00:12:16

Right.

Speaker: 00:12:16

I would, I would just script all this nonsense.

Speaker: 00:12:18

Right?

Speaker: 00:12:19

Um, but it, but it, the bigger the environment, the more.

Speaker: 00:12:23

That doing it programmatically made sense, right?

Speaker: 00:12:26

Um, and, and by the way, even if you have a more modern backup tool

Speaker: 00:12:30

that does incremental forever, there are many applications that

Speaker: 00:12:35

won't, that won't let you do

Speaker: 00:12:36

that.

Speaker: 00:12:37

Right?

Speaker: 00:12:37

I think of like database backups still need to be done every, you know, a full

Speaker: 00:12:41

backup every so often, and you have to schedule these out,

Speaker: 00:12:44

And that's the

Speaker: 00:12:44

second category.

Speaker: 00:12:45

'cause I know you talked about three categories.

Speaker: 00:12:48

Yeah.

Speaker: 00:12:48

Oh yeah.

Speaker: 00:12:49

Oh, well the three categories were, yes.

Speaker: 00:12:51

Uh, thank you.

Speaker: 00:12:53

I'm glad I have you here sometimes, you know.

Speaker: 00:12:55

Yeah.

Speaker: 00:12:55

So you have the, the, the old school full and incremental,

Speaker: 00:12:58

which old school is still current

Speaker: 00:13:00

school.

Speaker: 00:13:00

If we're talking about regular apps, then there's the forever incremental type.

Speaker: 00:13:05

Um, and you don't, you, you do have to worry about scheduling those,

Speaker: 00:13:09

but generally you just sort of tell 'em all to start at once and then

Speaker: 00:13:12

they queue and then it is not, it's, it's a lot simpler to do those.

Speaker: 00:13:17

I. But then the final category are ones that actually, um, and I

Speaker: 00:13:22

think the one that probably stands out the most here would be Rubrik,

Speaker: 00:13:27

right?

Speaker: 00:13:27

Rubrik doesn't let you schedule, um, that

Speaker: 00:13:30

stuff.

Speaker: 00:13:31

You tell it what your RTO

Speaker: 00:13:33

is and your RPO, and it just does the backups.

Speaker: 00:13:36

I mean, in fact, there are people that complain that you cannot, at least

Speaker: 00:13:40

last time I checked, you could not do.

Speaker: 00:13:42

a a manually scheduled backup if you wanted to tell it when to do stuff.

Speaker: 00:13:47

Um, I, I think this is probably the first use of some sort of machine learning

Speaker: 00:13:53

or artificial intelligence that I can think of with regards to scheduling.

Speaker: 00:13:56

Which, which I was also gonna chime in.

Speaker: 00:13:58

So the first two methods you talked about, right?

Speaker: 00:14:01

You're kind of statically doing this upfront, setting the schedules and

Speaker: 00:14:06

hoping that forever that it will be good,

Speaker: 00:14:09

Right.

Speaker: 00:14:09

You'll always be able to meet it, but say that there's an additional load or a

Speaker: 00:14:13

server goes down or something else, right.

Speaker: 00:14:15

There's no way to fine tune and adjust that,

Speaker: 00:14:18

Well, well, I, Well, there, I mean, there is, but there's

Speaker: 00:14:21

no way to automatically fine

Speaker: 00:14:23

tune and Yeah.

Speaker: 00:14:24

Yeah.

Speaker: 00:14:24

Right.

Speaker: 00:14:24

And so you're just like, okay, maybe it'll fail a couple times

Speaker: 00:14:28

and then I'll adjust the policies and then I'll be fine, but Right.

Speaker: 00:14:31

Versus something like an SLA based, which I, I actually have

Speaker: 00:14:35

looked at rubrics in the past,

Speaker: 00:14:36

and I find that very enticing because really in the end, you

Speaker: 00:14:41

care about what your RPO and RTO,

Speaker: 00:14:44

Yeah.

Speaker: 00:14:44

No one cares if you can back up.

Speaker: 00:14:45

They only care if you can restore.

Speaker: 00:14:46

the problem though is it's such a big paradigm shift for a lot of backup admins

Speaker: 00:14:53

that it's very difficult to understand because it's like when people move

Speaker: 00:14:57

from on-premises to the cloud and they were concerned because they're like,

Speaker: 00:15:00

I can't touch and feel my equipment.

Speaker: 00:15:02

Right.

Speaker: 00:15:03

It's not something I could actually do.

Speaker: 00:15:04

I think that's also the same challenges you get when you move

Speaker: 00:15:07

from sort of, uh, schedule-based backups to sort of SLA based backups.

Speaker: 00:15:12

Yeah, I, I liked, I liked the idea a lot.

Speaker: 00:15:15

I, I, I still, again, you know, if I was, if I was running rubric,

Speaker: 00:15:20

I would give people the ability to do a manual backup if they

Speaker: 00:15:22

wanted to.

Speaker: 00:15:23

But, but I do really like the idea of SLA driven backups,

Speaker: 00:15:27

because I like the idea of SLAs.

Speaker: 00:15:29

You know, we've talked about SLAs on here, and I like the idea of.

Speaker: 00:15:32

Knowing the back backups were being done often enough to meet my SLAs.

Speaker: 00:15:36

really liked that idea.

Speaker: 00:15:38

The one thing I think that is useful with these sort of approaches is

Speaker: 00:15:43

we've talked about the fact that like your environment doesn't say static.

Speaker: 00:15:47

Right.

Speaker: 00:15:48

So as you're adding new workloads, as things are changing, you don't

Speaker: 00:15:53

want to have to go recompute your entire spreadsheet or your

Speaker: 00:15:56

script H every single time.

Speaker: 00:15:58

So it's nice to have sort of these models that can automatically help fine tune and

Speaker: 00:16:03

optimize so you're not wasting your time because it's more than likely that you're

Speaker: 00:16:07

not gonna get it right the first time if you manually try to reset some of these

Speaker: 00:16:10

things.

Speaker: 00:16:11

And so having this automatic thing that constantly is

Speaker: 00:16:14

adjusting just seems amazing.

Speaker: 00:16:18

Yeah, it does.

Speaker: 00:16:18

And I, and outside of Rubrik, I'm not aware of any tools that do that.

Speaker: 00:16:23

Uh, but I, I think that this could certainly be a way where

Speaker: 00:16:25

they could use AI to do that.

Speaker: 00:16:27

Um, the.

Speaker: 00:16:30

And I, and I was thinking about, again, going back to it, it's been a

Speaker: 00:16:33

while since I've had to do this in a production environment, but the, the

Speaker: 00:16:36

the first thing that you have to find out is how big is everything, right?

Speaker: 00:16:40

How big is, is everything from a database perspective and

Speaker: 00:16:43

how, how long does it take?

Speaker: 00:16:45

'cause there's all these different, and that's the thing that nobody knows.

Speaker: 00:16:48

Right.

Speaker: 00:16:48

How big is your, how big is your data center?

Speaker: 00:16:50

And they're like, I don't know.

Speaker: 00:16:51

I don't know.

Speaker: 00:16:52

And so like, you have to do a full backup first

Speaker: 00:16:54

before you have any idea.

Speaker: 00:16:55

And not every server backs up at the same speed and all these different things.

Speaker: 00:16:59

So yeah, it it is a

Speaker: 00:17:00

complicated

Speaker: 00:17:01

and you may not be able to back up everything at the same

Speaker: 00:17:03

time because there might be

Speaker: 00:17:04

different hours, right?

Speaker: 00:17:05

That

Speaker: 00:17:06

a server is sort of offline or has less load that you can actually do it.

Speaker: 00:17:12

Yeah, so having some sort of AI or ml, um, figure that out sounds amazing.

Speaker: 00:17:17

Right?

Speaker: 00:17:18

Another area where I think that this could help is very, very closely related, and

Speaker: 00:17:23

that is, and, and some backup products do have this and that is making sure

Speaker: 00:17:29

that everything in my data center.

Speaker: 00:17:31

Is backed up in some

Speaker: 00:17:34

way, right?

Speaker: 00:17:35

Usually where you see this is an integration with like, um, uh,

Speaker: 00:17:40

VMware or, uh, AWS, et cetera, right?

Speaker: 00:17:44

Um, basically just connect to my entire, uh, you know, control

Speaker: 00:17:49

panel and then just look and make sure that everything is connected

Speaker: 00:17:53

to some type of policy to back it

Speaker: 00:17:56

up.

Speaker: 00:17:56

I, I think.

Speaker: 00:17:57

a default policy if anything is created, so at least everything

Speaker: 00:18:00

is protected, even though

Speaker: 00:18:02

it may not be protected with the right thing, but at least it's

Speaker: 00:18:04

being protected and you don't have to worry about these gaps.

Speaker: 00:18:06

Speaker: 00:18:07

Exactly.

Speaker: 00:18:07

Exactly.

Speaker: 00:18:08

Um, and I, I think you do see this in a lot of backup products.

Speaker: 00:18:15

Usually again, it's with integration

Speaker: 00:18:17

with, uh, big things like VMware, HyperV, AWS, um,

Speaker: 00:18:22

you know, et cetera.

Speaker: 00:18:24

you need the companies, those vendors, to actually provide the APIs to be

Speaker: 00:18:28

able to do these sort of queries, and I think that's where there's kind

Speaker: 00:18:31

of a little bit of a tension there,

Speaker: 00:18:33

Yeah.

Speaker: 00:18:35

Yeah.

Speaker: 00:18:35

I mean, theoretically you could scour the data center, right?

Speaker: 00:18:39

Uh, looking for new computers.

Speaker: 00:18:41

Again, I, I know I mentioned this before, but you know, back

Speaker: 00:18:44

in the day we did that, right?

Speaker: 00:18:47

And back in the day we did that with Vizio.

Speaker: 00:18:49

Um, the, the vis, there used to be a very

Speaker: 00:18:52

expensive version of Vizio that would just literally crawl your data center.

Speaker: 00:18:56

And it used, uh, some very interesting technology.

Speaker: 00:19:00

Um, I forgot the, the name of this, but like, inmap

Speaker: 00:19:04

does this, where it, what it does is it sends a malformed packet.

Speaker: 00:19:09

It finds an IP address, it sends a malformed packet to that IP address

Speaker: 00:19:12

to see how it responds, and different things respond in different ways.

Speaker: 00:19:16

And that's how it, that's how it, um,

Speaker: 00:19:18

That

Speaker: 00:19:19

is crazy that they built that.

Speaker: 00:19:21

Yeah.

Speaker: 00:19:22

Yeah.

Speaker: 00:19:22

Um, and so you, you could theoretically do that, but a agreed, it's much easier

Speaker: 00:19:27

if you just have, everything's gonna be in VMware or AWS and then just talk to AWS.

Speaker: 00:19:32

Now again, going to VMware and AWS, there can be multiple virtual data centers.

Speaker: 00:19:37

There can be

Speaker: 00:19:37

multiple AWS accounts.

Speaker: 00:19:39

So you, you, you want to make sure that, that you have some way to, to

Speaker: 00:19:42

do that.

Speaker: 00:19:43

And I, and I do like that idea.

Speaker: 00:19:45

Shadow it.

Speaker: 00:19:47

Yeah, shadow it bad,

Speaker: 00:19:48

especially when it comes to backup.

Speaker: 00:19:49

Right.

Speaker: 00:19:50

Um, again, I'll tell a story from back in the day was the time that someone came to

Speaker: 00:19:56

me and they had, they were DBAs and they, they gave me a directory of a database.

Speaker: 00:20:01

They wanted me to restore.

Speaker: 00:20:02

Restore, and it was temp, um slash TMP on a, on a HP box.

Speaker: 00:20:08

And for those that don't know slash TMP on an HP box specifically, HPUX was in ram.

Speaker: 00:20:16

So when you rebooted it, temp went away.

Speaker: 00:20:19

And this, um,

Speaker: 00:20:21

this

Speaker: 00:20:21

it source code,

Speaker: 00:20:23

what I.

Speaker: 00:20:23

it?

Speaker: 00:20:23

Source code

Speaker: 00:20:24

It was source code.

Speaker: 00:20:25

Yeah.

Speaker: 00:20:26

And they were developing for months, like an entire team of

Speaker: 00:20:29

developers developing source code of this new application in temp.

Speaker: 00:20:34

And then we rebooted the server and they, and they came to me

Speaker: 00:20:39

and asked me to restore it.

Speaker: 00:20:41

And I was like, dude, we don't back up temp. I don't know

Speaker: 00:20:44

what you're talking about.

Speaker: 00:20:45

Like, and they're like, dude, this is really important,

Speaker: 00:20:47

like heads are gonna roll.

Speaker: 00:20:48

And I'm like, yeah, not mine.

Speaker: 00:20:50

Like everybody knows we don't back up temp.

Speaker: 00:20:53

Except for you, apparently.

Speaker: 00:20:55

Uh, so it's, I'm just, you know, it's really bad when you have

Speaker: 00:20:59

a functioning system and then it's not being backed up again.

Speaker: 00:21:02

Another story we used to have, um, we had a, a naming convention.

Speaker: 00:21:07

Ours was very boring.

Speaker: 00:21:09

Um, it, it was, it H-P-D-B-S-V-A, right?

Speaker: 00:21:13

HP database server A, and there was HB FS oh one, et

Speaker: 00:21:16

cetera, right?

Speaker: 00:21:18

And I remember, and I had this form that you had to fill out.

Speaker: 00:21:22

This was an actual piece of paper.

Speaker: 00:21:24

We did

Speaker: 00:21:24

not have web pages.

Speaker: 00:21:26

Right?

Speaker: 00:21:27

You had this form that you fill out and, and you had to, and, and it, it said

Speaker: 00:21:31

on there, simply filling out this form is not, does not meet the requirement.

Speaker: 00:21:35

You do not consider your system backed up until you have a signed form back from me.

Speaker: 00:21:40

Right?

Speaker: 00:21:41

And then one day somebody handed me a form and it said like.

Speaker: 00:21:44

They wanted, like me to back up H-P-D-B-S-V-M, right?

Speaker: 00:21:50

And I go, M that's interesting.

Speaker: 00:21:53

The last server I remember hearing about was H. So that means there's an I, A

Speaker: 00:21:58

J, A K, and an L out there somewhere.

Speaker: 00:22:01

hasn't been backed up.

Speaker: 00:22:02

That hasn't been backed up.

Speaker: 00:22:04

Yeah.

Speaker: 00:22:05

Um, so this idea of automatically

Speaker: 00:22:07

detecting servers and applications sounds like a great

Speaker: 00:22:10

idea.

Speaker: 00:22:11

And also not just VMs, but also detect, it would be really

Speaker: 00:22:15

nice if it detected the type of

Speaker: 00:22:16

VM and said, this appears to be a SQL instance.

Speaker: 00:22:20

We should back it up with the default SQL

Speaker: 00:22:21

policy.

Speaker: 00:22:22

That would be great.

Speaker: 00:22:24

So in addition to making things more efficient, um, there are some

Speaker: 00:22:28

other things we could do, uh, with AI that also would be interesting.

Speaker: 00:22:33

Uh, what do

Speaker: 00:22:34

you think is the, the first one?

Speaker: 00:22:35

No.

Speaker: 00:22:35

So I think one of the ones, and we've talked about it so much, so often,

Speaker: 00:22:39

and vendors are starting to do this, it's around anomaly detection and

Speaker: 00:22:44

it could be used in various fashion.

Speaker: 00:22:46

So one thing is like, Hey, by the way, this server, all of a sudden it's backing

Speaker: 00:22:52

up 10 times what it normally does.

Speaker: 00:22:55

Maybe this might indicate like a malware or ransomware on the system.

Speaker: 00:23:00

Right?

Speaker: 00:23:00

Um.

Speaker: 00:23:01

Or Hey, I've noticed that there's a bunch of data that's starting

Speaker: 00:23:05

to look like based on entropy.

Speaker: 00:23:06

That it's been encrypted, that doesn't look normal.

Speaker: 00:23:09

Okay, maybe I should go investigate it, right?

Speaker: 00:23:12

So, or it could even be security things like, Hey, you're logging

Speaker: 00:23:16

in from a different place than normal as a backup admin.

Speaker: 00:23:19

Is this the right thing or not?

Speaker: 00:23:22

Yeah.

Speaker: 00:23:22

And also very closely related to the stuff you said before was, uh,

Speaker: 00:23:28

are files where the file type based on the first few bytes of the file,

Speaker: 00:23:35

does not match the extension of the

Speaker: 00:23:37

file.

Speaker: 00:23:37

So it says it's a dot doc, but the first few bites of the file

Speaker: 00:23:42

show that it's an application, for

Speaker: 00:23:43

Sorry, one

Speaker: 00:23:44

Yeah, that's an interesting use case around, uh, the first few bites because

Speaker: 00:23:50

that could detect things that are being encrypted or other things that don't

Speaker: 00:23:56

make sense, or potentially even malware.

Speaker: 00:23:58

Right.

Speaker: 00:23:59

Yeah, it, uh, it's something we do, you know, my, uh, employee is S two

Speaker: 00:24:03

data and we do a lot of restores of old stuff, um, where we're pulling

Speaker: 00:24:09

data off of tape often for, um, I. For e-discovery purposes and lawsuit

Speaker: 00:24:16

purposes and, um, investigation purposes.

Speaker: 00:24:19

And one of the things that we do as we're pulling data, 'cause we

Speaker: 00:24:22

use a, a, a proprietary tool that we've written to restore data off

Speaker: 00:24:27

of most backups rather than use the built in tool for a lot of reasons.

Speaker: 00:24:33

Um, and this is one of them is that we check the file type against the file

Speaker: 00:24:37

contents and, uh, it can, it can also indicate.

Speaker: 00:24:42

Um, uh, subterfuge,

Speaker: 00:24:44

right?

Speaker: 00:24:45

Um, it can indicate somebody trying to hide something.

Speaker: 00:24:48

Um, but yeah, so anomaly detection, I think is a really big one.

Speaker: 00:24:51

Uh, right.

Speaker: 00:24:52

Definitely that this is a, this is a, you looks like you've got ransomware, right?

Speaker: 00:24:58

You need

Speaker: 00:24:59

to solve that.

Speaker: 00:25:00

That was probably the, the first big use of AI that I

Speaker: 00:25:02

remember, uh, in, in the backup world.

Speaker: 00:25:05

And I, I, I will say that if.

Speaker: 00:25:08

The way that you know, that you have ransomware is that your backup

Speaker: 00:25:11

product told you something is wrong, but, uh, but it, but it can

Speaker: 00:25:16

happen.

Speaker: 00:25:16

Right.

Speaker: 00:25:17

Um, another one that I'll talk, uh, that I'd bring up is, is data classification.

Speaker: 00:25:23

Again, I think that.

Speaker: 00:25:26

This is, this is probably a very simple one, but the

Speaker: 00:25:29

idea of like, looking at all the different data types and helping you to

Speaker: 00:25:33

understand what is in your environment.

Speaker: 00:25:35

This is not that new.

Speaker: 00:25:37

Um, but perhaps the AI use case could be helping you to identify trends,

Speaker: 00:25:43

um, and, and where the data's moving, where it's being created, where

Speaker: 00:25:47

it's being changed, uh, et cetera.

Speaker: 00:25:50

Um, and, and then, which is very closely related to my

Speaker: 00:25:53

other idea, which is predictive

Speaker: 00:25:54

analytics.

Speaker: 00:25:56

Right.

Speaker: 00:25:56

Um, again, going back to, uh, you know, back in the day,

Speaker: 00:26:01

one of the things I remember being the hardest to do is capacity prediction.

Speaker: 00:26:08

You

Speaker: 00:26:08

know, predicting whether or not I have enough capacity To

Speaker: 00:26:12

do my backups for the next six

Speaker: 00:26:13

and you know what makes it even harder?

Speaker: 00:26:15

What's that?

Speaker: 00:26:17

It does, d ddu makes it way harder.

Speaker: 00:26:21

And you know what AI right?

Speaker: 00:26:24

Ai ml could, could use to, could be used because it's smarter than I am.

Speaker: 00:26:30

Smarter than you are.

Speaker: 00:26:31

It could actually understand the trends

Speaker: 00:26:34

as to now what, what, let's talk about that Non, not every,

Speaker: 00:26:39

everybody might not understand.

Speaker: 00:26:41

Why DDU makes capacity,

Speaker: 00:26:44

Sure.

Speaker: 00:26:45

uh, management so

Speaker: 00:26:47

So let's talk about the, before we get to D Dub, let's talk about like

Speaker: 00:26:50

traditional storage or tape, right?

Speaker: 00:26:52

Speaker: 00:26:53

you're doing a full backup, you know how big your database is, therefore,

Speaker: 00:26:56

you know, okay, my full backup is gonna take this much space and

Speaker: 00:27:00

you know, with compression, maybe it's gonna be two x or half the space, right?

Speaker: 00:27:04

And then, you know, okay, my daily change rate is say 5%, and based on the

Speaker: 00:27:08

total size, I know what that's gonna be.

Speaker: 00:27:10

And so

Speaker: 00:27:10

if I'm doing weekly fulls, daily incrementals, I know how much

Speaker: 00:27:14

storage I'm gonna need for a week.

Speaker: 00:27:16

Yeah.

Speaker: 00:27:16

And, and just as, and just as important, you also know how

Speaker: 00:27:20

much storage, when you delete

Speaker: 00:27:23

the, you know, the older backups.

Speaker: 00:27:26

Yeah.

Speaker: 00:27:26

You know how much storage will be freed up, which is just if, if not even more

Speaker: 00:27:29

important.

Speaker: 00:27:30

Now the problem with deduplication is they talk about these great rates like

Speaker: 00:27:34

40 x, 30 x, 20 x, take your pick, right?

Speaker: 00:27:38

And that's all great.

Speaker: 00:27:39

If you're all like if a lot of your data is very similar, but it's hard

Speaker: 00:27:44

to tell, is your data similar or not until you've actually start doing it.

Speaker: 00:27:48

So if you're trying to buy storage for, say, three years

Speaker: 00:27:51

ahead of time, a capacity plan.

Speaker: 00:27:53

It becomes really difficult.

Speaker: 00:27:54

And so you guess, right?

Speaker: 00:27:56

You'll take a stab and maybe you look at some of your data and you're like,

Speaker: 00:27:58

Hey, these kind of look the same, but you don't know if that's right or not

Speaker: 00:28:02

until you actually start backing it up.

Speaker: 00:28:04

And like you said, Curtis, if you go delete your backup, you may not

Speaker: 00:28:09

actually free up that space because it's been de-duplicated against something

Speaker: 00:28:12

else that you're still preserving.

Speaker: 00:28:14

right,

Speaker: 00:28:15

Say I go delete my backup for six months ago for one application.

Speaker: 00:28:19

Another application might have, uh, common blocks with that data or with that other

Speaker: 00:28:24

application.

Speaker: 00:28:25

And so even though I deleted the first application's backup,

Speaker: 00:28:28

it's not gonna free up space.

Speaker: 00:28:29

And so you end up with this problem and this challenge.

Speaker: 00:28:33

And that's one of the things, the hardest things about deduplication.

Speaker: 00:28:37

Having worked at a company that did deduplication, customers

Speaker: 00:28:41

always struggled with it,

Speaker: 00:28:43

Yeah,

Speaker: 00:28:44

And some of the

Speaker: 00:28:44

things we would do is we would be like, Hey, let's scan your

Speaker: 00:28:47

application and just understand what sort of DDU rates you may get.

Speaker: 00:28:52

And even that's a guess, because maybe you move an application from one storage

Speaker: 00:28:55

appliance to a different appliance and now your DDU rates are different.

Speaker: 00:29:00

Yeah.

Speaker: 00:29:01

And, and, and again, the

Speaker: 00:29:02

one of the most frustrating things could be if you, you start.

Speaker: 00:29:06

You're running outta capacity, right?

Speaker: 00:29:09

And so you say, listen, I know we said we wanted to keep backups for

Speaker: 00:29:13

three years, but we're running outta capacity and so we're gonna start

Speaker: 00:29:16

deleting three years minus a month.

Speaker: 00:29:19

And you do that and you get

Speaker: 00:29:20

back 0.1% of your, it can be very difficult.

Speaker: 00:29:26

Um,

Speaker: 00:29:27

fact that to free up that space takes time.

Speaker: 00:29:30

Because typically with a lot of these systems, there's a background process

Speaker: 00:29:33

typically called garbage collection,

Speaker: 00:29:35

which goes and now needs to free up all this data and that does take time to run.

Speaker: 00:29:40

Yeah, it is, it is a two stage process where you, you, you, um, flag that

Speaker: 00:29:46

block for deletion and then another

Speaker: 00:29:49

process that runs typically when backups aren't running.

Speaker: 00:29:52

Um, and you, you probably have to force the garbage collection process.

Speaker: 00:29:57

Um, so go, go ahead.

Speaker: 00:29:59

so I was just thinking as we were talking about the first time

Speaker: 00:30:03

that I heard about AI in storage,

Speaker: 00:30:07

and I think the first company that I can recall, and I'm sure there

Speaker: 00:30:11

were others, was actually nimble.

Speaker: 00:30:14

Storage and nimble.

Speaker: 00:30:16

What they did is their first product when they built they, so

Speaker: 00:30:19

they provided primary storage.

Speaker: 00:30:21

And their first product, they basically were like, Hey, we are optimized for sql.

Speaker: 00:30:27

We are optimized for VMware.

Speaker: 00:30:29

We are optimized for these different, and I was like, oh, that's pretty awesome.

Speaker: 00:30:32

They're doing it dynamically.

Speaker: 00:30:33

But I think at the time it was kind of a static thing where you

Speaker: 00:30:36

would say, Hey, I have VMware.

Speaker: 00:30:38

I'm writing into this data store.

Speaker: 00:30:41

And it would optimize its, and it would basically pick different

Speaker: 00:30:45

block sizes for deduplication

Speaker: 00:30:47

Right, right, right.

Speaker: 00:30:49

Yeah.

Speaker: 00:30:49

That's interesting.

Speaker: 00:30:50

The, the, the, I, I, I think div, going back to the thing

Speaker: 00:30:56

we were talking about of like.

Speaker: 00:30:59

Using AI to basically help me understand when do I need to order more storage?

Speaker: 00:31:05

It can, to the best of its ability.

Speaker: 00:31:07

It can actually look at all of the DDU rates, right?

Speaker: 00:31:11

At all of the at, at what?

Speaker: 00:31:13

It could look at the DDU rate of each individual backup, right?

Speaker: 00:31:17

You, you gave, you told me it's a backup this much and this is

Speaker: 00:31:19

how much, and so we can actually

Speaker: 00:31:20

run all those calculations and I can actually figure out.

Speaker: 00:31:24

Well in six months, based on if everything stays the

Speaker: 00:31:26

same in six months, you're gonna be

Speaker: 00:31:29

outta storage.

Speaker: 00:31:29

Speaker: 00:31:30

many vendors actually do.

Speaker: 00:31:32

Yeah.

Speaker: 00:31:32

Yeah.

Speaker: 00:31:33

Um, so the, the, um,

Speaker: 00:31:37

Because I think storage capacity is a little easier.

Speaker: 00:31:40

To predict, because like you said, you're not really changing things, right.

Speaker: 00:31:44

You know what your policy is.

Speaker: 00:31:45

You know what data's coming in, you know how long it's, you're keeping it,

Speaker: 00:31:49

you know what your deduplication rates are, you know how much it's filling up.

Speaker: 00:31:53

So I think it's a little easier than what we had talked about previously

Speaker: 00:31:57

where it's like, okay, now let me plan out my entire backup infrastructure

Speaker: 00:32:00

and start scheduling that.

Speaker: 00:32:01

Yeah.

Speaker: 00:32:02

Speaking of dedupe, can AI help dedupe itself?

Speaker: 00:32:07

Do you think that?

Speaker: 00:32:08

can.

Speaker: 00:32:10

So I think my biggest.

Speaker: 00:32:12

Challenge would be that to run AI requires compute

Speaker: 00:32:19

and usually backup.

Speaker: 00:32:20

You want to go as fast as you can,

Speaker: 00:32:23

Mm-hmm.

Speaker: 00:32:23

right?

Speaker: 00:32:24

And so I think there's that tension.

Speaker: 00:32:26

That exists between running as fast as you can versus introducing

Speaker: 00:32:31

something in the pipeline to that could potentially slow things down.

Speaker: 00:32:35

And you'd have to also ask at what cost, right?

Speaker: 00:32:39

Like, are you going to be saving, say 70% additional versus a traditional

Speaker: 00:32:44

algorithms, or is it gonna be much less

Speaker: 00:32:48

Yeah, I think ddu in, um, in the backup world, there, there, there

Speaker: 00:32:57

have been two main ways to do ddu, which has been, there has been

Speaker: 00:33:02

something that isn't really ddu, but

Speaker: 00:33:05

there were DDU products that called themselves DDU products that did this.

Speaker: 00:33:09

Uh, and that would be block level, um,

Speaker: 00:33:12

incremental, essentially.

Speaker: 00:33:14

Right?

Speaker: 00:33:14

Not

Speaker: 00:33:14

actually de-duping things against each other, but just.

Speaker: 00:33:17

Using technology to lower the additional new data that's

Speaker: 00:33:22

backed up from each workload.

Speaker: 00:33:24

But then the traditional ddu, the way it works for those that don't know

Speaker: 00:33:27

this, is that you slice it up, you slice everything up into what are

Speaker: 00:33:31

typically called shards or chunks.

Speaker: 00:33:33

You run some type of algorithm on it that gives you some type of thing.

Speaker: 00:33:37

Like, like

Speaker: 00:33:38

A fingerprint.

Speaker: 00:33:39

the original SHA two,

Speaker: 00:33:41

SHA 2 56.

Speaker: 00:33:42

And again, here the, the better the algorithm, um, the better the ddu,

Speaker: 00:33:47

but the better the algorithm, the more compute it takes going back to

Speaker: 00:33:50

your trade off thing.

Speaker: 00:33:52

And so, um, that's the way basically every chunk it's run through, you come

Speaker: 00:33:57

up with this alpha numeric string, that alpha numeric string is compared

Speaker: 00:34:01

with every other alpha numeric string.

Speaker: 00:34:03

I. Um, and then that's how you identify redundant data.

Speaker: 00:34:07

And one of the challenges you have with that method is that, uh, the data slides,

Speaker: 00:34:11

um, and so if you don't slice the data at exactly the same spot it, it's duplicate

Speaker: 00:34:16

data, but you don't, don't identify it.

Speaker: 00:34:19

The, there is a completely different way which, um, you

Speaker: 00:34:24

look at the way vast does things.

Speaker: 00:34:26

They do something completely different, right?

Speaker: 00:34:28

So they, they have an algorithm and, and I, I'm guessing they

Speaker: 00:34:31

use AI or ML to, to, do this.

Speaker: 00:34:34

They have an algorithm that, um, basically identifies data that

Speaker: 00:34:41

is probably redundant, right?

Speaker: 00:34:43

Um, that, that, so they, they've got two different ways to do de-dupe and I, so

Speaker: 00:34:48

there are potentially, again, potentially.

Speaker: 00:34:53

AI or ML could be used to identify a new way to identify duplicate

Speaker: 00:35:00

data that is maybe, maybe

Speaker: 00:35:02

more efficient from a compute and storage.

Speaker: 00:35:06

Like even if it was just more efficient from a compute standpoint,

Speaker: 00:35:09

but got the but got the same amount of dedupe, that would still

Speaker: 00:35:13

be great.

Speaker: 00:35:14

Um, but

Speaker: 00:35:16

potentially this is something

Speaker: 00:35:18

that I think, uh, AI could

Speaker: 00:35:19

and the one thing I did also want to comment on Curtis is, uh, going back to

Speaker: 00:35:24

your comment about, okay, if the data shifts, then now you have to make sure

Speaker: 00:35:27

that you're doing the right blocks, right?

Speaker: 00:35:30

Uh, this is where companies though have done sort of, uh, what you're

Speaker: 00:35:34

talking about is called fixed block.

Speaker: 00:35:35

Fixed block deduplication,

Speaker: 00:35:37

right?

Speaker: 00:35:38

There are

Speaker: 00:35:38

many vendors out there though, who do variable size.

Speaker: 00:35:41

Variable block, uh, deduplication, which allows it to vary such that if

Speaker: 00:35:46

you do get an offset right, because of some data change, it's still able to

Speaker: 00:35:51

dup everything else after that because

Speaker: 00:35:53

of how it's actually computing the chunks, the segments, right?

Speaker: 00:35:58

Each of

Speaker: 00:35:58

the blocks.

Speaker: 00:35:59

Yep.

Speaker: 00:36:00

Um, so, uh, so that, that's certainly an area where, where AI could potentially

Speaker: 00:36:06

help the, um, the next, do you think it could help with recovery testing?

Speaker: 00:36:13

Oh yeah, I would.

Speaker: 00:36:15

So one thing for C is like, most people probably don't

Speaker: 00:36:20

know how to write a DR plan,

Speaker: 00:36:22

Mm-hmm.

Speaker: 00:36:23

Mm-hmm.

Speaker: 00:36:24

right.

Speaker: 00:36:25

Um, I wonder if you took ai, like even, and I'm going back to the first

Speaker: 00:36:31

set, right, the large language models,

Speaker: 00:36:33

Yep.

Speaker: 00:36:34

So the thing we said we

Speaker: 00:36:35

weren't talking about, I think we're gonna talk about it here.

Speaker: 00:36:38

Yeah.

Speaker: 00:36:38

I think at least to start with, it's like, Hey, here's all my data.

Speaker: 00:36:42

Here's my applications.

Speaker: 00:36:43

Help me build a DR test plan.

Speaker: 00:36:46

Yeah,

Speaker: 00:36:47

I like that idea.

Speaker: 00:36:48

And

Speaker: 00:36:49

see what it pops out because, and it may not be perfect, and don't just

Speaker: 00:36:52

blindly trust what it provides, but use it as a starting point, right?

Speaker: 00:36:55

And then go use that.

Speaker: 00:36:56

Because I think a lot of people struggle with, where do I even start?

Speaker: 00:37:00

Yeah.

Speaker: 00:37:01

And you could also, um, you could use it like a chaos monkey,

Speaker: 00:37:06

right?

Speaker: 00:37:06

You could use it.

Speaker: 00:37:07

Help me come up with some interesting scenarios.

Speaker: 00:37:09

To just make the, the idea, you know, one of the things that we talked about with in

Speaker: 00:37:14

terms of, uh, cyber testing, uh, was, um.

Speaker: 00:37:18

You know, when we had Mike on the idea of like, doing this and, and

Speaker: 00:37:21

making it, making it fun, making it a game, uh, I like that idea a

Speaker: 00:37:25

lot and I think maybe AI could help

Speaker: 00:37:27

there.

Speaker: 00:37:27

Um,

Speaker: 00:37:28

if, if it helps you do recovery testing more often, um, and, uh,

Speaker: 00:37:33

helps you identify potential, uh, uh, plot, I was gonna say plot

Speaker: 00:37:37

holes, uh, potential, potential holes in your program, uh, then that, then that

Speaker: 00:37:43

I think could be, um, very

Speaker: 00:37:45

helpful.

Speaker: 00:37:45

And Curtis, since you threw out a term, Chaos Monkey is a tool that was released

Speaker: 00:37:50

by Netflix, and literally what it is used for is to just test it, resiliency.

Speaker: 00:37:56

So it'll go randomly, kill services, kill locations, kill

Speaker: 00:37:59

network connections, just to see.

Speaker: 00:38:02

Is streaming, interrupted, are, uh, end users having any sort of

Speaker: 00:38:06

issues and it's able to do this at a scale and in an automated fashion

Speaker: 00:38:11

versus someone like trying to think about all the combinations,

Speaker: 00:38:13

permutations, and scenarios, because they're probably gonna miss things.

Speaker: 00:38:17

And so Netflix designed this thing to actually go out and

Speaker: 00:38:20

test their infrastructure.

Speaker: 00:38:22

It is pretty impressive.

Speaker: 00:38:24

Uh, you know, their infrastructure in general is pretty impressive.

Speaker: 00:38:27

It's not flawless.

Speaker: 00:38:28

Um, I did, I did watch part of the, uh.

Speaker: 00:38:32

The Tyson fight a little while ago, and that was on Netflix

Speaker: 00:38:36

and it was not good, right?

Speaker: 00:38:39

That wasn't so much a resilient thing as it was.

Speaker: 00:38:42

They just, again, they could have used perhaps a little bit better

Speaker: 00:38:45

AI to predict the, what kind of load they were gonna have.

Speaker: 00:38:49

But yeah.

Speaker: 00:38:50

But the idea of predicting crazy things that will happen, uh, Netflix

Speaker: 00:38:56

is pretty darn resilient, uh, when it comes to their infrastructure,

Speaker: 00:38:59

Yep.

Speaker: 00:39:00

yeah, I, I like that idea a lot.

Speaker: 00:39:02

Um, and, and I think, I think this is something that could be, that,

Speaker: 00:39:05

that, that, again, an, uh, uh, an LLM could actually help with, right?

Speaker: 00:39:09

So, like I said, the thing that we said we weren't gonna talk about,

Speaker: 00:39:11

we could talk about it, right?

Speaker: 00:39:13

Um, and for those, if you've never used a chat, g PT or a Claude,

Speaker: 00:39:17

uh, I think it's very useful

Speaker: 00:39:19

here, right?

Speaker: 00:39:20

You, you could say, Hey, I, I'm this kind of company.

Speaker: 00:39:23

This is the type of company, you know, and I understand the,

Speaker: 00:39:26

the privacy concerns of what you

Speaker: 00:39:27

share with a chat g pt or a clot.

Speaker: 00:39:29

Uh, there, there are, by the way, there are on-prem versions that

Speaker: 00:39:32

you can run, uh, of these LLMs too, so that you can keep the

Speaker: 00:39:36

data to yourself.

Speaker: 00:39:38

But the, you have a conversation with it.

Speaker: 00:39:41

Here's the type of company I am, here's the type of computing environment I have.

Speaker: 00:39:45

What do you th what could go

Speaker: 00:39:46

wrong?

Speaker: 00:39:47

Um, you know what, what could I build a, a dr scenario

Speaker: 00:39:51

around?

Speaker: 00:39:51

Any final thoughts?

Speaker: 00:39:52

Can you think of, uh, any other areas where we could use AI and, and backup?

Speaker: 00:39:59

Not so much.

Speaker: 00:39:59

I think the one thing I do wanna call out though is AI is here to stay.

Speaker: 00:40:05

ML is here to stay.

Speaker: 00:40:06

Don't be afraid of it.

Speaker: 00:40:08

Use it.

Speaker: 00:40:09

Right in the right ways and don't be afraid and just start thinking about it.

Speaker: 00:40:14

Uh, the one other thing I will call out is as companies are starting

Speaker: 00:40:19

to dig into AI and ML for their own applications, production applications

Speaker: 00:40:24

and other things, as a backup admin, you need to start thinking

Speaker: 00:40:28

about how do I protect this, right?

Speaker: 00:40:30

How do I back it up?

Speaker: 00:40:31

How would I potentially restore it?

Speaker: 00:40:32

Because there's a lot of data and training these models.

Speaker: 00:40:36

Is really, really expensive.

Speaker: 00:40:39

Mm.

Speaker: 00:40:39

And so you wanna make sure you have mechanisms to protect the models

Speaker: 00:40:44

that emerge from all of this training so you can restore them if needed.

Speaker: 00:40:50

So use backup to, to make AI more resilient while AI makes backup more

Speaker: 00:40:55

resilient.

Speaker: 00:40:56

I like that.

Speaker: 00:40:57

We'll call that a symbiosis.

Speaker: 00:40:59

I like that a lot.

Speaker: 00:41:01

Uh, one my final thought is that potentially you could use, again,

Speaker: 00:41:05

going back to the thing we said we weren't gonna talk about.

Speaker: 00:41:08

You could use LLMs to help select vendors, right?

Speaker: 00:41:12

You could say, Hey, here are all my requirements and here's all the

Speaker: 00:41:15

documents that they, they gave me this 57 page response to my 10 page RFI.

Speaker: 00:41:21

Can you help me make sense of it?

Speaker: 00:41:23

Um, and, uh, you, you could use that again, trust but

Speaker: 00:41:26

verify when using an LLM for

Speaker: 00:41:28

sure.

Speaker: 00:41:30

All right, well, thanks again, Prasanna, uh, for a good chat.

Speaker: 00:41:34

Thank you, Curtis.

Speaker: 00:41:35

And I am not gonna change how I hold a coffee mug.

Speaker: 00:41:38

I'm sorry.

Speaker: 00:41:40

I, I would expect no less.

Speaker: 00:41:42

And thanks to our listeners, uh, we'd be nothing without you.

Speaker: 00:41:45

That is a wrap.