Speaker:

Listeners of this podcast have heard why snapshots aren't backup.

Speaker:

You've also learned why replication isn't backup either.

Speaker:

But what if I make a snapshot on one array and replicate it to another array?

Speaker:

Is that a backup?

Speaker:

A lot of people would say yes.

Speaker:

We'll also need things like reporting and cataloging, of course, but I would

Speaker:

argue that snap and replicate also known as near CDP is one of the most

Speaker:

efficient ways we have of protecting data.

Speaker:

Hi, I'm Debbie Curtis, press and AKA Mr.

Speaker:

Backup, and each episode of this podcast, dives deep into a backup related topic.

Speaker:

We turn unappreciated backup admins into cyber recovery heroes.

Speaker:

This is the backup wrap up.

Speaker:

Welcome to the show.

Speaker:

Thanks for listening today.

Speaker:

I am your host, w Curtis Preston, and I have with me the guy that

Speaker:

made me wait for him today.

Speaker:

Prasanna Malaiyandi

Speaker:

I am good, Curtis.

Speaker:

I am so sorry for making you wait.

Speaker:

and, and why did I wait again?

Speaker:

So you could finish the last few minutes of a series that you have already

Speaker:

Well, it's, and it's not even the entire series, it is just

Speaker:

the episode that I was on.

Speaker:

Now, granted, I'm near the end of the show, so it is sort of getting

Speaker:

to the cliffhanger phases, but I was enjoying my lunch while watching

Speaker:

the last bits of a show, and

Speaker:

Yeah.

Speaker:

then Curtis, and I was like, Curtis, I need five minutes.

Speaker:

And then I was like, no, I need seven minutes because

Speaker:

there's still six minutes left.

Speaker:

Well.

Speaker:

It is time for the news of the week.

Speaker:

the first news of the week falls into one of my favorite news categories.

Speaker:

Do you know what category that is?

Speaker:

Things that you should be backing up that people don't

Speaker:

realize until the data's gone.

Speaker:

Yeah, I was just gonna say, I told you so.

Speaker:

So what do do you wanna do?

Speaker:

You wanna jump right in on our first news

Speaker:

yeah.

Speaker:

So this is actually something that I ran into on Reddit, which

Speaker:

I for some reason get cis admin.

Speaker:

Subreddit, uh, articles in my feed and it was like, Hey, did anyone notice that

Speaker:

they have data missing from Google Drive?

Speaker:

And I was like, oh man, what is this?

Speaker:

And then it slowly started picking up.

Speaker:

I think the register carried it.

Speaker:

Bleeping computer and other folks as well, but.

Speaker:

Basically what happened is people realized all of a sudden that some

Speaker:

of their data was gone and that they didn't have any of their files and

Speaker:

other changes since May of this year.

Speaker:

Since May of this year, and this is being recorded in the end of November,

Speaker:

so that is a long amount of time

Speaker:

and, and Google has officially responded.

Speaker:

And what they're saying as I was looking at these, these instructions and,

Speaker:

and it basically said like, um, don't mess around with your drive right now.

Speaker:

Right.

Speaker:

So they were, they're saying don't disconnect.

Speaker:

Don't do any structural changes to your drive.

Speaker:

And to me what that says is their pro, that there was a suggestion that maybe

Speaker:

somebody did a rollback of some snapshot.

Speaker:

Is that

Speaker:

Yeah, and I think we should be clear, this isn't just general

Speaker:

Google Drive, so if you just access it through the web portal, right?

Speaker:

All of that stuff still works fine.

Speaker:

That has all your data.

Speaker:

This only specifically affected customers who were using Google

Speaker:

Drive for desktop to connect.

Speaker:

So that I, I gotta

Speaker:

Oh, is that no longer

Speaker:

I.

Speaker:

I'm seeing comments from the, that was what we thought a few days ago, but if

Speaker:

you follow some comments, and again, it we're, we're, we're coming this from,

Speaker:

you know, from the outside and we're not personally being impacted, but

Speaker:

there are people who are saying that they've never used the desktop version

Speaker:

and they're experiencing the problem.

Speaker:

Those people may be wrong, it's just a couple of people.

Speaker:

Um, but they're saying that they're experiencing the problem, but.

Speaker:

So I, whoa.

Speaker:

Which is in, yeah, that's news because I was worried about

Speaker:

that, so I went and looked.

Speaker:

And Curtis, I know we do, we use OBS as a backup for us recording this podcast

Speaker:

and we upload that to Google Drive.

Speaker:

And I did go and check to make sure, because we just uploaded one a couple

Speaker:

of weeks ago, and that still exists.

Speaker:

So I don't know if whatever we had shared it with, it looks like

Speaker:

not everyone is affected, but.

Speaker:

right.

Speaker:

It looks like there are some random set of folks who somehow,

Speaker:

some reason are affected by having some of their data gone.

Speaker:

Yeah, so it looks like, uh, I don't remember exactly where I read it,

Speaker:

but the idea is that someone rolled, basically rolled the entire drive or, or

Speaker:

a section of this entire drive back to, uh, essentially the end of April and.

Speaker:

So, and that's consistent with what they're saying of like, don't do,

Speaker:

don't do any work in this right now and don't do any structural changes.

Speaker:

'cause what I think they're going to try to do is to basically undo that

Speaker:

action, which would then put the, all of those same customers back to what

Speaker:

happened before all of this happened.

Speaker:

And if you're making any changes in there right now, those changes

Speaker:

will be undone by that change.

Speaker:

What's a little disconcerting is that there isn't, there's no, you know, we,

Speaker:

we talk a lot about how companies respond and Google isn't doing those things.

Speaker:

It's.

Speaker:

very, very little information out there about this.

Speaker:

Yeah, there's just this one page that it basically said, Hey, don't do anything.

Speaker:

There isn't, I haven't seen any updated stories I checked before

Speaker:

we recorded this, and I'm a little, a little concerned about that

Speaker:

I'm also surprised that.

Speaker:

They gave service engineering or whoever else the capability to even roll back

Speaker:

production to a snapshot that far back without checks and balances in place.

Speaker:

Now, I don't know the what happened, right?

Speaker:

This is all just assumption, but it's a little scary that someone

Speaker:

had that sort of capability.

Speaker:

It's a lot scary.

Speaker:

Uh, imagine if you're a company that is using this, you know, you've

Speaker:

got all sorts of stuff stored in there and you just rolled it back.

Speaker:

Uh, yeah, I, I, I hope that there's an update on this.

Speaker:

I hope we know more than we know right now, but, uh, if you're, if you're

Speaker:

a Google user, it's time to just double check what's going on in your

Speaker:

Yeah.

Speaker:

And this is one of the other reasons why you should back up

Speaker:

your data, even if you use Google

Speaker:

Yes, yes.

Speaker:

This is, this is why I put it in the, I told you So category, you know, the,

Speaker:

the Google, you know, the, the cloud is, is a magical place, but it's not magic.

Speaker:

So, um, let's take a look at this other story.

Speaker:

And it has to do with a ransomware attack on a hospital chain in

Speaker:

Nashville, Tennessee, and they've got.

Speaker:

30 hospitals and 200 care sites around the country.

Speaker:

Oklahoma, Texas, New Jersey, New Mexico, Idaho, and Kansas.

Speaker:

And they were forced to divert patients from a, a number of ERs and one of

Speaker:

the other things was that people weren't able to book appointments

Speaker:

at, um, you know, their usual doctor because the patient portal was down.

Speaker:

I just wanna say.

Speaker:

It wasn't that long ago.

Speaker:

Do you remember when the ransomware groups, they specifically

Speaker:

didn't target healthcare?

Speaker:

Um, because it tend, you know, people can

Speaker:

die, but that is

Speaker:

clearly gone.

Speaker:

And remember there was the case in Germany, I think, where a patient died

Speaker:

because an ER was closed and they had to re reroute them to a different one.

Speaker:

And that's, I think when it came out where ransomware actors were like,

Speaker:

yeah, maybe we will avoid hospitals.

Speaker:

Yeah, but clearly not here.

Speaker:

Right.

Speaker:

They targeted this group and, the only update that I've seen is that.

Speaker:

They, they are starting to restore some services.

Speaker:

The, the company said that they did notify law enforcement.

Speaker:

It did say that they, um, that they contracted a cybersecurity firm.

Speaker:

These are all good things.

Speaker:

These are the things that we like to hear.

Speaker:

These are what people should be doing.

Speaker:

Um, but we don't yet know if they were able to restore services

Speaker:

or if they paid the ransom.

Speaker:

Uh, we don't, you know, we don't know yet.

Speaker:

And I think the other big thing is these are your medical records, right?

Speaker:

And another unintended consequence is some of these particular

Speaker:

facilities provide specialized care for certain types of, uh, ailments.

Speaker:

And if they're.

Speaker:

They're providing that specialized care and then they're down.

Speaker:

It's not like they can just divert that care to some other place.

Speaker:

So yeah, this is a real mess.

Speaker:

Um, you know, I just, the, the thing I think we can take away from this is what

Speaker:

it's again, what, what I've already said.

Speaker:

I like that they contacted the law enforcement.

Speaker:

I like that they contracted with the cybersecurity professional.

Speaker:

The key there is that you want to start having those conversations.

Speaker:

Now you want to identify a cybersecurity firm that you can

Speaker:

contract with, that you can work with.

Speaker:

One of the ways to do this is to con, is to talk to a cybersecurity.

Speaker:

Uh, like if you, if you have cybersecurity insurance, to talk to them, uh, and see

Speaker:

if they can put you in touch with somebody now so that you can prepare, uh, you know.

Speaker:

Rather than, um, going to Google and saying cybersecurity first,

Speaker:

Yeah.

Speaker:

the middle of your, uh, ransomware

Speaker:

attack.

Speaker:

So that's, uh, that's hopefully what we, yeah.

Speaker:

A little bit too late.

Speaker:

All right, well, that is the news of the week,

Speaker:

All right.

Speaker:

This week on the backup wrap up, we have another backup to basics

Speaker:

topic, and I wanted to talk this week about near CDP and which is

Speaker:

near continuous data protection.

Speaker:

And I, I think in order to do that we have to sort of back up a little bit and

Speaker:

talk about the things that we've talked about that have led up to this point.

Speaker:

Uh, these are modern backup and recovery methods.

Speaker:

Basically things that have been birthed in the last 20 years,

Speaker:

basically in the 21st century.

Speaker:

Before we talk about near CDP, I think we need to talk about the

Speaker:

things that have led up to this point.

Speaker:

And, uh, we're gonna talk about replication, snapshots, and what

Speaker:

we call continuous data protection.

Speaker:

So let's talk about replication first.

Speaker:

Do you wanna take that on?

Speaker:

Yeah, so replication is basically you're taking data in one system and

Speaker:

replicating it to the other system.

Speaker:

So the second system is an exact copy of the first system.

Speaker:

And in the case of synchronous, there's no data loss, right?

Speaker:

So your RPO is zero.

Speaker:

And yeah, so it is in sync.

Speaker:

It's basically a mirror.

Speaker:

And that also means you don't have multiple versions on that secondary side.

Speaker:

So if you have a logical corruption or you have a user error on the primary,

Speaker:

it's just gonna replicate it blindly to the other side, and that's what you get.

Speaker:

Exactly.

Speaker:

It makes your stupidity just more effective is what the way I, the way I

Speaker:

like to say it, and it doesn't matter whether you're, I mean, I suppose

Speaker:

maybe if you had an asynchronous replication, you could, may, maybe

Speaker:

there's a big enough buffer that maybe you could stop a disaster if

Speaker:

you did something really stupid.

Speaker:

But you'd really have to be on the ball, I would think,

Speaker:

uh, to, to do that in general.

Speaker:

The, the, the replication.

Speaker:

Replication will be great for Dr when your site blows up, but

Speaker:

it will be really worthless if you're the one that blew it up,

Speaker:

Yep.

Speaker:

If, if you had dropped a table or, or you got a ransomware attack,

Speaker:

which I think we can all agree is.

Speaker:

Uh, a big deal, right, right now,

Speaker:

Oh yeah.

Speaker:

Uh, and by the way, each of these that we're summarizing have their

Speaker:

own episodes back before the episode.

Speaker:

So, so if you don't, if you're not familiar with these topics, these,

Speaker:

this is just a review of them.

Speaker:

They, they each have their own episodes.

Speaker:

Yeah.

Speaker:

So I think next in the list of topics, I think we talked about CDP next.

Speaker:

So do you wanna talk about continuous.

Speaker:

Yeah.

Speaker:

So basically continuous data of protection is replication with a back button, right?

Speaker:

It, it, it, it works very similar to replication except that the way it

Speaker:

stores the data on the other end, it does it in such a way that you can bring.

Speaker:

The, you know, the destination back from the bad thing, right?

Speaker:

The great thing about replication is that it's incremental, right?

Speaker:

That it's block level and it, and, and it, you know, it can keep up with, you

Speaker:

know, relatively speaking, real time of what's going on in your production site.

Speaker:

The bad thing about replication is the exact same thing, right?

Speaker:

So, so CDP gives you the ability to go back in time.

Speaker:

If you did something stupid, like drop a table, get a ransomware attack, have

Speaker:

some sort of logical corruption, it gives you the ability to go back in time.

Speaker:

It has a couple of different ways that it does that.

Speaker:

Um.

Speaker:

amazing.

Speaker:

Curtis, why isn't everyone using

Speaker:

Yeah, it wa if, if we were having this conversation, say 20 years

Speaker:

ago, everybody was gonna do CDP.

Speaker:

Uh, the only problem is it's, it is expensive af right?

Speaker:

Uh, not just the cost of the software itself, it's also the cost of all of the

Speaker:

IO and all of the storage required to restore essentially every single change.

Speaker:

From, you know, during the entire recovery continuum that you are, uh,

Speaker:

trying to be able to support and, uh, the, and, and so there are very few

Speaker:

actual, I think, true CDP products.

Speaker:

There are some that are very specific, like Zerto, I think,

Speaker:

uh, would be a CDP product.

Speaker:

The, there are some, uh, the EMC recover point.

Speaker:

I know that a couple of the other products that I tracked have now been

Speaker:

acquired by other companies and they're just a product on their portfolio.

Speaker:

The, um, but the, basically the problem is it's just too dang expensive, especially

Speaker:

if we're gonna use it for everything.

Speaker:

Right.

Speaker:

Um, and then we have snapshots.

Speaker:

Now you used to work at a company that.

Speaker:

Did a snapshot or two.

Speaker:

And by snapshots we mean storage snapshots, not the ones up in

Speaker:

AWS, which are entirely different.

Speaker:

Yeah.

Speaker:

So storage snapshots let you take a virtual copy of a particular volume

Speaker:

file system, et cetera, and keep it there so you can quickly go back to it.

Speaker:

If you need to restore really rapidly, it's all stored locally, which is great.

Speaker:

And some companies, actually, I would probably say most companies these

Speaker:

days allow users to browse snapshots.

Speaker:

So they don't need to call up the IT help desk and be like, Hey,

Speaker:

can you restore this file for me that I accidentally deleted?

Speaker:

It's already there in the system.

Speaker:

They can manually browse it, they can pull the data out themselves,

Speaker:

self service, it's awesome.

Speaker:

Saves the backup team a bunch of time having to do restores.

Speaker:

The downside of snapshots though, is it's on the local system.

Speaker:

And when we talk about backups and the purpose of backups, you wanna

Speaker:

make sure you have a copy that's independent from that primary copy.

Speaker:

When you have a snapshot, if something happens to that system, if someone deletes

Speaker:

that volume, then that snapshot is gone and you've lost your quote unquote backup.

Speaker:

So a snapshot is not a backup.

Speaker:

And I will caveat that with what Curtis said earlier.

Speaker:

Snapshots have changed their names based on what the vendor decides to implement.

Speaker:

So an EBS snapshot isn't really the same as what I've just been talking about.

Speaker:

It is completely different.

Speaker:

They actually make a copy into AWS S3 that is independent from the production,

Speaker:

and therefore it doesn't follow what we've been calling snapshots,

Speaker:

even though AWS calls it a snapshot.

Speaker:

Yeah, exactly, and, and I think they're not the only cloud vendor to do that.

Speaker:

I also know, for example, our previous employer calls their backups.

Speaker:

They call them snapshots, which I didn't like it when I worked there and

Speaker:

I don't like, I still don't like it.

Speaker:

But yeah, so when we're talking about snapshots here, we're talking about

Speaker:

storage snapshots, like what you would see in a NetApp or, uh, and there

Speaker:

are different kinds of snapshots.

Speaker:

There's copy on, right?

Speaker:

There's redirect on, right.

Speaker:

And again, there is a whole separate.

Speaker:

Episode just on that topic.

Speaker:

So my memory is that I coined the term near CDP back in the day.

Speaker:

They just, they just called it snapshots and replication.

Speaker:

And as you may recall, CDP was all the rage.

Speaker:

And I remember thinking that.

Speaker:

CDP was very expensive.

Speaker:

And because of that, very few people are going to use it.

Speaker:

They might use it for their severely, like tier one applications, but they're not

Speaker:

gonna use it for regular every day data.

Speaker:

And what was more common back in that time was that most people would use.

Speaker:

NetApps for that type of data.

Speaker:

I mean, at that time, NetApp was kind of, you know, ruling the roost

Speaker:

of the, of the NA world, right?

Speaker:

Network attached storage, and they were very big on snapshots and

Speaker:

then replicated snapshots, and they

Speaker:

and replicate.

Speaker:

I.

Speaker:

Um, you know, you could do multiple tiers of that.

Speaker:

They were happy and you could, you could

Speaker:

Yep.

Speaker:

Replicate the data all around the world.

Speaker:

exactly.

Speaker:

Exactly.

Speaker:

And I liked the term near continuous.

Speaker:

And I remember, um, one of the folks that I interfaced with was, um, uh,

Speaker:

storage Zilla, which is, uh, mark Toomey.

Speaker:

Uh, he lives over there in Cork.

Speaker:

And, uh, that was my, that was my attempt to do a cor for anyone who listens there.

Speaker:

And I remember he just, he just really hated my term.

Speaker:

Like, he's like, continuous is a binary term, you know, like, like immutable.

Speaker:

It's a binary term.

Speaker:

It's either continuous or it's not.

Speaker:

You can't be near continuous.

Speaker:

Like, like it's like saying you're near pregnant, right?

Speaker:

Pregnant is a binary term.

Speaker:

And I'm like, yes, but we do use the word like nearly dead.

Speaker:

Right there.

Speaker:

There aren't times when we do put the word near next to a binary term, and I

Speaker:

just felt that this was a world that was much closer to continuous than it

Speaker:

was to what we thought of as backup.

Speaker:

Backup at that time, and honestly, even to today.

Speaker:

I think, I don't know.

Speaker:

This is one of those, like, I don't know for a fact, but I'm

Speaker:

pretty darn sure that most people still just back up every night.

Speaker:

Yep.

Speaker:

Right.

Speaker:

It's like if your RPO is 24 hours or less, you are probably doing some

Speaker:

form of, I'm just gonna use quote unquote replication, which is all the

Speaker:

stuff we just talked about, right?

Speaker:

Which includes Async sync, CDP, near CDP, which by the way, I also don't like

Speaker:

the word near CDP, but that's just me.

Speaker:

Well, you just have to get over it 'cause you're on, you're

Speaker:

on the podcast now, buddy.

Speaker:

Yeah, but, and then everything beyond 24 hours is probably backup.

Speaker:

And I know as technologies change and everyone was like, Hey, database backups.

Speaker:

I wanna do it more frequently than every 24 hours.

Speaker:

Let me do log backups and all the rest of that stuff.

Speaker:

That's when things sort of backup, sort of started reducing the RPO

Speaker:

Right?

Speaker:

and started moving down into that near CDP space.

Speaker:

And, and again, if you're not familiar with the terms Rt, o and RPO, you

Speaker:

really should be recovery time objective, recovery point, objective.

Speaker:

It, it literally drives all backup design, right?

Speaker:

Recovery time objective is how, how long have, have, you know, us and

Speaker:

the, the, the, what do you call 'em?

Speaker:

The, um, sorry, the stakeholders.

Speaker:

What, what have we and the stakeholders agreed that it is an acceptable time

Speaker:

for the recovery to take, right?

Speaker:

We, we have to be able to bring the data back in four hours, right?

Speaker:

And then our PO is how much time, how much data we've agreed

Speaker:

that we are allowed to lose.

Speaker:

By a measurement of time, not, you know, we could lose 10 gigabytes of data.

Speaker:

It's, we could lose one hour or four hours or 24 hours worth of data.

Speaker:

That's what RPO and those two things drive backup design

Speaker:

Yeah, and I would say that it's also useful beyond backup design.

Speaker:

I think anytime you're talking about.

Speaker:

Data protection, disaster recovery, backup, all of these things

Speaker:

always take into consideration.

Speaker:

RTO and RPO.

Speaker:

Yeah.

Speaker:

Uh, no one cares about backup window anymore.

Speaker:

It used to be that was that, that drove a lot of backup, uh, design.

Speaker:

But, uh, you know, luckily we, we've, I think we've tackled the backup

Speaker:

window problem, so you would probably call what we're about to talk about

Speaker:

snapshots and replication instead

Speaker:

Snap and replicate.

Speaker:

And actually when we went back to the replication issue or replication

Speaker:

episode, I would actually call async replication snap and replicate.

Speaker:

But that's because of how I entered the storage space and

Speaker:

the technology with NetApp.

Speaker:

So that's what I, when I think of async replication, I

Speaker:

think of snap and replicate.

Speaker:

Interesting.

Speaker:

Um, obviously it doesn't have to be snap and replicate.

Speaker:

Ay replication could just have a buffer.

Speaker:

Right, right.

Speaker:

And a lag is just a snapshot.

Speaker:

Right.

Speaker:

So every six hours I do that.

Speaker:

That's my lag.

Speaker:

Gotcha, gotcha.

Speaker:

Yeah, I, I would say that.

Speaker:

Snap and replicate would be a way to do a sync replication for sure.

Speaker:

And the, I, I think the more common way people would probably

Speaker:

just use this term, uh, snap and replicate, and I'm fine with that.

Speaker:

Uh, this is one where I, where I, I'm not going to battle for the term.

Speaker:

I do like the term because I think it's a lot closer to continuous.

Speaker:

What's that?

Speaker:

Is it

Speaker:

not, it's not trademarked.

Speaker:

Feel free to use it.

Speaker:

There are some systems where you don't make the snapshot on the primary system.

Speaker:

You replicate the data, and then you make the snapshot over there.

Speaker:

My problem with that is that when you're making the snapshot, you often have to.

Speaker:

Interface with the thing that's writing the data to the snapshot, right?

Speaker:

So you want to put Oracle in backup mode.

Speaker:

Take a VSS snapshot, take a VMware snapshot, whatever it is, do the,

Speaker:

do the thing that you need to do to get the data to be consistent.

Speaker:

Then we take a snapshot, then we replicate that snapshot.

Speaker:

I don't like replicating

Speaker:

You don't, you don't, you know how people deal with that.

Speaker:

I'm laughing at it because I've actually worked with groups and

Speaker:

products that actually do that.

Speaker:

Right,

Speaker:

so, uh, one way you can solve what you're asking Curtis, is when you

Speaker:

take your snapshot, you are queing the application and you issue the

Speaker:

snapshot command to the target.

Speaker:

But, but the problem with that is that we need to make sure that the bits are

Speaker:

By QCing.

Speaker:

Yep.

Speaker:

I find, I find that very.

Speaker:

I find that messy.

Speaker:

I don't like it.

Speaker:

I'm

Speaker:

I It's not clean.

Speaker:

Yeah, it's not

Speaker:

Yeah, it's not as clean.

Speaker:

I, I like clean.

Speaker:

So when we're talking about, you know, near CDP or snapshots of replication,

Speaker:

the really nice thing about it is that you can take essentially as many

Speaker:

snapshots as you'd like to take within the limits of your storage system.

Speaker:

I, I don't know what, do you know what ONTAP is up to these days?

Speaker:

I am guessing probably a thousand.

Speaker:

Yeah, that's a lot of snapshots,

Speaker:

Yeah.

Speaker:

You could take a snapshot every minute for the first hour.

Speaker:

You could take a snapshot every hour after that, et cetera, et cetera, et you can

Speaker:

take the snapshots as much as you want.

Speaker:

And then basically what you're doing is you're replicating the changes that are

Speaker:

contained within that snapshot, right?

Speaker:

Um, and

Speaker:

it's much more efficient because the storage array itself is

Speaker:

keeping track computing those differences and sending 'em out.

Speaker:

So it's much, much faster at doing that than reading the data out, figuring

Speaker:

out the differences and sending it.

Speaker:

Yeah.

Speaker:

The challenge, I think, is that it is a storage level solution,

Speaker:

which means that you need to do the interfacing up to the application.

Speaker:

Sometimes the storage vendor can help you with that.

Speaker:

Sometimes you're on your own.

Speaker:

I've been in both scenarios.

Speaker:

But at today though, Curtis, I wanna say most backup vendors

Speaker:

integrate with most storage vendors.

Speaker:

Or, and it may not be a hundred percent, but if you're picking like the major

Speaker:

common ones, I'm guessing that most backup vendors have API integration

Speaker:

with the storage vendors APIs in order to be able to trigger that snapshot.

Speaker:

Yes, you can.

Speaker:

So the question is, do they both interface with the application and with

Speaker:

the storage snapshot at the same time?

Speaker:

All I'm saying is you need to look into that, right?

Speaker:

If you're taking a snapshot, you need to do your best to make sure

Speaker:

that that snapshot is, is application consistent, um, ver versus the

Speaker:

alternative, which is crash consistent.

Speaker:

Right.

Speaker:

And, and by the way, let, let me just, yeah, let me just use that, lemme just

Speaker:

talk about that term for a minute.

Speaker:

So if you are not.

Speaker:

Making a snapshot with the, in, in partnership with an application,

Speaker:

you're creating what's called a crash consistent snapshot.

Speaker:

It's called that because it is as consistent as a crash.

Speaker:

You, you're essentially like, it's like you flip the power switch off

Speaker:

on an, on an operational storage array and you get what you get.

Speaker:

Yes.

Speaker:

Nothing is moving, but.

Speaker:

Stuff was moving.

Speaker:

So your mileage will vary.

Speaker:

Well, nothing that was committed to DI or committed by the storage array.

Speaker:

Any rights that were committed by a storage array has been preserved.

Speaker:

Anything that was in flight may not have been committed.

Speaker:

And as an application, you might have to do some recovery steps once

Speaker:

a storage array comes back because you don't know what the state is

Speaker:

Yeah,

Speaker:

because some of those in-flight rights might have been

Speaker:

committed, some may not have.

Speaker:

And there are those who say, look, you know it works 99% of the time.

Speaker:

You just take more snapshots.

Speaker:

And if this snapshot doesn't work, maybe the previous snapshot will be,

Speaker:

I'm just, I just, I try to avoid crash consistent snapshots whenever I can.

Speaker:

Right.

Speaker:

I would.

Speaker:

I, I agree.

Speaker:

For the most part, but there are cases where you could use a crash

Speaker:

consistent snapshot at a more frequent basis and do like an application

Speaker:

consistent snapshot, say once a day.

Speaker:

So even though, so you can potentially

Speaker:

have that as your backup of your

Speaker:

Yeah.

Speaker:

Yes.

Speaker:

As a backup of your backup.

Speaker:

Right.

Speaker:

And that

Speaker:

why?

Speaker:

Why would you do that?

Speaker:

I'm guessing the answer to that question would be, I.

Speaker:

Perhaps if doing an application consistent snapshot has an impact on

Speaker:

the performance of the application.

Speaker:

Uh, I know in the case of Oracle, for example, when you put it in backup

Speaker:

mode, it changes how it stores the redo logs, which could, which can

Speaker:

have a minor impact on performance.

Speaker:

And so perhaps you only do that once a day when nobody's using the

Speaker:

database and then you do the crash.

Speaker:

Consistent snapshots more often than that, I, I don't have a problem with that.

Speaker:

Yeah.

Speaker:

Yeah.

Speaker:

But just relying on, yeah, and then you just take that snapshot

Speaker:

and then replicate it off.

Speaker:

Right, and the, the beautiful thing I think of snapshots and replication

Speaker:

or near CDP, is that what you have?

Speaker:

I'm glad that you find that term so amusing, what you have at the um,

Speaker:

I think that's why I can say that I coined this term 'cause nobody

Speaker:

else seems to want to use it, so I must have coined it and I love it.

Speaker:

Um, so the, um, and it's in at least two books, two that I wrote.

Speaker:

I don't know if it's anywhere, I don't know if it's in anywhere else, but,

Speaker:

uh, I don't care what you'd call it.

Speaker:

We're just talking about snapshots and replication.

Speaker:

Just don't,

Speaker:

the 15 years that I worked,

Speaker:

don't call near C-D-P-C-D-P because NetApp definitely tried that one.

Speaker:

Right?

Speaker:

It is not continuous.

Speaker:

The, the reason I was laughing is yeah, the 15 years that I worked

Speaker:

in the storage industry, I'd never come across near CDP ever in the

Speaker:

way that you're talking about it.

Speaker:

Yeah.

Speaker:

Yeah, yeah.

Speaker:

And I'm fine.

Speaker:

I'm fine with that.

Speaker:

So again, I'm still taking credit for coining it, even if nobody uses it but me.

Speaker:

It's not like the 3, 2, 1 rule or anything like that.

Speaker:

Um,

Speaker:

One other thing I wanted to mention about Snap and replicate that I don't

Speaker:

think you covered yet is there are some vendors when you're doing Snap

Speaker:

and replicate, you may not always have to have the same snapshot retention on

Speaker:

your source array and your target array.

Speaker:

You might, for instance, decide I'm only gonna keep 30 days worth of

Speaker:

snapshots on my production system.

Speaker:

And on my secondary system, I'm gonna keep 90 days worth of backup, uh,

Speaker:

worth of snapshots so I can go back.

Speaker:

Some systems allow you to set different retentions for snapshots on both sides.

Speaker:

Some may not.

Speaker:

So you should also, once again, look at your vendor, see what's possible.

Speaker:

But I know for some folks, instead of having to go beyond that 30 days and

Speaker:

say, okay, now I have to go to my backup infrastructure and pull data off of

Speaker:

it, they might be able to say, okay.

Speaker:

If it's not in production because it's beyond the 30 days, let me go

Speaker:

check my secondary storage system.

Speaker:

Okay?

Speaker:

I have 90 days worth of snapshot.

Speaker:

Can I restore the data from there?

Speaker:

Right.

Speaker:

Yeah.

Speaker:

I, I, I love that idea, right?

Speaker:

'cause it, one of the, one of the nice things about this idea is that you

Speaker:

could have, maybe have a more expensive primary storage array and you can have

Speaker:

a less expensive storage array that's based on Sada, for example, as you're.

Speaker:

As your a backup system.

Speaker:

And another thing, by the way, that you can do with a near CDP setup is that

Speaker:

you can use that secondary site to give, I'll coin a new term near C DP plus.

Speaker:

So, so near CDP plus is snap, replicate then back up, right?

Speaker:

Use that snapshot that's on that.

Speaker:

On that target and then back that up with some other method that isn't 'cause one

Speaker:

of the downsides that some people pick on, uh, snapshot and replication is that

Speaker:

your entire, basically storage and backup infrastructure are all within one vendor.

Speaker:

And the, the worry is about this idea of a rolling bug that somehow

Speaker:

takes out all of ONTAP one day.

Speaker:

And it takes some, it takes everybody's primary, uh, and their

Speaker:

secondaries along with it, so.

Speaker:

The other issue also with that just snap and replicate, is if you say, have a

Speaker:

backup proxy, so you're backing up your NASS system, you're using a proxy, which

Speaker:

is basically a backup client to mount that snapshot and copy the data off.

Speaker:

One of the challenges you have is when you mount it.

Speaker:

To the storage array.

Speaker:

That backup client looks no different than any other production client,

Speaker:

and so when it ends up reading the data, it could cause performance impact

Speaker:

because it has to read the entire file system on the source to figure out

Speaker:

what's different and move the data off.

Speaker:

This, of course, isn't integrating with the native snapshot storage APIs that the

Speaker:

storage vendor provides, but is actually just reading it like a normal file system.

Speaker:

When you do snap and replicate, you can actually mount the

Speaker:

snapshot on the target system.

Speaker:

And do your backup off of that, and therefore you're not affecting your

Speaker:

production application because you're not impacting the IO on that system.

Speaker:

Or you could use our friend Steven's favorite thing, the NDMP,

Speaker:

Yep.

Speaker:

You could use NDMP

Speaker:

the network data management protocol.

Speaker:

Which was, which was another solution.

Speaker:

This is like to, this is technically off topic at this point, but there was this

Speaker:

other way to back up, uh, NAS systems.

Speaker:

Well, it's still around.

Speaker:

Is that you can back up essentially to tape.

Speaker:

DMP is generally meant to go to tape, uh, or to virtual tape.

Speaker:

And, uh, it was meant to solve the issue that you mentioned because

Speaker:

it would recognize it as a backup process and then deprioritize it.

Speaker:

Uh, nice.

Speaker:

It,

Speaker:

I.

Speaker:

Yep.

Speaker:

There's another use case I wanna talk about with SNAP and Replicate, and it's

Speaker:

not necessarily backup related, but there are many companies who have a distributed

Speaker:

environment and they need performance.

Speaker:

And so what they sometimes do is they will snap and replicate to multiple

Speaker:

systems as kind of a fan, as kind of a fan out, and then they would have

Speaker:

clients read from those target systems because they're consistent at some

Speaker:

point, and use that as, uh, read.

Speaker:

Optimization rather than all these systems trying to hit a single production system.

Speaker:

And these secondary systems could be in the same building.

Speaker:

It could be spread across the world.

Speaker:

So you're now sort of doing read load balancing and you're leveraging

Speaker:

the snap and replicate technology in order to move a copy of the

Speaker:

data to close to the clients.

Speaker:

Yeah, that, uh, by the way, that's, we, we, I don't think we really mentioned

Speaker:

this before, but that's one of the best things here, is that that secondary

Speaker:

target, and maybe even a tertiary target could be very far away because

Speaker:

you're doing asynchronous replication, so you shouldn't be impacting the

Speaker:

performance of the, of the primary array.

Speaker:

Uh, at least not much anyway.

Speaker:

Um, but that, that's, we can put that generally speaking as far

Speaker:

as we want to from the primary.

Speaker:

Yep.

Speaker:

So I'd say the final thing that we would say about snapshots and

Speaker:

replication is that that which we've already sort of alluded to, and that

Speaker:

is that your backup vendor may support this as just another way to backup.

Speaker:

Production data.

Speaker:

Right.

Speaker:

Most of the popular NAS vendors, especially nas, uh, are gonna

Speaker:

have something like this.

Speaker:

And then, uh, the more popular they are as a NAS product, the greater

Speaker:

the possibility that they will integrate with a, a backup app.

Speaker:

Right.

Speaker:

So, um, this is just another way to backup up, especially your on-prem storage,

Speaker:

although some of these vendors are now starting to offer actually for quite some

Speaker:

time now, are offering cloud versions of these typically on-prem products.

Speaker:

Um, so anything, can you think of anything else that we should talk about?

Speaker:

Persona,

Speaker:

I think that covers it all quite a

Speaker:

it's just a, yeah, it's, it's, it's a great way, I think to have a very tight

Speaker:

RPOA ver a really tight RTO, right?

Speaker:

The RTO is really small.

Speaker:

'cause basically you just start using the snapshot that, that you,

Speaker:

that, that there's no restore.

Speaker:

You can start using like the replicated snapshot immediately while you're

Speaker:

restoring the primary snapshot.

Speaker:

Right?

Speaker:

That, you know, that's sort of the beautiful thing of, that.

Speaker:

You might get a.

Speaker:

Reduced performance.

Speaker:

Um, but so, so the RTO, it can, you can meet a really tight RTO, you

Speaker:

could do snapshots very frequently.

Speaker:

So you can also meet a, uh, a really tight RPO, um,

Speaker:

I did have one thing to add since you were just talking about it.

Speaker:

So one thing we didn't talk about, which I think is.

Speaker:

Super awesome about snapshots is we mentioned previously that snapshots

Speaker:

are read only, which is great if you wanna pull some piece of data out

Speaker:

of it or something else like that.

Speaker:

But if you have applications where you need to actually do some recovery process,

Speaker:

you can actually take a snapshot, which is read only, and most storage vendors allow

Speaker:

you to clone it into a read write volume that you can then mount and connect to

Speaker:

your and do your recovery process again.

Speaker:

Again, without occupying the full amount of space, because it's all

Speaker:

based on the snapshot, spins up a copy, allows you to do the recovery process.

Speaker:

It's read, write.

Speaker:

You could do all your testing, your restore verification, which

Speaker:

we always talk about on the podcast is go restore your backups.

Speaker:

And once you're done with that and you validate, you can quickly toss

Speaker:

it away, and then you're good to go.

Speaker:

So that's another benefit of sort of snap and replicate, is you can

Speaker:

do all this verification on your secondary system without once

Speaker:

again impacting your production.

Speaker:

Right.

Speaker:

There are a lot of advantages to the snap and replicate style of,

Speaker:

you know, I'm calling it backup.

Speaker:

Right.

Speaker:

And, uh, this is one of them is, is that, you know, the, basically that the.

Speaker:

The, the replicated copy stays in native format, and that leads, that

Speaker:

leads to all sorts of possibilities.

Speaker:

One of which I think probably the best of which is, is all of this you, you

Speaker:

can do automated recovery testing.

Speaker:

Right.

Speaker:

Automated cloning and then, uh, tested recovery.

Speaker:

And that way you're, you're validating the actual snapshot that

Speaker:

you would like to use for recovery.

Speaker:

So yeah, I, it's, it's a really great way, it's a really great way that I think

Speaker:

maybe not enough people take advantage of.

Speaker:

So hopefully, um.

Speaker:

You know, you've learned a thing or two.

Speaker:

And, uh, with that, I wanna say thank you for, uh, joining

Speaker:

us and of course, persona.

Speaker:

This was one where you really shined, I think.

Speaker:

'cause you, you know, your,

Speaker:

This is, this is what I lift and breathed.

Speaker:

Yeah.

Speaker:

Yeah, yeah, exactly.

Speaker:

Exactly.

Speaker:

So, uh, uh, great, great having you on again today.

Speaker:

And I promise I won't harp on the near CDP term as much.

Speaker:

It's gonna take off.

Speaker:

Uh, we'll see.

Speaker:

We'll see.

Speaker:

Maybe I'll, maybe I'll do it in Spanish and then, uh, it'll be, it'll be better.

Speaker:

Uh, and, uh, so thanks to the listeners.

Speaker:

Thanks for, thanks for listening because, uh, that's really the

Speaker:

only reason that we do this.

Speaker:

That's a wrap.