Speaker:

You found the backup wrap up your go-to podcast for all things

Speaker:

backup recovery and cyber recovery.

Speaker:

In this episode, we talk about some hard earned disaster recovery lessons

Speaker:

from major events like nine 11.

Speaker:

We talk about what we learned about DR that day, and we talk about

Speaker:

those critical human elements of DR.

Speaker:

That people often forget, like where your recovery team is going to

Speaker:

sleep when the hotels are all gone.

Speaker:

Disasters happen and they're never convenient.

Speaker:

Whether it's terrorists, hurricanes, or ransomware, you

Speaker:

need to think through what you do.

Speaker:

If you're completely isolated from the world, the time to

Speaker:

learn these lessons are now.

Speaker:

So I hope you enjoy this episode.

Speaker:

By the way, if you don't know who I am, I'm w Curtis Preston, AKA, Mr. Backup,

Speaker:

and I've been passionate about backup and disaster recovery for over 30 years.

Speaker:

Ever since.

Speaker:

I had to tell my boss that there were no backups of the

Speaker:

big database that we just lost.

Speaker:

I don't want that to happen to you, and that's why I do this podcast.

Speaker:

On this podcast, we turn unappreciated backup admins into Cyber Recovery Heroes.

Speaker:

This is the backup wrap up.

Speaker:

Welcome to the show.

Speaker:

Hi, I am w Curtis Preston, AKA, Mr. Backup, and I have with me a guy who

Speaker:

is as lazy as I am when it comes to how to move rugs around the house.

Speaker:

Prasanna Malaiyandi, how's it going?

Speaker:

Prasanna.

Speaker:

I am doing well, Curtis, by the way, isn't it amazing how quiet a room gets

Speaker:

when you put something on the floor?

Speaker:

Yeah, I, I, I wonder if anybody, uh, if anybody notices the difference

Speaker:

in sound because it, you know, and it's been, they probably just got

Speaker:

so used to the other sound, right?

Speaker:

Because I. I went with LVP, you know, upstairs and um, and then in

Speaker:

my office and it just got so echoy and I put the stuff on this wall.

Speaker:

And this over here is a acoustic panels.

Speaker:

Um, you know, behind me there are acoustic panels up on the ceiling

Speaker:

and it still wasn't the same.

Speaker:

I. And then I found, you know, I, I, I, this, this rug is from actually my

Speaker:

living room because we bought a nicer rug and, uh, a nicer, bigger rug.

Speaker:

And so we put that one down there and then I moved here.

Speaker:

But when I moved it in, I, I, I did what I thought was like

Speaker:

a really weird way to do it.

Speaker:

I didn't actually move the furniture.

Speaker:

I was like lifting it up and trying to do it without having to move everything out.

Speaker:

And you said, well, that's exactly how I did it.

Speaker:

exactly.

Speaker:

Yeah.

Speaker:

And I, I, I thought it was just me, but apparently.

Speaker:

It wasn't.

Speaker:

because at least with the rug, well, it depends on how your desk is too.

Speaker:

But like you can at least unfurl part of it and just kind of like shove it down.

Speaker:

Like you just lift

Speaker:

up the front part and

Speaker:

then you lift up the back part.

Speaker:

So,

Speaker:

yeah.

Speaker:

I, I think different people might go, no, no, we're gonna,

Speaker:

we're gonna move everything out.

Speaker:

We're gonna put the rug down and we'll move everything back in.

Speaker:

And they might think of that as easier, but I was like, I

Speaker:

don't wanna move everything.

Speaker:

well, it's more than moving.

Speaker:

It's setting everything back up again.

Speaker:

Yeah, I, I looked into it.

Speaker:

I could have potentially moved the desk without disassembling everything.

Speaker:

It was just, you know, but, um, yeah, so I, it was just nice to hear

Speaker:

when I was talking to you for once.

Speaker:

You weren't like, 'cause so many times I tell you about something that I'm

Speaker:

doing and then you're like, that's the dumbest thing I've ever heard.

Speaker:

Why would you do it that way?

Speaker:

And you're like, oh no, that's, uh, that's how I do it.

Speaker:

Yeah.

Speaker:

Especially if you're space limited.

Speaker:

Like for

Speaker:

What's that?

Speaker:

especially if you are space limited

Speaker:

Yes, space limited is true.

Speaker:

I was just looking, so I have my desk right here and then

Speaker:

like that way, that way I

Speaker:

have the door, but there's no way I could fit my desk outside of

Speaker:

the door with it fully assembled.

Speaker:

Oh yeah.

Speaker:

I'd have to at least take the top off, which is just a giant chore.

Speaker:

So.

Speaker:

Oh yeah.

Speaker:

Well, in my case, I don't have that excuse 'cause I have two French doors, so I

Speaker:

wouldn't have had that excuse, but yeah.

Speaker:

But it still, it still would've been, it still would've been annoying.

Speaker:

And so, but we figured, we figured it out and now I have better sound.

Speaker:

It's a beautiful thing.

Speaker:

And, uh, so, uh, today we're gonna talk about DR and disaster recovery and

Speaker:

lessons learned, and especially, um, some lessons learned from at least one

Speaker:

major event, uh, one major disaster.

Speaker:

And, you know, I, I don't want to, um, you know, the, these, these events,

Speaker:

especially this event, disasters are, are always difficult on, on the people.

Speaker:

Uh, and, and I don't want in any way make light of those disasters

Speaker:

or I, I don't know, whatever, whatever I'm trying to say.

Speaker:

Right.

Speaker:

So proper respect to the people.

Speaker:

'cause some of the, well, especially the one, the main one that we're

Speaker:

gonna talk about, people died in these disasters and, um, you know,

Speaker:

with respect to those people.

Speaker:

Having said that.

Speaker:

We can learn from the things that happened, uh, at that time.

Speaker:

Right?

Speaker:

And so let's talk about, and and the disaster.

Speaker:

The, the main disaster I'm talking about at this point would be what

Speaker:

we all called nine 11, right?

Speaker:

So September 11th, 2001.

Speaker:

Uh, I lived through it.

Speaker:

You lived through it.

Speaker:

Uh, so how old would you have been in 2001?

Speaker:

I would rather not say

Speaker:

I was very young.

Speaker:

You were very young.

Speaker:

I, I was actually in college.

Speaker:

okay.

Speaker:

I was here, um, and um, I was in this house and my kids were young.

Speaker:

My oldest would've been, uh, seven, and my younger one would've been in four.

Speaker:

And what I remember was she might've been in, she might've been five,

Speaker:

she might've been in preschool or might've been in kindergarten.

Speaker:

What I remember, and for those that don't know, I live in North County San Diego,

Speaker:

which means I live just south of Camp Pendleton, which is, uh, you know, one

Speaker:

of the biggest, uh, Marine Corps bases.

Speaker:

The, the idea was that we were under attack because there were multiple events

Speaker:

that were happening simultaneously.

Speaker:

Right.

Speaker:

Multiple planes hit the, the trade center, there was the plane that hit the Pentagon.

Speaker:

There was the plane that went down in Pennsylvania.

Speaker:

There was this feeling that, like we were being attacked

Speaker:

as a country, which we were, and living near a military base.

Speaker:

I don't know what I thought I was going to accomplish by keeping my kids home

Speaker:

from school, but that's what I did.

Speaker:

Yeah.

Speaker:

I was just like, I'm gonna, I'm gonna, I'm gonna, I'm gonna, I'm

Speaker:

gonna keep my kid, like, it, it, like, I, I, I don't even know what

Speaker:

I was thinking, but like, you know, thinking back on it now I'm thinking

Speaker:

that probably I was thinking that, um.

Speaker:

I, I, I, if the world's gonna end, I want my kids near me.

Speaker:

I

Speaker:

mean, it was like, it was, it was, it was, it's kind of morbid.

Speaker:

But that's,

Speaker:

I, I just remember that I didn't, you know, that, that I kept my kids near me.

Speaker:

Um, and I remember that I knew multiple people that worked in

Speaker:

the Real World Trade Center.

Speaker:

None of them were hurt that day

Speaker:

Mm.

Speaker:

and for various reasons.

Speaker:

One, uh, and I know one person that was in the World Trade Center that day.

Speaker:

I knew other people that worked in the World Trade Center, but

Speaker:

for one reason or another, chose not to go into work that day.

Speaker:

Hmm.

Speaker:

I know another person that was supposed to be on Flight 11, which

Speaker:

was from Boston to New York, which was the flight that ended up going.

Speaker:

I don't know if that's the one that went into the Pentagon or

Speaker:

if that's the one that crashed in Pennsylvania, but I knew a person that

Speaker:

was supposed to be on that flight.

Speaker:

That person did not get on that flight.

Speaker:

And, uh, and then the other person that, that I knew was the person in

Speaker:

the, in the trade center, it was, uh, Michael Hingis that was, uh, the, the

Speaker:

blind person that made it, that made it down thanks to his seeing eye dog.

Speaker:

Um, and he, you know, he.

Speaker:

Somewhat famous.

Speaker:

As a result.

Speaker:

He actually went on to become a motivational speaker.

Speaker:

And, um, and, uh, just to, just to lighten things up, I, I'll talk about,

Speaker:

uh, how I first met Michael, and that was, and I, it seems like I've told

Speaker:

this story relatively recently, but I was at my very first trade show.

Speaker:

This would've been in the early nineties, and it was at, it was in

Speaker:

New York, um, at the Javit Center.

Speaker:

In Manhattan, and it was Unix Expo, and I saw, uh, these hot swappable

Speaker:

dish drives, you know, where, where you

Speaker:

could push a button and pull the drives out.

Speaker:

And I just thought that was the coolest thing I'd ever seen.

Speaker:

'cause at the, at the time that, that was unlike anything I'd ever

Speaker:

seen it, it, that was really, really new and really, really cool.

Speaker:

Now we just take it for granted.

Speaker:

But back then it was really cool and he was, he was the one that was.

Speaker:

The se standing in the booth demonstrating these, these, uh, disc drives and

Speaker:

literally his, his shtick was.

Speaker:

It's so easy a blind man can do it.

Speaker:

Right?

Speaker:

And um, and we were like, this is amazing.

Speaker:

We're like, you guys are the only ones with it.

Speaker:

And he goes, yes, we are the only ones with this product.

Speaker:

And then we walked through the trade center or the trade show and we saw

Speaker:

several other vendors with the product.

Speaker:

And one of us said to the other.

Speaker:

Well, in his defense, he can't see the other vendors.

Speaker:

So true.

Speaker:

Um, do you, do you remember, do you have memories of that day?

Speaker:

Of course You do.

Speaker:

Oh yeah, so I remember, so I went to school, I was in Pittsburgh actually.

Speaker:

Mm-hmm.

Speaker:

And so we had a lot of friends, or I had a lot of friends whose families were

Speaker:

living in the city in New York City.

Speaker:

Right.

Speaker:

And so I remember everyone just was very, very concerned.

Speaker:

Um, and like you had mentioned, there was sort of the plane flying towards.

Speaker:

Uh, over Pennsylvania.

Speaker:

And so everyone in Pittsburgh was keeping a close eye being like, Hey,

Speaker:

where is that plane actually going?

Speaker:

Because

Speaker:

it was supposed to fly over Pittsburgh,

Speaker:

right.

Speaker:

right?

Speaker:

So I remember everyone sort of being worried, concerned because they had family

Speaker:

and relatives and they weren't able, like all the phone lines were shut down, right?

Speaker:

So.

Speaker:

No one was able to figure out like what was going on.

Speaker:

I remember going with a bunch of other folks down to the common area

Speaker:

and just kind of like just being like shellshocked as we're watching the news.

Speaker:

Yeah, I, yeah, absolutely.

Speaker:

Um, there's a, there's a great line in the beginning of a movie that I like.

Speaker:

It's become problematic now.

Speaker:

Um, maybe it was always problematic, but there's a movie called Love, actually,

Speaker:

and in the beginning, which is, uh, voiceover from, um, Hugh, Hugh Grant.

Speaker:

And, and, and he was saying on nine 11.

Speaker:

There were a lot of phone calls that were made from the plains,

Speaker:

and he said, to my knowledge, none of them were messages of hate.

Speaker:

They were

Speaker:

messages of love.

Speaker:

Hmm.

Speaker:

I just got a little of a clump there.

Speaker:

Anyway, so, okay.

Speaker:

So, um, when we think about that event, there is, in my world, we

Speaker:

immediately started talking about.

Speaker:

The, we saw the things that happened to companies, and we're gonna talk

Speaker:

about, I'm gonna talk about sort of two, two different kinds of companies.

Speaker:

One of them.

Speaker:

So first off, let's talk about, uh, the, the difference between

Speaker:

a cold site and a hot site.

Speaker:

Do you, so when we talk about disaster recovery, there was this

Speaker:

idea that we, that we're gonna have another site, like ready to go

Speaker:

Yep.

Speaker:

and we talk about cold site and the hot site.

Speaker:

Do you want to talk about what that means?

Speaker:

Yeah, so a hot site basically is you have a site, right?

Speaker:

A disaster recovery site that is fully operational, has all the

Speaker:

equipment, has everything replicated to it, and basically once you push

Speaker:

the BRI big red button, right?

Speaker:

Everything sort of fails over.

Speaker:

It's available, operational, all ready to go, sort of ready

Speaker:

to serve traffic and take over, usually minutes to hours within a

Speaker:

failure.

Speaker:

right.

Speaker:

And then a cold site is very much the opposite of that, right?

Speaker:

It's a site that's sort of ready to start a restore.

Speaker:

Uh, like I suppose there'd be a, there'd be a no site,

Speaker:

Yeah.

Speaker:

right?

Speaker:

That's where, uh, a bad thing happened.

Speaker:

And we're gonna go find some hardware to restore

Speaker:

to, uh, cold site, the, the.

Speaker:

Implication there is that you have some hardware ready to go, but

Speaker:

you haven't restored anything.

Speaker:

A warm site is somewhere in between those two things,

Speaker:

I've have, have you heard of this term that some people call pilot?

Speaker:

A pilot light

Speaker:

Yeah.

Speaker:

talk to.

Speaker:

Talk to me

Speaker:

So it's basically not quite cold, but not quite warm or hot, right?

Speaker:

So

Speaker:

it's a little bit better than cold, but not quite to the extent.

Speaker:

And a lot of the trade off comes from not necessarily having to

Speaker:

eat all of the costs upfront.

Speaker:

So

Speaker:

as an example, if your pilot site is, say, in the cloud, you might have your

Speaker:

data available, but not necessarily your compute and everything else ready to go.

Speaker:

Yeah, I think that I, I think I would still call that a warm sight, but

Speaker:

I mean, there is this concept, it comes from the concept of a pilot

Speaker:

light.

Speaker:

I. Right.

Speaker:

Which for those of you that don't know, when you have a

Speaker:

gas old school that

Speaker:

nowadays we have electronic ignition, but there used to be this in inside,

Speaker:

if you had a gas water heater or a gas furnace, there would be this

Speaker:

little flame that would burn all the

Speaker:

time.

Speaker:

And that would, that's called your pilot light.

Speaker:

And if the pilot light goes out, then you're gonna have to relight it

Speaker:

because otherwise your, your heat won't work.

Speaker:

Um, modern days we use.

Speaker:

electronic electronic ignition,

Speaker:

but, um, because a pilot like just wastes a lot of,

Speaker:

yep.

Speaker:

It's always running and you're always consuming gas.

Speaker:

Yeah.

Speaker:

Um, so there, so the, the best from a DR perspective, right?

Speaker:

The, the, the, the Cadillac, if you will, is, um.

Speaker:

The, I don't know if that's the right term anymore because nobody buys Cadillac, the

Speaker:

Ferrari, the Rolls Royce.

Speaker:

Does anybody buy, do they still make Rolls

Speaker:

Oh yeah.

Speaker:

Okay.

Speaker:

Um, is the hot site

Speaker:

that it's, it's, it's ready to go.

Speaker:

Number one.

Speaker:

It's ready to go when you need it, and, and two, it's kept and

Speaker:

it's ready to go within a, a few.

Speaker:

Minutes or seconds, right?

Speaker:

It's kept as up to date as possible, as much as technology

Speaker:

would allow you to do so.

Speaker:

And one of the challenges with a hot site is, is latency.

Speaker:

And so you might want to put the hot site

Speaker:

as close

Speaker:

as you can to, uh, the, the site that you're preparing

Speaker:

and what happened on nine 11.

Speaker:

So there were many companies to, for their hot site, uh, or had

Speaker:

their main data center in one tower

Speaker:

and had their hot site in another data in the other tower.

Speaker:

which makes perfect sense.

Speaker:

As long as nothing would take out both towers

Speaker:

Yep.

Speaker:

and.

Speaker:

So unfortunately, as we know, you know, both towers, I mean, that,

Speaker:

that's what I woke up to by the way.

Speaker:

I, I

Speaker:

woke up to my wife saying both of the, 'cause I'm on the west coast, so both

Speaker:

of the towers had already collapsed

Speaker:

as you know, as I was waking up.

Speaker:

And so people lost or companies lost their.

Speaker:

their.

Speaker:

primary site and their hot site in, in the same moment.

Speaker:

And so one of the things we learned, nine 11 is to make sure that you are,

Speaker:

if you're doing some sort of hot site or warm site, is to put that site, I'm

Speaker:

gonna say nowhere near, um, that the, the site that you're being protected.

Speaker:

Now let's talk about that.

Speaker:

Um, there were actually some attempts, um, I lived through

Speaker:

this because I was working.

Speaker:

For companies at the time, and that is there were attempts at regulation to say,

Speaker:

If you're a bank or whatever you need to put, if you're financial training,

Speaker:

you need to put a copy of your data.

Speaker:

Uh, that's, that's hot over 200 miles away.

Speaker:

Yep.

Speaker:

was an attempt at regulation.

Speaker:

What's the problem with that?

Speaker:

Um, 200 miles away if you're, depending on the type of disaster isn't far enough.

Speaker:

And so that's

Speaker:

not the problem.

Speaker:

But the second is latency.

Speaker:

yeah, that's the problem, right?

Speaker:

It's just simply not feasible because the, the, the round trip time, the 200 mile

Speaker:

round trip time, uh, was just far too long that couldn't keep the data up to date.

Speaker:

depending on the

Speaker:

Based on the technology at the time,

Speaker:

Yeah.

Speaker:

Well, and I think it depends.

Speaker:

So I do so.

Speaker:

Many years after nine 11, I was working at a storage company and one of the

Speaker:

things that they did was they also like talking to financial customers, right?

Speaker:

Is many of 'em had what they would call dark fiber,

Speaker:

Yep.

Speaker:

right?

Speaker:

Where they would basically run fiber optics, two fiber network between

Speaker:

their two sites, and it would.

Speaker:

You're right.

Speaker:

It wouldn't be completely eliminate the latency, but it would definitely

Speaker:

help versus say routing it over a public network of any type.

Speaker:

Yeah, we definitely, you would.

Speaker:

I, I think that there was an assumption of dark fiber, uh, at that point, but even

Speaker:

200 miles on a straight piece of glass, speed of light has a speed of, speed of

Speaker:

light is not, it's not instantaneous.

Speaker:

It's whatever it is.

Speaker:

187,000 miles a second or whatever,

Speaker:

six hundred or something like

Speaker:

something like that.

Speaker:

Right.

Speaker:

Um, the, the round trip time is gonna be measured in milliseconds.

Speaker:

It's not gonna be, it's not, it's, it's, it's going to significantly increase

Speaker:

latency, especially if we start talking about synchronous transfer of data.

Speaker:

Right.

Speaker:

Now let's talk about synchronous versus sacred.

Speaker:

Synchronous versus asynchronous transfer of data.

Speaker:

You want to give that a shot?

Speaker:

Yeah, so synchronous is.

Speaker:

Basically a right comes into your production site.

Speaker:

It gets forwarded over to your DR site.

Speaker:

The right gets, now there are different flavors, but typically the right gets

Speaker:

committed on the DR site acknowledged back to the primary site, and then the

Speaker:

primary site acknowledges the client during which, so you have to add up

Speaker:

basically the latency of going over the writes on both sides, coming back before

Speaker:

the client acknowledges, in which case you're guaranteed that that right has hit

Speaker:

both sites and so the client can move on.

Speaker:

Which is great as long as it doesn't take too long.

Speaker:

Yes.

Speaker:

So that's synchronous and asynchronous is something, is, is uh, different than that.

Speaker:

And we, we've had some different, between you and I, we've had some

Speaker:

different understandings of different kinds of asynchronous, but go ahead.

Speaker:

Yeah, so for me, asynchronous is, well, what I would call semi synchronous, but

Speaker:

that's a different case is where, uh, you accept the right on the production,

Speaker:

you forward it over to the secondary or DR site, but, but while it's in

Speaker:

process of being committed on the other side, you can acknowledge the client.

Speaker:

So there is a lag.

Speaker:

Um, you could decide how long that lag is, depending on technology and the vendor.

Speaker:

Some allow you to specify at an IO level so you can say, I want

Speaker:

10 transactions outstanding.

Speaker:

Others allow you to do it in terms of seconds.

Speaker:

So I'm allowing up to 10 seconds or 30 seconds, um,

Speaker:

Before you start, before you start kicking back a performance issue to the client.

Speaker:

Before

Speaker:

you start Yeah.

Speaker:

Putting back pressure.

Speaker:

Now

Speaker:

interestingly, there is a mode, which I'm not sure if you're aware of,

Speaker:

uh, that some financial co companies requested, which is called Domino mode.

Speaker:

Talk to me.

Speaker:

So it's a form of synchronous replication where it, because in

Speaker:

synchronous replication, typically you write to the client, it sends it over.

Speaker:

If the right fails on the secondary, it'll still accept it on the primary.

Speaker:

And acknowledge back to the client, right?

Speaker:

So you're not guaranteed that it'll stop writes if it can't write to

Speaker:

both sides At the same time, there's a mode called domino mode where if

Speaker:

it can't write to both sides, it will not acknowledge the client.

Speaker:

So that's why it's called a Domino.

Speaker:

One takes out the other.

Speaker:

I, I would think that,

Speaker:

this is one of those, this is one of those things where, you know, uh, this

Speaker:

is reality versus the idea, right?

Speaker:

To me.

Speaker:

domino mode that you described, that's synchronous.

Speaker:

Anything other than that is not synchronous, right?

Speaker:

And anything other than the domino mode that you described,

Speaker:

I would call asynchronous, right?

Speaker:

So these either synchronous, it's sort of like immutable and not immutable, right?

Speaker:

It's either synchronous or it's not synchronous.

Speaker:

And if it's synchronous, then it shouldn't acknowledge the right to the

Speaker:

client until both writes have been done.

Speaker:

And if one of them fails, then they're not done.

Speaker:

yeah.

Speaker:

So at least from most of the vendors I've seen,

Speaker:

they've never implemented it that way.

Speaker:

Yeah.

Speaker:

That's interesting.

Speaker:

Well, they're wrong.

Speaker:

So, um, uh, yeah, so that was, so that was a lesson we thought we

Speaker:

learned at the time, but we need to make sure we put it far enough away.

Speaker:

But then they were like, it's gotta be synchronous and

Speaker:

it's gotta be 200 miles away.

Speaker:

They're like, eh, it's not gonna work.

Speaker:

Right.

Speaker:

Um, nowadays, you know, you hinted at it earlier, nowadays we would do this with

Speaker:

the cloud and we can put it actually, because, you know, you, you did say that

Speaker:

your first problem was that it wasn't far enough, and that's probably true, right?

Speaker:

Because especially when we start talking about certain areas

Speaker:

like Southern Florida, right?

Speaker:

200 miles isn't far enough.

Speaker:

Um, and um, so with the cloud, you can put it.

Speaker:

Pretty much anywhere.

Speaker:

Now, if we're going to do that, if, especially if we're gonna use

Speaker:

public networks, we're pretty much going to have to use asynchronous

Speaker:

of some sort, right?

Speaker:

Uh, so we're gonna send the data and put it another place we're going to,

Speaker:

you know, like you said, you can have a buffer, you can have a certain amount

Speaker:

of time that it's allowed to get behind before it, uh, like you said, put, what

Speaker:

do you mean when you say back pressure?

Speaker:

So this is where you start to, um, elongate the time

Speaker:

before acknowledging a right.

Speaker:

So

Speaker:

to the client, because typically your client will sort of throttle itself

Speaker:

because at some point your latencies are gonna get into the seconds

Speaker:

and they'll be like, no, no, no.

Speaker:

Something's going on.

Speaker:

I'll slow down.

Speaker:

right.

Speaker:

Because otherwise you're just gonna start dropping the writes.

Speaker:

And.

Speaker:

And, and this is a, a configuration choice on the part of the customer

Speaker:

where they can say, I don't want to ever put back pressure.

Speaker:

I wanna, uh, you know, that, that the data protection is less important

Speaker:

than actually getting the job done.

Speaker:

And then other clients would say, if I don't back it up, I don't

Speaker:

wanna write it in the first place.

Speaker:

Right.

Speaker:

Um, I, I, obviously, I tend to be more towards the latter than the former.

Speaker:

Yeah.

Speaker:

Uh, from, uh, most of the companies I've seen are vendors.

Speaker:

Yeah.

Speaker:

They're not in line with you.

Speaker:

Meaning Meaning that they would just go ahead and do it anyway?

Speaker:

Yeah, because most customers Right.

Speaker:

Unless you

Speaker:

have very, very strict regulations.

Speaker:

Right.

Speaker:

They're like, best effort

Speaker:

Yeah.

Speaker:

I, I think it's

Speaker:

because they will

Speaker:

that I care about data protection more than the average

Speaker:

person,

Speaker:

Because the thing is, at some point it will catch back up.

Speaker:

Hopefully,

Speaker:

Hopefully yes.

Speaker:

Depending on what the problem was.

Speaker:

right.

Speaker:

Um, and, and again, as long as we're okay with the potential,

Speaker:

right, um, uh, I, I would think that there should still be some number.

Speaker:

number might be measured in hours if we're hours behind updating

Speaker:

our other copy, something.

Speaker:

Might be drastically wrong that we need to look at.

Speaker:

Yep.

Speaker:

And the other thing to also mention is with a lot of this.

Speaker:

High end.

Speaker:

I normally refer to it as tier one storage systems

Speaker:

Yeah,

Speaker:

because these are tier one applications with very strict requirements.

Speaker:

Um, usually also they provide the ability to do automatic failover.

Speaker:

So it's kind of, think of it like high availability plus clustering.

Speaker:

right.

Speaker:

So if you take a look at a lot of the tier one storage, right, they might have two

Speaker:

storage systems in both locations with the drives that are all interconnected.

Speaker:

So in case one unit fails, the other unit can take over the diss of the other side.

Speaker:

Um, the clients are also connected to both sides, so they don't

Speaker:

have to worry about failing over.

Speaker:

Um, I can't remember what it was called.

Speaker:

It's like the optimized and non-optimized connectivity for fiber channel,

Speaker:

which allows it to have a preferred path and a non-preferred path.

Speaker:

So you still

Speaker:

have connectivity, so your clients will automatically fail over, so you

Speaker:

don't have to do anything, and so your writes can still continue to happen.

Speaker:

Yeah.

Speaker:

So that's a big thing, is like, you know, we, we, we learned that

Speaker:

we should have it farther away.

Speaker:

We learned that maybe we shouldn't have it too far away, but, but,

Speaker:

but now with the, with the.

Speaker:

With the cloud, we can potentially have it pretty much anywhere, but we

Speaker:

definitely have to rely on some sort of asynchronous, uh, uh, communication.

Speaker:

When, I think about our, our friends that came on the show.

Speaker:

Talk about their experiences with disaster.

Speaker:

I think there's some, that was a major, that was a hurricane

Speaker:

that took out an island

Speaker:

and they had multiple data centers on that island.

Speaker:

One was more in the high ground than the other.

Speaker:

So one was flooded, the other was not.

Speaker:

And they were gonna recover from one data center to the other.

Speaker:

And that's, and they had everything they needed there.

Speaker:

They did.

Speaker:

They needed personnel.

Speaker:

They had to fly people to the island.

Speaker:

But other things happened that we can also learn from.

Speaker:

Do you, what do you remember from those?

Speaker:

So from, so there were two things I remember.

Speaker:

One was sort of.

Speaker:

The people process stuff, which I think we rarely think about.

Speaker:

And then the other was the technology piece.

Speaker:

So from a people process perspective, it was, do you have the right people in

Speaker:

country who have the expertise and the

Speaker:

skills to

Speaker:

recover?

Speaker:

Where are they going to sleep?

Speaker:

Where are they going to get food?

Speaker:

Right?

Speaker:

All the things that you kind of take for granted, right?

Speaker:

They had none of that.

Speaker:

Like how do they communicate

Speaker:

Yeah.

Speaker:

He, as I recall, he, he turned a, a, um, a conference room into a

Speaker:

hotel room

Speaker:

Right.

Speaker:

And slept on a cot and ate rice and beans for two

Speaker:

weeks.

Speaker:

And he was lucky to

Speaker:

get rice and beans right?

Speaker:

Because there were a lot of people who didn't even get that.

Speaker:

Yeah.

Speaker:

Um.

Speaker:

Yeah, that's, that is definitely a, a lesson learned is that, you

Speaker:

know, make sure to take the human element into your DR design.

Speaker:

Right.

Speaker:

Um, and the one that, the one that stands out to me was

Speaker:

the reliance on the mainland.

Speaker:

Yeah.

Speaker:

Right.

Speaker:

That when you have a true disaster, whether it's on an island or just.

Speaker:

You know, wherever you might not be able to get connectivity to the rest

Speaker:

of your computing infrastructure.

Speaker:

And in this case, their authentication and authorization, their IAM system

Speaker:

relied on active directory, which

Speaker:

all of which was in, was in the mainland.

Speaker:

And um, so this is just, you know, the lesson there is to just make sure that.

Speaker:

To just take that into consideration, right.

Speaker:

Just the, the realize that in a real disaster, you may be very

Speaker:

isolated from the rest, rest of the world, and you need to take

Speaker:

that into consideration in your DR.

Speaker:

Design.

Speaker:

Yeah.

Speaker:

And, but I think this becomes so difficult, right, Curtis, because

Speaker:

like how many scenarios are you gonna play in your head and how much time

Speaker:

are you gonna focus on some of these?

Speaker:

Now granted, a hurricane on a tropical island is probably

Speaker:

a high likelihood event,

Speaker:

Yes.

Speaker:

Right.

Speaker:

Well, I,

Speaker:

of that, but.

Speaker:

well, I, I, I think it's, I think it's a totally, I think that there's two

Speaker:

things that you cannot count on, right.

Speaker:

Um, I. and and I, I would just say one thing that covers the number of

Speaker:

things, and that is utilities, right?

Speaker:

You cannot count on the internet.

Speaker:

You cannot count on power.

Speaker:

You cannot count on, um, you know, uh, water,

Speaker:

right?

Speaker:

You can't count on those three things.

Speaker:

And so I, I'm just saying, I don't, I don't think you, you don't need

Speaker:

to think of all of the reasons that one or more of those might not.

Speaker:

Yeah.

Speaker:

available.

Speaker:

You just need to plan for them not being available.

Speaker:

If you don't have internet, if you don't have whatever it is you,

Speaker:

however you communicate between your sites, you are going to be isolated.

Speaker:

If you don't have power, you're gonna need to supply your own power,

Speaker:

right?

Speaker:

Um, these are just things that you can think, you can

Speaker:

think through these things, and these are all things that you can say either.

Speaker:

I'm just saying, have that discussion.

Speaker:

And say, you know what, if we don't have power, we're not gonna do anything.

Speaker:

We're not

Speaker:

gonna spend $15 billion on power generators

Speaker:

Yeah,

Speaker:

in case we, you know, we might, or, or you might be subject

Speaker:

to regulations, or you might

Speaker:

have, uh, financial reasons why downtime is enough.

Speaker:

That, or the cost of downtime is enough that you're gonna pay for generators.

Speaker:

Um, do you remember how they got internet over there?

Speaker:

It was satellite.

Speaker:

Yeah.

Speaker:

satellite.

Speaker:

I like this idea of making sure that you think about what would

Speaker:

you do if the utilities that you're normally counting on are not

Speaker:

available?

Speaker:

If you get completely isolated, what would you do?

Speaker:

There are a number of reasons why that might end up being the case.

Speaker:

or the, or the people that you normally depend on are not available.

Speaker:

Yes.

Speaker:

This is why we had this little thing called documentation.

Speaker:

Yeah.

Speaker:

Um,

Speaker:

And,

Speaker:

and

Speaker:

go ahead.

Speaker:

and hopefully I know when we've had Mike, Dr. Mike on the podcast, right?

Speaker:

Um, he's talked about sort of doing tabletop exercises, right?

Speaker:

So have you done a tabletop exercise, which is like, Hey, what happens if the

Speaker:

main IT person, Susie, is unavailable?

Speaker:

Right, right.

Speaker:

And, and that's what you do.

Speaker:

You work through those various scenarios.

Speaker:

You, you hire somebody to come in as an outsider is the best way to do that.

Speaker:

When we start talking about ransomware, um, Dr. Mike Sailor's company would,

Speaker:

would, would be a great resource for that.

Speaker:

It's good to use a very negative person.

Speaker:

Think the most negative person in your environment, the most

Speaker:

pess, pessimistic person.

Speaker:

And, uh, no, no.

Speaker:

Pessimist thinks they're negative.

Speaker:

They, they think they're, they're realists,

Speaker:

will realize

Speaker:

Um, yeah.

Speaker:

Sort of the final thought that I'm thinking is, and, and I, I remember.

Speaker:

Doing a disaster recovery of my own, and that is, I did it where.

Speaker:

I had not yet tested the throughput of the

Speaker:

backup system that we were doing and the throughput of the backup system we

Speaker:

were doing, it turned out to be crap.

Speaker:

Right?

Speaker:

And as a result, uh, that was a really hard day.

Speaker:

And in Curtis land, right?

Speaker:

This is

Speaker:

very early in my career, I learned it.

Speaker:

What's that?

Speaker:

Yes, it was the compression thing.

Speaker:

Yeah.

Speaker:

So make sure that when, when you make configuration changes to your

Speaker:

backup system or your DR system, make sure that you test that.

Speaker:

Just realize that in general, restore speed is slower than backup speed.

Speaker:

It just is.

Speaker:

It's the way it, you know it, you know, back in the day with tape, it

Speaker:

was because we were doing multiplexing.

Speaker:

Nowadays it would would with dedupe.

Speaker:

It's because of dedupe.

Speaker:

Um, and so just sort of plan for that.

Speaker:

But don't, don't just assume how much slower it is, uh, test it and see how much

Speaker:

slower it is and make sure you figure that into the, to the disaster recovery plan.

Speaker:

But, um, so, um, with that, I think that's enough for now in terms of lessons learned

Speaker:

from just, you know, the pains of others.

Speaker:

Um.

Speaker:

Just think of the scenarios that might be, you know, that you might be subject to.

Speaker:

Um, make sure you've got at least one copy of your data

Speaker:

that's far away from everything.

Speaker:

Um, and the way to probably do that today is cloud taped can still play a role.

Speaker:

In fact, our previous episode we talk about why tape is

Speaker:

still not dead in backup.

Speaker:

Uh, it certainly is on life support, but there

Speaker:

is, you know.

Speaker:

There is, there is a use for tape and backup and it's disaster recovery.

Speaker:

So, um, um, especially when we start talking about disaster

Speaker:

recovery from a ransomware attack.

Speaker:

So, well, thanks for chatting again, my friend.

Speaker:

No, it was good.

Speaker:

Uh, one thing I'm surprised you, I thought you were gonna mention, but you did not,

Speaker:

The 3, 2, 1 rule.

Speaker:

1 rule.

Speaker:

You were just

Speaker:

You know what's funny is we were so close.

Speaker:

We were so close.

Speaker:

Yeah.

Speaker:

Yeah.

Speaker:

The whole 3, 2, 1 rule with, you know, three copies of your data on two

Speaker:

different media, one of which is offsite.

Speaker:

That, that, that's a core design concept for anything.

Speaker:

Uh, backup in dr. And, and that's, yeah, you're right.

Speaker:

We never, we, we didn't mention, but now we have, so you

Speaker:

have corrected our oversight.

Speaker:

Thank you very much.

Speaker:

And thank you to our listeners.

Speaker:

We'd be nothing without you.

Speaker:

That is a wrap.

Speaker:

The backup wrap up is written, recorded, and produced by me w Curtis Preston.

Speaker:

If you need backup or Dr. Consulting content generation or expert witness

Speaker:

work, check out backup central.com.

Speaker:

You can also find links from my O'Reilly Books on the same website.

Speaker:

Remember, this is an independent podcast and any opinions that

Speaker:

you hear are those of the speaker and not necessarily an employer.

Speaker:

Thanks for listening.