You found the backup wrap up your go-to podcast for all things
Speaker:backup recovery and cyber recovery.
Speaker:In our final episode about DR testing, the rubber meets the road.
Speaker:Last time we talked about getting ready for your DR test, and this time we're
Speaker:talking about actually running the test.
Speaker:We'll cover what you need to do during the test, like coordinating between
Speaker:teams, documenting what goes wrong, because something always goes wrong,
Speaker:and making sure that you've got backup communication methods ready.
Speaker:By the way, if you don't know who I am, I'm w Curtis Preston, AKA, Mr.
Speaker:Backup, and I've been passionate about backup and recovery for over 30 years,
Speaker:ever since I had to tell my boss.
Speaker:We had no backups of that really important production database that we had just lost.
Speaker:I don't want that to happen to you, and that's why I do this.
Speaker:On this podcast, we turn unappreciated backup admins into Cyber Recovery Heroes.
Speaker:This is the backup wrap up.
Speaker:Welcome to the show.
Speaker:Hi, I am w Curtis Preston, AKA, Mr.
Speaker:Backup, and if you could just take a couple of seconds to
Speaker:either like or subscribe or.
Speaker:Uh, follow the channel so that you can, uh, always get our great content.
Speaker:That would be awesome.
Speaker:I am once again joined by a guy who has finally put some of his car
Speaker:knowledge to use Prasanna Malaiyandi.
Speaker:I'm doing well Curtis, and yes, I am finally putting some of that car knowledge
Speaker:to use, uh, for viewers who may not, or listeners who may not be aware.
Speaker:tend to watch a lot of car YouTube stuff, um, a lot of it
Speaker:tends to be a brown fabrication engine rebuilding, drag racing.
Speaker:It's a really odd mix, but a lot of it is just YouTube knowledge.
Speaker:And so I finally decided to try something different, and I've been taking auto
Speaker:shop classes at my local community college, which has been amazing.
Speaker:And so as part of it, you actually have a hands-on lab section where you get to
Speaker:actually work on cars like your own car.
Speaker:And right now it's all basic stuff, right?
Speaker:So changes, underhood inspections, inspecting cooling systems.
Speaker:But we actually gotta do things like charging tests, uh, compression tests,
Speaker:leak down tests, replacing spark plugs.
Speaker:So excited.
Speaker:I'm actually using these hands for things.
Speaker:And you did a, you did an oil change yesterday on your wife's car,
Speaker:do an oil change on my wife's car.
Speaker:Yep.
Speaker:I.
Speaker:How filthy was the oil in your wife's car?
Speaker:Yeah, it looked almost brand new.
Speaker:Um, it didn't have many miles since the last oil change.
Speaker:I'd probably say five or 600 miles, but it was a sacrifice since I needed to
Speaker:actually do an oil change for the class.
Speaker:So.
Speaker:Right, right now, this isn't the first time you've done an oil change, right?
Speaker:no.
Speaker:Okay.
Speaker:I've done one in the past, but this is the first time I've done it on a lift,
Speaker:which oh my God, is so much nicer.
Speaker:Oh, you brought, you brought her car into the class.
Speaker:class.
Speaker:And we put it
Speaker:I see, I see.
Speaker:and I.
Speaker:Yeah.
Speaker:Yeah.
Speaker:Everything's nicer on a lift.
Speaker:Absolutely.
Speaker:When you're not like struggling underneath the car, trying not to drop the hot oil
Speaker:on you, and you're actually able to get a large, like container underneath, like the
Speaker:drum was probably like three feet wide.
Speaker:Right, right.
Speaker:Yeah.
Speaker:You had one of those that you can wheel around, right?
Speaker:Yep.
Speaker:So significantly easier.
Speaker:And then I was just thinking, I was like, do I have room in
Speaker:my garage for a two post lift?
Speaker:Even a short one, but no.
Speaker:Trust me, I have thought about it back when I was doing a
Speaker:lot more work on my cars.
Speaker:I definitely looked into it and I was like, okay, I don't, I
Speaker:can't spend that kind of money.
Speaker:So let's talk about something that we actually know a little bit of something
Speaker:about, uh, so last, so two weeks ago, for those of you that follow the,
Speaker:uh, episode, or for those of you that follow the show, uh, two weeks ago
Speaker:we did DR testing part one, and then.
Speaker:Um, we aired, uh, a great speaking of Dr.
Speaker:Testing a great episode from 2021, which was the best DR testing story ever, right?
Speaker:Yep.
Speaker:Oh yeah.
Speaker:The scariest, I would say.
Speaker:Yeah.
Speaker:Where a guy for reasons that he goes into in the show, he essentially purposefully
Speaker:destroys his production environment, not just for DR testing, but as a.
Speaker:As a matter of how everything happened, he ends up testing his DR system
Speaker:and it, it does work, but oh my God.
Speaker:There was, there was a, there was a quote in there that said something like, he had
Speaker:a long weekend that lasted like five days.
Speaker:Yeah.
Speaker:Yeah.
Speaker:So, yeah.
Speaker:things it's like, and well, and the other challenge is he was up in Alaska,
Speaker:Yeah,
Speaker:If he needed to get parts or other things like
Speaker:right.
Speaker:luck.
Speaker:Yeah, exactly
Speaker:if I'm about to do like a house repair or something else like that,
Speaker:it's like, you know not to do it on a Saturday or a Sunday or a Friday,
Speaker:right.
Speaker:if you have to call someone or you need to pick up something
Speaker:and you don't do it at night.
Speaker:Yeah, definitely.
Speaker:Definitely don't do it at night.
Speaker:Right.
Speaker:Yeah.
Speaker:Um, yeah, so that's a great episode if you didn't listen to that episode.
Speaker:That is a great episode.
Speaker:Um, and, uh, uh, yeah, listen to that.
Speaker:So this one, the, the.
Speaker:Two weeks ago, we talked essentially about getting ready to do the DR test,
Speaker:preparing for it, setting the scope for it, agreeing on what's going to be a
Speaker:success, and then this week we're gonna talk about actually executing the DR test.
Speaker:And again, this is a DR test.
Speaker:What would you say is the purpose of a DR test Prasanna?
Speaker:I.
Speaker:To make sure that you're actually in the case of an actual disaster,
Speaker:you're able to recover as agreed upon whatever your agreement was.
Speaker:Yeah, I, I think that's sort of the general, yeah.
Speaker:Obviously that's the purpose of a test in general, right.
Speaker:Is to, is to, is to.
Speaker:To test whether or not you could do it when you, when you need it.
Speaker:But since most tests fail, I'm going to say that the other purpose and
Speaker:perhaps the bigger purpose is to fix the parts of your, of the TR system that
Speaker:you discover are broken in some way.
Speaker:Right?
Speaker:Um, and, uh, so the, the probably one of the biggest.
Speaker:Outcomes of a DR test is to feed back into the DR plan, right?
Speaker:Yeah.
Speaker:just in terms of what fails, I know sometimes people are like,
Speaker:oh, it's just thinking about like, I can't restore the data.
Speaker:But a lot of times what really fails is the dependencies that you didn't consider.
Speaker:Right.
Speaker:you make sure you're able to fail over and recover your active
Speaker:directory in your DR site before you can bring your applications online?
Speaker:You, you know, um, I'm glad you brought that up because I aired
Speaker:another classic episode about the actual disaster recovery on an island.
Speaker:And, uh, again, well, it's with the islands, right?
Speaker:Because Alaska was Kodiak Island.
Speaker:Um, but this was in a Caribbean island.
Speaker:And they do an actual deal, you know, an actual recovery because there
Speaker:was a hurricane that took it out.
Speaker:And one of those dependencies that you talked about was the lack of internet
Speaker:Yeah.
Speaker:and, uh, lack, the lack of power, the lack of internet.
Speaker:These are all things that we come to expect on a normal everyday basis, which
Speaker:In
Speaker:an actual disaster is, is not,
Speaker:Yep.
Speaker:not that right.
Speaker:Yeah.
Speaker:And we also had that other episode.
Speaker:Do you remember maybe, I don't know if you want to air that or not.
Speaker:The dire show one.
Speaker:That's right, the one that talked about the derecho.
Speaker:I'm gonna have to, I have to go find that one.
Speaker:'cause it's not titled the Derecho episode.
Speaker:It was, um.
Speaker:I'll have to find, if I can find that, I'll rebroadcast it in the keeping
Speaker:of the, the disaster recovery theme.
Speaker:I'll, I'll definitely see if I can find that when I br
Speaker:'cause that was also very good.
Speaker:I didn't even know what a derecho was.
Speaker:Derecho is a land hurricane.
Speaker:Uh, a hurricane that forms over land.
Speaker:I don't know why it's called derecho, but that is what it is.
Speaker:Right.
Speaker:Yep.
Speaker:Uh, to me that just means Right, you know, to the right in Spanish.
Speaker:But I.
Speaker:You know, it is what it is.
Speaker:Um, so, uh, so, so we talk about if we're executing the DR test.
Speaker:Uh, we, we, you know, we, we, we've, we've agreed on what we're gonna test.
Speaker:We've agreed on what the success criteria is.
Speaker:It's time to actually start walking through the, the test
Speaker:we're, we're going to have.
Speaker:And, and also we created a, an environment that we're going to test in.
Speaker:We're not doing what our friend from Alaska did.
Speaker:I, I was just thinking, are you just gonna go around like the TV shows
Speaker:when they get hit with an attack and they're just like, plug gun,
Speaker:plug the cables up, plug the cables.
Speaker:Yeah, don't do that.
Speaker:We, we have some sort of test of environment.
Speaker:Generally speaking, today's, it's generally gonna be the
Speaker:cloud and we're going to start executing the, the, um, this test.
Speaker:Can you think of, uh, and, and one of the things, again, this is more of
Speaker:set up a thing, but one of the things you wanna make sure is to allocate
Speaker:enough time, uh, for this, you know.
Speaker:For this process to unfold in its natural, um, evolution.
Speaker:I would say time.
Speaker:And then also make sure you have the resources right.
Speaker:And I'm not, I don't mean compute resources, but people because.
Speaker:Right?
Speaker:Make sure that people are available, right?
Speaker:yeah.
Speaker:Um.
Speaker:don't do this at like, uh, quarter end because people may
Speaker:be firefighting other things or.
Speaker:Yeah.
Speaker:Yeah.
Speaker:The company that I, the, the bank, we did it on a weekend.
Speaker:Um, but it was a dedicated, you know, a, a dedicated weekend where we're
Speaker:going to do the DR test, and we did that because again, you're, you're
Speaker:making all these resources available for the DR test, which means they're not
Speaker:available to do their day job, and their day job would happen during the week.
Speaker:So we chose to do it on the weekend and.
Speaker:I'd say the bigger, the bigger you're going, the bigger te the bigger,
Speaker:this isn't coming out in English.
Speaker:Uh, the bigger the test, the bigger the need to prepare and to, to have, um,
Speaker:you know, to make sure you have those resources and to not do it when the
Speaker:normal production stuff is going on.
Speaker:requires buy-in from the business communication, right?
Speaker:All these
Speaker:Yeah.
Speaker:right?
Speaker:Yeah.
Speaker:Make sure you.
Speaker:Make sure you communicate to all the powers that be, that you are doing
Speaker:a DR test, especially if you're gonna do any kind of failover.
Speaker:Um,
Speaker:it too, right?
Speaker:Because you want this to be done on an ongoing basis.
Speaker:right,
Speaker:to convince 'em upfront, Hey, here's why it's valuable, such that when you go back
Speaker:and after the results, right, you're like, Hey, we now need to do another DR test.
Speaker:Maybe six months down the line,
Speaker:right.
Speaker:already bought it.
Speaker:Another thing as, as we're going through the DR test, we're documenting
Speaker:what went right, what went wrong, especially what went wrong.
Speaker:Right.
Speaker:Um, go ahead.
Speaker:so
Speaker:this is an interesting thing 'cause when we had Mike podcast, right, and
Speaker:he was talking about sort of doing these tabletop exercises, right?
Speaker:I think it's important the person documenting kind of needs to
Speaker:take an objective perspective.
Speaker:Mm-hmm.
Speaker:Right, because you may be showing some biases or the person documenting
Speaker:may not want to document certain things, or may just sort of dismiss
Speaker:it as, Hey, this isn't important,
Speaker:Right,
Speaker:Versus actually capturing what happened throughout the process.
Speaker:right.
Speaker:Agreed.
Speaker:Um, the next thing is, and, and we covered this, uh, in the previous
Speaker:episode, but once you've, you know, we talked about testing little parts
Speaker:of the infrastructure, but once we grow, once we've tested this piece
Speaker:and this piece and this piece, I.
Speaker:I do think it's important to test, you know, you look at the scenario, what
Speaker:would this scenario do to our company?
Speaker:Right?
Speaker:The scenario is a disaster.
Speaker:The scenario is a fire, a.
Speaker:A terrorist action, um, and it's gonna take out all of this infrastructure.
Speaker:What would that do to us?
Speaker:So for example, you might not need to test your ability to recover from a SaaS outage
Speaker:when your, if you have a data center and your data center goes out, right?
Speaker:It's a, it's gonna be scenario dependent.
Speaker:What you're gonna test, but, um, you, you might wanna, what would be the impact to
Speaker:our business and our ability to use the different parts of our infrastructure?
Speaker:And so speaking of dependencies, if we don't have internet of any kind, it
Speaker:is, it is kind of a SaaS outage, right?
Speaker:Right.
Speaker:Um, so, uh, we're gonna, we want to test as many of those parts
Speaker:of our infrastructure that are going to be impacted by the
Speaker:scenario that we're testing, right.
Speaker:Yeah, and sometimes it's a little bit about.
Speaker:consequences or identifying gaps.
Speaker:It's like when you're writing code, right?
Speaker:You normally do unit
Speaker:Um.
Speaker:but then when you actually test the end-to-end functionality, you're
Speaker:like, oh, I didn't realize that this interacts with this other thing
Speaker:this way, and things don't work.
Speaker:That's why we also do end-to-end testing in addition to unit tests.
Speaker:Yeah.
Speaker:And, and, and again, this is why I went, why back in the beginning I
Speaker:was saying that the purpose of the DR test is to identify these gaps, right?
Speaker:The, yeah.
Speaker:I mean we can have, um, I.
Speaker:You know, we can have that perfect test that goes well and that's great and
Speaker:everybody feels better, but it's just as valuable to find the DR test that
Speaker:had, that had a big hole or a small hole and, um, you know, the, uh, and, and,
Speaker:and to document that and address that.
Speaker:And this is why we do it on a regular basis.
Speaker:I have a question for you, Curtis.
Speaker:Yeah.
Speaker:Do you think DR.
Speaker:Testing?
Speaker:So most organizations have a risk management team,
Speaker:Mm-hmm.
Speaker:right?
Speaker:Which usually has a lot of this information in terms of, okay,
Speaker:what are the business risks and everything else like that.
Speaker:But they're also probably the ones who are coordinating across the business
Speaker:in order to say, okay, let's do a test.
Speaker:Mm-hmm.
Speaker:Right where the infrastructure, DR testing that we're talking about here
Speaker:is probably one portion of that overall
Speaker:Mm-hmm.
Speaker:Mm-hmm.
Speaker:Do you think that's fair?
Speaker:Yeah, I think that's fair.
Speaker:And you know, this is, we're going to.
Speaker:I think that if we're doing a, a real DR test, we're going to this.
Speaker:This is a business test as much as it is a technology test, right?
Speaker:Yeah.
Speaker:There is this, that overlap between business continuity planning
Speaker:and disaster recovery planning.
Speaker:And maybe for a DR test, we're not concerned so much with, um.
Speaker:Uh, like if, if it's just a DR test, we're not concerned with, let's say,
Speaker:uh, uh, buildings and people, places for people to, to work and things like that.
Speaker:We're concerned more with getting the technology back up and running.
Speaker:But I, I'm glad you brought that up.
Speaker:That is a, a separate aspect that does need to be taken into account.
Speaker:Well, and the benefit with this is if there's already a team that is looking
Speaker:at that business continuity aspect,
Speaker:Mm-hmm.
Speaker:You may not need to convince the business as much, right?
Speaker:In order to be
Speaker:Right.
Speaker:right, you should partner with people who already, like that is their job,
Speaker:Agreed.
Speaker:Agreed.
Speaker:them.
Speaker:Agreed.
Speaker:We talked about documenting things that we discover here.
Speaker:I, I think that we should be maintaining like a log of, you
Speaker:know, all of the tests and the things that we've learned from them.
Speaker:Because again, that may be helpful for, uh, you know, for
Speaker:future generations of tests.
Speaker:You know, It's important to have a Dr.
Speaker:Runbook and to, to, to have this, you know, one of the pur
Speaker:the, one of the purposes of the test is to update that runbook.
Speaker:So let's just talk about that.
Speaker:Um, the, the, the thing about having a Dr.
Speaker:Runbook, I do believe in having an electronic copy of the Dr.
Speaker:Runbook, uh, but uh, also have the ability to easily update.
Speaker:A paper copy of that runbook.
Speaker:So the way to do that is to have some sort of documentation system
Speaker:online that you can easily update.
Speaker:Um, and then if you want to have a paper copy and you want to have a paper copy,
Speaker:then um, the best way to have that is a, is a loose leaf type notebook system right
Speaker:where you can update pages of it, where you don't have to update the entire book.
Speaker:I have a comment about the electronic copy.
Speaker:Sure.
Speaker:I would recommend also keeping a copy out of your normal corporate infrastructure.
Speaker:Agreed.
Speaker:Right, right.
Speaker:in case, say you get hit with ransomware and you no longer have access to that
Speaker:infrastructure, or someone deletes your account that hosted that data, right?
Speaker:So make sure it's something completely disconnected as well.
Speaker:A copy just in case.
Speaker:And I go back to think about the Pixar story, right, where they just happen to
Speaker:be lucky with Toy Story two and have a copy offsite offline to save the movie.
Speaker:Exactly.
Speaker:Um, yeah, I, I, I think obviously we, we have to keep security in mind.
Speaker:We have to make sure that what, wherever that system, wherever that other
Speaker:copy is, it's protected by security.
Speaker:But the whole point of it is to have it outside the normal security.
Speaker:So, uh, there, there's a, there's a, um, a balance that you need to have there.
Speaker:Right.
Speaker:Um, what about communications during the DR tests?
Speaker:Um, we need to keep everyone.
Speaker:Abreast of what's going on.
Speaker:You wanna talk about that a little bit?
Speaker:Yeah, so you wanna make sure people aren't working in silos and because during a
Speaker:DR test things are gonna be chaotic.
Speaker:but since this is more of a controlled environment, you want to establish
Speaker:those patterns and say, this is a normal way that we communicate.
Speaker:It might be via phones, it might be emails.
Speaker:You might jump into a video conference, right?
Speaker:Whatever it is that you use, make sure that you have all the right
Speaker:stakeholders in that session.
Speaker:Right,
Speaker:in order.
Speaker:So, so then everyone knows what's going on.
Speaker:The other thing though, uh, to mention is make sure you also have
Speaker:alternate methods, Just like what we talked about, the runbook itself.
Speaker:Make
Speaker:right.
Speaker:case your voiceover IP phones are down in your corporate environment
Speaker:or your chat slack is down, or whatever else you're using,
Speaker:Right.
Speaker:Make sure you have an alternate mechanism to get in touch with people.
Speaker:Yeah.
Speaker:That's a real challenge.
Speaker:Um, I mean, it, it is
Speaker:Smoke
Speaker:to have communication during it.
Speaker:What'd you say?
Speaker:signals.
Speaker:So signals, it's, it's a real challenge because we depend so much on technology
Speaker:and I would say that that, um.
Speaker:Again, if it's an outage, generally the outage is for you
Speaker:and not for everything else.
Speaker:So for example, if you're relying on Zoom, um, as your mechanism, zoom will
Speaker:probably be up when you have your outage.
Speaker:You just have to need to make sure that everybody can get to zoom.
Speaker:So, um, if for example, your your, your challenge there
Speaker:will be if you are using, um.
Speaker:You know, a, a, a third party authentication mechanism to get into Zoom
Speaker:and then you don't have access to that, that could be, that could be a problem.
Speaker:So these are the things you wanna make sure, you wanna be able to make sure that
Speaker:you can communicate during the outage.
Speaker:Um, and I can definitely think of a, you know, of a multi-headed zoom call where
Speaker:everybody's just sort of keeping everybody abreast of what's going on, right.
Speaker:Um, and we wanna make sure that the stakeholders are aware of everything
Speaker:that's going on, as well as the people that are executing the, um,
Speaker:that are ex executing the test.
Speaker:Um, and then what about, um, I, I, I think, by the way, the Zoom call,
Speaker:I think is the best way to have, or something like a Zoom call to have
Speaker:coordination between the teams if there are multiple teams that are happening.
Speaker:You don't necessarily have to have everybody who's
Speaker:doing something with the Dr.
Speaker:Uh.
Speaker:To, to be on the Zoom call, but the purpose of the Zoom call, I
Speaker:think is probably to keep, keep all of the different teams aware
Speaker:of what the other teams are doing.
Speaker:Right?
Speaker:It's almost like a war room, if you will.
Speaker:Right?
Speaker:exactly.
Speaker:The big, again, the bigger the test, the bigger it is, the bigger
Speaker:the need is to have, uh, some type of communication like this.
Speaker:Right?
Speaker:Uh, and then you've also got escalation procedures.
Speaker:What happens if something doesn't go right?
Speaker:Who do we call?
Speaker:Um, yeah.
Speaker:you could throw a monkey wrench in things and be like, someone's about to
Speaker:do a, normally is part of the DR test.
Speaker:Right?
Speaker:Or would be responsible for something.
Speaker:You could be like, that person is home sick with the flu
Speaker:and cannot be in the office.
Speaker:Now what do you do?
Speaker:Yeah.
Speaker:Um, yeah.
Speaker:If your DR.
Speaker:Test says, you know, call Steve.
Speaker:Um, this, this is the, the, you know, the more you have something like that,
Speaker:the bigger that, that, that kind of thing is gonna be a problem, right?
Speaker:you bring this up.
Speaker:So I was reading the register this morning
Speaker:Mm-hmm.
Speaker:there was a call in or a write in from a, a reader,
Speaker:Mm-hmm.
Speaker:and they were saying that they had worked at a company I.
Speaker:In it and managed a bunch of infrastructure and they had built
Speaker:this system to automate all of their, uh, software deployment stuff.
Speaker:Mm-hmm.
Speaker:Um, and then they had quit the company, but no one knew how to
Speaker:operate it, and he had left his number, it's in the closet, was a machine.
Speaker:He had left his number.
Speaker:It said, do not reboot, call Steve or whatever his name was.
Speaker:And he got the call and this was like 20 years later,
Speaker:Wow.
Speaker:he got a call and he was like, I don't remember the password.
Speaker:I'm sorry.
Speaker:You gotta figure it out on your own.
Speaker:Wow.
Speaker:That's crazy.
Speaker:That's just crazy.
Speaker:So call Steve.
Speaker:That's funny.
Speaker:Um, yeah, don't, don't be like that.
Speaker:Um, so, uh, let's just say we get to the end of the test, right?
Speaker:We've successfully recovered all of the, all of the aspects
Speaker:if we're doing a full DR test.
Speaker:What needs to happen is a full sort of end-to-end functional test of the
Speaker:different parts of the business to make sure that not just that the, that a
Speaker:system was recovered or a database was recovered, but the application and the.
Speaker:The, the system around that application is able to function.
Speaker:And again, this is why we go into things like phone systems, right?
Speaker:Yeah.
Speaker:Um, you know, if, if the, the application that we're recovering is our customer
Speaker:call center, um, but we don't have phones, uh, great, uh, you know, all of that
Speaker:stuff, all of that stuff has to work.
Speaker:And you've got to do the functional end-to-end test to make sure that all
Speaker:the parts that you are pretending.
Speaker:Are, you know, were damaged, are now fully functional.
Speaker:I agree to that, but I think it's also one of the things, you have to
Speaker:be careful not to boil the ocean.
Speaker:Yes.
Speaker:Yeah, yeah.
Speaker:Well, again, this is about,
Speaker:Yeah.
Speaker:what's that?
Speaker:a balance.
Speaker:I.
Speaker:Well, what I'm saying is, whatever it is, this is, I, I think what you're
Speaker:talking about is, is more about scope,
Speaker:Yes.
Speaker:Because.
Speaker:Even if we just agreed to test this one part of the application, you
Speaker:need to do a functional test of whatever it is that you recovered.
Speaker:E even if it's just a small part of the environment.
Speaker:What I'm that, that's all I'm saying.
Speaker:Yeah.
Speaker:Right.
Speaker:That, that, that we often focus a little bit too much time on the
Speaker:recovery, the restore, and we say, okay, the application's restored.
Speaker:I can walk away.
Speaker:No, the application's restored.
Speaker:When the application is restored, when people can do the thing that whatever
Speaker:it is that application was supposed to.
Speaker:I was intent, I was thinking more about, be careful about thinking
Speaker:about all the failure scenarios.
Speaker:Like I was saying, the person gets sick with the flu, right.
Speaker:Oh, yeah, yeah, yeah.
Speaker:Yeah.
Speaker:about going down that rabbit hole because you will never come back
Speaker:out because it might be, what if the butterfly flops its wing halfway around
Speaker:the world and causes X, Y, Z, right?
Speaker:So,
Speaker:The butterfly will die.
Speaker:right.
Speaker:So don't get overwhelmed by these scenarios is
Speaker:Yeah.
Speaker:And, and speaking of not being overwhelmed when we get to the
Speaker:post, you know, when we get to the, uh, the post game analysis, right?
Speaker:Let's measure against the success criteria that we agreed to.
Speaker:Um, we, we look at the things that didn't work and the
Speaker:bottlenecks and things like that.
Speaker:The key, again here is to better the world, not to prove that
Speaker:you were the best or whatever.
Speaker:I.
Speaker:Um, I know it can be really difficult.
Speaker:Say that again.
Speaker:you were the worst.
Speaker:You were the worst.
Speaker:Yeah.
Speaker:Um, you know, we're looking for things that we can improve.
Speaker:We're looking for procedures that we can update based on, you know,
Speaker:the lessons that we learned.
Speaker:Um, any other post-game analysis?
Speaker:What can you, that you can think of?
Speaker:I would also say.
Speaker:If this is your first time doing this, I think it's also good
Speaker:to say what things went well.
Speaker:I think a lot of times we tend to focus on the negatives,
Speaker:Right.
Speaker:right?
Speaker:But if this is your first time, like this is really hard.
Speaker:This is a hard thing to do.
Speaker:Yeah.
Speaker:And you should acknowledge that and realize if you got through, like I
Speaker:know Curtis, you've always talked about the bank and your DR tests, right?
Speaker:And how I don't think you guys ever completed a hundred
Speaker:No.
Speaker:right?
Speaker:No.
Speaker:Yeah, yeah.
Speaker:So don't be too hard on yourself.
Speaker:Congratulate yourself first off on doing the test in the first place,
Speaker:and second, making it to the end of the test, even if everyone is dead.
Speaker:Um, you know, and then, and then, and then, you know, yeah,
Speaker:don't be too hard on yourself.
Speaker:Right.
Speaker:Uh, because these things, these things rarely do they go well, uh,
Speaker:unless it's like fully automated.
Speaker:And, you know, the, the more I will say, the more you can
Speaker:automate things, the better.
Speaker:Right?
Speaker:Yeah.
Speaker:So you ran the tests,
Speaker:Mm-hmm.
Speaker:things that went well, things that went wrong.
Speaker:think the next step after that is.
Speaker:Identifying how do you close the gaps,
Speaker:Right.
Speaker:And coming up with a plan, because you don't want to
Speaker:just let these things linger,
Speaker:Right.
Speaker:create a plan.
Speaker:Identify what are the most critical elements that you want to address first
Speaker:Mm-hmm.
Speaker:timeframes, and make sure you get buy-in across the board to fix those things.
Speaker:Yeah.
Speaker:Agreed.
Speaker:Right.
Speaker:Um, you, you, you have a, you have an action item list and who's responsible
Speaker:for addressing the different things, and then of course, what's the next thing?
Speaker:You do it again.
Speaker:Right?
Speaker:Um.
Speaker:When,
Speaker:Uh, soon.
Speaker:Right.
Speaker:Um, I would say I'm a fan of more frequent, smaller tests
Speaker:than like an annual huge test.
Speaker:Right.
Speaker:Um, I think the more often we do that, the more we get into the, the
Speaker:mindset of thinking about the things that can go wrong, because a, a lot
Speaker:of things are, are, you know, um.
Speaker:They're the same on different discip disciplines across the, uh, the,
Speaker:the, uh, the organization, right?
Speaker:So the more often we test, the more often we get to a recovery mindset and
Speaker:we start including those things in the system design from the very beginning.
Speaker:Yeah.
Speaker:Right?
Speaker:Um, again, that's the other purpose.
Speaker:I would add that to.
Speaker:My original question, that's the other purpose of a DR test, is
Speaker:to get people to a DR mindset,
Speaker:Yeah.
Speaker:um, to a recovery mindset of saying, um, we need to design the infrastructure and
Speaker:the processes around the infrastructure so that they are easy to recovery.
Speaker:Right.
Speaker:Yep.
Speaker:Or at least even think about it to start with rather than, oh yeah, this failed.
Speaker:Now what?
Speaker:What were you gonna do with our Dr.
Speaker:Yeah.
Speaker:And, and, and lemme just give you a, a, a silly but simple example of what happens
Speaker:when you don't have a recovery mindset.
Speaker:So I go back to the bank, right?
Speaker:I have so many good stories from the days of the bank, right?
Speaker:And when we bought a a T 1000, which was, uh, an HP server, it
Speaker:was a really big server and it had, um, it was a huge server.
Speaker:It had a hundred gigabytes of data.
Speaker:Ginormous, wait.
Speaker:Let me go grab my flash drive.
Speaker:It was a huge server for the time, and it came with a two gigabyte tape drive.
Speaker:Right.
Speaker:I think with compression it was like a four gigabyte tape drive
Speaker:that, that was a system design.
Speaker:And there, there were no changes.
Speaker:No, we, we, we added 30%.
Speaker:With one server, we added 30% to the capacity of the
Speaker:data center with one server.
Speaker:There wasn't a single discussion about what we should do from a
Speaker:backup and recovery perspective.
Speaker:That's what happens when you don't have a recovery mindset,
Speaker:Yeah.
Speaker:right?
Speaker:Is that you, you do things, you add things to the system without any thought
Speaker:to what they would, you know, how that would impact the recovery system.
Speaker:So that's why we want to have a recovery mindset.
Speaker:Yep.
Speaker:Okey dokey.
Speaker:I think we covered everything.
Speaker:Yeah.
Speaker:I think so, yeah, everything you could possibly want to know about
Speaker:Dr in, uh, four episodes with the two, the two, maybe five.
Speaker:We'll see.
Speaker:We'll see if I can find that other episode.
Speaker:Thanks Prasanna for, uh, you know, once again, uh, you know, great team.
Speaker:Woo hoo.
Speaker:Go team.
Speaker:Go.
Speaker:Team go and uh, I want to thank you once again to our listeners.
Speaker:We'd be nothing without you.
Speaker:That is a wrap.