W. Curtis Preston:

Hi, and welcome to Backup Central's.

W. Curtis Preston:

Restore it all podcast.

W. Curtis Preston:

I'm your host, W.

W. Curtis Preston:

Curtis Preston, AKA Mr.

W. Curtis Preston:

Backup.

W. Curtis Preston:

And I have with me my continuing advisor on my consumer backup project

W. Curtis Preston:

Prasanna Malaiyandi, how's it going?

W. Curtis Preston:

Prasanna.

Prasanna Malaiyandi:

I'm good.

Prasanna Malaiyandi:

Curtis.

Prasanna Malaiyandi:

You know what?

Prasanna Malaiyandi:

We have the expert, the backup anorak Daniel Rosehill coming

Prasanna Malaiyandi:

next week on the podcast, so

W. Curtis Preston:

yeah.

Prasanna Malaiyandi:

can definitely pick his brains.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

You know, it's been interesting cuz, and it was funny, it was on the

W. Curtis Preston:

podcast where I really, I really, I had a moment where I was like,

W. Curtis Preston:

I'm not really backing up I photos.

W. Curtis Preston:

Right?

W. Curtis Preston:

Like, I'm, I'm backing up, you know, I, I use iCloud, but as, as we have

W. Curtis Preston:

discussed, iCloud is not a backup.

W. Curtis Preston:

iCloud is a sync.

W. Curtis Preston:

Right.

W. Curtis Preston:

And if something catastrophic, if you, if I ever got hacked, um, and somebody

W. Curtis Preston:

got a hold of my, my iCloud password or my iPhone and then just decided to

W. Curtis Preston:

massively delete everything, I, if I caught it soon enough, I would be okay.

W. Curtis Preston:

Cuz I do have like a deleted items thing.

W. Curtis Preston:

Right.

W. Curtis Preston:

And as we discussed, I have, well, I don't think we'd discuss on the pod,

W. Curtis Preston:

but as part of this project I found out I have 11,000 photos in, in iCloud.

W. Curtis Preston:

So,

Prasanna Malaiyandi:

isn't as much as a lot of other people, you know.

Prasanna Malaiyandi:

I'm sure there are

W. Curtis Preston:

know, I am not,

Prasanna Malaiyandi:

you.

W. Curtis Preston:

yeah.

W. Curtis Preston:

As, as I, as you and I were talking earlier, I'm not, you

W. Curtis Preston:

know, on one end, I'm not Cecil b Dilla and I'm not, you know,

W. Curtis Preston:

photographing and filming everything.

W. Curtis Preston:

On the other hand, I'm not Prasanna because you use your phone camera,

W. Curtis Preston:

like you use your, uh, Tesla,

Prasanna Malaiyandi:

Yeah, pretty much.

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

Almost never

W. Curtis Preston:

Okay.

W. Curtis Preston:

So you've had your Tesla, how long now

Prasanna Malaiyandi:

four years,

W. Curtis Preston:

and how many miles do you have on it?

Prasanna Malaiyandi:

I think like 11,600, 700, something like that.

Prasanna Malaiyandi:

And it works great because my maintenance cost has been zero.

Prasanna Malaiyandi:

My electricity cost is really minimal versus gas,

W. Curtis Preston:

Cars are incredibly reliable when you don't use them.

W. Curtis Preston:

Um, anyway, yeah,

Prasanna Malaiyandi:

powered cars, even if you don't use them,

Prasanna Malaiyandi:

you still gotta change the oil.

Prasanna Malaiyandi:

You still gotta do everything else, you know?

Prasanna Malaiyandi:

So,

W. Curtis Preston:

I'm not, we're not doing a, we're not

W. Curtis Preston:

doing an e an ecar thing.

W. Curtis Preston:

But anyway, but yeah, we've, we've been having some fun with, uh, with this

W. Curtis Preston:

project of figuring out the various ways.

W. Curtis Preston:

Right.

W. Curtis Preston:

Uh,

Prasanna Malaiyandi:

Speaker:

and, and I think, yeah.

Prasanna Malaiyandi:

Speaker:

I was just gonna say, I think you should mention to the listeners what you're

Prasanna Malaiyandi:

Speaker:

current, what you are currently trying to do for backing up your iCloud photo

W. Curtis Preston:

my current, uh, uh, uh, I don't know what,

W. Curtis Preston:

I don't know what, what, what, I don't know these different methods.

W. Curtis Preston:

My current method that I am trying is Google Photos, and it turns

W. Curtis Preston:

out Google photos, it, it's the only one that I've found so far.

W. Curtis Preston:

Uh, well, the only one that I've.

W. Curtis Preston:

Well, there's maybe one other, which is iDrive, but Google Photos, cuz

W. Curtis Preston:

the problem is that on an iPhone you can turn on optimized storage.

W. Curtis Preston:

And so I have like, I don't know, somewhere between sixty

W. Curtis Preston:

and a hundred gigabytes.

W. Curtis Preston:

We're not quite sure of photos up in, uh, iCloud and, but I only have four

W. Curtis Preston:

and a half gigabytes on my phone because it's, I'm using the optimized storage.

W. Curtis Preston:

But apparently Google Cloud photo, Google Photos pulls down a high res

W. Curtis Preston:

whatever high, the original version from.

W. Curtis Preston:

iCloud and then backs

Prasanna Malaiyandi:

Speaker:

that, that's our theory.

Prasanna Malaiyandi:

Speaker:

That's our theory.

W. Curtis Preston:

That's the theory.

W. Curtis Preston:

Well, it's, I, that's what it says in documentation.

W. Curtis Preston:

We shall see what we shall see.

W. Curtis Preston:

Um, and then we will report on the results here and then,

W. Curtis Preston:

and, and I'll blog about it.

W. Curtis Preston:

I'll backup Central, uh, because IO iCloud is not a backup.

W. Curtis Preston:

The number of articles that I read that told me to use iCloud to

W. Curtis Preston:

back up my iPhone pissed me off.

W. Curtis Preston:

Right?

W. Curtis Preston:

Like, it, it was like 95% of the articles that I found on how to back up, uh,

W. Curtis Preston:

my photos basically said, OI cloud.

W. Curtis Preston:

I'm like, ah.

Prasanna Malaiyandi:

Because.

Prasanna Malaiyandi:

Because for most consumers, right, they're probably not going to do what

Prasanna Malaiyandi:

you're about to do, and they don't care.

Prasanna Malaiyandi:

And so turn on iCloud.

Prasanna Malaiyandi:

At least you have something else other than whatever's on your phone,

W. Curtis Preston:

Yeah.

W. Curtis Preston:

So we're gonna, we're gonna have an answer for the three people in the world,

W. Curtis Preston:

all of which are probably already on this recording, the three people in the

W. Curtis Preston:

world that actually care about having an actual backup of their, of their photos.

W. Curtis Preston:

Anyway, all right.

W. Curtis Preston:

Well, we're gonna bring back a, a longtime friend and a

W. Curtis Preston:

returned guest to our podcast.

W. Curtis Preston:

Uh, he is one of the few people in this industry that, um, make me feel young.

W. Curtis Preston:

Uh, we welcome.

W. Curtis Preston:

We welcome.

W. Curtis Preston:

And he's also the, uh, the technologist, extraordinary and

W. Curtis Preston:

plenty of potentially at Vast Data.

W. Curtis Preston:

Welcome to the podcast, Howard Marks.

W. Curtis Preston:

How's it going, Howard?

Howard Marks:

I'm really happy to be here cuz you know you guys went on your

Howard Marks:

little podcast and you said something about using flash for backup being stupid

W. Curtis Preston:

Yeah.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

I might have

Prasanna Malaiyandi:

But, but, but, but wait, I wanna clarify,

Prasanna Malaiyandi:

Howard, that was from Curtis.

Prasanna Malaiyandi:

I was the one who was like, yes.

W. Curtis Preston:

Oh, he's

Prasanna Malaiyandi:

What about Howard?

Prasanna Malaiyandi:

Oh,

W. Curtis Preston:

Speaker:

throwing me right under the

Prasanna Malaiyandi:

throw you under the,

W. Curtis Preston:

Speaker:

So, we'll, we'll, we'll

Howard Marks:

you don't have to explain that to me.

Howard Marks:

I've known Curtis 35 years.

W. Curtis Preston:

that's that.

W. Curtis Preston:

We will, we will, uh, we will, we'll get to that topic.

W. Curtis Preston:

I will give you a chance to defend your, your, your Honor.

W. Curtis Preston:

Um, why don't we start with an update.

W. Curtis Preston:

It's been a while since we've had you on the pod.

W. Curtis Preston:

Why don't we start with an update on, uh, how much more vast, vast

W. Curtis Preston:

data is, uh, since we had you on.

Howard Marks:

Well, you know, from the financial side, um, we announced at

Howard Marks:

the beginning of this year that we've hit a hundred million a year in annual

Howard Marks:

recurring revenue cuz we've organized ourselves as a software company even

Howard Marks:

though the experience customers have is look, appliances on my data center

Howard Marks:

call vast when something goes wrong.

Howard Marks:

Um, we arrange for customers to buy the hardware so that we are a software

Howard Marks:

company, makes life easier for us.

Howard Marks:

Um, the other big thing is that our friends at HPE

Howard Marks:

just made announcement of a product of theirs called GreenLake Files

Howard Marks:

that will be powered by our software.

Howard Marks:

So before today, if you wanted a scale out, expandable, low cost all flesh file

Howard Marks:

and object system, we would facilitate your buying hardware from the OEMs that we

Howard Marks:

deal with, and we'd sell you the software and you'd have a system that was running.

Howard Marks:

Now you can buy that from HPE as part of GreenLake, and that includes management

Howard Marks:

through the GreenLake Cloud front end.

Howard Marks:

So you can manage the GreenLake for files along with GreenLake for Block

Howard Marks:

and the Compute and all the other servers that are part of GreenLake.

Howard Marks:

So they've taken our software, married it to their control plane

Howard Marks:

and run it on their hardware.

Prasanna Malaiyandi:

And is for, sorry, for those who may

Prasanna Malaiyandi:

not be familiar with GreenLake.

Prasanna Malaiyandi:

GreenLake is more of a, I don't know, a managed or a hosted environment done by

Howard Marks:

It, it, it, it's an as a servicey.

Howard Marks:

So there, there are both consumption and CapEx models the way I understand it.

Howard Marks:

But you know, you don't log into the block array that and create a LUN.

Howard Marks:

You go to the cloud website and you create a LUN and their

Howard Marks:

control plane does that for you.

Howard Marks:

And so it's got more controls and you do don't have to keep the detail.

Prasanna Malaiyandi:

Yeah.

W. Curtis Preston:

So then you're, you're probably paying

W. Curtis Preston:

for what you provision then,

Howard Marks:

Uh, you can do that or you either way.

Howard Marks:

And, you know, that's, that's, you know, kind of an H p E finance question.

Howard Marks:

But my understanding is they do it either way.

Prasanna Malaiyandi:

And I'm sure for vast data, right.

Prasanna Malaiyandi:

That's a huge win, being part of that offering.

Howard Marks:

Well, it's, you know, first of all, it just gets hundreds and hundreds

Howard Marks:

of boots on the ground out selling our software and our whole concept of you

Howard Marks:

can do all flash for as cheaper, cheaper than other guys can do spinning disk.

Howard Marks:

And why do you want spinning discs as opposed to flash?

Howard Marks:

Other than that, they're cheaper.

Howard Marks:

You know, I can't think of another advantage.

Howard Marks:

Um, and so we, you know, for most workloads have narrowed

Howard Marks:

that down or reversed it, and so, All flash cheaper than disk.

Howard Marks:

What a great idea.

Howard Marks:

Um, so we've got, you know, a, all those HPE sales guys going out there selling it

Howard Marks:

as an HPE product, you know, it's not like Qumulo or Scality, where HPE was reselling

Howard Marks:

those products to run on HPE servers.

Prasanna Malaiyandi:

Mm-hmm.

Howard Marks:

Um, and you know, who you called for support was.

Howard Marks:

Well, is this a server problem or is this a software problem?

Howard Marks:

It's HPE GreenLake for files.

Howard Marks:

HPE takes the support calls.

Howard Marks:

It's a full HPE product.

Howard Marks:

Um, it's our software underneath it.

W. Curtis Preston:

Speaker:

Yeah, it's interesting.

W. Curtis Preston:

Speaker:

So you know, you talked about the, you, you know, you said you, you.

W. Curtis Preston:

Speaker:

You, you have a r r.

W. Curtis Preston:

Speaker:

So basically your customers are paying an annual fee to you based

W. Curtis Preston:

Speaker:

on the size of their storage

Howard Marks:

we, so, so we make, we make our money on a, what

Howard Marks:

we call a Gemini subscription.

Howard Marks:

That is, you know, in capacity units, we sell it at a hundred terabytes.

Howard Marks:

HP can sell it in different ways, um, and it's per year.

Howard Marks:

And we guarantee that we'll write that agreement for any piece

Howard Marks:

of hardware for 10 years at the

W. Curtis Preston:

right, right.

W. Curtis Preston:

right.

W. Curtis Preston:

I remember

Howard Marks:

Because, you know, it's not spinning disks.

Howard Marks:

They don't start failing a lot more often in year five and six.

Howard Marks:

And so if you want to keep it for 10 years, keep it for 10 years.

Howard Marks:

If you decide you want to replace some of your hardware in your five

Howard Marks:

or six, because the new denser or faster hardware is more attractive to

Howard Marks:

you, uh, but you bought seven years of support, we'll transfer it on the,

Howard Marks:

you know, terabyte per terabyte basis.

W. Curtis Preston:

Right.

W. Curtis Preston:

Gotcha.

W. Curtis Preston:

Um, yeah, that's a pretty good deal for you.

W. Curtis Preston:

And by the way, I'll, I'll, um, I, I was gonna, uh, compare it to something

W. Curtis Preston:

else, but, but it, it made me remind me of our, uh, disclaimer Prasanna.

W. Curtis Preston:

And I work for different companies.

W. Curtis Preston:

I work for myself, he works for Zoom.

W. Curtis Preston:

And, uh, these are our opinions, not theirs.

W. Curtis Preston:

And, uh, be sure to rate us at, uh, your favorite podcast or give

W. Curtis Preston:

us all the stars and comments.

W. Curtis Preston:

It helps other people find us.

W. Curtis Preston:

If you think we're amazing, then maybe other people will do so as well.

W. Curtis Preston:

Uh, reach out to me, uh, @wcpreston on Twitter, or w Curtis Preston

W. Curtis Preston:

at gmail, and, you know, to be part of the conversation.

W. Curtis Preston:

And we'll see.

W. Curtis Preston:

Um, you know, we'll get you on.

W. Curtis Preston:

So your arrangement with HP reminds me of our arrangement with Dell.

W. Curtis Preston:

Basically it's the whole boots on the ground thing.

W. Curtis Preston:

Uh, you get to put your product in front of a whole, you know, giant number of

W. Curtis Preston:

other people and it's great for you.

W. Curtis Preston:

It's good for them.

W. Curtis Preston:

Their customers get the benefit of your, uh, technology with, with the company

W. Curtis Preston:

that they already, you know, know and

Howard Marks:

And you know, and, and we all know that there are loyal H P E

Howard Marks:

customers who you know now it's a lot more likely they'll buy this product cuz

Howard Marks:

it's got that stamp of approval on it,

Howard Marks:

all of which works for us.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

Well, congratulations on hitting a hundred million.

W. Curtis Preston:

Um, wish you the best of luck on your way to the, you know,

W. Curtis Preston:

doubling that and triple in that.

W. Curtis Preston:

Um, last time you were on, we talked about, we, we alluded to, I think,

W. Curtis Preston:

a little bit about how you do dedupe or dedupe-like stuff that's a little

W. Curtis Preston:

different than the rest of the world.

W. Curtis Preston:

And, and, and that it's better, you know, these are the, you know, the,

W. Curtis Preston:

the, you're saying it's better.

W. Curtis Preston:

So I, I want to give you a chance to talk about that, and then

Howard Marks:

we, we guarantee it's, we guarantee it's better because I'm

Howard Marks:

a vendor and without a guarantee you shouldn't believe anything I say.

W. Curtis Preston:

Okay.

W. Curtis Preston:

All right.

W. Curtis Preston:

That sounds good.

W. Curtis Preston:

So how so how, so first off, how is it

W. Curtis Preston:

better, and then why?

Howard Marks:

It's better cause it reduces data further.

Howard Marks:

Um, And the why is how it works.

Howard Marks:

So, you know, at the beginning it's really pretty simple.

Howard Marks:

We do variable chunk deduplication with a variation of the rock soft method.

Howard Marks:

So if there are insertions, we re re re-sync relatively quickly and the

Howard Marks:

deduplication gets more effective.

W. Curtis Preston:

Mm-hmm.

Howard Marks:

Um, we do z standard compression on the data.

Howard Marks:

We then throw some data specific encryption algorithms at the data for

Howard Marks:

things like, oh look, it's numeric data.

Howard Marks:

Well that means it's only gonna vary within this range.

Howard Marks:

We'll store deltas.

Howard Marks:

And so whichever of those compression methods reduces this block of data most.

W. Curtis Preston:

Mm-hmm.

Howard Marks:

we run.

Howard Marks:

Um, because we're doing so the, the data path is writes go to storage class memory.

W. Curtis Preston:

Mm-hmm.

Howard Marks:

then get act and then all of this data reduction happens

Howard Marks:

as we migrate from that writebuffer to the capacity flash layer.

Howard Marks:

And since it's after the act, as long as we're draining that buffer fast

Howard Marks:

enough, l how long in time it takes to move any piece is irrelevant.

Howard Marks:

And so we have time to go, ah, let's try five different compression algorithms.

Howard Marks:

Use whichever one works best.

W. Curtis Preston:

Interesting.

W. Curtis Preston:

Yeah.

Prasanna Malaiyandi:

do you do,

W. Curtis Preston:

go ahead.

W. Curtis Preston:

Go ahead.

Prasanna Malaiyandi:

do you?

Prasanna Malaiyandi:

And that's actually very interesting how you can, like you said, by

Prasanna Malaiyandi:

storing it in memory, right.

Prasanna Malaiyandi:

You're not impacting client latencies at all.

Prasanna Malaiyandi:

Right.

Prasanna Malaiyandi:

For them it's like, Hey, right.

Prasanna Malaiyandi:

Went through.

Prasanna Malaiyandi:

And then you have this time to do the parallel, uh, computation.

Howard Marks:

Yeah, just, just accept it.

Howard Marks:

It's storage class memory, so it's an S s D, so it's persistent and

Howard Marks:

there's no batteries and protection and you know, a panic when power goes

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

Now, when you are running these algorithms, like I know AI and ML

Prasanna Malaiyandi:

is all the hot topic everywhere you look these days, right?

Prasanna Malaiyandi:

Are you guys doing anything around that in terms of trying to smartly detect

Prasanna Malaiyandi:

which compression algorithms based on

Howard Marks:

We we're, we're not doing that in the data path right now.

Howard Marks:

You know, frankly, the running the five doesn't use that much

Howard Marks:

compute that it's worth it.

Howard Marks:

Um, we're using AI in our cloud platform, so if you have multiple

Howard Marks:

clusters, there's a cloud site you can go to and see one dashboard.

Howard Marks:

Um, and we're using it for the capacity projections.

Howard Marks:

So it's like, oh look, here's how much capacity you're gonna need

Howard Marks:

six months from now while you're filling out your budget request.

Howard Marks:

Let me tell you what you're gonna, there's AI behind that so that it

Howard Marks:

smooths things like, oh look, every three months they do a cleanup.

Howard Marks:

And so let me factor that the AI is good enough to factor that kind of thing

Howard Marks:

in, but not in the data path.

Howard Marks:

But let's get back to the data path.

Prasanna Malaiyandi:

Yep.

W. Curtis Preston:

yeah, your, um, your comment when you, you know, there

W. Curtis Preston:

was a comment you were like, as long as we're clearing the buffer quick enough.

W. Curtis Preston:

Um, and, and I would agree with you, um, you know, how, how do you ensure that

W. Curtis Preston:

that happens, I guess is, is one question.

Howard Marks:

Well, first of all, it becomes a parallelism issue.

Howard Marks:

So we have a large number of compute nodes, all of which are

Howard Marks:

draining this buffer in parallel.

Howard Marks:

And so when the buffer hits a high water mark, more threads

Howard Marks:

to D stage, it gets spawned and allocated across the parallel system.

Howard Marks:

Now, if there's a huge influx of writes, and you know, we're talking.

Howard Marks:

Tens of gigabytes per second for hours on the smallest system.

W. Curtis Preston:

Mm-hmm.

Howard Marks:

Um, then we'll start introducing latency into the

Howard Marks:

writes and apply back pressure.

W. Curtis Preston:

Okay.

W. Curtis Preston:

Okay, that makes

Howard Marks:

But, but you know, that's, you know, literally,

Howard Marks:

you know, it, it don't

Howard Marks:

ha it, you know, the mechanism is there just in case,

W. Curtis Preston:

right.

Howard Marks:

happen.

Prasanna Malaiyandi:

And when you, and because you have more than more

Prasanna Malaiyandi:

capacity, I guess, more throughput at the capacity level than at the storage

Prasanna Malaiyandi:

media level, is that why you can just increase the number of parallel

Prasanna Malaiyandi:

threads and you don't have to worry about the backend being a bottleneck?

Howard Marks:

so in, in, so our, our building block, we call a D Box

Howard Marks:

or a data box, and it's got some.

Howard Marks:

S scm SSDs.

Howard Marks:

We started with Opta.

Howard Marks:

We now mostly use K oxia FL six, and then a larger number of capacity SSDs.

Howard Marks:

And they, you know, it's whatever the cheapest we can get or the cheapest that

Howard Marks:

our OEMs use is, um, the P C I E lanes.

Howard Marks:

Feeding the small number of S C m SSDs is generally the bottleneck.

Prasanna Malaiyandi:

Okay.

Howard Marks:

And so we can paralyze reading data out of s C m, the

Howard Marks:

writing to cut to the capacity.

Howard Marks:

We have a lot more capacity ssd, so there's plenty of bandwidth to write

Prasanna Malaiyandi:

Gotcha.

W. Curtis Preston:

So what, why'd you stop using OC Octane?

Howard Marks:

Um, well first we decided just to get a second

Howard Marks:

source because it's a good idea.

Howard Marks:

Um, and then I, Intel

W. Curtis Preston:

turned out to be a really good

Howard Marks:

of, then, then Intel decided to get out of the business.

Howard Marks:

Um,

Howard Marks:

and, you know, we have supply agreements with Intel.

Howard Marks:

They still have a warehouse full of wafers.

Howard Marks:

Um, but it, you know, the, the performance advantage wasn't worth the complexity.

Howard Marks:

So we've chunked on these variable sized 32 K average blocks and we de-dupe them.

Howard Marks:

But in addition to running a strong hash.

Howard Marks:

To validate identical, we run a series of weaker hashes against the same

Howard Marks:

data blocks, and these weaker hashes are designed to generate the same

Howard Marks:

hash value for inputs across a narrow range of cryptographic distance.

Howard Marks:

So if two blocks have a sm, so cryptographic distance is the

Howard Marks:

number of bits you have to flip to turn block A into block B.

Howard Marks:

If block A is within X bits of block B, this hash will

Howard Marks:

generate the same hash value

W. Curtis Preston:

Okay.

Howard Marks:

from a data reduction point of view.

Howard Marks:

If two blocks generate the same hash value and are a small cryptographic

Howard Marks:

distance part, they have long common strings between them.

Howard Marks:

And will therefore re compress with the same compression dictionary.

Howard Marks:

So the first block that generates one of these similarity hashes, we just

Howard Marks:

compress and store when the second through MTH block generates the same hash.

Howard Marks:

We recall the first one and we used the dictionary from the first

Howard Marks:

one to compress the second one

Prasanna Malaiyandi:

So you get better compression

Howard Marks:

we can store it compressed without the overhead of

Howard Marks:

storing the dictionary a second time.

Prasanna Malaiyandi:

yeah.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

Yeah.

Howard Marks:

and it becomes essentially the difference,

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

So instead of storing a bigger block, it's like just very, very small Deltas

Prasanna Malaiyandi:

because they are cryptographically

Howard Marks:

right?

W. Curtis Preston:

that,

W. Curtis Preston:

that's

Howard Marks:

similar.

Howard Marks:

The mathematicians would say it's a limited cryptographic distance.

Prasanna Malaiyandi:

That's unique.

Prasanna Malaiyandi:

I've never heard of someone doing that.

Prasanna Malaiyandi:

Have you, Curtis?

W. Curtis Preston:

just this guy that we had on the podcast a little

Prasanna Malaiyandi:

Yeah.

W. Curtis Preston:

that looks a lot like Howard.

Howard Marks:

It, it, you know, nobody else is doing it now.

Howard Marks:

Um,

W. Curtis Preston:

Is it, are you patenting it or,

Howard Marks:

there are patents around it.

Howard Marks:

I don't, I haven't looked to see exactly

W. Curtis Preston:

gotcha.

W. Curtis Preston:

Gotcha.

Howard Marks:

to.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

That

Howard Marks:

Um,

Howard Marks:

cause reading patent applications makes my brain hurt.

W. Curtis Preston:

I see, I thought that you would, let's

W. Curtis Preston:

say you got two chunks, right?

W. Curtis Preston:

And you run the really weak, but much faster, I'm assuming, uh, hashing

W. Curtis Preston:

algorithm, and that you would say these two blocks definitely aren't the same.

W. Curtis Preston:

And so let's not do anything else other than com.

W. Curtis Preston:

They're not, they're nowhere.

W. Curtis Preston:

They're, they're, they're cryptographic distance.

W. Curtis Preston:

I think you said so far apart, there's no point in running the

W. Curtis Preston:

stronger ddu, uh, thing on it.

W. Curtis Preston:

Um, that's

W. Curtis Preston:

where I

Howard Marks:

it, it turns out, it turns out even with a weak hash, the

Howard Marks:

number of identical hashes that are not identical data is so small that

Howard Marks:

the cost of testing is ignorable.

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

And especially if it's in flash, like probably

Howard Marks:

the,

Prasanna Malaiyandi:

is

Howard Marks:

compare is so rare.

Howard Marks:

It doesn't matter that it's expensive.

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

And the fact that you're not doing this in line, right.

Prasanna Malaiyandi:

So it's all been de

Prasanna Malaiyandi:

Or

Howard Marks:

it, it's not trench.

Howard Marks:

It's in line, but it's not.

Prasanna Malaiyandi:

client.

Howard Marks:

act,

Prasanna Malaiyandi:

Yeah, yeah.

Prasanna Malaiyandi:

Exactly.

Howard Marks:

it's, it's it's post act, so it doesn't have any impact on latency.

Howard Marks:

But you know, the, the S CM is a one-way writebuffer.

Howard Marks:

We write new data into it, it gets demoted to the capacity flash layer

Howard Marks:

and there's so much bandwidth in the capacity flash layer that reads from

Howard Marks:

there actually faster than from the scm.

Howard Marks:

So there's no reason ever to promote it back.

W. Curtis Preston:

Right,

Howard Marks:

Um, but the other thing is we keep all the metadata in that s scm.

Howard Marks:

So as you expand the system, you add another enclosure that's got more s cm

Howard Marks:

and more capacity, the DUP hash table and the similarity hash tables all grow

Howard Marks:

with it.

Howard Marks:

So it's one data reduction realm regardless of how big a cluster is.

Howard Marks:

We don't have to store that DUP table in memory.

Prasanna Malaiyandi:

Yep.

Howard Marks:

And so you know the whole, well, flash would be great for backup,

Howard Marks:

except I can't afford it as well.

Howard Marks:

If you've got three or four conventional PBBAs,

Prasanna Malaiyandi:

Mm-hmm.

Howard Marks:

you know, first of all, the vendors of PBBAs charged

Howard Marks:

you a lot for that disc storage.

Howard Marks:

You know, they, that's a high margin product.

Prasanna Malaiyandi:

Yeah.

Howard Marks:

as soon as you have two of them, you have two deduplication realms.

W. Curtis Preston:

Right,

Howard Marks:

And we might talk about data duping 10 to one.

Howard Marks:

That doesn't mean all your data dupes 10 to one

W. Curtis Preston:

right.

Howard Marks:

50% of your data at least is unique.

Prasanna Malaiyandi:

Yep,

Howard Marks:

Some of your data ddus a hundred or a thousand to one, and most

Howard Marks:

of the benefits you get is from that data that ddus a hundred or a thousand to one.

Howard Marks:

Well, when you got two boxes, it's not a hundred or a thousand, it's 50 to 500.

Prasanna Malaiyandi:

yep.

Prasanna Malaiyandi:

And every time you add a new box, you're, you lose some of that benefit as well.

Prasanna Malaiyandi:

Right.

Prasanna Malaiyandi:

So,

W. Curtis Preston:

you can dup all our episodes down to

W. Curtis Preston:

like four or five comments.

W. Curtis Preston:

Right?

W. Curtis Preston:

Like back up, back up, all the things.

Howard Marks:

Well that

W. Curtis Preston:

3, 2, 1.

W. Curtis Preston:

Just 3, 2, 1.

Howard Marks:

that requires the next version of, uh, AI deduplication

Howard Marks:

that can take out the idle banter.

W. Curtis Preston:

Exactly, exactly.

W. Curtis Preston:

Our episodes will be like five minutes long.

Prasanna Malaiyandi:

So just to summarize or just to close on that, so

Prasanna Malaiyandi:

we talked about the how you guys do it.

Prasanna Malaiyandi:

So because of all these technologies that you're leveraging or mechanisms,

Prasanna Malaiyandi:

right, that's how you're able to offer that guarantee, right?

Prasanna Malaiyandi:

That's better than anyone else.

Howard Marks:

Yeah, we, you know, we use Z Standard.

Howard Marks:

It's a slightly newer compression algorithm than anybody else

Howard Marks:

does, cuz we started a little bit later than everybody else.

Howard Marks:

So we got to pick the latest one.

Howard Marks:

Um, and then we have those, you know, the additional, well, oh, they're numbers,

Howard Marks:

let's just store the differences, tricks.

Howard Marks:

And then we do deduplication on variable block, which is

Howard Marks:

as well as anybody does it.

Howard Marks:

And then we throw in similarity as, oh, here's another unique

Howard Marks:

trick nobody else does.

Howard Marks:

And so the combination is, we are confident that as long as you're send,

Howard Marks:

you know, we guarantee as long as you're sending us unencrypted data,

Howard Marks:

that will reduce it better than the other guy, whoever the other guy is.

Howard Marks:

And if we don't, we'll provide the capacity so that you

Howard Marks:

didn't pay any more money.

Howard Marks:

Cuz

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

No, that's a great guarantee for end users and customers, especially

Prasanna Malaiyandi:

with budgets these days, right?

Prasanna Malaiyandi:

It's like, Hey, I bought this system.

Prasanna Malaiyandi:

It doesn't quite meet my expectations.

Prasanna Malaiyandi:

I can't go back to my boss and ask for more money.

Prasanna Malaiyandi:

So

Howard Marks:

Well, you know, the other side of that is, you

Howard Marks:

know, just, it's really a very simple scale out architecture.

Howard Marks:

So you don't buy today what you think you're gonna need in three years.

Howard Marks:

You buy today what you think you're gonna need in a year, and then you

Howard Marks:

can buy more when you need it later.

Howard Marks:

Or if, as some of our customers have found out much to therin of

Howard Marks:

our sales guys, their data reduces better than they expected and they

Howard Marks:

don't need anymore in the next year.

Howard Marks:

Well then you're just ahead of the game.

Prasanna Malaiyandi:

So, and I know maybe you could talk in gener generalities,

Prasanna Malaiyandi:

but sort of like if I was a customer who had one of the competition, PBBAs, right.

Prasanna Malaiyandi:

And I now use Vast, right.

Prasanna Malaiyandi:

I buy a vast system, sort of like, is there, like what is the savings that

Prasanna Malaiyandi:

I normally see in terms of storage?

Prasanna Malaiyandi:

Like if I had like a hundred terabyte P B B A, actual

Howard Marks:

if, if you, you know, a hundred terabytes is small for us.

Prasanna Malaiyandi:

okay.

Prasanna Malaiyandi:

Or say

Prasanna Malaiyandi:

a

Howard Marks:

So if you had, if you had a petabyte P B B A, um, then you

Howard Marks:

know, you're probably storing four or five petabytes of logical data on

Howard Marks:

it.

Howard Marks:

Um, and you bought a, you know, a petabyte of usable from us and you'd

Howard Marks:

probably store 25 or 30% more on it.

Prasanna Malaiyandi:

Okay.

Howard Marks:

But that petabyte, P B B A is as big as you can buy that P B B

Howard Marks:

A, there isn't a two petabyte P B B A.

Prasanna Malaiyandi:

Yep.

Howard Marks:

And the real difference is at restore time

Prasanna Malaiyandi:

Hmm.

Howard Marks:

because PBBAs are scaled.

Howard Marks:

For backup speed, not restore speed.

Howard Marks:

They don't even have restore speed on the spec sheet anymore.

Howard Marks:

And backups are not sequential operations nearly as much as

Howard Marks:

you think they used to be.

Howard Marks:

And you

Howard Marks:

know, when

Howard Marks:

Curtis, when when Curtis changed block, block tracking, incremental forever,

W. Curtis Preston:

Right?

Howard Marks:

of those things make the backup and the restore much more random.

Prasanna Malaiyandi:

yep,

Howard Marks:

And so if you're backing up to a disc based p v a,

Howard Marks:

your restore speed is like a fourth or a fifth, you're backup speed.

Prasanna Malaiyandi:

yep.

Howard Marks:

If you're backing up to a vast, your restore speed

Howard Marks:

is five times your backup speed.

Howard Marks:

Cause we're, cuz we are designed to serve.

Howard Marks:

Re primary storage applications where reads happen much more frequently

Howard Marks:

than writes cuz the reads come from all the capacity SSDs, the

Howard Marks:

writes have to go to the s scm.

Howard Marks:

Um, and so what, where that really starts to get important is when, when we start

Howard Marks:

talking about ransomware attack, cuz 10 years ago Curtis and I used to teach

Howard Marks:

seminars and we'd go, yeah, 90, 95% of your restorers are, you know, the file.

Howard Marks:

Somebody screwed up.

Howard Marks:

And you know, if it's on A P B B A it'll be restored in a couple of minutes.

Howard Marks:

And if it was on

Howard Marks:

tape, you'd go find the tape and then a couple of minutes.

Howard Marks:

And so, but you don't know you've been ransomware attacked till

Howard Marks:

thousands or hundreds of thousands of files have been encrypted.

Prasanna Malaiyandi:

Yep.

Howard Marks:

And so now you have to like use something like instant recovery to

Howard Marks:

check back, you know, is this backup good?

Howard Marks:

You gotta do three or four quick looks without restoring, which

Howard Marks:

is a great feature, but you know, requires a relatively high speed

Howard Marks:

backend to work relatively well.

Howard Marks:

And then you're gonna find, okay, this is my last non good point.

Howard Marks:

And then you have to restore and you are gonna have to restore a lot

Howard Marks:

of data and restore speed starts to become really important then.

Prasanna Malaiyandi:

Mm-hmm.

Howard Marks:

And then the kicker is, and the lawyers in the insurance company

Howard Marks:

won't let you use the, the system that was infected for another couple of

Howard Marks:

weeks cuz it's evidence or we have to get somebody in to clean it and certify

Howard Marks:

that it's cleaned well, if you know you can run a VMware NFS data store

Howard Marks:

on VAs, you can just restore to VASc.

Howard Marks:

Now it's a bad idea to run your primary and your backup on the same

Howard Marks:

system for more than a day or two,

W. Curtis Preston:

Right,

Howard Marks:

but, Compared to not running your primary and just, you

Howard Marks:

know, if your choice is backup only or primary and backup, and if this one

Howard Marks:

system dies, I'm really in trouble.

Howard Marks:

Not that part of choice for me.

Howard Marks:

I want my users back up and running.

Howard Marks:

As soon as the lawyers let me get tacked to the old system, or my

Howard Marks:

VAR gives me a new system, or I have someplace else to storage, VMO

Howard Marks:

to, I'm getting that stuff off there right away.

Howard Marks:

But that might mean I'm up a week earlier and a week earlier is a lot of time.

W. Curtis Preston:

my objection to flash for backup has

W. Curtis Preston:

been for two primary reasons.

W. Curtis Preston:

One is, is expensive af right second.

W. Curtis Preston:

Do I really need it?

W. Curtis Preston:

Right?

W. Curtis Preston:

Like, because there's, there are a lot of things that we can buy in life, right?

W. Curtis Preston:

Uh, like I, I need to move fertilizer.

W. Curtis Preston:

I can totally borrow Prasannas, uh, Tesla and it will do it, right?

W. Curtis Preston:

But, but is that what I should be using for that?

W. Curtis Preston:

Do I need, do I need a Tesla to move fertilizer or will

W. Curtis Preston:

my Prius do

Howard Marks:

need Prasannas.

Howard Marks:

Tesla to move fertilizer if you ever want prasanna to speak to you again.

W. Curtis Preston:

no, that's, that's true.

W. Curtis Preston:

By the way, the Prius has been used to move fertilizer just for the record.

W. Curtis Preston:

Um, but so, so that's the thing.

W. Curtis Preston:

It's like, there, there are a lot of things, like, this goes back

W. Curtis Preston:

to the c d P, the c d P argument that I made back in the day.

W. Curtis Preston:

It was the same thing, the same two arguments.

W. Curtis Preston:

One was c D P was too damn expensive, right?

W. Curtis Preston:

And then the other was, does anybody actually need.

W. Curtis Preston:

The, the, the, the functionality that C D P provided.

W. Curtis Preston:

And the answer is yes.

W. Curtis Preston:

0.1% of the population needed what C D P provided.

W. Curtis Preston:

And that's why you don't really see c D P as a choice very, very often these days.

W. Curtis Preston:

Right there, there, there's a one or two companies that do it now, um,

W. Curtis Preston:

and all the other products have died.

W. Curtis Preston:

So those are my two arguments.

W. Curtis Preston:

It's, I, I already know what your argument to the second one is gonna

W. Curtis Preston:

be because you just gave it, I think.

W. Curtis Preston:

Um, so

W. Curtis Preston:

why

Prasanna Malaiyandi:

about for cost?

Howard Marks:

Well,

W. Curtis Preston:

talk about costs?

Howard Marks:

so, for cost, it depends what flash systems you're talking about.

W. Curtis Preston:

Mm-hmm.

Howard Marks:

Um, I will give you that most all flash systems are

Howard Marks:

designed to be fast as possible for a small amount of data, because that's

Howard Marks:

what you need to run the Oracle databases that make companies work.

Howard Marks:

And so if you, you know, it's a block, you know, it's block storage to be low

Howard Marks:

latency to support O L T P and therefore expensive because that's, you know, if

Howard Marks:

that system goes down, you count by the second how much money you're losing.

Howard Marks:

And so you have always bought expensive storage for that.

Howard Marks:

Um,

W. Curtis Preston:

sort of the, sort of the true normal sort of, if I, if

W. Curtis Preston:

I can use this word, pure flash array.

Howard Marks:

Yes.

W. Curtis Preston:

the, that type is designed for that, right?

W. Curtis Preston:

Um, that's technically pure with a small p, but it works the other way as well.

W. Curtis Preston:

Um,

Howard Marks:

talking either way, you know?

Howard Marks:

Yeah.

Howard Marks:

You know, I could name half a dozen other products, but

W. Curtis Preston:

And they're just too expensive.

Howard Marks:

of it, you know, it's, we're gonna design a system based on

Howard Marks:

having a, a pyramidal tiered system.

Howard Marks:

this is the one at the top.

Prasanna Malaiyandi:

Yeah.

Howard Marks:

And if you assume you're gonna build a tier system, then you

Howard Marks:

want the one at the top to be as fast as possible, and you kind of

Howard Marks:

don't care how much it costs because

Howard Marks:

you'll just put stuff that doesn't deserve it on the next tier.

Prasanna Malaiyandi:

Yep.

Howard Marks:

Philosophically our idea was we're gonna make something that

Howard Marks:

delivers performance for everything but the very, very top there.

W. Curtis Preston:

Mm-hmm.

Howard Marks:

And goes down in cost to where Well, if you use enough

Howard Marks:

of it, you don't need those tiers.

Howard Marks:

You don't need the complexity.

Howard Marks:

Right.

Howard Marks:

So, you know, part of our story is as you consolidate workloads, you have

Howard Marks:

workloads that need performance, and you have workloads that need capacity.

Howard Marks:

When you add capacity, performance comes with because in,

Howard Marks:

you know, spindles, how many SSDs?

Howard Marks:

Yeah.

Howard Marks:

A hundred SDS is so much performance.

Howard Marks:

200 SDS is twice that much performance.

Howard Marks:

So if you take the applications that need capacity, And you put them on the

Howard Marks:

same system as the applications that need performance but don't need capacity.

Howard Marks:

The performance that the capacity creates is used by the applications

Howard Marks:

that need the performance and the cost of the performance is brought down

Howard Marks:

because you've used that much capacity and you get in a vir virtuous cycle.

W. Curtis Preston:

I think I followed that.

Prasanna Malaiyandi:

yeah, it, it, it's basically

Prasanna Malaiyandi:

by

Howard Marks:

if you, if

Prasanna Malaiyandi:

that's

Prasanna Malaiyandi:

common.

Howard Marks:

if you, you're, paying 10 x for 10% and one x for

Howard Marks:

90%, then you're paying a hundred.

Howard Marks:

If you have one tier that costs a hundred,

W. Curtis Preston:

Mm-hmm.

Howard Marks:

why have two tiers?

Prasanna Malaiyandi:

Yeah.

Howard Marks:

And when you use capacity, that capacity comes with performance.

W. Curtis Preston:

Mm-hmm.

Howard Marks:

And that means that performance is available

Howard Marks:

to other applications that didn't need the capacity.

Howard Marks:

So you don't need to have separate systems, you just have

W. Curtis Preston:

So, so if I could, if I could try to put this

W. Curtis Preston:

in, in, in just different words, but it'll say the same thing.

W. Curtis Preston:

If I've got a hundred QLC disks, right.

W. Curtis Preston:

Um, and, and these are how big

Howard Marks:

15 or 30 terabytes.

W. Curtis Preston:

the each, each one, right?

Howard Marks:

Each one

W. Curtis Preston:

So if I've got a hundred, I've got one and a half

W. Curtis Preston:

tear, one and a half petabytes.

W. Curtis Preston:

Did I

W. Curtis Preston:

do that

W. Curtis Preston:

right of raw?

W. Curtis Preston:

Yeah.

W. Curtis Preston:

Okay.

W. Curtis Preston:

All right.

W. Curtis Preston:

So I've got one and a half petabytes of raw capacity.

W. Curtis Preston:

And what you're saying is we can just take a slice off the top, if you will.

W. Curtis Preston:

You know, we used to call short stroking the discs.

W. Curtis Preston:

The, obviously you don't need to short stroke a, a flash, but you're

W. Curtis Preston:

basically saying, we're just gonna take a slice off the top, uh, of these

W. Curtis Preston:

150 discs and we're gonna get this massive performance slice, uh, for,

W. Curtis Preston:

for the 10% that need that performance.

W. Curtis Preston:

And then the rest will just put wherever we need to put it.

W. Curtis Preston:

Is that, Does that sound about

W. Curtis Preston:

right?

Howard Marks:

I'm, saying all those SSDs create one pool of

Howard Marks:

performance and one pool of capacity,

W. Curtis Preston:

Right.

Howard Marks:

and a workload can draw from either one as much as it needs.

W. Curtis Preston:

Gotcha.

Prasanna Malaiyandi:

Is it separate capacity and performance pools that you

Prasanna Malaiyandi:

then assign to Applic?

Prasanna Malaiyandi:

Okay.

Prasanna Malaiyandi:

It's just one

Prasanna Malaiyandi:

pool that includes both

Howard Marks:

that, and now, you know, and you can use q o s, you can say,

Howard Marks:

okay, this workload gets a hundred thousand iops, or, or 50 gigabytes

Howard Marks:

per second, and this other one gets

Howard Marks:

different.

Prasanna Malaiyandi:

yeah,

W. Curtis Preston:

yeah,

W. Curtis Preston:

And

W. Curtis Preston:

you'll just use

Howard Marks:

And so you

W. Curtis Preston:

you need to

Howard Marks:

performance, right?

Howard Marks:

But you know, if you've got, um, you know, your backups and you've got the developers

Howard Marks:

who wanna do run, live copies of the database, well run it all on one system.

Howard Marks:

It's, it's an all flash system.

Howard Marks:

It's fast enough to run the database.

Prasanna Malaiyandi:

It's almost as if you're saying, You've built a

Prasanna Malaiyandi:

system that works for all workloads except that 1% or whatever, that's

Prasanna Malaiyandi:

like that very, very, very high end.

Prasanna Malaiyandi:

And you're saying you have one common architecture that allows it

Prasanna Malaiyandi:

to deal with, regardless of if your workload is capacity focused and not

Prasanna Malaiyandi:

very performance, it doesn't need a lot of performance or it's high

Prasanna Malaiyandi:

performance and maybe a little capacity.

Prasanna Malaiyandi:

It's all a

Howard Marks:

and and it doesn't matter whether your definition of

Howard Marks:

performance is bandwidth or iops.

Howard Marks:

You know, it's like all but that very lowest.

Howard Marks:

You know, we, you know, we're an all flash system lightly loaded.

Howard Marks:

We deliver one millisecond latency.

Prasanna Malaiyandi:

yeah,

Howard Marks:

You know, some systems can deliver half that

Howard Marks:

and some rare applications care.

Howard Marks:

But you know, between that and the 10, Tencent, a gigabyte, well,

Howard Marks:

there are 20 terabyte hard drives and super micro servers and you

Howard Marks:

know, they don't do any iops, but you can write to 'em pretty fast.

Howard Marks:

You know,

Prasanna Malaiyandi:

Yeah.

Howard Marks:

in between we can cover.

W. Curtis Preston:

So we're dancing around.

W. Curtis Preston:

You're saying why you could be cheaper, but let me,

W. Curtis Preston:

let me just put a, lemme just put it right, you know, sort of, I'm

W. Curtis Preston:

assuming that you get into competitive bids with PBBAs on a regular basis.

Howard Marks:

Yes, sir.

W. Curtis Preston:

Okay.

W. Curtis Preston:

How do you do there?

Howard Marks:

They're easy.

Howard Marks:

Those are very high profit margin products for

Howard Marks:

the

W. Curtis Preston:

so you're, so you're, saying you can come in

W. Curtis Preston:

less expensive than the effective price of the typical P B B A, even

W. Curtis Preston:

though you're using all this flash.

Howard Marks:

Yes, sir.

W. Curtis Preston:

Okay.

W. Curtis Preston:

Because that, that's the short answer.

W. Curtis Preston:

I like the long answer.

W. Curtis Preston:

That's, I like the long answer.

W. Curtis Preston:

You and I

W. Curtis Preston:

live in long, right.

W. Curtis Preston:

Um, yeah, but in the end, it doesn't matter if it's still more expensive.

Howard Marks:

yeah, the, you know, the long answer is, you know, we use the

Howard Marks:

cheapest flash we can get because we designed the system to treat flash well

Howard Marks:

and understand how to minimize wear.

Howard Marks:

We ha our erasure codes have 3% overhead at I at large scale, so we're not wasting.

Howard Marks:

Space on raid, we reduce data better than anybody else does.

Howard Marks:

So you know, we're getting as much capacity in there.

Howard Marks:

Um, and then when you start saying, okay, it's 30 terabyte SSDs, so you get a lot

Howard Marks:

of capacity and a little bit of space and a little bit of power, and the power

Howard Marks:

and Rackspace start to add up as costs.

Howard Marks:

Um, especially when you start looking at the fact that the leading PBBAs are

Howard Marks:

still using eight terabyte hard drives because that rehydration tax of turning,

W. Curtis Preston:

Hmm.

Howard Marks:

making everything random, well the bigger the hard

Howard Marks:

drive gets, the worse it is cause.

Howard Marks:

One hard drive is a hundred iops.

Prasanna Malaiyandi:

Yep.

Howard Marks:

Doesn't matter whether it's a one terabyte hard

Howard Marks:

drive or a 20 terabyte hard drive.

Howard Marks:

And so they're just reaching the, the world.

Howard Marks:

The land of diminishing returns on IO density.

Howard Marks:

They can't go any lower.

Howard Marks:

And now the sheet metal and the power supplies and the SaaS

Howard Marks:

expanders are becoming a larger and larger percentage of their cogs.

Howard Marks:

And they mark 'em up a lot cuz there's a lot of IP in there in terms of

Howard Marks:

software and they have to make a margin.

Howard Marks:

Um, and so we just don't have most of those problems.

Prasanna Malaiyandi:

Yep.

W. Curtis Preston:

you're, you're, also marking up due to your ddo, right?

W. Curtis Preston:

I mean, you

Howard Marks:

Yeah.

Howard Marks:

Or you know, some of it, you know, some small portion of the difference is, you

Howard Marks:

know, compared to the guys, those the best P VBAs, we still do a little bit better.

Howard Marks:

but but when we go into a customer who says, no, no, no price, this as

Howard Marks:

if you de-dupe the same, we're still coming in with a lower selling price.

Prasanna Malaiyandi:

Yeah, I, I'm not surprised about that.

Prasanna Malaiyandi:

The other thing, Howard, I wanted to bring up, I know you

Prasanna Malaiyandi:

mentioned sort of dis drives and the a hundred iops limit, right?

Prasanna Malaiyandi:

That each of them typically have.

Prasanna Malaiyandi:

The other thing that I've also seen is as the drives get larger and larger, anytime

Prasanna Malaiyandi:

you have to do a raid, rebuild, right?

Prasanna Malaiyandi:

And you're talking like a 20 terabyte drive and it just takes longer and longer,

Prasanna Malaiyandi:

and now there's a potential for failure,

Howard Marks:

Yeah.

Howard Marks:

Well,

Prasanna Malaiyandi:

becomes a lot worse.

Howard Marks:

you know, I do a lot of, you know, resilience calculations

Howard Marks:

and people just don't realize how big a factor rebuild time is

Howard Marks:

in the probability of data loss.

Howard Marks:

Uh, we had one customer share with us.

Howard Marks:

They ran, you know, the leading.

Howard Marks:

Scale out system before us and the average for their rebuilds was 53 days.

Prasanna Malaiyandi:

Oh

Prasanna Malaiyandi:

wow.

W. Curtis Preston:

That's two months.

Howard Marks:

yeah, that's two months during which time your data is exposed

Howard Marks:

and you know, could be slightly exposed if you are already running.

Howard Marks:

N plus three could be really exposed if you're running n plus

Howard Marks:

one, like some vendors recommend.

Howard Marks:

So it all depends.

W. Curtis Preston:

right.

W. Curtis Preston:

Okay.

W. Curtis Preston:

So, so I think, I think, you know, you, you've definitely

W. Curtis Preston:

covered the cost argument.

W. Curtis Preston:

Um, the, and, and it's, I think if we just back up, you've

W. Curtis Preston:

already covered the why, right?

W. Curtis Preston:

The why.

W. Curtis Preston:

would, why

W. Curtis Preston:

is today's Restore different?

Howard Marks:

Ransomware.

W. Curtis Preston:

Yeah.

Howard Marks:

stores.

Howard Marks:

The stores bigger.

Howard Marks:

And the restore location is less well known

W. Curtis Preston:

What do you mean by that?

Howard Marks:

you may not be able to restore back to the infected

Howard Marks:

system cuz it's still evidence,

W. Curtis Preston:

okay.

W. Curtis Preston:

Understood.

W. Curtis Preston:

Okay.

Howard Marks:

right?

Howard Marks:

You need someplace to restore to.

Howard Marks:

And you know, having it where the primary in the backup are duped to each other

Howard Marks:

probably gives you that in a pinch.

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

Have you

Prasanna Malaiyandi:

seen.

Howard Marks:

emphasize in a pinch, cuz

W. Curtis Preston:

Yeah,

Howard Marks:

you know, we've, we've all been trained, no, no bad idea.

Howard Marks:

Don't mix the strip, don't cross the streams.

Howard Marks:

Um, but, but that's the, if you have a backup on the primary, you

Howard Marks:

don't, you don't have a backup.

Howard Marks:

But if you have to choose between backup and primary, I'd rather have primary.

Prasanna Malaiyandi:

Have you seen customers actually do this

Prasanna Malaiyandi:

in the field with fast systems?

Howard Marks:

Oh, we have several customers doing really

Howard Marks:

large scale backup to Vasst.

Howard Marks:

Um, we had one customer who was kind of shocked cuz they were

Howard Marks:

doing encryption in net backup.

Howard Marks:

And so they expected us to not reduce data at all, uh, but they were doing encrypted

Howard Marks:

net backup backups of Oracle dumps of the same database over and over again,

Howard Marks:

encrypted with the same encryption key.

Howard Marks:

And we started seeing about 20% reduction just because even when you encrypted, if

Howard Marks:

you're backing up the same data, it looks

Howard Marks:

the same encrypted as it

W. Curtis Preston:

right.

W. Curtis Preston:

There's like sort of two questions in my head here.

W. Curtis Preston:

One is, and, and they're, they're very much related.

W. Curtis Preston:

One is the, the whole backup container problem, right?

W. Curtis Preston:

Meaning that you get the net backup container and the.

W. Curtis Preston:

Arc serve container and the backup exec container, you know, and they

W. Curtis Preston:

all stored backup data differently.

W. Curtis Preston:

And you have that issue.

W. Curtis Preston:

And then you, but you, there was something that you alluded

W. Curtis Preston:

to that I found interesting.

W. Curtis Preston:

You said commonality between the backup and the primary, but the

W. Curtis Preston:

backup is in some weirdo format.

W. Curtis Preston:

So are you able to get backup or commonality between the

W. Curtis Preston:

backup and the primary?

Howard Marks:

Now in that case, it's

Howard Marks:

more likely we'll see commonality between multiple primaries.

Howard Marks:

You know, it's more like you

Howard Marks:

restored 17 windows VMs

W. Curtis Preston:

Does the way that you're doing ddo make the

W. Curtis Preston:

format problem any less problematic?

W. Curtis Preston:

Right.

Howard Marks:

O Only in that we reduce them all as opposed to if you were relying

Howard Marks:

on the data movers to do reduction.

Howard Marks:

So, you know, kind of the most common case is the storage guys like backup,

Howard Marks:

you know, com Vault or net backup or Veritas or Veeam or whatever they use.

Howard Marks:

And the Oracle DBAs don't trust them and insist on doing, doing, dumps.

W. Curtis Preston:

Right.

Howard Marks:

And so, you know, if you're doing both to, you know, they

Howard Marks:

just give the Oracle DBAs, okay dump to this NFS mount point on the vast.

Howard Marks:

Well then we'll reduce all of those dumps as well as anybody could reduce

Howard Marks:

all of those dumps and your data mover.

Howard Marks:

You'll do data reduction at multiple stages to manage the network traffic.

Howard Marks:

And then we'll do the final dup at the end cuz we're finer grained.

Howard Marks:

And the sim similarity works really well for things that are duped course

Howard Marks:

grain, cuz the edges all look similar.

Howard Marks:

And so when

Howard Marks:

we, when we run, you know, we have a probe you can get as a VM that

Howard Marks:

scans your data and reports back, this is how much it would reduce.

Howard Marks:

And this is how much of that comes from each of these techniques.

Howard Marks:

And so when we run, when we do that with data from a data mover,

Howard Marks:

d duper, those are usually, you know, 128 K or big blocks because

Howard Marks:

they have limited memory available.

Howard Marks:

And so we see more similarity cuz we're finding those pieces Finer

W. Curtis Preston:

just to, just to make sure I understood correctly.

W. Curtis Preston:

So the, one of the question that I didn't really ask was, you know,

W. Curtis Preston:

when you buy a, you know, pick your favorite P V B A, they tend to support.

W. Curtis Preston:

These five backup products, and if you buy a different backup

W. Curtis Preston:

product, well, they're like, well, we don't understand that format yet.

W. Curtis Preston:

And so then they have to go and do some development work to figure

W. Curtis Preston:

out how to crack that container.

W. Curtis Preston:

Do you not have that problem or have you done that

Howard Marks:

We, We, have not optimized for any of these backup applications,

W. Curtis Preston:

And yet you

W. Curtis Preston:

still get better duped than the other guys.

Howard Marks:

Our, our general case data reduction against all of these reduced

Howard Marks:

data types still gets better reduction.

Howard Marks:

You know, we are not, you know, scanning for the timestamps in Oracle rack

Howard Marks:

dumps and, you know, that level stuff.

W. Curtis Preston:

Right.

Howard Marks:

Not to

Prasanna Malaiyandi:

agnostic, right?

Howard Marks:

to say we won't in the future, but our, you know, our data

Howard Marks:

reduction was written for primary storage.

Howard Marks:

It just so happens that being an

Howard Marks:

N F S or an S3 target for a backup data mover is a simple case of primary storage,

Prasanna Malaiyandi:

Yeah.

Howard Marks:

it just works.

W. Curtis Preston:

Right.

W. Curtis Preston:

Right.

W. Curtis Preston:

Hmm.

W. Curtis Preston:

What do you think Prasanna,

Prasanna Malaiyandi:

in the future?

Prasanna Malaiyandi:

So I, I had no complaints to start with.

Prasanna Malaiyandi:

Uh, the one question

W. Curtis Preston:

I lost this argument?

W. Curtis Preston:

I think I might

Prasanna Malaiyandi:

I think, I think you lost this one.

Prasanna Malaiyandi:

Uh, Howard, the one last question I had was, I know some of these backup vendors

Prasanna Malaiyandi:

support the ability to do source side due duplication by integrating with

Prasanna Malaiyandi:

the purpose-built backup appliances.

Prasanna Malaiyandi:

Does VAs support that?

Prasanna Malaiyandi:

Are you guys planning to support that?

Prasanna Malaiyandi:

I know you're looking, you just

Prasanna Malaiyandi:

previously said, right, that you're

Howard Marks:

don't, we don't, um, I've never been really comfortable with the

Howard Marks:

use of client side CPU for that cuz client side CPU is valuable for other things.

Howard Marks:

Um, I think, you know, doing a pass at some level in the data mover.

Prasanna Malaiyandi:

Mm-hmm.

Howard Marks:

It's like, okay, we'll we'll de-dupe at the media server at some

Howard Marks:

large grain so that we're not transferring 50 copies of windows over the network.

Howard Marks:

Um, is perfectly reasonable thing to do cuz it's a network

Howard Marks:

bandwidth management technique.

Howard Marks:

Um, things like Didi Boost are, you know, let's offload this from the,

Howard Marks:

the P B B A to the client and we'd just rather do the work ourselves.

Howard Marks:

Um, and in our architecture, since you can just add more servers at the front end

Howard Marks:

and you just have to buy the servers, we don't even charge for that software.

Howard Marks:

If you need more compute.

Howard Marks:

To do more

Prasanna Malaiyandi:

you just scale out.

Howard Marks:

to more dup, you just add a few more servers as opposed to stealing 5%

Howard Marks:

of the cycles of all of your VMware hosts, which means you now have to not just

Howard Marks:

buy servers, but you have to buy another VMware host, another VMware license.

Howard Marks:

All the other stuff you put on a VMware host starts to add up.

Prasanna Malaiyandi:

Yeah.

Prasanna Malaiyandi:

Gotcha.

W. Curtis Preston:

Well since, well, since you stepped into my neighborhood

W. Curtis Preston:

now Howard, I will have to say that source side, DUP done correctly,

W. Curtis Preston:

speeds up the backup and reduces the C P U utilization on the client.

W. Curtis Preston:

But I, I want, I can't speak to the, to the implementations

W. Curtis Preston:

you were talking about.

W. Curtis Preston:

Um, I can only speak to the one that I am obviously very familiar with.

W. Curtis Preston:

Um, cuz there, you know, there that, that is the off discussed

W. Curtis Preston:

thing of like, well there is a,

Howard Marks:

It's

W. Curtis Preston:

know, there's a

W. Curtis Preston:

pen.

Howard Marks:

it's also a different case because of the assumed

Howard Marks:

bandwidth at all the stages.

W. Curtis Preston:

right.

Howard Marks:

You know, I'm, I'm kind of assuming that there's

Howard Marks:

a lot of bandwidth for short

Howard Marks:

distances in the data center

W. Curtis Preston:

Well, all right.

W. Curtis Preston:

I, I concede this battle, Howard, I lay down my sword.

W. Curtis Preston:

Um,

Howard Marks:

Okay.

Howard Marks:

I

W. Curtis Preston:

you know, I mean, you, what's that

W. Curtis Preston:

You expect

W. Curtis Preston:

to what?

Howard Marks:

in the mail,

W. Curtis Preston:

Um, yeah, I'll send you, I'll send you something.

W. Curtis Preston:

Um, alright, well, uh, Howard's been great.

W. Curtis Preston:

Uh, glad to hear the update and glad to, you know, I, I remember we did,

W. Curtis Preston:

now that I heard you describe it, I, I think we did cover it in the last one,

W. Curtis Preston:

but I think you went deeper this time and that, that's good to hear this idea

Howard Marks:

probably.

W. Curtis Preston:

you can, that you have the, that you have the, the bandwidth

W. Curtis Preston:

to, to, to, to how many different ways did you say you try each block

W. Curtis Preston:

for

Howard Marks:

there's five compression algorithms and, and then there's a strong

Howard Marks:

hash and a number of similarity hashes.

Howard Marks:

I can't remember offhand what they are,

W. Curtis Preston:

Gotcha.

W. Curtis Preston:

I got, I thought I heard you say 15 total ways.

W. Curtis Preston:

I thought I

W. Curtis Preston:

heard you say

W. Curtis Preston:

that.

Howard Marks:

it on that order

W. Curtis Preston:

Gotcha.

W. Curtis Preston:

So the fact that you can take each chunk and try 15 different ways to

W. Curtis Preston:

compress it and pick the one that works the best is pretty damn cool.

W. Curtis Preston:

Um, and, um, you

W. Curtis Preston:

know, it's just on

Howard Marks:

of it's just cuz we can parallelize it so well

Prasanna Malaiyandi:

Yeah.

W. Curtis Preston:

that too, right?

W. Curtis Preston:

Uh, the

W. Curtis Preston:

fact that, you can, you know, that's a.

Howard Marks:

one takes.

Howard Marks:

We're doing a lot.

Prasanna Malaiyandi:

Yeah,

W. Curtis Preston:

it's that, that beauty of the scale out architecture, right?

W. Curtis Preston:

That you can just pass that out, um, like that.

W. Curtis Preston:

All right, well, thanks for coming back, especially to, it's really funny

W. Curtis Preston:

how we had you, you're like, you're listening and you're like, Hey, you said

W. Curtis Preston:

mean things about the way I do things.

W. Curtis Preston:

I will, I accept your challenge.

W. Curtis Preston:

And I'm like, all right, come on back.

W. Curtis Preston:

Uh, happy to do that.

W. Curtis Preston:

And we'll do that with other people.

W. Curtis Preston:

By the way, if you're out there listening and you're like, the thing that Curtis or

W. Curtis Preston:

Prasanna said is wrong, we will be happy to have you on and have us prove to you

W. Curtis Preston:

why you're wrong or, or in this case,

W. Curtis Preston:

In this case, uh, right.

W. Curtis Preston:

Yeah, we concede.

W. Curtis Preston:

Well, I mean, I mean, I think my concerns are certainly valid and

W. Curtis Preston:

there are certainly vendors out there that are like, yes, we can

W. Curtis Preston:

certainly sell you this appliance for the purposes of backup, because

W. Curtis Preston:

recovery speed is really important.

W. Curtis Preston:

I'm like, but it costs five times the cost of this thing over here.

W. Curtis Preston:

I don't like how much better could it possibly be Anyway,

W. Curtis Preston:

so that's, that's where those arguments tend to come from, so.

W. Curtis Preston:

Alright, well thanks.

W. Curtis Preston:

Thanks Howard for

Howard Marks:

and given, given the marketplace, they're

Howard Marks:

not completely unreasonable.

Howard Marks:

Uh, you know, we, we just do things sufficiently different that,

Howard Marks:

you know, if you think restore speed's important than we do,

W. Curtis Preston:

right, right.

Prasanna Malaiyandi:

Yep.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

Yeah.

W. Curtis Preston:

Ransomware.

W. Curtis Preston:

Ransomware.

W. Curtis Preston:

All right.

W. Curtis Preston:

Once again, ransomware, you know, I don't know.

W. Curtis Preston:

What do you, what do you call it?

W. Curtis Preston:

Uh, trump's all, um, although I don't enjoy that word as much

W. Curtis Preston:

as I used to for some reason.

W. Curtis Preston:

Um, anyways, thanks for, thanks for your questions for,

Prasanna Malaiyandi:

Uh, I try.

Prasanna Malaiyandi:

I try.

Prasanna Malaiyandi:

And Howard, great to have you back on the podcast.

Prasanna Malaiyandi:

Hopefully you'll come again.

Howard Marks:

Always pleasure.

Howard Marks:

As long as I keep winning, I'll keep coming back.

W. Curtis Preston:

and we thank you to our listeners.

W. Curtis Preston:

Uh, you know, we're nothing without you.

W. Curtis Preston:

Remember to subscribe so that you can restore it all.