I'm pretty sure we've said smoking hole more times
curtis:than we've on this podcast.
curtis:Just for the record.
curtis:Just saying
curtis:It's getting a
curtis:lot of play today.
curtis:Hi and welcome to Backup Central's Restore it All podcast.
curtis:I'm your host, W.
curtis:Curtis Preston, AKA Mr.
curtis:Backup and I have with me, my Bhangra dance consultant, Prasanna
curtis:Malaiyandi, how's it going Prasanna?
Prasanna:I'm good Curtis, but I have to warn you.
Prasanna:I have not a dancer at all.
Prasanna:So probably the wrong person to be seeking advice about dancing from,
curtis:But you said that you knew about Bhangra dancing and that you
curtis:could advise me on these things.
Prasanna:I told you that it's like a Indian dance style, if you will.
Prasanna:And you had asked a question of, have I seen it because I bet I've
Prasanna:seen a bunch of Bollywood movies.
curtis:You expanded my horizon I bought my tickets my wife and I will be going
curtis:to see the show it's called Bhangin' it!
curtis:It's bangin' was spelled would be H so it's it's trying to like
curtis:do an homage to the Bhangra.
curtis:So it's a, new musical at the LA Jolla Playhouse, which is a very nice
curtis:Playhouse that I've actually never been.
curtis:I've lived here 20 something years.
curtis:I've never watched a show there, but a lot of like big Broadway
curtis:shows actually start out.
curtis:I've never started.
curtis:I've, always watched the Broadway shows
Prasanna:Like
Prasanna:gone to Broadway.
curtis:This is the kind of show that could possibly hit big on Broadway.
curtis:And so we'll see it and we'll see if it's any good and
curtis:I'll
Prasanna:waiting for
curtis:with my review.
Prasanna:Yes.
Prasanna:I think our listeners will be curious.
Prasanna:And by the way, for those in San Diego, when is it running?
Prasanna:Do you know how long?
curtis:It's running.
curtis:It's running until April.
Prasanna:Okay.
Prasanna:So that
Prasanna:was
Prasanna:folks in San Diego.
curtis:Yeah.
curtis:Yeah, depending on when this goes live, if it goes live less than a month from
curtis:now, then you have two days left to go see it because it runs until April
curtis:17th at the LA Jolla Playhouse, by which time all the tickets will be
curtis:gone and you won't be able to see it.
curtis:Sorry, I don't know what to tell you, but so we, have a longtime
curtis:friend on the podcast here today.
curtis:Prasanna.
curtis:I'm excited to bring him on I, and not just because he's one of those
curtis:people that make me feel young.
curtis:As, been in the it industry for an awfully long time, makes me feel like
curtis:a young whippersnapper sometimes.
curtis:He is now the technologist extraordinary and plenipotentiary at Vast Data.
curtis:Welcome to the podcast, Howard Marks.
Howard:Thank you.
Howard:It's very nice to be here.
Howard:I was always about Fauci guy, so I don't know much about Indian dance.
Prasanna:Curtis didn't either before he met me.
Prasanna:So it's fine.
curtis:Yeah I, my knowledge of Indian dance it basically includes the
curtis:reference to it in what was that movie?
Prasanna:Millionaire.
curtis:bright, the BR the bride and prejudice.
Prasanna:Oh
curtis:There's a
Prasanna:yeah.
Prasanna:I th I think
curtis:there's a, it's a pride and prejudice,
Prasanna:yeah.
curtis:Knock off done with what's her name?
curtis:. Prasanna: Ashwaryia Rai.
curtis:I think.
curtis:She remember she, she, in the movie she, gives two D two dance moves.
curtis:It was petting the dog and screwing in the light bulb.
curtis:I don't know if you remember that.
curtis:She says that.
curtis:That's literally the extent of my knowledge of Indian dance.
curtis:That, and the fact that I've watched a bunch of Bollywood movies, but
curtis:that's all thanks to Prasanna.
Prasanna:Yeah.
curtis:So you never know what you're going to get when you're listening to the
curtis:Backup Central Restore it All podcast.
curtis:Speaking of which, let me throw out our usual disclaimer, Prasanna
curtis:and I work for different companies.
curtis:Persona works for Zoom.
curtis:I worked for Druva.
curtis:This is not a podcast of either company and the opinions that you hear are
curtis:ours, and be sure to rate this podcast ratethispodcast.com/restore, or just
curtis:go at your on your favorite pod catcher apple podcasts and just scroll down
curtis:to the bottom and give us some stars.
curtis:And if you really want to make my day, actually put some words there.
curtis:Yeah, absolutely.
curtis:And if you are interested in the things that we're interested in, like
curtis:backups and storage and resilience and ransomware recovery and cyber
curtis:warfare and all of these things.
curtis:Then just send me a note @wcpreston on Twitter, or wcurtispreston@gmail, and
curtis:I'll be happy to get you on the podcast.
Prasanna:friendly.
Prasanna:We ask questions.
curtis:we even apparently, although the last episode I said, unless your
curtis:name was Stewart and apparently Stewart has now reached out to you Prasanna
Prasanna:Yes, he has.
Prasanna:He
curtis:and,
Howard:So even
curtis:and
Howard:name
curtis:Even Stuart can get on this podcast.
curtis:So if we're going to let you know a mouse on the podcast, then surely we can let
curtis:you, his name is Stuart Liddle for those of you that didn't get that reference
curtis:anyway.
Howard:to make me feel honored here.
curtis:We literally let anybody in the door,
curtis:including guys who always wear Hawaiian shirts.
Howard:They're comfortable.
Howard:They come in my size and at this point I'm just known for them.
Howard:I have been known to tell people I'm going to meet at the
Howard:Starbucks at some conference.
Howard:Just look for Santa Clause in an Aloha shirt.
Howard:That will be me.
curtis:much.
curtis:It pretty much
Prasanna:that's.
Howard:Yeah.
Howard:You know how many 350 pound guys with a gray beard are there walking around the
Howard:average tech show, wearing an Aloha shirt?
Howard:Two
curtis:I'm going to, yeah.
curtis:Two, yeah.
curtis:At most.
curtis:Absolutely.
curtis:And one of them is going to be you.
Howard:Yeah.
curtis:so how long have you been at Vast Data?
Howard:I've been at Vast Data three years and 15 days.
curtis:Wow.
Prasanna:And the company is fairly new as well.
Howard:I joined Vast Data the day before we came out of stealth.
Howard:My, my first official act at Vast Data was a briefing for Chris Mellor followed
Howard:the next day by Storage Field Day.
curtis:Wow.
Howard:Nothing like starting off running
Howard:Now, I joined Vast from being an independent analyst.
Howard:So there were a couple of weeks there where I was getting brought up to speed
Howard:and such before my official start date.
Howard:But yeah,
curtis:And why don't you give a for those that aren't familiar with Vast
curtis:Data, give us a, know, the elevator
Howard:sure.
curtis:and
Howard:The really short form on Vast Data is that we make very large scale all
Howard:flash file and object storage systems.
Howard:And when I say very large scale our average selling price for
Howard:our cluster is well on the north side of a million dollars.
Howard:It's multiple petabytes.
Howard:Today we're just introducing a new storage enclosure that brings
Howard:our building block down from 675 terabytes per HA enclosure to 338.
Howard:So we're taking it down by factor of two.
Howard:We're going from a two U to a one U enclosure.
Howard:We'll talk about that in a little bit, but the innovative thing
Howard:about Vast is the architecture.
Howard:If you talk about a large scale system, like we build traditionally, that's been
Howard:done with a scale out, shared nothing model where you have a lot of x86 servers.
Howard:Each of those x86 servers owns some set of media and they communicate
Howard:on a backend network and software makes it look like one big system.
Howard:But those systems start to break down at really large scale.
Howard:And so we've come up with a new model.
Howard:We call DASE the shared everything architecture instead of having a field of
Howard:peer nodes, each of which owns some media, we disaggregated the media into these HA
Howard:enclosures that I was just talking about.
Howard:So no single point of failure, 400 gig connections to an NVME fabric and
Howard:that's typically a hundred gig Ethernet.
Howard:Some of our HPC customers like to run InfiniBand so we
Howard:can do InfiniBand as well.
Howard:All those enclosures do is hold data.
Howard:There's no services there.
Howard:All of the services, everything that you would think of as the controller function
Howard:of the system runs in stateless Docker containers in the front end servers.
Howard:So when a user makes a request to a protocol server to one of
Howard:those front end servers could be NFS, could be SMB, could be S3.
Howard:That server looks in the metadata that's stored in storage class memory
Howard:in the enclosures, finds the data the user's requesting in the data in
Howard:QLC flash in those same enclosures, retrieves it over the NVME over fabric's
Howard:fabric and delivers it to the user.
Howard:So there's none of the traffic from node to node required to reassemble
Howard:data, everything's north, south across that NVME over fabrics connection.
Howard:And since the metadata is in storage class memory, it's fast enough to
Howard:directly access by all of the front end servers that they can just share it.
Howard:They don't have to cash it.
Howard:And by not having the cache, we don't have all the complexities
Howard:of keeping the cache coherent.
Prasanna:I was just going to ask about that, Howard.
Prasanna:So it looks like though you're dis-aggregating the actual storage
Prasanna:and metadata from all the front end processing, which allows,
Prasanna:would assume the front end to scale independently of the backend.
Howard:So each of those front end protocol servers, mounts all of the
Howard:SSDs in the cluster at boot time.
Howard:And then it looks at all of those SSDs, and at those are the SCM
Howard:SSDs that hold the metadata and the QLC SSDs that hold the data.
Howard:So everybody has access to everything.
Howard:And instead of sending messages back and forth between the front end servers,
Howard:they simply write a single of truth in the shared metadata, so that the
Howard:old so that you can place a lock on the metadata or update the metadata.
Howard:But you never have to tell everybody else you updated it because if they want
Howard:to know what the state is, they'll go look in the one place where it's true.
Prasanna:Yeah.
Prasanna:And because everything is stateless in the front end, you don't have to worry
Prasanna:about that necessarily to everyone
Howard:Right,
Prasanna:that backend
Howard:right.
curtis:So the backend has both SSDs and QLC.
Howard:What has SCM sort of storage class memory SSDs, and that can be
Howard:Optane or and it has low end QLC SSDs.
Prasanna:So
curtis:And the, the, yeah the, storage class memory is what's
curtis:holding the metadata and the
curtis:QLC is, what's holding the data.
Howard:Primarily.
Howard:It's also used as a write buffer.
curtis:Okay.
curtis:Okay.
Howard:So writes come into the storage class memory and get mirrored to two
Howard:different SCM SSDs and then get ACKd.
Howard:And then the migration from SCM to QLC happens after the act.
Howard:So we have more time to do things like compress more fully.
curtis:This is a very different game than.
curtis:This idea of all of the front end nodes, being able to mount the entire
Howard:Yes.
curtis:the background
Howard:Yeah.
Howard:We we eliminate the whole concept of ownership and all the
Howard:complexity that, that creates.
Howard:And now I'm going to blow your mind because when I say the metadata is in
Howard:the SCM, I don't mean just the element store metadata, the metadata for our
Howard:merged file system object store, but also the data reduction metadata.
Howard:And so when you add another enclosure to the cluster, you add more SCM, which
Howard:means you add more room for that metadata.
Howard:So regardless of the size of cluster, the cluster is one data reduction realm
Howard:across tens or hundreds of petabytes.
Prasanna:Because everything's looks like one cluster, if you will, or one system.
Howard:right.
Howard:And, we don't have to hold the data deduplication hash
Howard:table in memory any place.
Howard:It's all in SCM where it's fast enough we don't need that.
Howard:So we don't have the limitations of how big a deduplication realm can be
Howard:that most deduplication systems have.
curtis:right.
curtis:They typically top out around a a petabyte or so, and then you
curtis:can't get any bigger than that.
curtis:I don't know where to start on my questions!
Howard:so from that, from the backup point of view, we're discovering that
Howard:the customers are starting to demand higher restore speeds that traditionally
Howard:all a customer worried about when they were picking the storage for their
Howard:backups was it fast enough that I can make my backup within the window?
Howard:And so we got systems like Data Domain and other disk based deduplicating systems,
Howard:where there was a big write read asymmetry where you could write data faster to
Howard:them than you could read data from them.
Howard:Because reading data that caused the system to rehydrate turned
Howard:sequential IO into random IO.
Howard:And they had disks on the backend.
Howard:And as disk drives have gotten bigger, this has gotten worse
Howard:because a 20 terabyte disk drive today delivers exactly the same
Howard:number of IOPS that a one terabyte disc drive delivered 10 years ago.
Howard:So now 20 terabytes of data gets a 20th as many IOPS.
Howard:And so you discover, yes, it takes me eight hours to back this up.
Howard:It takes me 82 hours to restore it
Howard:and
curtis:Yeah.
curtis:D D dedupe has never been very friendly for, large restores, especially if
curtis:you're doing any sort of, if you want to do a live mount, forget it right.
curtis:From a directly, from a Data Domain.
curtis:It's possible in the same way, it's possible that...
Howard:That's, but that's, you can bring up the Oracle or the SQL server VM.
Howard:So that the it guys can access the passwords database, so that everybody
Howard:can start at running ERP on it again.
Prasanna:Yeah.
Prasanna:Don't use it as production.
Prasanna:That's a bad thing.
Howard:Right.
curtis:right.
Howard:And we're discovering that people's requirements are getting tighter.
Howard:You start thinking about software as a service providers where, you know, if you
Howard:run some account, some industry specific accounting as a service for a thousand
Howard:customers, that's a thousand databases.
Howard:And when something goes wrong, you want to restore those databases
Howard:as fast as you can, because your customers are going to be standing
Howard:over your shoulder, yelling at you.
Howard:And the last thing that's kicked, a couple of our potential customers over
Howard:the edge is the ransomware threat.
Howard:Because the size of the restore grows so much with ransomware.
Howard:You start off with, they need to protect my data against ransomware
Howard:and use various methods to do that.
Howard:And so we have indestructable snapshots.
Howard:So you can say snapshot this folder at 6:00 AM when the backup window
Howard:closes and retain it for 30 days.
Howard:And even if the administrator wants to delete it he can't.
Prasanna:So I
Howard:but
Prasanna:about that.
Prasanna:So I did read a little small blurb about that.
Prasanna:So
Prasanna:What prevents, is that locked down forever?
Prasanna:Like an admin can't delete it no matter what, or is it just, there
Prasanna:are additional safeguards in place to make sure that someone doesn't
Prasanna:compromise the admin password,
Howard:Anyone who ever talked to any customer of EMC Centera knows that if you
Howard:build a system where you literally can't delete data someone will get themselves in
Howard:trouble and fill it a hundred percent up with junk, and it will be a bad situation.
Howard:So you have to provide some mechanism for overriding this because customers
Howard:will paint themselves in corners.
Howard:As I said, our average selling price is well over a million dollars.
Howard:We don't have small customers who we only know third hand through VARs.
Howard:We are in relatively intimate contact with every one of our customers.
Howard:And so we don't have a fixed policy that says, if you jump through these
Howard:hoops, then we will let you delete the undeletable snapshots we, and the
Howard:customer agree what the hoops are.
Howard:Yeah, multifactor authentication must be three of the five people on this list.
Howard:They have to know the passphrase and the proper response to the passphrase.
Howard:And if they respond with this other response to the passphrase, then for
Howard:the next 24 hours, do not give anybody the secret as complicated as you want.
Howard:We'll as long as we can write it down, those are the rules.
Howard:And then once you've jumped through the hoops, we give you a time limited
Howard:token that allows you to delete snapshots for a short period of time.
Howard:And that token is a one-time pad.
Howard:So that you can't re it's not good for
Prasanna:Yeah.
Howard:an hour whenever you use it.
Howard:It is good for the time when we issue it for some limited period of time.
Howard:And then you have to know the next one.
Howard:And it's just, it was the best solution we could come up with.
Prasanna:And this is probably helps in cases where someone
Prasanna:attacks a company, they get access to the, to a storage system.
Prasanna:They start deleting back-ups or what have you, it gives you
Prasanna:that extra layer of protection.
Howard:I've seen ransomware , you know, we think of ransomware as being on the
Howard:order of the viruses we've dealt with.
Howard:And the ransomware reports I see are much more frequently and this ransomware
Howard:opened a door and then someone physically hacked for a long period of time.
Howard:And they took over some workstation, eventually that some
Howard:administrator logged into and they have an administrator password.
Howard:And if we're just worried about, if we're just worried about the
Howard:script kiddies in a, I can protect against the script kiddies in
Howard:building my backup infrastructure and architecture and those permissions.
Howard:But we're talking about more sophisticated attacks than that.
Howard:And frankly we talk about it as ransomware, but it's also
Howard:rogue administrator protection.
Howard:Then it's also just the guy who is disgruntled and decides his
Howard:way out the door, he's going to make life for his employer.
Howard:You're protected against that too.
curtis:Yeah.
curtis:Yeah.
curtis:And, sometimes rogue administrator is a true rogue administrator, meaning
curtis:it's a, it's someone masquerading as an administrator as well.
curtis:That hacker that you talked about.
curtis:So let me let, me ask call it a difficult question, call it
curtis:whatever you want to call it.
curtis:But when I hear about boxes that where you're not supposed to be able to
curtis:delete data, but then there is this other way where you can delete data.
curtis:I immediately ask I, I have to ask the question doesn't that suggest
curtis:that there is a this is, I'm assuming this is a, Unix-based OS and that
curtis:there's that there is a root account,
Howard:It we, run in containers under linux
curtis:So there is an account, there is a a root account and that
curtis:if someone did some sort of just the right attack against that box.
curtis:And again you've already mentioned that there is that
curtis:these are sophisticated attacks.
curtis:If someone Did a privilege escalation attack against
curtis:the CoreOS, and now they've gained access to a privileged Couldn't want
Howard:if someone
curtis:want.
Howard:administrative access to the management network, because the
Howard:ports that face users as storage
Howard:ports, can't be logged into
curtis:Okay.
curtis:they're
curtis:cause they're back.
curtis:Cause they're backend,
Howard:so if you're wondering, if you want to log into
Howard:Linux as root on one of our appliances, then you need,
Howard:then the management network has to be set, has to be compromised.
Howard:And we start saying, are you looking for protection against destruction?
Howard:Because if your data center is compromised, everything can be destroyed,
Howard:but that's not really the level of attack that we're, concerned about.
Howard:We're not talking about and someone walked into the data center because we
Howard:hadn't disabled their key card and left 20 pounds of thermite in the middle of
Howard:the floor, who would do such a thing.
Howard:I've done that on video I was being paid.
Howard:So you know, I, it is a vulnerability, but it's the
Howard:generalest of the vulnerabilities.
Howard:You're pointing out that if I have sufficient
Howard:access, I can destroy anything.
curtis:The but it sounds like you have protected from the rogue
curtis:administrator, the stupid administrator.
curtis:And and someone gaining access to those.
curtis:But let me just you to clarify something from your previous answer, when you said
curtis:that means the management network has been compromised, what do you mean by that?
Howard:So you manage the system through different ethernet ports,
Howard:then you access the system.
Howard:And so too, you're if there's a vulnerability where a user could log
Howard:into the appliance as the Linux root user that Linux root user can only
Howard:log in on the management, physical Ethernet port on the appliance, not
Howard:on the gigabit NVMe over fabric port.
curtis:Gotcha.
curtis:Okay.
Howard:so network security should keep that from being an internet
Howard:connected network and to attack.
curtis:Gotcha.
curtis:Gotcha.
curtis:sense.
curtis:Okay.
Prasanna:I had a
Prasanna:question.
Prasanna:So Howard, before we dive more into the data protection side, one thing that
Prasanna:was curious to me was you mentioned that vast supports file and object.
Prasanna:Could you talk about some of the use cases that you see
Prasanna:your customers using Vast Data?
Prasanna:And then I think maybe some of the protection stuff will
Prasanna:probably come alongside that.
Howard:Sure.
Howard:We have the majority of our customers use us for primary storage.
Howard:And that includes one of the biggest travel sites who uses us for their
Howard:big data analytics and are using the S3 Presto connectors to store
Howard:all of their analytic data on us.
Howard:So that we're much faster than a disk based object store, obviously.
Howard:And they can do that processing faster.
Howard:We have a lot of hedge funds who do time series analysis of trade
Howard:data against large databases to try and predict the market.
Howard:We have a lot of life sciences customers who are doing things like.
Howard:Molecular modeling and cryo electron microscopy where one microscope generates
Howard:many terabytes of data a day because we have very high resolution images.
Howard:And we have a major motion picture studio who makes movies.
Prasanna:And so it looks like they are using both sort of the file and the object
Prasanna:interfaces for a lot of these use cases.
Prasanna:So specifically around data protection and backup.
Prasanna:A lot of times you hear The vendor's customers say, object
Prasanna:store doesn't need to be backed up.
Howard:This is a subject that personally I find myself on the fence about part
Howard:of me goes I've built a huge amount of resiliency into this single system.
Howard:And for durability, if for, availability, I may need to have it in another
Howard:location, but for durability, assuming that the whole data center doesn't end
Howard:up being a smoking hole in the ground I could get away without backing this up.
Howard:I am N I remain firmly on the fence there.
Howard:But
curtis:assuming you have the second copy somewhere, you're going to
curtis:write.
Howard:may decide that it's, it is data that If, the whole data
Howard:center goes away, I don't need.
curtis:Okay.
curtis:Yeah.
curtis:Agreed.
curtis:If, yeah, if we have That, data I would argue why did we make
curtis:it in the first place, but,
Howard:That the risk of that is the risk of that is small enough that I'm
Howard:going to go once every thousand years this is going to cost me a million
Howard:dollars, but it's going to cost me a million dollars a year to protect.
Howard:So I'm going to take that risk.
curtis:okay.
curtis:So such I will agree to such data classes exist.
curtis:I don't run into them much, but I will agree
Howard:yeah.
Howard:And and then we get to the okay, so this is the object store that does a
Howard:deep dispersal coding, and they have three locations and I can lose one.
Howard:So do I need to back that up?
Howard:That starts getting really close to now I need to back it up because there could be
Howard:a bug in the software that loses my data.
Howard:'cause, that's the only thing that could cause that it's like
Howard:unprotected against one of my three data centers being a smoking hole.
Howard:what again, it's I could see you going, I want to be safe and I can
Howard:see you going, it's not worth it.
curtis:And.
Howard:Now for us, most of our users use us for primary storage.
Howard:And for someone like that, big data analytics data, they may not back it
Howard:up because it's regenerate Hubble, and it's not actually in the form
Howard:it's in on the object store, but it's extracts from other things and they
Howard:can run the ETL again and it would be really annoying, but it is replaceable.
Howard:And then we and then for other use cases this is primary data.
Howard:I gotta protect it.
Howard:And so we can do snapshots to an S3 compatible object store
Howard:and back ourselves up that way.
Howard:Or you can back us up the usual ways.
curtis:And could you use one of the, like ones that are like
curtis:glacier deep archive where I hope I don't ever have to use this.
curtis:I know it's going to cost me a crap ton of money, but it'll save me a lot of money.
curtis:In the meantime, can you use that kind of storage?
Howard:The risk reading data out of that kind of storage
Howard:requires a few manual steps.
Howard:If you just use S3 standard then data in those snapshots is available
Howard:in a .Remote folder, like the .Snapshots folder in the file system.
Howard:So users can do self-service restore, but that required, but
Howard:this, that feature means the object has to be immediately readable.
Howard:And so if you, if it went to
Howard:Glacier, then.
Howard:And it would be like your net backup
Prasanna:Okay.
Howard:this backup isn't in the catalog anymore.
Howard:So I got to put those files someplace where I can catalog it and then I got
Howard:a catalog and then I can restore it.
Howard:So if you
curtis:so it's possible.
curtis:it
curtis:doesn't sound like it's very it's the smoking hole copy, right?
Howard:It is annoying.
Howard:But if it's just, but if you're protecting against the smoking hole,
Howard:then you know, you may be willing to put up with the annoyance.
curtis:I'm pretty sure we've said smoking hole more times
curtis:than we've on this podcast.
curtis:Just for the record.
curtis:Just saying
curtis:It's getting a lot of play today.
Howard:I spent way too long as a disaster recovery planner.
curtis:Yeah.
curtis:Yeah.
curtis:So the majority of your customers use you for primary storage, but clearly
curtis:you're trying to expand your TAM,
Howard:Well, w we, we deliver all flash at a substantially lower
Howard:price than anybody else does.
Howard:We start with using the cheapest QLC flash.
Howard:We have a file system designed to treat that flash properly.
Howard:So we never do small writes that would consume a lot of write amplification.
Howard:We do very wide erasure code stripes.
Howard:So we've got under 3% overhead, and then we do guaranteed better data reduction
Howard:than anybody else in the business.
Howard:And so that combination means that on an effective byte basis, from whatever backup
Howard:data mover you're planning on using, we're going to be cheaper than a Data Domain.
Howard:When you start saying that it's, you have more than a petabyte of data
Howard:and you need multiple Data Domains.
Howard:And each one of those is going to be a separate deduplication realm.
Howard:Then the gap starts to grow substantially.
Howard:So if so for these very large customers who have five or 10 or 20
Howard:petabytes data across a bunch of Data Domains, simply the fact that we're
Howard:one reduction realm makes that makes us much more efficient that can be.
Howard:it's one system to manage.
Howard:It's one namespace it's one 20 petabytes or 50 petabytes system.
curtis:So you're saying, so let me just make sure I understood
curtis:what you said there correctly.
curtis:saying on a, regardless of the size of the system, you should
curtis:be priced competitive with a Data Domain, but then the bigger you get,
curtis:better you look.
Howard:under about 500, any pricing experiments under about 500 terabytes,
curtis:Okay.
curtis:Okay.
Howard:in the large end of the business, but yes.
curtis:Right, That is interesting though, that sort of.
curtis:into that end of the business.
curtis:And you had another there was another large, all flash competitor that's
curtis:doing very well, but they have a very different architecture, they're referring
curtis:of course, to the orange company.
curtis:And
Howard:Yeah, but there,
curtis:than you.
Howard:If you're talking about Flash Blade, that's really a shared nothing
Howard:architecture it's of being pizza box servers, they're blade servers, and each
Howard:blade has flash modules built in And they they don't scale nearly as large.
curtis:So it sounds like you, you just took, you've built an
curtis:architecture based on several new pieces of technology that simply
curtis:weren't available, say, five years ago,
Howard:Yeah.
Howard:We, are the storage system designed from a clean slate around the 2016 toolbox.
Howard:So QLC, flash,
Howard:SCM, NVMe over fabrics and other people shoe horn one or two of those technologies
Howard:into an existing architecture, but we built the whole architecture
Howard:around having those technologies.
Howard:Yeah, putting all of the metadata in SCM with no cache meant it had to be in SCM.
Howard:And it meant the connection between the compute server and that SCM had to be
Howard:fast enough that we weren't going if we cached this, it would be a lot faster.
Howard:So that meant it had to be NVMe over fabrics.
Howard:And then the QLC flash gives us the cost.
Howard:But it, really is if you look at any storage system, it's by definition built
Howard:with the parts that the industry is making when they sat down to design it.
curtis:Yeah.
Howard:And that when x86 processor when Mahalum came along and the
Howard:memory bandwidth and the number of PCI e-lanes on processors got big enough.
Howard:All of a sudden we stopped seeing FPGAs and ASICs in storage systems, we started
Howard:seeing software defined storage, cause what was available for the designers
Howard:changed and the NVMe over fabrics has been used by most of the storage
Howard:vendors for that last mile connection going well, it's going to be fast and
Howard:then fiber channel or iSCSI for the user machine to access the storage.
Howard:But it hasn't been as effectively used for the server that is the logical
Howard:controller to access the media on the back end and the way we use it, we broke the
Howard:traditional limitation that a drive had to be owned by one or two controllers.
Howard:Cause I drive a SAS drive where an NVMe drive has one or two ports.
Prasanna:Yea.
Howard:We connect that NVMe SSD to what we call a fabric module, which
Howard:is an NVMe over fabrics router.
Howard:And in fact, in the new box, it's going to be a pair of Nvidia Bluefield cards
Howard:and the Bluefield card routes, NVMe over fabrics requests from the ethernet network
Howard:to the SSDs and routes the responses back.
Howard:But that's all it does.
Howard:We don't need x86 servers in the enclosure.
Howard:We can do it on the ARMs and the offloads and the Bluefields.
Prasanna:and these are the DPUs, correct?
Howard:Yes.
Howard:Yeah.
Howard:The Bluefield is, the DPU it's the Nvidia Mellanox version of that.
Howard:And so it has an ARM some ARM cores and NVMe over fabrics and RDMA and
Howard:other built-in offloads in the chip.
Howard:And so we leverage that to do the routing of requests from the front
Howard:end servers, everything is, all the work gets done the SSDs and get that
Howard:clean fast, more cost-effective channel
curtis:Let me go back in time when you did that first presentation that
curtis:you did to the Storage Field Day folks,
Howard:Yep.
curtis:how did that go over with, with those folks?
Howard:It went over pretty well.
Howard:There was a little being from Missouri and,
Howard:you,
Howard:know, we should show you,
curtis:Cause you weren't because you were brand new.
curtis:at that point.
Howard:We We were brand new.
Howard:And now we're going, okay, look, we've sold a couple of exabytes of storage.
Howard:Now at this we, our go to market model's a little different, we sell software.
Howard:We arrange for customers to buy the pre-approved hardware at cost.
Howard:And the
Howard:software licenses are,
curtis:a little interesting.
Howard:and the software licenses are transferable.
Howard:So you license a petabyte of software.
Howard:And you upgrade the hardware when you feel like you're want to upgrade the hardware.
Howard:Cause you want the denser faster one that is always coming, but we'll write
Howard:the support contract for 10 years for any appliance from install date.
Howard:So
Prasanna:That's very different
Howard:well, a typical
Howard:vendor, you would buy an appliance, it would come with an oEM software license.
Howard:They would write five years of support.
Howard:And in year six they would encourage you very strongly to rebuy.
Prasanna:yep.
Howard:And then when you rebuy, you have to buy another appliance the
Howard:software license isn't transferable.
Howard:So you have to buy another software license.
Howard:So with us, you gotta have your VAR go to a VAR, a hundred
Howard:percent channel you go to a VAR.
Howard:your VAR, goes to Avnet, says, I want this hardware for Vast.
Howard:Now $1.2 million average selling price.
Howard:One of our sales guys is involved.
Howard:We're writing the high touch sale.
Howard:It's not somebody went on a website someplace.
Howard:Um, but essentially the VAR, writes two POs: one to Avnet for the hardware and one
Howard:to us for the actually he writes one PO to Avnet, Avnet cuts us a PO for the software
Howard:and, that's a capacity subscription.
Howard:So if you bought a 675 terabyte, enclosure and an appliance, that's got
Howard:four servers that provide the front end, which is our usual entry point.
Howard:You could license a hundred terabytes for a year.
Howard:Multiples of a hundred terabytes for multiples a year.
curtis:And so that, I think that addresses the question that I had.
curtis:Cause I listened to the Chris Evans podcasts that you guys did.
Howard:Yeah.
curtis:and there was this talk of the 10 year And, again I'm gonna, I'm gonna just
Howard:Perfect.
curtis:acknowledge that I live in a SaaS world where we preach against
curtis:large capacity licensing and capital purchases and all of that stuff.
curtis:So when I heard 10 year purchase.
curtis:I was like, what?
curtis:I gotta, I got to decide now how much I need for 10 years, but that doesn't
curtis:sound like what you're talking about.
Howard:No, No, no.
Howard:no.
Howard:You th you buy the hardware.
curtis:Right.
Howard:We will write a support contract and software license.
Howard:One agreement.
Howard:For that hardware for up to 10 years from install date at the same rate.
Howard:So if you want to keep it for 10 years, you keep it for 10 years
Howard:Bought
curtis:I could buy a smaller one and then add capacity.
Howard:Oh yeah.
Howard:Our NRR is three.
Howard:Lots of people buy small and add capacity.
Howard:We had a 300% NRR.
Prasanna:I think you meant NRR,
Prasanna:right?
curtis:Thanks for explaining.
curtis:Yeah.
curtis:NRR,
curtis:you said ARR.
curtis:That's why you
curtis:have me confused there for a minute.
Howard:Yeah
curtis:I was like an annual recurring revenue of three, three.
curtis:Met meant net retention rate, you're saying?
curtis:yeah.
curtis:So you're saying 300% your customers start out at X and they end up
curtis:with three X very regularly.
curtis:Okay.
Howard:You know, and you can do that.
curtis:it just grows as they need it to grow.
Howard:Yeah.
Howard:And you can do it in the hardware, so if you want to start really small, then
Howard:you can buy hardware and license it
Prasanna:oh, interesting.
Howard:So You can buy, a 600 terabyte box and a hundred terabytes software
Howard:license, and the 600 terabyte box you bought at what would be our cost.
Howard:If we were still selling hardware, we negotiate the cost with the intel
Howard:and key Aksia and those vendors.
Prasanna:so you used to sell hardware and then you
Prasanna:of,
Howard:started off in an appliance model.
curtis:Why would I do that?
curtis:Is that just like ease of large capital purchase thing?
Howard:Yeah.
curtis:why
curtis:would I buy a bigger box
Howard:university, we had a university had this much money in this year's budget.
curtis:Oh, okay.
Howard:We won't put more than a hundred terabytes on it before the next budget
Howard:comes around when we renew, we'll renew it as a 400 terabyte license.
Prasanna:and I think this is where at the beginning, you said Howard, that you're
Prasanna:looking at releasing a smaller unit.
Howard:Yeah.
Howard:So the new box is one.
Howard:You,
Howard:it uses the ESS F one L the ruler form factor as, DS.
Howard:So we can, we have 2215 terabyte SSDs for 3 38 raw bat, 300 usable.
Howard:And that's half the physical size, half the capacity, because what we
Howard:have now, it holds 56 SSDs and two U
Prasanna:Gotcha.
Howard:Yeah, the new one is, from the fabric module is those NVMe routers today.
Howard:Each one has to be a dual Xeon.
Howard:So we have enough PCIE
Howard:lanes and the processors don't do hardly anything.
Howard:So there's just there's costs there.
Howard:We don't need, if the Bluefield
Howard:thing
Prasanna:That's exciting.
curtis:right.
curtis:So let's, focus for a little bit on.
curtis:The only reason I have historically been when, I historically heard the
curtis:idea of using flash for backup, I'm like, that sounds ridiculous because
curtis:for the same for cost reasons, too expensive I'm hearing you that so I
curtis:would put it this way that, in, in this upcoming world, in this current world
curtis:in a world where we have large nation states invading other nation states
curtis:and then large ransomware organizations in those countries, we had this, was
curtis:our last th they're talking about.
curtis:So we're, talking about being retaliated against because of this other country.
curtis:It's crazy.
curtis:So you have this this, need more than ever before for large recoveries.
curtis:And I, do believe strongly that there's really only one of two
curtis:ways to be really successful in any sort of ransomware situation.
curtis:And, it's basically about fighting the laws of physics .Either you
curtis:have to have already restored it.
curtis:So you already have a hot standby ready to go to switch over to or you're
curtis:doing live mount directly from your backup and live mount directly from
curtis:your backup is only going to happen if you either aren't, deduplicating
curtis:like, the way Data Domain does, or
Howard:Right.
curtis:have flash as far
curtis:Tell.
Howard:if you're not, even if you're not, deduplicating when you start talking
Howard:about big, hard drives the IO density just
Howard:isn't there it's better
curtis:Some somewhere between you and Data Domain, I would put Exagrid,
curtis:because exa grid has that front end.
curtis:It's not de duplicated now they're there.
curtis:They're nowhere near the size of you.
Howard:right, no.
Howard:And they have some, and they, have, some flash cache.
Howard:And if you look at guys who do integrated appliances where the
Howard:software and the target are one thing, those are typically hybrids.
Howard:And, so they'll do an instant recover for one or two VMs pretty well.
Howard:Cause there's enough flash for that.
Howard:But when you start going, I need the database server behind my ERP, instant
Howard:recovered, or I need all 50 of these VMs, instant recovered, then it's then you
Howard:just, don't have enough flash and you're going to get hard drive performance,
curtis:And so
curtis:what it sounds like you've replaced the hard drives with QLC
Howard:right,
curtis:Help me because I don't live in this world QLC from
curtis:a cost perspective regular.
Howard:it's, not just QLC.
Howard:So QLC means quad level cell holds four bits per cell.
curtis:okay?
Howard:The more, bits you hold, the closer, the voltage levels
Howard:that represent the differences are, and the more sensitive the cells
Howard:become to a few electrons escaping.
Howard:If you have SLC, it's like a light switch it's on or off,
Howard:It doesn't matter if a few electrons escape, you can still
Howard:tell whether it's on or off.
Howard:QLC.
Howard:You got 16 values.
Howard:The difference between value 13 and value 14 might only be a handful of electrons.
Howard:So QLC has less endurance.
Howard:Cause every time you erase it, the insulating layers wear down
Howard:a little and a few more electrons have opportunities to escape.
Howard:And it's slower to write because you have to adjust the voltage level just right
Howard:to be one of those 16 voltage levels.
Howard:And that takes a little bit longer.
Howard:Now the slower to write, we don't really care about because
Howard:we acknowledge the writes while it's still in the SCM.
Howard:So as long as we are flushing that data out of the SCM, in bandwidth terms
Howard:fast enough, Latency is unimportant.
Howard:and the endurance we specifically do a lot of things in our
Howard:software to manage endurance.
Howard:So we write very large writes so that the SSD doesn't have to garbage collect
Howard:internally to accommodate small writes.
Howard:We erase very large erases so that we delete all of the data in an erase block
Howard:in the flash so that the SSD doesn't have to garbage collect internally.
Howard:And that means not only can we use QLC, but we can use dirt cheap QLC
Howard:SSDs that don't have a DRAM buffer in them to protect the QLC from wear.
Howard:If you have a DRAM buffer, then you can aggregate multiple small
Howard:writes, but yet, but now if power fails, it's DRAM, you lose the data.
Howard:So you need a power fail protection circuit, and you need big capacitors
Howard:to power, the power fail protection
Howard:circuit so that you can that you can dump the DRAM into flash and
Howard:right, and it all starts to add up.
Howard:So the SSDs we buy, the other customers are hyperscalers.
Howard:They put them in servers.
Howard:They only need one port they're writing long tail data.
Howard:It's not like they're overriding this stuff all the time.
Howard:It's just too many people are looking at that drunken fat frat
Howard:boy picture on Facebook it to be on disk so it's on flash.
curtis:A.
Howard:We're leveraging all of that to keep so that we can literally
Howard:use that lowest cost flash.
Howard:And do the 10 year support because the 10 year support includes if the
Howard:SSD wears out, we'll replace it.
Prasanna:cause normally QLC isn't rated for that long.
Prasanna:I believe.
Prasanna:Right.
Prasanna:SLC is years
Howard:S SLC SLC is the very high endurance flesh, but the typical
Howard:flash that you see for volume use today is TLC triple level cell.
Howard:So it's three bits instead of four bits.
Howard:So QLC is 30% cheaper to make because it holds more bits per cell.
Howard:And QLC has substantially less endurance.
Howard:So when you start looking at enterprise SSDs on newegg.
Howard:The 0.1 drive write per day, SSD is slightly better than the ones we use.
Howard:And the three drive write per day, SSD, you notice has less capacity because
Howard:it's got the same amount of flash.
Howard:It's just more over-provisioned so they can wear level across more of it.
Howard:And the three drive rate per day, SSD probably has a DRAM cache
Howard:and all this stuff to protect it.
Prasanna:Yeah
Howard:And that's what most enterprise storage systems need because how
Howard:they put the data in the drive dates back to when it was a disk drive.
Howard:And you were trying to keep data logically adjacent, not try and manage
Howard:the write pool inside the drive.
Prasanna:yeah,
Howard:The requirements were different.
curtis:Yeah.
curtis:Interesting.
curtis:Yeah.
curtis:So again, going back to.
curtis:the fact that you built this from the scratch with that toolbox
curtis:from 2016, and you were like we need to, manage write leveling,
Howard:And look, our founder Renen Hallak was the chief engineer at Extreme IO.
Howard:And when he got tired of working for Michael Dell, he got to talk to Extreme IO
Howard:customers and find out what they wanted.
Howard:And nobody said we want faster, Extreme IO was already all flash.
Howard:They were still adjusting to all flash.
Howard:And it was plenty fast, but everybody wanted to be able to use
Howard:that all flash for more things.
Howard:And so our whole system is designed to provide very high, random read
Howard:performance, across large amounts of flash at an affordable price.
curtis:Got it.
Howard:And so our our performance asymmetry is exactly
Howard:the opposite of data domains.
curtis:wait, explain what you just said.
Howard:Our performance asymmetry is exactly the opposite of data domains.
Howard:They don't publish restore speeds anymore.
Howard:Haven't for years we publish, read speeds and writes speeds and reads
Howard:are at eight times faster than rights.
Prasanna:That doesn't mean your rights are slow either.
Prasanna:Just for
Howard:No Our, smallest system does five gigabytes per second of rights.
Howard:Yeah.
Howard:Or your story system probably doesn't keep up with that, but that's the SLOs.
Howard:But what that means is if you scale a system the traditional way, and
Howard:you say, I need to move this many terabytes over this many hours, so you
Howard:have to scale it by right performance.
Howard:Your backups are going to be much faster than your restores.
Howard:Excuse me.
Howard:your restores are much
Howard:faster than your
Howard:backups.
Prasanna:Yeah,
Howard:Yeah
Howard:we read much faster than we write.
Howard:And so if you size for backups speed, you're a store.
Howard:Speed's going to be
curtis:yeah.
Howard:nice.
curtis:All right.
curtis:Consider me impressed, Howard.
curtis:you know, I,
Prasanna:do by the way
curtis:I
Howard:I I've
curtis:I, I,
Howard:time.
Howard:I've impressed him once.
Howard:this is makes twice.
Howard:I'm really, I'm happy with that,
curtis:yeah it sounds like you're, clearly you've been
curtis:in the business a long time.
curtis:You've seen those companies that have really interesting technology
curtis:and nobody's buying anything.
curtis:You're not that you,
Howard:but
curtis:the really interesting technology, but you're also actually selling it,
curtis:right?
Howard:I decided it was time to get a job.
Howard:And I talked to the folks at Vast, who were still in stealth.
Howard:And I said to myself, look, Howard, you're a storyteller.
Howard:And this is a really good story.
Howard:And it doesn't matter whether it succeeds or not.
Howard:You're going to have a good story to tell.
Howard:and low and behold, it's one of those cases where it was a good
Howard:story and the market requirement fit.
Howard:And
curtis:don't have to create the need.
Howard:we are selling we have, for the past couple of years
Howard:done comparisons, all the storage companies have gone public you.
Howard:Yeah.
Howard:We're growing faster than all of them put together
curtis:all right Howard thanks a lot for coming on.
curtis:We might have to have you back.
curtis:Cause I, I know that I know we've, just begun to scratch the surface and but
curtis:sounds like you got a good gig over there.
curtis:I'm glad.
curtis:Both of us could be
curtis:employed.
Howard:Ed
curtis:well.
Howard:for the people have known us a long time.
Howard:It really must be shocking to you and I both the same job multiple years, but
Howard:I'm still having fun at Vast.
Howard:And there's lots of interesting stuff still to come.
Howard:Having taken a fresh eye to the market.
Howard:We got all sorts of good stuff coming.
curtis:Cool.
curtis:All right.
curtis:I wish you the best.
curtis:And thanks Prasanna.
curtis:This is one of those cases where your background was very helpful.
curtis:I think,
Prasanna:Oh, I try.
Prasanna:I try,
Prasanna:Yeah Yeah.
Prasanna:Having spent a bunch of time building storage arrays.
Prasanna:It helps, but
Prasanna:no, it's still interesting problems though, and, yeah.
Prasanna:Thank you, Howard, for sharing some of the details and indulging in my questions.
Prasanna:So.