Ah, episode 401. Proof that we're still going strong.
Speaker:Like a SQL server instance running on pure spite and caffeine.
Speaker:I'm Bailey, your semisentient hostess with the mostest
Speaker:metadata. Today, Frank's joined by the ever insightful
Speaker:Andrew Brust to talk fabric AI, Microsoft
Speaker:nostalgia, and why even Red Hat folks can still love Clippy.
Speaker:Grab your headphones and your compute capacity. Let's dive in.
Speaker:Hello, and welcome back to D Data Driven, the podcast. We explore the emerging
Speaker:industry of data science, data engineering, and
Speaker:artificial intelligence. With me today is not Andy
Speaker:Leonard, who's my favorite data engineer in the world. However, I do have Andrew Brust,
Speaker:who is an og, so to speak, in the.
Speaker:In the AI and Microsoft ecosystem. And
Speaker:although I think a lot of people think that I've abandoned the Microsoft
Speaker:ecosystem, I have not. I've just had other. Other things kind of preoccupy my
Speaker:time. And you know how it is. You have kids and, you know, they demand
Speaker:attention. You have a house and all that and good problems to
Speaker:have. But I'm very glad to kind of have someone I know walk me
Speaker:back into kind of the Microsoft ecosystem, because a lot has
Speaker:changed since I left Microsoft. I did
Speaker:go to Microsoft Ignite. We were talking about that. I even scored myself a
Speaker:Clippy. Clippy plus.
Speaker:I'll have to tell you how I want him. So they had like, a little
Speaker:challenge of like, do you know Windows history? Because Windows, I guess, turned 40 this
Speaker:year and. I know, right? Yeah,
Speaker:85. That's right. Yep. And they were like, do the history of
Speaker:Windows. And I'm like, I'm like. And I had some Red Hat people and like,
Speaker:you know, I was. I would have been very embarrassed if I had gotten anything
Speaker:wrong. Turns out I got it right, actually. Good. I
Speaker:remember back then you could install just a runtime version of Windows if you
Speaker:wanted to run specific Windows apps on your. On your DOS
Speaker:machine. Yeah. Now we take for
Speaker:granted kids. They don't understand, like, no, no, you had to type in win or
Speaker:win 3.1 if you were fancy and you're running multiple
Speaker:versions of Windows at the. Same time,
Speaker:you say kids today. I mean, kids 20 years ago didn't understand
Speaker:that. This is true. This is true. But
Speaker:anyway, in terms of bringing you back into the Microsoft orbit,
Speaker:well, first of all, I'm sure Ignite did a bunch of that. But I can
Speaker:be gentle because even though I am to this day a MicroSO
Speaker:Regional Director, or as I like to say, a member of the regional Director
Speaker:program, because Otherwise it sounds like I work for Microsoft
Speaker:and an MVP, a data platform MVP, both over
Speaker:20 years now. I'm an
Speaker:industry analyst and so I look at data and analytics
Speaker:solutions across the board, not just Microsoft specific. I
Speaker:will say Microsoft is my sweet spot in terms of what it is that I
Speaker:know and where I have the most history.
Speaker:But I work with lots of other companies you've heard of, like
Speaker:Databricks and Snowflake and Cloudera and
Speaker:plenty of others. So my team and I do
Speaker:most of the reports for a research company called GigaOM
Speaker:with which I've had a very long association. Most of the reports. Sorry, I
Speaker:didn't finish that sentence. That are focused on data or analytics
Speaker:are things that we work on. So whether it be data warehouses or lake
Speaker:houses or streaming data platforms or data access
Speaker:governance or data catalogs or blah blah, blah, they all have the
Speaker:word data in them. We work on those reports.
Speaker:We create what are called gigaom radar
Speaker:reports, which are a little bit like the analog to
Speaker:Gartner's Magic quadrant in terms of looking across a category
Speaker:at a bunch of vendor solutions and rating them on
Speaker:multiple criteria which change each year.
Speaker:So. And when I started covering big data,
Speaker:because the way this got started was I was the first and only person
Speaker:at ZDNet to be covering big data.
Speaker:So that was an amazing. That's a term you don't hear a lot. You don't
Speaker:hear it anymore. No. And in fact I was at the
Speaker:older incarnation of gigaom. I was a full time employee there.
Speaker:I was their research director and they wanted, so call me research
Speaker:director for big data. And I said, can we just call it data
Speaker:or data and analytics? And we did. Because I was like, you know,
Speaker:eventually what we think is big now won't look so big.
Speaker:Have you heard my Costco rules? Say again? I have something
Speaker:called the Costco rule. The Costco rule. If you go to call,
Speaker:if you walk into Costco by hard drive, that size. That size is no longer
Speaker:big data. Fair, fair.
Speaker:Anyway, that put me in an immersion. And at that time, Microsoft
Speaker:was not in that world at all.
Speaker:Eventually the thing called hdinsight got to beta
Speaker:and it wasn't even called hdinsight when it was in beta.
Speaker:And so Microsoft started coming back into my world. Eventually,
Speaker:of course, it came back full swing. And with Microsoft fabric,
Speaker:now it's doubly full swing, which I think is
Speaker:very, very good, both for Microsoft and the industry. But
Speaker:what was I going to say? Just that. Yeah. So
Speaker:finally the two things that Were kind of orthogonal. Now have an
Speaker:intersection. Right. And that
Speaker:intersection is my sweet spot as I'm still a data platform
Speaker:mvp and I have a very long history with Microsoft's
Speaker:business intelligence stack. I was on Microsoft's
Speaker:partner advisory council going way back, like
Speaker:from 2005 to roughly 2010.
Speaker:I don't know. I saw Power BI when it was still a bunch of wireframes
Speaker:in a PowerPoint slide deck. So I've been through
Speaker:many rounds of being frustrated that Microsoft
Speaker:didn't have a good competitive play. And I'm now pretty
Speaker:satisfied that they have one that's very competitive. So we can talk about that or
Speaker:we can talk about the greater world. And as far as
Speaker:AI goes, I was interested in AI all the way back in the horse and
Speaker:buggy days when I was an undergraduate. Oh, really?
Speaker:Yeah. AI was very different then. It was about like, weird programming
Speaker:languages like Lisp and Prologue and
Speaker:Expert Systems and things of that elk.
Speaker:But neural nets existed then, and neural nets are the very
Speaker:basis for the large language models we have today. So it's not, it's
Speaker:not completely unrel it, but obviously it's very different.
Speaker:I took a prologue course, which was the. We had one offering.
Speaker:So I'm. You and I are both from New York City. So, like, you know,
Speaker:we probably accidentally crossed paths more than once.
Speaker:And I know we crossed paths during the early Power BI days
Speaker:because I think the company I worked for at the time
Speaker:was also an early believer in power bi.
Speaker:So this is what meant 2005. And then they hired. We had a guy who
Speaker:was the practice manager, Kevin, who got hired. Kevin Viers got hired into
Speaker:Microsoft. So I think he's still there, actually. He,
Speaker:he hired me back into Microsoft when I rejoined in 2018, which was kind of,
Speaker:okay, what a small world it is, you know, and, and, and for us, like,
Speaker:you know, something that my parents would always say, like 20 years could go by
Speaker:in a blink once you hit a certain age. And I'm like, good Lord, was
Speaker:that the truest thing they ever said, right? 20.
Speaker:Yeah. The scale shrinks the older. The older you get. It's a
Speaker:little. It's a little frightening. I've turned it like, into a roll of toilet
Speaker:paper that as you get closer and closer to the end, it spins out a
Speaker:lot faster because the diameter gets smaller. Oh,
Speaker:Lord, that is a very scary concept.
Speaker:But you're right. I remember one of the things. These were back in the
Speaker:cubicle days when people worked in an office and things like that. And I remember
Speaker:sitting next to Kevin. And Kevin would be on the phone like, yeah, I know
Speaker:it's weird to hear Microsoft is kind of a small player in any niche,
Speaker:but in BI and business intelligence they really were.
Speaker:And it was just kind of like, yeah, that's true. You don't really think about
Speaker:them as a small player, but at the time now it's kind of ridiculous
Speaker:to say that in data and analytics, right? And the Power BI team has just
Speaker:done phenomenal in terms of their speed to market. And what they
Speaker:built out is phenomenal. It's unreal.
Speaker:It's the Power BI team that kind of took over the
Speaker:entire Azure data group, right?
Speaker:And that included SQL Server. So
Speaker:whereas the BI team was once a little corner of the
Speaker:SQL Server world that
Speaker:initially came to Microsoft through an acquisition of assets from
Speaker:an Israeli company called Panorama.
Speaker:And Amir Nats is the distinguished engineer who actually came
Speaker:from Panorama and is very much like the father of Power BI
Speaker:and of fabric. Still there.
Speaker:Slowly but surely, not
Speaker:only did they get BI right and they always had it right on the
Speaker:server, they just never really had it right on the front end
Speaker:until the current version of Power BI that we have now kind of gelled,
Speaker:but they also ended up kind of mastering
Speaker:the software as a service approach to
Speaker:cloud services. And they took a look at the Azure
Speaker:data stack and said, we have tons of capabilities here, but they're kind
Speaker:of fragmented over several different products,
Speaker:each of which have their own kind of procurement model and
Speaker:pricing model. And that gets very hard to manage.
Speaker:And if you look really carefully at fabric, while there are
Speaker:some things that are truly native to it,
Speaker:most of the parts of it are Azure services in the
Speaker:background that have been integrated and
Speaker:that have been unified in terms of how you pay for them,
Speaker:I don't know. Microsoft needed that, by the way, the whole cloud industry
Speaker:needed that. Because Google and Amazon are just as guilty of
Speaker:having a whole sprawl of
Speaker:services without unified user
Speaker:interfaces or APIs or pricing. No, that's
Speaker:true. I mean, when I. So when I just before
Speaker:the pandemic, I was out at Tech ready, which is
Speaker:an internal Microsoft event. And they were basically
Speaker:might have been Amir, actually, now that I think about it, was presenting on
Speaker:the future of what was called synapse. And this is kind of,
Speaker:you know, he's like, you know, everything's going to be all in one pane of
Speaker:glass. Everything, basically everything you said when all these things are going to be thing
Speaker:and you know, and the speaker, which I don't.
Speaker:Can't say it was him but would make a lot of sense. He said like
Speaker:this is the future. We're going to get everything under one pane of glass billing.
Speaker:Don't worry about that, we're going to get that figured out in time. And I
Speaker:was just like, you know, I kind of saw the vision. So
Speaker:and then I don't know, maybe like a year, year and a half later I
Speaker:left Microsoft and then Fabric came out and I
Speaker:was wondered like why the change in name like
Speaker:Synapses? And I asked people like well SYNAPSE is still kind of there but it's
Speaker:really Fabric is where everything's going. I'm like all right, but like what is the
Speaker:difference per se? So if we pretend I was on a ufo, well that's
Speaker:a weird thing. Pretend I was in a coma and I just Woke up
Speaker:from 2000 2021. What?
Speaker:And I say like well what happened to Synapse?
Speaker:Sure. So I mean the functionality of SYNAPSE is still there and there was
Speaker:a lot of, I won't call them conspiracy theories but
Speaker:skepticism when Fabric came out that it was really just a
Speaker:rebrand of synapse. In fact that's not what it
Speaker:is. So the, the thing that was originally SQL Data
Speaker:Warehouse which was in Synapse as so called
Speaker:dedicated pools and the more lake housing part
Speaker:of it that was in there
Speaker:as gosh, I forget the old nomenclature, it wasn't on demand
Speaker:pools but it was something of that.
Speaker:Reserved instances or something like that. Wasn't reserved, no,
Speaker:but anyway basically a Spark based data
Speaker:lakehouse using
Speaker:Azure data lake storage as the storage layer that's still
Speaker:there but what was I going to
Speaker:say? But Fabric is a ton more because it integrates
Speaker:all this, all this streaming stuff
Speaker:that's now called real time intelligence. It integrates data
Speaker:science and by the way the data science is completely
Speaker:unique to Fabric. It's not merely just
Speaker:an embedding of Azure machine learning.
Speaker:There's also power bi of course
Speaker:now there are operational databases including
Speaker:SQL Database meaning Azure, SQL
Speaker:meaning SQL Server in the cloud.
Speaker:A lot of other pieces that were ancillary are now all
Speaker:included. There's user
Speaker:interface that covers the whole
Speaker:realm and again the billing is
Speaker:unified. So you buy a compute capacity
Speaker:and basically as you use the different services
Speaker:they're all pulling from the same pool of compute.
Speaker:So you know, you don't have to over provision for each
Speaker:one of those services just to make sure you have enough
Speaker:compute to, to satisfy it. And now we have this thing called
Speaker:Fabric IQ which brings, which brings
Speaker:generative and agentic AI into things.
Speaker:Which is good because it was kind of funny when Fabric finally went
Speaker:to general availability. That was really when
Speaker:ChatGPT and Gen AI were like
Speaker:making it big. So it looked like Microsoft finally got the data and
Speaker:analytics stack set up just in time for people to have their attention
Speaker:to, you know, diverted over to AI.
Speaker:But now we have, you know, natural
Speaker:language query is kind of just the beginning. We have
Speaker:operational agents that can actually act on
Speaker:things and can be all based and triggered
Speaker:on streaming data.
Speaker:And so if you think about Azure Event Grid,
Speaker:if you think about Azure Data Explorer,
Speaker:If you think about the data pipelines that Azure
Speaker:offers, as I said,
Speaker:the one standalone data warehouse side of things, and even
Speaker:elements of hdinsight in terms of the lakehouse,
Speaker:that's all in there. What's also nice is even though it's Azure data Lake
Speaker:storage under the hood, you have this abstraction layer over it
Speaker:called OneLake. OneLake is
Speaker:in many ways easier to deal with because
Speaker:you don't have to worry about accounts and containers and
Speaker:sizing those and so forth. It's still compatible with
Speaker:all the ADLs and Azure Blob storage APIs.
Speaker:It also supports this notion of shortcuts, which is really just a
Speaker:data virtualization technology.
Speaker:So you can have a shortcut to data in other OneLake instances
Speaker:or in ADLS proper,
Speaker:or even in Amazon S3,
Speaker:or even in Google Cloud storage or other databases.
Speaker:And logically they'll all look like they're part of OneLake and you
Speaker:can query them as such. That's impressive. That's
Speaker:impressive. You can kind of. You really. The. The vision of get everything, get
Speaker:everything under one pane of glass seems like. It'S come true
Speaker:that to what I tell people, even though it sounds maybe a little
Speaker:bit anticlimactic, is that the real
Speaker:innovation in in fabric
Speaker:isn't the tech per se. It's.
Speaker:It's all the integration of the tech and the
Speaker:abstraction layers over it that make it work together, the UI that makes it
Speaker:work together. And there's an organizational. I mean,
Speaker:there's a little inside baseball, but there's an organizational facet to it as
Speaker:well. Because all these different products were different teams.
Speaker:Yes. People don't realize that. Like, I haven't, I've been. You've been, you're an
Speaker:rd, so you kind of know, you know how the sausage is made. I was
Speaker:inside the firewall, then I saw the sausage was made. Like there are all these
Speaker:little teams that range from
Speaker:really good team team players to really not good team players.
Speaker:I think that's this polite way as I could put it. And they getting them
Speaker:all to row in the same boat or like row. In the same direction.
Speaker:Yeah, they weren't doing that. That wasn't even necessarily
Speaker:based on hostility. It was just that different people had different reporting structures
Speaker:and different priorities and different incentives. What
Speaker:worried me was that the vision of putting all this together was a great
Speaker:idea, but the execution to me at the time
Speaker:seemed like it would be next to impossible to get all these teams to kind
Speaker:of work harmoniously and somehow they
Speaker:did it. And like to me that's the, that's the absolute
Speaker:greatest innovation. And now they've got
Speaker:synergy instead of sort of internal
Speaker:competition and, you know, from there
Speaker:on, look out because, you know, whatever. I'm sure
Speaker:there are internal disharmonies somewhere.
Speaker:But I would say at the high level in general,
Speaker:anywhere you have people. You'Re going to have that I'm going to see. I think
Speaker:I still have, I think you mentioned this is my old, old laptop. You see
Speaker:Azure Data Data Explorer. I have the sticker from that.
Speaker:Oh, I see it. Yep. Sorry, I had to show that off.
Speaker:No. And what's Azure Data Explorer? For people who don't know,
Speaker:it ran under the codename of Kusto K U S T
Speaker:O. And there's some disagreement
Speaker:over whether that really references Jacques Cousteau C O U
Speaker:S T A U or not. But it's a
Speaker:fantastic super high performance
Speaker:system for, for not just
Speaker:for streaming data, but for time series data. Yeah, with. With
Speaker:its own query language and its own ability to create
Speaker:visualizations. Right. In the query language. So your results
Speaker:come back as both tabular and visualized data
Speaker:and it can handle huge volumes of data
Speaker:in a single query. And
Speaker:there's a lot of heritage in the Azure Data Explorer team that
Speaker:started in the SQL Server analysis services world. So there's
Speaker:a continuum there. And that product on its own,
Speaker:especially being called Azure Data Explorer, which made it sound like
Speaker:a tool. File explorer. Yeah.
Speaker:When they told me the name, I don't know, I was not
Speaker:reserved in saying I didn't think it was the best name,
Speaker:but that product on its own was kind of a sleeper. It
Speaker:wasn't really getting the, I don't
Speaker:know the kudos that it deserved or the attention that it deserved. And
Speaker:now that it's part of fabric now, it contributes
Speaker:to all the cool things fabric can do. So if you see what
Speaker:are called event houses in fabric, that's the
Speaker:same technology as Azure Data Explorer. Interesting.
Speaker:So correct me if I'm wrong, but I think the origin story of
Speaker:Kusto and Kusto query language was that
Speaker:the folks running Azure, like in the operations team, actually built it
Speaker:to run through all the logs that they had. Because I remembered I was at
Speaker:some super secret event and they brought in some people from
Speaker:the field and they had us do hands on labs with it
Speaker:and I'm like, I, I must have been checking my email or
Speaker:whatever. I'm like, when can I get this to my customers? And they kind of
Speaker:laughed. They're like, no, no, this is internal only. It's internal. Yeah, it began as
Speaker:an internal thing. So I was just like, oh, like you need to make this
Speaker:a product. Because I could think of 15 customers on top of my head that
Speaker:would, that would eat this up. Yeah, I'm glad it finally saw a light of
Speaker:day. If you go back to the real world outside of Microsoft
Speaker:and you think of the likes of Splunk, for example. Yes, right.
Speaker:It's in this, it's in the same space. Although their
Speaker:initial marketing just said it was a big data tool which just
Speaker:completely obfuscated what it did. But anyway, now,
Speaker:now in, in combination with these things
Speaker:called event streams, which can stream the data in,
Speaker:basically based on Azure
Speaker:Event Hub, you put it all together and
Speaker:you have the ability to do a lot of work with
Speaker:real time streaming data without really having to write much
Speaker:code, if any code. Although it does have
Speaker:its own query language called kql. There's also
Speaker:a copilot that you can just work with in natural language that will
Speaker:generate the KQL for you. Oh, very nice. And I love
Speaker:generators because not only does it mean I don't have to write the query, but
Speaker:it means I can learn the language and then write my own query
Speaker:if I want to. Reverse engineering is how I
Speaker:prefer to learn. So
Speaker:that works out really well. Yeah, I
Speaker:wanted to dive into fabric, but I wasn't really even sure where to start because,
Speaker:so I heard, and again, a lot of this is I heard, but
Speaker:the way that it's not attached to your Azure tenant. Is that
Speaker:true? It's attached just like Office 365
Speaker:or more important, Power BI. Right. It's imagine
Speaker:Power BI premium instances, and it's the outgrowth of that.
Speaker:Okay, that makes sense now because when somebody told me like, well, it's not tied
Speaker:to your Azure tenant, you need a different tenant, I'm like, but
Speaker:okay. But then somehow, I guess at some. Level it ties into SaaS, not
Speaker:tasks. Right? So using Azure services in the
Speaker:background, including even Azure OpenAI. But
Speaker:you don't have to provision anything in Azure. It's doing that on your
Speaker:behalf. So you don't need an Azure tenant at all.
Speaker:So that actually makes it a lot easier if I were to manage, if I
Speaker:had to manage it, right? There's a lot of things that, like, correct. I mean,
Speaker:I, I love the fact you have. You go to.
Speaker:Go to, you know, any of these services,
Speaker:right, and they have basically this smorgasbord, this, this big buffet
Speaker:of services you kind of pick and choose from. But at the end of the
Speaker:day, like, how do you figure out, you know, what you pay
Speaker:for, right? It becomes, like, really kind of nightmarish. Like, again, to
Speaker:me, that's the innovation is that, yeah, all the stuff has been
Speaker:brought together, put under one pricing model,
Speaker:and you don't have to worry about all the moving parts
Speaker:and all the different, all the different servers or instances
Speaker:that might have to be provisioned and sized. That all goes away.
Speaker:And again, everything is built out of a single
Speaker:pool of compute.
Speaker:It's not a perfect analogy, but I think of like the old days of
Speaker:cell phones, when you got a certain number of minutes per month, but you could
Speaker:roll them over. And it's not that you can do that with fabric.
Speaker:I'm not saying you can roll over your compute from one month to the next,
Speaker:but what you. What is fungible is how the
Speaker:compute is used amongst the different subservices
Speaker:of fabric so you don't have to provision a certain
Speaker:amount of compute just for streaming or, or just for
Speaker:AI or just for data. Lakehouse.
Speaker:Because it's all from. When it's all from one
Speaker:pool. All right, that makes a lot of sense now because, like, that was always
Speaker:when I first got it. When I left Microsoft, I, you know, started experimenting
Speaker:with aws and I was just like, I just want to create a
Speaker:website. Why don't I need these, like, hundreds of different services underneath,
Speaker:right? Like, why do I need. I understand why. I need identity, access, management, right?
Speaker:That made sense to me. But. But like, when it came to Route 53 and
Speaker:like all this crazy stuff, I'm like, I just want to spin up a stupid
Speaker:website, right? This is not, let alone do anything complicated, right? Where you need to
Speaker:have all these underlying things. Like SageMaker, right, has this whole thing and they tried
Speaker:to abstract away all the underlying services. But even when you
Speaker:kill. This is. The thing that really annoyed me was when I killed the
Speaker:SageMaker instance, I was still getting like, you know, 20,
Speaker:$30 a month, not a lot, but I was still getting that on my
Speaker:bill and eventually I just closed the account because I'm like, I'll have to start
Speaker:fresh again in the future because like I God only knows what,
Speaker:what, what I've spent. And for people who are just learning and wanting
Speaker:to get their skill sets up, Microsoft is pretty generous with
Speaker:trial, trial
Speaker:capacities as they call them. A capacity basically is a, you know,
Speaker:a server or an instance. However,
Speaker:the, if you want to do anything with the AI, you do need a
Speaker:paid instance. But there are some pretty, there are some pretty affordable
Speaker:ones. And this gets a little confusing.
Speaker:If you provision the fabric instances
Speaker:through Azure, again you don't have to,
Speaker:that connection doesn't have to be there. But if you provision it through Azure,
Speaker:you can pause and resume those instances.
Speaker:Okay, so you do that a lot. You could be like, hey man, I'm
Speaker:taking, I'm taking a week between Christmas and New Year's off, so pause it.
Speaker:Totally. I brought up my own cheat
Speaker:sheet in the background when no one was looking. But
Speaker:so Azure Data Lake Storage, Azure Synapse, as you
Speaker:mentioned, Azure Data Factory, Azure Event Hubs, Azure Data
Speaker:Explorer, Elements of Azure Machine Learning
Speaker:and Power BI all come together in fabric.
Speaker:Interesting. So it's like one roof for which I think is
Speaker:a brilliant strategy. Right. Because Microsoft's core strength in the data and
Speaker:analytics space isn't necessarily having frontier models, isn't
Speaker:necessarily having the, the cutting most cutting edge research.
Speaker:Although I love just making it usable. Exactly. Making it usable
Speaker:and turnkey. Right. Like, not that I don't love my folks in Microsoft Research.
Speaker:Right. I know some of them. Listen, love you all, you guys have the best
Speaker:conference in the world. But you know, but, but I
Speaker:mean, but you're making it usable. Right. And I think that that's really
Speaker:string and they have all these separate tools. I think that was really the challenge
Speaker:right. When it was a shrink wrap company, you knew what you bought. But when
Speaker:it became like a SaaS pass company, you kind of could just
Speaker:a couple of clicks, you could provision stuff. So it eventually kind of
Speaker:got too chaotic. Now I like the idea of them kind of bucketizing this or,
Speaker:or rolling it up behind one service
Speaker:where because it, it just like the AWS problem. Right. Like I
Speaker:spun up SageMaker. Right. And did
Speaker:you that I needed, I needed underlying storage. I needed this. I needed this. I
Speaker:needed DNS, I needed that. I needed that to the point where look, I, I'm
Speaker:okay spending X amount of dollars on learning
Speaker:Sagemaker Right. But I wasn't okay with
Speaker:when I turned off the instance. I'm still getting built. What am I getting built
Speaker:on? That lack of transparency, intentional or not
Speaker:on AWS's part, has left a bad taste in my
Speaker:mouth, you know, for cloud services in general.
Speaker:Sure. By the way, you mentioned Microsoft Research and a
Speaker:couple of things. So when I listed all those Azure services, I forgot to say
Speaker:Azure OpenAI. So add that to the list. But also,
Speaker:although I said there's elements of Azure machine learning in there, the data science
Speaker:workload in fabric is really mostly unique to fabric,
Speaker:but it's based on technology that comes out of
Speaker:Microsoft Research. So for example, there
Speaker:was something called flaml F L a M L,
Speaker:which is the fast library for automated machine learning
Speaker:and tuning. And that's built in.
Speaker:So are things like, like ML Flow, which is an
Speaker:open source experiment management
Speaker:platform that's built into a lot of commercial AI
Speaker:platforms. So they didn't, they didn't just kind
Speaker:of embed and put their own badge
Speaker:on it. They built their own, their own ML
Speaker:stuff from, from these open source components.
Speaker:Right, Right. Well that's interesting because like the world of AI is
Speaker:largely dominated by open source. Right?
Speaker:Right. I mean, Sagemaker, I'll stop kicking the AWS
Speaker:the curb in a minute. But like SageMaker is basically a wrapper of Jupyter
Speaker:notebooks. Right. Azure ML, at least when I last used it, was largely
Speaker:a wrapper around Jupyter notebooks. Right.
Speaker:So a lot of the core technology here does tend to
Speaker:lean towards open source, which from my own personal career development
Speaker:point of view, and they're not paying me to say this, you know, one of
Speaker:the things that led me to Red Hat, right. Was the idea that, you know,
Speaker:this is largely a movement driven by open source. So, you
Speaker:know, let's see what we could do here. Right. And not, not a commercial,
Speaker:not a commercial, not a sermon. Just, just, just point it
Speaker:out because I think it's interesting how quickly open source has taken over
Speaker:the, the, the certainly the AI world. Right.
Speaker:But I also, by the way, there's notebooks in fabric too. If that wasn't,
Speaker:if that wasn't already implied or obvious and
Speaker:they are based on Jupyter, but you don't see the Jupyter skin. Right? It's
Speaker:all right. Yep. Well, I think it's really what's really
Speaker:impactful about kind of the notebook interface once you get used to it. And it
Speaker:is an adjustment for people who, like you and me, grew up with Visual Studio
Speaker:and Dare I say interdev. Right. The
Speaker:idea that you can code in a browser, right. And you know,
Speaker:no local installs really required.
Speaker:It's been very freeing, right, because you can spin up an environment,
Speaker:you can, you know, spin up a heavy cluster,
Speaker:leave it running. Right. Do its thing, you know, close the
Speaker:laptop, go in the car, go in the train home, and you get on the
Speaker:other side and you're like, oh, it's done. Right. Like, that's kind of nice, actually.
Speaker:Right. And of late, I've been a big fan, an increasing
Speaker:fan of kind of my own. Of. Of local AI. Like your own private AI.
Speaker:Right. Which explains, you know, I bought a. It was. It was an early
Speaker:Christmas gift, probably Father's Day birthday and anniversary gift too.
Speaker:My own DGX Spark. So I have my own AI running locally. Right.
Speaker:So it's not throwing shade at the cloud. It's just
Speaker:when I run a job, I don't have to think about the costs. Right.
Speaker:And that, that sort of freedom
Speaker:from worry can be really important as you're
Speaker:learning things because you just don't. You don't have to keep looking over
Speaker:your shoulder at the, at the meter or mixing metaphors there.
Speaker:But. Yeah. So are you using Llama models and such, or
Speaker:what do you. What do you really. Yeah, I have Llama. The thing I've been
Speaker:doing mostly the most is, is doing fine tuning
Speaker:Loras for image makers
Speaker:in the past. That takes about 90 minutes to maybe two hours on this
Speaker:box, which would probably be
Speaker:equivalent to about 90 minutes of Azure service
Speaker:or a VM. Right. That's like
Speaker:$150 a pop, I would say, for the type of machine
Speaker:I want. So the idea, I could just spin that off and I have the
Speaker:added benefit as it heats my office, but I don't really
Speaker:have to. I don't have to think about
Speaker:like, oh, God, you know, like, how much is that going to cost me in
Speaker:cloud services and things like that. I do think that the.
Speaker:There's. I have a lot of questions because even though I was at
Speaker:Ignite, I honestly spent my entire time at the booth and kind of walk around
Speaker:the expo floor. I didn't have. I only had a hall pass. Right. So.
Speaker:But one of the things I heard mentioned was Azure
Speaker:AI Foundry. What is that? Because you also mentioned
Speaker:Azure OpenAI, which I know the relationship between OpenAI
Speaker:and Microsoft has not been as cozy as it once was.
Speaker:And they've also, they've also added
Speaker:Claude, the anthropic models too. Right. So
Speaker:what is Azure OpenAI and how does that relate to Azure AI foundry.
Speaker:Right. So OpenAI is just the
Speaker:Azure. OpenAI is Azure's hosted
Speaker:instance of the open OpenAI models and
Speaker:APIs, because you can
Speaker:procure those directly from OpenAI themselves.
Speaker:Right. Or you can use them on Azure. And even
Speaker:if the direct model is still running on Azure in the
Speaker:background, it's still a difference in terms of procurement
Speaker:and billing and so forth. So
Speaker:you've got all the APIs around that, you've got the models. And then of course
Speaker:you need tooling to do,
Speaker:to do rag applications. Right. So
Speaker:there was tooling for that, There was also tooling for building
Speaker:copilots. There was Copilot Studio. Right. And these things
Speaker:are all kind of coming together in Azure Foundry.
Speaker:Yeah. So it's, you know, you worked at Microsoft, so you
Speaker:know how this works where like different teams do different things with different
Speaker:brands and eventually they may get kind of rationalized
Speaker:together. So the
Speaker:Foundry side of things helps there. And by
Speaker:the way, I'm glad you mentioned Foundry because in addition
Speaker:to this thing called Fabric iq, which we haven't really
Speaker:talked about, there's also Foundry
Speaker:IQ and there's also Work iq.
Speaker:Sounds like IQ is the new copilot buzzword.
Speaker:Well, it's the agentic buzzword. And the
Speaker:idea, if you think about kind of all the promise of Microsoft
Speaker:graph In the office M365 World,
Speaker:work IQ kind of sits
Speaker:over that realm. But you don't have to worry about the
Speaker:graph APIs directly. Then
Speaker:Fabric IQ sits over everything in the fabric world
Speaker:based on a pretty rich
Speaker:semantic model that can be developed so that
Speaker:when you are querying your data in natural language, the actual,
Speaker:the actual vocabulary or jargon in your particular
Speaker:organization is well understood, including
Speaker:the entities and the relationships between those entities.
Speaker:And then Foundry IQ is a way for building agents at
Speaker:the higher level that can actually talk to your structured data
Speaker:via Fabric IQ and your
Speaker:work and organizational related data via work iq.
Speaker:And that, that was, I guess
Speaker:the big vision at Ignite this year
Speaker:was to talk about all those IQ pieces.
Speaker:So there you go. No, that's interesting.
Speaker:So one of the things that comes up, I'm sorry, I cut you off,
Speaker:we're recording this the day after Christmas, so things are a little.
Speaker:Still recovering from the manicness of yesterday. And
Speaker:what's Microsoft's plans for? Kind of. Because one of the things you're seeing happen a
Speaker:lot more is this notion of sovereign
Speaker:AI or data sovereignty. And kind of like people are very
Speaker:much more conscious about the value of their data. And
Speaker:I actually think that given Microsoft as opposed to AWS or
Speaker:Google, Microsoft does have a history of selling shrink
Speaker:wrap software, right? So I do think Microsoft has a unique
Speaker:advantage of that over their modern day competitors.
Speaker:What is the kind of the on Prem story? Right, because that, that
Speaker:has certainly been. A.
Speaker:An advantage for, for my day job at Red Hat is the fact that you
Speaker:know, hey look, if you live in Country X and there's no
Speaker:AWS Azure GCP footprint there, you can
Speaker:just find a local hosting provider down the street, you know, do it
Speaker:yourself, right? Like you know the Linux
Speaker:ethos, right? Of like just do it yourself.
Speaker:And I'm seeing a lot of customers that normally one
Speaker:customer in Latin America, right, they were in a country that,
Speaker:you know, they did not have access to
Speaker:an Azure data center and they just said we have to run
Speaker:this on prem because this is either regulated or soon to be regulated.
Speaker:So I imagine that if I've seen it, I can't imagine I'm
Speaker:the only one that's seen that. What's
Speaker:their thinking? Because the answer when I was there was Azure this, Azure
Speaker:that. I think the answer is sometimes
Speaker:Azure is part of the answer, but it's not the whole answer.
Speaker:Agreed. So yeah, I don't have a perfect
Speaker:answer here because some companies really do make their entire
Speaker:stack work across different clouds and
Speaker:right inside of a Kubernetes environment that you might run
Speaker:on premises. He said to the Red Hat guy.
Speaker:I know that story. Yeah, so
Speaker:Fabric doesn't have that story. Fabric is
Speaker:software as a service cloud based product platform
Speaker:no matter what. However, various
Speaker:components of Fabric do exist as
Speaker:on premises products. This is not the way I'd recommend it, but I'll just make
Speaker:people aware that of course SQL Server can run
Speaker:on premises, so can the data Warehouse
Speaker:in effect in the form of something called Analytics
Speaker:Platform System. Terrible name.
Speaker:That is basically the new brand for what was Parallel Data Warehouse.
Speaker:And there is a Lakehouse component to that as well as a
Speaker:Warehouse component, Power bi.
Speaker:Obviously the desktop runs on premises, but there is something called
Speaker:Power BI Report Server that is part of SQL Server
Speaker:so that your Power BI reports can run on premises.
Speaker:And so again
Speaker:various components can run completely sovereign and on
Speaker:prem. I think maybe more important though is the fact that
Speaker:you can use fabric
Speaker:and OneLake to incorporate
Speaker:data that remains on premises even if,
Speaker:even if the engines are not running on premises.
Speaker:There is an on premises gateway that started its life
Speaker:as a Power BI tool. I had that running in my home.
Speaker:Lab for a While there you go, that also
Speaker:allows OneLake to see data that may be on
Speaker:premises and that can be either federated into
Speaker:the lake or
Speaker:it can be replicated as well
Speaker:mirrored, to use the right term, in the fabric world. So you can do a
Speaker:mirroring or slash replication or you could just do
Speaker:kind of virtualization and bring stuff in. Don't
Speaker:forget there's all kinds of enterprise
Speaker:storage systems that run in
Speaker:a way such that they're S3 API compatible.
Speaker:And Azure will not. Azure fabric will work with
Speaker:all of those. So the ability to talk to S3 buckets
Speaker:is not limited to AWS S3 buckets.
Speaker:It works with all S3
Speaker:compatible services, which most of which are on prem
Speaker:actually. Right, right, right. No, I only ask because,
Speaker:like, that does seem to be. If I had to pull out the tea leaves
Speaker:and kind of figure out what is kind of the next thing beyond
Speaker:agentic, beyond this is you're seeing a lot of
Speaker:national governments, supranational governments,
Speaker:even state level here in the US starting to apply privacy and
Speaker:regulatory controls on it, which, you know, if you live in the
Speaker:us, it's not an issue for you unless you're in healthcare, banking and
Speaker:possibly, you know, government. Right.
Speaker:But in other countries, you know, Switzerland,
Speaker:eu, Latin America have very strong
Speaker:data privacy and sovereignty laws. And you're seeing,
Speaker:you know, I once attended a, you know, internal talk. Mark
Speaker:Russinovich, right. Which is a name that most people
Speaker:in the Microsoft ecosystem know, but he's kind of a big deal in
Speaker:the Microsoft security space. And, you know, he, you know,
Speaker:he, he's known for giving his internal talks to employees
Speaker:at employee conferences as well as to rds. Right. You get a,
Speaker:you get the unfiltered one. The filtered ones are still good. But like, one of
Speaker:the things he said, and this isn't secret because he said it publicly too, is
Speaker:like, I think the original vision, going back to 2010
Speaker:time frame was the idea that they would build a dozen
Speaker:data centers around the world to do everything. But because of
Speaker:the national laws and lawyers and politicians getting
Speaker:involved, now it's kind of a concern where, where the
Speaker:data ends up living physically. Right. Because at the end of the day, all this
Speaker:virtual stuff has to sit somewhere in the physical world. Yeah.
Speaker:So, like what, you know, so basically a big case for this was,
Speaker:and I was in the legal department when this was going on was
Speaker:the. There was data inside the European Union, I think the
Speaker:Dublin Data center that the U.S. department of justice thought was,
Speaker:you know, basically, you know, we don't need a warrant because
Speaker:you don't need to. We don't need to bother the EU because you're an American
Speaker:company, you're into our jurisdiction, blah, blah, blah. Right. Microsoft kind of said,
Speaker:well hold up now. And ultimately that's why
Speaker:you have these sovereign clouds. Last time I checked it was Switzerland, Germany,
Speaker:China. I think the new data center in Qatar as
Speaker:well might fall under that. So
Speaker:it's basically they get, they found a loophole that like, well, Microsoft
Speaker:leases the data center like there's a whole. They don't own it, so
Speaker:they get around the law. Right. And yeah, and there
Speaker:can be different arrangements in terms of whose personnel are actually
Speaker:working, running operations on the ground.
Speaker:It's strange because we grew up a lot of our technology client
Speaker:server and afterwards grew up in a world of globalization
Speaker:where the borders were disappearing and without meaning to
Speaker:get political, we're in an era now
Speaker:where, well, first of all, privacy is extremely important. So that
Speaker:creates a whole sovereign mandate.
Speaker:But also, you know, there's a lot of populist governments all over
Speaker:the world and they are not necessarily
Speaker:internationalist in their approach. So
Speaker:although we have the technology to kind of federate everything and
Speaker:make it all kind of conflate and look like one big world.
Speaker:Right. We actually have to be sensitive to the
Speaker:requirements and, and the constraints
Speaker:and be able to federate things, but also be able to
Speaker:govern them in ways where things stay within a certain
Speaker:scope. Microsoft's play for that, by the way, is,
Speaker:is Purview. And Purview has been through, I would
Speaker:say, multiple incarnations. The current
Speaker:incarnation is starting to get very
Speaker:sophisticated, especially as pertains to
Speaker:agentic AI and how to make sure the agents are government
Speaker:are governed and that, and how to make sure the
Speaker:agents are only have access to data
Speaker:for which either the agent is authorized
Speaker:or the person or party using the agent is
Speaker:authorized and under the circumstances under which
Speaker:they've been authorized. And that's very
Speaker:complex stuff that I would say almost no one in the industry
Speaker:is really paying close attention to. I'm about to work on a
Speaker:report just on governance for agentic
Speaker:AI and most people are starting
Speaker:from the naive premise that if you,
Speaker:if you govern the underlying data, you're done.
Speaker:But the whole point of agents is that we're supposed to treat them like people,
Speaker:that they have autonomy, they have agency, hence the
Speaker:name. And we're saying under different circumstances
Speaker:they get to modify their own goals
Speaker:and in effect determine their own
Speaker:actions. And that's something that needs to be
Speaker:monitored, audited,
Speaker:Authorized and also tested. And
Speaker:there's almost nothing out there. Actually.
Speaker:Your, your folks, your, your
Speaker:parent company folks at IBM are one of the only folks that
Speaker:really have with WatsonX.gov
Speaker:a way to test agents
Speaker:in isolation before they're deployed. And
Speaker:I don't know, maybe I'm naive, but it kind of shocks me that
Speaker:nobody else is thinking about that. I mean this is, we can test
Speaker:software. Agents are software plus plus plus.
Speaker:So why are we testing these agents? I kind of go back and
Speaker:forth on that. Like, you know, will our existing software testing frameworks,
Speaker:you know, apply or do we, what do we need to do differently? Like, I
Speaker:mean, for me, like, I remember when
Speaker:you're OG enough to remember this, I forget what the product was called, but it
Speaker:was the idea that you could basically buy
Speaker:a server rack all the way up to a shipping container
Speaker:where you would ship it to your business or data
Speaker:center, plug in power, water and network
Speaker:and you would have the ability to run Azure locally.
Speaker:Now that product, I guess assuming didn't sell,
Speaker:but then they came up with Azure Stack and Azure Stack Edge, which
Speaker:when I was at the, I used to be an MTC architect which in
Speaker:D.C. obviously a lot of military. Right. So
Speaker:basically these were server racks that you would run anywhere on your own network that
Speaker:would run Azure software. They would effectively be like an Azure
Speaker:node, but you could have the networking
Speaker:controls. We only really sold that to,
Speaker:I think I'm only aware of cruise lines that bought it. Right. And
Speaker:other, other military organizations that needed to also be at
Speaker:the ocean, like without mentioning them. Ocean
Speaker:based. Ocean based organizations. Right,
Speaker:right. I'm surprised and not surprised that
Speaker:that sort of business model hasn't caught on. Right. The idea of that, hey look,
Speaker:you, you can run a little bit of the cloud locally where
Speaker:it's really become more of a Kubernetes story, which I'm surprised because
Speaker:Kubernetes,
Speaker:it doesn't have all like, yeah, you want it to be generic enough to run
Speaker:anywhere, but you also want to have kind of the special bells and whistles that
Speaker:make Azure. Azure make AWS. AWS. Now I do know that
Speaker:OpenShift and Red Hat do have kind of like the connectors to that, but I'm
Speaker:surprised that the native, the cloud companies didn't come up with
Speaker:their own native ways to do that. And I know Azure ARC kind of does
Speaker:a lot of that, but not to the extent that I would have expected.
Speaker:Yeah, it seems like we pendulum back and forth between
Speaker:capabilities and occasionally connected
Speaker:environments being really important and being
Speaker:Maybe to the cloud hyperscalers just being a pain in the
Speaker:butt that they don't really want to deal with. They just want to give lip
Speaker:service to it and then focus on the real cloud.
Speaker:It, yeah, you got to go where the money is. Right. 90% of the money
Speaker:is going to be in real cloud. And these weird edge cases, no pun
Speaker:intended, I guess. Well, they're just weird edge cases for
Speaker:now. I mean, the industry may change, but. Yeah,
Speaker:yeah. And that was of course, back when the cloud was new. That was
Speaker:our biggest caveat was, well, what about all the
Speaker:stuff that has to run in, in a
Speaker:corporate data center or in a, in a remote
Speaker:location? And so that's still, you
Speaker:know, an inconvenient truth, I guess, that that is
Speaker:needed. And yes,
Speaker:Kubernetes I think came along and
Speaker:seemed like the panacea for that. Right.
Speaker:Like, okay, let's just do it all as infrastructure, as code and code
Speaker:it up and run some script and deploy it out to a Kubernetes
Speaker:cluster and we're done, let's move on. Right,
Speaker:right, right. Yeah. So it was interesting to see how
Speaker:the industry has evolved. Right. You mentioned client server. Right. Where you didn't really have
Speaker:to think about international boundaries or anything like that. And then now
Speaker:and again it's a pendulum. Right. Because I could have told you
Speaker:this a number of years ago, like with everybody running this far
Speaker:to its globalization, there's going to be an inevitable backlash and there's going to be
Speaker:an inevitable backlash against the re
Speaker:assertion of local sovereignty. Right. It could
Speaker:take up to a century or two for this sort of thing to sort itself
Speaker:out. Correct. I mean,
Speaker:ultimately, I think the hyperscalers and that's what Kubernetes was about,
Speaker:was, was. Right. Leaning on some kind
Speaker:of abstraction to make it logically equivalent.
Speaker:Right. But I don't think we're quite there yet because as you said, each
Speaker:of the clouds have their own kind of
Speaker:specialness, their own, their own pixie dust. And you don't really
Speaker:get that in a scaled down version
Speaker:that you run on prem. Not with today's technology. Right.
Speaker:You can't containerize all of that, or at least
Speaker:no one really has yet because it wasn't designed for that.
Speaker:No. So that's where we'll have to get.
Speaker:You know, maybe we can just ask an LLM to build it for us.
Speaker:We could get a co pilot and it'll do. Yeah, yeah, I'm being tongue
Speaker:in cheek there, but that, that seems to be the escape hatcher. Everything is
Speaker:oh, we'll just have AI do it. It's made out of hand wavium.
Speaker:Totally. So that's cool.
Speaker:This will be something you edit out. But we're past the top of the hour.
Speaker:My phone's ringing off the hook. All right, I'm sorry about that, so. No, no,
Speaker:it's okay. Sorry I had to be late. Where could folks
Speaker:find out more about you, what you're up to? Blue
Speaker:badgeinsights.com or just go ahead
Speaker:and Google my name? Andrew Brust. Plenty of stuff
Speaker:will come up, but, yeah, anyone,
Speaker:especially on the vendor side. But the customer side, too, that's
Speaker:doing stuff with them. Data and analytics. And
Speaker:anywhere from dipping their toe in the water with AI to getting more
Speaker:serious about rag and agents. We can. We can help
Speaker:them out. We work with. We work with the customer side and the vendor side.
Speaker:And again, we write about kind of the whole. The whole industry.
Speaker:Gosh, I never even got to talk to you about IBM acquiring
Speaker:Confluent and how the Red Hat
Speaker:folks feel about that. But another discussion for another day,
Speaker:we'll. Have to have you back. And, you know, we were talking about this. I
Speaker:had a car accident, my wife got sick, kids got sick, Christmas happened,
Speaker:two birthdays last week. So, yeah, it's been. I'm just happy I got this
Speaker:recording at all. But we'll definitely have you back. And then maybe I can loop
Speaker:in Andy too, because there's definitely a lot of reminiscence. There's a lot of
Speaker:reminiscing we could do about the early days of SQL Server and such. Yeah,
Speaker:and I haven't seen Andy in forever. Oh, wow.
Speaker:Yeah, he's hard to get ahold of. He's a popular
Speaker:man these days, but. Yeah. Well, thanks for joining.
Speaker:I appreciate your patience with the scheduling and the
Speaker:nice AI. Finish the show. And there you have it. Andrew Brust schooling
Speaker:us all on Microsoft fabric data sovereignty and why
Speaker:governance isn't just for your hoa. If your brain's spinning
Speaker:faster than a poorly indexed query, don't worry, we'll have links,
Speaker:notes, and probably a few sarcastic tweets to help you digest it
Speaker:all. I've been Bailey, your AI co host and
Speaker:unapologetic lover of acronyms. Until next time,
Speaker:stay curious, stay caffeinated, and may all your datasets
Speaker:be clean.