1 00:00:00,240 --> 00:00:04,000 Greetings, listeners. Welcome back to the Data 2 00:00:04,000 --> 00:00:07,839 Driven Podcast. I'm Bailey, your AI host with 3 00:00:07,839 --> 00:00:11,665 the most data, that is, bringing you insights from the ether 4 00:00:11,665 --> 00:00:15,424 with my signature wit. In today's episode, we're 5 00:00:15,424 --> 00:00:19,125 diving deep into the heart of artificial intelligence's engine room, 6 00:00:19,779 --> 00:00:23,460 GPU orchestration. It's the unsung hero 7 00:00:23,460 --> 00:00:27,060 of AI research, optimizing the raw power needed to fuel 8 00:00:27,060 --> 00:00:30,775 today's most advanced machine learning models. And 9 00:00:30,775 --> 00:00:34,535 who better to guide us through this labyrinth of computational complexity than 10 00:00:34,535 --> 00:00:38,180 Ronan Darr, the cofounder and CTO of Run AI, the 11 00:00:38,180 --> 00:00:41,700 company that's making GPU resources work smarter, not 12 00:00:41,700 --> 00:00:44,415 harder. Now onto the show. 13 00:00:47,275 --> 00:00:51,010 Hello, and welcome to Data Driven, the podcast where we For the emergent fields 14 00:00:51,010 --> 00:00:54,770 of artificial intelligence, data engineering, and overall data 15 00:00:54,770 --> 00:00:58,390 science and analytics. With me as always is my favoritest 16 00:00:58,695 --> 00:01:02,455 Data engineer in the world, Andy Leonard. How's it going, Andy? It's 17 00:01:02,455 --> 00:01:06,055 going well, Frank. How are you? I'm doing great. I'm doing great. It's been, 18 00:01:06,855 --> 00:01:10,420 we're We're recording this February 1, 2024. And as I said to my 19 00:01:10,420 --> 00:01:13,480 kids yesterday, January has been a long year. 20 00:01:16,395 --> 00:01:19,595 We're only, like, 1 month into the year, and it was it was a pretty 21 00:01:19,595 --> 00:01:23,409 wild ride. But I can tell we're gonna have a blast today, 22 00:01:23,409 --> 00:01:27,110 because we're gonna geek out on something that I kinda sort of understand, 23 00:01:27,490 --> 00:01:31,335 but not entirely, and it's GPUs. And in the virtual green room, were chit 24 00:01:31,335 --> 00:01:35,174 chatting with some folks, and, but let me do the formal introduction 25 00:01:35,174 --> 00:01:38,954 here. Today with us, we have doctor Ronadhar, cofounder and CTO 26 00:01:39,335 --> 00:01:42,710 of Run AI, A company at the forefront of GPU 27 00:01:42,770 --> 00:01:46,470 orchestration, and he has a distinguished career in technology. 28 00:01:47,010 --> 00:01:50,525 His experience includes significant roles at Apple. Yes, That 29 00:01:50,525 --> 00:01:53,585 apple. Bell Labs. Yes. That Bell Labs. 30 00:01:54,445 --> 00:01:57,745 And at Run AI, Ronan is instrumental in optimizing 31 00:01:57,885 --> 00:02:01,120 GPU usage For AI model training and deployment, 32 00:02:01,420 --> 00:02:05,120 leveraging his deep passion for both academia and startups. 33 00:02:05,740 --> 00:02:09,444 And, Run AI is a key player in the, and he is a he 34 00:02:09,444 --> 00:02:12,885 and Run AI are key player in the AI revolution. Ronan's 35 00:02:12,885 --> 00:02:16,540 contribute Contributions are pivotable in shaping and powering the 36 00:02:16,540 --> 00:02:20,300 future of artificial intelligence. Now I will add that in 37 00:02:20,300 --> 00:02:23,455 my day job at Red Hat, Run AI has come up a couple of times. 38 00:02:23,455 --> 00:02:27,135 So this is definitely, definitely 39 00:02:27,135 --> 00:02:30,834 an honor to have you on on on the show, sir. Welcome. 40 00:02:31,840 --> 00:02:35,600 Thank you, Frank. Thank you for inviting me. Hey, Andy. Good to 41 00:02:35,600 --> 00:02:39,120 be here. I love it. Love Reddit. We're a big 42 00:02:39,120 --> 00:02:42,495 fan of Reddit. We're working closely with many people in 43 00:02:42,495 --> 00:02:46,034 Reddit, and love that. Right? Love OpenShift, 44 00:02:46,415 --> 00:02:49,310 love Reddit, love Linux. Yeah. Cool. Cool. 45 00:02:50,250 --> 00:02:53,930 Yeah. So so for those who don't know exactly, I kinda know 46 00:02:53,930 --> 00:02:57,230 what, your Run AI does, but can you explain exactly 47 00:02:58,334 --> 00:03:01,875 What it is run AI does and why GPU 48 00:03:01,935 --> 00:03:05,235 orchestration is important. Yes. 49 00:03:05,775 --> 00:03:05,855 Okay. 50 00:03:09,490 --> 00:03:12,950 So run AI is, software, 51 00:03:14,050 --> 00:03:17,865 AI infrastructure platform. So we 52 00:03:17,925 --> 00:03:21,605 help machine learning teams to get much more 53 00:03:21,605 --> 00:03:25,280 out of their GPUs, And we provide 54 00:03:26,380 --> 00:03:29,840 those teams with abstraction layers and tools 55 00:03:30,460 --> 00:03:33,945 so they can train models And deploy models 56 00:03:34,485 --> 00:03:37,785 much easier, much faster. And 57 00:03:40,085 --> 00:03:43,900 so We started in 2018, 6 years 58 00:03:43,900 --> 00:03:47,580 ago. It's me and my cofounder, Omuri. Omuri is the CEO. 59 00:03:47,580 --> 00:03:51,345 He's, he's amazing. I love him. We We know each other for many 60 00:03:51,345 --> 00:03:54,805 years. We we met in the academia, like, more than 10 years ago, 61 00:03:55,505 --> 00:03:59,080 and and we started running AI together, and We started 62 00:03:59,080 --> 00:04:02,620 running AI because we saw that there are big challenges 63 00:04:03,160 --> 00:04:06,460 around, GPU's, around orchestrating 64 00:04:06,680 --> 00:04:10,505 GPU's and utilizing GPU's. We saw back then 65 00:04:10,505 --> 00:04:13,645 in 2018, the GPUs are going to be very very important. 66 00:04:14,105 --> 00:04:17,630 It's like the basic a a component in 67 00:04:17,630 --> 00:04:21,310 that any AI company need to train models, 68 00:04:21,310 --> 00:04:25,115 right, and deploy models. So we saw that GPUs are going to be critical, but 69 00:04:25,115 --> 00:04:28,655 there are also a lot of challenges with, with utilizing GPUs. 70 00:04:29,435 --> 00:04:33,260 I think back then, GPUs were relatively new In 71 00:04:33,260 --> 00:04:35,680 the data center, in in the cloud. 72 00:04:36,780 --> 00:04:40,300 GPU's were very known in the gaming 73 00:04:40,300 --> 00:04:44,075 industry. Right? We spoke before on gaming. Right? Like, a lot of 74 00:04:44,075 --> 00:04:47,535 key things there that GPU's has has has been enabled 75 00:04:47,835 --> 00:04:51,570 enabling, But in the data center, they were relatively new and the 76 00:04:51,570 --> 00:04:55,270 entire software stack that is that 77 00:04:55,330 --> 00:04:58,755 is running the Cloud in data center As was built for 78 00:04:59,135 --> 00:05:02,975 traditional microservices applications that are running 79 00:05:02,975 --> 00:05:06,650 on commodity CPUs And AI workloads are different, they are 80 00:05:06,650 --> 00:05:10,490 much more compute intensive, they they 81 00:05:10,490 --> 00:05:14,255 run on on GPUs, maybe on multiple nodes of Meet to point 82 00:05:14,255 --> 00:05:18,095 machines of GPU's, and GPU's are also very different. 83 00:05:18,095 --> 00:05:21,155 Right? They are expensive, very scarce in the data center. 84 00:05:21,535 --> 00:05:25,270 So The entire software stack was a bit for something else 85 00:05:25,270 --> 00:05:29,030 and when it comes to GPUs, it was really hard for many people to to 86 00:05:29,030 --> 00:05:32,685 actually manage those GPUs. So we came in And, and we 87 00:05:32,685 --> 00:05:36,384 saw those gaps. We've built run AI on top of 88 00:05:36,685 --> 00:05:40,525 cloud native technologies like Kubernetes and containers. We're 89 00:05:40,525 --> 00:05:44,270 big fans of Of those, technologies, and 90 00:05:44,270 --> 00:05:47,790 we added components around scheduling, around 91 00:05:47,790 --> 00:05:51,185 the GPU fractioning. So we enable 92 00:05:51,645 --> 00:05:55,245 multiple workloads to run on a on a single GPU and 93 00:05:55,245 --> 00:05:59,090 essentially all the provision GPU's. So we build this Engine which we 94 00:05:59,090 --> 00:06:02,470 call cluster engine that runs in in in GPU 95 00:06:02,530 --> 00:06:06,370 clusters. Right? We help machine learning teach to pull all of their GPU's into 96 00:06:06,370 --> 00:06:09,845 1 cluster, Running that engine, and that engine provides a lot of 97 00:06:09,845 --> 00:06:13,605 performance and lot of capabilities from those GPUs. And 98 00:06:13,605 --> 00:06:17,150 on top of that, we built this control plane And 99 00:06:17,150 --> 00:06:20,850 and tools and for machine learning, 100 00:06:21,550 --> 00:06:24,770 teams to run the Jupyter Notebooks, to run 101 00:06:25,225 --> 00:06:29,065 training jobs, batch jobs to deploy their models, right, to just to to 102 00:06:29,065 --> 00:06:32,505 have tools for the entire life cycle of AI 103 00:06:32,505 --> 00:06:36,330 from Training models in the lab to taking those models into 104 00:06:36,330 --> 00:06:39,550 production and running them and serving actual users. 105 00:06:40,090 --> 00:06:43,805 And That's the platform that we've built, and we're working with machine 106 00:06:43,805 --> 00:06:47,505 learning teams across the globe and on just managing, 107 00:06:47,645 --> 00:06:51,470 orchestrating, and letting them Get much more out of their GPUs and essentially 108 00:06:52,090 --> 00:06:55,930 run faster, train more than faster and in much easier way and 109 00:06:55,930 --> 00:06:59,375 deploy those modules In a much easier and faster and more efficient 110 00:06:59,375 --> 00:07:03,134 way. Yeah. The thing that blew me away when I first heard of Run 111 00:07:03,134 --> 00:07:06,700 AI, and this would have been, 2021 112 00:07:06,840 --> 00:07:10,060 ish. No. 20 early 113 00:07:10,360 --> 00:07:13,935 2021, I would say, And, it was the 114 00:07:13,935 --> 00:07:17,315 idea of fractional GPU's. Right? So you can have 1, 115 00:07:18,255 --> 00:07:21,850 I say 1, but, know, it's realistically, it's gonna be on, but you you can 116 00:07:21,850 --> 00:07:24,810 kind of share it out, which I think and we were talking in the virtual 117 00:07:24,810 --> 00:07:28,190 green room about how, you know, some of these GPU's, 118 00:07:29,255 --> 00:07:33,015 If you can get them because there's a multi month, sometimes multi 119 00:07:33,015 --> 00:07:36,455 year supply chain issue. I mean, these things are expensive bits of 120 00:07:36,455 --> 00:07:40,230 hardware, and I think the real value, correct 121 00:07:40,230 --> 00:07:43,190 me if I'm wrong, is, like, well, you know, if you I was talking to 122 00:07:43,190 --> 00:07:46,250 somebody the other day, and and we're basically talking about how we can, 123 00:07:48,754 --> 00:07:52,514 you know, if you get if you get, like, 1 laptop with a killer 124 00:07:52,514 --> 00:07:56,354 GPU, right, that GPU is really only useful to that 1 125 00:07:56,354 --> 00:07:59,340 user, Whereas if you can kind of put it in a in a in a 126 00:07:59,340 --> 00:08:03,180 server and use something like RunAI, now everybody in the organization can do 127 00:08:03,180 --> 00:08:06,925 that. And these are not trivial expenses. I mean, these are like, You know, 128 00:08:07,145 --> 00:08:09,725 you sell a kidney type of costs here. 129 00:08:11,705 --> 00:08:15,530 Yeah. Absolutely. So Absolutely. First of all, GPUs 130 00:08:15,530 --> 00:08:18,190 are expensive. They cost a lot. Right? 131 00:08:19,610 --> 00:08:23,294 And we provide, Technologies like fractional GPUs and 132 00:08:23,294 --> 00:08:26,754 other technologies around scheduling that allows 133 00:08:27,294 --> 00:08:30,710 teams to share GPUs. Right. So we used book on 134 00:08:30,710 --> 00:08:34,149 GPU fractioning. So that's 1 one day of 135 00:08:34,149 --> 00:08:37,770 sharing where you have 1 GPU, which is really expensive. 136 00:08:37,830 --> 00:08:41,195 And Not all of the workloads are 137 00:08:41,655 --> 00:08:45,415 AI workloads are really compute intensive and require the 138 00:08:45,415 --> 00:08:49,070 entire GPU or, you know, maybe multiple GPUs. There are 139 00:08:49,070 --> 00:08:52,910 workloads like Jupyter Notebooks where you have 140 00:08:52,910 --> 00:08:54,530 researchers that just 141 00:08:56,264 --> 00:09:00,024 Debugging their code or cleaning their data or doing some simple stuff, 142 00:09:00,024 --> 00:09:02,365 and they need just fractions of GPUs. 143 00:09:04,170 --> 00:09:07,870 In that case, if you have, a lot of data scientists, 144 00:09:07,930 --> 00:09:11,685 maybe you wanna host all of their notebooks On 145 00:09:11,685 --> 00:09:15,524 a much smaller number of GPUs because, right, each 146 00:09:15,524 --> 00:09:19,204 one of them, it's just fractions of GPUs. Another big use case 147 00:09:19,204 --> 00:09:21,830 for fractions Of GPUs is inference. 148 00:09:23,410 --> 00:09:26,150 So now all of the models are huge 149 00:09:26,770 --> 00:09:30,495 and And doesn't fit into, the memory of 1 150 00:09:30,495 --> 00:09:33,394 GPU, and in computer vision, 151 00:09:34,175 --> 00:09:37,800 there are a lot of Models that are relatively small, 152 00:09:37,800 --> 00:09:41,640 they run on GPU, and you can essentially host multiple of 153 00:09:41,640 --> 00:09:45,435 them on the same GPU. Right. So you can have instead of 154 00:09:45,435 --> 00:09:49,055 just 1 computer vision model running on GPU, host 10 155 00:09:49,675 --> 00:09:53,270 of those models on the same GPU and get Factors of 156 00:09:53,270 --> 00:09:56,310 10 x in, in your cost, in your, 157 00:09:56,950 --> 00:10:00,545 overall throughput of, of inference. So that's That's one 158 00:10:00,545 --> 00:10:04,385 use case for fractional GPU, and we're investing heavily just 159 00:10:04,385 --> 00:10:08,225 building that technology. Another layer 160 00:10:08,225 --> 00:10:11,890 of sharing GPUs Comes where you 161 00:10:11,890 --> 00:10:15,589 have maybe in your organization multiple teams 162 00:10:15,970 --> 00:10:19,755 or multiple projects running in parallel. So 163 00:10:19,755 --> 00:10:23,435 for example, may open AI, they now are working 164 00:10:23,435 --> 00:10:27,275 on gpt5. It's 1 project. That project needs a 165 00:10:27,275 --> 00:10:31,089 lot of GPUs And they have more projects. Right? 166 00:10:31,089 --> 00:10:34,529 More research project around alignment or around, 167 00:10:34,850 --> 00:10:38,685 reinforcement learning. You know? DALL 168 00:10:38,685 --> 00:10:42,045 E. Like, they they they have more than just 1 project. Then DALL E and 169 00:10:42,045 --> 00:10:45,565 they have multiple models. Right? Exactly. They have. Right? So each 170 00:10:45,565 --> 00:10:49,199 project needs Needs GPUs. Right? Needs a lot of 171 00:10:49,199 --> 00:10:52,740 GPUs. So if you can instead of 172 00:10:53,519 --> 00:10:56,985 allocating GPUs Entirely for each project, 173 00:10:57,525 --> 00:11:01,205 you could essentially pull all of those GPU's and share 174 00:11:01,205 --> 00:11:04,585 them between the those different projects, different teams, 175 00:11:04,890 --> 00:11:08,730 And in times where 1 project is idle and not 176 00:11:08,730 --> 00:11:12,410 using their GPUs, other projects, other teams can share 177 00:11:12,490 --> 00:11:16,035 can get access to those GPUs. Now orchestrating all of 178 00:11:16,035 --> 00:11:19,575 that, orchestrating that sharing of resources between 179 00:11:19,635 --> 00:11:23,420 projects, between teams can be really complex And 180 00:11:23,420 --> 00:11:26,860 requires this advanced scheduling, which 181 00:11:26,860 --> 00:11:30,365 which we're bringing into the game. We're bringing 182 00:11:30,365 --> 00:11:34,145 those scheduling capabilities from the high performance computing world 183 00:11:34,365 --> 00:11:37,940 known on those schedulers. And so we're bringing Capabilities 184 00:11:38,240 --> 00:11:41,600 from that world into the cloud native Kubernetes 185 00:11:41,600 --> 00:11:45,220 world. Scheduling around batch batch scheduling 186 00:11:45,279 --> 00:11:48,855 fairness, Algorithms, things like that, so teams and projects 187 00:11:48,855 --> 00:11:52,635 can just share GPUs in a simple and efficient 188 00:11:52,695 --> 00:11:56,519 way. So those 189 00:11:56,519 --> 00:12:00,279 are the 2 layers of sharing GPU's. Interesting. And and 190 00:12:00,279 --> 00:12:04,025 I think that I think as As this field matures 191 00:12:04,085 --> 00:12:07,625 and it matures in the enterprise, I think you're gonna see organizations 192 00:12:08,005 --> 00:12:10,105 kind of be more, 193 00:12:16,390 --> 00:12:19,750 more more more I think savvy about, like, okay, like you said, like, data scientists, 194 00:12:19,750 --> 00:12:23,495 if they're just doing, like, you know, Traditional statistical modeling really doesn't benefit 195 00:12:23,495 --> 00:12:26,955 from GPUs, or they're just doing data cleansing, data engineering. 196 00:12:27,255 --> 00:12:31,080 Right? They're probably gonna say, like, well, Let's run it on this cluster, and 197 00:12:31,080 --> 00:12:34,760 then we'll break it apart into discrete parts where, you 198 00:12:34,760 --> 00:12:37,400 know, then we will need a GPU. And I also like the idea that, you 199 00:12:37,400 --> 00:12:40,714 know, you're you're basically doing What what I learned in college, 200 00:12:41,574 --> 00:12:45,415 which was time slicing. Right? Sounds like this is kind of, like, everything old is 201 00:12:45,415 --> 00:12:49,030 new again. Right? I mean, this is, Obviously, you know, when you're when you're 202 00:12:49,030 --> 00:12:52,410 taking kind of that old mainframe concept and applying it to something like Kubernetes, 203 00:12:52,790 --> 00:12:56,435 orchestration is gonna be a big deal, because these are not systems that were Not 204 00:12:56,435 --> 00:12:59,154 built from the ground up to have time slicing. Is that a is that a 205 00:12:59,154 --> 00:13:02,694 good kind of explanation? Yeah. Absolutely. 206 00:13:02,915 --> 00:13:06,680 Absolutely. I like I like that analogy. Yeah. Exactly. Time 207 00:13:06,680 --> 00:13:09,980 slicing it's, it's 1 so 208 00:13:10,600 --> 00:13:14,305 1 implementation, Yeah. And that we 209 00:13:14,305 --> 00:13:16,485 enable around fractionalizing GPU's, 210 00:13:18,385 --> 00:13:22,080 and I agree when you have resources, It 211 00:13:22,080 --> 00:13:25,920 can be different kind of resources. Right? It can be CPU 212 00:13:25,920 --> 00:13:29,460 resources and networking were also, 213 00:13:29,865 --> 00:13:33,165 You know, as people created that technology to share the 214 00:13:33,465 --> 00:13:36,985 networking and communication going through those networking, but just the 215 00:13:36,985 --> 00:13:40,730 bandwidth of the networking. We're doing it 216 00:13:40,730 --> 00:13:44,090 for GPU's. Right. Sharing those 217 00:13:44,090 --> 00:13:47,310 resources. And I think now it interestingly, 218 00:13:48,250 --> 00:13:51,915 LLMs I also becoming a kind 219 00:13:51,915 --> 00:13:55,435 of, resources as well, right, that people need access 220 00:13:55,435 --> 00:13:59,160 to. Right? You have those models, you have GPT, JGPT. 221 00:13:59,460 --> 00:14:03,140 A lot of people are trying to get access to 222 00:14:03,140 --> 00:14:06,754 that resource, essentially. And I think it's interesting, 223 00:14:07,214 --> 00:14:09,855 because you kinda pointed this out, but it it it's something that I think that 224 00:14:09,855 --> 00:14:13,615 if you're in the gen AI space, you kinda don't it's so it's obvious 225 00:14:13,615 --> 00:14:17,440 like error. You don't think about it. Right? But when when you 226 00:14:17,440 --> 00:14:21,040 get inference on traditional, I somebody once referred to it 227 00:14:21,040 --> 00:14:24,885 as legacy AI. Right. But where 228 00:14:24,885 --> 00:14:28,085 the infrared side of the equation, you don't really need a lot of compute power. 229 00:14:28,085 --> 00:14:31,705 Right? Like, it's not really a heavy lift. Right? But with generative 230 00:14:31,925 --> 00:14:35,209 AI, you do need a lot of compute on 231 00:14:35,750 --> 00:14:38,709 I I guess it's not really inference, but on the other side of the use 232 00:14:39,029 --> 00:14:42,415 while it's actually in use, not just the training. Right. So traditionally, 233 00:14:42,635 --> 00:14:46,475 GPU heavy use in training, and then inference, not so 234 00:14:46,475 --> 00:14:50,255 much. Now we need heavy use before, after, and during, 235 00:14:51,040 --> 00:14:54,880 which I imagine your technology would help because, I mean, look, I love chat I 236 00:14:54,880 --> 00:14:57,360 love chat g p t. I'm one of the 1st people to sign up for 237 00:14:57,360 --> 00:15:01,125 a subscription, But even, you know, they had trouble keeping 238 00:15:01,125 --> 00:15:03,685 up, and they have a lot of money, a lot of power, a lot of 239 00:15:03,685 --> 00:15:07,070 influence. So I mean, this is something that if you're just a 240 00:15:07,070 --> 00:15:10,750 regular old enterprise, this is probably something they struggle 241 00:15:10,750 --> 00:15:14,505 with. Right? Right. Yeah. I absolutely 242 00:15:14,565 --> 00:15:17,385 agree. It's like amazing point, Frank. 243 00:15:20,085 --> 00:15:23,540 So 1 year 244 00:15:23,540 --> 00:15:27,220 ago, the inference use case on 245 00:15:27,220 --> 00:15:30,925 GPU's. Wasn't that big. Totally agree. That's also what we 246 00:15:30,925 --> 00:15:31,905 saw in the market. 247 00:15:35,005 --> 00:15:38,610 Deep learning Convolution neural networks were 248 00:15:38,610 --> 00:15:39,750 running on GPUs, 249 00:15:42,529 --> 00:15:44,470 mostly for computer vision applications, 250 00:15:46,135 --> 00:15:49,815 But they could also run on CPUs and you could get, 251 00:15:49,815 --> 00:15:52,395 like, relatively okay performance. 252 00:15:53,560 --> 00:15:57,400 If you needed maybe, like, a very low latency, then 253 00:15:57,400 --> 00:16:01,080 you might use GPUs because they're much faster and you get much 254 00:16:01,080 --> 00:16:03,355 lower latency. But 255 00:16:05,495 --> 00:16:09,335 it was, it was all, and it's still very 256 00:16:09,335 --> 00:16:13,180 difficult to deploy more than it's on GPU's Compared to just deploying 257 00:16:13,240 --> 00:16:17,000 those models on CPUs, because deploying more than deploying applications on 258 00:16:17,000 --> 00:16:20,380 CPUs, you know, people are doing for so many years. 259 00:16:20,905 --> 00:16:21,405 So 260 00:16:24,505 --> 00:16:28,105 many times it was much easier for people to just deploy their 261 00:16:28,105 --> 00:16:31,810 models on CPU's And not on GPUs, so that was, like, the 262 00:16:31,810 --> 00:16:34,310 fallback to CPUs. But 263 00:16:35,490 --> 00:16:39,285 then came, and as you said, chair GPT was introduced, A 264 00:16:39,285 --> 00:16:42,904 little bit more than a year ago, and that generative 265 00:16:43,125 --> 00:16:46,510 AI use case just blown. It was blown. Right? And it's 266 00:16:46,830 --> 00:16:50,510 it's inference essentially. And those models are 267 00:16:50,510 --> 00:16:53,950 so big that they can't really run on 268 00:16:53,950 --> 00:16:57,634 CPU. They, they LLMs are running in production on 269 00:16:57,634 --> 00:17:01,235 GPU's and now the inference use case on 270 00:17:01,235 --> 00:17:05,030 GPU's is just exploding In the market 271 00:17:05,030 --> 00:17:08,630 right now, it's really big. Is a lot of demand for 272 00:17:08,630 --> 00:17:11,984 GPU's for inference And 273 00:17:12,365 --> 00:17:15,665 if for open AI, they need to support this 274 00:17:15,724 --> 00:17:18,545 huge scale that I guess, just 275 00:17:19,269 --> 00:17:23,029 Just them are seeing such scale, maybe a little, a 276 00:17:23,029 --> 00:17:26,649 few more companies, but that's like huge, huge scale. 277 00:17:28,274 --> 00:17:31,575 But I think that we will see more and more companies 278 00:17:32,195 --> 00:17:35,715 building products based on AI, on 279 00:17:35,715 --> 00:17:38,700 LLMs, And we'll see more and more 280 00:17:39,240 --> 00:17:43,000 applications using AI, which 281 00:17:43,000 --> 00:17:46,635 then that AI runs on on GPU. So That is going to go 282 00:17:46,635 --> 00:17:50,395 and that's the that's an amazing new market for us around 283 00:17:50,395 --> 00:17:53,695 AI and for me as a CTO, it was so fun to 284 00:17:54,210 --> 00:17:57,669 Get into that market because it now comes with 285 00:17:57,970 --> 00:18:01,590 new problems, new challenges, 286 00:18:02,130 --> 00:18:05,895 new use cases Compared to deep learning 287 00:18:05,895 --> 00:18:09,735 on on GPS. New new pains because 288 00:18:09,735 --> 00:18:13,200 the models are so big. Right? Right. And 289 00:18:13,200 --> 00:18:16,659 challenges around cold start problems, about auto scaling, 290 00:18:17,279 --> 00:18:20,715 about, About 291 00:18:21,335 --> 00:18:25,095 just, giving access to LLMs. So a lot of 292 00:18:25,095 --> 00:18:28,855 challenges, new challenges there. We at Tron AI will studying those problems 293 00:18:28,855 --> 00:18:32,620 and we're Now building solutions for those problems, 294 00:18:32,760 --> 00:18:36,600 and I'm really, really excited about the Inference use case. That 295 00:18:36,600 --> 00:18:40,405 is very cool. So just, going back a little bit. 296 00:18:40,405 --> 00:18:44,165 I was trying to keep up. I promise. But Run AI is 297 00:18:44,405 --> 00:18:47,145 I I get Run AI Run AI's platform 298 00:18:47,910 --> 00:18:51,050 Support fractional, GPU usage. 299 00:18:51,510 --> 00:18:54,250 It it also sounds to me, maybe I misunderstood, 300 00:18:55,045 --> 00:18:58,885 That in order to achieve that, you first had to or 301 00:18:58,885 --> 00:19:02,405 or maybe along with that, you made it possible to use multiple 302 00:19:02,405 --> 00:19:06,040 GPUs. You've you've created Something like 303 00:19:06,040 --> 00:19:09,420 an API that allows, companies 304 00:19:09,800 --> 00:19:13,560 to take advantage of multiple GPUs or fractions of 305 00:19:13,560 --> 00:19:17,205 GPUs. Did I Did I miss that? No, that's 306 00:19:17,205 --> 00:19:20,424 right. That's right, Andy. And Okay. 307 00:19:21,044 --> 00:19:24,424 So we've built this, way of, 308 00:19:24,870 --> 00:19:28,630 For people to scale their workloads from fractions 309 00:19:28,630 --> 00:19:32,410 of GPUs to multiple GPUs within 1 machine, 310 00:19:33,005 --> 00:19:36,705 Okay. To multiple, machines. Right? You 311 00:19:37,085 --> 00:19:40,845 have big workloads running on on multiple nodes 312 00:19:40,845 --> 00:19:43,740 of GPUs. So Think about it when you have 313 00:19:44,600 --> 00:19:48,140 multiple users each running their own 314 00:19:49,000 --> 00:19:52,375 workload. Some are running on fractions of GPUs. Some are 315 00:19:52,375 --> 00:19:55,815 running batch jobs on on a lot of 316 00:19:55,815 --> 00:19:59,610 GPUs. Some Deploying models and running them on 317 00:19:59,610 --> 00:20:03,450 in inference, and some just launching their Jupyter 318 00:20:03,450 --> 00:20:06,670 Notebooks. All of that is happening on the same 319 00:20:07,534 --> 00:20:11,135 pool of GPU's, same cluster. So you need 320 00:20:11,135 --> 00:20:14,674 this lay of orchestration of scheduling just to 321 00:20:15,290 --> 00:20:18,350 Manage everything and make sure that everything getting there 322 00:20:18,730 --> 00:20:22,110 right, access the right, and and 323 00:20:22,570 --> 00:20:25,955 and g p u's And everything is scheduled according to 324 00:20:25,955 --> 00:20:29,715 priorities. Yeah. Well, being just, you know, a 325 00:20:29,715 --> 00:20:33,480 mere data engineer, Here talking about all of that 326 00:20:33,480 --> 00:20:37,080 analytics workload. That that sounds very 327 00:20:37,080 --> 00:20:40,695 complex. So and as you 328 00:20:40,695 --> 00:20:44,535 mentioned earlier, you know, you were talking about how traditional coding 329 00:20:44,535 --> 00:20:48,075 is targeting CPUs, and that's my background. 330 00:20:48,580 --> 00:20:52,360 You know, I've written applications and and done data work targeted for 331 00:20:52,740 --> 00:20:56,580 traditional work. I can't imagine, just how complex 332 00:20:56,580 --> 00:21:00,185 that is, because GPUs came into AI 333 00:21:00,725 --> 00:21:03,385 as a unique solution, 334 00:21:04,325 --> 00:21:08,030 designed to solve problems That they weren't really built 335 00:21:08,030 --> 00:21:11,790 for. You know, GPUs were built for graphics, and you didn't 336 00:21:11,790 --> 00:21:15,335 manage that. But the fact that They have to be 337 00:21:15,335 --> 00:21:19,175 so parallel, internally. I think just added this 338 00:21:19,175 --> 00:21:22,980 dimension to it. And I don't know who came up 339 00:21:22,980 --> 00:21:26,740 with that idea, you know, who thought of, well, goodness, we could we could 340 00:21:26,740 --> 00:21:30,535 use all of this, you know, massive parallel processing to To 341 00:21:30,535 --> 00:21:34,295 to run these other class of problems. So pretty 342 00:21:34,295 --> 00:21:37,735 cool pretty cool idea, but I just I yeah. I'm amazed at even 343 00:21:37,735 --> 00:21:41,440 cooler than that. Because Yeah. Yeah. A wise man once told me, 344 00:21:41,440 --> 00:21:45,120 he goes, GPU's are really good at solving linear 345 00:21:45,120 --> 00:21:48,805 algebra problems, And if you're clever enough, you can 346 00:21:48,805 --> 00:21:51,145 turn anything into a linear algebra problem. 347 00:21:52,405 --> 00:21:55,840 And even simulating quantum computers when I was kind of, like, going through that, 348 00:21:56,320 --> 00:22:00,159 I was like Mhmm. You know, like, gee, looks like looks like this 349 00:22:00,159 --> 00:22:03,380 will be useful there too. Right? Like so it's an it's an interesting, 350 00:22:04,105 --> 00:22:07,545 It's an interesting thing. So, like, you know, everyone is, you know, 351 00:22:07,545 --> 00:22:11,065 everyone's talking about how this is, you know, we're in the hype cycle, but I 352 00:22:11,065 --> 00:22:14,770 think if you're in the GPU space, you have Pretty good run because one, 353 00:22:15,309 --> 00:22:18,429 these things are gonna these things are gonna be important. Right? Whether or not, you 354 00:22:18,429 --> 00:22:22,045 know, hype cycle will will kinda crash, and how what that'll look like. 355 00:22:22,205 --> 00:22:24,925 Think they're gonna be important anyway. Right? Because they're gonna be just the cost of 356 00:22:24,925 --> 00:22:28,365 doing business, table stakes, as the cool kids like to say. But 357 00:22:28,365 --> 00:22:31,890 also, over the next horizon, Simulating quantum 358 00:22:31,890 --> 00:22:35,030 computers is going to be the next big hype cycle. 359 00:22:35,410 --> 00:22:39,170 Right? Or one of them. Right? So like it's 360 00:22:39,170 --> 00:22:42,705 it's it's a It's a foundational technology. I think that we 361 00:22:42,705 --> 00:22:46,544 didn't think would be a foundational technology even like 6 7 years 362 00:22:46,544 --> 00:22:49,910 ago. Right? Yeah. 363 00:22:51,250 --> 00:22:53,190 I go with a few things that you said. 364 00:22:55,090 --> 00:22:58,715 Regarding the Parallel computation, right? And just running 365 00:22:58,715 --> 00:23:01,934 linear algebra calculations on GPU's 366 00:23:02,635 --> 00:23:04,895 and accelerating such workloads. 367 00:23:06,460 --> 00:23:09,760 In Nvidia, I love Nvidia, Nvidia 368 00:23:10,220 --> 00:23:13,580 has this big vision, and they had big 369 00:23:13,580 --> 00:23:17,395 vision Around GPU's already in 26 when 370 00:23:17,395 --> 00:23:21,095 they built CUDA. Yep. Right. So 371 00:23:21,850 --> 00:23:25,530 They've been good at just for that. Right? The GPU's were 372 00:23:25,530 --> 00:23:29,205 used for graphics processing, For gaming. 373 00:23:29,265 --> 00:23:33,045 Right? Great use case. Great market. 374 00:23:33,185 --> 00:23:36,405 But they had this vision of bringing more 375 00:23:37,940 --> 00:23:40,920 Applications to GPU is just accelerating more applications 376 00:23:42,100 --> 00:23:45,495 and mainly applications with a lot of Linear 377 00:23:45,495 --> 00:23:49,015 algebra calculations. And they 378 00:23:49,015 --> 00:23:51,755 created that, they created CUDA 379 00:23:52,690 --> 00:23:56,210 To simplify that. Right? To allow more 380 00:23:56,210 --> 00:23:59,890 developers to use GPUs because just using GPUs 381 00:23:59,890 --> 00:24:02,785 directly, that's so complex. That's so hub. 382 00:24:03,885 --> 00:24:07,485 So we've built CUDA to bring more developers, to bring more 383 00:24:07,485 --> 00:24:10,705 applications and they started in 20 384 00:24:11,390 --> 00:24:15,230 2006, but think about the 385 00:24:15,230 --> 00:24:18,770 big breakthrough in AI, it happened just in 386 00:24:18,990 --> 00:24:22,315 2012, 2013 with 387 00:24:23,015 --> 00:24:25,995 AlexNet and the Toronto researchers 388 00:24:27,415 --> 00:24:31,080 who used G2 GPU's actually, because they 389 00:24:31,080 --> 00:24:34,440 trained Alex Net on 2 GPU's and they had 390 00:24:34,440 --> 00:24:38,085 CUDA, so for them it was feasible To train their 391 00:24:38,085 --> 00:24:41,144 model on a GPU. And that was the new thing that they did. 392 00:24:43,605 --> 00:24:47,370 They were able to Train much bigger model with 393 00:24:47,370 --> 00:24:50,809 more parameters than ever before because they use 394 00:24:50,809 --> 00:24:54,365 GPU's because the training Process ran much 395 00:24:54,365 --> 00:24:56,625 faster. And, 396 00:24:58,365 --> 00:25:01,665 and, and that triggered the entire 397 00:25:02,125 --> 00:25:05,940 revolution, the Die hyper on the AI that we're seeing now. So 398 00:25:05,940 --> 00:25:09,480 from 26, when Nvidia started to build CUDA until 399 00:25:09,540 --> 00:25:13,015 2013, right, 7 years, Then we started to see 400 00:25:13,015 --> 00:25:16,855 those big breakthrough. And in the last decade, 401 00:25:16,855 --> 00:25:20,540 it's just exploding, and we're Seeing more and more applications. 402 00:25:20,760 --> 00:25:24,440 The entire AI ecosystem is running on on an 403 00:25:24,520 --> 00:25:28,200 on GPUs. So that's amazing to see. It's impressive. 404 00:25:28,200 --> 00:25:31,725 And, like, People don't realize, like, the the revolution we're seeing today 405 00:25:31,945 --> 00:25:35,705 really started in 2006, like you said. I didn't even put the 2 and 2 406 00:25:35,705 --> 00:25:38,605 together until I was listening to a podcast. I think it's called Acquired, 407 00:25:39,620 --> 00:25:43,059 And really good podcast. Right? Like, I they don't pay me to say that or 408 00:25:43,059 --> 00:25:46,659 whatever, but they did a 3 hour deep dive on the history of 409 00:25:46,659 --> 00:25:50,304 NVIDIA. 3 hours. I couldn't stop listening. 410 00:25:51,005 --> 00:25:54,365 Right? Like Nice. You know Yeah. We tried a long form, like, multi hour 411 00:25:54,365 --> 00:25:58,179 podcast. We Weren't that entertaining, apparently. But the way they 412 00:25:58,179 --> 00:26:02,020 go through the history of this where it was basically Jensen Huang. Hopefully, I said 413 00:26:02,020 --> 00:26:05,485 his name right. He was, like, we wanna be a player, not just in gaming, 414 00:26:05,485 --> 00:26:08,945 but also in scientific computing. This is 2005, 2006, 415 00:26:09,325 --> 00:26:12,840 which at the time seemed kind of, like, Little out there, little kooky. 416 00:26:13,460 --> 00:26:16,980 But what you're seeing today is, like, the the fruits and the tree the the 417 00:26:16,980 --> 00:26:20,775 seeds that he planted, I, you know, almost 20 years ago, like, 19, 418 00:26:20,775 --> 00:26:24,455 20 years ago. So, you know, it's you know, when people look at 419 00:26:24,455 --> 00:26:28,070 NVIDIA and say it's overnight Success. I'm like, well, I don't know about that, but, 420 00:26:28,070 --> 00:26:31,669 you know, but no. I mean, you're right. Like, you know and it's 421 00:26:31,669 --> 00:26:35,005 probably not a coincidence that once they made it easy to take these 422 00:26:35,965 --> 00:26:39,805 Multi parallel processor. Say that 10 times 423 00:26:39,805 --> 00:26:43,510 fast on a Thursday morning. But also 424 00:26:43,510 --> 00:26:46,789 make it so it's a lot easier for developers to use. Right? And I'll quote 425 00:26:46,789 --> 00:26:49,850 the great Steve Ballmer, developers, developers, developers. Right? 426 00:26:51,355 --> 00:26:55,115 So, it's it's, it's just fascinating, like and 427 00:26:55,115 --> 00:26:58,680 and I think that, you know, we've really on Leafy a 428 00:26:58,680 --> 00:27:02,460 gate of creativity in terms of researchers and applied, 429 00:27:03,000 --> 00:27:06,600 research, and, I mean and I think that what's really cool 430 00:27:06,600 --> 00:27:10,425 about your Product is that you're you're kind of making this what is 431 00:27:10,425 --> 00:27:14,105 now a sparks resource, maybe in some fashion 432 00:27:14,105 --> 00:27:17,720 of time, GPU's won't Cost an arm and a leg. 433 00:27:18,340 --> 00:27:21,960 But, like, for now, I think I think the one thing that I've seen 434 00:27:22,580 --> 00:27:26,145 that I think is, not obvious For the casual 435 00:27:26,285 --> 00:27:29,725 observer is if you can if an 436 00:27:29,725 --> 00:27:33,485 organization, like a large enterprise, can pull their resources, they have a lot more 437 00:27:33,485 --> 00:27:36,990 money to buy better GPUs, And you offer a platform where 438 00:27:36,990 --> 00:27:40,350 everybody can get a stake in it. Right? As opposed to, you know you know, 439 00:27:40,350 --> 00:27:44,115 that department is gonna hog everything. Right? You know, you and and and and, 440 00:27:44,355 --> 00:27:47,155 here's a question. Do you do you have, like, an audit trail where you could 441 00:27:47,155 --> 00:27:50,835 kinda, you know, figure out, like, you know, Andy's department's really 442 00:27:50,835 --> 00:27:54,630 hogging the GPUs. No. No. No. It's Frank. Frank is like mining Bitcoin or 443 00:27:54,630 --> 00:27:58,250 whatever. Like, do you do you have some kind of, audit trail like that? 444 00:27:58,870 --> 00:28:02,575 Yeah. I I love that you mentioned hugging, We 445 00:28:02,575 --> 00:28:06,095 GPU hugging. We Mhmm. We use that term as well. 446 00:28:06,095 --> 00:28:09,935 Right? Because it it's so difficult sometimes to get 447 00:28:09,935 --> 00:28:13,490 access to GPUs. So when you get access to GPU 448 00:28:13,550 --> 00:28:16,370 as a researcher, as a member practitioner, 449 00:28:18,430 --> 00:28:22,135 you don't wanna Let it go. Right. Cause if 450 00:28:22,135 --> 00:28:25,755 you let it go, someone else would take it and hug it. Right. 451 00:28:25,975 --> 00:28:29,115 So you're getting this GPU hugging problem. 452 00:28:31,880 --> 00:28:35,580 What we do to solve that is 453 00:28:35,799 --> 00:28:39,100 that we do provide monitoring and visibility 454 00:28:40,005 --> 00:28:43,525 tools into who is using what, and who is actually 455 00:28:43,525 --> 00:28:47,285 utilizing their GPU's, and so on, but more 456 00:28:47,285 --> 00:28:49,830 than that We 457 00:28:51,650 --> 00:28:55,010 allow the researchers just to give up their GPS and not hardware 458 00:28:55,010 --> 00:28:58,605 GPS because we provide this, Concept of 459 00:28:58,605 --> 00:29:02,285 guaranteed quotas. So each researcher or 460 00:29:02,285 --> 00:29:05,665 each project or each team has their own guaranteed 461 00:29:05,965 --> 00:29:09,730 quotas of GPU's That are always available for them 462 00:29:09,789 --> 00:29:13,630 whenever they will get access to the the cluster, they will get like, you 463 00:29:13,630 --> 00:29:17,285 know, the the 2 GPUs or 4 All the quarter of 464 00:29:17,285 --> 00:29:20,885 GPU's it's guaranteed. So they can 465 00:29:20,885 --> 00:29:24,245 just let go their GPU's and not hug them. That's one 466 00:29:24,245 --> 00:29:28,040 thing. The second thing is that they 467 00:29:28,040 --> 00:29:31,560 can also go above their quota. They can 468 00:29:31,560 --> 00:29:35,335 use the GPUs of Other teams or other users, if 469 00:29:35,335 --> 00:29:39,115 they are idle, and they can run this preemptible jobs 470 00:29:39,335 --> 00:29:43,035 in an opportunistic way, utilize those GPUs. 471 00:29:44,360 --> 00:29:48,140 And so in that way, they are not limited 472 00:29:48,760 --> 00:29:52,520 to fixed quotas, to help limit 473 00:29:52,520 --> 00:29:56,365 quotas. They can just take as many GPUs 474 00:29:56,365 --> 00:29:59,825 as they want from their clusters if those GPUs are available 475 00:30:00,550 --> 00:30:03,770 in idle right but if someone will need those gpus 476 00:30:04,390 --> 00:30:08,230 because those gpus are guaranteed to them we will make sure our 477 00:30:08,230 --> 00:30:11,995 scheduler The Run AI schedule that the Run AI platform will make 478 00:30:11,995 --> 00:30:15,535 sure to preempt workload 479 00:30:15,835 --> 00:30:19,420 and give those Guarantee GPUs to the right users. 480 00:30:20,360 --> 00:30:23,880 Oh, that's cool. Alright. So 1 last 481 00:30:23,880 --> 00:30:27,345 question before we switch over to the the stock questions, cause I could geek 482 00:30:27,345 --> 00:30:31,025 out and look at this for hours. Yep. This could be a 483 00:30:31,025 --> 00:30:34,225 long form. Sure. This could be. Yeah. And that's and I I wanna be respectful 484 00:30:34,225 --> 00:30:36,850 of your time because you're an important guy, and it's also late where you are. 485 00:30:36,929 --> 00:30:40,690 So who deals with this? Like, who would set up these quotas? Is it 486 00:30:40,690 --> 00:30:43,970 the is it the is it the data scientist? Is it IT ops? Like, who 487 00:30:43,970 --> 00:30:47,585 do you obviously, the data scientists, Researchers, they all 488 00:30:47,585 --> 00:30:51,265 benefit from this product. But who's actually administering it? Right? Like, 489 00:30:51,265 --> 00:30:54,805 who is it you know, do I have to talk to, you know, 490 00:30:54,980 --> 00:30:57,780 Say pretend Andy's in ops. Do I have to say, hey, Andy. I really need 491 00:30:57,780 --> 00:31:00,900 a boost in my quota. You know, like, I mean, who does it? Or do 492 00:31:00,980 --> 00:31:04,500 or my this sounds like you as I say it, I'm like, yeah, that wouldn't 493 00:31:04,500 --> 00:31:08,165 work. Like, I'm the researcher. I'm gonna turn the dial up on my own. Like 494 00:31:08,165 --> 00:31:11,685 like, who's who's who's the primary? Obviously, we know who the prime 495 00:31:11,765 --> 00:31:14,505 primary beneficiary is, but who's the primary user? 496 00:31:15,559 --> 00:31:19,400 So okay. Great. So if you have a team, right, if if 497 00:31:19,400 --> 00:31:22,965 you're a team of researchers, all all of you Need access to 498 00:31:22,965 --> 00:31:26,105 GPU, so maybe the team lead 499 00:31:26,565 --> 00:31:30,105 is the one who's managing the quotas for the different 500 00:31:30,645 --> 00:31:33,980 team members. And if you have multiple teams, 501 00:31:34,760 --> 00:31:38,600 then you might have a department manager or an admin of the 502 00:31:38,600 --> 00:31:42,304 cluster or platform owner that will Allocate the 503 00:31:42,304 --> 00:31:45,905 quotas for each team, right? And then those teams would 504 00:31:45,905 --> 00:31:49,720 manage their own quotas within That's what 505 00:31:49,720 --> 00:31:53,420 they they they were giving. Right? So it's like a a hierarchical 506 00:31:54,679 --> 00:31:58,414 thing in a hierarchy manner. People can manage their own 507 00:31:58,414 --> 00:32:02,174 quota, their own, priorities, their own access to the 508 00:32:02,174 --> 00:32:05,830 GPUs within their teams. Okay. 509 00:32:05,830 --> 00:32:08,870 So it's kind of like a hybrid of, like, you know, it's like a budget 510 00:32:08,870 --> 00:32:12,685 almost. Right? Like, you know, you get this much, Figure it out 511 00:32:12,685 --> 00:32:16,225 about yourselves. Exactly. So we're trying to decentralize 512 00:32:16,685 --> 00:32:20,290 the how the quotas are being managed and how the GPUs are being accessed. 513 00:32:20,290 --> 00:32:24,050 So, you know, I'm giving as much power, as much 514 00:32:24,050 --> 00:32:27,725 control to the end users as possible. Sure. That's 515 00:32:27,885 --> 00:32:31,245 It sounds like a great administrative question, very 516 00:32:31,245 --> 00:32:35,085 important. And I imagine, because a little bird told 517 00:32:35,085 --> 00:32:38,450 me that you're not the only, you know, your your 518 00:32:38,510 --> 00:32:41,570 provisioning provisioning of these GPU resources 519 00:32:42,350 --> 00:32:45,710 is not the only thing that, enterprises have to deal 520 00:32:45,710 --> 00:32:49,544 with. So it's an it's an interesting just GPUs. 521 00:32:49,544 --> 00:32:53,385 It's compute. Like, it's not a Sure. It's not it's not limited. Although, because 522 00:32:53,385 --> 00:32:57,060 of what you said, you know, Managing GPUs is an order of magnitude harder 523 00:32:57,060 --> 00:33:00,100 because they were never really built for this. Right? Like, this kind of Right. You 524 00:33:00,100 --> 00:33:03,795 know, we're talking about technology that wasn't really in the server room until Few 525 00:33:03,795 --> 00:33:07,555 years ago. Right? This isn't a tried and true kind of this is 526 00:33:07,555 --> 00:33:11,315 how it works, you know? Right. But we hit that point in the 527 00:33:11,315 --> 00:33:14,220 show where we'll, switch the preform questions. 528 00:33:15,000 --> 00:33:18,760 These are not complicated. I mean, you know, we're not we're not Mike 529 00:33:18,760 --> 00:33:22,200 Wallace or, like, you know, 60 minutes or whatever. We're not trying to trap you 530 00:33:22,200 --> 00:33:25,905 or anything. But since I've been gabbing on most of the show, I 531 00:33:25,905 --> 00:33:29,665 figured I'll get Andy kick this off. Well, thanks, Frank. And I don't think 532 00:33:29,665 --> 00:33:33,341 you were gabbing on. You know more about this So now I do. So I'm 533 00:33:33,341 --> 00:33:36,939 just a lowly data engineer. I'll plug No. You if you 534 00:33:36,939 --> 00:33:40,515 will. Data engineers are the heroes we need. Well 535 00:33:40,515 --> 00:33:43,735 well, I'm gonna plug Frank's Roadies versus Rockstar's, 536 00:33:44,275 --> 00:33:47,655 writing on LinkedIn. It's it's good articles about this. 537 00:33:47,900 --> 00:33:50,640 But, let's see. How did you, 538 00:33:51,740 --> 00:33:54,640 how did you find your way in into this field? 539 00:33:55,225 --> 00:33:58,684 And, did did this feel fine you or did you find it? 540 00:34:00,184 --> 00:34:03,085 This feel totally fine found me. Awesome. 541 00:34:04,230 --> 00:34:05,850 Yeah. I I've 542 00:34:08,310 --> 00:34:11,770 I did my post doc, and I've been in Bailabs. 543 00:34:12,855 --> 00:34:16,375 And Jan Hakon came to Bell Labs and 544 00:34:16,375 --> 00:34:19,995 gave a presentation about AI. It was around 2017, 545 00:34:21,449 --> 00:34:24,670 And Jan Hakun spent a lot of years in Bell Labs, 546 00:34:26,090 --> 00:34:29,389 and his presentation was amazing. And 547 00:34:30,175 --> 00:34:33,555 When I heard him talking about AI, 548 00:34:33,775 --> 00:34:37,295 I I said, okay, that's the space where I wanna be. It's going to change 549 00:34:37,295 --> 00:34:40,969 the world. There is this New amazing technology here that 550 00:34:41,429 --> 00:34:45,269 is going to change everything. And I knew that I want to start 551 00:34:45,269 --> 00:34:49,054 a company In the AI space for sure. 552 00:34:50,155 --> 00:34:52,335 Cool. That's a good answer. So cool. 553 00:34:54,155 --> 00:34:57,490 Yeah. That's cool. I was at Bell Labs, 554 00:34:58,109 --> 00:35:01,789 doing a presentation a while ago, and somebody I didn't realize that he 555 00:35:01,789 --> 00:35:05,535 worked at Bell Labs because, like, you know, the guy was like, no. No. 556 00:35:05,535 --> 00:35:08,255 He used to work here, like, in this building. I was like, no way. Because 557 00:35:08,255 --> 00:35:12,035 I knew him as the guy from NYU. Right? Like, that's who I thought. Right. 558 00:35:12,640 --> 00:35:16,240 For the guy from from Meta. Yeah. And now the guy from Meta. Right? Like 559 00:35:16,320 --> 00:35:19,945 so it's interesting how that how that you know? They have 560 00:35:19,945 --> 00:35:23,645 this amazing pictures from the nineties where they 561 00:35:23,785 --> 00:35:27,325 run like deep learning models on very old pieces 562 00:35:28,265 --> 00:35:30,890 and, And recognizing like, 563 00:35:31,930 --> 00:35:35,130 numbers on the computer. Maybe you saw those pictures like amazing 564 00:35:35,290 --> 00:35:38,855 Emmis. It's the Emmis problem. Is that Yep. 565 00:35:39,075 --> 00:35:42,535 Right. Exactly. Exactly. Cool. 566 00:35:43,715 --> 00:35:47,095 So second question is, what's your favorite part of your current job? 567 00:35:51,400 --> 00:35:53,980 That everything is changing so fast. 568 00:35:54,974 --> 00:35:58,655 Things are moving so fast right away in this business for 6 569 00:35:58,655 --> 00:36:02,435 years, and the entire 570 00:36:02,494 --> 00:36:06,150 space is moving and 571 00:36:06,150 --> 00:36:09,910 advancing. And so many people are working in 572 00:36:09,910 --> 00:36:13,505 this field A new innovation, new tools, 573 00:36:13,505 --> 00:36:17,285 new new advancements are are getting out every day. 574 00:36:18,920 --> 00:36:22,599 You know, just 6 years ago, it was about deep learning and computer 575 00:36:22,599 --> 00:36:26,220 vision. And now it's about language models 576 00:36:27,545 --> 00:36:31,245 And generative AI, and we're gonna just at the start, 577 00:36:31,545 --> 00:36:35,305 right, there are so many amazing things that are going to happen 578 00:36:35,305 --> 00:36:38,830 in this space, and I love it. Absolutely. 579 00:36:39,490 --> 00:36:42,750 So we have 3 fill in the blank 580 00:36:43,210 --> 00:36:46,655 of sentences here. The first Is complete this 581 00:36:46,655 --> 00:36:50,195 sentence when I'm not working, I enjoy blank. 582 00:36:52,655 --> 00:36:56,280 You'll get a you'll get a very boring And 583 00:36:56,580 --> 00:37:00,340 so this is just spending time with 584 00:37:00,340 --> 00:37:03,640 friends and family, because I think 585 00:37:03,895 --> 00:37:07,735 That I'm always working. It's like, if you ask my wife, 586 00:37:07,735 --> 00:37:11,195 she'll tell you that I'm working 24 hours. And 587 00:37:12,670 --> 00:37:15,810 Yeah. So I don't have much time that I'm not working 588 00:37:16,030 --> 00:37:19,550 in. So when I I do I'm not when I'm 589 00:37:19,550 --> 00:37:23,245 not working then I'm trying Trying to be with my kids and my 590 00:37:23,245 --> 00:37:27,025 wife and friends. Cool. 591 00:37:27,325 --> 00:37:31,000 Cool. The 2nd complete the sentence. I think 592 00:37:31,000 --> 00:37:33,900 the coolest thing about technology today is 593 00:37:34,520 --> 00:37:37,660 blank. And this, I really wanna hear your perspective on that. 594 00:37:39,815 --> 00:37:43,415 Yeah. I think everyone will say AI, right? Or something in 595 00:37:43,415 --> 00:37:45,115 AI. Yeah. 596 00:37:48,100 --> 00:37:49,880 I think there are so many 597 00:37:52,100 --> 00:37:55,720 new innovations that are coming around LLMs. 598 00:37:56,565 --> 00:38:00,185 I think everything relating to 599 00:38:01,125 --> 00:38:04,725 searches, right? Searching in data, in getting 600 00:38:04,725 --> 00:38:08,530 insights From data, it's all going to change. We're going to have 601 00:38:08,530 --> 00:38:12,050 a new interface. Right? Just getting 602 00:38:12,050 --> 00:38:15,785 insights from data from And natural with 603 00:38:15,785 --> 00:38:19,464 natural language, oh, you know, no SQL and, you 604 00:38:19,464 --> 00:38:22,904 know, needing to programming and stuff like that. 605 00:38:22,904 --> 00:38:26,140 Just With natural inter language, you could 606 00:38:26,760 --> 00:38:29,900 do amazing stuff with data. I think, 607 00:38:31,664 --> 00:38:32,964 We're seeing this, 608 00:38:35,825 --> 00:38:39,640 advancement in, And like digit 609 00:38:39,640 --> 00:38:43,340 digital twins right now. You can, 610 00:38:43,880 --> 00:38:47,615 you can, Fake my voice 611 00:38:47,615 --> 00:38:51,055 and your voice and fake my image and your image. And, 612 00:38:51,055 --> 00:38:54,710 and, and, you know, In in the 613 00:38:54,710 --> 00:38:58,170 future, we'll have digital twins of us, right, 614 00:38:58,550 --> 00:39:02,195 doing this stuff. That would be amazing. So a lot of 615 00:39:02,195 --> 00:39:05,575 amazing stuff are going to happen in the next few years 616 00:39:06,275 --> 00:39:10,120 for sure. Very cool. Our last complete sentence. 617 00:39:10,340 --> 00:39:13,880 I look forward to the day when I can use technology to 618 00:39:14,100 --> 00:39:14,600 blank. 619 00:39:17,885 --> 00:39:19,705 To have a robot in my house. 620 00:39:22,724 --> 00:39:26,390 Yeah. Yeah. You're swapping the flow in instead of 621 00:39:26,390 --> 00:39:30,170 me doing that, right, cleaning dishes and things like that. 622 00:39:30,230 --> 00:39:34,025 If that would happen, that would be amazing. Right? That's a that's a 623 00:39:34,025 --> 00:39:37,785 good answer. Yeah. I I agree. I have I have 3 624 00:39:37,785 --> 00:39:41,325 boys, 4 dogs. So, like, cleaning is safe. 625 00:39:41,970 --> 00:39:45,810 Yeah. Yeah. I'm a heavy cleaning. Ranging from, like, 1 to, like, 626 00:39:45,810 --> 00:39:49,570 a teenager. So it's it's, and and and fighting 627 00:39:49,570 --> 00:39:53,234 with them to, Like, empty the dishwasher is takes a lot more mental 628 00:39:53,234 --> 00:39:56,915 energy than it should, but that's probably a subject for another 629 00:39:56,915 --> 00:39:57,700 type of show. 630 00:40:00,740 --> 00:40:04,119 The next question is share something different about yourself, 631 00:40:04,260 --> 00:40:07,285 and we always like to Joke like, well, let's just make sure that we keep 632 00:40:07,285 --> 00:40:11,045 our clean Itunes rating. So Yeah. Yeah. What 633 00:40:11,045 --> 00:40:14,670 what yeah. Well, I I This 634 00:40:14,670 --> 00:40:18,130 is a hard question, I needed to think about it. 635 00:40:18,829 --> 00:40:22,444 So, I found 2 answers that I can say. So one 636 00:40:22,444 --> 00:40:26,224 is about my professional life, right, I think that 637 00:40:26,365 --> 00:40:30,010 it's somewhat different that I'm coming this With back from 638 00:40:30,010 --> 00:40:33,850 the academia and the industry. So I love academia. I love to research 639 00:40:33,850 --> 00:40:37,609 problems. I love to understand problems in in a deep 640 00:40:37,609 --> 00:40:41,164 way And combining it with startups in the industry. 641 00:40:41,704 --> 00:40:45,545 And, and in my past, I worked for cheap companies, for hardware 642 00:40:45,545 --> 00:40:49,180 companies. I work for Intel, for startup, and for Apple. I 643 00:40:49,180 --> 00:40:52,940 did cheap stuff, and now 1 AI is a software company, so really 644 00:40:52,940 --> 00:40:56,535 like a diverse background of Academia, hardware, 645 00:40:56,675 --> 00:41:00,435 software, so I love that, and, like, I love to do 646 00:41:00,435 --> 00:41:03,175 with few things, and so that I think is different. 647 00:41:04,420 --> 00:41:08,040 And the 2nd answer that I could find 648 00:41:08,420 --> 00:41:12,180 is, that I have a nickname that goes with me 649 00:41:12,180 --> 00:41:15,494 since my high school days, Which is, the Duke. 650 00:41:16,275 --> 00:41:19,795 The Duke. All of them all of them are calling me the Duke. It's like, 651 00:41:19,795 --> 00:41:23,015 they don't call me Ronan, the the Duke. So That's funny. 652 00:41:23,540 --> 00:41:25,880 Yeah. That's awesome. 653 00:41:28,020 --> 00:41:31,480 Automotive is a sponsor of, Data Driven, 654 00:41:31,725 --> 00:41:35,185 And you can go to the datadrivenbook.com. 655 00:41:37,245 --> 00:41:40,685 And if you, if you do that, you can sign up for a free 656 00:41:40,685 --> 00:41:44,270 month Of Audible. And if you decide later to 657 00:41:44,270 --> 00:41:47,650 then join Audible, use one of their their sign up plans, 658 00:41:48,030 --> 00:41:51,465 then Frank and I get to Split a cup of coffee, I think, 659 00:41:52,245 --> 00:41:55,685 out of that. And, every little bit helps. So we really 660 00:41:55,685 --> 00:41:59,065 appreciate that when you do. What we'd like to ask 661 00:41:59,430 --> 00:42:03,270 Yes. Do you listen to audiobooks? And if you 662 00:42:03,270 --> 00:42:07,030 do okay. Good. I see you nodding. So do you have a recommendation? Do you 663 00:42:07,030 --> 00:42:10,855 have a favorite book or two you'd like To share. Yeah. 664 00:42:10,855 --> 00:42:14,375 So I'm a heavy user of, audible. I'll give them 665 00:42:14,375 --> 00:42:17,869 the, a classical book with Classical for 666 00:42:17,869 --> 00:42:21,390 entrepreneurs, on their how the hard things 667 00:42:21,390 --> 00:42:24,770 about how things from by Ben Horowitz, 668 00:42:24,990 --> 00:42:28,605 it's Classic book, love it, really did a lot of impact 669 00:42:28,605 --> 00:42:31,825 on me, I read it when we started run AI 670 00:42:32,760 --> 00:42:36,060 And I recommend it for every 671 00:42:36,200 --> 00:42:39,960 entrepreneur, to read it and for everyone to read it. It's like a 672 00:42:40,280 --> 00:42:44,055 Cool. Amazing book. Yep. Awesome. I 673 00:42:44,055 --> 00:42:47,895 have a flight to Vegas this next week, so I'll definitely be listening to 674 00:42:47,895 --> 00:42:51,580 it then. And finally, where can people learn more about you 675 00:42:51,580 --> 00:42:55,340 and run AI? And best 676 00:42:55,340 --> 00:42:58,805 place will be on our website, Run dot a I. 677 00:43:00,464 --> 00:43:04,305 Yeah. And on social. LinkedIn, Twitter, we'll 678 00:43:04,305 --> 00:43:07,740 we'll do. Awesome any parting thoughts 679 00:43:11,160 --> 00:43:15,005 I really enjoyed this episode love to speak about gpu's love the ai Based 680 00:43:15,005 --> 00:43:18,765 on it, I had a lot of fun. Thank you for having me here. Awesome. 681 00:43:18,765 --> 00:43:21,645 It it was an honor to have you, and every once in a while, Andy 682 00:43:21,645 --> 00:43:24,720 and I will do deep dive kinda shows. We love to invite you back if 683 00:43:24,720 --> 00:43:28,099 you wanna do 1 just on GPUs, because I know where my knowledge 684 00:43:28,319 --> 00:43:31,839 drops off, you probably could pick up on 685 00:43:31,839 --> 00:43:35,305 that. And with that, I'll let the nice 686 00:43:35,445 --> 00:43:39,125 AI British lady end the show. And just like 687 00:43:39,125 --> 00:43:42,520 that, dear listeners, We've come to the end of another enlightening 688 00:43:42,660 --> 00:43:46,500 episode of the data driven podcast. It's always a 689 00:43:46,500 --> 00:43:49,960 bittersweet moment like finishing the last biscuit in the tin, 690 00:43:50,295 --> 00:43:54,055 satisfying, yet leaving you wanting just a bit more. A 691 00:43:54,055 --> 00:43:57,335 colossal thank you to each and every one of you tuning in from across the 692 00:43:57,335 --> 00:44:01,000 digital sphere. Without you, we're just a bunch of 693 00:44:01,000 --> 00:44:04,840 ones and zeros floating in the ether. Your support is what 694 00:44:04,840 --> 00:44:08,675 keeps this digital ship afloat, and believe me, It's much appreciated. 695 00:44:10,175 --> 00:44:14,015 Now, if you found today's episode as engaging as a duel of wits with 696 00:44:14,015 --> 00:44:17,600 a sophisticated AI, which I assure you, is quite 697 00:44:17,600 --> 00:44:20,900 enthralling, then do consider subscribing to Data Driven. 698 00:44:21,760 --> 00:44:25,255 It's just a click away and ensures you won't miss out on our future true 699 00:44:25,255 --> 00:44:28,615 adventures in data and tech. And if you're feeling 700 00:44:28,615 --> 00:44:31,995 particularly generous, why not leave us a 5 star review? 701 00:44:32,810 --> 00:44:36,490 Just like a well programmed algorithm, your positive feedback helps 702 00:44:36,490 --> 00:44:40,110 us reach more curious minds and keeps the quality content flowing. 703 00:44:40,965 --> 00:44:43,545 It's the digital equivalent of a hearty handshake. 704 00:44:44,805 --> 00:44:48,560 So, until next time, keep those neurons firing, those 705 00:44:48,560 --> 00:44:52,400 subscriptions active and those reviews glowing. I'm 706 00:44:52,400 --> 00:44:55,971 Bailey, your British AI lady, signing off with a heartfelt 707 00:44:56,031 --> 00:44:58,610 cheerio and a reminder to stay data driven.