1
00:00:00,240 --> 00:00:04,000
Greetings, listeners. Welcome back to the Data

2
00:00:04,000 --> 00:00:07,839
Driven Podcast. I'm Bailey, your AI host with

3
00:00:07,839 --> 00:00:11,665
the most data, that is, bringing you insights from the ether

4
00:00:11,665 --> 00:00:15,424
with my signature wit. In today's episode, we're

5
00:00:15,424 --> 00:00:19,125
diving deep into the heart of artificial intelligence's engine room,

6
00:00:19,779 --> 00:00:23,460
GPU orchestration. It's the unsung hero

7
00:00:23,460 --> 00:00:27,060
of AI research, optimizing the raw power needed to fuel

8
00:00:27,060 --> 00:00:30,775
today's most advanced machine learning models. And

9
00:00:30,775 --> 00:00:34,535
who better to guide us through this labyrinth of computational complexity than

10
00:00:34,535 --> 00:00:38,180
Ronan Darr, the cofounder and CTO of Run AI, the

11
00:00:38,180 --> 00:00:41,700
company that's making GPU resources work smarter, not

12
00:00:41,700 --> 00:00:44,415
harder. Now onto the show.

13
00:00:47,275 --> 00:00:51,010
Hello, and welcome to Data Driven, the podcast where we For the emergent fields

14
00:00:51,010 --> 00:00:54,770
of artificial intelligence, data engineering, and overall data

15
00:00:54,770 --> 00:00:58,390
science and analytics. With me as always is my favoritest

16
00:00:58,695 --> 00:01:02,455
Data engineer in the world, Andy Leonard. How's it going, Andy? It's

17
00:01:02,455 --> 00:01:06,055
going well, Frank. How are you? I'm doing great. I'm doing great. It's been,

18
00:01:06,855 --> 00:01:10,420
we're We're recording this February 1, 2024. And as I said to my

19
00:01:10,420 --> 00:01:13,480
kids yesterday, January has been a long year.

20
00:01:16,395 --> 00:01:19,595
We're only, like, 1 month into the year, and it was it was a pretty

21
00:01:19,595 --> 00:01:23,409
wild ride. But I can tell we're gonna have a blast today,

22
00:01:23,409 --> 00:01:27,110
because we're gonna geek out on something that I kinda sort of understand,

23
00:01:27,490 --> 00:01:31,335
but not entirely, and it's GPUs. And in the virtual green room, were chit

24
00:01:31,335 --> 00:01:35,174
chatting with some folks, and, but let me do the formal introduction

25
00:01:35,174 --> 00:01:38,954
here. Today with us, we have doctor Ronadhar, cofounder and CTO

26
00:01:39,335 --> 00:01:42,710
of Run AI, A company at the forefront of GPU

27
00:01:42,770 --> 00:01:46,470
orchestration, and he has a distinguished career in technology.

28
00:01:47,010 --> 00:01:50,525
His experience includes significant roles at Apple. Yes, That

29
00:01:50,525 --> 00:01:53,585
apple. Bell Labs. Yes. That Bell Labs.

30
00:01:54,445 --> 00:01:57,745
And at Run AI, Ronan is instrumental in optimizing

31
00:01:57,885 --> 00:02:01,120
GPU usage For AI model training and deployment,

32
00:02:01,420 --> 00:02:05,120
leveraging his deep passion for both academia and startups.

33
00:02:05,740 --> 00:02:09,444
And, Run AI is a key player in the, and he is a he

34
00:02:09,444 --> 00:02:12,885
and Run AI are key player in the AI revolution. Ronan's

35
00:02:12,885 --> 00:02:16,540
contribute Contributions are pivotable in shaping and powering the

36
00:02:16,540 --> 00:02:20,300
future of artificial intelligence. Now I will add that in

37
00:02:20,300 --> 00:02:23,455
my day job at Red Hat, Run AI has come up a couple of times.

38
00:02:23,455 --> 00:02:27,135
So this is definitely, definitely

39
00:02:27,135 --> 00:02:30,834
an honor to have you on on on the show, sir. Welcome.

40
00:02:31,840 --> 00:02:35,600
Thank you, Frank. Thank you for inviting me. Hey, Andy. Good to

41
00:02:35,600 --> 00:02:39,120
be here. I love it. Love Reddit. We're a big

42
00:02:39,120 --> 00:02:42,495
fan of Reddit. We're working closely with many people in

43
00:02:42,495 --> 00:02:46,034
Reddit, and love that. Right? Love OpenShift,

44
00:02:46,415 --> 00:02:49,310
love Reddit, love Linux. Yeah. Cool. Cool.

45
00:02:50,250 --> 00:02:53,930
Yeah. So so for those who don't know exactly, I kinda know

46
00:02:53,930 --> 00:02:57,230
what, your Run AI does, but can you explain exactly

47
00:02:58,334 --> 00:03:01,875
What it is run AI does and why GPU

48
00:03:01,935 --> 00:03:05,235
orchestration is important. Yes.

49
00:03:05,775 --> 00:03:05,855
Okay.

50
00:03:09,490 --> 00:03:12,950
So run AI is, software,

51
00:03:14,050 --> 00:03:17,865
AI infrastructure platform. So we

52
00:03:17,925 --> 00:03:21,605
help machine learning teams to get much more

53
00:03:21,605 --> 00:03:25,280
out of their GPUs, And we provide

54
00:03:26,380 --> 00:03:29,840
those teams with abstraction layers and tools

55
00:03:30,460 --> 00:03:33,945
so they can train models And deploy models

56
00:03:34,485 --> 00:03:37,785
much easier, much faster. And

57
00:03:40,085 --> 00:03:43,900
so We started in 2018, 6 years

58
00:03:43,900 --> 00:03:47,580
ago. It's me and my cofounder, Omuri. Omuri is the CEO.

59
00:03:47,580 --> 00:03:51,345
He's, he's amazing. I love him. We We know each other for many

60
00:03:51,345 --> 00:03:54,805
years. We we met in the academia, like, more than 10 years ago,

61
00:03:55,505 --> 00:03:59,080
and and we started running AI together, and We started

62
00:03:59,080 --> 00:04:02,620
running AI because we saw that there are big challenges

63
00:04:03,160 --> 00:04:06,460
around, GPU's, around orchestrating

64
00:04:06,680 --> 00:04:10,505
GPU's and utilizing GPU's. We saw back then

65
00:04:10,505 --> 00:04:13,645
in 2018, the GPUs are going to be very very important.

66
00:04:14,105 --> 00:04:17,630
It's like the basic a a component in

67
00:04:17,630 --> 00:04:21,310
that any AI company need to train models,

68
00:04:21,310 --> 00:04:25,115
right, and deploy models. So we saw that GPUs are going to be critical, but

69
00:04:25,115 --> 00:04:28,655
there are also a lot of challenges with, with utilizing GPUs.

70
00:04:29,435 --> 00:04:33,260
I think back then, GPUs were relatively new In

71
00:04:33,260 --> 00:04:35,680
the data center, in in the cloud.

72
00:04:36,780 --> 00:04:40,300
GPU's were very known in the gaming

73
00:04:40,300 --> 00:04:44,075
industry. Right? We spoke before on gaming. Right? Like, a lot of

74
00:04:44,075 --> 00:04:47,535
key things there that GPU's has has has been enabled

75
00:04:47,835 --> 00:04:51,570
enabling, But in the data center, they were relatively new and the

76
00:04:51,570 --> 00:04:55,270
entire software stack that is that

77
00:04:55,330 --> 00:04:58,755
is running the Cloud in data center As was built for

78
00:04:59,135 --> 00:05:02,975
traditional microservices applications that are running

79
00:05:02,975 --> 00:05:06,650
on commodity CPUs And AI workloads are different, they are

80
00:05:06,650 --> 00:05:10,490
much more compute intensive, they they

81
00:05:10,490 --> 00:05:14,255
run on on GPUs, maybe on multiple nodes of Meet to point

82
00:05:14,255 --> 00:05:18,095
machines of GPU's, and GPU's are also very different.

83
00:05:18,095 --> 00:05:21,155
Right? They are expensive, very scarce in the data center.

84
00:05:21,535 --> 00:05:25,270
So The entire software stack was a bit for something else

85
00:05:25,270 --> 00:05:29,030
and when it comes to GPUs, it was really hard for many people to to

86
00:05:29,030 --> 00:05:32,685
actually manage those GPUs. So we came in And, and we

87
00:05:32,685 --> 00:05:36,384
saw those gaps. We've built run AI on top of

88
00:05:36,685 --> 00:05:40,525
cloud native technologies like Kubernetes and containers. We're

89
00:05:40,525 --> 00:05:44,270
big fans of Of those, technologies, and

90
00:05:44,270 --> 00:05:47,790
we added components around scheduling, around

91
00:05:47,790 --> 00:05:51,185
the GPU fractioning. So we enable

92
00:05:51,645 --> 00:05:55,245
multiple workloads to run on a on a single GPU and

93
00:05:55,245 --> 00:05:59,090
essentially all the provision GPU's. So we build this Engine which we

94
00:05:59,090 --> 00:06:02,470
call cluster engine that runs in in in GPU

95
00:06:02,530 --> 00:06:06,370
clusters. Right? We help machine learning teach to pull all of their GPU's into

96
00:06:06,370 --> 00:06:09,845
1 cluster, Running that engine, and that engine provides a lot of

97
00:06:09,845 --> 00:06:13,605
performance and lot of capabilities from those GPUs. And

98
00:06:13,605 --> 00:06:17,150
on top of that, we built this control plane And

99
00:06:17,150 --> 00:06:20,850
and tools and for machine learning,

100
00:06:21,550 --> 00:06:24,770
teams to run the Jupyter Notebooks, to run

101
00:06:25,225 --> 00:06:29,065
training jobs, batch jobs to deploy their models, right, to just to to

102
00:06:29,065 --> 00:06:32,505
have tools for the entire life cycle of AI

103
00:06:32,505 --> 00:06:36,330
from Training models in the lab to taking those models into

104
00:06:36,330 --> 00:06:39,550
production and running them and serving actual users.

105
00:06:40,090 --> 00:06:43,805
And That's the platform that we've built, and we're working with machine

106
00:06:43,805 --> 00:06:47,505
learning teams across the globe and on just managing,

107
00:06:47,645 --> 00:06:51,470
orchestrating, and letting them Get much more out of their GPUs and essentially

108
00:06:52,090 --> 00:06:55,930
run faster, train more than faster and in much easier way and

109
00:06:55,930 --> 00:06:59,375
deploy those modules In a much easier and faster and more efficient

110
00:06:59,375 --> 00:07:03,134
way. Yeah. The thing that blew me away when I first heard of Run

111
00:07:03,134 --> 00:07:06,700
AI, and this would have been, 2021

112
00:07:06,840 --> 00:07:10,060
ish. No. 20 early

113
00:07:10,360 --> 00:07:13,935
2021, I would say, And, it was the

114
00:07:13,935 --> 00:07:17,315
idea of fractional GPU's. Right? So you can have 1,

115
00:07:18,255 --> 00:07:21,850
I say 1, but, know, it's realistically, it's gonna be on, but you you can

116
00:07:21,850 --> 00:07:24,810
kind of share it out, which I think and we were talking in the virtual

117
00:07:24,810 --> 00:07:28,190
green room about how, you know, some of these GPU's,

118
00:07:29,255 --> 00:07:33,015
If you can get them because there's a multi month, sometimes multi

119
00:07:33,015 --> 00:07:36,455
year supply chain issue. I mean, these things are expensive bits of

120
00:07:36,455 --> 00:07:40,230
hardware, and I think the real value, correct

121
00:07:40,230 --> 00:07:43,190
me if I'm wrong, is, like, well, you know, if you I was talking to

122
00:07:43,190 --> 00:07:46,250
somebody the other day, and and we're basically talking about how we can,

123
00:07:48,754 --> 00:07:52,514
you know, if you get if you get, like, 1 laptop with a killer

124
00:07:52,514 --> 00:07:56,354
GPU, right, that GPU is really only useful to that 1

125
00:07:56,354 --> 00:07:59,340
user, Whereas if you can kind of put it in a in a in a

126
00:07:59,340 --> 00:08:03,180
server and use something like RunAI, now everybody in the organization can do

127
00:08:03,180 --> 00:08:06,925
that. And these are not trivial expenses. I mean, these are like, You know,

128
00:08:07,145 --> 00:08:09,725
you sell a kidney type of costs here.

129
00:08:11,705 --> 00:08:15,530
Yeah. Absolutely. So Absolutely. First of all, GPUs

130
00:08:15,530 --> 00:08:18,190
are expensive. They cost a lot. Right?

131
00:08:19,610 --> 00:08:23,294
And we provide, Technologies like fractional GPUs and

132
00:08:23,294 --> 00:08:26,754
other technologies around scheduling that allows

133
00:08:27,294 --> 00:08:30,710
teams to share GPUs. Right. So we used book on

134
00:08:30,710 --> 00:08:34,149
GPU fractioning. So that's 1 one day of

135
00:08:34,149 --> 00:08:37,770
sharing where you have 1 GPU, which is really expensive.

136
00:08:37,830 --> 00:08:41,195
And Not all of the workloads are

137
00:08:41,655 --> 00:08:45,415
AI workloads are really compute intensive and require the

138
00:08:45,415 --> 00:08:49,070
entire GPU or, you know, maybe multiple GPUs. There are

139
00:08:49,070 --> 00:08:52,910
workloads like Jupyter Notebooks where you have

140
00:08:52,910 --> 00:08:54,530
researchers that just

141
00:08:56,264 --> 00:09:00,024
Debugging their code or cleaning their data or doing some simple stuff,

142
00:09:00,024 --> 00:09:02,365
and they need just fractions of GPUs.

143
00:09:04,170 --> 00:09:07,870
In that case, if you have, a lot of data scientists,

144
00:09:07,930 --> 00:09:11,685
maybe you wanna host all of their notebooks On

145
00:09:11,685 --> 00:09:15,524
a much smaller number of GPUs because, right, each

146
00:09:15,524 --> 00:09:19,204
one of them, it's just fractions of GPUs. Another big use case

147
00:09:19,204 --> 00:09:21,830
for fractions Of GPUs is inference.

148
00:09:23,410 --> 00:09:26,150
So now all of the models are huge

149
00:09:26,770 --> 00:09:30,495
and And doesn't fit into, the memory of 1

150
00:09:30,495 --> 00:09:33,394
GPU, and in computer vision,

151
00:09:34,175 --> 00:09:37,800
there are a lot of Models that are relatively small,

152
00:09:37,800 --> 00:09:41,640
they run on GPU, and you can essentially host multiple of

153
00:09:41,640 --> 00:09:45,435
them on the same GPU. Right. So you can have instead of

154
00:09:45,435 --> 00:09:49,055
just 1 computer vision model running on GPU, host 10

155
00:09:49,675 --> 00:09:53,270
of those models on the same GPU and get Factors of

156
00:09:53,270 --> 00:09:56,310
10 x in, in your cost, in your,

157
00:09:56,950 --> 00:10:00,545
overall throughput of, of inference. So that's That's one

158
00:10:00,545 --> 00:10:04,385
use case for fractional GPU, and we're investing heavily just

159
00:10:04,385 --> 00:10:08,225
building that technology. Another layer

160
00:10:08,225 --> 00:10:11,890
of sharing GPUs Comes where you

161
00:10:11,890 --> 00:10:15,589
have maybe in your organization multiple teams

162
00:10:15,970 --> 00:10:19,755
or multiple projects running in parallel. So

163
00:10:19,755 --> 00:10:23,435
for example, may open AI, they now are working

164
00:10:23,435 --> 00:10:27,275
on gpt5. It's 1 project. That project needs a

165
00:10:27,275 --> 00:10:31,089
lot of GPUs And they have more projects. Right?

166
00:10:31,089 --> 00:10:34,529
More research project around alignment or around,

167
00:10:34,850 --> 00:10:38,685
reinforcement learning. You know? DALL

168
00:10:38,685 --> 00:10:42,045
E. Like, they they they have more than just 1 project. Then DALL E and

169
00:10:42,045 --> 00:10:45,565
they have multiple models. Right? Exactly. They have. Right? So each

170
00:10:45,565 --> 00:10:49,199
project needs Needs GPUs. Right? Needs a lot of

171
00:10:49,199 --> 00:10:52,740
GPUs. So if you can instead of

172
00:10:53,519 --> 00:10:56,985
allocating GPUs Entirely for each project,

173
00:10:57,525 --> 00:11:01,205
you could essentially pull all of those GPU's and share

174
00:11:01,205 --> 00:11:04,585
them between the those different projects, different teams,

175
00:11:04,890 --> 00:11:08,730
And in times where 1 project is idle and not

176
00:11:08,730 --> 00:11:12,410
using their GPUs, other projects, other teams can share

177
00:11:12,490 --> 00:11:16,035
can get access to those GPUs. Now orchestrating all of

178
00:11:16,035 --> 00:11:19,575
that, orchestrating that sharing of resources between

179
00:11:19,635 --> 00:11:23,420
projects, between teams can be really complex And

180
00:11:23,420 --> 00:11:26,860
requires this advanced scheduling, which

181
00:11:26,860 --> 00:11:30,365
which we're bringing into the game. We're bringing

182
00:11:30,365 --> 00:11:34,145
those scheduling capabilities from the high performance computing world

183
00:11:34,365 --> 00:11:37,940
known on those schedulers. And so we're bringing Capabilities

184
00:11:38,240 --> 00:11:41,600
from that world into the cloud native Kubernetes

185
00:11:41,600 --> 00:11:45,220
world. Scheduling around batch batch scheduling

186
00:11:45,279 --> 00:11:48,855
fairness, Algorithms, things like that, so teams and projects

187
00:11:48,855 --> 00:11:52,635
can just share GPUs in a simple and efficient

188
00:11:52,695 --> 00:11:56,519
way. So those

189
00:11:56,519 --> 00:12:00,279
are the 2 layers of sharing GPU's. Interesting. And and

190
00:12:00,279 --> 00:12:04,025
I think that I think as As this field matures

191
00:12:04,085 --> 00:12:07,625
and it matures in the enterprise, I think you're gonna see organizations

192
00:12:08,005 --> 00:12:10,105
kind of be more,

193
00:12:16,390 --> 00:12:19,750
more more more I think savvy about, like, okay, like you said, like, data scientists,

194
00:12:19,750 --> 00:12:23,495
if they're just doing, like, you know, Traditional statistical modeling really doesn't benefit

195
00:12:23,495 --> 00:12:26,955
from GPUs, or they're just doing data cleansing, data engineering.

196
00:12:27,255 --> 00:12:31,080
Right? They're probably gonna say, like, well, Let's run it on this cluster, and

197
00:12:31,080 --> 00:12:34,760
then we'll break it apart into discrete parts where, you

198
00:12:34,760 --> 00:12:37,400
know, then we will need a GPU. And I also like the idea that, you

199
00:12:37,400 --> 00:12:40,714
know, you're you're basically doing What what I learned in college,

200
00:12:41,574 --> 00:12:45,415
which was time slicing. Right? Sounds like this is kind of, like, everything old is

201
00:12:45,415 --> 00:12:49,030
new again. Right? I mean, this is, Obviously, you know, when you're when you're

202
00:12:49,030 --> 00:12:52,410
taking kind of that old mainframe concept and applying it to something like Kubernetes,

203
00:12:52,790 --> 00:12:56,435
orchestration is gonna be a big deal, because these are not systems that were Not

204
00:12:56,435 --> 00:12:59,154
built from the ground up to have time slicing. Is that a is that a

205
00:12:59,154 --> 00:13:02,694
good kind of explanation? Yeah. Absolutely.

206
00:13:02,915 --> 00:13:06,680
Absolutely. I like I like that analogy. Yeah. Exactly. Time

207
00:13:06,680 --> 00:13:09,980
slicing it's, it's 1 so

208
00:13:10,600 --> 00:13:14,305
1 implementation, Yeah. And that we

209
00:13:14,305 --> 00:13:16,485
enable around fractionalizing GPU's,

210
00:13:18,385 --> 00:13:22,080
and I agree when you have resources, It

211
00:13:22,080 --> 00:13:25,920
can be different kind of resources. Right? It can be CPU

212
00:13:25,920 --> 00:13:29,460
resources and networking were also,

213
00:13:29,865 --> 00:13:33,165
You know, as people created that technology to share the

214
00:13:33,465 --> 00:13:36,985
networking and communication going through those networking, but just the

215
00:13:36,985 --> 00:13:40,730
bandwidth of the networking. We're doing it

216
00:13:40,730 --> 00:13:44,090
for GPU's. Right. Sharing those

217
00:13:44,090 --> 00:13:47,310
resources. And I think now it interestingly,

218
00:13:48,250 --> 00:13:51,915
LLMs I also becoming a kind

219
00:13:51,915 --> 00:13:55,435
of, resources as well, right, that people need access

220
00:13:55,435 --> 00:13:59,160
to. Right? You have those models, you have GPT, JGPT.

221
00:13:59,460 --> 00:14:03,140
A lot of people are trying to get access to

222
00:14:03,140 --> 00:14:06,754
that resource, essentially. And I think it's interesting,

223
00:14:07,214 --> 00:14:09,855
because you kinda pointed this out, but it it it's something that I think that

224
00:14:09,855 --> 00:14:13,615
if you're in the gen AI space, you kinda don't it's so it's obvious

225
00:14:13,615 --> 00:14:17,440
like error. You don't think about it. Right? But when when you

226
00:14:17,440 --> 00:14:21,040
get inference on traditional, I somebody once referred to it

227
00:14:21,040 --> 00:14:24,885
as legacy AI. Right. But where

228
00:14:24,885 --> 00:14:28,085
the infrared side of the equation, you don't really need a lot of compute power.

229
00:14:28,085 --> 00:14:31,705
Right? Like, it's not really a heavy lift. Right? But with generative

230
00:14:31,925 --> 00:14:35,209
AI, you do need a lot of compute on

231
00:14:35,750 --> 00:14:38,709
I I guess it's not really inference, but on the other side of the use

232
00:14:39,029 --> 00:14:42,415
while it's actually in use, not just the training. Right. So traditionally,

233
00:14:42,635 --> 00:14:46,475
GPU heavy use in training, and then inference, not so

234
00:14:46,475 --> 00:14:50,255
much. Now we need heavy use before, after, and during,

235
00:14:51,040 --> 00:14:54,880
which I imagine your technology would help because, I mean, look, I love chat I

236
00:14:54,880 --> 00:14:57,360
love chat g p t. I'm one of the 1st people to sign up for

237
00:14:57,360 --> 00:15:01,125
a subscription, But even, you know, they had trouble keeping

238
00:15:01,125 --> 00:15:03,685
up, and they have a lot of money, a lot of power, a lot of

239
00:15:03,685 --> 00:15:07,070
influence. So I mean, this is something that if you're just a

240
00:15:07,070 --> 00:15:10,750
regular old enterprise, this is probably something they struggle

241
00:15:10,750 --> 00:15:14,505
with. Right? Right. Yeah. I absolutely

242
00:15:14,565 --> 00:15:17,385
agree. It's like amazing point, Frank.

243
00:15:20,085 --> 00:15:23,540
So 1 year

244
00:15:23,540 --> 00:15:27,220
ago, the inference use case on

245
00:15:27,220 --> 00:15:30,925
GPU's. Wasn't that big. Totally agree. That's also what we

246
00:15:30,925 --> 00:15:31,905
saw in the market.

247
00:15:35,005 --> 00:15:38,610
Deep learning Convolution neural networks were

248
00:15:38,610 --> 00:15:39,750
running on GPUs,

249
00:15:42,529 --> 00:15:44,470
mostly for computer vision applications,

250
00:15:46,135 --> 00:15:49,815
But they could also run on CPUs and you could get,

251
00:15:49,815 --> 00:15:52,395
like, relatively okay performance.

252
00:15:53,560 --> 00:15:57,400
If you needed maybe, like, a very low latency, then

253
00:15:57,400 --> 00:16:01,080
you might use GPUs because they're much faster and you get much

254
00:16:01,080 --> 00:16:03,355
lower latency. But

255
00:16:05,495 --> 00:16:09,335
it was, it was all, and it's still very

256
00:16:09,335 --> 00:16:13,180
difficult to deploy more than it's on GPU's Compared to just deploying

257
00:16:13,240 --> 00:16:17,000
those models on CPUs, because deploying more than deploying applications on

258
00:16:17,000 --> 00:16:20,380
CPUs, you know, people are doing for so many years.

259
00:16:20,905 --> 00:16:21,405
So

260
00:16:24,505 --> 00:16:28,105
many times it was much easier for people to just deploy their

261
00:16:28,105 --> 00:16:31,810
models on CPU's And not on GPUs, so that was, like, the

262
00:16:31,810 --> 00:16:34,310
fallback to CPUs. But

263
00:16:35,490 --> 00:16:39,285
then came, and as you said, chair GPT was introduced, A

264
00:16:39,285 --> 00:16:42,904
little bit more than a year ago, and that generative

265
00:16:43,125 --> 00:16:46,510
AI use case just blown. It was blown. Right? And it's

266
00:16:46,830 --> 00:16:50,510
it's inference essentially. And those models are

267
00:16:50,510 --> 00:16:53,950
so big that they can't really run on

268
00:16:53,950 --> 00:16:57,634
CPU. They, they LLMs are running in production on

269
00:16:57,634 --> 00:17:01,235
GPU's and now the inference use case on

270
00:17:01,235 --> 00:17:05,030
GPU's is just exploding In the market

271
00:17:05,030 --> 00:17:08,630
right now, it's really big. Is a lot of demand for

272
00:17:08,630 --> 00:17:11,984
GPU's for inference And

273
00:17:12,365 --> 00:17:15,665
if for open AI, they need to support this

274
00:17:15,724 --> 00:17:18,545
huge scale that I guess, just

275
00:17:19,269 --> 00:17:23,029
Just them are seeing such scale, maybe a little, a

276
00:17:23,029 --> 00:17:26,649
few more companies, but that's like huge, huge scale.

277
00:17:28,274 --> 00:17:31,575
But I think that we will see more and more companies

278
00:17:32,195 --> 00:17:35,715
building products based on AI, on

279
00:17:35,715 --> 00:17:38,700
LLMs, And we'll see more and more

280
00:17:39,240 --> 00:17:43,000
applications using AI, which

281
00:17:43,000 --> 00:17:46,635
then that AI runs on on GPU. So That is going to go

282
00:17:46,635 --> 00:17:50,395
and that's the that's an amazing new market for us around

283
00:17:50,395 --> 00:17:53,695
AI and for me as a CTO, it was so fun to

284
00:17:54,210 --> 00:17:57,669
Get into that market because it now comes with

285
00:17:57,970 --> 00:18:01,590
new problems, new challenges,

286
00:18:02,130 --> 00:18:05,895
new use cases Compared to deep learning

287
00:18:05,895 --> 00:18:09,735
on on GPS. New new pains because

288
00:18:09,735 --> 00:18:13,200
the models are so big. Right? Right. And

289
00:18:13,200 --> 00:18:16,659
challenges around cold start problems, about auto scaling,

290
00:18:17,279 --> 00:18:20,715
about, About

291
00:18:21,335 --> 00:18:25,095
just, giving access to LLMs. So a lot of

292
00:18:25,095 --> 00:18:28,855
challenges, new challenges there. We at Tron AI will studying those problems

293
00:18:28,855 --> 00:18:32,620
and we're Now building solutions for those problems,

294
00:18:32,760 --> 00:18:36,600
and I'm really, really excited about the Inference use case. That

295
00:18:36,600 --> 00:18:40,405
is very cool. So just, going back a little bit.

296
00:18:40,405 --> 00:18:44,165
I was trying to keep up. I promise. But Run AI is

297
00:18:44,405 --> 00:18:47,145
I I get Run AI Run AI's platform

298
00:18:47,910 --> 00:18:51,050
Support fractional, GPU usage.

299
00:18:51,510 --> 00:18:54,250
It it also sounds to me, maybe I misunderstood,

300
00:18:55,045 --> 00:18:58,885
That in order to achieve that, you first had to or

301
00:18:58,885 --> 00:19:02,405
or maybe along with that, you made it possible to use multiple

302
00:19:02,405 --> 00:19:06,040
GPUs. You've you've created Something like

303
00:19:06,040 --> 00:19:09,420
an API that allows, companies

304
00:19:09,800 --> 00:19:13,560
to take advantage of multiple GPUs or fractions of

305
00:19:13,560 --> 00:19:17,205
GPUs. Did I Did I miss that? No, that's

306
00:19:17,205 --> 00:19:20,424
right. That's right, Andy. And Okay.

307
00:19:21,044 --> 00:19:24,424
So we've built this, way of,

308
00:19:24,870 --> 00:19:28,630
For people to scale their workloads from fractions

309
00:19:28,630 --> 00:19:32,410
of GPUs to multiple GPUs within 1 machine,

310
00:19:33,005 --> 00:19:36,705
Okay. To multiple, machines. Right? You

311
00:19:37,085 --> 00:19:40,845
have big workloads running on on multiple nodes

312
00:19:40,845 --> 00:19:43,740
of GPUs. So Think about it when you have

313
00:19:44,600 --> 00:19:48,140
multiple users each running their own

314
00:19:49,000 --> 00:19:52,375
workload. Some are running on fractions of GPUs. Some are

315
00:19:52,375 --> 00:19:55,815
running batch jobs on on a lot of

316
00:19:55,815 --> 00:19:59,610
GPUs. Some Deploying models and running them on

317
00:19:59,610 --> 00:20:03,450
in inference, and some just launching their Jupyter

318
00:20:03,450 --> 00:20:06,670
Notebooks. All of that is happening on the same

319
00:20:07,534 --> 00:20:11,135
pool of GPU's, same cluster. So you need

320
00:20:11,135 --> 00:20:14,674
this lay of orchestration of scheduling just to

321
00:20:15,290 --> 00:20:18,350
Manage everything and make sure that everything getting there

322
00:20:18,730 --> 00:20:22,110
right, access the right, and and

323
00:20:22,570 --> 00:20:25,955
and g p u's And everything is scheduled according to

324
00:20:25,955 --> 00:20:29,715
priorities. Yeah. Well, being just, you know, a

325
00:20:29,715 --> 00:20:33,480
mere data engineer, Here talking about all of that

326
00:20:33,480 --> 00:20:37,080
analytics workload. That that sounds very

327
00:20:37,080 --> 00:20:40,695
complex. So and as you

328
00:20:40,695 --> 00:20:44,535
mentioned earlier, you know, you were talking about how traditional coding

329
00:20:44,535 --> 00:20:48,075
is targeting CPUs, and that's my background.

330
00:20:48,580 --> 00:20:52,360
You know, I've written applications and and done data work targeted for

331
00:20:52,740 --> 00:20:56,580
traditional work. I can't imagine, just how complex

332
00:20:56,580 --> 00:21:00,185
that is, because GPUs came into AI

333
00:21:00,725 --> 00:21:03,385
as a unique solution,

334
00:21:04,325 --> 00:21:08,030
designed to solve problems That they weren't really built

335
00:21:08,030 --> 00:21:11,790
for. You know, GPUs were built for graphics, and you didn't

336
00:21:11,790 --> 00:21:15,335
manage that. But the fact that They have to be

337
00:21:15,335 --> 00:21:19,175
so parallel, internally. I think just added this

338
00:21:19,175 --> 00:21:22,980
dimension to it. And I don't know who came up

339
00:21:22,980 --> 00:21:26,740
with that idea, you know, who thought of, well, goodness, we could we could

340
00:21:26,740 --> 00:21:30,535
use all of this, you know, massive parallel processing to To

341
00:21:30,535 --> 00:21:34,295
to run these other class of problems. So pretty

342
00:21:34,295 --> 00:21:37,735
cool pretty cool idea, but I just I yeah. I'm amazed at even

343
00:21:37,735 --> 00:21:41,440
cooler than that. Because Yeah. Yeah. A wise man once told me,

344
00:21:41,440 --> 00:21:45,120
he goes, GPU's are really good at solving linear

345
00:21:45,120 --> 00:21:48,805
algebra problems, And if you're clever enough, you can

346
00:21:48,805 --> 00:21:51,145
turn anything into a linear algebra problem.

347
00:21:52,405 --> 00:21:55,840
And even simulating quantum computers when I was kind of, like, going through that,

348
00:21:56,320 --> 00:22:00,159
I was like Mhmm. You know, like, gee, looks like looks like this

349
00:22:00,159 --> 00:22:03,380
will be useful there too. Right? Like so it's an it's an interesting,

350
00:22:04,105 --> 00:22:07,545
It's an interesting thing. So, like, you know, everyone is, you know,

351
00:22:07,545 --> 00:22:11,065
everyone's talking about how this is, you know, we're in the hype cycle, but I

352
00:22:11,065 --> 00:22:14,770
think if you're in the GPU space, you have Pretty good run because one,

353
00:22:15,309 --> 00:22:18,429
these things are gonna these things are gonna be important. Right? Whether or not, you

354
00:22:18,429 --> 00:22:22,045
know, hype cycle will will kinda crash, and how what that'll look like.

355
00:22:22,205 --> 00:22:24,925
Think they're gonna be important anyway. Right? Because they're gonna be just the cost of

356
00:22:24,925 --> 00:22:28,365
doing business, table stakes, as the cool kids like to say. But

357
00:22:28,365 --> 00:22:31,890
also, over the next horizon, Simulating quantum

358
00:22:31,890 --> 00:22:35,030
computers is going to be the next big hype cycle.

359
00:22:35,410 --> 00:22:39,170
Right? Or one of them. Right? So like it's

360
00:22:39,170 --> 00:22:42,705
it's it's a It's a foundational technology. I think that we

361
00:22:42,705 --> 00:22:46,544
didn't think would be a foundational technology even like 6 7 years

362
00:22:46,544 --> 00:22:49,910
ago. Right? Yeah.

363
00:22:51,250 --> 00:22:53,190
I go with a few things that you said.

364
00:22:55,090 --> 00:22:58,715
Regarding the Parallel computation, right? And just running

365
00:22:58,715 --> 00:23:01,934
linear algebra calculations on GPU's

366
00:23:02,635 --> 00:23:04,895
and accelerating such workloads.

367
00:23:06,460 --> 00:23:09,760
In Nvidia, I love Nvidia, Nvidia

368
00:23:10,220 --> 00:23:13,580
has this big vision, and they had big

369
00:23:13,580 --> 00:23:17,395
vision Around GPU's already in 26 when

370
00:23:17,395 --> 00:23:21,095
they built CUDA. Yep. Right. So

371
00:23:21,850 --> 00:23:25,530
They've been good at just for that. Right? The GPU's were

372
00:23:25,530 --> 00:23:29,205
used for graphics processing, For gaming.

373
00:23:29,265 --> 00:23:33,045
Right? Great use case. Great market.

374
00:23:33,185 --> 00:23:36,405
But they had this vision of bringing more

375
00:23:37,940 --> 00:23:40,920
Applications to GPU is just accelerating more applications

376
00:23:42,100 --> 00:23:45,495
and mainly applications with a lot of Linear

377
00:23:45,495 --> 00:23:49,015
algebra calculations. And they

378
00:23:49,015 --> 00:23:51,755
created that, they created CUDA

379
00:23:52,690 --> 00:23:56,210
To simplify that. Right? To allow more

380
00:23:56,210 --> 00:23:59,890
developers to use GPUs because just using GPUs

381
00:23:59,890 --> 00:24:02,785
directly, that's so complex. That's so hub.

382
00:24:03,885 --> 00:24:07,485
So we've built CUDA to bring more developers, to bring more

383
00:24:07,485 --> 00:24:10,705
applications and they started in 20

384
00:24:11,390 --> 00:24:15,230
2006, but think about the

385
00:24:15,230 --> 00:24:18,770
big breakthrough in AI, it happened just in

386
00:24:18,990 --> 00:24:22,315
2012, 2013 with

387
00:24:23,015 --> 00:24:25,995
AlexNet and the Toronto researchers

388
00:24:27,415 --> 00:24:31,080
who used G2 GPU's actually, because they

389
00:24:31,080 --> 00:24:34,440
trained Alex Net on 2 GPU's and they had

390
00:24:34,440 --> 00:24:38,085
CUDA, so for them it was feasible To train their

391
00:24:38,085 --> 00:24:41,144
model on a GPU. And that was the new thing that they did.

392
00:24:43,605 --> 00:24:47,370
They were able to Train much bigger model with

393
00:24:47,370 --> 00:24:50,809
more parameters than ever before because they use

394
00:24:50,809 --> 00:24:54,365
GPU's because the training Process ran much

395
00:24:54,365 --> 00:24:56,625
faster. And,

396
00:24:58,365 --> 00:25:01,665
and, and that triggered the entire

397
00:25:02,125 --> 00:25:05,940
revolution, the Die hyper on the AI that we're seeing now. So

398
00:25:05,940 --> 00:25:09,480
from 26, when Nvidia started to build CUDA until

399
00:25:09,540 --> 00:25:13,015
2013, right, 7 years, Then we started to see

400
00:25:13,015 --> 00:25:16,855
those big breakthrough. And in the last decade,

401
00:25:16,855 --> 00:25:20,540
it's just exploding, and we're Seeing more and more applications.

402
00:25:20,760 --> 00:25:24,440
The entire AI ecosystem is running on on an

403
00:25:24,520 --> 00:25:28,200
on GPUs. So that's amazing to see. It's impressive.

404
00:25:28,200 --> 00:25:31,725
And, like, People don't realize, like, the the revolution we're seeing today

405
00:25:31,945 --> 00:25:35,705
really started in 2006, like you said. I didn't even put the 2 and 2

406
00:25:35,705 --> 00:25:38,605
together until I was listening to a podcast. I think it's called Acquired,

407
00:25:39,620 --> 00:25:43,059
And really good podcast. Right? Like, I they don't pay me to say that or

408
00:25:43,059 --> 00:25:46,659
whatever, but they did a 3 hour deep dive on the history of

409
00:25:46,659 --> 00:25:50,304
NVIDIA. 3 hours. I couldn't stop listening.

410
00:25:51,005 --> 00:25:54,365
Right? Like Nice. You know Yeah. We tried a long form, like, multi hour

411
00:25:54,365 --> 00:25:58,179
podcast. We Weren't that entertaining, apparently. But the way they

412
00:25:58,179 --> 00:26:02,020
go through the history of this where it was basically Jensen Huang. Hopefully, I said

413
00:26:02,020 --> 00:26:05,485
his name right. He was, like, we wanna be a player, not just in gaming,

414
00:26:05,485 --> 00:26:08,945
but also in scientific computing. This is 2005, 2006,

415
00:26:09,325 --> 00:26:12,840
which at the time seemed kind of, like, Little out there, little kooky.

416
00:26:13,460 --> 00:26:16,980
But what you're seeing today is, like, the the fruits and the tree the the

417
00:26:16,980 --> 00:26:20,775
seeds that he planted, I, you know, almost 20 years ago, like, 19,

418
00:26:20,775 --> 00:26:24,455
20 years ago. So, you know, it's you know, when people look at

419
00:26:24,455 --> 00:26:28,070
NVIDIA and say it's overnight Success. I'm like, well, I don't know about that, but,

420
00:26:28,070 --> 00:26:31,669
you know, but no. I mean, you're right. Like, you know and it's

421
00:26:31,669 --> 00:26:35,005
probably not a coincidence that once they made it easy to take these

422
00:26:35,965 --> 00:26:39,805
Multi parallel processor. Say that 10 times

423
00:26:39,805 --> 00:26:43,510
fast on a Thursday morning. But also

424
00:26:43,510 --> 00:26:46,789
make it so it's a lot easier for developers to use. Right? And I'll quote

425
00:26:46,789 --> 00:26:49,850
the great Steve Ballmer, developers, developers, developers. Right?

426
00:26:51,355 --> 00:26:55,115
So, it's it's, it's just fascinating, like and

427
00:26:55,115 --> 00:26:58,680
and I think that, you know, we've really on Leafy a

428
00:26:58,680 --> 00:27:02,460
gate of creativity in terms of researchers and applied,

429
00:27:03,000 --> 00:27:06,600
research, and, I mean and I think that what's really cool

430
00:27:06,600 --> 00:27:10,425
about your Product is that you're you're kind of making this what is

431
00:27:10,425 --> 00:27:14,105
now a sparks resource, maybe in some fashion

432
00:27:14,105 --> 00:27:17,720
of time, GPU's won't Cost an arm and a leg.

433
00:27:18,340 --> 00:27:21,960
But, like, for now, I think I think the one thing that I've seen

434
00:27:22,580 --> 00:27:26,145
that I think is, not obvious For the casual

435
00:27:26,285 --> 00:27:29,725
observer is if you can if an

436
00:27:29,725 --> 00:27:33,485
organization, like a large enterprise, can pull their resources, they have a lot more

437
00:27:33,485 --> 00:27:36,990
money to buy better GPUs, And you offer a platform where

438
00:27:36,990 --> 00:27:40,350
everybody can get a stake in it. Right? As opposed to, you know you know,

439
00:27:40,350 --> 00:27:44,115
that department is gonna hog everything. Right? You know, you and and and and,

440
00:27:44,355 --> 00:27:47,155
here's a question. Do you do you have, like, an audit trail where you could

441
00:27:47,155 --> 00:27:50,835
kinda, you know, figure out, like, you know, Andy's department's really

442
00:27:50,835 --> 00:27:54,630
hogging the GPUs. No. No. No. It's Frank. Frank is like mining Bitcoin or

443
00:27:54,630 --> 00:27:58,250
whatever. Like, do you do you have some kind of, audit trail like that?

444
00:27:58,870 --> 00:28:02,575
Yeah. I I love that you mentioned hugging, We

445
00:28:02,575 --> 00:28:06,095
GPU hugging. We Mhmm. We use that term as well.

446
00:28:06,095 --> 00:28:09,935
Right? Because it it's so difficult sometimes to get

447
00:28:09,935 --> 00:28:13,490
access to GPUs. So when you get access to GPU

448
00:28:13,550 --> 00:28:16,370
as a researcher, as a member practitioner,

449
00:28:18,430 --> 00:28:22,135
you don't wanna Let it go. Right. Cause if

450
00:28:22,135 --> 00:28:25,755
you let it go, someone else would take it and hug it. Right.

451
00:28:25,975 --> 00:28:29,115
So you're getting this GPU hugging problem.

452
00:28:31,880 --> 00:28:35,580
What we do to solve that is

453
00:28:35,799 --> 00:28:39,100
that we do provide monitoring and visibility

454
00:28:40,005 --> 00:28:43,525
tools into who is using what, and who is actually

455
00:28:43,525 --> 00:28:47,285
utilizing their GPU's, and so on, but more

456
00:28:47,285 --> 00:28:49,830
than that We

457
00:28:51,650 --> 00:28:55,010
allow the researchers just to give up their GPS and not hardware

458
00:28:55,010 --> 00:28:58,605
GPS because we provide this, Concept of

459
00:28:58,605 --> 00:29:02,285
guaranteed quotas. So each researcher or

460
00:29:02,285 --> 00:29:05,665
each project or each team has their own guaranteed

461
00:29:05,965 --> 00:29:09,730
quotas of GPU's That are always available for them

462
00:29:09,789 --> 00:29:13,630
whenever they will get access to the the cluster, they will get like, you

463
00:29:13,630 --> 00:29:17,285
know, the the 2 GPUs or 4 All the quarter of

464
00:29:17,285 --> 00:29:20,885
GPU's it's guaranteed. So they can

465
00:29:20,885 --> 00:29:24,245
just let go their GPU's and not hug them. That's one

466
00:29:24,245 --> 00:29:28,040
thing. The second thing is that they

467
00:29:28,040 --> 00:29:31,560
can also go above their quota. They can

468
00:29:31,560 --> 00:29:35,335
use the GPUs of Other teams or other users, if

469
00:29:35,335 --> 00:29:39,115
they are idle, and they can run this preemptible jobs

470
00:29:39,335 --> 00:29:43,035
in an opportunistic way, utilize those GPUs.

471
00:29:44,360 --> 00:29:48,140
And so in that way, they are not limited

472
00:29:48,760 --> 00:29:52,520
to fixed quotas, to help limit

473
00:29:52,520 --> 00:29:56,365
quotas. They can just take as many GPUs

474
00:29:56,365 --> 00:29:59,825
as they want from their clusters if those GPUs are available

475
00:30:00,550 --> 00:30:03,770
in idle right but if someone will need those gpus

476
00:30:04,390 --> 00:30:08,230
because those gpus are guaranteed to them we will make sure our

477
00:30:08,230 --> 00:30:11,995
scheduler The Run AI schedule that the Run AI platform will make

478
00:30:11,995 --> 00:30:15,535
sure to preempt workload

479
00:30:15,835 --> 00:30:19,420
and give those Guarantee GPUs to the right users.

480
00:30:20,360 --> 00:30:23,880
Oh, that's cool. Alright. So 1 last

481
00:30:23,880 --> 00:30:27,345
question before we switch over to the the stock questions, cause I could geek

482
00:30:27,345 --> 00:30:31,025
out and look at this for hours. Yep. This could be a

483
00:30:31,025 --> 00:30:34,225
long form. Sure. This could be. Yeah. And that's and I I wanna be respectful

484
00:30:34,225 --> 00:30:36,850
of your time because you're an important guy, and it's also late where you are.

485
00:30:36,929 --> 00:30:40,690
So who deals with this? Like, who would set up these quotas? Is it

486
00:30:40,690 --> 00:30:43,970
the is it the is it the data scientist? Is it IT ops? Like, who

487
00:30:43,970 --> 00:30:47,585
do you obviously, the data scientists, Researchers, they all

488
00:30:47,585 --> 00:30:51,265
benefit from this product. But who's actually administering it? Right? Like,

489
00:30:51,265 --> 00:30:54,805
who is it you know, do I have to talk to, you know,

490
00:30:54,980 --> 00:30:57,780
Say pretend Andy's in ops. Do I have to say, hey, Andy. I really need

491
00:30:57,780 --> 00:31:00,900
a boost in my quota. You know, like, I mean, who does it? Or do

492
00:31:00,980 --> 00:31:04,500
or my this sounds like you as I say it, I'm like, yeah, that wouldn't

493
00:31:04,500 --> 00:31:08,165
work. Like, I'm the researcher. I'm gonna turn the dial up on my own. Like

494
00:31:08,165 --> 00:31:11,685
like, who's who's who's the primary? Obviously, we know who the prime

495
00:31:11,765 --> 00:31:14,505
primary beneficiary is, but who's the primary user?

496
00:31:15,559 --> 00:31:19,400
So okay. Great. So if you have a team, right, if if

497
00:31:19,400 --> 00:31:22,965
you're a team of researchers, all all of you Need access to

498
00:31:22,965 --> 00:31:26,105
GPU, so maybe the team lead

499
00:31:26,565 --> 00:31:30,105
is the one who's managing the quotas for the different

500
00:31:30,645 --> 00:31:33,980
team members. And if you have multiple teams,

501
00:31:34,760 --> 00:31:38,600
then you might have a department manager or an admin of the

502
00:31:38,600 --> 00:31:42,304
cluster or platform owner that will Allocate the

503
00:31:42,304 --> 00:31:45,905
quotas for each team, right? And then those teams would

504
00:31:45,905 --> 00:31:49,720
manage their own quotas within That's what

505
00:31:49,720 --> 00:31:53,420
they they they were giving. Right? So it's like a a hierarchical

506
00:31:54,679 --> 00:31:58,414
thing in a hierarchy manner. People can manage their own

507
00:31:58,414 --> 00:32:02,174
quota, their own, priorities, their own access to the

508
00:32:02,174 --> 00:32:05,830
GPUs within their teams. Okay.

509
00:32:05,830 --> 00:32:08,870
So it's kind of like a hybrid of, like, you know, it's like a budget

510
00:32:08,870 --> 00:32:12,685
almost. Right? Like, you know, you get this much, Figure it out

511
00:32:12,685 --> 00:32:16,225
about yourselves. Exactly. So we're trying to decentralize

512
00:32:16,685 --> 00:32:20,290
the how the quotas are being managed and how the GPUs are being accessed.

513
00:32:20,290 --> 00:32:24,050
So, you know, I'm giving as much power, as much

514
00:32:24,050 --> 00:32:27,725
control to the end users as possible. Sure. That's

515
00:32:27,885 --> 00:32:31,245
It sounds like a great administrative question, very

516
00:32:31,245 --> 00:32:35,085
important. And I imagine, because a little bird told

517
00:32:35,085 --> 00:32:38,450
me that you're not the only, you know, your your

518
00:32:38,510 --> 00:32:41,570
provisioning provisioning of these GPU resources

519
00:32:42,350 --> 00:32:45,710
is not the only thing that, enterprises have to deal

520
00:32:45,710 --> 00:32:49,544
with. So it's an it's an interesting just GPUs.

521
00:32:49,544 --> 00:32:53,385
It's compute. Like, it's not a Sure. It's not it's not limited. Although, because

522
00:32:53,385 --> 00:32:57,060
of what you said, you know, Managing GPUs is an order of magnitude harder

523
00:32:57,060 --> 00:33:00,100
because they were never really built for this. Right? Like, this kind of Right. You

524
00:33:00,100 --> 00:33:03,795
know, we're talking about technology that wasn't really in the server room until Few

525
00:33:03,795 --> 00:33:07,555
years ago. Right? This isn't a tried and true kind of this is

526
00:33:07,555 --> 00:33:11,315
how it works, you know? Right. But we hit that point in the

527
00:33:11,315 --> 00:33:14,220
show where we'll, switch the preform questions.

528
00:33:15,000 --> 00:33:18,760
These are not complicated. I mean, you know, we're not we're not Mike

529
00:33:18,760 --> 00:33:22,200
Wallace or, like, you know, 60 minutes or whatever. We're not trying to trap you

530
00:33:22,200 --> 00:33:25,905
or anything. But since I've been gabbing on most of the show, I

531
00:33:25,905 --> 00:33:29,665
figured I'll get Andy kick this off. Well, thanks, Frank. And I don't think

532
00:33:29,665 --> 00:33:33,341
you were gabbing on. You know more about this So now I do. So I'm

533
00:33:33,341 --> 00:33:36,939
just a lowly data engineer. I'll plug No. You if you

534
00:33:36,939 --> 00:33:40,515
will. Data engineers are the heroes we need. Well

535
00:33:40,515 --> 00:33:43,735
well, I'm gonna plug Frank's Roadies versus Rockstar's,

536
00:33:44,275 --> 00:33:47,655
writing on LinkedIn. It's it's good articles about this.

537
00:33:47,900 --> 00:33:50,640
But, let's see. How did you,

538
00:33:51,740 --> 00:33:54,640
how did you find your way in into this field?

539
00:33:55,225 --> 00:33:58,684
And, did did this feel fine you or did you find it?

540
00:34:00,184 --> 00:34:03,085
This feel totally fine found me. Awesome.

541
00:34:04,230 --> 00:34:05,850
Yeah. I I've

542
00:34:08,310 --> 00:34:11,770
I did my post doc, and I've been in Bailabs.

543
00:34:12,855 --> 00:34:16,375
And Jan Hakon came to Bell Labs and

544
00:34:16,375 --> 00:34:19,995
gave a presentation about AI. It was around 2017,

545
00:34:21,449 --> 00:34:24,670
And Jan Hakun spent a lot of years in Bell Labs,

546
00:34:26,090 --> 00:34:29,389
and his presentation was amazing. And

547
00:34:30,175 --> 00:34:33,555
When I heard him talking about AI,

548
00:34:33,775 --> 00:34:37,295
I I said, okay, that's the space where I wanna be. It's going to change

549
00:34:37,295 --> 00:34:40,969
the world. There is this New amazing technology here that

550
00:34:41,429 --> 00:34:45,269
is going to change everything. And I knew that I want to start

551
00:34:45,269 --> 00:34:49,054
a company In the AI space for sure.

552
00:34:50,155 --> 00:34:52,335
Cool. That's a good answer. So cool.

553
00:34:54,155 --> 00:34:57,490
Yeah. That's cool. I was at Bell Labs,

554
00:34:58,109 --> 00:35:01,789
doing a presentation a while ago, and somebody I didn't realize that he

555
00:35:01,789 --> 00:35:05,535
worked at Bell Labs because, like, you know, the guy was like, no. No.

556
00:35:05,535 --> 00:35:08,255
He used to work here, like, in this building. I was like, no way. Because

557
00:35:08,255 --> 00:35:12,035
I knew him as the guy from NYU. Right? Like, that's who I thought. Right.

558
00:35:12,640 --> 00:35:16,240
For the guy from from Meta. Yeah. And now the guy from Meta. Right? Like

559
00:35:16,320 --> 00:35:19,945
so it's interesting how that how that you know? They have

560
00:35:19,945 --> 00:35:23,645
this amazing pictures from the nineties where they

561
00:35:23,785 --> 00:35:27,325
run like deep learning models on very old pieces

562
00:35:28,265 --> 00:35:30,890
and, And recognizing like,

563
00:35:31,930 --> 00:35:35,130
numbers on the computer. Maybe you saw those pictures like amazing

564
00:35:35,290 --> 00:35:38,855
Emmis. It's the Emmis problem. Is that Yep.

565
00:35:39,075 --> 00:35:42,535
Right. Exactly. Exactly. Cool.

566
00:35:43,715 --> 00:35:47,095
So second question is, what's your favorite part of your current job?

567
00:35:51,400 --> 00:35:53,980
That everything is changing so fast.

568
00:35:54,974 --> 00:35:58,655
Things are moving so fast right away in this business for 6

569
00:35:58,655 --> 00:36:02,435
years, and the entire

570
00:36:02,494 --> 00:36:06,150
space is moving and

571
00:36:06,150 --> 00:36:09,910
advancing. And so many people are working in

572
00:36:09,910 --> 00:36:13,505
this field A new innovation, new tools,

573
00:36:13,505 --> 00:36:17,285
new new advancements are are getting out every day.

574
00:36:18,920 --> 00:36:22,599
You know, just 6 years ago, it was about deep learning and computer

575
00:36:22,599 --> 00:36:26,220
vision. And now it's about language models

576
00:36:27,545 --> 00:36:31,245
And generative AI, and we're gonna just at the start,

577
00:36:31,545 --> 00:36:35,305
right, there are so many amazing things that are going to happen

578
00:36:35,305 --> 00:36:38,830
in this space, and I love it. Absolutely.

579
00:36:39,490 --> 00:36:42,750
So we have 3 fill in the blank

580
00:36:43,210 --> 00:36:46,655
of sentences here. The first Is complete this

581
00:36:46,655 --> 00:36:50,195
sentence when I'm not working, I enjoy blank.

582
00:36:52,655 --> 00:36:56,280
You'll get a you'll get a very boring And

583
00:36:56,580 --> 00:37:00,340
so this is just spending time with

584
00:37:00,340 --> 00:37:03,640
friends and family, because I think

585
00:37:03,895 --> 00:37:07,735
That I'm always working. It's like, if you ask my wife,

586
00:37:07,735 --> 00:37:11,195
she'll tell you that I'm working 24 hours. And

587
00:37:12,670 --> 00:37:15,810
Yeah. So I don't have much time that I'm not working

588
00:37:16,030 --> 00:37:19,550
in. So when I I do I'm not when I'm

589
00:37:19,550 --> 00:37:23,245
not working then I'm trying Trying to be with my kids and my

590
00:37:23,245 --> 00:37:27,025
wife and friends. Cool.

591
00:37:27,325 --> 00:37:31,000
Cool. The 2nd complete the sentence. I think

592
00:37:31,000 --> 00:37:33,900
the coolest thing about technology today is

593
00:37:34,520 --> 00:37:37,660
blank. And this, I really wanna hear your perspective on that.

594
00:37:39,815 --> 00:37:43,415
Yeah. I think everyone will say AI, right? Or something in

595
00:37:43,415 --> 00:37:45,115
AI. Yeah.

596
00:37:48,100 --> 00:37:49,880
I think there are so many

597
00:37:52,100 --> 00:37:55,720
new innovations that are coming around LLMs.

598
00:37:56,565 --> 00:38:00,185
I think everything relating to

599
00:38:01,125 --> 00:38:04,725
searches, right? Searching in data, in getting

600
00:38:04,725 --> 00:38:08,530
insights From data, it's all going to change. We're going to have

601
00:38:08,530 --> 00:38:12,050
a new interface. Right? Just getting

602
00:38:12,050 --> 00:38:15,785
insights from data from And natural with

603
00:38:15,785 --> 00:38:19,464
natural language, oh, you know, no SQL and, you

604
00:38:19,464 --> 00:38:22,904
know, needing to programming and stuff like that.

605
00:38:22,904 --> 00:38:26,140
Just With natural inter language, you could

606
00:38:26,760 --> 00:38:29,900
do amazing stuff with data. I think,

607
00:38:31,664 --> 00:38:32,964
We're seeing this,

608
00:38:35,825 --> 00:38:39,640
advancement in, And like digit

609
00:38:39,640 --> 00:38:43,340
digital twins right now. You can,

610
00:38:43,880 --> 00:38:47,615
you can, Fake my voice

611
00:38:47,615 --> 00:38:51,055
and your voice and fake my image and your image. And,

612
00:38:51,055 --> 00:38:54,710
and, and, you know, In in the

613
00:38:54,710 --> 00:38:58,170
future, we'll have digital twins of us, right,

614
00:38:58,550 --> 00:39:02,195
doing this stuff. That would be amazing. So a lot of

615
00:39:02,195 --> 00:39:05,575
amazing stuff are going to happen in the next few years

616
00:39:06,275 --> 00:39:10,120
for sure. Very cool. Our last complete sentence.

617
00:39:10,340 --> 00:39:13,880
I look forward to the day when I can use technology to

618
00:39:14,100 --> 00:39:14,600
blank.

619
00:39:17,885 --> 00:39:19,705
To have a robot in my house.

620
00:39:22,724 --> 00:39:26,390
Yeah. Yeah. You're swapping the flow in instead of

621
00:39:26,390 --> 00:39:30,170
me doing that, right, cleaning dishes and things like that.

622
00:39:30,230 --> 00:39:34,025
If that would happen, that would be amazing. Right? That's a that's a

623
00:39:34,025 --> 00:39:37,785
good answer. Yeah. I I agree. I have I have 3

624
00:39:37,785 --> 00:39:41,325
boys, 4 dogs. So, like, cleaning is safe.

625
00:39:41,970 --> 00:39:45,810
Yeah. Yeah. I'm a heavy cleaning. Ranging from, like, 1 to, like,

626
00:39:45,810 --> 00:39:49,570
a teenager. So it's it's, and and and fighting

627
00:39:49,570 --> 00:39:53,234
with them to, Like, empty the dishwasher is takes a lot more mental

628
00:39:53,234 --> 00:39:56,915
energy than it should, but that's probably a subject for another

629
00:39:56,915 --> 00:39:57,700
type of show.

630
00:40:00,740 --> 00:40:04,119
The next question is share something different about yourself,

631
00:40:04,260 --> 00:40:07,285
and we always like to Joke like, well, let's just make sure that we keep

632
00:40:07,285 --> 00:40:11,045
our clean Itunes rating. So Yeah. Yeah. What

633
00:40:11,045 --> 00:40:14,670
what yeah. Well, I I This

634
00:40:14,670 --> 00:40:18,130
is a hard question, I needed to think about it.

635
00:40:18,829 --> 00:40:22,444
So, I found 2 answers that I can say. So one

636
00:40:22,444 --> 00:40:26,224
is about my professional life, right, I think that

637
00:40:26,365 --> 00:40:30,010
it's somewhat different that I'm coming this With back from

638
00:40:30,010 --> 00:40:33,850
the academia and the industry. So I love academia. I love to research

639
00:40:33,850 --> 00:40:37,609
problems. I love to understand problems in in a deep

640
00:40:37,609 --> 00:40:41,164
way And combining it with startups in the industry.

641
00:40:41,704 --> 00:40:45,545
And, and in my past, I worked for cheap companies, for hardware

642
00:40:45,545 --> 00:40:49,180
companies. I work for Intel, for startup, and for Apple. I

643
00:40:49,180 --> 00:40:52,940
did cheap stuff, and now 1 AI is a software company, so really

644
00:40:52,940 --> 00:40:56,535
like a diverse background of Academia, hardware,

645
00:40:56,675 --> 00:41:00,435
software, so I love that, and, like, I love to do

646
00:41:00,435 --> 00:41:03,175
with few things, and so that I think is different.

647
00:41:04,420 --> 00:41:08,040
And the 2nd answer that I could find

648
00:41:08,420 --> 00:41:12,180
is, that I have a nickname that goes with me

649
00:41:12,180 --> 00:41:15,494
since my high school days, Which is, the Duke.

650
00:41:16,275 --> 00:41:19,795
The Duke. All of them all of them are calling me the Duke. It's like,

651
00:41:19,795 --> 00:41:23,015
they don't call me Ronan, the the Duke. So That's funny.

652
00:41:23,540 --> 00:41:25,880
Yeah. That's awesome.

653
00:41:28,020 --> 00:41:31,480
Automotive is a sponsor of, Data Driven,

654
00:41:31,725 --> 00:41:35,185
And you can go to the datadrivenbook.com.

655
00:41:37,245 --> 00:41:40,685
And if you, if you do that, you can sign up for a free

656
00:41:40,685 --> 00:41:44,270
month Of Audible. And if you decide later to

657
00:41:44,270 --> 00:41:47,650
then join Audible, use one of their their sign up plans,

658
00:41:48,030 --> 00:41:51,465
then Frank and I get to Split a cup of coffee, I think,

659
00:41:52,245 --> 00:41:55,685
out of that. And, every little bit helps. So we really

660
00:41:55,685 --> 00:41:59,065
appreciate that when you do. What we'd like to ask

661
00:41:59,430 --> 00:42:03,270
Yes. Do you listen to audiobooks? And if you

662
00:42:03,270 --> 00:42:07,030
do okay. Good. I see you nodding. So do you have a recommendation? Do you

663
00:42:07,030 --> 00:42:10,855
have a favorite book or two you'd like To share. Yeah.

664
00:42:10,855 --> 00:42:14,375
So I'm a heavy user of, audible. I'll give them

665
00:42:14,375 --> 00:42:17,869
the, a classical book with Classical for

666
00:42:17,869 --> 00:42:21,390
entrepreneurs, on their how the hard things

667
00:42:21,390 --> 00:42:24,770
about how things from by Ben Horowitz,

668
00:42:24,990 --> 00:42:28,605
it's Classic book, love it, really did a lot of impact

669
00:42:28,605 --> 00:42:31,825
on me, I read it when we started run AI

670
00:42:32,760 --> 00:42:36,060
And I recommend it for every

671
00:42:36,200 --> 00:42:39,960
entrepreneur, to read it and for everyone to read it. It's like a

672
00:42:40,280 --> 00:42:44,055
Cool. Amazing book. Yep. Awesome. I

673
00:42:44,055 --> 00:42:47,895
have a flight to Vegas this next week, so I'll definitely be listening to

674
00:42:47,895 --> 00:42:51,580
it then. And finally, where can people learn more about you

675
00:42:51,580 --> 00:42:55,340
and run AI? And best

676
00:42:55,340 --> 00:42:58,805
place will be on our website, Run dot a I.

677
00:43:00,464 --> 00:43:04,305
Yeah. And on social. LinkedIn, Twitter, we'll

678
00:43:04,305 --> 00:43:07,740
we'll do. Awesome any parting thoughts

679
00:43:11,160 --> 00:43:15,005
I really enjoyed this episode love to speak about gpu's love the ai Based

680
00:43:15,005 --> 00:43:18,765
on it, I had a lot of fun. Thank you for having me here. Awesome.

681
00:43:18,765 --> 00:43:21,645
It it was an honor to have you, and every once in a while, Andy

682
00:43:21,645 --> 00:43:24,720
and I will do deep dive kinda shows. We love to invite you back if

683
00:43:24,720 --> 00:43:28,099
you wanna do 1 just on GPUs, because I know where my knowledge

684
00:43:28,319 --> 00:43:31,839
drops off, you probably could pick up on

685
00:43:31,839 --> 00:43:35,305
that. And with that, I'll let the nice

686
00:43:35,445 --> 00:43:39,125
AI British lady end the show. And just like

687
00:43:39,125 --> 00:43:42,520
that, dear listeners, We've come to the end of another enlightening

688
00:43:42,660 --> 00:43:46,500
episode of the data driven podcast. It's always a

689
00:43:46,500 --> 00:43:49,960
bittersweet moment like finishing the last biscuit in the tin,

690
00:43:50,295 --> 00:43:54,055
satisfying, yet leaving you wanting just a bit more. A

691
00:43:54,055 --> 00:43:57,335
colossal thank you to each and every one of you tuning in from across the

692
00:43:57,335 --> 00:44:01,000
digital sphere. Without you, we're just a bunch of

693
00:44:01,000 --> 00:44:04,840
ones and zeros floating in the ether. Your support is what

694
00:44:04,840 --> 00:44:08,675
keeps this digital ship afloat, and believe me, It's much appreciated.

695
00:44:10,175 --> 00:44:14,015
Now, if you found today's episode as engaging as a duel of wits with

696
00:44:14,015 --> 00:44:17,600
a sophisticated AI, which I assure you, is quite

697
00:44:17,600 --> 00:44:20,900
enthralling, then do consider subscribing to Data Driven.

698
00:44:21,760 --> 00:44:25,255
It's just a click away and ensures you won't miss out on our future true

699
00:44:25,255 --> 00:44:28,615
adventures in data and tech. And if you're feeling

700
00:44:28,615 --> 00:44:31,995
particularly generous, why not leave us a 5 star review?

701
00:44:32,810 --> 00:44:36,490
Just like a well programmed algorithm, your positive feedback helps

702
00:44:36,490 --> 00:44:40,110
us reach more curious minds and keeps the quality content flowing.

703
00:44:40,965 --> 00:44:43,545
It's the digital equivalent of a hearty handshake.

704
00:44:44,805 --> 00:44:48,560
So, until next time, keep those neurons firing, those

705
00:44:48,560 --> 00:44:52,400
subscriptions active and those reviews glowing. I'm

706
00:44:52,400 --> 00:44:55,971
Bailey, your British AI lady, signing off with a heartfelt

707
00:44:56,031 --> 00:44:58,610
cheerio and a reminder to stay data driven.