Hello and welcome to this episode of Haverin About.
2
00:00:15,885 --> 00:00:22,441
I am so delighted today to introduce you to another one of my favorite reads Maria
Sukareva,
3
00:00:22,441 --> 00:00:27,683
I wanted to talk with Maria because she goes by a brand of AI Realist.
4
00:00:27,823 --> 00:00:34,148
And that is really why I enjoy her writing, because she knows so much of the deep
5
00:00:34,148 --> 00:00:38,969
technical subject matter level, and yet she makes it very accessible.
6
00:00:38,969 --> 00:00:41,771
She's able to cross the boundaries in your writing.
7
00:00:41,771 --> 00:00:43,752
So I enjoy that very much, Maria.
8
00:00:43,752 --> 00:00:45,062
So welcome to you.
9
00:00:45,062 --> 00:00:47,664
Welcome to our viewers and listeners.
10
00:00:47,664 --> 00:00:54,168
And maybe Maria, I could start by asking you to just introduce yourself and explain a
little bit about your background.
11
00:00:54,312 --> 00:00:57,163
Thank you very much for such a nice introduction.
12
00:00:57,163 --> 00:01:08,194
I'm actually even a bit embarrassed now because of hearing so many nice words towards me
and towards the AI Realist That's really what makes my writing and my work rewarding.
13
00:01:08,194 --> 00:01:11,405
Yes, so towards the introduction of myself.
14
00:01:11,405 --> 00:01:15,736
Many years ago in 2003, I started as a linguist.
15
00:01:15,736 --> 00:01:18,438
like studying linguistics in a university.
16
00:01:18,438 --> 00:01:25,740
And while I really liked linguistics, my other major was translation and translation is
really not my cup of tea.
17
00:01:25,740 --> 00:01:40,546
I think it's a very complex job, very difficult job, but also it's a job that a uh bit
puts you in the shadow and is somewhat really not exactly my type of activity that I would
18
00:01:40,546 --> 00:01:42,569
see myself for years doing.
19
00:01:42,569 --> 00:01:50,900
I always actually liked computers and I liked math and artificial intelligence even back
then.
20
00:01:50,900 --> 00:01:54,260
Artificial intelligence existed for many, many, many decades before.
21
00:01:54,260 --> 00:01:58,660
mean, it was a slightly different type of artificial intelligence back then.
22
00:01:58,660 --> 00:02:05,260
lot of this ontologies kind of like neuro-symbolic stuff too was back then.
23
00:02:06,361 --> 00:02:09,132
neuro-linguistic programming and yes.
24
00:02:09,163 --> 00:02:14,736
Yeah, that's what I frequently get confused when I say I'm like an LP and they're like,
are you doing neuro-linguistic programming?
25
00:02:14,736 --> 00:02:16,427
No, it's like natural language processing.
26
00:02:16,427 --> 00:02:17,128
a different thing.
27
00:02:17,128 --> 00:02:25,893
yeah, so then I was basically, I got a scholarship from German government to study in a
German university in Saarland.
28
00:02:25,893 --> 00:02:33,445
And I moved to Germany and started studying computational linguistics, which, yep.
29
00:02:33,445 --> 00:02:34,178
you grow up?
30
00:02:34,178 --> 00:02:37,341
Where were you born and grew up in the first place?
31
00:02:37,341 --> 00:02:40,934
I grew up in the polar north of Russia.
32
00:02:40,934 --> 00:02:44,377
So on the coast of the Arctic Ocean, the city is called Arkhangelsk.
33
00:02:44,377 --> 00:02:53,203
It was built in 16th century as a place for punishment and suffering from the Tsar.
34
00:02:53,203 --> 00:03:00,090
So basically he was sending criminals there and people who just didn't like to live there
and suffer.
35
00:03:00,090 --> 00:03:13,275
And this city eventually grew into an industrial center of the North because there are
many resources, things that actually like an industrial country could profit from.
36
00:03:13,275 --> 00:03:19,637
So it became a fairly large polar city with many things happening there.
37
00:03:19,637 --> 00:03:22,490
But of course, it's extreme weather conditions.
38
00:03:22,490 --> 00:03:25,581
minus 40, minus 30 in winter.
39
00:03:25,581 --> 00:03:36,542
You have 52 days of vacations a year, paid transportation every two years, and many other
benefits just for living there.
40
00:03:36,542 --> 00:03:41,931
That's the type of living we are used to.
41
00:03:41,931 --> 00:03:43,565
visiting your family in St.
42
00:03:43,565 --> 00:03:44,908
Paul, Minnesota.
43
00:03:44,939 --> 00:03:47,561
Yes, that's absolutely right.
44
00:03:47,561 --> 00:03:48,532
I don't think St.
45
00:03:48,532 --> 00:03:51,163
Paul is even that cold.
46
00:03:52,595 --> 00:03:53,755
Yeah.
47
00:03:53,755 --> 00:03:57,598
So basically, then I did the computational linguistics.
48
00:03:57,598 --> 00:03:59,820
I graduated from this master's program.
49
00:03:59,820 --> 00:04:01,962
Then I started working in research.
50
00:04:01,962 --> 00:04:04,844
So basically, I started my PhD in Frankfurt.
51
00:04:04,844 --> 00:04:08,507
Unfortunately, never finished it, though it was very, very close.
52
00:04:08,507 --> 00:04:10,339
I had lots of publications, but...
53
00:04:10,339 --> 00:04:14,244
and life went a different way and I switched to industry.
54
00:04:14,244 --> 00:04:25,309
During my research career, I was working with machine translation and historical
languages, which many people actually found very confusing back then because, what machine
55
00:04:25,309 --> 00:04:29,391
translation can you have for middle low German and middle high German?
56
00:04:29,391 --> 00:04:32,042
I was working back then a lot
57
00:04:32,188 --> 00:04:35,461
with historical texts and historical texts are what?
58
00:04:35,461 --> 00:04:38,564
Well, there was only one text basically that existed everywhere.
59
00:04:38,564 --> 00:04:39,184
It was Bible.
60
00:04:39,184 --> 00:04:46,000
yeah, I was literally had a huge corpus of Bibles, like huge collection of Bibles.
61
00:04:46,000 --> 00:04:55,389
And I was doing machine translation from translating like from middle low German to high
German, so on like between different Bible versions.
62
00:04:55,389 --> 00:04:57,691
However, it might sound irrelevant.
63
00:04:57,691 --> 00:05:08,942
but in fact machine translation and all these approaches that we had, well, they are
basically were the founding of the theory behind large language models.
64
00:05:08,942 --> 00:05:10,784
we were doing language modeling.
65
00:05:10,784 --> 00:05:12,885
mean, language modeling comes from machine translation.
66
00:05:12,885 --> 00:05:15,608
So we started in N-gram language models.
67
00:05:15,608 --> 00:05:19,051
Then we moved towards neural language models.
68
00:05:19,051 --> 00:05:23,536
First, there were recurrent neural language models like LSTMs and so on.
69
00:05:23,536 --> 00:05:25,397
That's where attention mechanism appeared.
70
00:05:25,397 --> 00:05:27,670
So, you know, like the one that was famous.
71
00:05:27,670 --> 00:05:37,227
Then at some point someone, well, someone like the famous paper, attention is all you
need, realized that attention might actually be all you need.
72
00:05:37,227 --> 00:05:41,481
And what you don't need is the recurrency that was coming from those LSTMs.
73
00:05:41,481 --> 00:05:47,285
And they just kept the attention and that's what transformer models were.
74
00:05:47,605 --> 00:05:48,057
And...
75
00:05:48,057 --> 00:05:51,370
Yeah, and then we had the Transformers.
76
00:05:51,370 --> 00:05:53,792
They lived for a while.
77
00:05:53,792 --> 00:06:00,777
By that time I already moved to BMW, I was doing machine translation for an automotive
domain.
78
00:06:00,777 --> 00:06:07,102
So I was training these models because back then there were not many off-the-shelf
domain-specific machine translators.
79
00:06:07,102 --> 00:06:11,837
So I had a DGX station with 16 GPUs under my desk.
80
00:06:11,837 --> 00:06:15,108
Nobody really wanted to use it back then except for me.
81
00:06:15,108 --> 00:06:18,228
So I was lucky to have it all.
82
00:06:18,948 --> 00:06:19,228
Yes.
83
00:06:19,228 --> 00:06:24,979
It was kind of like funny in BMW because I was in the department, like NLP was very small.
84
00:06:24,979 --> 00:06:31,290
mean, NLP, like now everyone is an AI expert and everyone knows everything about large
language models and back then.
85
00:06:31,290 --> 00:06:37,921
And when I started in 2018, NLP group was just founded in BMW and there were four people.
86
00:06:37,921 --> 00:06:40,013
And four people is not enough to have a department.
87
00:06:40,013 --> 00:06:42,175
So they didn't know where to put us.
88
00:06:42,175 --> 00:06:48,521
So they put us into blockchain because whatever, guess blockchain was massive back then.
89
00:06:48,521 --> 00:06:52,375
It's like, it was like 22 people or something doing blockchain.
90
00:06:52,375 --> 00:06:58,700
And that's why we got the DGX station with Nvidia GPUs because blockchain obviously GPUs.
91
00:06:58,700 --> 00:07:03,884
So they bought the stations in BMW and the blockchain team actually like didn't do
anything with them.
92
00:07:03,884 --> 00:07:05,978
Like they didn't, they didn't have a project.
93
00:07:05,978 --> 00:07:08,129
So I was like, then I'll take them.
94
00:07:08,129 --> 00:07:15,382
yeah, I had in my car, they still, maybe I still have them somewhere, lying around the
blockchain and the NLP.
95
00:07:15,382 --> 00:07:20,827
And people, when they would see this card, they were like, oh, you're doing blockchain,
because everyone knew blockchain, nobody knew NLP.
96
00:07:20,827 --> 00:07:24,429
And I was like, I have no idea about blockchain, like, not a clue.
97
00:07:24,429 --> 00:07:28,209
And I really didn't know much about like what's happening in blockchain.
98
00:07:28,209 --> 00:07:31,640
I'm like, just like, don't ask me, but it says blockchain.
99
00:07:31,640 --> 00:07:34,211
Well, it's the other one, the NLP part.
100
00:07:34,211 --> 00:07:49,131
blockchain was one of those massively hyped technologies as going to be able to solve
everything from commerce to international financial transactions to security across
101
00:07:49,131 --> 00:07:50,422
networks, etc.
102
00:07:50,422 --> 00:07:54,404
And it was the biggest puff of smoke.
103
00:07:54,535 --> 00:07:57,096
I've seen in my career because you're correct.
104
00:07:57,096 --> 00:08:05,121
It was one of those things whereby it was so hot that everybody was talking about it, but
nobody understood what you could actually do with it.
105
00:08:05,121 --> 00:08:11,323
What was the problems it was actually in practice going to solve as opposed to
theoretically going to solve?
106
00:08:11,323 --> 00:08:22,541
So I have a lot of sympathy for you, it's, yeah, it sounds as though it enabled you to get
access at least to the technology that would allow you to start growing your.
107
00:08:22,541 --> 00:08:24,147
activity and work.
108
00:08:36,141 --> 00:08:39,865
Yeah, that was absolutely fun to have these boxes under your table.
109
00:08:39,865 --> 00:08:43,081
It was a luxury because now I don't have this.
110
00:08:43,081 --> 00:08:49,057
now we're all on the cloud and for comparison, I could run training for two weeks there.
111
00:08:49,057 --> 00:08:52,950
And it would train a massive machine translation model there.
112
00:08:52,950 --> 00:08:54,641
And basically it would cost nothing.
113
00:08:54,641 --> 00:08:59,795
mean, maybe some electricity, but for a corporation like BMW, it wouldn't matter.
114
00:08:59,795 --> 00:09:04,299
And obviously like nobody would be counting like how much electricity I burn in the
building.
115
00:09:04,299 --> 00:09:10,631
But now to have the training like this for two weeks would cost thousands on the cloud.
116
00:09:10,631 --> 00:09:20,622
And then I would simply not be able to do because I would actually have to explain why I
burned 10,000 euros on the cloud by training a model.
117
00:09:20,622 --> 00:09:31,282
this was really a luxury back then that is not really easily accessible nowadays because
we are heavily cloud
118
00:09:31,282 --> 00:09:43,290
a lot of people that, you know, when you look at companies like here in the US, there's a
number of those companies, whether it's buying capacity from Google, whether it's Nvidia
119
00:09:43,290 --> 00:09:51,334
selling their highest end processors, or there some of the specialty organizations that
are out there.
120
00:09:51,334 --> 00:09:52,005
I know that
121
00:09:52,005 --> 00:09:58,449
My organization has partnered with a number of them and you're correct, it's incredibly
expensive.
122
00:09:58,449 --> 00:10:01,952
And when you look at the amount of profit that Nvidia is making, interesting.
123
00:10:01,952 --> 00:10:12,469
And when you then look at the interest there is in developing the competitor chips, that
raw processing capability is the engine room that is making a lot of these hardware
124
00:10:12,469 --> 00:10:15,731
manufacturers, these chip manufacturers, we're all going bust.
125
00:10:15,731 --> 00:10:20,053
trying to develop CPUs for servers and computers.
126
00:10:20,053 --> 00:10:27,936
And then suddenly GPUs, it's become a license to print money for a lot of these hardware
organizations.
127
00:10:27,936 --> 00:10:32,549
So yeah, I think it's fascinating the way that that has changed from there.
128
00:10:32,549 --> 00:10:40,774
But you're working at BMW, clearly they didn't have a clue what to do with you or your
work.
129
00:10:40,774 --> 00:10:43,366
So what did you do next from there?
130
00:10:43,366 --> 00:10:47,718
Yeah, so well, in BMW, I was doing machine translation and some chatbots.
131
00:10:47,718 --> 00:10:49,880
uh Back then chatbots were also popular.
132
00:10:49,880 --> 00:10:51,442
And then I moved to Siemens.
133
00:10:51,442 --> 00:10:59,169
And in Siemens, they also decided to start NLP projects, like NLP initiatives.
134
00:10:59,169 --> 00:11:05,225
So I was also started with the usual NLP use cases, so like sentiment analysis.
135
00:11:05,225 --> 00:11:14,398
you know, like usual suspects like classification, but whatever we were doing then back
then with fine tuning transformers like BERTHA and doing them this.
136
00:11:14,398 --> 00:11:21,540
I think chatbot somehow avoided me back then, but it's so yeah.
137
00:11:21,540 --> 00:11:30,322
So, and actually nobody like really cared about me much back then when I started because
it was like just like NLP, some something.
138
00:11:30,322 --> 00:11:41,555
So I was in this dark corner of NLP doing some classification for financial domain, some
sentiment analysis and so on.
139
00:11:41,555 --> 00:11:46,822
And then suddenly, chat GPT happened and everyone decided that they need a chat bot.
140
00:11:46,822 --> 00:11:52,998
And it didn't happen overnight that we had an avalanche of AI experts.
141
00:11:52,998 --> 00:11:56,662
So the first two, three months, were actually like, so who's here?
142
00:11:56,662 --> 00:12:01,085
Actually like heard of this before and there were not many people.
143
00:12:01,238 --> 00:12:12,585
So uh that was interesting to come from this dark corner of NLP into the spotlight of
generative So at some point, yeah, we kind of like the NLP people, they got popular.
144
00:12:12,585 --> 00:12:14,258
ah
145
00:12:14,258 --> 00:12:18,732
And the NLP people who come from that background, were immediate.
146
00:12:18,732 --> 00:12:23,075
mean, the story of language models, it wasn't new for us.
147
00:12:23,075 --> 00:12:24,226
We knew the models before.
148
00:12:24,226 --> 00:12:30,872
knew GPT-3, there was Ulythor, there was GPT-2, and there was a bunch of these generative
approaches before.
149
00:12:30,872 --> 00:12:39,839
And this open domain chatbots, the neural chatbots that kind of behave in a similar manner
as machine translation behaves, because that's basically what they do, right?
150
00:12:39,839 --> 00:12:43,286
They get an input and then they need to generate
151
00:12:43,286 --> 00:12:44,557
some kind of an output.
152
00:12:44,557 --> 00:12:52,257
That's a similar idea behind machine translation, only that it doesn't translate but
continues generating.
153
00:12:55,117 --> 00:12:58,477
We knew this and we knew all the problems with this.
154
00:12:59,077 --> 00:13:05,967
Immediately, we knew how to make this thing to tell you something mean, to spare it to
you, how to hack it.
155
00:13:05,967 --> 00:13:09,359
And so on, like it was like uh quite obvious.
156
00:13:09,359 --> 00:13:13,783
I was very skeptical in the beginning, like that it's going to go somewhere.
157
00:13:13,783 --> 00:13:17,405
And so when people started coming to me, I was like, yeah, that's good.
158
00:13:17,405 --> 00:13:23,566
Let's uh do this, this and this, but this, this, this will not be possible with these
models.
159
00:13:23,566 --> 00:13:28,948
uh But very quickly there was an avalanche of kind of...
160
00:13:28,948 --> 00:13:29,479
uh
161
00:13:29,479 --> 00:13:36,358
AI experts who probably consulted the board first before becoming AI experts for their
education in the field.
162
00:13:38,765 --> 00:13:42,186
There are an enormous number of those people out there.
163
00:13:43,167 --> 00:13:50,711
find, because I write about AI, but I write about it from an applied perspective in terms
of, you know, like it's any other product.
164
00:13:50,711 --> 00:13:58,416
And I've shared with you, I'm one of those enthusiastic skeptics, you know, similar from
you, but you've got a technical background.
165
00:13:58,416 --> 00:14:01,429
I've got a business commercial and healthcare background.
166
00:14:01,429 --> 00:14:03,622
And I find it very funny.
167
00:14:03,622 --> 00:14:11,045
to see all of these people when you look back in their career and how they became instant
AI experts 18 months ago.
168
00:14:23,697 --> 00:14:27,789
And that's the wonderful thing I love about you is this has been your life.
169
00:14:27,789 --> 00:14:35,894
mean, the through path from your linguistics background into where you are today is
extremely clear.
170
00:14:35,894 --> 00:14:41,540
And the thing I love about your background is that through line makes complete
171
00:14:41,540 --> 00:14:49,637
intellectual sense as to why you understand those fundamental underpinnings of what these
transformers are doing.
172
00:14:49,637 --> 00:14:50,367
Because you're right.
173
00:14:50,367 --> 00:14:52,609
mean, they're basically, taking words.
174
00:14:52,609 --> 00:14:58,784
We dump a bunch of words into a context window and they're trying to sort of like look for
patterns, look for associations.
175
00:14:58,784 --> 00:15:02,499
And then the basis of that predicts something that you might want.
176
00:15:02,499 --> 00:15:07,885
And obviously the problem with that is it's, I don't know if you've ever heard this old
177
00:15:07,885 --> 00:15:16,793
joke about if you gave 10,000 monkeys, 10,000 typewriters, give them long enough and you
would eventually get the work of Shakespeare Yeah.
178
00:15:16,793 --> 00:15:17,573
Yeah.
179
00:15:17,573 --> 00:15:25,210
And so I feel like that all we've done is we've applied this massive computing power to
allow that to occur.
180
00:15:25,210 --> 00:15:30,083
So you get this predictive output with lots and lots of loops, you know, in there.
181
00:15:30,083 --> 00:15:35,449
You know, my son calls them Markov loops and because he studies more technically than I
am.
182
00:15:35,449 --> 00:15:36,811
And so I think that that
183
00:15:36,811 --> 00:15:48,538
linguistic background that you've got and that translation background that you've got in
terms of the fundamentals around translating from those various versions of Bibles is an
184
00:15:48,538 --> 00:16:00,525
amazing background compared to some of these IT people who have become AI experts only
because it's the latest flavor of technology and so therefore they've had to study really
185
00:16:00,525 --> 00:16:03,467
hard, very fast to try and keep up.
186
00:16:03,467 --> 00:16:07,087
So yeah, sorry, that was me flattering you even more.
187
00:16:07,087 --> 00:16:09,987
And I apologize for making you blush.
188
00:16:13,024 --> 00:16:13,844
That's okay.
189
00:16:13,844 --> 00:16:23,364
Well, let's pivot a little bit from your background because I know that you're now doing
some very serious strategy work in your work for Siemens.
190
00:16:23,364 --> 00:16:26,724
And we won't touch on that because you're not here to represent Siemens.
191
00:16:26,724 --> 00:16:29,343
We're talking about you and AI Realists.
192
00:16:29,343 --> 00:16:32,046
Oh, I need to tell you something.
193
00:16:33,089 --> 00:16:39,407
Yes, I'm a Siemens employee, but all views expressed in this podcast are solely my own.
194
00:16:39,407 --> 00:16:44,579
Yeah, this is...
195
00:16:44,579 --> 00:16:46,260
to you as the AI realist.
196
00:16:46,260 --> 00:16:48,554
And that's what we're talking about.
197
00:16:48,554 --> 00:16:53,210
that I'm supposed to say every time I'm doing something like this.
198
00:16:53,210 --> 00:16:54,222
Absolutely.
199
00:16:54,222 --> 00:17:00,685
All views of the contributor are her own and personal.
200
00:17:01,606 --> 00:17:02,206
Wonderful.
201
00:17:02,206 --> 00:17:07,019
Well, let's talk a little bit about what you see changing in AI right now.
202
00:17:06,992 --> 00:17:09,975
And before we do that, I'll do a little insert here.
203
00:17:09,975 --> 00:17:14,530
This is Maria's publication, the AI Realist on Substack.
204
00:17:14,530 --> 00:17:19,436
And I can strongly recommend that you go and look that up.
205
00:17:19,436 --> 00:17:24,890
Maria, do you mind if I put your LinkedIn connection as well for anybody who wants to look
up it, because it's public.
206
00:17:24,890 --> 00:17:36,320
And so I'll add in Maria's LinkedIn profile so that if you want to learn a little bit more
about what she's doing either professionally or more importantly, go subscribe to the AI
207
00:17:36,320 --> 00:17:36,901
Realist.
208
00:17:36,901 --> 00:17:38,953
It will be such a great read for you.
209
00:17:38,953 --> 00:17:42,537
Some of Maria's work is very technical, but some of it is...
210
00:17:42,537 --> 00:17:48,772
very insightful about just what's going on and how do you cut through the marketing fluff
and hype.
211
00:17:48,772 --> 00:17:52,535
So with that said, Maria, now I've done a small advert for you.
212
00:17:52,535 --> 00:18:07,606
What is that is changing right now in AI in terms of both at a commercial industrial
level, but also in what you write about, you write a lot about what's happening for the
213
00:18:07,606 --> 00:18:09,839
general utility user.
214
00:18:09,839 --> 00:18:13,843
of AI tools out there in the world.