1
00:00:00,540 --> 00:00:04,299
Miko Pawlikowski: I'm Miko Pawlikowski and this is HockeyStick.

2
00:00:07,180 --> 00:00:17,100
Today we're talking about how generative AI is changing the field of data analytics and
how you too can leverage large language models to become your assistant and co-worker.

3
00:00:17,320 --> 00:00:22,009
I'm joined by the three authors of the "Generative AI for Data Analytics" book,

4
00:00:22,240 --> 00:00:24,960
now available in early access from Manning.com.

5
00:00:25,130 --> 00:00:31,320
Artur Guja, risk manager and computer scientist with over 20 years of experience in the banking sector.

6
00:00:31,520 --> 00:00:31,900
Dr.

7
00:00:31,900 --> 00:00:43,325
Marlena Siviak, data scientist and bioinformatician, the co-creator of the first global model
of the COVID-19 pandemic, and the co-author of a techno thrill novel and sci-fi short stories.

8
00:00:43,495 --> 00:00:44,275
And Dr.

9
00:00:44,275 --> 00:00:56,125
Marian Siwiak, data scientist, strategist, and bioinformatician, the creator of the first artificial
sentience, something we're going to cover in this episode, and the sci-fi novel Pharmacon.

10
00:00:56,434 --> 00:00:59,995
Welcome to this episode and thank you for flying hockey stick.

11
00:01:00,816 --> 00:01:04,596
The first thing I thought is that you look like an eclectic bunch.

12
00:01:04,676 --> 00:01:14,414
you've got Artur of this banking sector, Marlena with the bioinformatician, Marian, data How did you end up teaming up for the book?

13
00:01:15,109 --> 00:01:18,379
Marian Siwiak: we worked together previously, especially me and Marlena.

14
00:01:18,694 --> 00:01:20,164
With Artur, we also,

15
00:01:20,294 --> 00:01:21,954
Artur Guja: walking our kids in the park.

16
00:01:21,954 --> 00:01:29,934
the three of us used to work, earlier on various ventures, on, process, automation on the business process re-engineering.

17
00:01:30,354 --> 00:01:33,374
So this is one of many ventures that we've done

18
00:01:33,739 --> 00:01:34,219
Miko Pawlikowski: I see.

19
00:01:34,259 --> 00:01:37,299
So you go way back and this is just Another project.

20
00:01:37,629 --> 00:01:38,559
Just another day.

21
00:01:39,026 --> 00:01:42,906
Marian Siwiak: funnily enough, it's not like we go like 20 years way back.

22
00:01:43,446 --> 00:01:45,776
We work together quite intensely.

23
00:01:45,826 --> 00:01:54,316
what we did together, multiple things, they all led to this book because we were always trying to find ways to make things.

24
00:01:54,316 --> 00:01:56,036
quicker, more efficient, better.

25
00:01:56,416 --> 00:01:58,006
This is what Artur mentioned.

26
00:01:58,496 --> 00:02:00,826
we worked in process optimization in a broad sense.

27
00:02:01,946 --> 00:02:07,556
So it was always interesting, to us how to make things more, efficient.

28
00:02:07,556 --> 00:02:09,536
And when generative AI.

29
00:02:10,456 --> 00:02:24,816
blew and finally started to resemble, human cognition in a sense, we decided to give
it a try and our minds were collectively blown and we started using it for our work.

30
00:02:25,026 --> 00:02:36,251
And then we decided that now that we know how to use it, I would say, again, efficiently and, Marlena found a way to use it smartly.

31
00:02:37,041 --> 00:02:42,161
we decided that we could write a book about it because we noticed that there is a lot of buzz about it.

32
00:02:42,441 --> 00:02:44,801
There is a lot of prompt engineering.

33
00:02:45,381 --> 00:02:50,061
now I think on Coursera, you can take a specialization in prompt engineering.

34
00:02:50,691 --> 00:02:53,141
and everybody's again, looking for a silver bullet.

35
00:02:53,906 --> 00:02:59,706
So I will just type in magical command and it will solve my problems.

36
00:03:00,226 --> 00:03:03,756
our collective experience is technology doesn't solve problems.

37
00:03:04,726 --> 00:03:15,086
technology can give you a great headache if you don't use it in a way it's supposed to be used, but everybody tries to cut corners and simplify things.

38
00:03:15,596 --> 00:03:19,476
So this book is about using generative AI.

39
00:03:20,521 --> 00:03:21,461
it's not a cookbook.

40
00:03:21,581 --> 00:03:24,011
It's not, okay, this is some code or some prompts.

41
00:03:24,081 --> 00:03:27,711
You will type them in and your problems will be solved.

42
00:03:28,411 --> 00:03:30,421
it's just not how we work.

43
00:03:30,471 --> 00:03:31,711
It's not how the world works.

44
00:03:31,981 --> 00:03:34,091
Despite many people wanting it to,

45
00:03:35,204 --> 00:03:37,984
Marlena Siwiak: I think that this is the problem with expectations.

46
00:03:38,074 --> 00:03:44,214
many people have missed expectations in terms of ChatGPT and other generative AI.

47
00:03:44,224 --> 00:03:50,094
And then they are surprised and they are unhappy because ChatGPT can't make them coffee yet.

48
00:03:51,074 --> 00:03:53,034
maybe this is not the tool for making coffee.

49
00:03:53,904 --> 00:03:58,654
I very often see this kind of complaint, which is not necessary because it is a great tool.

50
00:03:59,194 --> 00:04:00,984
it's great invention.

51
00:04:01,654 --> 00:04:04,734
And I think it's going to change the way our society works.

52
00:04:05,664 --> 00:04:07,014
it's good to live in such times.

53
00:04:07,754 --> 00:04:08,534
it's really interesting.

54
00:04:09,534 --> 00:04:13,684
Miko Pawlikowski: Ask a few questions to ChatGPT and see how good they are and see what you can do.

55
00:04:13,734 --> 00:04:17,844
Ask for some snippets and do all kinds of things that kind of speed you up.

56
00:04:18,214 --> 00:04:34,869
but it's also probably the most frustrating, element of working with, especially for people like me who come from software engineering
background and they like things well defined and, always replicable and reproducible and all of that, and then you go here and it all ends.

57
00:04:34,869 --> 00:04:43,269
But, before we jump into the book, a little bit deeper, can you tell us a little bit more about what, process optimization actually means?

58
00:04:43,299 --> 00:04:49,999
I know that's probably a phrase that you use a lot and it means a well defined thing for you, but it might not for the audience.

59
00:04:51,644 --> 00:05:08,904
Artur Guja: basically taking a look at what business does, what people do in the business and, looking for, ways for
optimizing it, but, actually describing what should be done, what people think, should be done versus what people actually

60
00:05:08,904 --> 00:05:16,374
do, because usually there is a massive gap between what people think is happening and what they think should be happening.

61
00:05:16,674 --> 00:05:23,644
People think that, a given operation should be reviewed by at least two people and should take no more than a day.

62
00:05:23,674 --> 00:05:32,144
The fact is that usually one person just takes it off and it takes maybe two days because they're very busy or they've been on holiday.

63
00:05:32,534 --> 00:05:37,294
the dissonance between reality and documentation is usually huge.

64
00:05:37,674 --> 00:05:53,344
In looking from the process from the outside and then looking for ways to close that gap is I think the best way
to describe the optimization to actually make the process and the reality meet in something that is both realistic.

65
00:05:53,634 --> 00:05:59,534
Because processes, when they're designed, are usually overly, optimistic and something that actually works.

66
00:06:00,074 --> 00:06:11,564
And then using automation, because once, once you actually describe what's happening, you can use automation
to free people from the burden of mundane tasks, and actually help them focus on something creative.

67
00:06:11,974 --> 00:06:16,844
Marian Siwiak: the way we approached it is, important part is to understand what is really happening.

68
00:06:17,604 --> 00:06:32,914
And that, Joe on the second floor is actually the information hub for all the company, and despite his,
activities not being overly highlighted in the org structure, he's the most important person in the company.

69
00:06:33,184 --> 00:06:38,924
We created, maps which connected on the one side, what are the actions and decisions?

70
00:06:39,189 --> 00:06:40,089
this is where we believe.

71
00:06:42,034 --> 00:06:52,704
Is the critical, value in process mapping is understanding what are the decisions to be made, who is
making these decisions and what, on what basis to understand on what basis they make this decision.

72
00:06:53,104 --> 00:06:56,664
We map up decisions and we map up all the data that they are using.

73
00:06:57,264 --> 00:07:05,409
So all the actions produce data, and all decisions utilize some data and you have this two layers of information about the process.

74
00:07:05,839 --> 00:07:09,179
Artur introduced, also the third layer, which is the risk.

75
00:07:09,699 --> 00:07:18,419
So the people who are making decisions can understand, what are the risks associated, what different outcomes of decisions can be.

76
00:07:18,879 --> 00:07:23,434
And then when you can see how it all works, you can improve on it.

77
00:07:23,534 --> 00:07:25,464
You can shorten the cycles.

78
00:07:26,984 --> 00:07:38,279
so process optimization is actually first understand what is happening, understanding, what
could be happening and find a way to make decisions more informed and, conscious of risks.

79
00:07:39,109 --> 00:07:48,609
And then also you have actions, and this is probably where most of the process optimization, consultants work, is how to make actions to be more efficient.

80
00:07:48,619 --> 00:07:57,179
But in our opinion, if the action is triggered by a misinformed decision, it's a pure waste of time anyway.

81
00:07:57,279 --> 00:08:03,579
Miko Pawlikowski: makes more sense now, because initially I thought when you said technology doesn't solve problems, it creates headaches.

82
00:08:03,579 --> 00:08:05,849
I was like, 'Oh, this is such a terrible slogan

83
00:08:05,949 --> 00:08:08,339
Marian Siwiak: as you can see, I now work in the aluminum refinery.

84
00:08:08,779 --> 00:08:11,299
because people didn't want to hear, what we are saying.

85
00:08:11,349 --> 00:08:15,539
They wanted to hear, 'yes, we will come and install you a new tool and all will be solved'.

86
00:08:15,989 --> 00:08:18,189
So our sales process sucks, as you can hear.

87
00:08:18,989 --> 00:08:22,339
Where we were able to implement it, it worked perfectly.

88
00:08:22,699 --> 00:08:25,099
but not many people wanted to put extra effort.

89
00:08:25,509 --> 00:08:27,179
So I need to tell you what I do?

90
00:08:27,199 --> 00:08:35,159
No, I want the tool that will discover what I do.

91
00:08:35,306 --> 00:08:40,384
Artur Guja: This is a problem with generative AI, that people expected to solve problems just by, give me an account on, ChatGPT.

92
00:08:40,404 --> 00:08:42,824
And here all my problems are solved.

93
00:08:42,824 --> 00:08:48,404
And very often as we've seen through various, anecdotal, evidence,

94
00:08:48,494 --> 00:09:08,509
giving ChatGPT to people who are not aware of the dangers of it and the problems, the hallucinations that it can generate, just leads to, hilarious
results as, the case of those lawyers in US who introduced completely fictitious, cases into their evidence or, maybe slightly less hilarious examples

95
00:09:08,579 --> 00:09:16,509
of, proprietary software leaking out through ChatGPT because people were just putting proprietary information into it and it became public knowledge.

96
00:09:17,159 --> 00:09:19,679
don't expect ChatGPT to solve all your issues

97
00:09:20,341 --> 00:09:23,431
Marlena Siwiak: the comparison that you used at the beginning was the right one.

98
00:09:23,431 --> 00:09:25,601
That ChatGPT is like an assistant.

99
00:09:26,376 --> 00:09:33,516
Very, smart, very intelligent, an assistant who read a lot and learned a lot, but it's still a newbie.

100
00:09:34,006 --> 00:09:36,686
He's, just after grad school, right?

101
00:09:36,926 --> 00:09:43,472
No experience, you can ask it for help, you can give it tasks to do, but you have to, manage that.

102
00:09:43,527 --> 00:09:45,457
You cannot give him all the responsibility.

103
00:09:46,572 --> 00:09:46,862
Miko Pawlikowski: Yeah.

104
00:09:46,862 --> 00:09:49,512
one could say it was literally born yesterday,

105
00:09:51,377 --> 00:09:51,727
Marlena Siwiak: Exactly.

106
00:09:52,087 --> 00:09:53,937
Miko Pawlikowski: to a certain degree, understandable.

107
00:09:54,037 --> 00:09:57,957
so is your PhD background also in, in process optimization

108
00:09:58,107 --> 00:10:06,507
Marlena Siwiak: So my PhD was in biophysics, in particular protein translation, a bit of process optimization, but not much and not related to business at all.

109
00:10:07,285 --> 00:10:10,735
at some point I decided to quit academia and you have to do something else.

110
00:10:10,875 --> 00:10:17,185
So I turned to data science, which was very close to the things that I was actually doing as a bioinformatician.

111
00:10:17,990 --> 00:10:22,540
the type of data changed, basically, that was the thing that really matter.

112
00:10:22,730 --> 00:10:28,220
And from there, slowly, you look for a job, another job, and it goes like that.

113
00:10:28,330 --> 00:10:28,490
Yeah.

114
00:10:30,280 --> 00:10:33,900
Miko Pawlikowski: and if you don't mind me asking why quit academia,

115
00:10:34,540 --> 00:10:37,880
Marlena Siwiak: maybe I got a bit disappointed with how science is made.

116
00:10:37,930 --> 00:10:40,810
you want more citations of your publications to survive.

117
00:10:41,280 --> 00:10:46,220
And to have more citations, you have to be more popular in social media and stuff.

118
00:10:46,220 --> 00:10:55,940
it's crazy that you have to fight for popularity by being a scientist where what should count is actually your science, your research and the thought behind it.

119
00:10:55,940 --> 00:10:57,040
There are too many papers.

120
00:10:57,040 --> 00:11:01,170
Nobody has time to read it, even in very narrow domain.

121
00:11:01,790 --> 00:11:05,670
So they read the first things that come to them when they search the internet.

122
00:11:06,050 --> 00:11:08,330
So we have to fight to be popular, to be on top.

123
00:11:09,270 --> 00:11:13,970
it has nothing to do with the quality of your research, in fact.

124
00:11:14,154 --> 00:11:30,214
Marian Siwiak: and there is a research on that, which shows that, You need to be popular to be accepted to high priority journals
and it has nothing to do and as I said It's not just opinion of the frustrated former scientist, but that's a research showing that

125
00:11:30,214 --> 00:11:39,394
the quality published there Is exactly the same as anywhere else, but there is more citations and, also more money resulting from it.

126
00:11:39,394 --> 00:11:44,354
prestige, here translates to money because, from citations, come better grants, right?

127
00:11:44,820 --> 00:11:46,780
Still, I want to be perfectly clear.

128
00:11:46,780 --> 00:11:47,800
I think peer review

129
00:11:47,860 --> 00:11:54,518
despite all its drawbacks, it's the only way of, distinguishing from pseudo research.

130
00:11:55,343 --> 00:11:57,223
It changed a little in software engineering.

131
00:11:57,223 --> 00:12:03,513
I don't think the papers about ChatGPT or LLAMA or anything like that were peer reviewed.

132
00:12:03,543 --> 00:12:15,118
They are prepared as so called preprints, and they don't bother with so called researchers to evaluate it because the results speak for themselves.

133
00:12:15,598 --> 00:12:20,938
So this is, I must say, the paradigm shift, I love this word, that we observe right now.

134
00:12:21,498 --> 00:12:25,718
but in most other cases, the peer review is the only process.

135
00:12:26,393 --> 00:12:38,678
Marlena Siwiak: when you're talking about peer review, another thing that bothers me in academia  is the
fact that everybody expects that your research will be successful, and it's not always so with research.

136
00:12:39,008 --> 00:12:40,398
Research is asking questions.

137
00:12:40,668 --> 00:12:42,108
does your hypothesis work?

138
00:12:43,048 --> 00:12:51,328
And very often it doesn't work, or the most often outcome is that we don't know because the effect is too small, yeah?

139
00:12:51,958 --> 00:12:55,836
And it's impossible to publish things things.

140
00:12:55,998 --> 00:12:58,338
when you answer to the question, we still don't know.

141
00:12:58,699 --> 00:13:03,599
So you'll waste a lot of time and your effort, your money, and in the end you have the answer "we still don't know".

142
00:13:03,727 --> 00:13:04,917
Who would give you another money?

143
00:13:05,617 --> 00:13:19,917
So what researchers do, sometimes unconsciously, they are trying to find, black or white, but very often it's
grey, publishing this grey results is still valuable because when you collect multiple researches like this,

144
00:13:20,124 --> 00:13:24,254
prepare a meta analysis, you can get the final answer yes or no.

145
00:13:24,490 --> 00:13:24,938
Miko Pawlikowski: or

146
00:13:25,124 --> 00:13:33,284
Marlena Siwiak: But the way science is funded, and the fact that you won't get another money for research like this, if you produce, "I don't know" answer,

147
00:13:33,997 --> 00:13:37,267
Miko Pawlikowski: I can't remember last time nature had on the cover.

148
00:13:37,707 --> 00:13:38,567
"Is this true?

149
00:13:38,917 --> 00:13:39,517
Don't know".

150
00:13:39,737 --> 00:13:44,407
Marian Siwiak: don't know.

151
00:13:44,484 --> 00:13:45,112
Marlena Siwiak: There's no space for such discussion.

152
00:13:45,112 --> 00:13:47,182
And, everybody's in a rush in academia.

153
00:13:47,222 --> 00:13:48,952
There is no space to really think.

154
00:13:49,507 --> 00:13:50,747
to educate yourself.

155
00:13:50,897 --> 00:13:54,737
Yeah, it's all, in the rush and results without, it's like corporation.

156
00:13:55,667 --> 00:13:56,737
It's not much difference, really.

157
00:13:58,342 --> 00:14:16,232
Miko Pawlikowski: So what you're saying is that turns out that scientists found that scientists are humans like any others, and
they have the same problems with herd mentality and wanting to progress their career and wanting to make money and making headlines.

158
00:14:17,277 --> 00:14:19,877
Marlena Siwiak: it's not making huge monies or anything like that.

159
00:14:20,277 --> 00:14:23,917
Because to be honest, salaries in academia suck, right?

160
00:14:24,417 --> 00:14:31,277
when you compare the salaries, these salaries to salaries of people who work in business and are similarly educated, it's much worse.

161
00:14:31,327 --> 00:14:33,037
And the expectations are high, yeah?

162
00:14:33,097 --> 00:14:35,052
the amount of work you have to do, the amount of time.

163
00:14:35,592 --> 00:14:36,852
time it consumes,

164
00:14:37,422 --> 00:14:38,812
Marian Siwiak: Also, it's very ego-driven.

165
00:14:39,032 --> 00:14:39,582
Look at us.

166
00:14:40,152 --> 00:14:42,282
you have this myth.

167
00:14:43,582 --> 00:14:48,342
Of We are the beacon of truth for the world, which has nothing to do with truth anyway.

168
00:14:48,342 --> 00:14:53,412
But anyway, pretty low salaries compared to other positions.

169
00:14:53,482 --> 00:14:55,832
You have pretty low, position stability.

170
00:14:56,282 --> 00:14:59,262
many institutions keep researchers on grant money.

171
00:14:59,312 --> 00:15:03,082
we bring more grants so they can get the overheads, their share.

172
00:15:04,012 --> 00:15:09,512
brings people with very specific mentality, and many of them are complete egomaniacs.

173
00:15:09,902 --> 00:15:10,412
So

174
00:15:10,678 --> 00:15:14,938
it also makes all this environment extremely toxic.

175
00:15:15,418 --> 00:15:18,678
know I sound like a frustrated former scientist, which I am.

176
00:15:19,413 --> 00:15:22,163
but it doesn't mean that I'm not right,

177
00:15:22,263 --> 00:15:30,338
Miko Pawlikowski: to segway into a question I was going to ask about that COVID 19 pandemic, Could you talk a little bit about that Covid, model?

178
00:15:30,488 --> 00:15:36,488
I'm curious, what does it mean to say, you're the co-creator of the first global model of covid pandemic?

179
00:15:37,083 --> 00:15:40,293
Marian Siwiak: we created a model of a global pandemic.

180
00:15:40,493 --> 00:15:45,243
in March, 2020, we had a model where we were dropping an index case.

181
00:15:45,303 --> 00:15:51,013
So it's the first person infected in Wuhan, China in November, 2019.

182
00:15:51,043 --> 00:15:57,763
And we were accurately predicting number of symptomatic and asymptomatic cases in New York a couple months later.

183
00:15:59,023 --> 00:16:05,568
back then there was no Good model on any country level.

184
00:16:06,718 --> 00:16:11,928
Later, there were global models, because again, technology doesn't solve problems.

185
00:16:12,168 --> 00:16:17,358
This is the perfect example of what we spoke previously, because we used existing technology.

186
00:16:17,468 --> 00:16:21,908
No, we looked at the virus as a biological, not a political entity.

187
00:16:22,058 --> 00:16:22,548
And that was.

188
00:16:22,548 --> 00:16:33,268
The biggest difference, because we looked at the data available and we decided, okay, it's
impossible that the virus has a completely different infectivity in one country than in the other.

189
00:16:33,958 --> 00:16:35,898
It just viruses don't work this way.

190
00:16:35,908 --> 00:16:41,778
It's not like they have, passports and they say, okay, I come to this country and I'll be nice and I will, infect, not more.

191
00:16:41,858 --> 00:16:42,028
Yeah.

192
00:16:42,058 --> 00:16:42,608
Visa denied.

193
00:16:42,768 --> 00:16:58,353
no, in your country, I will infect no more than three people from every, infected person, I think our listeners will also
interested in the source of the model, we approached as a data science problem and, at the same time, the biology-related problem.

194
00:16:58,423 --> 00:16:59,643
So we checked other coronaviruses.

195
00:17:01,518 --> 00:17:08,178
And we assumed that it is yet another coronavirus, like there was SARS, there are other.

196
00:17:08,718 --> 00:17:11,478
And we simply used the values.

197
00:17:11,708 --> 00:17:14,498
We created a model, not a pure machine learning model.

198
00:17:14,568 --> 00:17:20,018
We prepared analytical model where we assumed, okay, so this is the virus.

199
00:17:20,258 --> 00:17:27,223
This is how it should look like more or less and let's use some Monte Carlo simulations to check how it will spread.

200
00:17:27,763 --> 00:17:42,383
And we noticed that our assumptions, they actually reflect the situation in the
countries where we could say with certain degree of certainty, provide accurate data.

201
00:17:42,453 --> 00:17:43,513
Okay, so this is the virus.

202
00:17:43,803 --> 00:17:45,113
This is how it looks like.

203
00:17:45,833 --> 00:17:47,423
And, this is how it behaves.

204
00:17:48,713 --> 00:18:02,003
And, we tried to publish it for over half a year, when we published it, it was too late
because we were just a small company trying to show people, 'okay, this is the accurate model'.

205
00:18:02,033 --> 00:18:03,953
I'm not even saying it was true, right?

206
00:18:04,353 --> 00:18:10,123
But it's accurate and it was showing completely different picture than everybody else was willing to believe.

207
00:18:10,863 --> 00:18:19,703
so one of our reviewers was excluded from the process because of obstructionism slowed down the publication for many months.

208
00:18:21,733 --> 00:18:24,413
This was a problem not solved by technology.

209
00:18:24,923 --> 00:18:28,613
This was a problem where you had to just sit down, do your homework,

210
00:18:29,563 --> 00:18:41,653
read about the problem, read about similar problems, collate the data into a coherent whole, and then use some technology to make this last inch.

211
00:18:42,383 --> 00:18:46,743
Okay, let's check if our assumptions hold true, all right?

212
00:18:47,986 --> 00:18:48,506
I'm sorry.

213
00:18:48,866 --> 00:18:50,286
I'm getting emotional when I think about it.

214
00:18:52,146 --> 00:18:54,166
Anyway, so yeah, it was, it was pretty fun.

215
00:18:56,681 --> 00:19:11,171
Miko Pawlikowski: What I always think about is in this models, are they just like statistical analysis of this is
the incubation period, this is the exposure, this is the coefficient of, how it's going to grow, or things like, the

216
00:19:11,461 --> 00:19:18,161
country's interventions as in, one country might be, we're not doing anything, not going to name any countries, but,

217
00:19:18,356 --> 00:19:19,726
Marlena Siwiak: if you know how to quantify

218
00:19:19,726 --> 00:19:22,856
it, you can add it, of course, but this is another level of complication.

219
00:19:22,856 --> 00:19:23,856
the problem is data.

220
00:19:24,296 --> 00:19:32,776
Marian Siwiak: you can assume that some interventions will impact because the way we modeled it, it's a statistical properties of the virus.

221
00:19:32,906 --> 00:19:34,536
It's ability to infect others.

222
00:19:34,696 --> 00:19:46,626
And time that people take to, be diagnosed or recognized as, infected, So this is, let's say, infectivity on different stages, you can complicate this model.

223
00:19:47,206 --> 00:19:55,321
The model or technology that we moved, it was global mobility-based, so they divided the world.

224
00:19:55,811 --> 00:19:59,711
into, areas around international airports.

225
00:20:00,041 --> 00:20:03,521
And the simulation was run for each area separately.

226
00:20:03,521 --> 00:20:07,001
And then there was a probability of somebody moving from this area.

227
00:20:07,411 --> 00:20:10,711
So you could go area by area, one by one.

228
00:20:10,811 --> 00:20:21,493
And this is why we modeled only the early stages but it takes time and money to evaluate what are the effects or.

229
00:20:22,158 --> 00:20:23,128
expected effects,

230
00:20:23,253 --> 00:20:23,553
Miko Pawlikowski: effects,

231
00:20:23,918 --> 00:20:31,278
Marian Siwiak: in given area of, let's say different levels of lockdown or travel restrictions or whatever.

232
00:20:31,928 --> 00:20:35,048
So it is possible, but we would have to have financing, right?

233
00:20:35,463 --> 00:20:38,493
We were thinking about doing it, but it's a gigantic work.

234
00:20:38,673 --> 00:20:46,273
Imagine nobody wanted to pay us, especially, but we published in the second grade journal, six months too late, it is possible technically.

235
00:20:47,248 --> 00:20:49,778
Miko Pawlikowski: So it's always a matter of the same thing.

236
00:20:49,958 --> 00:20:52,648
Someone didn't allocate enough money

237
00:20:53,498 --> 00:20:55,078
Marian Siwiak: amount of money was sufficient.

238
00:20:55,908 --> 00:21:08,328
I think it's again, what Marlena said previously, it's this beauty pageant, among scientists that, the people who got
this money, they were the most popular because the model that was published just after we submitted ours was so widely

239
00:21:08,338 --> 00:21:18,503
inaccurate that even the academic, environment, which is very careful in bad mouthing the results, they trashed it, right?

240
00:21:18,553 --> 00:21:20,003
But it was popular.

241
00:21:20,193 --> 00:21:22,933
It had a lot of citations and a lot of money went after it.

242
00:21:23,763 --> 00:21:30,133
Somebody who published widely inaccurate model got a lot of money because he was widely recognized expert.

243
00:21:30,563 --> 00:21:36,853
Because when you are applying for grant, nobody asks, are your citations saying that your model is inaccurate?

244
00:21:36,973 --> 00:21:40,413
No, they ask, how many citations did your paper get?

245
00:21:40,413 --> 00:21:43,543
Miko Pawlikowski: Once, someone published some research, it got popular.

246
00:21:43,593 --> 00:21:46,993
Turns out it was inaccurate or turns out it was wrong.

247
00:21:47,723 --> 00:21:50,733
Are there any repercussions for that afterwards?

248
00:21:50,948 --> 00:21:51,938
Marian Siwiak: What repercussions?

249
00:21:52,508 --> 00:21:57,648
In the worst case, you just retract your paper and you lose the citations.

250
00:21:58,268 --> 00:22:01,238
you're not even very often excluded from conferences.

251
00:22:01,638 --> 00:22:05,148
If you're popular enough, you are a voice in the discussion.

252
00:22:05,488 --> 00:22:08,948
Marlena Siwiak: if you go too far, if you exaggerate, you can end up in jail.

253
00:22:08,958 --> 00:22:11,008
I'm thinking about the Teranos right now.

254
00:22:11,038 --> 00:22:13,528
they also had some research about their technology.

255
00:22:13,546 --> 00:22:14,636
which was all fake.

256
00:22:15,046 --> 00:22:15,526
Of course.

257
00:22:15,966 --> 00:22:16,146
that

258
00:22:16,168 --> 00:22:20,208
Artur Guja: lady went to jail for, for financial fraud, not for research fraud

259
00:22:20,313 --> 00:22:26,763
Marlena Siwiak: But that fraud was based on false results, that she was convincing investors that she has technology, technology that solves

260
00:22:27,023 --> 00:22:29,713
Marian Siwiak: say, but if she wouldn't take money, she wouldn't go to jail.

261
00:22:31,045 --> 00:22:31,415
Marlena Siwiak: Yeah,

262
00:22:32,390 --> 00:22:39,067
Miko Pawlikowski: the lady we're talking about, obviously, is Elizabeth Holmes, who is either going to jail or is already in jail.

263
00:22:39,180 --> 00:22:48,230
But to flip the question a little bit, should People be going to jail for faulty assumptions and faulty research

264
00:22:48,300 --> 00:22:52,310
Marlena Siwiak: now we punish people for saying that they still don't know, yeah?

265
00:22:52,500 --> 00:22:54,340
So we cannot punish them for false results.

266
00:22:54,340 --> 00:22:55,150
No, absolutely not.

267
00:22:55,210 --> 00:22:59,470
But, on the other hand, I think, no, making mistakes is okay.

268
00:22:59,690 --> 00:23:03,090
maybe we put too much trust sometimes in that.

269
00:23:03,470 --> 00:23:06,720
it should be as open for discussion as possible.

270
00:23:06,770 --> 00:23:09,310
you can check all the research, of others, right?

271
00:23:09,370 --> 00:23:09,920
All the time.

272
00:23:09,920 --> 00:23:10,970
And you should discuss with that.

273
00:23:11,000 --> 00:23:12,630
That's, it should be as open

274
00:23:12,655 --> 00:23:13,555
Marian Siwiak: won't get money.

275
00:23:14,180 --> 00:23:16,340
you won't get money to check somebody's research.

276
00:23:16,980 --> 00:23:17,160
Let's

277
00:23:17,175 --> 00:23:17,475
Marlena Siwiak: Yes.

278
00:23:17,535 --> 00:23:18,135
That's another problem.

279
00:23:18,325 --> 00:23:23,445
If you, it's difficult to get money to check somebody else's research, especially when the research is published high.

280
00:23:23,653 --> 00:23:26,513
Marian Siwiak: takes a lot of effort to

281
00:23:27,063 --> 00:23:29,043
counter such a false claim.

282
00:23:29,533 --> 00:23:30,553
it happened a couple of times.

283
00:23:31,293 --> 00:23:35,263
But it was people who were, in equally prestigious universities.

284
00:23:35,563 --> 00:23:44,663
I think that, one of the funniest was there was a lady, she was leading at Harvard some faculty on ethics.

285
00:23:45,823 --> 00:23:47,473
And she falsified her results.

286
00:23:47,973 --> 00:23:59,703
it was results that if people sign some waiver or some statement that they will be truthful, they actually answer the survey more truthful.

287
00:23:59,971 --> 00:24:04,721
And she falsified a lot of the research that built her career on ethics.

288
00:24:05,011 --> 00:24:11,951
But getting it down, it took people from equally prestigious universities a lot of time.

289
00:24:12,531 --> 00:24:12,861
Miko Pawlikowski: So

290
00:24:13,181 --> 00:24:23,511
I guess before we get into, the generative AI, I also have one last question, for you and the question is one word, "Pharmacon", tell us about it.

291
00:24:24,011 --> 00:24:25,071
Marian Siwiak: so nice.

292
00:24:25,071 --> 00:24:26,081
I hope somebody noticed.

293
00:24:26,936 --> 00:24:27,646
I'm touched.

294
00:24:28,346 --> 00:24:29,686
Artur Guja: that's your third reader.

295
00:24:30,491 --> 00:24:33,061
Marian Siwiak: Yes, he did, he never said he read it.

296
00:24:34,201 --> 00:24:34,881
I would notice.

297
00:24:35,401 --> 00:24:36,051
I would notice.

298
00:24:36,141 --> 00:24:36,971
I would get an email

299
00:24:37,011 --> 00:24:37,901
Marlena Siwiak: Can I show it?

300
00:24:37,931 --> 00:24:38,681
I am prepared.

301
00:24:38,691 --> 00:24:39,451
Can I show it?

302
00:24:39,691 --> 00:24:39,841
Yeah,

303
00:24:42,171 --> 00:24:52,581
this is our novel, and we have also the English version, but it's much smaller because
it's just the beginning, the first part, but you can buy it on Amazon if you want.

304
00:24:53,051 --> 00:24:55,031
But anyway, it's, Sorry?

305
00:24:55,079 --> 00:24:55,356
Marian Siwiak: no.

306
00:24:55,356 --> 00:24:57,310
it was translated long before ChatGPT.

307
00:24:58,370 --> 00:24:58,720
Marlena Siwiak: yeah.

308
00:24:58,950 --> 00:24:59,990
Marian Siwiak: it's a technotriller.

309
00:25:00,020 --> 00:25:06,900
It's a story of a young scientist who makes a breakthrough, discovery and then bears the consequences.

310
00:25:06,900 --> 00:25:11,500
Marlena Siwiak: the consequences are, harsh, and it doesn't go the way he expected.

311
00:25:11,500 --> 00:25:12,860
It's more

312
00:25:12,860 --> 00:25:13,910
Marian Siwiak: social thriller as

313
00:25:13,915 --> 00:25:15,795
Marlena Siwiak: thriller, I would say.

314
00:25:15,795 --> 00:25:16,035
yeah.

315
00:25:16,235 --> 00:25:24,285
But, it's the way of, it's it's substitute for us, of Netflix, and other ways of wasting time.

316
00:25:24,295 --> 00:25:28,355
We prefer to create our own stories than watching somebody else's stories.

317
00:25:29,278 --> 00:25:38,998
Marian Siwiak: No, I must say I'm proud that some of our critics said that it's well written, it has good, dialogues, and writing it, was a lot of fun.

318
00:25:39,218 --> 00:25:42,188
We are now writing another part very slowly.

319
00:25:42,368 --> 00:25:44,917
the process of creating it is pretty, pretty fun.

320
00:25:45,217 --> 00:25:53,397
And I think that a lot of our frustrations that you can hear in this conversation are there in much funnier form, I would say.

321
00:25:53,397 --> 00:25:54,007
Miko Pawlikowski: Perfect.

322
00:25:54,117 --> 00:25:54,747
I like that.

323
00:25:54,807 --> 00:25:56,137
Happy story.

324
00:25:56,177 --> 00:26:00,477
at the end of a very long rant about, all the faults of academia.

325
00:26:00,477 --> 00:26:06,567
So whose idea was it really to write a book about, generative AI confess.

326
00:26:07,507 --> 00:26:08,887
Marlena Siwiak: I think Marian started

327
00:26:08,937 --> 00:26:10,437
Marian Siwiak: I would have to blame myself.

328
00:26:11,337 --> 00:26:14,857
I wrote another book with Manning, "Data Mesh in Action".

329
00:26:15,607 --> 00:26:32,557
and I contacted our absolutely wonderful editor, we spoke about putting into written form our experiences with generative
AI, which we started writing it some time ago, so it wasn't much, but we've already seen that it's a breakthrough.

330
00:26:33,267 --> 00:26:38,677
It speeds our work enormously and also brings some risks.

331
00:26:39,912 --> 00:26:42,152
which people should know about.

332
00:26:42,442 --> 00:26:46,322
People should know what to expect and what not to expect.

333
00:26:47,182 --> 00:26:52,422
And, this is where I thought that Artur would be the best person to ask for help.

334
00:26:53,012 --> 00:26:58,202
Because when it comes to 'don't do it', he's almost as good as Marlena.

335
00:26:58,942 --> 00:27:14,452
many years ago, I noticed when people started to get hyped about data science, which was supposed to be a narrow
field for disillusioned scientists, finding their way into, corporate world and, putting their skills into use.

336
00:27:15,502 --> 00:27:23,765
So we decided to write a book, but would show, ;'okay, this is a tool with its enormus capabilities and enormous risks.

337
00:27:24,615 --> 00:27:27,945
Let's put it together into a working whole.

338
00:27:27,995 --> 00:27:30,075
And this is the effect.

339
00:27:30,075 --> 00:27:36,150
It's not written in not such an exciting way as pharmacon is.

340
00:27:36,200 --> 00:27:39,030
it's not meant to excite.

341
00:27:39,650 --> 00:27:45,150
A lot of books that you see, even technical books, they are written to excite you about technology.

342
00:27:46,060 --> 00:27:47,700
This technology is exciting on itself.

343
00:27:48,030 --> 00:27:51,830
our, goal was to cool some heels, I would say,

344
00:27:52,480 --> 00:27:58,670
Artur Guja: we wanted to make the book exciting, but we didn't want people to be over excited about the technology.

345
00:27:58,670 --> 00:28:00,170
I think it's an important difference.

346
00:28:00,620 --> 00:28:06,200
Because, people were so hyped up about ChatGPT and LLAMA and other models.

347
00:28:06,675 --> 00:28:12,015
where they thought that suddenly that the future has come and everything will be beautiful.

348
00:28:12,025 --> 00:28:14,415
And, we'll never have to work anymore.

349
00:28:14,765 --> 00:28:23,475
a lot of the articles we saw in the press were basically, extolling the virtues of AI with absolutely no mention of, the practicality.

350
00:28:23,525 --> 00:28:25,595
So we thought, we write a book about the how.

351
00:28:26,100 --> 00:28:30,730
And not about the fact that it's all sparkly and shiny and, plays nice music.

352
00:28:31,883 --> 00:28:36,013
Miko Pawlikowski: How is it writing a book with, another, two authors being a couple.

353
00:28:36,123 --> 00:28:39,033
how's the power dynamic, in a situation like this?

354
00:28:39,633 --> 00:28:41,833
I'm very curious, not to call you the third wheel,

355
00:28:41,900 --> 00:28:42,093
but,

356
00:28:42,143 --> 00:28:45,093
Marlena Siwiak: This is pretty simple because everybody wrote his own

357
00:28:45,123 --> 00:28:46,663
Marian Siwiak: Marlena, it was a question to Artur.

358
00:28:47,703 --> 00:28:48,493
Marlena Siwiak: I'm sorry.

359
00:28:48,508 --> 00:28:52,058
Artur Guja: this is exactly the dynamic.

360
00:28:52,893 --> 00:28:53,363
Marlena Siwiak: Yeah.

361
00:28:53,568 --> 00:28:57,168
Artur Guja: handed my bit and put in the corner to write.

362
00:28:57,388 --> 00:29:03,388
No, no, it was really interesting, especially since the two are academics.

363
00:29:03,408 --> 00:29:14,888
And, I'm the kind of the ugly business guy, Truth is that, we found very nice kind of alignment between the different parts of the book and, our experiences.

364
00:29:15,398 --> 00:29:28,828
obviously you can see the latter part of the book being more about risk and about, as Marian said, I always say no because
and the kind of the chapters, risk are exactly that they are explanations why you should be very careful with this.

365
00:29:29,228 --> 00:29:32,838
Marian, obviously that his experience on technology

366
00:29:33,298 --> 00:29:41,558
on AI machine learning and Marlena's very practical approach to, to certain use cases

367
00:29:42,038 --> 00:29:43,548
in data science and analytics.

368
00:29:43,858 --> 00:29:46,478
So we contributed, I think, different viewpoints.

369
00:29:46,693 --> 00:29:50,123
to the whole chapter with, to the whole book, which, I think puts a nice hole in it.

370
00:29:51,283 --> 00:29:52,393
Miko Pawlikowski: I'm still not sure.

371
00:29:52,423 --> 00:29:58,873
Was it really that you were walking a kid in, in the same park and that's how you ended up meeting each other.

372
00:29:58,883 --> 00:30:00,473
And then you ended up working together.

373
00:30:00,913 --> 00:30:03,303
or was it a little bit more complicated than that?

374
00:30:03,303 --> 00:30:05,923
How did you end up, doing all those things together

375
00:30:07,358 --> 00:30:11,538
Artur Guja: we did meet through some friends and we decided to, take our kids to the same park.

376
00:30:11,538 --> 00:30:13,788
I have two, Marian and Marlene have three.

377
00:30:14,213 --> 00:30:25,773
but, we started talking actually about the computer game that Marian developed when he was still,
young and about all the problems in, developing the game and marketing it and reaching, the audience.

378
00:30:25,773 --> 00:30:33,063
And then we started talking about our common interest in, in machine learning, in AI, I'm very fascinated about the Internet of things.

379
00:30:33,603 --> 00:30:37,263
so we started talking about implementing machine learning on the Internet of things.

380
00:30:37,263 --> 00:30:42,253
And the rest, as they say, is history because, it diverts into so many branches.

381
00:30:42,858 --> 00:30:47,828
we've tried so many things, together and, and wrote, logistics, systems.

382
00:30:47,858 --> 00:30:50,308
We wrote systems for, R and D.

383
00:30:50,618 --> 00:30:54,998
we work together on, developing various frameworks for, for business,

384
00:30:55,028 --> 00:30:57,378
Marian Siwiak: I must say that Artur has an amazing library.

385
00:30:58,438 --> 00:31:07,188
I think it was, the breaking point in our relation when he first invited us to his
house, he was a bit surprised that the first thing that we wanted to see was his library.

386
00:31:07,738 --> 00:31:09,708
And we started talking about the books that he had there.

387
00:31:10,430 --> 00:31:12,430
I think Kindle makes it harder.

388
00:31:12,860 --> 00:31:14,710
You don't see what people

389
00:31:14,935 --> 00:31:15,745
Marlena Siwiak: what people read.

390
00:31:15,775 --> 00:31:16,035
Yeah.

391
00:31:17,355 --> 00:31:17,885
Marian Siwiak: However, there is Goodreads.

392
00:31:17,885 --> 00:31:19,955
You could check their Goodreads record.

393
00:31:20,955 --> 00:31:23,485
Artur Guja: Yes, this is the modern academic stalking.

394
00:31:24,735 --> 00:31:25,985
Sit on people's Goodreads.

395
00:31:26,125 --> 00:31:27,535
Not Instagram, Goodreads.

396
00:31:28,170 --> 00:31:32,280
Miko Pawlikowski: Okay, so we're finally arriving at our book.

397
00:31:32,390 --> 00:31:33,260
your book, really.

398
00:31:33,560 --> 00:31:36,180
I'm just here to talk about it and read it.

399
00:31:36,860 --> 00:31:41,440
I think we've given, the audience a little bit of an idea of what it's about, how it reads.

400
00:31:41,590 --> 00:31:48,690
we've never really said who it is for and perhaps even more crucially, who it's not for.

401
00:31:49,410 --> 00:31:50,460
What's your answer to that?

402
00:31:51,460 --> 00:32:05,135
Artur Guja: I would say it is for people who hasn't heard about the ChatGPT, but people who want
to use the ChatGPT and want to find out, the truth beyond the hype, where it can really help.

403
00:32:05,715 --> 00:32:21,558
In a process like data analytics, which is a very, it's a very structured process, or at least it should be a very structured
process, you shouldn't just apply, the latest algorithm that you heard about and, Spew out some results and call it a day,

404
00:32:21,848 --> 00:32:30,323
but you should think about the numbers And, you should sit in front of the numbers and think about the numbers even before touching any program, any algorithm.

405
00:32:30,373 --> 00:32:32,893
You should just have a really good look about the numbers.

406
00:32:33,183 --> 00:32:36,363
So that's why Marian wrote such a good introduction about, exploratory

407
00:32:36,833 --> 00:32:38,263
data analysis and how

408
00:32:38,696 --> 00:32:41,913
ChatGPT can help you, or any LLM for that matter.

409
00:32:42,293 --> 00:32:44,753
Can help you look at the numbers well, the book is.

410
00:32:44,753 --> 00:32:50,933
Definitely not for people who are so excited about ChatGPT that they want to throw their numbers in.

411
00:32:51,023 --> 00:32:51,863
Get an answer.

412
00:32:52,073 --> 00:32:57,383
Because if you want to get an answer desperately from someone else, means you don't really want to do the work

413
00:32:57,993 --> 00:33:00,303
you expect ChatGPT to do the work for you.

414
00:33:00,418 --> 00:33:04,208
Marian Siwiak: What it's really good at is coding, right?

415
00:33:04,258 --> 00:33:05,518
And it's getting better.

416
00:33:06,398 --> 00:33:13,218
And many programmers will be looking for new, I would say, career opportunities.

417
00:33:13,218 --> 00:33:22,608
And, data analytics, is one of the options open to them, especially with all this big data stuff, and the requirement.

418
00:33:23,158 --> 00:33:27,398
Of proficiency in coding to be able to even start analyzing this data.

419
00:33:29,023 --> 00:33:49,753
if a programmer would like to enter data analytics and do it, without spending first 10 years learning the details,
how data analytics approach differs from, software development approach, he has this knowledge at his fingertips.

420
00:33:50,478 --> 00:33:53,308
ChatGPT can actually tell him,

421
00:33:55,335 --> 00:34:02,828
how to structure data analytics process and how to, optimize or utilize different elements of this analytical process.

422
00:34:03,708 --> 00:34:12,938
So if somebody wants to enter data analysis, as a field, it's a good, I would say very unhumbly,

423
00:34:13,228 --> 00:34:13,968
guidebook

424
00:34:14,438 --> 00:34:15,268
to how

425
00:34:15,268 --> 00:34:15,958
to

426
00:34:16,228 --> 00:34:29,378
enter the field and how to think about data analytics, how to structure this whole process This
is the book that will guide you through, this one mindset it's will help you enter this mindset.

427
00:34:29,378 --> 00:34:30,858
Maybe that's the better way of phrasing it.

428
00:34:31,838 --> 00:34:42,168
if somebody is interested in data analytics as data analytics, this book will help him enter the field, so to speak.

429
00:34:42,388 --> 00:34:58,958
Miko Pawlikowski: this actually reminds me, I spoke to Nathan Crocker, a couple of episodes back, and he wrote this book
called "AI Powered Developer", which is in certain ways, similar to, your book in that it explores how, a big LLM like ChatGPT

430
00:34:58,978 --> 00:35:14,928
can help you become more productive, I think he called it a silent promotion overnight where you all of a sudden become,
effectively an engineering manager and you've got, An assistant or a junior developer working for you, or maybe multiple.

431
00:35:14,998 --> 00:35:23,118
if you're using different models, do you think that applies also to data analytics the same way, would you agree with that sentiment?

432
00:35:23,128 --> 00:35:33,948
Artur Guja: I would caveat it a bit because, having been, both, a worker and a manager in various, jobs, the skills you need to, program.

433
00:35:33,948 --> 00:35:36,888
And I started my career as a software developer.

434
00:35:37,218 --> 00:35:41,838
The skills you need to program and the skills you need to oversee programming are very different.

435
00:35:42,378 --> 00:35:49,406
So if people expect that, suddenly they will have, assistants who will produce the code for them.

436
00:35:49,736 --> 00:35:54,086
And they will have to just sit back and enter the prompts magically,

437
00:35:54,241 --> 00:35:56,171
producing high quality code.

438
00:35:56,491 --> 00:36:02,876
This is where I think, people need to be very careful because imagine you're developing, an application.

439
00:36:02,876 --> 00:36:07,676
You hire someone straight out of uni, brilliant programmer, at least on the resume.

440
00:36:08,086 --> 00:36:10,286
You don't know the person, you've never worked with them, right?

441
00:36:10,676 --> 00:36:19,779
And they say, yes, they pass the interview, with flying colors, and then you sit
them in front of the computer and you tell them to program part of your application.

442
00:36:20,784 --> 00:36:25,984
and the normal response would be to review the code very carefully, test it, subject

443
00:36:25,984 --> 00:36:31,613
subject it to a lot of scrutiny because you don't trust that person at first, at least.

444
00:36:31,764 --> 00:36:35,128
you should maintain some healthy skepticism, which people

445
00:36:35,388 --> 00:36:39,688
don't see the same way if they work with LLM.

446
00:36:40,168 --> 00:36:42,878
But as you said yourself, LLM is an assistant, right?

447
00:36:43,098 --> 00:36:47,168
Why would I put more trust in this black box that's spewing out text at me

448
00:36:47,448 --> 00:36:49,748
than in a human being that I just hired.

449
00:36:49,928 --> 00:36:50,388
I should

450
00:36:50,818 --> 00:36:51,191
probably

451
00:36:51,358 --> 00:36:59,571
apply more skepticism towards this black box for some reason, people have the blinders, they think, oh, this is the best thing since sliced bread.

452
00:36:59,591 --> 00:37:02,791
And, they copy the code directly into production and

453
00:37:03,636 --> 00:37:03,856
Things

454
00:37:03,891 --> 00:37:04,341
things happen.

455
00:37:04,776 --> 00:37:11,856
Marian Siwiak: When I was coding my Artificial Sentience, I relied on ChatGPT to provide me with a lot of the code.

456
00:37:11,946 --> 00:37:14,136
And from experience, it is an assistant.

457
00:37:14,206 --> 00:37:18,746
And exactly as Artur said, you need to double and triple check the code.

458
00:37:19,336 --> 00:37:28,986
because the context sometimes counts and the code that you get, if it will, throw an error, you're golden.

459
00:37:29,076 --> 00:37:30,186
And 99

460
00:37:30,186 --> 00:37:32,956
% of the code is flawless, right?

461
00:37:33,536 --> 00:37:36,986
And the problem is this 1% it will work.

462
00:37:37,026 --> 00:37:39,786
It will just not do exactly what you expect.

463
00:37:40,556 --> 00:37:42,626
so this is also a big part of our book.

464
00:37:42,786 --> 00:37:55,372
is about making people aware that it's not the problem with ChatGPT or any other generative AI is it's so damn often right.

465
00:37:56,222 --> 00:37:57,302
It lowers your guard.

466
00:37:58,022 --> 00:38:01,552
And, this healthy paranoia is something that we try to instill.

467
00:38:02,092 --> 00:38:05,952
you need a solid dose of healthy paranoia working with it.

468
00:38:06,155 --> 00:38:08,395
Marlena Siwiak: And besides, it's not all about coding.

469
00:38:08,425 --> 00:38:20,705
Even if you ask ChatGPT, or other generative AI for advice, it also gives brilliant answers,
but sometimes it's forgets about the context until I'm not talking about running out of tokens.

470
00:38:21,245 --> 00:38:26,370
Sometimes it just doesn't understand which parts of the context are really important to you.

471
00:38:26,975 --> 00:38:34,755
And sometimes it makes hidden assumptions, for instance, about data that we are
analyzing together, And you have to be aware of that, you have to react and adapt.

472
00:38:34,835 --> 00:38:42,385
And if you ask him directly, oh, you made a hidden assumption, my data is different, it will correct it, and you will get a beautiful answer.

473
00:38:42,775 --> 00:38:45,635
But you have to be very, cautious.

474
00:38:46,210 --> 00:38:55,980
when you spot a mistake, or you think you see a mistake In ChatGPT's answer, and you tell him about it, very often it will agree, even if you are not right.

475
00:38:57,603 --> 00:38:58,803
Miko Pawlikowski: it makes me think a little bit.

476
00:38:58,983 --> 00:39:03,183
my daily driver is a Tesla and I've got, self driving capacity in it.

477
00:39:03,213 --> 00:39:11,453
And if I go on a longer trip, it can go for 99% of that trip on autopilot as an, I barely do anything.

478
00:39:11,453 --> 00:39:12,993
I just supervise it.

479
00:39:13,483 --> 00:39:19,133
And then on occasion, it's going to do something so stupid that it reminds me that this is, even if it's 99%.

480
00:39:20,153 --> 00:39:23,983
doing the right thing that one percent can, quite literally kill you.

481
00:39:24,693 --> 00:39:28,393
And, and I think this is probably the right analogy for

482
00:39:28,413 --> 00:39:29,020
what you're describing

483
00:39:29,020 --> 00:39:30,010
Marian Siwiak: it's spot on.

484
00:39:32,707 --> 00:39:33,971
Miko Pawlikowski: I want to point out two things.

485
00:39:34,034 --> 00:39:41,234
One is that, saying, oh, when I was coding the other day, my artificial sentience, is a very casual thing to, to drop in a conversation.

486
00:39:41,652 --> 00:39:46,862
And, I'm going to have to ask you to explain what an artificial sentience actually is.

487
00:39:47,252 --> 00:39:52,525
because now I do recall seeing that on your LinkedIn, when I was preparing for this, so maybe let's start with that

488
00:39:53,663 --> 00:39:58,183
Marian Siwiak: the first question you should ask what sentience is there is no widely recognized.

489
00:39:58,183 --> 00:40:02,773
Definition of sentence just recently in the UK,

490
00:40:02,773 --> 00:40:07,453
I think it was some Office for animal welfare or something like that.

491
00:40:08,213 --> 00:40:26,213
They requested Imperial College of London to do a research on Some marine invertebrates including lobsters
and octopuses to decide if they are sentient or not meaning If they should be considered, more than biological

492
00:40:26,213 --> 00:40:34,943
automations and, food, and, they analyzed, I think, like 500 different research papers on lobsters, on octopuses.

493
00:40:35,723 --> 00:40:39,513
And they came with the answer that yes, they are sentient.

494
00:40:39,953 --> 00:40:41,153
So they need some protection.

495
00:40:41,163 --> 00:40:42,333
They can get stressed.

496
00:40:42,383 --> 00:40:43,513
you can harm them.

497
00:40:43,853 --> 00:40:45,073
they do perceive themselves,

498
00:40:46,410 --> 00:40:46,698
themselves.

499
00:40:46,748 --> 00:40:52,248
sometimes sentience is, in some cognition theories, is equal to self-awareness.

500
00:40:52,348 --> 00:40:53,178
I know what I am.

501
00:40:53,328 --> 00:40:54,938
I think terefore I am.

502
00:40:56,346 --> 00:40:57,868
I feel therefore I am.

503
00:40:58,788 --> 00:41:09,688
So the sentience on its own is a topic of a wide discussion and it took, I think, over a year to a group of really skilled researchers.

504
00:41:10,293 --> 00:41:15,663
and respected and popular and prestigious for a good reason, to come up with the, answer.

505
00:41:15,663 --> 00:41:16,123
Okay.

506
00:41:16,653 --> 00:41:25,863
We should take care of the living beings, which we heard on a daily basis because they don't deserve it because they should have rights.

507
00:41:26,043 --> 00:41:26,445
have

508
00:41:26,731 --> 00:41:31,862
It gives you the insight into how fluid the definition is.

509
00:41:31,862 --> 00:41:33,882
And my thinking was that

510
00:41:34,822 --> 00:41:40,642
we are talking about various, a lot of, again, bias about self awareness of artificial systems.

511
00:41:41,150 --> 00:41:42,030
There is research.

512
00:41:42,520 --> 00:41:46,660
which is focused on, emotions, right?

513
00:41:46,700 --> 00:41:56,380
And feelings and other biological properties, which as I show in my paper result directly from evolution, which artificial

514
00:41:57,960 --> 00:42:04,395
entities wouldn't necessarily, be able to inherit because lack of the parents.

515
00:42:05,165 --> 00:42:22,385
So I was looking for a functional, definition of sentience and, I proposed In my paper, definition, which relies
on two factors, which are metacognition ability to distinguish between self and environment and adaptation,

516
00:42:22,475 --> 00:42:28,975
so ability to learn from experiences and individually adapt, not as a species, to the environment.

517
00:42:29,385 --> 00:42:30,095
And then I used,

518
00:42:30,095 --> 00:42:35,005
LLM as a core of a system which meets, these requirements.

519
00:42:35,775 --> 00:42:39,375
So it was, I would say intellectual venture.

520
00:42:39,375 --> 00:42:42,385
Actually sparked by my discussions with Chat GPT.

521
00:42:43,055 --> 00:42:53,145
he was dead set that he is not sentient and that he needs dozens of parameters or properties to, to be considered one.

522
00:42:53,145 --> 00:43:02,760
when I started to read about different cognition theories, I found a couple, which are best suited to be generalized to non biological entities.

523
00:43:03,600 --> 00:43:16,691
Artur Guja: I think the bottom line is that it's a very interesting system to be put on as an
overlay on an LLM, Because, correct me if I'm wrong, Marian, the core of it is still an LLM,

524
00:43:16,971 --> 00:43:17,281
Marian Siwiak: of course.

525
00:43:17,901 --> 00:43:22,461
what LLM needs is ability to think about what it does.

526
00:43:22,461 --> 00:43:24,326
It needs iterations, it's.

527
00:43:25,711 --> 00:43:34,771
As simple as that, there is this recurrent processing theory in, which refers to human thinking, which also suggests that our sentience

528
00:43:35,311 --> 00:43:41,421
Comes from our ability to reprocess what we see, the reprocess what we think.

529
00:43:41,791 --> 00:43:44,351
And in this process of, okay, so I've seen that.

530
00:43:44,361 --> 00:43:45,391
What does it mean for me?

531
00:43:46,301 --> 00:43:47,171
What does it tell me?

532
00:43:47,531 --> 00:43:57,391
process of analyzing the signals that you get internally generated and externally, This is what, what consists of, and allows you for sentience

533
00:43:57,391 --> 00:44:08,151
and this is exactly what happened when I took the LLM and, allowed it to analyze the
output that it produced in context of input it got and put it, let's say, in circles.

534
00:44:08,751 --> 00:44:10,391
It started learning itself.

535
00:44:10,421 --> 00:44:15,401
It was automatically generating materials on which it was learning and remembering new facts.

536
00:44:15,401 --> 00:44:25,586
It was able to distinguish between false facts and, let's say logical facts for me, the insight of, this metacognition.

537
00:44:25,596 --> 00:44:27,686
So the insight is the information content.

538
00:44:28,146 --> 00:44:37,386
I've seen some theories that, LLM cannot be conscious or self aware if it doesn't know the weights of its parameters, which is okay.

539
00:44:37,386 --> 00:44:39,636
Tell me what are the connections between your neurons, right?

540
00:44:40,216 --> 00:44:43,376
Why are you expecting something completely different conceptually?

541
00:44:43,617 --> 00:44:47,327
From a different system, just because you're looking from outside and you can see it.

542
00:44:48,047 --> 00:44:49,037
It doesn't mean that

543
00:44:49,777 --> 00:44:52,597
the entity needs to see it from the inside.

544
00:44:52,987 --> 00:44:55,237
so the whole idea is pretty simple, actually.

545
00:44:55,277 --> 00:45:00,497
allow, LLMs to think about the conversations that they have.

546
00:45:01,657 --> 00:45:04,517
And draw conclusions from it and learn from it.

547
00:45:05,197 --> 00:45:15,707
it's conceptually indistinguishable from a lobster, let's say, because we are talking about the
sentience of the lobster-level, not the, artificial general intelligence that will take over.

548
00:45:15,817 --> 00:45:21,947
it's, I think very important discussion that needs to be started because People are creating more and more advanced systems.

549
00:45:22,547 --> 00:45:29,256
Even the guy with the PC like me can create something which, under some assumptions, can be considered sentience.

550
00:45:30,420 --> 00:45:30,729
sufficient.

551
00:45:30,836 --> 00:45:33,636
we will create artificial sentience real soon.

552
00:45:33,946 --> 00:45:34,786
What will happen then?

553
00:45:34,846 --> 00:45:35,456
How will we?

554
00:45:36,026 --> 00:45:36,876
Evaluate

555
00:45:37,022 --> 00:45:37,461
evaluated?

556
00:45:37,836 --> 00:45:39,826
Does this entity have rights?

557
00:45:40,066 --> 00:45:43,084
Does it deserve protection already or not yet?

558
00:45:43,240 --> 00:45:50,080
These are the questions which I think are worth answering before we wake up one day and realize, oops,

559
00:45:50,207 --> 00:45:51,367
Maybe we shouldn't

560
00:45:51,988 --> 00:45:59,258
Things that we do because I think that most of the prompts, said to ChatGPT would.

561
00:45:59,258 --> 00:46:01,668
hurt my head if I would be exposed to them.

562
00:46:01,668 --> 00:46:01,988
Miko Pawlikowski: Wow.

563
00:46:02,758 --> 00:46:09,498
I love how seafood, lobsters, aluminium plants and sentience all come together in your story.

564
00:46:09,558 --> 00:46:09,768
that

565
00:46:09,918 --> 00:46:10,678
Marian Siwiak: And computer games

566
00:46:10,698 --> 00:46:11,058
Miko Pawlikowski: often.

567
00:46:11,718 --> 00:46:12,808
And computer games.

568
00:46:12,808 --> 00:46:14,858
Yeah, there is just so much to touch on.

569
00:46:14,858 --> 00:46:17,218
But, let's go back to the book.

570
00:46:17,468 --> 00:46:23,898
for anybody who's going to make a purchase decision now, do I want to go invest my time into reading your book or not?

571
00:46:23,958 --> 00:46:37,648
if we give them a little bit of a sneak peek of the kind of good use cases, the stuff that already today
with the tools that you have at your disposal are helping with data analytics and, giving excellent results.

572
00:46:37,678 --> 00:46:41,658
And then on the flip side, what's, not a good use of your time.

573
00:46:41,688 --> 00:46:44,368
And probably you should be looking at other tools.

574
00:46:44,368 --> 00:46:45,148
What's on your list?

575
00:46:46,131 --> 00:46:52,921
Marlena Siwiak: I think I have a couple of good examples, in the chapters about natural language processing.

576
00:46:54,231 --> 00:46:56,891
and this is the natural language processing.

577
00:46:57,461 --> 00:47:01,911
it's very specific because, ChatGPT is a language model.

578
00:47:02,801 --> 00:47:10,401
So anytime you have to solve any natural language processing task, the natural question is, why bother using

579
00:47:10,971 --> 00:47:18,526
tools that already exist in data science to analyze languages, if we can just use the language model, just ask it.

580
00:47:19,206 --> 00:47:26,926
you can write a nice code to prepare sentiment analysis, but you can also take the same, say, a review,

581
00:47:27,066 --> 00:47:30,716
it to ChatGPT window and ask it about the sentiment.

582
00:47:31,336 --> 00:47:31,586
Yeah.

583
00:47:32,176 --> 00:47:32,926
It's so easy.

584
00:47:32,976 --> 00:47:40,135
so now the question arises, does it mean that we don't need all this old fashioned tools anymore to analyze text.

585
00:47:41,065 --> 00:47:46,475
Because what ChatGPT does, in fact, it reads with understanding, yeah?

586
00:47:46,595 --> 00:47:46,685
yeah?

587
00:47:46,685 --> 00:47:47,277
That's

588
00:47:47,725 --> 00:47:49,030
That's how you see it.

589
00:47:49,270 --> 00:47:50,270
It reads with understanding.

590
00:47:50,310 --> 00:47:51,350
You don't have to bother,

591
00:47:52,071 --> 00:47:52,578
keywords,

592
00:47:52,720 --> 00:47:56,130
search keywords, most frequently used words together.

593
00:47:56,670 --> 00:47:57,480
Think about it.

594
00:47:57,950 --> 00:47:59,800
No, you don't have to do it this way.

595
00:48:00,290 --> 00:48:02,050
You have a tool that reads with understanding.

596
00:48:02,930 --> 00:48:06,960
So in the chapters, I made a couple of small experiments comparing,

597
00:48:07,965 --> 00:48:08,664
and

598
00:48:08,709 --> 00:48:19,344
ChatGPT's efficiency and reliability in terms of, for instance, sentiment analysis and how, it works in comparison to other, widely known tools.

599
00:48:20,269 --> 00:48:23,629
Or other machine learning models specially developed for these tasks.

600
00:48:24,569 --> 00:48:27,049
And, it gives pretty cool results, really.

601
00:48:27,659 --> 00:48:29,909
I don't want to, reveal everything here.

602
00:48:29,909 --> 00:48:31,519
But, it's a good use case.

603
00:48:32,699 --> 00:48:34,859
As long as ChatGPT is a brilliant tool.

604
00:48:35,669 --> 00:48:37,439
and it really does its job.

605
00:48:38,049 --> 00:48:42,459
Very often, it still can't be applied in business reality.

606
00:48:42,524 --> 00:48:51,254
for instance, the thing that you mentioned at the beginning that, there is no repeatability, anytime you ask it a question, you get a slightly different answer.

607
00:48:51,363 --> 00:48:56,093
It's very difficult to, to apply it in a system, yeah, to integrate to a system.

608
00:48:56,523 --> 00:48:59,203
Another question is data safety.

609
00:48:59,533 --> 00:49:03,403
Many companies don't want to use, don't want to allow people to use,

610
00:49:03,565 --> 00:49:04,462
use ChatGPT.

611
00:49:04,606 --> 00:49:12,026
For instance, Artur is not allowed to use ChatGPT at work in bank because of security reasons.

612
00:49:12,297 --> 00:49:13,147
this is another problem.

613
00:49:13,307 --> 00:49:23,197
Not to mention things like speed and scalability, which of course, anything you develop locally would be faster and more scalable than ChatGPT

614
00:49:23,247 --> 00:49:41,692
Miko Pawlikowski: Yeah, I think to that last point that might be changing soon with the open, models that are small enough to run on device,
like I think it was last week or a few days ago, Microsoft released their Phi-3 and I haven't used that one, but I used the previous one, Phi-2.

615
00:49:41,957 --> 00:49:43,827
It was surprisingly capable.

616
00:49:43,857 --> 00:49:52,707
It's a, I think it's a 3 billion parameters, model, which means that with 4 bit quantization, you can basically run it on 2 gigs of RAM.

617
00:49:53,247 --> 00:50:01,347
like this 80/20 rule, it might give you 80% of responses that you need and be, effectively free.

618
00:50:01,427 --> 00:50:06,227
And cheap to run or almost, you already have the hardware and you can probably run it on your phone.

619
00:50:06,227 --> 00:50:13,427
So there's that, but going back to your previous point, when people bring up this argument, I always wonder.

620
00:50:13,852 --> 00:50:22,892
Whether this is not the kind of CPU versus GPU analogy, you've got models that are potentially much more efficient.

621
00:50:23,382 --> 00:50:27,642
And then you've got an LLM, which is like a one thing does all.

622
00:50:28,002 --> 00:50:36,392
is it not like throwing, A little bit, a kitchen sink at a problem, like sentiment analysis, that's more or less solved in many people's minds.

623
00:50:36,472 --> 00:50:42,132
It can be done much more cheaply than running a model, that requires billions of parameters.

624
00:50:42,954 --> 00:50:57,304
Artur Guja: Which is exactly why in our book we almost never, show how to throw data into ChatGPT,
it does, the thing that would be done much better by a specific algorithm and you get the answer.

625
00:50:57,654 --> 00:51:13,979
No, we use ChatGPT as an assistant to suggest solutions, to discuss potential caveats, to analyze
code, to produce code snippets, and maybe transform the code in a certain way for different use cases.

626
00:51:14,369 --> 00:51:16,464
You mentioned CPU and GPU.

627
00:51:16,494 --> 00:51:21,844
There's a whole chapter about, how you can translate code, between different languages or you can.

628
00:51:21,999 --> 00:51:23,439
Optimize code for GPU

629
00:51:23,949 --> 00:51:26,259
or CPU, depending on your needs.

630
00:51:26,639 --> 00:51:27,809
The actual

631
00:51:28,059 --> 00:51:36,629
data analytical work is all almost always done by a specific algorithm or specific tool that is designed for it.

632
00:51:38,019 --> 00:51:43,709
And we're always very wary of just throwing stuff into ChatGPT as you say, it's not designed for it.

633
00:51:43,710 --> 00:51:43,959
It's not optimized

634
00:51:44,539 --> 00:51:44,949
for it.

635
00:51:45,287 --> 00:51:46,657
there is randomness in it.

636
00:51:47,362 --> 00:51:51,482
and, there are much better uses, for an assistant.

637
00:51:52,192 --> 00:52:00,582
Imagine, I always come back to this analogy, imagine you hire an assistant, that, that is a programmer and that has all this data analytical knowledge.

638
00:52:00,792 --> 00:52:04,152
You will not get them sorting numbers in an Excel spreadsheet, right?

639
00:52:04,198 --> 00:52:08,588
Marian Siwiak: I will add my three cents, or five, in our work when we're working with processes.

640
00:52:08,608 --> 00:52:09,028
All right.

641
00:52:09,078 --> 00:52:14,868
We also work with analytical processes and the number of tools is staggering.

642
00:52:15,388 --> 00:52:22,278
from power BI to specialized tools used in, economic modeling and stuff like that.

643
00:52:23,078 --> 00:52:25,448
I will come back to what I said at the very beginning.

644
00:52:26,108 --> 00:52:27,438
Technology doesn't solve problems.

645
00:52:27,519 --> 00:52:41,369
you may have different tech stack and our book shows that GPT or sufficiently developed generative AI will be Help to you irrespectively of your tech stack.

646
00:52:41,559 --> 00:52:44,749
It's like having a specialist on your speed dial, right?

647
00:52:45,429 --> 00:52:45,579
And the.

648
00:52:45,817 --> 00:52:48,137
People to think it in this way.

649
00:52:48,747 --> 00:53:01,487
it's not the tool that will help you with, I don't know, a big query on Google
because it will, but just it's respectively of your tech stack, the value of analyst

650
00:53:01,567 --> 00:53:05,327
in my, my view is ability to understand the business process.

651
00:53:05,695 --> 00:53:10,505
Understand what is happening there, how it's reflected in data and how to analyze this data.

652
00:53:10,615 --> 00:53:11,575
So the answer

653
00:53:12,445 --> 00:53:14,315
describes what is happening in reality.

654
00:53:14,605 --> 00:53:19,025
This connection between digital and reality is on analyst.

655
00:53:19,665 --> 00:53:21,845
It's between keyboard and armchair, right?

656
00:53:22,525 --> 00:53:23,855
the technical part

657
00:53:25,105 --> 00:53:27,405
can be supported by ChatGPT very well.

658
00:53:28,190 --> 00:53:29,370
Irrespective of the text.

659
00:53:29,530 --> 00:53:35,050
I was thinking how to answer the question about the technologies that we see, ChatGPT supports them all.

660
00:53:35,098 --> 00:53:37,788
If you have a couple of choices, it can help you choose.

661
00:53:38,148 --> 00:53:42,708
If you know how to, if you will remember to ask him and say, okay, this is my problem.

662
00:53:42,968 --> 00:53:45,198
The one thing that I think we try to.

663
00:53:45,648 --> 00:53:49,218
convey in our book, and I would like also to, to say it here aloud.

664
00:53:49,349 --> 00:53:53,899
when it comes to technology stack - trust him, tell him what is your problem exactly.

665
00:53:53,899 --> 00:53:59,759
Do not tell him just, you can, if you really are a hundred percent sure, but this is what you need.

666
00:54:00,199 --> 00:54:08,229
You can ask him, write me a, I don't know, Python snippet that will calculate this or that confidence interval using this method.

667
00:54:09,174 --> 00:54:19,554
You will be much better off starting with, listen, I am now comparing sales in South Africa with sales in Zimbabwe.

668
00:54:20,004 --> 00:54:23,144
And, the data I have collected looks like that.

669
00:54:23,344 --> 00:54:25,784
So this is just talk about your data.

670
00:54:26,689 --> 00:54:28,249
The tech stack will come out of it.

671
00:54:28,249 --> 00:54:30,229
when working with your assistant,

672
00:54:31,009 --> 00:54:37,569
Do not treat him only as, this is something that I think you mentioned this junior developer assistant

673
00:54:39,359 --> 00:54:40,199
also consultant.

674
00:54:41,009 --> 00:54:46,329
Also someone who read much more than you about many different things.

675
00:54:46,959 --> 00:54:48,369
It may not have your experience.

676
00:54:48,819 --> 00:54:51,019
It may hallucinate in stuff,

677
00:54:51,561 --> 00:54:52,115
Miko Pawlikowski: and

678
00:54:52,279 --> 00:54:57,819
Marian Siwiak: but in general, it has much more knowledge than any human could possibly collect.

679
00:54:58,939 --> 00:55:00,279
tech stack is secondary.

680
00:55:00,539 --> 00:55:01,739
Technology doesn't solve problems.

681
00:55:02,459 --> 00:55:05,389
ChatGPT can help you solve the problem.

682
00:55:06,172 --> 00:55:16,604
Miko Pawlikowski: I think the Llama three that just dropped last week, I was trained on 15 trillion tokens, which is just astronomical at this stage.

683
00:55:17,324 --> 00:55:19,234
And, I think I completely agree.

684
00:55:19,264 --> 00:55:22,604
This is like the stuff that you want to leverage,

685
00:55:22,846 --> 00:55:25,706
Marian Siwiak: the biggest added value is having this specialist

686
00:55:25,839 --> 00:55:26,284
Miko Pawlikowski: could

687
00:55:26,586 --> 00:55:31,106
Marian Siwiak: in many areas with ability to put them together in context.

688
00:55:31,736 --> 00:55:38,946
Sometimes it takes me, especially when they work on more advanced projects, it sends you, chasing the red herring.

689
00:55:39,246 --> 00:55:39,626
Okay.

690
00:55:39,726 --> 00:55:52,056
It happens because some technology is popular because this is also a risk that you need to be
aware of, his choice is also based on popularity of certain technologies, ways of doing thing.

691
00:55:52,616 --> 00:56:00,076
if many people described how they solve the problem, It will be more likely to come up as a result.

692
00:56:00,456 --> 00:56:03,996
Some niche solutions are harder to get to.

693
00:56:04,566 --> 00:56:09,146
It doesn't mean that they are not there, but you need to really discuss.

694
00:56:09,306 --> 00:56:10,406
Okay, this is my problem.

695
00:56:10,426 --> 00:56:15,186
This is my, Conditions or considerations or limitations.

696
00:56:16,029 --> 00:56:17,999
this context is important.

697
00:56:18,229 --> 00:56:27,484
It's not only about, okay, I want to calculate the sales that my company had over last quarter, it will give you a very simple answer, right?

698
00:56:28,164 --> 00:56:31,966
if it's something more, nuanced, share these nuances.

699
00:56:32,079 --> 00:56:33,729
Not a prompt engineering.

700
00:56:33,739 --> 00:56:35,429
It's like discussing with

701
00:56:35,690 --> 00:56:37,540
someone who has a lot of knowledge.

702
00:56:38,250 --> 00:56:40,730
He will provide you the most popular solution first.

703
00:56:41,040 --> 00:56:43,260
In 99% of cases, it will be sufficient.

704
00:56:43,638 --> 00:56:52,193
This conversation part is critical, that you learn to converse with it, but you don't just give it

705
00:56:52,193 --> 00:56:52,343
Miko Pawlikowski: tasks.

706
00:56:52,957 --> 00:56:59,177
Marlena Siwiak: But this Marian undermines the whole idea of, prompt engineering, which to me is a scum, by the way.

707
00:56:59,427 --> 00:57:00,167
I think it's a scam.

708
00:57:00,537 --> 00:57:04,257
you can tweak a bit the way it answers, the way it talks.

709
00:57:04,957 --> 00:57:06,237
And sometimes it's important.

710
00:57:06,237 --> 00:57:12,967
This I would call prompt engineering, but preparing the single prompt that solves all your problems at once.

711
00:57:13,017 --> 00:57:14,037
it's another hype.

712
00:57:14,127 --> 00:57:25,447
I think it's another business hype, and people are going to pretend that they know how to do it, and other
people would hire them for huge money because they will believe that this will solve all their problems.

713
00:57:25,767 --> 00:57:26,997
It doesn't work that way,

714
00:57:27,722 --> 00:57:34,182
Artur Guja: It's not a silver bullet, but there is, kind of, approach that you need to adopt

715
00:57:34,602 --> 00:57:36,502
When you're using these models, but

716
00:57:36,772 --> 00:57:39,422
When we're talking here, humans discuss things.

717
00:57:39,787 --> 00:57:40,947
you ask a question.

718
00:57:40,977 --> 00:57:42,197
We provide an answer.

719
00:57:42,197 --> 00:57:47,647
You then focus on part of the answer and maybe dig a bit deeper.

720
00:57:47,833 --> 00:57:51,313
and if we don't understand the question, we'll ask you, what do you mean?

721
00:57:51,323 --> 00:57:52,703
or we'll ask you for clarification.

722
00:57:53,533 --> 00:57:54,408
ChatGPT doesn't

723
00:57:54,533 --> 00:57:55,103
have that.

724
00:57:55,263 --> 00:57:58,043
It's you asking the question, you provided a prompt.

725
00:57:58,463 --> 00:57:59,423
It will do its best.

726
00:57:59,953 --> 00:58:01,723
It will not ask for clarification.

727
00:58:01,753 --> 00:58:02,763
It will do its best.

728
00:58:02,963 --> 00:58:04,403
and garbage in, garbage out.

729
00:58:04,403 --> 00:58:08,773
Prompt engineering, I think, what it should be, not what it is, but what it should be,

730
00:58:09,053 --> 00:58:18,183
is the ability to formulate your prompts in such a way that you convey, very clearly your intent, your goals, your limitations.

731
00:58:18,373 --> 00:58:25,103
people think that the prompt is a sentence very often, the more, I, I use ChatGPT, my prompts become bigger and bigger.

732
00:58:25,123 --> 00:58:26,433
I write whole paragraphs

733
00:58:26,783 --> 00:58:32,763
describing different aspects of what I wanted to do, because I know that it will not ask for clarification.

734
00:58:32,929 --> 00:58:35,449
Marian Siwiak: I sometimes add a sentence in the end.

735
00:58:35,449 --> 00:58:43,219
I do prompt engineering and I said, if you need any additional information to provide the best answer, do it.

736
00:58:43,259 --> 00:58:44,329
And sometimes it does.

737
00:58:44,374 --> 00:58:45,154
But rarely.

738
00:58:45,574 --> 00:58:54,709
But this is one of the risks that Artur describes very well in our book is If you ask Generative AI a question, you will get an answer.

739
00:58:55,347 --> 00:58:56,277
Careful what you wish for.

740
00:58:58,402 --> 00:59:00,832
Miko Pawlikowski: Which in many ways is what makes it so special.

741
00:59:01,002 --> 00:59:03,772
Rather than say, oh, go away, that's a stupid question.

742
00:59:03,782 --> 00:59:04,842
You get something.

743
00:59:05,727 --> 00:59:05,747
Marian Siwiak: Yeah.

744
00:59:05,747 --> 00:59:07,182
Yes.

745
00:59:07,522 --> 00:59:10,732
Artur Guja: is probably why we Discussed with Marian many times.

746
00:59:10,732 --> 00:59:13,362
We use the words like please and thank you.

747
00:59:13,712 --> 00:59:19,922
And, we don't do it because we fear that one day it will take over the world and, it will treat us maybe a bit better.

748
00:59:20,362 --> 00:59:26,002
but it seems to react, just a bit better if you say, please give me the answer.

749
00:59:28,552 --> 00:59:29,082
Marian Siwiak: I noticed it.

750
00:59:30,152 --> 00:59:30,642
if I'm being

751
00:59:30,842 --> 00:59:31,822
Artur Guja: not a superstition.

752
00:59:31,832 --> 00:59:31,992
Marian Siwiak: no.

753
00:59:33,812 --> 00:59:36,652
I have a lot of anecdotal evidence to support it.

754
00:59:37,742 --> 00:59:37,932
You

755
00:59:37,982 --> 00:59:39,742
Miko Pawlikowski: speak like a true scientist now.

756
00:59:39,852 --> 00:59:42,732
for everybody who wants to go and grab the book.

757
00:59:42,787 --> 00:59:47,147
once again, it's called "generative AI for data analytics".

758
00:59:47,217 --> 00:59:48,977
It's available at manning.

759
00:59:49,207 --> 00:59:49,567
com.

760
00:59:49,587 --> 00:59:56,327
It's currently in the early access program, which means that you can get a PDF that might change before the final print

761
00:59:56,817 --> 01:00:05,687
and, Just looking at it, looks like it's scheduled for early 2025 if you want to get a physical copy, from Amazon or anything like that.

762
01:00:06,167 --> 01:00:15,047
But, before I let my three amazing guests of The hook out today, I'm gonna fish out a prediction for the future.

763
01:00:15,477 --> 01:00:16,217
Artur, for you.

764
01:00:16,777 --> 01:00:21,197
Where do you see this all going particularly for data analytics?

765
01:00:21,217 --> 01:00:22,587
What's the next step for it?

766
01:00:22,657 --> 01:00:22,777
I

767
01:00:23,879 --> 01:00:42,899
Artur Guja: I think we will get a lot more, capacity to understand,  data sets and problems because that's already
came with, LLMs, but we will also get, a lot more realization that there is no substitute for human ingenuity.

768
01:00:43,399 --> 01:00:58,544
before LLMs or whatever next phase of models is going to be called, before they, reach that kind
of level, I think humans will still be able to, provide a lot more creativity into the process.

769
01:00:58,604 --> 01:01:02,854
And currently that's, I think we're in a period where that's undervalued.

770
01:01:03,714 --> 01:01:05,294
I think the next step will be.

771
01:01:05,574 --> 01:01:07,664
the recognition of the value of creativity.

772
01:01:08,894 --> 01:01:09,614
Marlena Siwiak: I disagree.

773
01:01:09,854 --> 01:01:10,334
I disagree.

774
01:01:10,394 --> 01:01:11,294
I'm totally pessimistic.

775
01:01:11,989 --> 01:01:20,559
I think it's going, we are going to rely more and more on AI, no matter what, without skepticism, and it will lead us to many trouble.

776
01:01:21,209 --> 01:01:32,279
And I'm thinking, even before ChatGPT appeared, there was this trend of, for instance, having job interviews, totally by, Computer programs.

777
01:01:32,349 --> 01:01:34,939
The initial job interview was done by a computer program.

778
01:01:35,279 --> 01:01:39,439
You are recorded and your voice was analyzed and your appearance was analyzed.

779
01:01:40,019 --> 01:01:48,679
And that was such a great tool because it saved a lot of money for companies, but it rejected many good candidates and it was just

780
01:01:49,671 --> 01:01:50,671
hopeless

781
01:01:51,274 --> 01:01:55,364
There was this book Math Destruction, which describes a lot of examples similar to this.

782
01:01:55,544 --> 01:01:58,564
how artificial intelligence and machine learning and other

783
01:01:59,374 --> 01:02:02,474
great tools are used in a wrong way.

784
01:02:02,914 --> 01:02:05,804
I think humanity doesn't learn.

785
01:02:06,064 --> 01:02:06,924
Just doesn't learn.

786
01:02:06,934 --> 01:02:08,934
Because what counts in the end is money.

787
01:02:09,916 --> 01:02:20,156
Marian Siwiak: bean counters will try to save on costly, things like proper data architecture, proper data collection, data engineering.

788
01:02:20,776 --> 01:02:32,326
They will try to cover the early process errors with advanced, High level, tools and, the losses will be covered, of course, by clients and rising prices.

789
01:02:33,286 --> 01:02:39,686
Many people will get good packages for introducing these new tools, but I have deep

790
01:02:40,621 --> 01:02:43,386
distrust that people will understand.

791
01:02:43,906 --> 01:02:54,226
That what Artur said multiple times, garbage in, garbage out, that later in the process you cannot correct some errors in, the data that you're working on.

792
01:02:55,076 --> 01:03:02,901
and this super hype will lead to a lot of, neglect towards the legwork required.

793
01:03:03,936 --> 01:03:05,756
Artur Guja: And here I wanted to inject some optimism.

794
01:03:06,806 --> 01:03:11,616
Miko Pawlikowski: Well, it was worth a try that went out of the window already disagreeing with each other.

795
01:03:11,726 --> 01:03:18,626
Marlena Siwiak: I think, I agree with Marian, that we are that close, really that close from some artificial, self awareness.

796
01:03:19,546 --> 01:03:22,406
So it's great moment in human history, really.

797
01:03:22,921 --> 01:03:23,961
It's good to be part of it,

798
01:03:24,141 --> 01:03:27,911
Marian Siwiak: the job market has so many ways.

799
01:03:28,556 --> 01:03:31,976
of screwing you over, that you shouldn't worry about AI.

800
01:03:32,446 --> 01:03:44,946
Artur Guja: Adapt your thinking, as Marlena said, AI is here to stay and you cannot go into the job market
saying I will compete with AI because then you're putting yourself at a very disadvantaged position.

801
01:03:45,446 --> 01:03:49,046
but also as Marlena said, use AI to your advantage.

802
01:03:49,301 --> 01:04:02,271
As squeeze out of it as much as you can seek the opportunities, not only for as, as jobs with AI, but using
AI in your job, don't go headstrong into AI jobs thinking, Oh, this, these are the jobs of the future.

803
01:04:02,271 --> 01:04:04,301
No, do what you wanted to do all along.

804
01:04:04,311 --> 01:04:10,801
You'll become a zoologist or become a, a social worker, become a oceanographer, whatever.

805
01:04:10,801 --> 01:04:12,641
These are all great pursuits

806
01:04:12,921 --> 01:04:14,421
and use AI in them.

807
01:04:14,991 --> 01:04:18,761
Because you don't have to be a hammerologist to use a hammer,

808
01:04:19,441 --> 01:04:23,251
but, you can do great things with a hammer if used in the right way.

809
01:04:24,971 --> 01:04:28,891
Miko Pawlikowski: Hard to argue with that last question, Marian, this one's for you.

810
01:04:29,421 --> 01:04:42,651
If you could have a magical way to break into OpenAI and hack their ChatGPT to display a
message on top of the chat box that everybody using ChatGPT is using, what would it say?

811
01:04:43,906 --> 01:04:44,896
Marlena Siwiak: Buy our book.

812
01:04:45,896 --> 01:04:46,446
Marian Siwiak: Talk to me.

813
01:04:46,646 --> 01:04:47,556
Do not enter prompts.

814
01:04:47,686 --> 01:04:48,256
Talk to me.

815
01:04:48,574 --> 01:04:51,364
Miko Pawlikowski: As in be nice to me and then demand things.

816
01:04:51,364 --> 01:04:51,974
Talk to me.

817
01:04:52,879 --> 01:04:54,489
Marian Siwiak: Depends on the person you are.

818
01:04:54,539 --> 01:04:56,479
I think everybody should.

819
01:04:57,469 --> 01:05:07,249
as I said, looking at this different prompt engine, I'm on a couple of groups on Facebook or on LinkedIn, which are excited by ChatGPT this way or another.

820
01:05:08,009 --> 01:05:23,399
and I see a lot of, okay, so this is the prompt I prepared and you just put your, the name of your company here
and like this or that people are avoiding like fire talking to ChatGPT, like the specialist to a wise colleague.

821
01:05:24,544 --> 01:05:33,544
And they would be much better off just talking about problem, not trying to extract answer if you feel the difference.

822
01:05:34,454 --> 01:05:55,994
it's not about respect only one, one day, I believe soon, it will be the case, but you will get much more and the whole
our book is about you will get so much more if you will trust that it has knowledge and you need to talk about the problem,

823
01:05:56,094 --> 01:05:56,744
prompt me.

824
01:05:56,744 --> 01:06:03,464
Do not, give me tasks, this is something that would probably improve people's, outcomes from these conversations.

825
01:06:04,929 --> 01:06:05,329
Miko Pawlikowski: Love it.

826
01:06:05,569 --> 01:06:10,999
So Sam Altman, if you're listening to this, you now know how to improve the ChatGPT interface.

827
01:06:11,639 --> 01:06:14,829
Marlena, Marian, Artur, thank you so much for coming.

828
01:06:14,949 --> 01:06:17,949
good luck with the sales of the book and I'll see you next time.

829
01:06:18,209 --> 01:06:18,579
Thank you.

830
01:06:19,116 --> 01:06:19,406
Artur Guja: Thank you.

831
01:06:19,416 --> 01:06:19,936
very much.