1 00:00:00,240 --> 00:00:03,840 Welcome back to Data Driven, the podcast where we chart the thrilling 2 00:00:03,840 --> 00:00:07,379 terrains of data science, AI, and everything in between. 3 00:00:07,759 --> 00:00:11,120 I'm Bailey, your semiscient host with a pangshang for 4 00:00:11,120 --> 00:00:14,179 sarcasm and a wit sharper than a histogram spike. 5 00:00:14,684 --> 00:00:18,365 Today's episode promises a delightful mix of the analytical and the 6 00:00:18,365 --> 00:00:22,064 artistic as we dive into the fascinating world of vector databases, 7 00:00:22,605 --> 00:00:26,224 retrieval augmented generation, and origami. Yes. 8 00:00:26,285 --> 00:00:29,690 You heard that right. Origami, the ancient art of 9 00:00:29,690 --> 00:00:33,310 folding paper, somehow finds itself intersecting with AI, 10 00:00:33,610 --> 00:00:37,070 proving that the future really does have layers or should I say folds. 11 00:00:37,530 --> 00:00:41,265 Our guest, Arjun Patel, is a developer advocate at Pinecone 12 00:00:41,265 --> 00:00:44,965 who's on a mission to demystify vector databases and semantic 13 00:00:45,025 --> 00:00:48,704 search, turning complex AI concepts into snackable bits of 14 00:00:48,704 --> 00:00:52,465 brilliance. He's also a self taught origami artist and a 15 00:00:52,465 --> 00:00:56,230 former statistics student who actually enjoyed it. So if 16 00:00:56,230 --> 00:01:00,070 you're ready to unravel the secrets of modern AI and maybe pick up a trick 17 00:01:00,070 --> 00:01:03,750 or two about folding life into geometric perfection, you're in the 18 00:01:03,750 --> 00:01:04,489 right place. 19 00:01:08,229 --> 00:01:11,925 Hello, and welcome back to Data Driven, the podcast where we explore the emergent 20 00:01:11,925 --> 00:01:15,465 fields of data science, AI, data engineering. 21 00:01:16,005 --> 00:01:19,845 Now today, due to a scheduling conflict, my most favorite is data engineer 22 00:01:19,845 --> 00:01:23,220 in the world will not be able to make it. But I will 23 00:01:23,220 --> 00:01:27,060 continue on, despite the recent snowstorms that we've had here in 24 00:01:27,060 --> 00:01:30,500 the DC Baltimore area. With me today, I have 25 00:01:30,500 --> 00:01:33,480 Arjun Patel, a developer advocate at Pinecone, 26 00:01:34,215 --> 00:01:37,915 who aims to make vector databases retrieval augmented generation, 27 00:01:38,535 --> 00:01:42,295 also known as RAG, and semantic search accessible by 28 00:01:42,295 --> 00:01:45,895 creating engaging YouTube videos, code notebooks, and blog 29 00:01:45,895 --> 00:01:49,355 posts that transform complex AI concepts 30 00:01:49,920 --> 00:01:53,760 into easily understandable content. After graduating with 31 00:01:53,760 --> 00:01:57,520 a BA in statistics from the University of Chicago, his journey through 32 00:01:57,520 --> 00:02:00,820 tech world stands spans from making speech coaching 33 00:02:00,880 --> 00:02:04,565 accessible with AI at Speeko to tackling AI 34 00:02:04,625 --> 00:02:08,305 generated content detection at Appen. Arjun's 35 00:02:08,305 --> 00:02:12,065 interest spans traditional natural language processing into modern 36 00:02:12,065 --> 00:02:14,405 large language model development and applications. 37 00:02:15,879 --> 00:02:19,480 Behind beyond his technical prowess, Arjun has been designing and folding his 38 00:02:19,480 --> 00:02:22,379 own origami creations for over a decade. Interesting. 39 00:02:23,000 --> 00:02:26,519 Seamlessly blending analytical thinking with artistic expression and his 40 00:02:26,519 --> 00:02:29,739 professional and personal pursuits. Welcome to the show, Arjun. 41 00:02:30,555 --> 00:02:34,255 Hey. Nice to meet you, Frank. Thanks for having me on. Excited to be here. 42 00:02:34,315 --> 00:02:37,435 Awesome. Awesome. There's a lot to unpack from there, but I think it's interesting to 43 00:02:37,435 --> 00:02:41,215 note that you have a BA in statistics. Yes. So you were probably 44 00:02:41,275 --> 00:02:44,335 studying, this sort of stuff before it was cool? 45 00:02:45,480 --> 00:02:48,700 Yeah. Yeah. A lot of the old school ways of analyzing 46 00:02:49,080 --> 00:02:51,900 data, understanding what's going on, so on and so forth. 47 00:02:53,240 --> 00:02:56,680 It was kind of, like, made clear to me pretty early that 48 00:02:56,680 --> 00:03:00,475 understanding how to work with data at small scale and at large scale is gonna 49 00:03:00,475 --> 00:03:03,355 be very important going to the future. So I kinda just took that and ran 50 00:03:03,355 --> 00:03:07,115 with it with my education. Very cool. It was 51 00:03:07,115 --> 00:03:10,955 definitely, you know, one of those things where I don't 52 00:03:10,955 --> 00:03:14,780 think people realized how important statistics would be until, 53 00:03:15,400 --> 00:03:19,159 you know, until the revolution happens, so to speak. So and it's also 54 00:03:19,159 --> 00:03:22,540 interesting to see because there's a lot of people that I think could benefit from, 55 00:03:23,079 --> 00:03:26,920 you know, picking up that old picking up a, an old statistics book and 56 00:03:26,920 --> 00:03:30,584 reading through it and understanding, like, a lot of the fundamentals. Obviously, there's a lot 57 00:03:30,584 --> 00:03:34,105 of new things, but a lot of the fundamentals are largely the 58 00:03:34,105 --> 00:03:37,944 same. You know, just I'll 59 00:03:37,944 --> 00:03:41,704 use this example. You know, McDonald's can add a Mc McRib sandwich, 60 00:03:41,704 --> 00:03:44,510 but it's still a McDonald's. Right? Like, it's This 61 00:03:45,609 --> 00:03:49,290 is what happens when you're shoveling snow. Like, your 62 00:03:49,290 --> 00:03:52,810 brain gets I absolutely agree. And, like, 63 00:03:52,810 --> 00:03:56,409 another proof on that point is that Anthropic just released a 64 00:03:56,409 --> 00:04:00,185 blog recently kind of recapping how to do statistical analysis when you're 65 00:04:00,185 --> 00:04:04,025 comparing different large language models. And when you read the paper in the blog, 66 00:04:04,025 --> 00:04:07,565 it's basically just like 2 sample t tests and kind of going over really, 67 00:04:08,425 --> 00:04:12,270 like, not introductory, but still statistics that's easily accessible for people to 68 00:04:12,270 --> 00:04:15,170 learn and understand. So it's still relevant, and it's still important. 69 00:04:15,790 --> 00:04:19,630 Interesting. One of the things that that that stood out in your in your bio 70 00:04:19,630 --> 00:04:23,230 was, people tend to forget that there 71 00:04:23,230 --> 00:04:26,764 was a natural language processing field prior 72 00:04:27,225 --> 00:04:29,085 to chat gpt launching. 73 00:04:31,384 --> 00:04:32,264 How do you, you know, 74 00:04:36,345 --> 00:04:39,884 we wanna talk about the difference between those 2? Sure. 75 00:04:40,520 --> 00:04:44,280 So the one of the first and probably only 76 00:04:44,280 --> 00:04:48,120 course I took in college related to natural language processing was 77 00:04:48,120 --> 00:04:51,960 called geometric models of meaning. And everything I learned in that 78 00:04:51,960 --> 00:04:55,745 course was like everything before, what we now would 79 00:04:55,745 --> 00:04:59,345 consider, like, modern embedding models. So bag of 80 00:04:59,345 --> 00:05:03,185 word methods, understanding how to represent documents and text purely 81 00:05:03,185 --> 00:05:06,805 based on, like, the frequency of the words that exist in the text, 82 00:05:06,865 --> 00:05:10,570 and then trying to understand, like, okay. Based on that information, how can 83 00:05:10,570 --> 00:05:14,090 we learn about the concepts that exist in text from the words that are being 84 00:05:14,090 --> 00:05:17,850 used? Like, what is the framework we can use to understand what these 85 00:05:17,850 --> 00:05:21,610 words mean based on their, co occurrences with the other words and 86 00:05:21,610 --> 00:05:25,195 texts that you're working with and based on, what those 87 00:05:25,195 --> 00:05:28,875 words mean as well. So, like, what the words' neighbors are and what their meaning 88 00:05:28,875 --> 00:05:32,555 helps and also what those words are doing. And I think a lot of traditional 89 00:05:32,555 --> 00:05:36,315 natural language processing, methodologies kinda stem from that, and 90 00:05:36,315 --> 00:05:39,850 there's a there's a lot of mileage you can get out of just thinking about 91 00:05:39,850 --> 00:05:43,370 approaching problems there before you step into these more complicated methods, 92 00:05:43,370 --> 00:05:46,970 like, these embed modern embedding models that exist. So that's kind of, like, what I 93 00:05:46,970 --> 00:05:50,590 would consider, like, traditional NLP, like, doing named entity recognition, 94 00:05:50,810 --> 00:05:54,605 trying to understand how to, find keywords really 95 00:05:54,605 --> 00:05:58,445 quickly. And then once you get really good at that, there's a whole host of 96 00:05:58,445 --> 00:06:02,125 problems that you encounter afterward that kind of modern techniques try to 97 00:06:02,125 --> 00:06:05,185 solve. Right. That's interesting. So so 98 00:06:05,750 --> 00:06:09,510 what was it, what was your thoughts 99 00:06:09,510 --> 00:06:12,890 when you first, like given that you were an NLP practitioner 100 00:06:13,750 --> 00:06:17,350 prior to the release of transformers and things like that, what was your initial thought? 101 00:06:17,350 --> 00:06:21,125 Because I'm curious because there's not a lot of people there are a 102 00:06:21,125 --> 00:06:24,245 lot of experts today that really kind of started a couple of years ago. No 103 00:06:24,245 --> 00:06:28,085 fault on them. They see where the industry is going. Totally understand it. But what 104 00:06:28,085 --> 00:06:31,845 was your thoughts? What was your thoughts when 105 00:06:31,845 --> 00:06:35,370 you when you first saw the attention all you need? The 106 00:06:35,370 --> 00:06:39,070 attention is all you need paper. So that would have been 107 00:06:39,210 --> 00:06:42,670 probably around the time I graduated college, around 108 00:06:42,890 --> 00:06:45,930 maybe a year or 2 after I took the course that I was just describing. 109 00:06:45,930 --> 00:06:49,545 So I I just started learning about, like, okay. Like, this is 110 00:06:49,545 --> 00:06:52,985 how, like, old school, quote unquote, like, embedding 111 00:06:52,985 --> 00:06:56,345 methodologies work. And the biggest takeaway that I got from those is that they work 112 00:06:56,345 --> 00:07:00,185 pretty well. They work pretty well for, like, a lots of different kinds of 113 00:07:00,185 --> 00:07:03,645 queries. And I think what the attention all you need paper did 114 00:07:03,880 --> 00:07:07,400 was it kinda helped you, understand how 115 00:07:07,400 --> 00:07:11,100 to rigorously create representations of text that 116 00:07:11,160 --> 00:07:14,600 generalize way better than, any sort of, like, 117 00:07:14,600 --> 00:07:18,220 normal, keyword based, bag of word based search methodology. 118 00:07:18,920 --> 00:07:22,755 And I think that at the time, I probably didn't 119 00:07:23,055 --> 00:07:26,895 grasp as much what impact the attention all you need paper would have on the 120 00:07:26,895 --> 00:07:30,735 field until we started getting embedding models that people could use really 121 00:07:30,735 --> 00:07:34,415 easily, like Roberta or Bert. And we're like, okay. Now we can do, like, 122 00:07:34,415 --> 00:07:37,810 multilingual search without any issue. Now we can represent, 123 00:07:37,810 --> 00:07:41,569 like, any sentence without keyword overlap when we 124 00:07:41,569 --> 00:07:45,090 wanna find some document that's interesting, without doing any 125 00:07:45,090 --> 00:07:48,645 additional work. Like, once those papers started hitting the scene, I think now we start 126 00:07:48,645 --> 00:07:51,365 seeing, like, okay, this is what attention is doing for us. This is what the 127 00:07:51,365 --> 00:07:55,125 ability to, like, contextualize our vector embeddings is doing for us. 128 00:07:55,125 --> 00:07:58,805 And now we can see what's kind of getting benefited there. But I think I 129 00:07:58,805 --> 00:08:02,324 think my, understanding of how beneficial that 130 00:08:02,324 --> 00:08:06,010 was kind of lagged until we started seeing these other models kind of hit. And 131 00:08:06,010 --> 00:08:09,470 I'm like, okay. Now I can kinda see why this is important and why, like, 132 00:08:09,930 --> 00:08:12,910 future and future models are gonna get better and better based on this architecture. 133 00:08:13,930 --> 00:08:17,167 Interesting. So so for those that don't know kind of and even I'm rusty on 134 00:08:17,167 --> 00:08:17,835 this. Right? Yeah. One of the things that was interesting about this was the in 135 00:08:17,914 --> 00:08:21,514 on this. Right? Yeah. One of the things that was interesting about this was the 136 00:08:21,514 --> 00:08:25,354 in first, appearance. What was it? You you just described it a 137 00:08:25,354 --> 00:08:29,194 minute ago, but it was something like the the prevalence of a word 138 00:08:29,194 --> 00:08:32,909 in a bit of text versus the lack of prevalence and how that 139 00:08:32,909 --> 00:08:36,429 metric becomes was very important in in 140 00:08:36,750 --> 00:08:39,010 I'll call it classical natural language processing. 141 00:08:40,510 --> 00:08:44,270 Right. So this is the idea that if you have words that co 142 00:08:44,270 --> 00:08:48,055 occur together in some document space, the meaning of those words are gonna be 143 00:08:48,055 --> 00:08:51,655 more similar than words that don't co occur in some other given document 144 00:08:51,655 --> 00:08:55,255 space. This is rooted in something called the 145 00:08:55,255 --> 00:08:58,870 distributional hypothesis, which is basically this idea and the other 146 00:08:58,870 --> 00:09:02,470 idea that, concepts cluster in in this type of 147 00:09:02,470 --> 00:09:06,230 space. So what what does that mean actually? Right? So if you have the word 148 00:09:06,230 --> 00:09:09,990 like hot dog, it's probably gonna be seen in a corpus that's 149 00:09:09,990 --> 00:09:13,704 near other food related words than it would be if you picked some 150 00:09:13,704 --> 00:09:17,305 other word like space or moon. And there's something we can 151 00:09:17,305 --> 00:09:20,824 learn from that relationship to infer the meaning of what that word 152 00:09:20,824 --> 00:09:24,345 is and how we can use that meaning of that word to learn about what 153 00:09:24,345 --> 00:09:27,850 other words are doing. So So this is kind of, like, the theoretical 154 00:09:28,070 --> 00:09:31,850 basis of, like, why we can represent words geometrically, 155 00:09:32,630 --> 00:09:35,990 with with a little bit of hand waving. But that's kind of the core idea. 156 00:09:35,990 --> 00:09:39,575 And attention kind of takes this a little further by allowing the 157 00:09:39,575 --> 00:09:43,255 representation of these tokens or words to be altered based 158 00:09:43,255 --> 00:09:47,014 on the words that occur in a given sentence. So you might have a 159 00:09:47,014 --> 00:09:50,855 word like does, like, does this mean something? 160 00:09:50,855 --> 00:09:54,370 You might say something like that. Or you might say, I saw some 161 00:09:54,370 --> 00:09:58,050 does in the forest. Both spelled exactly the same, but have 162 00:09:58,050 --> 00:10:01,490 completely different meanings based on their context. And if you used a 163 00:10:01,490 --> 00:10:05,170 traditional, maybe, bag of words model where you're just counting the 164 00:10:05,170 --> 00:10:08,995 words that occur in a given document and kind of creating a representation of what 165 00:10:08,995 --> 00:10:12,435 that document looks like based on the words that are composed in there, you're gonna 166 00:10:12,435 --> 00:10:16,135 overlap and conflict with the meaning of those of of the word 167 00:10:16,195 --> 00:10:19,875 does and does because they're spelled exactly the same. They might look 168 00:10:19,875 --> 00:10:23,360 exactly the same with this type of representation. But if you have a way of 169 00:10:23,360 --> 00:10:27,200 informing what that word means with its context, which is what attention 170 00:10:27,200 --> 00:10:30,560 allows us to do, then you can completely change how that's being 171 00:10:30,560 --> 00:10:34,400 represented in your downstream system, which allows you to do interesting things 172 00:10:34,400 --> 00:10:38,045 with with search. So that's kind of, like, the biggest benefit that's coming out of 173 00:10:38,045 --> 00:10:41,805 that type of methodology, and that kinda enables what is now known as 174 00:10:41,805 --> 00:10:45,485 semantic search and retrieval augmented generation and so on and so forth. I was gonna 175 00:10:45,485 --> 00:10:49,165 say, that sounds very it's almost like it was, like, the old pre 176 00:10:50,090 --> 00:10:53,770 that error, the vectorization of this and the distance in 177 00:10:53,770 --> 00:10:57,530 that vector in that geometric space. I guess 178 00:10:57,530 --> 00:11:00,730 we've been doing that for a lot longer than most people realize in in a 179 00:11:00,730 --> 00:11:03,070 sense. Yeah. I mean, 180 00:11:04,375 --> 00:11:07,895 looking through, indexes or document stores with some sort of 181 00:11:07,895 --> 00:11:11,275 vectorization has has has been, 182 00:11:12,214 --> 00:11:16,055 something that people have done, except instead of being dense vectors, which is, like, 183 00:11:16,055 --> 00:11:19,880 you have some fixed size representation that isn't necessarily interpretable 184 00:11:19,940 --> 00:11:23,620 to the human eye for some given query or document, it would 185 00:11:23,620 --> 00:11:27,380 be, like, the size of your vocabulary. So you think of, like, Wikipedia. You 186 00:11:27,380 --> 00:11:31,060 can find, like, every unique word on Wikipedia, and, like, that is gonna be how 187 00:11:31,060 --> 00:11:34,135 big your vector's gonna be. And every time you have a new document come in, 188 00:11:34,135 --> 00:11:37,975 a new article, somebody's kind of, like, wrote up and published to Wikipedia, like, you're 189 00:11:37,975 --> 00:11:41,654 representing that in terms of its vocabulary. But now instead of doing that, we 190 00:11:41,654 --> 00:11:45,495 have, like, this magical fixed sized box that allows us 191 00:11:45,495 --> 00:11:48,795 to represent chunks of text in a way that is 192 00:11:49,160 --> 00:11:52,840 extremely fascinating and abstract. And every time I think about it, it just, like, blows 193 00:11:52,840 --> 00:11:56,120 my mind, but that's kind of, like, the main kind of difference is the way 194 00:11:56,120 --> 00:11:59,560 we're representing that information and how compact compact that is and 195 00:11:59,560 --> 00:12:03,195 generalizable it has become. Yeah. That is, like, it it's almost 196 00:12:03,195 --> 00:12:06,955 like you're, you know correct me if I'm wrong, but, you know, 197 00:12:06,955 --> 00:12:10,475 creating these vectors, these large vector databases, right, with, you 198 00:12:10,475 --> 00:12:14,235 know, 10, 12,000 dimensions, right, of how these words 199 00:12:14,235 --> 00:12:16,255 are measured in relationship to others. 200 00:12:17,860 --> 00:12:21,540 It's almost as a consequence of training a large language 201 00:12:21,540 --> 00:12:24,740 model, you create a knowledge graph. Is that is that true? Is that really the 202 00:12:24,740 --> 00:12:28,500 case where, you know, like, you know, dog is most likely to be 203 00:12:28,500 --> 00:12:32,305 next to, you know, the word pet, you know, or 204 00:12:32,305 --> 00:12:35,904 it has the same distance. Is that I'm not 205 00:12:35,904 --> 00:12:39,585 explaining it right. No. No. No. You're you're on you're on the right track exactly. 206 00:12:39,585 --> 00:12:42,805 And I think this is, like, one of the most fascinating qualities 207 00:12:43,105 --> 00:12:46,730 of even, like, what people would consider, like, older 208 00:12:46,790 --> 00:12:50,470 embedding models is this idea that you can take, like, a training test that 209 00:12:50,470 --> 00:12:54,250 seems completely unrelated to the quality that you want in a downstream model, 210 00:12:54,630 --> 00:12:58,455 and it turns out that that actually achieves that quality. So, what you were referring 211 00:12:58,455 --> 00:13:02,214 to, Frank, is this idea that you might have, like, a sentence. You 212 00:13:02,214 --> 00:13:05,975 might have, like, I took my dog out on a walk, and you might say, 213 00:13:05,975 --> 00:13:09,575 okay. I'm gonna remove the word, walk, and I'm gonna have 214 00:13:09,735 --> 00:13:13,560 I'm gonna train some model that tries to predict what that word 215 00:13:13,560 --> 00:13:17,240 where I removed was. This is masked language modeling, which is this idea that you're 216 00:13:17,240 --> 00:13:20,040 kind of getting at of, like, okay, what are the words and how are they 217 00:13:20,040 --> 00:13:23,240 in relation to the other words in that sentence? And it turns out that if 218 00:13:23,240 --> 00:13:26,920 you, like, do this with, like, 100 of 1,000 of millions of sentences and 219 00:13:26,920 --> 00:13:30,685 words, in some corpus that is somewhat representative of 220 00:13:30,685 --> 00:13:34,525 how people, use human language, you can 221 00:13:34,525 --> 00:13:37,565 act you will get really good at this task, number 1, because you're training the 222 00:13:37,565 --> 00:13:41,405 model on that task exactly. But if you are training a neural 223 00:13:41,405 --> 00:13:45,040 network on that model, some intermediate layer representation 224 00:13:45,180 --> 00:13:48,940 in that model so somewhere in that set of matrix 225 00:13:48,940 --> 00:13:52,540 multiplications where you're turning this input sentence into some fixed size 226 00:13:52,540 --> 00:13:55,920 vector representation is gonna be a good representation 227 00:13:56,300 --> 00:13:59,685 of what that word or that token or that sentence is going to be. 228 00:14:00,465 --> 00:14:03,605 And the fact that that works is not intuitive. Right? 229 00:14:04,145 --> 00:14:07,185 The the fact that that works has been shown empirically, and it turns out that 230 00:14:07,185 --> 00:14:10,405 we can kind of do that and kind of have these models work really well. 231 00:14:10,490 --> 00:14:13,690 And nowadays, in addition to kind of doing that, which is what we would consider 232 00:14:13,690 --> 00:14:17,370 pretraining on some large corpus, we now fine tune those 233 00:14:17,370 --> 00:14:21,050 embedding models on specific tasks that are important to us 234 00:14:21,050 --> 00:14:24,765 for retrieval. Like, okay, we have this query or question we're 235 00:14:24,765 --> 00:14:28,365 asking. We have the set of documents that might answer this question or might 236 00:14:28,365 --> 00:14:31,965 not. We want a model that makes it so that the query's embedding and the 237 00:14:31,965 --> 00:14:35,565 document relevance embeddings are in the same vector space. So you're on the right track. 238 00:14:35,565 --> 00:14:39,240 That's, like, basically how these models are able to learn these things. I don't know 239 00:14:39,240 --> 00:14:43,080 if I would call them, graph representation, maybe a little bit 240 00:14:43,080 --> 00:14:46,920 of, being being pandactic on, like, use of words there because that can 241 00:14:46,920 --> 00:14:50,300 be a little bit, different how how you're organizing that information. 242 00:14:50,760 --> 00:14:54,065 But you can make the argument that the way that these large language models are 243 00:14:54,065 --> 00:14:57,825 representing information is a compressed form of, like, the giant dataset that they're 244 00:14:57,825 --> 00:15:01,585 trained on. And we don't actually know exactly, like, where that 245 00:15:01,585 --> 00:15:05,185 information lies inside that neural network. There's some research that's, 246 00:15:05,185 --> 00:15:08,430 like, trying to get at answering that question, But you could, for the sake of 247 00:15:08,430 --> 00:15:12,190 argument, be like, yeah. There's probably, like, a a a dog 248 00:15:12,190 --> 00:15:15,949 node somewhere in this neural network that knows a ton about dogs, and that's how 249 00:15:15,949 --> 00:15:18,910 we're able to kind of learn this information. That is the stuff that we don't 250 00:15:18,910 --> 00:15:22,665 exactly know. Interesting. Because, there was a really good 251 00:15:22,665 --> 00:15:26,025 video by 3 blue one brown, which you probably are I love that 252 00:15:26,025 --> 00:15:29,705 channel. Where he gives examples where, you know, famous historical 253 00:15:29,705 --> 00:15:33,325 leaders from Britain have the same distance 254 00:15:33,385 --> 00:15:36,910 from you change the country to Italy 255 00:15:37,290 --> 00:15:41,050 or the United States have the same kind of distance. So you can kind 256 00:15:41,050 --> 00:15:44,730 of infer I'm not saying that the AI it 257 00:15:44,730 --> 00:15:48,190 almost seems like this knowledge graph is also is also a byproduct 258 00:15:48,330 --> 00:15:51,405 of of of building this out. Like, the there's some 259 00:15:51,865 --> 00:15:55,625 type of encoding or semantic, I guess, is this is really what it is. Right? 260 00:15:55,625 --> 00:15:59,465 Like, that that you get with it. And, I wanna get 261 00:15:59,465 --> 00:16:03,240 your thoughts because yesterday, I I caught the part the 262 00:16:03,240 --> 00:16:06,780 first half of the Jetson Juan keynote at c s CES, 263 00:16:07,320 --> 00:16:10,520 which this you know, we're recording this on January 8th. Right? And one of the 264 00:16:10,520 --> 00:16:13,880 things that the video starts off with is, you know, the idea 265 00:16:13,880 --> 00:16:17,560 that tokens are kind of fundamental elements of 266 00:16:17,560 --> 00:16:21,295 knowledge. And I did a live stream where I'm like, well, I never really thought 267 00:16:21,295 --> 00:16:24,654 about it this way. Right? They're they're building blocks of knowledge or the pixels, if 268 00:16:24,654 --> 00:16:28,415 you will, of of of of knowledge. And I wanted to get your 269 00:16:28,415 --> 00:16:32,115 thoughts on that because, like, that kind of blew my mind and maybe I'm simple. 270 00:16:32,130 --> 00:16:35,410 I don't know. Maybe I'm not. But it all it seems like we've been kinda 271 00:16:35,410 --> 00:16:38,850 dancing around this idea where and now NVIDIA is really 272 00:16:38,850 --> 00:16:42,690 fully, you know, going all in on this, the idea that, you know, 273 00:16:42,690 --> 00:16:46,530 these are not, this isn't an AI system. It's a token factory 274 00:16:46,530 --> 00:16:49,795 or a token score. What are your what are your thoughts on that? I'm curious. 275 00:16:50,495 --> 00:16:54,334 So when I started learning about how, like, tokenization works 276 00:16:54,334 --> 00:16:57,855 and how we're able to kind of, like, basically build these 277 00:16:57,855 --> 00:17:00,595 models without having massive, massive vocabularies, 278 00:17:01,740 --> 00:17:05,520 it is it is pretty it it is pretty 279 00:17:05,660 --> 00:17:08,560 interesting to be, like, okay. Like, maybe maybe there's some, 280 00:17:10,140 --> 00:17:13,900 abstract notion of information that each token has that 281 00:17:13,900 --> 00:17:17,735 is being that is what the model is learning during training time. And then 282 00:17:17,735 --> 00:17:21,515 we're just combining these sets of information in order to kind of, like, understand 283 00:17:21,815 --> 00:17:24,855 what words mean or what documents mean, so on and so forth. Because when you 284 00:17:24,855 --> 00:17:28,695 look at how, tokenizers work and the size of the number of 285 00:17:28,695 --> 00:17:31,835 tokens for, like, maybe the English language or maybe, like, a really multilingual 286 00:17:32,100 --> 00:17:35,780 model like Roberta or multilingual e five large, they're a lot 287 00:17:35,780 --> 00:17:39,400 less than you would expect. Like, it's on the order of, like, maybe a 100000, 288 00:17:39,940 --> 00:17:42,840 200000, 300000, tokens. 289 00:17:43,539 --> 00:17:47,115 So it is kind of 290 00:17:47,115 --> 00:17:50,794 odd to think about whether those tokens 291 00:17:50,794 --> 00:17:54,554 themselves hold information that's readily interpretable for us. But I 292 00:17:54,554 --> 00:17:57,990 think that we've gotten so far with using 293 00:17:57,990 --> 00:18:01,830 systems that are just combining, the operations on top of 294 00:18:01,830 --> 00:18:05,270 these tokens in order to retrieve the information that these systems have learned, that there's 295 00:18:05,270 --> 00:18:08,950 definitely something important there. And I would love to, like, know 296 00:18:08,950 --> 00:18:12,535 exactly, like, what is happening when we're able to do that. The the 297 00:18:12,535 --> 00:18:16,375 heuristic that I like to use is, large 298 00:18:16,375 --> 00:18:20,215 language models are generally reflections of the training datasets that they've been trained on, 299 00:18:20,215 --> 00:18:23,735 and they're basically creating, like, really efficient indexes over that 300 00:18:23,735 --> 00:18:27,440 information. And sometimes those indices hallucinate. And the reason 301 00:18:27,440 --> 00:18:31,280 why is because we are when we ask, quote, unquote, what 302 00:18:31,520 --> 00:18:35,360 a question to a large language model or query a large language model, we 303 00:18:35,360 --> 00:18:39,059 are kind of conditioning that model, on a probability 304 00:18:39,200 --> 00:18:42,875 space where every token being generated after is 305 00:18:42,875 --> 00:18:46,475 likely to exist given the query or the context or whatever we're passing to 306 00:18:46,475 --> 00:18:50,235 it. And once you think about it that way, then it just feels like 307 00:18:50,235 --> 00:18:53,820 instead of thinking about what each of the tokens are doing, you're kind of just 308 00:18:54,059 --> 00:18:57,820 querying what the model has been trained on and what it will tell you 309 00:18:57,820 --> 00:19:01,039 based on what it, quote unquote, learned or knows. 310 00:19:01,580 --> 00:19:04,460 And then you can kind of run with that metaphor a lot and build systems 311 00:19:04,460 --> 00:19:08,255 on on top of that. That seems, much more actionable than thinking about, 312 00:19:08,255 --> 00:19:11,135 like, what each of the tokens are doing individually. Does that kinda make sense? No. 313 00:19:11,135 --> 00:19:13,455 That makes a lot of sense. I think the whole gestalt of it is what 314 00:19:13,455 --> 00:19:16,895 really makes it magical. Right? Like Yeah. You know, you can you 315 00:19:16,895 --> 00:19:20,580 can obviously, I I don't this is not this is not, like, the newest iPhone 316 00:19:20,580 --> 00:19:23,960 or whatever. But, you know, if you go through the the text auto complete, 317 00:19:24,740 --> 00:19:28,340 you can maybe make a sentence that sounds like 318 00:19:28,340 --> 00:19:32,155 something you would write. But much beyond that, it starts getting weird. In 319 00:19:32,155 --> 00:19:35,535 early generative AI was very much like that, particularly the images. 320 00:19:35,915 --> 00:19:39,455 Well, you know Don't like, yes. A 100% 321 00:19:39,515 --> 00:19:43,115 understand. I started learning about generative, text 322 00:19:43,115 --> 00:19:46,840 generation before we had instruction fine tune model. So are you 323 00:19:46,840 --> 00:19:50,520 familiar with, like, the concept of instruction fine tuning, Frank? I think I am, 324 00:19:50,520 --> 00:19:53,880 but I IBM slash Red Hat defines it one way. I would like to get 325 00:19:53,880 --> 00:19:57,720 your opinion. Yeah. So, this is the idea that 326 00:19:57,720 --> 00:20:01,195 you can train or fine tune large language models to follow 327 00:20:01,195 --> 00:20:04,955 instructions to complete tasks. So, before we had, 328 00:20:04,955 --> 00:20:08,715 like, models that could that we could just, like, ask questions of and just, like, 329 00:20:08,715 --> 00:20:12,495 receive answers directly, you had to craft text 330 00:20:13,110 --> 00:20:16,710 that would increase the probability that the document that you want to 331 00:20:16,710 --> 00:20:20,470 generate would happen. So if you wanted a story about, like, unicorns or something, 332 00:20:20,470 --> 00:20:24,070 you would have to start your query to the LLM as there 333 00:20:24,070 --> 00:20:27,190 once was, like, a set of unicorns living in the forest. Blah blah blah blah. 334 00:20:27,190 --> 00:20:30,415 And then it would just, like, complete sentence, just like a fancy version of autocomplete. 335 00:20:30,875 --> 00:20:34,555 Right. And that that's kind of, like, what we used to have, and that was 336 00:20:34,555 --> 00:20:37,995 pretty hard to work with. And then once researchers kinda cracked, like, wait a second. 337 00:20:37,995 --> 00:20:41,755 We can create a dataset of, like, instruction pairs and, like, document 338 00:20:41,755 --> 00:20:45,080 sets and fine tune models on them. And it turns out now we can just, 339 00:20:45,080 --> 00:20:48,920 like, ask models to do things, and they will do them. Whether or not 340 00:20:48,920 --> 00:20:52,200 those are correct is kind of the next part of the story. But getting to 341 00:20:52,200 --> 00:20:55,180 that point, it was, like, pretty interesting and pretty significant. 342 00:20:56,115 --> 00:20:59,575 Interesting. Interesting. When I think of 343 00:20:59,955 --> 00:21:03,635 fine tuning, I think of I think of 344 00:21:03,635 --> 00:21:07,235 primarily InstruqtLab, where you basically kinda have a 345 00:21:07,235 --> 00:21:10,740 LoRa layer on top of the base LLM doing 346 00:21:10,740 --> 00:21:14,420 that. Is that the same thing? Or is it kind of slightly 347 00:21:14,580 --> 00:21:18,260 it sounds like it's slightly nuanced. So the nuance there 348 00:21:18,260 --> 00:21:22,100 is that, one, though this the methodology that I'm 349 00:21:22,100 --> 00:21:25,945 describing is mostly dataset driven. So you have, like, your original LLM, 350 00:21:26,005 --> 00:21:29,845 and then you have, like, a new dataset that allows the LLM to learn a 351 00:21:29,845 --> 00:21:33,605 specific task. Or in this case, like, a generalized form of tasks, 352 00:21:33,605 --> 00:21:37,390 which is you have instruction, answer, user query, 353 00:21:37,390 --> 00:21:41,150 give it an instruction. Whereas in your case, you're kind of, like, adding another layer 354 00:21:41,150 --> 00:21:44,830 to the LLM and, like, forcing the LLM to learn all the new 355 00:21:44,830 --> 00:21:48,210 methodology inside that layer in order to accomplish a specific 356 00:21:48,270 --> 00:21:52,115 task. So that's kind of like what client cleaning ends up doing. So the other 357 00:21:52,115 --> 00:21:55,475 way there's multiple ways to do this, it seems. Right? Like, there there's that way 358 00:21:55,475 --> 00:21:58,775 we add the layer, but there's also kind of I hate the term prompt engineering 359 00:21:58,915 --> 00:22:02,755 because it's just so over overblown. But, like, giving it 360 00:22:02,755 --> 00:22:06,559 more context and samples. And now that the the token context 361 00:22:06,559 --> 00:22:10,320 window is large enough that you don't have to be well, if you wanna 362 00:22:10,320 --> 00:22:12,799 save money, you have to be very mindful of that. But if you're running it 363 00:22:12,799 --> 00:22:16,480 locally, like, doesn't really matter. Well, you could give it an example of 364 00:22:16,639 --> 00:22:19,905 let's just say you had I'm trying to think of a short story or a 365 00:22:19,905 --> 00:22:23,365 novel. I don't know. Let's pretend, 366 00:22:23,905 --> 00:22:27,745 Moby Dick was only a 100 pages. Right? I 367 00:22:27,745 --> 00:22:30,785 could give it that as the part of the prompt. Let's say write a sequel 368 00:22:30,785 --> 00:22:34,580 to this book based on what happens in this one. Is that what you're talking 369 00:22:34,580 --> 00:22:38,260 about? Were you kinda giving an example as part of the prompt? Or is there 370 00:22:38,260 --> 00:22:41,779 some and not part of the layer? Or some combination thereof? Or was some third 371 00:22:41,779 --> 00:22:45,525 thing entirely? So this would be like, what what 372 00:22:45,525 --> 00:22:49,365 you're describing is more like few shot learning, which is you gave kind of an 373 00:22:49,365 --> 00:22:53,045 example, and then you're, like, okay. Like, given these examples, can you do this other 374 00:22:53,045 --> 00:22:56,885 task this test that I've described on this unseen example? What I'm describing is 375 00:22:56,885 --> 00:23:00,289 kind of, like, slightly before that. So, like, before we had the ability to, like, 376 00:23:00,289 --> 00:23:03,750 give models examples, we had to, like, give them we have to 377 00:23:03,809 --> 00:23:07,570 create the ability to follow instructions. And then once you have the ability to 378 00:23:07,570 --> 00:23:11,155 follow instructions, you can be like, okay. Here are the instructions. Here's 379 00:23:11,155 --> 00:23:14,615 examples of correctly completing the instruction, now do the instruction. 380 00:23:14,995 --> 00:23:18,355 And that is the reason why that happens in that order is 381 00:23:18,355 --> 00:23:21,795 because first, you have, like, just, like, sequence completion, like, 382 00:23:21,795 --> 00:23:25,395 autocomplete. Then you have, like, okay, given this 383 00:23:25,395 --> 00:23:29,120 task given this set of instructions, just follow the instruction instead of, 384 00:23:29,120 --> 00:23:32,800 like, trying to do autocomplete. And then you have, okay, now you know how to 385 00:23:32,800 --> 00:23:36,560 follow instructions. I'm gonna give you a few data points in order to 386 00:23:36,560 --> 00:23:40,160 learn a new task. Now do this new task. So you're kind of, 387 00:23:40,160 --> 00:23:43,955 like, moving from a situation where you need tons and tons 388 00:23:43,955 --> 00:23:47,635 of data just to get the, sequence completion. And then you need 389 00:23:47,635 --> 00:23:51,095 a smaller set of data to, like, get the capability to follow instructions. 390 00:23:51,555 --> 00:23:55,320 And then you need a very, very, very small amount of data, like, 391 00:23:55,320 --> 00:23:59,160 maybe 3 points or 10 examples or 15 examples to complete kind of, like, 392 00:23:59,160 --> 00:24:02,760 a new task. So there's a lot of kind of nuance in, like, how 393 00:24:02,760 --> 00:24:06,120 modern LLMs are being used and how they're kind of trained and fine tuned, so 394 00:24:06,120 --> 00:24:09,559 on and so forth. And I think there's a lot of, like, 395 00:24:09,559 --> 00:24:13,135 important importance in, like, learning what what happened kind of 396 00:24:13,135 --> 00:24:16,975 before because the advancements have happened so quickly. It can be really hard to kind 397 00:24:16,975 --> 00:24:20,815 of differentiate, or, like, oh, why is why do models perform like this? Why 398 00:24:20,815 --> 00:24:24,429 do things kind of happen like that? And even though, prompt 399 00:24:24,429 --> 00:24:28,190 engineering has kind of, like, let's say, traveled through the 400 00:24:28,190 --> 00:24:31,230 hype cycle where people were, like, really excited about it, and then we're, like, this 401 00:24:31,230 --> 00:24:34,830 is not actually that interesting. Right. What's interesting is that, 402 00:24:34,990 --> 00:24:38,755 doing building a good RAG system or trivial augmented generation system, 403 00:24:38,895 --> 00:24:42,195 you really need to be good at prompt engineering in a sense 404 00:24:42,415 --> 00:24:45,635 because you're assembling the correct context for this model 405 00:24:45,855 --> 00:24:49,230 to answer some downstream question, And it's not 406 00:24:49,230 --> 00:24:52,910 intuitive how to assemble that context. So understanding, like, how are these 407 00:24:52,910 --> 00:24:56,750 models are trained, like, whether they can follow instructions, how good they are at 408 00:24:56,750 --> 00:25:00,555 doing so, how many examples of information they need in order to accomplish some task 409 00:25:00,875 --> 00:25:04,395 really affects how you build that knowledge base in order to help the 410 00:25:04,395 --> 00:25:07,535 model do some sort of new thing. Interesting. 411 00:25:09,435 --> 00:25:12,895 So RAG is obviously all the rage now. 412 00:25:13,240 --> 00:25:17,080 Yep. But there's also a relatively new because this this 413 00:25:17,080 --> 00:25:20,840 space changes rapidly. Like, I mean, I took 2 weeks off in December, and 414 00:25:20,840 --> 00:25:24,380 I feel completely disconnected from the cutting edge, you know. 415 00:25:25,000 --> 00:25:28,655 Because when I was watching the keynote from CES, and I'm like, wow. That's 416 00:25:28,655 --> 00:25:32,095 really cool. And I was texting, you know, slacking with a coworker, and he goes, 417 00:25:32,095 --> 00:25:35,455 oh, no. This is a retread of their, like, last keynote they did. Like 418 00:25:35,935 --> 00:25:39,775 and I'm like, okay. Wow. Blink and you missed 419 00:25:39,775 --> 00:25:43,220 something. So what 420 00:25:43,220 --> 00:25:46,980 you're describing the fine tuning, is that really what Raft is, where the 421 00:25:46,980 --> 00:25:50,820 idea that you have kind of retrieval augmented fine tuning, which I think is what 422 00:25:50,820 --> 00:25:54,595 the acronym stands for. Is that not I'm 423 00:25:54,595 --> 00:25:58,315 not familiar with how Raft works. So I don't wanna, like, kind of venture 424 00:25:58,315 --> 00:26:01,875 and guess without without knowing what it is. But do you remember, like, what context 425 00:26:01,875 --> 00:26:04,695 you encountered this in? Basically, it's the idea that 426 00:26:06,290 --> 00:26:10,049 it's the idea that you can fine tune the results. Sounds very 427 00:26:10,049 --> 00:26:13,190 similar to what you're doing, and I've haven't read the paper in a while. 428 00:26:14,850 --> 00:26:17,745 Back when I was a Microsoft MVP, like, you know, 429 00:26:18,625 --> 00:26:22,465 they had a Microsoft Research had the thing for their calls, and they 430 00:26:22,465 --> 00:26:26,005 were all raving about it. The paper had just come out and things like that. 431 00:26:26,625 --> 00:26:29,925 It's the idea that you can kind of give it pretrained examples. 432 00:26:30,630 --> 00:26:33,910 You start with a base LLM, and you give it pre trained examples, and then 433 00:26:33,910 --> 00:26:37,750 you add on top of just to retrieve an 434 00:26:37,750 --> 00:26:41,350 augmented portion of it. It's very similar, not to 435 00:26:41,350 --> 00:26:43,990 plug my you know, for my day job. I work at Red Hat. That's why 436 00:26:43,990 --> 00:26:47,695 there's a fedora there. We have a product called Rel 437 00:26:47,695 --> 00:26:51,135 AI, which is based on an upstream open source project called instruct 438 00:26:51,135 --> 00:26:54,815 lab. And it's the idea similar idea in that you you you 439 00:26:54,815 --> 00:26:58,035 basically give it a set of data. 440 00:26:58,580 --> 00:27:01,780 And then you we there's a there's a little more to it because there's a 441 00:27:01,780 --> 00:27:05,400 teacher model. And basically what it'll do is it will and synthetic data generation. 442 00:27:05,860 --> 00:27:09,160 So you can start with a modest document set. 443 00:27:10,180 --> 00:27:13,875 And based on how the questions and answers that you 444 00:27:13,875 --> 00:27:15,174 form and the the the, 445 00:27:17,875 --> 00:27:21,015 the taxonomy that you attach to it, it will 446 00:27:21,715 --> 00:27:25,255 create a LoRa layer on top of an existing LLM. 447 00:27:26,120 --> 00:27:29,960 And it it could be that it's it's it's not quite exactly the same as 448 00:27:29,960 --> 00:27:33,320 Raft, but it's definitely in the same direction. Same same thing as, like, Bert, Elmo, 449 00:27:33,320 --> 00:27:36,540 and, you know, Roberta, which, I think 450 00:27:37,400 --> 00:27:40,645 I think I understand. So it's kind of like you so the I think the 451 00:27:40,645 --> 00:27:44,325 problem that might be addressing is kind of just really similar to the problem that 452 00:27:44,325 --> 00:27:48,085 traditional RAG tries to address, except in a more kind of deliberate fashion 453 00:27:48,245 --> 00:27:51,925 Exactly. Yeah. Where you have some document store internally. Like, let's say we 454 00:27:51,925 --> 00:27:55,465 both work at some company, and we have a giant customer support document store. 455 00:27:55,710 --> 00:27:59,150 You take some LLM off the shelf. It's not necessarily gonna know the 456 00:27:59,150 --> 00:28:02,909 contents of your internal kind of documents. So how can you get 457 00:28:02,909 --> 00:28:06,669 it to, like, successfully help answer tickets or triage tickets that 458 00:28:06,669 --> 00:28:10,365 you're trying to build, so that you can answer, like, most difficult tickets and 459 00:28:10,365 --> 00:28:13,965 kind of work toward that. In this situation, maybe you 460 00:28:13,965 --> 00:28:17,405 want to, inject some of the knowledge of 461 00:28:17,405 --> 00:28:21,005 the documents in addition to having the 462 00:28:21,005 --> 00:28:24,760 model being able to search over the document store. So maybe, like, the what this 463 00:28:24,760 --> 00:28:28,280 lower layer is doing is, like, absorbing Yeah. Some of the knowledge from the 464 00:28:28,280 --> 00:28:31,500 document store so that you can kind of more 465 00:28:32,120 --> 00:28:35,815 efficiently query, the database and so 466 00:28:35,815 --> 00:28:39,195 that you don't have to, like, query it all the time. The only, 467 00:28:39,655 --> 00:28:43,255 issue, quote, unquote, I'd have with that method is that you'd have to, like, keep 468 00:28:43,255 --> 00:28:47,015 that updated from time to time, and that's, like, not that's nontrivial. Whereas 469 00:28:47,015 --> 00:28:50,679 if you just do, like, traditional RAG, you just need to 470 00:28:50,679 --> 00:28:54,200 update your, Vector Store, and then you can just have the model 471 00:28:54,200 --> 00:28:57,559 query that new information when you need to. But, you know, it's always best to 472 00:28:57,559 --> 00:29:01,019 use whatever solution works best for your, given use case. 473 00:29:01,455 --> 00:29:04,895 And experimenting with different use cases is always really important. But I imagine that's, like, 474 00:29:04,895 --> 00:29:08,195 kind of what that is trying to address, which is the That is basically it. 475 00:29:08,255 --> 00:29:11,375 The I, you know, I don't wanna go down that rabbit hole of that. But 476 00:29:11,375 --> 00:29:15,150 but, basically, the idea is that, if 477 00:29:15,150 --> 00:29:18,590 you train an LLM or you have a layer on top of an 478 00:29:18,590 --> 00:29:22,270 LLM that not only does retrieval from a source document 479 00:29:22,270 --> 00:29:25,950 store. Right? I think that's a pretty set pattern. But it also has a 480 00:29:25,950 --> 00:29:29,170 better understanding of your business, your industry, the jargon. 481 00:29:29,644 --> 00:29:33,404 Right. Right. Blah blah blah. Right? The idea is that the retrieval success 482 00:29:33,404 --> 00:29:37,184 rate will be higher. Now we're not publishing the numbers yet, 483 00:29:37,485 --> 00:29:41,085 but the research is still ongoing. But basically, it's a 484 00:29:41,085 --> 00:29:44,799 pretty substantial from what I've seen well, I haven't 485 00:29:44,799 --> 00:29:47,679 seen the actual numbers yet, but from what I've been told those numbers are by 486 00:29:47,679 --> 00:29:51,380 the researcher, that it is a it is a substantial improvement 487 00:29:51,440 --> 00:29:55,065 that is worth the, the juice is worth the squeeze in that in that regard. 488 00:29:55,784 --> 00:29:59,544 You're not and it's also computationally, you're not quite training the 489 00:29:59,544 --> 00:30:03,225 whole thing again. You're just kinda putting a new Instagram filter, so to 490 00:30:03,225 --> 00:30:06,664 speak, together on top of the base. So it definitely 491 00:30:06,664 --> 00:30:10,230 does it definitely does some things. Now when we get the hard 492 00:30:10,230 --> 00:30:13,990 numbers, then, you know, I mean, I can 493 00:30:13,990 --> 00:30:17,669 say them publicly, then I think we'll we'll know is the juice how 494 00:30:17,669 --> 00:30:21,450 much does the the the the squeeze to juice ratio is? 495 00:30:22,325 --> 00:30:26,085 But, I can confidently say publicly now, like, there's a there 496 00:30:26,085 --> 00:30:29,685 there. Yeah. And, you know, we'll have those numbers soon 497 00:30:29,685 --> 00:30:33,445 enough. But it's it's interesting because you're right. I mean, this paper 498 00:30:33,445 --> 00:30:37,110 came out in 2019. Right? There was just an 499 00:30:37,110 --> 00:30:40,810 explosion of these different mechanisms. You mentioned Bert. You mentioned Roberta. 500 00:30:41,030 --> 00:30:44,490 Fun fact, my wife's name is Roberta. So that was kind of fun. 501 00:30:45,110 --> 00:30:48,950 There was Elmo. There was Ernie. There was a whole Sesame 502 00:30:48,950 --> 00:30:52,545 Street themed zoo of of model 503 00:30:52,545 --> 00:30:56,304 types. That seems to have kind of that branching out of 504 00:30:56,304 --> 00:31:00,145 those different directions has seemed to have stalled, and we're going into more of 505 00:31:00,145 --> 00:31:03,985 these retrieval augmented generation systems. So for those who because 506 00:31:03,985 --> 00:31:07,590 not everybody on our listeners know exactly what retrieval 507 00:31:07,590 --> 00:31:11,110 augmented systems are. Could you give kind of a a 508 00:31:11,110 --> 00:31:14,890 level 200 elevator explanation? Sure. 509 00:31:15,190 --> 00:31:18,970 So, when you speak to a modern chatbot, 510 00:31:19,635 --> 00:31:23,395 what's happening is that they've learned information through their pre 511 00:31:23,395 --> 00:31:27,095 training processes, the large corpus of basically the entire Internet, 512 00:31:27,715 --> 00:31:31,410 and are generating information based on the query that you're passing in. 513 00:31:31,970 --> 00:31:35,110 The problem that often occurs is that 514 00:31:35,570 --> 00:31:39,330 these AI models might error, and the error could 515 00:31:39,330 --> 00:31:42,770 be making, inform making information up that doesn't 516 00:31:42,770 --> 00:31:46,425 exist. For example, if a model is trained before a period of time, 517 00:31:46,425 --> 00:31:49,065 like, it might not know about that period of time, which is which happens more 518 00:31:49,065 --> 00:31:52,665 often than you think. The information could be false, untruthful, or it could 519 00:31:52,665 --> 00:31:56,490 just be incorrect in a way that's not, like, bad, but still not 520 00:31:56,490 --> 00:32:00,170 helpful. And the reason for this is the way that these 521 00:32:00,170 --> 00:32:03,850 models are accessing that information. The idea behind retrieval 522 00:32:03,850 --> 00:32:07,370 augmented generation is that instead of having the model try 523 00:32:07,370 --> 00:32:10,745 to, generate the correct document or the correct 524 00:32:10,745 --> 00:32:14,365 response given its pretraining process, you instead 525 00:32:14,424 --> 00:32:18,265 add factual content to the query that you're asking 526 00:32:18,265 --> 00:32:22,025 the model for. You first search for that content, which is where 527 00:32:22,025 --> 00:32:25,740 the retrieval part comes, and then you augment the generation of what that 528 00:32:25,740 --> 00:32:29,260 model is going to create based on that content, hence 529 00:32:29,260 --> 00:32:33,020 retrieval augmented generation. There's usually, a querying 530 00:32:33,020 --> 00:32:36,620 step. So you take in a user query, you hit it against some sort 531 00:32:36,620 --> 00:32:39,895 of database, usually a vector database. In our case, it could be Pinecone. 532 00:32:40,355 --> 00:32:43,895 You find a set of relevant documents. You pass that to the generating LLM. 533 00:32:44,435 --> 00:32:47,795 The generating LLM uses those documents to generate a final 534 00:32:47,795 --> 00:32:50,835 response. And it turns out that if you do this, you can reduce the right 535 00:32:50,835 --> 00:32:54,679 hallucinations. And that makes sense because if the model was given true 536 00:32:54,679 --> 00:32:58,360 information and then conditioned its generation on that information, it 537 00:32:58,360 --> 00:33:01,799 follows that the probability of generating information that is 538 00:33:01,799 --> 00:33:05,415 correct could be higher. That's a good exam that's a good 539 00:33:05,415 --> 00:33:09,174 explanation. So you're basically giving it a 540 00:33:09,174 --> 00:33:12,215 crash course in what documents you care about. Right? Like 541 00:33:12,934 --> 00:33:16,695 Exactly. Interesting. And that's a good segue 542 00:33:16,695 --> 00:33:20,200 because you work for Pinecone. So so tell me about Pinecone. What is Pinecone? 543 00:33:20,980 --> 00:33:24,740 Yeah. So Pinecone is a, knowledge layer for AI. It's 544 00:33:24,740 --> 00:33:28,020 kind of like the way we like to describe it. We the main product that 545 00:33:28,020 --> 00:33:31,640 we provide is a vector database. So this is a way of storing 546 00:33:31,780 --> 00:33:35,275 information, information that has been vectorized, in a really 547 00:33:35,275 --> 00:33:39,054 efficient manner. And it turns out that if you have the ability to store information 548 00:33:39,115 --> 00:33:42,875 in this manner, you can search against it really quickly, with 549 00:33:42,875 --> 00:33:46,715 low latency and to find the things that you need to find really interesting for 550 00:33:46,715 --> 00:33:50,360 these types of semantic search and rag systems. Pinecone has a few other 551 00:33:50,360 --> 00:33:54,039 offerings now that kind of help people build these systems a lot easier. There's 552 00:33:54,039 --> 00:33:57,640 Pinecone Inference, which lets you embed data in order to do that querying 553 00:33:57,640 --> 00:34:01,135 step. Pinecone Assistant, which lets you just build a RAG 554 00:34:01,135 --> 00:34:04,915 system immediately just by upsurting documents into our vector database, 555 00:34:06,095 --> 00:34:09,695 so on and so forth. But the reason why, like, you 556 00:34:09,695 --> 00:34:13,455 need a vector database is because all of this advance of 557 00:34:13,455 --> 00:34:16,850 semantic search of embedding models. People have gotten really, really 558 00:34:16,850 --> 00:34:20,210 good at representing chunks of information using these dense sized 559 00:34:20,210 --> 00:34:23,910 vectors. But once you have 1,000, millions, 560 00:34:24,130 --> 00:34:27,890 even billions of vectors across tons of different users, you need a way 561 00:34:27,890 --> 00:34:31,304 of indexing this information to access it really quickly at 562 00:34:31,385 --> 00:34:35,065 scale, especially if your chatbot's gonna be querying this vector database really 563 00:34:35,065 --> 00:34:38,905 often. And so having a specialized data store that can handle that type 564 00:34:38,905 --> 00:34:42,505 of search becomes really useful. That's why Pinecone is here, and that's 565 00:34:42,505 --> 00:34:45,980 why we exist. Interesting. Interesting. 566 00:34:47,320 --> 00:34:51,160 One of the other interesting things from your bio, aside from 567 00:34:51,160 --> 00:34:53,820 the the the origami, 568 00:34:55,494 --> 00:34:58,194 Tell me about this. So so you 569 00:34:59,095 --> 00:35:02,695 your crew does your do you create the YouTube videos, or do you use your 570 00:35:02,695 --> 00:35:05,655 tools, or is it something completely it's just part of your job as a developer 571 00:35:05,655 --> 00:35:09,415 advocate? So it is just part of my job as a 572 00:35:09,415 --> 00:35:13,150 developer advocate. Oh, okay. Like, often that, you 573 00:35:13,150 --> 00:35:16,830 know, I do that because we are interviewing people or because there's a new 574 00:35:16,830 --> 00:35:20,590 concept we wanna teach people, so on and so forth. Or we do a webinar, 575 00:35:20,590 --> 00:35:24,030 and we just upload it to YouTube. Oh, very cool. Very cool. 576 00:35:24,030 --> 00:35:27,835 Yeah. I started my career in developer 577 00:35:27,835 --> 00:35:31,055 advocacy. One was called evangelism. So I was a a Microsoft 578 00:35:31,355 --> 00:35:34,795 evangelist for a while. So yeah. Yeah. Cool. YouTube 579 00:35:34,795 --> 00:35:38,635 is very important. Yep. But it's 580 00:35:38,635 --> 00:35:41,375 also it's also, I think, speaks to how people learn, 581 00:35:43,380 --> 00:35:47,060 but, how people learn. YouTube University is very 582 00:35:47,060 --> 00:35:50,820 real. Right? And Yep. You know, not not a knock on 583 00:35:50,820 --> 00:35:54,660 traditional schools, not a knock on traditional publishing, but this space 584 00:35:54,660 --> 00:35:58,444 is moving so fast that if it weren't for YouTubers like 3blueonebrown 585 00:35:59,545 --> 00:36:03,325 I think his real name is, Grant Sanderson. I think that's his real name. 586 00:36:04,184 --> 00:36:07,005 Somebody will send me hate mail if I get it wrong. But, 587 00:36:08,515 --> 00:36:12,070 he he is, like, really good at explaining these 588 00:36:12,070 --> 00:36:15,770 really abstract mathematical concepts. And 589 00:36:15,910 --> 00:36:19,510 unlike you, I didn't study math undergrad. I didn't I mean, I had to. I 590 00:36:19,510 --> 00:36:23,030 only took the requirements. Right? But I have comp sci degrees. So, like, for me 591 00:36:23,030 --> 00:36:26,845 to kind of fall in love with math again or for the first time, depending 592 00:36:26,845 --> 00:36:30,685 on depending on how you wanna say that, for me, that 593 00:36:30,685 --> 00:36:34,385 was very helpful. And under having an understanding of this, if you're a data engineer 594 00:36:34,445 --> 00:36:37,805 and, you know, or wanna get into this space, it's 595 00:36:37,805 --> 00:36:41,500 definitely vector databases for traditional kinda SQL kinda 596 00:36:41,500 --> 00:36:45,339 RDBMS person will look very awkward at first. But 597 00:36:45,339 --> 00:36:48,300 I know a lot of people that have made the transition, and they kinda love 598 00:36:48,300 --> 00:36:51,280 it. Right? Because in a lot of ways, it's way more efficient, 599 00:36:52,780 --> 00:36:56,195 than, I dare say, traditional data stores. But when you're 600 00:36:56,195 --> 00:36:59,795 processing the large blocks of text, it's really good for kind of 601 00:36:59,795 --> 00:37:03,475 parsing through that. But 602 00:37:03,475 --> 00:37:07,095 that's that's really cool. So, we do have the preset 603 00:37:07,220 --> 00:37:09,779 questions if you're good for doing those. I'll put them in the chat in case 604 00:37:09,779 --> 00:37:13,539 you don't have them. Sure. They're not brain teasers 605 00:37:13,539 --> 00:37:16,279 or anything like that. They are pretty basic of, 606 00:37:17,700 --> 00:37:20,839 questions, and I will paste them in the chat. 607 00:37:22,155 --> 00:37:25,855 So the first question is, how did you find your way into 608 00:37:26,075 --> 00:37:29,915 AI? Did you did you find AI, or did 609 00:37:29,915 --> 00:37:33,515 AI find you? So this is a little bit of a 610 00:37:33,515 --> 00:37:36,495 crazy story, but AI definitely found me. 611 00:37:37,110 --> 00:37:40,950 So when I was in college, when I was looking for my 1st 612 00:37:40,950 --> 00:37:44,790 internship, I couldn't find any internships, basically, because I had, like, no 613 00:37:44,790 --> 00:37:48,390 previous experience in working at tech or anything like that. And, 614 00:37:48,710 --> 00:37:51,990 the first company I worked for, Speeko, took a chance on me because they were 615 00:37:51,990 --> 00:37:55,645 building public speaking, tools to kind of help people learn how to do 616 00:37:55,645 --> 00:37:59,405 public speaking better, for an iOS app. And I had some 617 00:37:59,405 --> 00:38:02,205 public speaking experience. They were, like, close enough. We'll have you come on and kind 618 00:38:02,205 --> 00:38:05,805 of help us, like, work work things out. And while I was there, it was 619 00:38:05,805 --> 00:38:09,240 made very obvious to me how important building 620 00:38:10,580 --> 00:38:14,260 very basic deep learning systems and AI systems to kind 621 00:38:14,260 --> 00:38:17,940 of accomplish really specific tasks that could help serve an 622 00:38:17,940 --> 00:38:21,220 ultimate goal. Like, what we were trying to do is just, like, see how many 623 00:38:21,220 --> 00:38:24,925 filler words people are using or how quickly or slowly you were speaking. 624 00:38:24,925 --> 00:38:28,464 And that requires a lot of, complicated 625 00:38:28,525 --> 00:38:31,405 processing because you have to do transcription and because you have to figure out what 626 00:38:31,405 --> 00:38:34,525 words are being said, so on and so forth. So kind of experiencing that and 627 00:38:34,525 --> 00:38:37,970 seeing that firsthand really opened my eyes to how powerful 628 00:38:38,350 --> 00:38:42,190 the technology had been even back in, like, 2017. And ever 629 00:38:42,190 --> 00:38:45,730 since then, I started learning more and more and more about statistics, 630 00:38:45,950 --> 00:38:49,184 AI, natural language processing through my internships, 631 00:38:49,565 --> 00:38:52,944 learning more complicated problems, reading research papers, so on and so forth. 632 00:38:53,405 --> 00:38:56,845 And I got to where I am now. A lot of where I learned is 633 00:38:56,845 --> 00:39:00,125 just out of pure curiosity. Just like, okay. There's this new thing. I wanna learn 634 00:39:00,125 --> 00:39:03,619 about it. That's where I wanna be. And that's kind of how I fell into 635 00:39:03,619 --> 00:39:06,980 large language models and AI, just by wanting to learn about what was going to 636 00:39:06,980 --> 00:39:10,740 happen and then eventually being there. So it definitely found me. I was 637 00:39:10,740 --> 00:39:14,415 not looking for it. Didn't even know I liked statistics until I started doing 638 00:39:14,415 --> 00:39:17,935 statistical modeling. And I was like, wait. This is really fun. I wanna do a 639 00:39:17,935 --> 00:39:20,735 lot more of this. I wanna learn a lot more of this. And I knew 640 00:39:20,735 --> 00:39:24,335 that, once I was in college and I bought a statistics book for fun, and 641 00:39:24,335 --> 00:39:27,160 I was like, okay. I'm I'm past the point of no return. Like, this is 642 00:39:27,160 --> 00:39:30,040 definitely Right. Right. Right. Right. That that might be one of the first times in 643 00:39:30,040 --> 00:39:33,560 history that that's been said. Right. Because I I learned statistics for 644 00:39:33,560 --> 00:39:37,320 fun. I I took stats in college. 645 00:39:37,320 --> 00:39:40,715 I hated it. Hated every minute of it. But 646 00:39:40,715 --> 00:39:43,775 when I got into data science, 647 00:39:44,635 --> 00:39:48,315 I the first two weeks were not fun. I'm not gonna lie. Yep. But 648 00:39:48,315 --> 00:39:51,535 just like the VI editor, once you stick with it, 649 00:39:51,835 --> 00:39:55,610 Stockholm syndrome kicks in, And you start loving 650 00:39:55,610 --> 00:39:59,450 it. That's cool. 2, what's your favorite 651 00:39:59,450 --> 00:40:03,210 part of your current gig? The favorite part of my 652 00:40:03,210 --> 00:40:06,670 current job is being able to learn interesting, 653 00:40:06,810 --> 00:40:10,375 fun, even complicated things in data science and AI, 654 00:40:10,675 --> 00:40:14,115 and figuring out how to communicate them to a wide 655 00:40:14,115 --> 00:40:17,635 audience. It's a really fun challenge. It's really similar to, like, 656 00:40:17,635 --> 00:40:21,235 what, 3 blue one brown does all the time on the YouTube channel, and it's 657 00:40:21,235 --> 00:40:24,940 something that I get to learn and practice and keep keep doing. That's the best 658 00:40:24,940 --> 00:40:28,060 part of the job. I love learning things and, like, teaching other people about them 659 00:40:28,060 --> 00:40:31,820 and learning even more things. And the fact that I have an opportunity to do 660 00:40:31,820 --> 00:40:35,260 that every single day is, like, the best. That's cool. That's 661 00:40:35,260 --> 00:40:39,025 cool. We have 3 complete sentences. When I'm 662 00:40:39,025 --> 00:40:42,705 not working, I enjoy blank. When I'm 663 00:40:42,705 --> 00:40:46,385 not working, I enjoy, baking sweet treats and 664 00:40:46,385 --> 00:40:50,099 goods. I can't have any dairy. So very often, I had to kind 665 00:40:50,099 --> 00:40:52,980 of give up a lot of the cakes and desserts that I loved eating when 666 00:40:52,980 --> 00:40:56,019 I was younger. So now I, like, spend my time trying to figure out how 667 00:40:56,019 --> 00:40:59,460 I can make them again without dairy so they taste really good. So that's that's 668 00:40:59,460 --> 00:41:02,835 something I enjoy I really enjoy doing. Very cool. 669 00:41:04,015 --> 00:41:07,155 Next, complete the sentence. I think the coolest thing in technology 670 00:41:07,375 --> 00:41:10,815 today is blank. I 671 00:41:10,815 --> 00:41:14,340 thought really hard about this question because we're living in a 672 00:41:14,420 --> 00:41:18,180 crazy time of technological development. But the thing that really 673 00:41:18,180 --> 00:41:22,020 stuck out to me and the thing that was also the moment for me 674 00:41:22,020 --> 00:41:25,540 when I started working with, like, chatbots and LLMs was code 675 00:41:25,540 --> 00:41:29,195 generation models. The first time I learned how to 676 00:41:29,195 --> 00:41:32,875 use, GitHub Copilot specifically, I 677 00:41:32,875 --> 00:41:36,475 was I was completing some function, and it completed it before I was done typing 678 00:41:36,475 --> 00:41:40,075 it. And I was like, what the heck? This is amazing. Like, this this this 679 00:41:40,075 --> 00:41:43,860 actually figured out exactly what I needed. And because I was still, like, 680 00:41:43,860 --> 00:41:47,000 a budding developer, it was extremely helpful because I could learn 681 00:41:47,380 --> 00:41:51,220 faster rather than having already a huge kind of store knowledge already in my 682 00:41:51,220 --> 00:41:54,680 brain and kind of pulling from that. So I could see it benefiting my workflow. 683 00:41:54,740 --> 00:41:58,125 So I think the development of those tools and modern tools like 684 00:41:58,125 --> 00:42:01,965 Cursor, so on and so forth, extremely cool. And I can't wait to 685 00:42:01,965 --> 00:42:05,805 see, like, what the next generation of those technologies will look like. Yeah. I 686 00:42:05,805 --> 00:42:09,420 mean, that's a that's a great example. It's almost like you don't 687 00:42:09,420 --> 00:42:13,180 need, you know, the the classic 10000 hours to master a skill or something like 688 00:42:13,180 --> 00:42:16,940 that. It's almost like you can leverage the AI to take on the 689 00:42:16,940 --> 00:42:20,780 lion's share of the 10000 hours. You're still gonna need to know something. You still 690 00:42:20,780 --> 00:42:23,825 have to put in some reps, but not to the degree that you used to. 691 00:42:23,825 --> 00:42:27,505 No. I think that's gonna be very transformative. I mean, I mean, I'm 692 00:42:27,505 --> 00:42:31,105 learning, JavaScript and Next. Js on the side because it's something I have no 693 00:42:31,105 --> 00:42:34,805 experience in. Right. And I was able to build my personal website 694 00:42:35,025 --> 00:42:38,579 entirely through using Cursor and Progression. Nice. I 695 00:42:38,579 --> 00:42:42,420 often check that out. Which is insane. Right? Which is, like, really, really 696 00:42:42,420 --> 00:42:45,859 fascinating. And and I'm not gonna claim to, like, suddenly be an expert in 697 00:42:45,859 --> 00:42:49,140 NextGen or anything like that. Right? Right. Right. Right. I still wanna learn, like, exactly 698 00:42:49,140 --> 00:42:52,075 what's going on under the hood, But having a project that you can kind of, 699 00:42:52,075 --> 00:42:55,755 like, tinker on that's, like, pretty small in scale and that you can kind of 700 00:42:55,755 --> 00:42:59,115 afford to make a few mistakes on and having, like, an expert system kind of 701 00:42:59,115 --> 00:43:02,955 help you go through that, expert, quote, unquote, being close enough, really cool 702 00:43:02,955 --> 00:43:06,760 learning experience. No. That's a great way to put it because, like, I I 703 00:43:06,980 --> 00:43:10,580 I don't have any apps on the modern devices. Right? Like, 704 00:43:10,580 --> 00:43:14,420 so, it would be nice if I 705 00:43:14,420 --> 00:43:18,040 had an Android app that could kick off some automation process that I have. 706 00:43:18,234 --> 00:43:21,855 Right? Or do some kind of tie in with, you know, Copilot 707 00:43:21,994 --> 00:43:25,675 into that or things like that. Like, where, you know, I 708 00:43:25,675 --> 00:43:29,115 originally wrote a content automation system I wrote. I originally wrote in 709 00:43:29,115 --> 00:43:32,494 dotnet, but I ported it to Python with the help of 710 00:43:33,039 --> 00:43:36,880 the help of AI. And I could well, that's just it. Right? 711 00:43:36,880 --> 00:43:40,640 It really the true valuable resource in in life is 712 00:43:40,640 --> 00:43:44,480 time. Right? Yes. It's not Yes. I mean, I could have done it by hand. 713 00:43:44,480 --> 00:43:47,619 I could have done it by myself, but it was one of those things where 714 00:43:48,425 --> 00:43:52,045 am I gonna do it because it's gonna take x number of hours or whatever? 715 00:43:53,065 --> 00:43:56,585 But if I can just kinda here's the dot net version that I, you know, 716 00:43:56,585 --> 00:44:00,185 I posted. This is before there was Copilot, so I pasted it into chat g 717 00:44:00,185 --> 00:44:03,900 p t. And it basically spit out a Python 718 00:44:03,900 --> 00:44:07,740 version, had some errors. You know, this was a while ago. But I 719 00:44:07,740 --> 00:44:11,180 was able to, inside of a day, get it done as opposed to 720 00:44:11,180 --> 00:44:14,865 before. Like, I know how my ADD works. Right? Like, I'll start it. 721 00:44:14,945 --> 00:44:18,705 First 3 days, working on it, grinding on it, and then 722 00:44:18,705 --> 00:44:22,465 I don't touch it again for 2 weeks. And it never gets built. But 723 00:44:22,465 --> 00:44:25,985 with this, I'm able to kinda harness the the spark of 724 00:44:25,985 --> 00:44:29,529 inspiration and and execute much faster. Now I think I don't think 725 00:44:29,529 --> 00:44:33,289 people fully realize, like, you know, it's not all doom and gloom. Nobody's 726 00:44:33,289 --> 00:44:37,130 gonna have any programming jobs. There's a lot of upside too. And I 727 00:44:37,130 --> 00:44:40,910 guess that's just where we are in the hype cycle. As you said. 728 00:44:41,210 --> 00:44:44,924 Yeah. Yeah. Yeah. Exactly. That's a good segue into I look forward to 729 00:44:44,924 --> 00:44:48,684 the day when I can use technology to blank. I look 730 00:44:48,684 --> 00:44:52,525 forward to the day where I can use technology to get a high quality 731 00:44:52,525 --> 00:44:56,000 education on any subject for free. So Nice. 732 00:44:56,380 --> 00:45:00,220 Free education is really important to me. A lot of 733 00:45:00,220 --> 00:45:03,980 what I learned about large language models, deep learning, all that 734 00:45:03,980 --> 00:45:07,500 stuff was online courses that I took for free on places like 735 00:45:07,500 --> 00:45:11,005 EDX, Coursera, so on and so forth. Or people sharing 736 00:45:11,005 --> 00:45:14,205 articles and kind of learning from them, or YouTube videos, or all that sort of 737 00:45:14,205 --> 00:45:16,965 things, in addition to my education. But there's a lot of things you kinda have 738 00:45:16,965 --> 00:45:20,605 to learn after that. Right? And I think that especially with, like, 739 00:45:20,605 --> 00:45:24,349 cogeneration models, it's, like, very easy to be, like, okay. Build me this app 740 00:45:24,349 --> 00:45:26,829 and, like, just make it work. And you can sit there for a couple hours, 741 00:45:26,829 --> 00:45:29,730 and it'll, like, work. But I think the missing piece is 742 00:45:30,190 --> 00:45:33,710 creating a structured kind of learning path that's, like, 743 00:45:33,710 --> 00:45:37,365 personalized to whoever you are for the 744 00:45:37,365 --> 00:45:41,125 thing that you're really interested in with the context of 745 00:45:41,125 --> 00:45:44,885 having, like, these tools that can help you do that thing. And I'm not sure 746 00:45:44,885 --> 00:45:48,645 if we have anybody or any offering that can 747 00:45:48,645 --> 00:45:51,590 kind of do that technologically, because you need a lot of information about what the 748 00:45:51,590 --> 00:45:54,710 user knows or doesn't know. You need to be able to create ability, and then 749 00:45:54,710 --> 00:45:57,830 you need to be able to kind of create, like, an entire mini course that's 750 00:45:57,830 --> 00:46:01,510 personalized to whatever that person needs. But if we can do that, we can solve 751 00:46:01,510 --> 00:46:05,085 so many wonderful problems. Absolutely. I'm 752 00:46:05,085 --> 00:46:08,845 thinking about special education needs and things like that. I don't think we're that 753 00:46:08,845 --> 00:46:12,445 far off from this. No. But I 754 00:46:12,605 --> 00:46:15,965 the biggest issue, is going to be just hallucinations. Right? And, 755 00:46:15,965 --> 00:46:19,810 hopefully, people can build, like, rag systems using tools like PineCone to kind 756 00:46:19,810 --> 00:46:23,490 of produce those hallucinations. But we will also for for something like 757 00:46:23,490 --> 00:46:26,930 that specific use case, we probably need, like, another breakthrough in 758 00:46:26,930 --> 00:46:30,530 indexing information or kind of presenting it, or we need a process that 759 00:46:30,530 --> 00:46:34,125 really allows people to create this information quickly 760 00:46:34,345 --> 00:46:38,025 and verifiably in order to kind of make that happen. But if if that is 761 00:46:38,025 --> 00:46:41,225 a future that we can live in, where technology can can kind of, like, help 762 00:46:41,225 --> 00:46:44,985 people learn, like, really important things really well, that would be 763 00:46:44,985 --> 00:46:48,125 wonderful. And I think that would be, like, amazing for for humanity. 764 00:46:48,730 --> 00:46:52,490 Oh, absolutely. Share something different 765 00:46:52,490 --> 00:46:54,670 about yourself, but remember as a family podcast. 766 00:46:57,130 --> 00:47:00,490 One of my favorite hobbies for about a decade is 767 00:47:00,490 --> 00:47:04,255 designing and folding origami. And it's really fun. 768 00:47:04,255 --> 00:47:07,935 It's very easy, but it's also very hard. There's a lot 769 00:47:07,935 --> 00:47:11,695 of comp complexity inside it as well. One thing people 770 00:47:11,695 --> 00:47:14,995 don't know about that is that there's a lot of mathematical complexity. 771 00:47:15,320 --> 00:47:19,160 So once you get to a point where you wanna design a model with 772 00:47:19,160 --> 00:47:22,680 really specific qualities, really specific features, it suddenly 773 00:47:22,680 --> 00:47:26,520 becomes a paper optimization problem where you 774 00:47:26,520 --> 00:47:30,145 have, like, a fixed size square, and you have different 775 00:47:30,145 --> 00:47:33,825 regions of that paper that you're allocating to portions of the model you're 776 00:47:33,825 --> 00:47:37,125 designing. And it turns out that there are entire mathematical 777 00:47:37,424 --> 00:47:40,944 principles and procedures to solve this problem. So much 778 00:47:40,944 --> 00:47:44,410 so that one of the leading, like, practitioners in the 779 00:47:44,410 --> 00:47:48,250 field is, like, this physicist who wrote a textbook on how to do origami design, 780 00:47:48,250 --> 00:47:51,870 and that's, like, the textbook everyone looks at. So, like, learn how to solve it. 781 00:47:51,930 --> 00:47:55,470 Yeah. I'm not surprised. There's definitely there's definitely a a correlation 782 00:47:55,530 --> 00:47:59,185 between the mathematics of that. And I look at origami creations, and I 783 00:47:59,185 --> 00:48:03,025 just fascinated that could be done from a single sheet. Like, it's 784 00:48:03,025 --> 00:48:06,705 just how is that I mean, that's just mind bending. Now it's 785 00:48:06,785 --> 00:48:09,984 and and makes sense that there's a mathematical because you have a certain type of 786 00:48:09,984 --> 00:48:13,260 constraint, And there's obviously 787 00:48:14,039 --> 00:48:17,799 folds factor into it and things like that. And, yeah, that's that's 788 00:48:17,799 --> 00:48:20,920 interesting. I I should what's the name of that book? I should pick it up. 789 00:48:20,920 --> 00:48:24,680 It's called Origami Design Secrets. Got it. Alright. I will check 790 00:48:24,680 --> 00:48:28,394 it out. So where can people learn more about 791 00:48:28,394 --> 00:48:32,075 you and Pinecone? Of course. You wanna learn more about Pinecone? The 792 00:48:32,075 --> 00:48:35,914 best place is our website, pinecone. Io. You can also find 793 00:48:35,914 --> 00:48:39,295 us on LinkedIn and on x and other social media platforms. 794 00:48:39,830 --> 00:48:42,870 You wanna learn more about me? You can go to my LinkedIn, which you can 795 00:48:42,870 --> 00:48:46,710 find at Arjun Girthi Patel, or you can go to my website, which is also 796 00:48:46,710 --> 00:48:50,070 my name, arjun, k I r t I p 797 00:48:50,070 --> 00:48:53,885 a t e l.com. Cool. And we can also check out your 798 00:48:53,885 --> 00:48:57,565 Next JS skills there too. Exactly. Hopefully, nothing is 799 00:48:57,565 --> 00:49:01,405 broken, but, you can you can see you can see how well I've gotten by 800 00:49:01,405 --> 00:49:05,210 with the Awesome. Trust me. 801 00:49:05,210 --> 00:49:07,630 JavaScript alone is is a is a frustration 802 00:49:08,890 --> 00:49:09,789 creation device. 803 00:49:12,410 --> 00:49:15,609 Audible sponsors the podcast. Do you do audio books? Is there a book that you 804 00:49:15,609 --> 00:49:19,115 would recommend? I do do audiobooks, but I've just 805 00:49:19,115 --> 00:49:22,955 started recently, so I don't have a huge, audiobook library. But 806 00:49:22,955 --> 00:49:26,715 there is I I am a huge fan of short story collections, and 807 00:49:26,715 --> 00:49:30,329 kind of the one that comes to mind is really anything by Ted 808 00:49:30,329 --> 00:49:33,289 Chiang, who does a lot of kind of sci fi short stories. If you've seen 809 00:49:33,289 --> 00:49:37,130 the movie Arrival, the short story based on that is story of your life, 810 00:49:37,130 --> 00:49:40,650 and it's wonderfully written. It's one of my favorite short stories ever. 811 00:49:40,650 --> 00:49:44,255 Yep. So highly recommend that. I believe the collection is 812 00:49:44,255 --> 00:49:47,694 called, story of your life and others, something like that. So 813 00:49:47,934 --> 00:49:51,295 Oh, interesting. Careful with audiobooks. They are very 814 00:49:51,295 --> 00:49:54,850 addictive. So, 815 00:49:55,710 --> 00:49:58,590 with Audible is a sponsor of the show. So if you go to the data 816 00:49:58,590 --> 00:50:02,130 driven book.com, you'll get routed to Audible and 817 00:50:02,350 --> 00:50:05,650 you'll get a free book on us. And if you 818 00:50:06,105 --> 00:50:09,305 choose to subscribe, we'll get a little bit of kickback. It helps run the show 819 00:50:09,305 --> 00:50:13,145 and helps, helps us bring, bring some good stuff to to 820 00:50:13,145 --> 00:50:16,445 the masses. So any any parting thoughts? 821 00:50:18,425 --> 00:50:21,145 No. But thank you so much for having me on, Frank. This was a ton 822 00:50:21,145 --> 00:50:24,160 of fun. I learned a lot from you, and I hope I I helped you 823 00:50:24,160 --> 00:50:28,000 learn one one small thing as well. Absolutely. It was it was 824 00:50:28,000 --> 00:50:31,600 a great conversation, and, we'll let the nice British lady finish the 825 00:50:31,600 --> 00:50:35,305 show. And that's a wrap for this episode of Data Driven, where we 826 00:50:35,305 --> 00:50:38,765 journeyed from the intricacies of vector databases to the surprising 827 00:50:38,905 --> 00:50:42,665 elegance of origami. A huge thank you to Arjun Patel for 828 00:50:42,665 --> 00:50:46,505 sharing his insights on retrieval augmented generation and his passion 829 00:50:46,505 --> 00:50:50,330 for making AI accessible to all. From turning raw data 830 00:50:50,330 --> 00:50:54,010 into actionable knowledge to turning paper into art, Arjun 831 00:50:54,010 --> 00:50:57,850 proves there's beauty in both precision and creativity. If today's 832 00:50:57,850 --> 00:51:01,610 episode left you curious, inspired, or just itching to fold a 833 00:51:01,610 --> 00:51:04,994 piece of paper into something meaningful, be sure to check out 834 00:51:04,994 --> 00:51:08,535 Arjun's work and Pinecones innovative tools. Remember, 835 00:51:08,755 --> 00:51:12,515 knowledge might be power, but sharing it makes you a force to be reckoned 836 00:51:12,515 --> 00:51:16,275 with. As always, I'm Bailey, your semi sentient guide to 837 00:51:16,275 --> 00:51:19,660 all things data. Reminding you that while AI might shape our 838 00:51:19,660 --> 00:51:23,340 future, it's the human touch or sometimes the paper fold that 839 00:51:23,340 --> 00:51:26,720 gives it meaning. Until next time, stay curious, 840 00:51:27,020 --> 00:51:30,160 stay analytical, and don't forget to back up your data. 841 00:51:30,540 --> 00:51:31,040 Cheerio.