Speaker: 00:00:00

Welcome to Impact Quantum, the show that peeks under the hood of

Speaker: 00:00:03

quantum computing to reveal what's emerging and why it

Speaker: 00:00:07

matters. Today's episode is an absolute masterclass

Speaker: 00:00:11

in quantum inspired data science, with a guest who

Speaker: 00:00:15

quite frankly makes the rest of us feel like we're still stuck figuring

Speaker: 00:00:18

out long division. Joining Frank and Candice is Dr.

Speaker: 00:00:23

Marvin Weinstein, emeritus at Stanford University,

Speaker: 00:00:27

bona fide particle physicist and co creator of

Speaker: 00:00:30

dynamic quantum clustering, a method that

Speaker: 00:00:34

sounds like science fiction, but delivers real world,

Speaker: 00:00:38

potentially life saving insight. Marvin takes us on

Speaker: 00:00:41

a thrilling journey through brain cancer research, data

Speaker: 00:00:44

agnosticism, and how a physicist wandered into

Speaker: 00:00:48

biology and found patterns that even seasoned

Speaker: 00:00:51

researchers had missed. This isn't quantum computing per

Speaker: 00:00:55

se, it's quantum mechanics inspired analysis applied with

Speaker: 00:00:59

surgical precision minus the surgical gloves.

Speaker: 00:01:03

Whether you're a curious technologist or just here for the

Speaker: 00:01:06

intellectual thrill ride, this one is for you. And

Speaker: 00:01:10

no, you don't need a PhD to follow along. Just

Speaker: 00:01:13

curiosity and perhaps a cup of strong tea. This episode

Speaker: 00:01:17

is rated 5 Schrodingers. So buckle up and let's get

Speaker: 00:01:21

into it.

Speaker: 00:01:33

Hello and welcome back to Impact Quantum, the podcast where we explore the

Speaker: 00:01:37

emergent fields of quantum computing and

Speaker: 00:01:41

the upcoming ecosystem that is going to spread around it.

Speaker: 00:01:44

So you don't need to be a quantum physicist, but you do need to be

Speaker: 00:01:47

curious and curious about quantum computing. And with me today,

Speaker: 00:01:51

as always, is the most quantum curious person I know, Candace Kahuli.

Speaker: 00:01:55

How's it going, Candace? It's great. Thank you so much for asking. I'm really

Speaker: 00:01:59

excited about today. Yeah. So I think today you actually have

Speaker: 00:02:02

an honest to goodness physicist here on

Speaker: 00:02:07

as a guest. Absolutely. Amongst many things that

Speaker: 00:02:10

he's, that he's doing, I can say that he is a particle physicist at Stanford

Speaker: 00:02:14

University as well as,

Speaker: 00:02:18

as well as the CSO co founder at

Speaker: 00:02:22

Quantum Insights Incorporated. He's got a lot, a

Speaker: 00:02:25

lot of experience and a lot of great knowledge to share

Speaker: 00:02:29

to our audience. I think everyone's going to find him as fascinating as I do.

Speaker: 00:02:33

Cool, I hope.

Speaker: 00:02:37

But I am a genuine quantum mechanic. That's right. There you

Speaker: 00:02:40

go. So please welcome everybody, Marvin

Speaker: 00:02:44

Weinstein to the show. How's it going? It's

Speaker: 00:02:48

going well. As I was telling Candice, you got me in a very

Speaker: 00:02:52

excited state today, so I hope I'm coherent.

Speaker: 00:02:55

Awesome. Yeah. In the virtual green room, you had said you kind of

Speaker: 00:02:58

uncovered something very interesting. So we can start there if you

Speaker: 00:03:02

like. Well, yeah, I mean,

Speaker: 00:03:07

basically the thing we were talking about

Speaker: 00:03:10

during the previous interview was a tool that was Developed that was

Speaker: 00:03:14

what the company was founded for, to apply the

Speaker: 00:03:18

various problems. And it's called dynamic quantum clustering.

Speaker: 00:03:22

And it differs from other clustering

Speaker: 00:03:26

algorithms, other data mining tools, in that it

Speaker: 00:03:30

is completely unbiased. You can take a first look at data with

Speaker: 00:03:34

making no assumptions about if there is anything to be found in the data,

Speaker: 00:03:38

cleaning the data or

Speaker: 00:03:42

labeling it in any manner, shape or form. You just look at the raw

Speaker: 00:03:46

data. So

Speaker: 00:03:50

for personal history reasons, I mean, Candace was telling me about somebody

Speaker: 00:03:53

she knew who partner had

Speaker: 00:03:57

died or father had died of a brain tumor. But

Speaker: 00:04:01

my first wife also died of glioblastoma.

Speaker: 00:04:06

So when Quantum Insights decided

Speaker: 00:04:10

to close its doors, I was sitting with all of this data from the Cancer

Speaker: 00:04:13

Genome Atlas, including all of its glioma

Speaker: 00:04:17

data. That means all low grade gliomas and

Speaker: 00:04:20

glioblastoma data. So I had RNA sequencing data

Speaker: 00:04:24

for all of those tumors. And basically

Speaker: 00:04:30

I decided first thing I want to do with it is take a look

Speaker: 00:04:33

at it and see if there's anything to see in that

Speaker: 00:04:37

data that I mean people have looked, this data set's old,

Speaker: 00:04:41

so it's been around for a long time, has been heavily studied.

Speaker: 00:04:45

People were totally sure that everything that there

Speaker: 00:04:49

was to be extracted from that data set had been extracted from

Speaker: 00:04:52

that data set. And so basically

Speaker: 00:04:58

I said, well, nobody's looked with our tool.

Speaker: 00:05:01

And so what did I do? Well, the first thing was, as

Speaker: 00:05:05

I promised you, I simply loaded up the data. I did

Speaker: 00:05:09

restrict the gene expression from the 60,000 genes

Speaker: 00:05:13

that it comes with down to what everybody believes is

Speaker: 00:05:17

the 20,000 so called protein coding genes.

Speaker: 00:05:21

Not all of them code for proteins, but they're the list

Speaker: 00:05:25

that various tools

Speaker: 00:05:28

restrict to. So I wanted to stay within what other people were doing.

Speaker: 00:05:32

So I looked at those 20,000 genes. Well, that's a lot of data. I mean

Speaker: 00:05:36

that's a lot of noise. It's actually not a lot of data. It's only 600.

Speaker: 00:05:40

I mean that's always a misconception. People say biologists have huge data

Speaker: 00:05:43

sets. They really don't. I mean, for example, all of the

Speaker: 00:05:47

cancer data is 692

Speaker: 00:05:50

tumors, brain cancers.

Speaker: 00:05:54

That's not a big data set. There's only 692 pieces of

Speaker: 00:05:57

information. I don't care that there's 20,000 genes

Speaker: 00:06:02

because all you're seeing is the effect of 692

Speaker: 00:06:06

combinations of the expression levels for those genes. The whole

Speaker: 00:06:09

data set can be reproduced from those 692 pieces

Speaker: 00:06:13

of information. So

Speaker: 00:06:17

not like a physics data set which has Millions of

Speaker: 00:06:20

samples and stuff to look at. This is a biology data

Speaker: 00:06:24

set and typically restricted to a specific disease.

Speaker: 00:06:27

It's not huge. What it is, is

Speaker: 00:06:31

complicated and really hard to see what's going

Speaker: 00:06:34

on because there's so much noise.

Speaker: 00:06:38

So first thing I did, as I said, was restricted

Speaker: 00:06:42

the raw data. It's a matrix after all, rows

Speaker: 00:06:45

and columns, okay? Every row is the expression level

Speaker: 00:06:50

for 20,000 genes. I'm rounding the numbers off. You don't

Speaker: 00:06:53

want the 20,312 all the time.

Speaker: 00:06:59

So it's that. And there are

Speaker: 00:07:02

692 rows in all. You feed that into DQC, it's

Speaker: 00:07:05

made to ingest that quickly and you do, you just

Speaker: 00:07:09

simply run the first analysis and surprise. The first thing you see

Speaker: 00:07:13

is there's a whopping big signal.

Speaker: 00:07:16

In fact that data, raw, unprocessed,

Speaker: 00:07:20

unlabeled, untreated in any way, no

Speaker: 00:07:24

training set, separates into two clusters. One

Speaker: 00:07:27

very large cluster which is mostly the lower grade

Speaker: 00:07:31

gliomas, and another cluster which is

Speaker: 00:07:35

almost all of the

Speaker: 00:07:37

glioblastomas. Well, that's pretty

Speaker: 00:07:41

cool. There's already a signal. It's not the best classification

Speaker: 00:07:45

in the world. Maybe it's very good, it's competitive.

Speaker: 00:07:49

But DQC has a standard trick which is

Speaker: 00:07:53

you can pick out a smaller number of

Speaker: 00:07:57

genes to look at, in this case a smaller number of features in the

Speaker: 00:08:01

fancy language which

Speaker: 00:08:05

give the same information. And so first run at that

Speaker: 00:08:08

produced 544genes and exactly

Speaker: 00:08:12

the same picture. So I didn't have to look at 20,000, I

Speaker: 00:08:16

had to look at 544 which were doing most of the heavy

Speaker: 00:08:19

lifting, produce the same two clusters,

Speaker: 00:08:23

same, not wonderful, but pretty good classification

Speaker: 00:08:27

scheme. Then there's another DQC based trick

Speaker: 00:08:30

which is using the information in the two clusters, now

Speaker: 00:08:35

I can order the genes that I'm looking at, the

Speaker: 00:08:38

544, in order of their

Speaker: 00:08:41

importance to the signal.

Speaker: 00:08:45

Then I look the first 10 genes, the first 20 genes, the first

Speaker: 00:08:49

30 genes, and I did those analyses over and over. Each time I did

Speaker: 00:08:53

it starting from 10, I got a pretty good

Speaker: 00:08:56

classifications. 20 made it better, 30 made it better

Speaker: 00:09:00

until I got up to 90 genes and then at

Speaker: 00:09:03

100, 110, 120, everything

Speaker: 00:09:07

stopped getting better and started to get worse. Interesting. So the

Speaker: 00:09:10

cutoff interesting was I wanted to look at the 90 gene signal

Speaker: 00:09:14

because the cleanest information was going to be in the 90 gene

Speaker: 00:09:18

signal. Did that and sure enough I find

Speaker: 00:09:21

four clusters. So what are the four clusters? Three of

Speaker: 00:09:25

those clusters are all low grade gliomas.

Speaker: 00:09:30

100% low grade gliomas. They

Speaker: 00:09:33

capture all of the low grade gliomas

Speaker: 00:09:37

except for four tumors. The

Speaker: 00:09:41

fourth cluster is all of the glioblastomas and

Speaker: 00:09:44

those four that were not captured. Now remember,

Speaker: 00:09:48

there were 692 tumors, I was missing four.

Speaker: 00:09:52

So when you look at them plotted in the space,

Speaker: 00:09:57

let's call it PCA space. You like that word? And it is the

Speaker: 00:10:00

PCA space for the tumor expressions. Those four

Speaker: 00:10:04

lie right next to the gliomas, whereas all the other data lie far away from

Speaker: 00:10:08

the gliomas. Just for folks that may not know

Speaker: 00:10:12

what PCA is, is it Principal Component

Speaker: 00:10:15

Analysis? Principal component, yeah. PCA

Speaker: 00:10:18

is a way of rotating the data so that

Speaker: 00:10:23

the dimension of the data in which

Speaker: 00:10:26

the data is most spread out is the first dimension. The dimension

Speaker: 00:10:30

in which the data is next most spread out is the second.

Speaker: 00:10:34

It tends, if you're lucky in low dimensions

Speaker: 00:10:37

to show you what you need to see in order to

Speaker: 00:10:41

try to do clustering. Because most clustering algorithms

Speaker: 00:10:45

deteriorate rapidly as the dimension of the data goes

Speaker: 00:10:49

up. So they like to do a hard dimensional reduction, they

Speaker: 00:10:52

call it to two or three PCA directions

Speaker: 00:10:56

and then try to cluster based on what they see there.

Speaker: 00:11:00

There are algorithms which work in higher dimension, but,

Speaker: 00:11:04

but still there are things they struggle with.

Speaker: 00:11:09

DTC doesn't really care. It doesn't start with a hard dimensional

Speaker: 00:11:13

reduction. It simply works

Speaker: 00:11:17

with, with what is showing the most information. If it's 6,

Speaker: 00:11:21

if it's 10, if it's 16, if it's 50, that's fine, I

Speaker: 00:11:25

don't care. I'll, I'll work in that. The only impact

Speaker: 00:11:28

and price I pay is the time it takes to run the algorithm.

Speaker: 00:11:32

But, and so the usual trick is you work in the lowest number

Speaker: 00:11:36

of dimensions that appear to be noise free,

Speaker: 00:11:40

which you can tell by looking at the spectrum that you see in pca

Speaker: 00:11:45

and then work your way up to twice that number of

Speaker: 00:11:48

dimensions and look again and if you see the same information,

Speaker: 00:11:52

well, it's quicker to run everything in the lower dimension, but

Speaker: 00:11:56

you don't stop if you see a change. So

Speaker: 00:12:00

at any rate, Granite, I got these four clusters.

Speaker: 00:12:04

Now you notice the only misclassification out of

Speaker: 00:12:08

692 tumors is four tumors.

Speaker: 00:12:13

So a considerably less than 1%

Speaker: 00:12:16

failure. Of doing close to like

Speaker: 00:12:20

1/7 of 1%, would you say? Yeah, yeah, yeah.

Speaker: 00:12:24

So I mean that's, that's, I'm trying. To quote the worst possible

Speaker: 00:12:27

statistic that I can imagine, but less than 1%. We can all

Speaker: 00:12:31

agree on. So that would put it at over 99%. If I tell you

Speaker: 00:12:34

from this analysis you have a low grade glioma, I'm

Speaker: 00:12:38

100% accurate. Right. If I tell you you have a

Speaker: 00:12:42

glioblastoma, I might be as much as

Speaker: 00:12:45

2 1/2% inaccurate on just glioma question

Speaker: 00:12:52

that's better than world class. Let's say what's current state

Speaker: 00:12:56

of the art is closer to like 80, 20. 19 around trying to

Speaker: 00:13:00

say how do we compete? And I haven't succeeded yet. My

Speaker: 00:13:03

collaborators, one bioinformaticist at

Speaker: 00:13:07

Wisconsin and a cancer doc at Stanford,

Speaker: 00:13:12

are going to have to help me with that. In searching the literature, what

Speaker: 00:13:16

I find are statements like most schemes for

Speaker: 00:13:20

doing this, unsupervised from

Speaker: 00:13:23

the raw data and then moving on from an internal

Speaker: 00:13:27

analysis. Still working, starting with the raw data,

Speaker: 00:13:31

what they call the area under the curve. So the likelihood you're right

Speaker: 00:13:35

is 70 to 80%.

Speaker: 00:13:38

Interesting. So we're not talking anything like the same. There

Speaker: 00:13:42

are some special biomarkers. If they're found

Speaker: 00:13:46

on a glioblastoma, then people are pretty sure

Speaker: 00:13:50

it's a glioblastoma at maybe the 1% level.

Speaker: 00:13:54

Okay, but separating

Speaker: 00:13:57

glioblastoma from low grade gliomas, blind,

Speaker: 00:14:02

they're nowhere near that good.

Speaker: 00:14:07

So at any rate,

Speaker: 00:14:10

that's what I found. I now have the world's best classifier.

Speaker: 00:14:14

In 90 genes, I plot the gene expression levels

Speaker: 00:14:18

for each one of those clusters. And for most of the genesis

Speaker: 00:14:22

I see the genes either fall into the category, the

Speaker: 00:14:25

expectation for the expression of that gene

Speaker: 00:14:29

for the either goes systematically up through

Speaker: 00:14:33

the four clusters, going from the lowest grade glioma to the

Speaker: 00:14:37

glioblastoma, or systematically goes down. That's

Speaker: 00:14:41

what you want to see. Those genes are involved in what's happening,

Speaker: 00:14:46

but there's still 544 genes.

Speaker: 00:14:49

And I can't see the forest for the trees.

Speaker: 00:14:53

Interesting. So does this inform treatment options?

Speaker: 00:14:57

Well, that's the problem. Treatment options, or at

Speaker: 00:15:01

least my understanding. So remember, I have to be

Speaker: 00:15:04

very upfront. I'm not a biologist. Right. Everything

Speaker: 00:15:08

you will hear me talk about, I learned by looking at this data. I have

Speaker: 00:15:12

nothing formal training in biology whatsoever,

Speaker: 00:15:16

so you're dealing with a novice. It reminds me of the

Speaker: 00:15:19

the original Star Trek show where Bones would always like, I'm a doctor, not an

Speaker: 00:15:23

engineer. Like you're like, I'm a physicist, not an engineer. I mean a doctor, you

Speaker: 00:15:26

know. So what, what initially inspired you to take

Speaker: 00:15:30

all of your quantum mechanics Knowledge and, and, and

Speaker: 00:15:34

apply it biological data. Yeah, so remember I'm the

Speaker: 00:15:38

co inventor of this algorithm. The other inventor is David

Speaker: 00:15:41

Horn, Tel Aviv University, a frequent visitor to

Speaker: 00:15:45

Slack. He collaborated on many physics

Speaker: 00:15:49

papers. And Slack is not the

Speaker: 00:15:52

messaging app. It's Stanford Linear Accelerator. Is that right?

Speaker: 00:15:56

No case Stanford. It used to be called the Stanford Linear Accelerator

Speaker: 00:15:59

Center. So I'll tell you out of school the story which

Speaker: 00:16:03

reflects wonderfully on the doe. At some point the

Speaker: 00:16:06

DOE wanted to put its name on everything

Speaker: 00:16:10

and trademark it.

Speaker: 00:16:14

Well, Slack said, because Stanford said you can't trademark

Speaker: 00:16:18

the Stanford Linear Accelerator Center. It's us, right?

Speaker: 00:16:22

We run the place. So DOE made

Speaker: 00:16:26

SLAC change its name to SLAC S L A C

Speaker: 00:16:30

and call it the SLAC National Accelerator Laboratory. So I guess

Speaker: 00:16:34

as an abbreviation we're now Snal, not Slack.

Speaker: 00:16:39

Slack sounds better. I grew up with it as slack for

Speaker: 00:16:42

42 years. To hell with the DOE. I don't intend to

Speaker: 00:16:46

listen to what they want. But it is now officially the SLAC

Speaker: 00:16:50

National Accelerator Laboratory. Right. So at any rate,

Speaker: 00:16:53

David Horn came into my office and life went as normal. He said, oh, I

Speaker: 00:16:57

have something interesting to show you. Because he kind of had left high energy

Speaker: 00:17:01

physics about eight years earlier and was looking

Speaker: 00:17:04

into data mining. And he said there this cool idea

Speaker: 00:17:08

that grows out of

Speaker: 00:17:12

something done by somebody called Emanuel Parsons, called the Parsons

Speaker: 00:17:15

estimator. And I figured out I should think about it as a

Speaker: 00:17:19

quantum potential. I already was very

Speaker: 00:17:22

suspicious. It sounded like a very strange idea.

Speaker: 00:17:26

And so we did our usual thing. We stood at the blackboard and

Speaker: 00:17:30

yelled at one another for three or four hours. And

Speaker: 00:17:34

then we came to a meeting of the minds and said, this really isn't the

Speaker: 00:17:36

stupid idea. It's kind of cute. And

Speaker: 00:17:41

you know, David said, well, he showed me some simple problems

Speaker: 00:17:45

having to do with classifying crabs. It's the standard

Speaker: 00:17:49

old problem that people did

Speaker: 00:17:53

and seemed to be very interesting.

Speaker: 00:17:57

He said, but the problem is in order to understand who's

Speaker: 00:18:00

so the what, the idea behind it is very simple. You take

Speaker: 00:18:04

all the data, you create a function. The properties of this

Speaker: 00:18:08

function are wherever there's more data

Speaker: 00:18:11

than it is in the surrounding, there should be a peak. And

Speaker: 00:18:15

wherever there's less data, there should be a value. The problem is,

Speaker: 00:18:19

of course, the way you create that data is very sensitive to a parameter that

Speaker: 00:18:23

you introduce. Okay, I don't want to get too

Speaker: 00:18:27

messy in this. It's all published so it can be

Speaker: 00:18:30

read and the sensitivity is hard to

Speaker: 00:18:34

deal with. So what we finally

Speaker: 00:18:38

understood was if we treated this. So this

Speaker: 00:18:41

is just a professional deformation.

Speaker: 00:18:45

Because we're particle physicists and quantum mechanics,

Speaker: 00:18:49

we think of everything as having something to do with particle physics.

Speaker: 00:18:53

Quantum mechanics. This problem has nothing to do with particle

Speaker: 00:18:56

physics or quantum mechanics, except we want to get rid of the sensitivity of

Speaker: 00:19:00

that function. And we said, if you think of this as the solution to

Speaker: 00:19:04

a problem in quantum mechanics, that problem has a

Speaker: 00:19:08

term having to do with the particles moving around and another

Speaker: 00:19:11

one having to do with the landscape it finds itself in.

Speaker: 00:19:15

That's called a potential function. It turns out

Speaker: 00:19:20

that potential function always has sharper

Speaker: 00:19:24

features, more pronounced dips

Speaker: 00:19:27

than the solution, has peaks,

Speaker: 00:19:31

and turns out to be much less sensitive to the parameter that goes into

Speaker: 00:19:35

building that function. So, literally, by

Speaker: 00:19:38

saying, what problem is this picture,

Speaker: 00:19:42

the solution to which turns out to be

Speaker: 00:19:45

trivial to solve what potential function,

Speaker: 00:19:49

you get a sharper picture. And the sensitivity, the parameter used

Speaker: 00:19:53

to build what's called the kernel function, that potential function

Speaker: 00:19:58

goes down by a factor of 10. So you have a pretty

Speaker: 00:20:01

unique answer. It's easy to arrive at. You don't have to be careful about

Speaker: 00:20:05

picking your parameters. The problem is if

Speaker: 00:20:09

you think of this as things living. So we have these

Speaker: 00:20:13

valleys now where the bulk of the heavy

Speaker: 00:20:17

concentration of data is, or we have stream

Speaker: 00:20:20

beds, but the data is up along the

Speaker: 00:20:24

walls as well as being down in the

Speaker: 00:20:27

valley. So the question is, which data belongs to which

Speaker: 00:20:31

valley, which stream bed, et cetera.

Speaker: 00:20:35

And so you want to move the points down the sides of the valley

Speaker: 00:20:39

and have them collect in whatever structure is at the bottom.

Speaker: 00:20:44

Well, people try that. In fact, my

Speaker: 00:20:47

colleague had been trying it. And as the dimension goes up,

Speaker: 00:20:51

for reasons we understand, that

Speaker: 00:20:54

surface becomes rippled just due to noise.

Speaker: 00:20:59

And so basically, if you try to just move things

Speaker: 00:21:02

down using ordinary calculus, what's called gradient descent,

Speaker: 00:21:06

you're just moving points in the direction of the slope. They get

Speaker: 00:21:09

stuck in the ripples. Oh, I see. Because

Speaker: 00:21:13

it can't find the global minimum. It can't find the important

Speaker: 00:21:17

minimum. Right. If things move according to quantum

Speaker: 00:21:20

mechanics, all bets are off. It's a much nicer

Speaker: 00:21:24

story. And it's the uncertainty principle, which made the

Speaker: 00:21:27

solution wider to begin with. So we're going to

Speaker: 00:21:31

exploit the uncertainty principle. If I move points

Speaker: 00:21:34

according to the laws of quantum mechanics. The first thing is, unlike

Speaker: 00:21:38

gradient descent, the quantum wave function extends out to

Speaker: 00:21:42

where the valley starts going up again.

Speaker: 00:21:47

So points automatically start to slow down as they

Speaker: 00:21:50

reach the minimum. And they don't overshoot and

Speaker: 00:21:54

rattle around, they just stop because they see now

Speaker: 00:21:57

equal influence from Both walls and therefore no

Speaker: 00:22:02

force. Also they don't see ripples because

Speaker: 00:22:05

the uncertainty principle allows for quantum tunneling

Speaker: 00:22:09

and they simply go through those tiny ripples or ride above them.

Speaker: 00:22:13

So as a way of making the data move

Speaker: 00:22:16

and find the minima in the function in any number of

Speaker: 00:22:20

dimensions and as a way of speeding up the

Speaker: 00:22:23

analysis, because quantum evolution is done by matrix

Speaker: 00:22:27

multiplication, so it's enormously

Speaker: 00:22:29

parallelizable. Didn't say that very well.

Speaker: 00:22:33

Parallelizable, you get a very quick algorithm

Speaker: 00:22:37

that is using physics principles. But to solve a non physics

Speaker: 00:22:41

problem, just getting the points efficiently down to the bottom.

Speaker: 00:22:45

If there is a riverbed that tells

Speaker: 00:22:48

you something about the data, says there's some one parameter

Speaker: 00:22:52

thing, some regression on the data that you can do to

Speaker: 00:22:56

something that's very extended. It's a huge discovery.

Speaker: 00:23:00

It's much better than finding simple clusters.

Speaker: 00:23:03

But that's what this does. So DQC has

Speaker: 00:23:08

advantages. One, it doesn't require training sets.

Speaker: 00:23:12

So it's great for biology data because having annotated

Speaker: 00:23:16

training sets that are really good, hard to combine.

Speaker: 00:23:20

So this is interesting and what does DQC stand for?

Speaker: 00:23:24

Dynamic Quantum clustering. Meaning we're using quantum mechanics.

Speaker: 00:23:28

The find the minimum. Now do you need a quantum computer to do this

Speaker: 00:23:31

or this is just an algorithm? Interesting. I told you I'm here under

Speaker: 00:23:35

false pretenses. You asked me here to talk about quantum

Speaker: 00:23:39

computing. And I told you I don't do quantum computing. I'm talking about using

Speaker: 00:23:42

quantum mechanics to run on an ordinary

Speaker: 00:23:46

computer. Could it run on a quantum computer? Yes, if

Speaker: 00:23:49

they were really as fast and as good as they say they're going to be,

Speaker: 00:23:53

would even be better because it can handle bigger. I'm

Speaker: 00:23:56

focusing on biology, by the way. Way this algorithm is data

Speaker: 00:24:00

agnostic, right? It's not talking about

Speaker: 00:24:04

biology per se, it doesn't care. It just says that

Speaker: 00:24:08

there's something interesting. Data is not distributed with

Speaker: 00:24:12

equal density every place. Things that are more like one

Speaker: 00:24:15

another tend to be located in a more dense region.

Speaker: 00:24:19

Okay, so and this has been applied to many things. It's been

Speaker: 00:24:23

applied to

Speaker: 00:24:27

finding radioactive sources in the city of Chicago hidden

Speaker: 00:24:30

in a building. Okay, it, there's a paper that I wrote on

Speaker: 00:24:34

that it's been applied to. I guess

Speaker: 00:24:38

there's no paper on this, but it was a problem I did for somebody

Speaker: 00:24:42

finding

Speaker: 00:24:46

tanks in the desert that have been camouflaged, painted,

Speaker: 00:24:50

same thing using the data from a

Speaker: 00:24:53

multispectral hyperspectral camera.

Speaker: 00:24:57

So it doesn't care what the data is. It's data agnostic

Speaker: 00:25:01

it's feature agnostic. It is

Speaker: 00:25:04

unsupervised completely. That doesn't mean that you don't use the results

Speaker: 00:25:08

of a previous analysis to now supervise the next analysis

Speaker: 00:25:12

based on what you learned. You do do that.

Speaker: 00:25:17

But at any rate, that was it. So what's now

Speaker: 00:25:20

going on is, as I said, we have the world's best

Speaker: 00:25:24

classifier. But I don't know how to tell you what the

Speaker: 00:25:28

best drug for your tumor, the one that's most likely

Speaker: 00:25:32

to work on, the biology that's happening now, should

Speaker: 00:25:35

be. And that's why I need to go find a biologist and they're

Speaker: 00:25:39

not so great at doing it either. So witness how

Speaker: 00:25:43

many people go through many, many failed drugs. Yeah,

Speaker: 00:25:47

well, precision medicine is definitely, you know, one

Speaker: 00:25:50

of, one of the, you know, one of the biggest outcomes of using

Speaker: 00:25:54

this type of, this type of clustering that

Speaker: 00:25:58

we can, we can create. I mean there's so many, there's, there's just, there's

Speaker: 00:26:02

so much out there that needs this type of, you know,

Speaker: 00:26:05

this type of. Still in its infancy. It's got a place to go to

Speaker: 00:26:09

be precision medicine. Where do you. Oh, go ahead.

Speaker: 00:26:13

Oh, please don't let me. What you have to say. I was going to say

Speaker: 00:26:16

based on dynamic clustering, quantum clustering,

Speaker: 00:26:20

you know, where do you see it evolving in the.

Speaker: 00:26:24

So I'll finish telling you this story about why I'm excited because

Speaker: 00:26:28

I think it's evolving to a really. I, I've

Speaker: 00:26:32

seen something today I never thought I would

Speaker: 00:26:35

see. So last night it showed up at 10 o' clock

Speaker: 00:26:39

in the evening and I'm still digesting what I saw.

Speaker: 00:26:45

What I show you, you should take with a grain of salt. But there's no

Speaker: 00:26:48

question, there's zero chance that I'm wrong in terms

Speaker: 00:26:52

of what you'll see. Okay, so the way

Speaker: 00:26:56

docs like to look at the problem or cancer

Speaker: 00:27:00

researchers is they talk about so called

Speaker: 00:27:03

biological pathways. Biological

Speaker: 00:27:06

pathways are sets of genes

Speaker: 00:27:11

which carry out some process. In the end, all processes

Speaker: 00:27:15

are making proteins, but we're not looking at the proteins being

Speaker: 00:27:19

made, but we know these sets of genes are

Speaker: 00:27:22

functioning together to produce an interesting

Speaker: 00:27:26

output. So if I can

Speaker: 00:27:30

take the information I have and find

Speaker: 00:27:34

a way of saying, oh, so in fact what I'm

Speaker: 00:27:37

seeing is actually predicted by the following

Speaker: 00:27:41

set of genes. And I can assign

Speaker: 00:27:44

meaningful coordinates to each tumor

Speaker: 00:27:48

based on where they are and what that set of

Speaker: 00:27:52

genes is doing together. I mean, biospace,

Speaker: 00:27:56

a point in that space depending on how many

Speaker: 00:28:00

things still I'm producing. One axis in biospace

Speaker: 00:28:04

and it's representing a process which is

Speaker: 00:28:08

happening in the patient where a bunch of genes are

Speaker: 00:28:11

telling me something, not one. And

Speaker: 00:28:15

that bunch of genes I can look at and ask what are their

Speaker: 00:28:19

properties? What are their common properties?

Speaker: 00:28:23

So I will share something with

Speaker: 00:28:27

you. So at any rate, did that.

Speaker: 00:28:30

Okay. Went to biospace using DQC

Speaker: 00:28:33

methods again. Remember I told you I had four clusters. So

Speaker: 00:28:37

there are six pairs of clusters which

Speaker: 00:28:41

differ in how the genes are being expressed in those clusters.

Speaker: 00:28:45

So I can find the most, the list of the most important ones between

Speaker: 00:28:50

1 and 2, 1 and 3, 1 and 4, 2 and 3,

Speaker: 00:28:54

2 and 4, 3 and 4. So

Speaker: 00:28:58

six possible axes

Speaker: 00:29:01

in biospace, the sets of genes that are most important.

Speaker: 00:29:06

And then using those axes which go from minus something

Speaker: 00:29:10

to plus something, I can assign a coordinate to every one of the

Speaker: 00:29:13

tumors. So I have points in a six dimensional space.

Speaker: 00:29:18

Okay. The way that's done,

Speaker: 00:29:22

it's done in a way such that zero on

Speaker: 00:29:26

that axis means that for that set of genes,

Speaker: 00:29:30

that point is consistent with what the value for

Speaker: 00:29:34

all of the genes in that thing. The average value of those genes

Speaker: 00:29:38

is. Plus means you are moving

Speaker: 00:29:41

x standard deviations away from

Speaker: 00:29:46

being at the average expression. So I don't need to know what the

Speaker: 00:29:50

normal expression of a gene is. That's always one of the

Speaker: 00:29:53

problems. You rarely have data for normal

Speaker: 00:29:57

cells of the same type as the tumor.

Speaker: 00:30:02

And so you don't know where to set your zeros. Here I'm doing it by

Speaker: 00:30:05

the average and I'm saying how far from a standard deviation am

Speaker: 00:30:09

I out one way and how far out am I the other way?

Speaker: 00:30:13

And so you plot the same tumors.

Speaker: 00:30:18

Now I have to remember what I do. I go to share

Speaker: 00:30:22

share the screen. So you plot the same set of

Speaker: 00:30:25

tumors. Now you see my background. Yes. And I am going to

Speaker: 00:30:29

switch over to the computer in my basement and show you a fun thing.

Speaker: 00:30:35

So the axes you see here are

Speaker: 00:30:38

DQC's plotting of

Speaker: 00:30:43

the cancers in a six dimensional biospace.

Speaker: 00:30:48

But I want you to see, blues are glioblastomas,

Speaker: 00:30:54

reds are the lowest grade gliomas,

Speaker: 00:30:57

magentas are the next lowest grade gliomas.

Speaker: 00:31:02

And the goals

Speaker: 00:31:05

are closest to the glioblastomas.

Speaker: 00:31:09

Interesting. Now I told you this is an animation. We're going to start the points.

Speaker: 00:31:12

This is how the QC works. Okay. So we're moving the points

Speaker: 00:31:16

downhill.

Speaker: 00:31:20

You like that? Yeah. So it's all. What's

Speaker: 00:31:24

happening, they're all converging into one like a

Speaker: 00:31:27

regression, right? Right. There's a one dimensional

Speaker: 00:31:31

shape. The healthiest tumors, they're not

Speaker: 00:31:34

healthy, but they're the healthiest. They're not the least awful.

Speaker: 00:31:38

Yeah, the least awful. So if I look at for this.

Speaker: 00:31:42

We already saw that when we analyzed the. So the colors

Speaker: 00:31:46

here are the clusters that I discovered

Speaker: 00:31:50

in RNA sequencing space in what we call

Speaker: 00:31:54

gene space.

Speaker: 00:31:57

They've just been arranged in a line from best to

Speaker: 00:32:01

worst. So blue is the worst. The

Speaker: 00:32:05

glioblastomas over here. Okay. Okay. So

Speaker: 00:32:09

for those listening, don't worry, we're gonna link in the show notes to a video

Speaker: 00:32:13

representation of this. Interesting. At

Speaker: 00:32:16

any rate, this is what you see.

Speaker: 00:32:20

So the. This is the plot in bio space. Now that's very

Speaker: 00:32:24

interesting because these have the dimensions of the bio coordinates

Speaker: 00:32:28

and those coordinates have a meaning.

Speaker: 00:32:33

Okay. In fact, I'll tell you what the meaning is. And this is based

Speaker: 00:32:37

upon a set of data of patients.

Speaker: 00:32:41

Yes, this is 692 patients.

Speaker: 00:32:46

That data was submitted to the cancer genome project. Okay. Oh, so this

Speaker: 00:32:50

is open source data that you're pulling. This is absolutely open source.

Speaker: 00:32:54

What my company did when we existed, because we had various

Speaker: 00:32:58

projects going and things like this, we downloaded all of

Speaker: 00:33:01

that data for the RNA sequencing data and as much

Speaker: 00:33:05

as we could find about each of those tumors, which was

Speaker: 00:33:09

not a hell of a lot. But there's something, It's a good

Speaker: 00:33:13

database. As I said, it's been studied for years and years and years.

Speaker: 00:33:17

So this is results obtained by starting from no information

Speaker: 00:33:21

and just relooking at the brain cancer data

Speaker: 00:33:25

and saying, people have been studying this forever. Did they ever

Speaker: 00:33:29

find anything like this? And the answer is no.

Speaker: 00:33:32

This has never been discovered. This has never been discussed. So

Speaker: 00:33:36

using traditional analytical sources,

Speaker: 00:33:40

you could not. Whatever. You could not get at this information

Speaker: 00:33:44

without doing the. The dynamic

Speaker: 00:33:47

because you make a lot. Of assumptions about what you're supposed to look at. You

Speaker: 00:33:51

make a lot of assumptions about how you filter the data.

Speaker: 00:33:54

You end up throwing the baby out with the bathwater.

Speaker: 00:34:00

Go ahead. No, you also said you didn't do any cleanup of the data. Like

Speaker: 00:34:03

that's just the wrong. No. Well, I mean, they've cleaned it up obviously. Obviously at

Speaker: 00:34:07

some level. But we're not doing the post. Whatever they

Speaker: 00:34:10

did cleanup that people normally do where they filter out

Speaker: 00:34:14

genes, where they have this gene should be expressed at

Speaker: 00:34:18

least at this level. All genes that aren't expressed at that

Speaker: 00:34:22

level we're throwing out of the data set. Okay.

Speaker: 00:34:25

If I see a difference between two

Speaker: 00:34:29

clusters and the genes are expressed differently in the two clusters,

Speaker: 00:34:33

but what they call the fold value isn't big enough.

Speaker: 00:34:37

I'm throwing it out of the data. Well, you can imagine

Speaker: 00:34:40

if there's hidden information in the data and you're busy throwing things

Speaker: 00:34:44

away, the chance you throw the baby out with the water bath water

Speaker: 00:34:48

is very high. Exactly. And

Speaker: 00:34:51

that's exactly what this shows. The benefit of going in

Speaker: 00:34:54

unbiased, unfiltered,

Speaker: 00:34:58

completely agnostic. Look to see if there's a signal first.

Speaker: 00:35:02

And then when you see the signal, which I did. So stage one is,

Speaker: 00:35:06

wow, there's a signal. Stage two, what is making the

Speaker: 00:35:09

signal? EQC is built for solving those

Speaker: 00:35:13

problems. Right. So basically, and

Speaker: 00:35:16

that's where it differs from AI, okay, AI needs training

Speaker: 00:35:20

sets for the most part. There are

Speaker: 00:35:24

versions of AI now that claim not to, which

Speaker: 00:35:27

are real. They make up data in order to train

Speaker: 00:35:31

the data. There aren't enough training sets.

Speaker: 00:35:35

So what you do instead is you make up artificial data and then try

Speaker: 00:35:38

to teach it to reconstruct the real data.

Speaker: 00:35:43

Okay, by by picking the parameters in the artificial data

Speaker: 00:35:48

and then you try to classify existing data.

Speaker: 00:35:52

But it's a different story here.

Speaker: 00:35:56

Everything is understood. The algorithm is totally

Speaker: 00:36:00

prescriptive. I know exactly what's going on.

Speaker: 00:36:04

There's no mystery. Once I find something

Speaker: 00:36:07

and we ended up, I just showed you with this concept of

Speaker: 00:36:11

biospace, which is what

Speaker: 00:36:15

people in literature, it turns out that's where the idea came from

Speaker: 00:36:19

to look at it this way, what people were

Speaker: 00:36:23

talking about as latent coordinates in the data.

Speaker: 00:36:27

So there are people doing AI that say, oh, I'm going to keep feeding

Speaker: 00:36:31

AI from this and AI is going to reduce my problem to

Speaker: 00:36:35

some low dimensional manifold and I'll call that a latent

Speaker: 00:36:38

coordinate picture. But then I'm faced with the problem. I don't really know

Speaker: 00:36:42

what the coordinates mean. I am busy trying to interpret

Speaker: 00:36:46

them and I certainly don't know how to exploit them.

Speaker: 00:36:50

Different here. Right. So we started with no training data.

Speaker: 00:36:56

Am I looking at here? Shouldn't be showing

Speaker: 00:37:00

you this, but my, my collaborators say I can show it to you.

Speaker: 00:37:06

So here are the axes. So what do

Speaker: 00:37:10

you know from these axes? Well, the

Speaker: 00:37:13

genes in this axis have, as I

Speaker: 00:37:17

say, tumor associated fibroblast activation,

Speaker: 00:37:21

their immune checkpoint genes

Speaker: 00:37:24

signaling chemokine driven inflation, the pathways that are

Speaker: 00:37:28

being recruited for this or that. Basically

Speaker: 00:37:32

here it says if you want to

Speaker: 00:37:36

change overexpression or under expression, you want to look at

Speaker: 00:37:39

the drugs which do the following thing.

Speaker: 00:37:43

There's one such description for every one of the six axes

Speaker: 00:37:48

they have A meaning. And so if I simply look at the

Speaker: 00:37:51

coordinates and biospace and see which. Along which of these

Speaker: 00:37:55

axes the biggest signal lies,

Speaker: 00:38:00

that's the first set of drugs you try on the tumor.

Speaker: 00:38:04

So by looking at in biospace and how the tumor

Speaker: 00:38:07

evolves in biospace,

Speaker: 00:38:12

that's what this is, right? The evolution of the tumor

Speaker: 00:38:16

in biospace. Every one of these points, after all, is a

Speaker: 00:38:19

snapshot in time of the tumor at that point.

Speaker: 00:38:23

What this suggests is it's a continuous

Speaker: 00:38:26

evolution to glioblastoma through these

Speaker: 00:38:30

biological processes. And as they change.

Speaker: 00:38:34

So you're seeing the. So basically, what

Speaker: 00:38:38

have I learned? God is showing me, or biology is

Speaker: 00:38:42

showing me how the tumors evolved in

Speaker: 00:38:46

time.

Speaker: 00:38:49

Interesting. I don't know. That doesn't. So do they all start out

Speaker: 00:38:53

as like you showed that image again, but

Speaker: 00:38:57

the one where they're all on the same plane, the one that we're all

Speaker: 00:39:01

on the same plane. This is the

Speaker: 00:39:04

snapshot and survival term for the patient because that's what

Speaker: 00:39:08

is changing along this curve. We already saw that. Oh, I see. So

Speaker: 00:39:12

this had different survival times. So these are all

Speaker: 00:39:15

tumors. I don't know where healthy is.

Speaker: 00:39:19

Okay. So not everybody starts out, for example,

Speaker: 00:39:23

you know, in the red. And then basically

Speaker: 00:39:26

they probably do. Okay.

Speaker: 00:39:29

Glioblastomas have to be. If you look at them in terms of their gene

Speaker: 00:39:33

expression patterns, they're a mess. Okay.

Speaker: 00:39:37

They've undergone many mutations to get where they are. And the more mutations,

Speaker: 00:39:41

that's the different colors, basically, and they change. Okay. So

Speaker: 00:39:44

everyone starts out maybe with the red, but not everybody

Speaker: 00:39:48

goes all the way to the blue. Purple. Right. And they probably. Everybody probably

Speaker: 00:39:52

starts out to the left of the red. Right. Because these

Speaker: 00:39:55

tumors probably form at the single cell or small number of

Speaker: 00:39:59

cell levels. Okay. And take 10 years to grow.

Speaker: 00:40:02

Okay. The first show up and be seen. Okay. So

Speaker: 00:40:07

it's not. We don't have examples of the earliest

Speaker: 00:40:10

version. That's the beauty of what. What's blowing me away. Yeah.

Speaker: 00:40:14

Don't need to know any of this. I don't

Speaker: 00:40:17

have to know. I only need the gene expression pattern

Speaker: 00:40:22

and I only needed the information about survival time to

Speaker: 00:40:26

interpret the axis. Everything else came after

Speaker: 00:40:30

I found the axes when I had to interrogate

Speaker: 00:40:34

pathway databases to find out what they do.

Speaker: 00:40:38

And truth be told, I asked an AI to give me the

Speaker: 00:40:42

information about that because it's a pain in the ass to

Speaker: 00:40:45

go through those things yourself. So we could use. And I just

Speaker: 00:40:49

wanted to know what I might see. This is not to be taken

Speaker: 00:40:53

seriously. Okay. Because My, my

Speaker: 00:40:56

biologist and my doctor friend are going to have to do the job

Speaker: 00:41:00

of vetting what these interpretations. I only trust

Speaker: 00:41:03

AIs a little

Speaker: 00:41:07

bit. It's sort of fun to do that. Okay.

Speaker: 00:41:11

But what I wanted to give you was a feeling

Speaker: 00:41:16

for the difference between biospace information

Speaker: 00:41:20

and simple single gene information. Okay.

Speaker: 00:41:24

And it's awesome what the difference is. And

Speaker: 00:41:28

it's awesome that there's a progression in

Speaker: 00:41:32

biological processes that lead you to

Speaker: 00:41:35

glioblastoma. I can't tell

Speaker: 00:41:39

you this actually represents evolution,

Speaker: 00:41:43

but if it looks like evolution and it smells like

Speaker: 00:41:46

evolution and it wax like evolution, it's

Speaker: 00:41:50

evolution, okay? I mean that's just my feeling.

Speaker: 00:41:53

Now I've already given you all the I don't know any biology,

Speaker: 00:41:58

do know a lot of physics, do know how DQC works.

Speaker: 00:42:02

Okay. I know that better than anybody. But this

Speaker: 00:42:06

business that you can take the information that you learned in

Speaker: 00:42:09

the genetic right, in the single gene

Speaker: 00:42:13

basis and convert it to biological

Speaker: 00:42:16

process basis and learn entirely new things

Speaker: 00:42:20

more suited to advising doctors who are treating

Speaker: 00:42:24

cancer patients. Because I can take a new

Speaker: 00:42:28

tumor stuff and put it on that plot, see where it

Speaker: 00:42:31

is, see what its access definition is

Speaker: 00:42:35

and see what the likely best drug is to start with. And

Speaker: 00:42:39

then if that doesn't work, drop down to the next most likely. The next

Speaker: 00:42:42

most likely. So

Speaker: 00:42:47

basically that sort of we can stop sharing actually

Speaker: 00:42:50

now, which he says I can stop sharing.

Speaker: 00:42:55

Okay, great. So you know why I'm

Speaker: 00:42:59

in this befuddled state at the moment? Because I am still

Speaker: 00:43:02

absorbing what this is telling me. I certainly never expected

Speaker: 00:43:08

when I thought of trying that because people talked about these latent

Speaker: 00:43:11

variables and hidden dimension, hidden coordinates

Speaker: 00:43:16

and describe ways that might work. I didn't see any

Speaker: 00:43:20

examples actually worked out. This is the

Speaker: 00:43:24

story from beginning to end

Speaker: 00:43:27

genetic coordinates to discovery

Speaker: 00:43:31

to the world's best classifier to changing that into

Speaker: 00:43:35

bio coordinates discovered from the genetic side.

Speaker: 00:43:41

The treatment options, a tool for helping doctors treat,

Speaker: 00:43:46

for suggesting to cancer researchers new experiments to

Speaker: 00:43:50

do to verify what they're seeing on this.

Speaker: 00:43:54

Lots of suggested. I alone with no knowledge can think

Speaker: 00:43:58

of 10 things people should explore based on this. And

Speaker: 00:44:01

drug companies want to know what the next set of things

Speaker: 00:44:05

to target should be for a given disease.

Speaker: 00:44:10

Wow. I think that's pretty cool. That is

Speaker: 00:44:13

impressive. So it really is. You know,

Speaker: 00:44:17

DQC is telling me the data is whispering

Speaker: 00:44:21

to you. I'm the tool

Speaker: 00:44:24

that'll teach you how to listen.

Speaker: 00:44:28

That's the way I feel about it. Since it's my baby, it's grown

Speaker: 00:44:32

up, I really think it's grown up and

Speaker: 00:44:36

I'm very impressed with where it got. So you're getting me in my

Speaker: 00:44:40

very biased statement for it. Oh, we can tell it's super, super humble. But

Speaker: 00:44:44

no, it's really. It's really exciting. But also to see where it can

Speaker: 00:44:48

be taken from there, you know, like, this is just the beginning.

Speaker: 00:44:51

The. There's so many scratching the surface. First place, those

Speaker: 00:44:55

axes could be improved because there's more than one

Speaker: 00:44:59

set of genes that give similar information

Speaker: 00:45:04

how to exploit it, how to do the bench experiments.

Speaker: 00:45:07

That's not me. I don't know that stuff. And I'm

Speaker: 00:45:11

83. I'm not ready to start learning how to be a bench

Speaker: 00:45:14

biologist. Okay. But

Speaker: 00:45:19

it's. It. It's just so cool. I mean, you know, it's

Speaker: 00:45:23

like you've seen the underbelly of what's happening in the biology.

Speaker: 00:45:27

At any rate, I don't know if you agree with me, but I think it's

Speaker: 00:45:29

really cool. No, that is really cool.

Speaker: 00:45:35

There's a lot to take in. I'm sorry about

Speaker: 00:45:38

that. No, no, I mean, you know, we have a scale system for these

Speaker: 00:45:42

shows, right? Like five. Five. What is the five Schrodinger.

Speaker: 00:45:46

Schrodingers, yeah. So we have, like from zero to five Schrodingers. This is definitely gonna

Speaker: 00:45:49

be a good five Schrodinger show. Like, and I was able to follow on because

Speaker: 00:45:52

I was a d. Data scientist before this. So, like, when you

Speaker: 00:45:56

said pca, like, I knew what you were referring to at least. I.

Speaker: 00:46:01

But like, so, like, it was like, this show is really geared towards the

Speaker: 00:46:04

quantum curious. Some of which will be data scientists, some of these will be

Speaker: 00:46:08

marketers, some of those will be, you know, kind of traditional software engineers,

Speaker: 00:46:12

et cetera, et cetera. Marketers. Right. Because it's our thesis that

Speaker: 00:46:17

when the quantum computing ecosystem comes around, and

Speaker: 00:46:20

indeed, I think what you've proven today is you don't really need

Speaker: 00:46:24

quantum computing to take advantage of the

Speaker: 00:46:28

innovations in quantum science. Right. Like. Right.

Speaker: 00:46:33

I think that was an assumption I think Candace and I had. I don't want

Speaker: 00:46:35

to speak for Candace, but I know I certainly did. But I know that there

Speaker: 00:46:39

is a field called quantum inspired algorithms, which is probably.

Speaker: 00:46:43

That's sort of what this falls. Yeah,

Speaker: 00:46:46

but it's just exciting

Speaker: 00:46:50

that innovation like this can come about in such a way that

Speaker: 00:46:55

it's going to improve people's lives. What you've discovered is

Speaker: 00:46:59

I'm not a biologist or a doctor, but I would imagine that a doctor or

Speaker: 00:47:03

pharmaceutical Researcher would look at that and say, oh, you know what this means? This

Speaker: 00:47:07

means xyz. I hope so. I mean, I mean

Speaker: 00:47:12

I'm pretty much at the limit of what I can do even

Speaker: 00:47:15

with collaborators on our own. The, the point

Speaker: 00:47:19

we're writing the paper now. This is, I've already blown

Speaker: 00:47:23

my collaborators out of the water because this was discovered last night and they

Speaker: 00:47:27

don't know about it yet. I have their permission to talk

Speaker: 00:47:30

about it though, so that's cool. It's,

Speaker: 00:47:35

you know, so I, I, I'm glad you liked it and the five

Speaker: 00:47:38

shortinger level because I'm only here because you guys refuse

Speaker: 00:47:42

statement. I don't know anything about quantum. Well, I, I do know something about quantum

Speaker: 00:47:45

computing but I am not a quantum computer person and

Speaker: 00:47:49

so I didn't belong on your show but you kept refusing to let me off.

Speaker: 00:47:55

But I mean, I think it's important that people think about like this is not,

Speaker: 00:47:59

I think one of the things that obviously you're, you're, you're a great

Speaker: 00:48:03

presenter and great teacher of these very

Speaker: 00:48:07

complicated topics but you've also something to figure it out. Plus I also think it's

Speaker: 00:48:10

important for people to realize that quite quantum physics and research in that

Speaker: 00:48:14

space is already improving people's lives or at

Speaker: 00:48:18

least already showing fruits of that. And

Speaker: 00:48:22

I think that your research kind of shows that. It's like, you know, you don't

Speaker: 00:48:25

have gen, you don't have the billionaires facing off over, you know,

Speaker: 00:48:29

Jensen saying it's going to take 20 years, Bill Gates saying it's going to take

Speaker: 00:48:32

less. Right. I mean this is pretty basement

Speaker: 00:48:36

and the data is free. So I think the other lesson here

Speaker: 00:48:40

is we have a wealth of data that's under explored

Speaker: 00:48:44

because looking at it in an unbiased fashion hasn't been done.

Speaker: 00:48:48

Right. So I have lots more diseases I want to look at

Speaker: 00:48:52

and I have all this TCGA data for

Speaker: 00:48:55

pancreatic cancer and various other

Speaker: 00:48:59

cancer and so

Speaker: 00:49:04

it's sort of fun, right? I like how you kind

Speaker: 00:49:07

of mix, you know, I know you say you weren't appropriate, but I think you

Speaker: 00:49:10

were totally appropriate for the show and you've got the physics

Speaker: 00:49:14

background when you're talking about quantum clustering,

Speaker: 00:49:18

why it's affecting the biological,

Speaker: 00:49:23

giving us biological data that we're able to move forward with

Speaker: 00:49:26

potentially for precision medicine. I love the

Speaker: 00:49:30

bridges that are being created all over

Speaker: 00:49:34

the place here that you're not just kind of stuck in one thing thinking you

Speaker: 00:49:37

can only do one thing because you have a certain amount of knowledge but how

Speaker: 00:49:41

you've bridged that to bring in all of this

Speaker: 00:49:45

biological data information, I think it's

Speaker: 00:49:48

fantastic. I'm very happy that you came and you joined us today.

Speaker: 00:49:52

I learned. I'm glad I didn't bore you and I hope I didn't get too

Speaker: 00:49:56

far into the weeds, which my wife accuses me of doing all the time.

Speaker: 00:50:01

Mine too.

Speaker: 00:50:04

Where can people find out more about you and what you're up to?

Speaker: 00:50:09

Me and what I'm up to? Well, I'm on LinkedIn. People contact

Speaker: 00:50:12

me through LinkedIn all the time.

Speaker: 00:50:17

I have a long history and you know, if you go look at

Speaker: 00:50:21

the archives, the physics archives. Physrev. Physrev A.

Speaker: 00:50:25

Physrev B. I, I mean my, my past history is a little

Speaker: 00:50:28

eclectic, even in physics, which I attribute to having a

Speaker: 00:50:32

short attention Spanish. But I started in

Speaker: 00:50:36

particle physics. In phenomenology means looking at data,

Speaker: 00:50:39

trying to understand what it's telling me. I moved into

Speaker: 00:50:43

pure abstract particle physics

Speaker: 00:50:46

and then I went into what's called lattice field theory and lattice gauge

Speaker: 00:50:50

theory, which is trying to learn stuff from

Speaker: 00:50:55

how to say this. Didn't expect to talk about this. So,

Speaker: 00:50:59

so let's talk about how we do physics, which is another

Speaker: 00:51:03

totally off topic thing. And you may be running out of time. I don't know.

Speaker: 00:51:06

You tell me when I have to shut up.

Speaker: 00:51:10

But the, the, the story is

Speaker: 00:51:14

physicists are smart, but there are very few problems we know how to solve

Speaker: 00:51:17

exactly. Only a handful.

Speaker: 00:51:21

Everything else is done by a process we call perturbation theory.

Speaker: 00:51:25

Mathematicians also call it perturbation theory. You say, well, this

Speaker: 00:51:29

problem that I know how to solve exactly kind of looks

Speaker: 00:51:33

a little bit like this other problem, but with some

Speaker: 00:51:36

modifications. So let me add the modifications to the problem

Speaker: 00:51:40

and try to calculate corrections to the answer

Speaker: 00:51:44

based on the modifications. So I have the original problem

Speaker: 00:51:47

set and forces involved and the changes in those

Speaker: 00:51:51

forces a little bit. And then I calculate

Speaker: 00:51:55

perturbatively what's happening. People do it in

Speaker: 00:51:59

celestial physics all the time. I have this

Speaker: 00:52:02

planet moving around the sun in an elliptical orbit. Oh well, but

Speaker: 00:52:06

there's the moon. So how does that affect the orbit?

Speaker: 00:52:09

Well, I can't solve that problem. That's already a three body problem.

Speaker: 00:52:13

And there's no exact solution to the three body problem by the time

Speaker: 00:52:17

it's also got Mars and Jupiter and

Speaker: 00:52:20

Saturn and Pluto and Mercury in the problem.

Speaker: 00:52:25

I can't plot orbits. But people do it all the time.

Speaker: 00:52:28

NASA plots orbits. How do they do it? They calculate

Speaker: 00:52:32

the original orbits and they Start calculating the effects of Mars

Speaker: 00:52:36

and this and that on that orbit, because we know what those

Speaker: 00:52:39

forces are if Mars is on its orbit. And through

Speaker: 00:52:44

successive corrections, successive iterations, you're

Speaker: 00:52:47

able to make the small perturbations in the orbit that get the answer

Speaker: 00:52:51

right for you and eventually lets you send something to the moon

Speaker: 00:52:55

and not miss. Okay,

Speaker: 00:52:59

so perturbation theory is, is what we use. But what is perturbation

Speaker: 00:53:02

theory based on? I have a solution, I know how to get

Speaker: 00:53:06

exactly. And I know how to make small corrections

Speaker: 00:53:10

to that solution. And then I can describe all kinds of

Speaker: 00:53:14

crap. So, for example,

Speaker: 00:53:18

condensed matter physics talks about matter.

Speaker: 00:53:22

So I ask you, has anybody ever proved that the table you're

Speaker: 00:53:26

sitting at exists?

Speaker: 00:53:31

Is there such a thing as a table made out of wood? In fact,

Speaker: 00:53:34

is there such a thing as wood? The answer is no.

Speaker: 00:53:40

Use wood to build houses. I use engineering

Speaker: 00:53:44

principles to calculate the stress and load on a beam.

Speaker: 00:53:48

How the hell do I do that if I don't know wood exists?

Speaker: 00:53:52

I describe wood, I assume it exists, I

Speaker: 00:53:56

characterize it in terms of a bunch of properties,

Speaker: 00:53:59

and then I can, based on that, make small correction

Speaker: 00:54:03

calculations again to see how the wood behaves

Speaker: 00:54:06

when I stand on it. But I have to start from the

Speaker: 00:54:10

assumption it exists and that there are properties

Speaker: 00:54:13

I can measure for it and make prediction based on that.

Speaker: 00:54:17

But the first principles thing that would exists, no way.

Speaker: 00:54:21

Nobody solved that problem. Okay? So I was

Speaker: 00:54:25

very interested in that because that's sort of a first principles problem,

Speaker: 00:54:28

right? It's very philosophical, isn't it? It's where the, a

Speaker: 00:54:32

hard science like physics kind of meets up against.

Speaker: 00:54:35

Oh, we meet up against soft stuff all the time and

Speaker: 00:54:39

we fail to solve the problem. But that's okay.

Speaker: 00:54:43

It, it's. At any rate,

Speaker: 00:54:49

I was always interested, always after many years in

Speaker: 00:54:53

phenomenology, I and papers

Speaker: 00:54:56

published in phenomenology and things like that,

Speaker: 00:55:00

getting into field theory and, and

Speaker: 00:55:04

trying to understand from first principles how to solve hard problems

Speaker: 00:55:09

that, like quantum chromodynamics.

Speaker: 00:55:13

That intrigued me because we're kind of using up this

Speaker: 00:55:16

perturbation theory paradigm, okay? It's very

Speaker: 00:55:20

useful, it's very good. But we're already running into lots of

Speaker: 00:55:23

problems where it doesn't work. We don't know a problem that's

Speaker: 00:55:27

approximately like the problem we want to solve.

Speaker: 00:55:30

So how do you solve it? So I got involved in that. I got involved

Speaker: 00:55:34

in what's called lattice field theory. And then I said, but how am I going

Speaker: 00:55:37

to know I'm right? Because

Speaker: 00:55:41

I could be Wrong in pushing my answer in the one

Speaker: 00:55:45

known direction. There got to be other problems,

Speaker: 00:55:49

but there's only one quantum chromodynamics. It's the one we

Speaker: 00:55:53

live with, it's the one we're made of.

Speaker: 00:55:57

So I don't know if I'm cheating or not, but there's

Speaker: 00:56:00

lots of condensed matter problems and they all have different

Speaker: 00:56:04

answers and many of them are strong coupling problems and you

Speaker: 00:56:08

can't treat them perturbated. So take the same methods

Speaker: 00:56:12

and change your field and go look at condensed matter and see if you can

Speaker: 00:56:15

develop techniques to do that. Then did that for

Speaker: 00:56:19

a long time and then developed some methods and decided,

Speaker: 00:56:23

oh, David Horn came into my office and I said,

Speaker: 00:56:27

oh, this looks interesting. So I can't stay

Speaker: 00:56:30

in one area now, to me it makes sense why I'm changing to other

Speaker: 00:56:34

people. It looks like I have no attention span. So that's

Speaker: 00:56:37

okay because I do this for me. And

Speaker: 00:56:42

so as long as I see the thread, I'm happy. But that's how I'm here.

Speaker: 00:56:46

I'm now in biology, quote. But we're

Speaker: 00:56:49

glad. We're glad that you're here. Glad that we got to learn

Speaker: 00:56:53

a bunch of stuff today. I think it's going to be really

Speaker: 00:56:57

exciting to unpack it and to

Speaker: 00:57:01

have you back because you are just a. Few

Speaker: 00:57:04

guys, but I'm going to bore you. So. No, I don't feel bored. I

Speaker: 00:57:08

mean, I'm more fascinated. I'm confused. It's about some things, but,

Speaker: 00:57:12

like, I'm also fascinated, too, and we want to be respectful of your

Speaker: 00:57:16

time and. But we'd love to have you back on the show.

Speaker: 00:57:19

I'm sitting in my office. I have

Speaker: 00:57:23

Nothing on until 5:00 clock this evening. Awesome. We'll

Speaker: 00:57:27

definitely have you come back then because again,

Speaker: 00:57:31

it's just really great information. It's important, it's exciting. I think it's very

Speaker: 00:57:34

exciting. So, unfortunately, we have a little limitation,

Speaker: 00:57:38

so. Yeah, but definitely. And so folks can

Speaker: 00:57:42

reach out to you on LinkedIn and engage with you directly, if you're cool with

Speaker: 00:57:46

that and let your AI. I don't promise to

Speaker: 00:57:49

answer everybody, and if they're a crackpot,

Speaker: 00:57:53

I don't promise to be polite. There you go. That's fair.

Speaker: 00:57:57

I'm liking that. I like that. Let our

Speaker: 00:58:01

AI finish the show. And that wraps this quantum

Speaker: 00:58:04

odyssey on impact. Quantum. A massive thank you to Dr.

Speaker: 00:58:09

Marvin Weinstein for taking us deep into the fractal jungle of

Speaker: 00:58:12

biology, data, science and quantum mechanics with

Speaker: 00:58:16

only his brain, DQC and a suspiciously

Speaker: 00:58:20

underutilized basement server farm. From classifying

Speaker: 00:58:24

glioblastomas with 99% accuracy to uncovering

Speaker: 00:58:28

biocordinates that could revolutionize precision

Speaker: 00:58:31

medicine. Marvin reminded us that sometimes the biggest

Speaker: 00:58:35

scientific breakthroughs don't require a billion dollar

Speaker: 00:58:39

lab, just a stubborn physicist, open source data,

Speaker: 00:58:43

and the audacity to ask what if? If you enjoyed

Speaker: 00:58:46

this episode, and really, how could you not? Be sure to

Speaker: 00:58:50

subscribe, share and let your fellow Quantum Curious friends

Speaker: 00:58:54

know. And as always, check the show notes for links to

Speaker: 00:58:57

Marvin's work, ways to connect, and possibly a

Speaker: 00:59:01

diagram that will make your head spin just a little less.

Speaker: 00:59:04

Until next time, stay curious, stay entangled,

Speaker: 00:59:08

and remember, just because you can't observe the Quantum doesn't mean it's

Speaker: 00:59:12

not observing you. Cheers.