Hello and welcome to BJGP Interviews.
Speaker AI'm Nada Khan and I'm one of the associate editors of the journal.
Speaker AThanks for taking the time today to listen to this podcast.
Speaker AToday we're speaking to Professor Martin Schutt, who is a professor in translational AI and Laboratory medicine, and Professor Hank Vanwort, GP and Emeritus professor in General Practice, who are both based at Amsterdam University Medical Center.
Speaker AWe're here to discuss their paper, which is titled Artificial Intelligence for Early Detection of lung cancer in GP's clinical notes.
Speaker ASo, yeah, it's great to see you both here today.
Speaker AAnd Martin, I'll come to you first.
Speaker AI suppose we know that it's important to try and diagnose cancer early, but could you talk us through what's the potential for artificial intelligence here in terms of identifying cancer earlier based on patient records?
Speaker BYeah, that's a very interesting question because the potential kind of like goes hand in hand with the huge amount of interest in AI.
Speaker BAnd I think there are great opportunities.
Speaker BThere are also great challenges.
Speaker BBut talking about the opportunities, especially in the context of the article that we wrote, is on the data side.
Speaker BSo on the data side, the digitalization of electronic health records gives great opportunities.
Speaker BA lot more is digitalized, and that means that we also, in our case, have access to free text, and that we, with the advent of the large language models, with also new developments in AI, we also have better ways of making use of those data.
Speaker BSo those two combined creates a really interesting formula for big opportunities for AI in the general practice and healthcare in general.
Speaker AAnd you mentioned access to free text records.
Speaker ASo what GPs are typing into the record records?
Speaker ABut before we get into the study, can you just briefly describe what is natural language processing and how that can be used in free text records?
Speaker BSo we know that a lot of clinical risk scores, they work with features of patients, so their age and their gender or sex.
Speaker BAnd.
Speaker BBut of course, a lot of information is also written up in unstructured way.
Speaker BAnd in our case that is text.
Speaker BBut we can also think of images and audio, and in that sense we have access to that data by different ways, which natural language processing is one of them.
Speaker BAnd it means that we give AI access to this text through, for example, advanced models like we now have, like ChatGPT users.
Speaker BBut that's only one extreme of the spectrum that we can talk about, because you could also imagine that we just simply look with keywords through the text, and then if certain keywords were mentioned, that you include that in the information that is available to your Docu to your, to your model.
Speaker AAnd Hank, I don't know if you want to comment on just what we know already about clinical scoring systems for early diagnosis of cancer.
Speaker CThe problem with what we already know is that we know things because they have been coded in the past.
Speaker CIf, if you look at the ways to access data, the only way to access data was by using codes.
Speaker CAnd the big jump forward is made by using not only codes, but also text, because codes will always be replicating themselves.
Speaker CBy which I mean that a GP who likes to, to have to make notes of what he has been speaking about with patients, he cannot code all the things that he will write down.
Speaker CSo codes will always form a very exquisite extraction of the content of a consultation and will never present us with new information because codes only exist when the information was already there.
Speaker COtherwise there will be no codes.
Speaker CJust so implicitly there is be a replication of what we know when we have to code our things.
Speaker AYeah, absolutely.
Speaker AAnd I work with a colleague called Sarah Price who's done some research around coding and she's shown in her research that clinical coding can be biased depending on the outcome.
Speaker ASo people who have bladder cancer, they're more likely to have codes for hematuria or blood in the urine.
Speaker ASo, yeah, there could be a discrepancy in how clinicians code things rather than write it in the free text.
Speaker CYeah, because in the past there has been done some marvelous research by Willie Hamilton, Hamilton and for example, and Judy Hippisley Cox is well known, but they had to use codes.
Speaker CSo there was never a jump forward.
Speaker CAnd I think that now with the aid of natural language, we can make a jump forward.
Speaker AAnd the methods that you use here are quite complex, but I'll try to summarize it briefly.
Speaker ASo essentially you analyze the electronic health records of over half a million Dutch patients and used these natural language processing techniques and machine learning to look back in the records of people diagnosed with cancer.
Speaker AAnd then you look to see what data in those records could be used to predict lung cancer.
Speaker ABut is there anything you want to add to that, just for a lay audience?
Speaker AMartin?
Speaker BYeah, one nuance, a small correction on that is that we don't only look at the patient with cancer, but we look at the cases and controls.
Speaker BSo we both look at that because the AI needs to be able to distinguish the case from the controls.
Speaker BI think that's one important distinction because in healthcare, fortunately, we always have to do with low prevalences.
Speaker BWe don't have too many patients compared to the healthy patients.
Speaker BThat is Something of what the complexity of these kinds of models is.
Speaker BI think that is also important to realize when you develop these kinds of models.
Speaker CMay I add something because.
Speaker AYes, please.
Speaker CBecause if you look at the, the scientific side of it, then if you develop a prediction model for, for a cancer, for example, then you have to do that with a logistic regression method.
Speaker CAnd logistic regressions can, can contain many variables, but not as many as you can use when you, when you can use new large language models.
Speaker CSo you can also analyze many more variables.
Speaker CBut you can.
Speaker CThat's one point.
Speaker CAnd the second point is that you can analyze those variables in connection to each other.
Speaker CGreat advantage compared to the past.
Speaker CSo if you look at the model that we are, we used for this research, I think we use two layers of 100 variables in different relations to each other.
Speaker CSo that gives you 100 times, hundred possibilities.
Speaker ATalk us through what you did develop here.
Speaker ASo what?
Speaker ATalk us through that.
Speaker AMaybe Martin, you can try to explain.
Speaker BYeah, Can I start with.
Speaker BSo we picked up a signal.
Speaker BSo we develop prediction models taking into all of these, what you said, over half a million patients, all the clinical notes, the consultations that they had, put it in a prediction model.
Speaker BWe pick up a signal, we can make a prediction model that can.
Speaker BThat performs well.
Speaker BSo that's one.
Speaker BBut the second step is that ideally we would also like to get some information from that model.
Speaker BIt's like, what do you use to predict what does contribute to a prediction for lung cancer?
Speaker BAnd then we come to the nature of the complex methods that we use is that they are black box.
Speaker BWe are not able to open them up and see what is in them.
Speaker BAnd that is actually, I say, planning forward that we would like to peek into those boxes to see like, what triggers these predictions for lung cancer, which can then be again used in clinical knowledge and independent of the algorithm or the model that we developed.
Speaker AAnd the model that you developed actually performed quite well in terms of the sensitivity of the model in terms of distinguishing which patients should be referred for potential lung cancer symptoms.
Speaker BCorrect.
Speaker BI'm going to end that off to Henk.
Speaker BMaybe just say in between that, when I mentioned predicted performance, I'm talking about the C statistic or the area under the curve, which is the first, how you say, performance criteria.
Speaker BIf that doesn't go well, then we should try other things.
Speaker BBut that performed well.
Speaker BAnd then we translate those indeed into clinically relevant specificity sensitivity.
Speaker BAnd that's where Hank played a big role.
Speaker AYeah, go ahead, Hank.
Speaker CYeah.
Speaker CFirst I'd like to say something about the content of what we found because we did a small exercise to, to discover what was inside the black box.
Speaker CBut that's.
Speaker CTherefore we need much more money to do a good project to, to come up with that.
Speaker CBut we found some predictions which were quite astonishing.
Speaker CThe, the thing, two things I, I always tell as an example, and the first thing is that when a GP starts to prescribe incontinence material to a man, then he has a risk for lung cancer, which you can, you can of course explain, because if you have lung cancer, you start coughing and then you start coughing, there is, there is a small chance that you, you wet yourself.
Speaker CAnd the other thing we found is that the number of slashes which was in the file was related to the, to the risk on lung cancer.
Speaker CAnd that was quite a big question for us what that would mean.
Speaker CAnd at the end we came up with the explanation that there is a connection between lung cancer and cardiovascular diseases.
Speaker CAnd that connection is, of course, smoking and GPS always use a slash to note blood pressures.
Speaker CSo if you have a lot of slashes in your file, you have a lot of blood pressures noted.
Speaker CAnd if you have a lot of blood pressure noted, then you probably will have a high blood pressure, which is related to lung cancer.
Speaker CThat are two small explanations of what you find inside the black box, as we now used.
Speaker CAnd if you see what's in, you can always think of an explanation, which is the funny thing, of course.
Speaker AYeah.
Speaker ASo do you think models like this could help clinicians target investigations like chest x rays or CTs in people who might be at risk of lung cancer?
Speaker COf course.
Speaker CAnd why we did this, so is that you can of course use a model like this for a number of applications.
Speaker CIf you use it for a diagnostic, in a diagnostic way, you will have other concerns about your sensitivity and specificity than when you used in, for example, a screening way.
Speaker CIf you look at screening, the number of positives will be much lower than when you used in a diagnostic sense.
Speaker CSo it is the way you want to use this algorithm which gives you the decision about what thresholds you will use.
Speaker CWe worked out the 3% threshold because that is the referral threshold, which is defined by nice a few years ago.
Speaker CAnd if you want to have 3%, then you have to.
Speaker CYou need to investigate 33 people to find one with lung cancer.
Speaker AYeah.
Speaker AI'm also thinking about sort of the potential practical application of something like this in a practice.
Speaker ASo if you were bringing this sort of tool to a general practice, would you be able to Then suggest sort of what thresholds they would be interested in or what the availability was of certain tools like chest X ray or how do you think that this could be applied in practice?
Speaker AAnd are there more 10 or.
Speaker AHank?
Speaker CYeah.
Speaker CFor example, now if, if I would make, would have to make the choice now I would go for the 3% because that is the advice threshold by Nice.
Speaker AMartin, do you want to add anything to that?
Speaker BYeah.
Speaker BIt's interesting that talking about thresholds, that it is important to realize that these models are not fixed in the sense of you can configure them with a different threshold depending on the evasiveness of a follow up action, the costs of a follow up action, the severity of the disease.
Speaker BSo extending this to other diagnosis, to other conditions.
Speaker BBut it's important to realize that these models are kind of like moldable to still use one model in different situations.
Speaker AAnd just in terms of applying something like this, how do you imagine it might work at a practice level, at that GP's level?
Speaker ASo might it suggest an alert or something if a patient was above a certain threshold to trigger an investigation?
Speaker AOr how do you envisage this being used in practice?
Speaker BCould very well manifest as a flagging system.
Speaker BBut still looking at bringing a model from theory or from research into practice has a number of steps which in this case still need to be done.
Speaker BSo we took data from three big cities in the Netherlands on which we externally validated models that we used.
Speaker BSo we developed the model in one city and then externally validated in the other two.
Speaker BSo that's one big step is external validation, but then also the clinical uptake, setting the thresholds, the technological infrastructure in different GP systems and connections to other systems.
Speaker BAnd when you do the updating, that's, that's another big challenge and also the step to maintaining the model afterwards because it's not something that we set and then it's fixed in time.
Speaker BOf course we have to be open, we have to be aware of the fact that these models need to be maintained and we have problems of drift and the setting might be changed and say that might have different application, how things are registered, which all has implications as to how useful this model remains in practice.
Speaker AAnd one thing I wanted to touch on is that you mentioned that these sorts of models will use hundreds of different variables.
Speaker AAnd I think the way that a lot of GPS practice when they're thinking about cancer is they're thinking about maybe five to 10 alarm symptoms or red flag symptoms that they're attuned to.
Speaker ASo when their Patient presents with that, they kind of are already thinking, right, I need to be doing something, maybe doing, making referral or ordering more tests.
Speaker ABut in this sort of model, because there could potentially be hundreds of variables, it's more that the system is learning or as Martin says, flagging which patients might need anything further alongside the clinician's intuition or concern about a patient's symptoms as well.
Speaker ASo it's in addition to the clinical intuition and thinking, thought processes as well.
Speaker COf course, this is very, this will be very disrupting in a GP's mind because he will have to refer patients who, who are not in his mind as at risk.
Speaker CAnd that's not what we, what we used to do.
Speaker CI mean, the GP is somebody who would, who calculates the risks for patients and if the risks are low, are low, he will not refer in his mind.
Speaker CAnd if you don't know how a risk is, is made up, then of course the mind of a GP will be, will be in problems.
Speaker CBecause one thing you have to say, if you speed up the process of diagnosing cancer with four weeks, until now, what we see is that if you speed up surgery for four weeks, there will be a 6% decrease in mortality, which is a huge gain.
Speaker CSo I think that in the end gps will be prepared to accept that the system might be better than themselves, because that's the step you have to accept.
Speaker AIt's really fascinating work and obviously, as Martin has mentioned, there's a lot more work to be done for these AI driven and natural language processing driven models.
Speaker ABut it's very exciting and I can already see the application potentially for lots of different cancers and not, not just lung cancer.
Speaker ASo is that where you're heading now with this?
Speaker COf course, this project is almost 10 years old now, so we saw, we saw in the, in the start, we saw the potential for, I mean, it's not only for cancer, even also for many other disease.
Speaker BSo in addition to that, indeed, what ankle also just mentioned, there's lots of variety in, in different words.
Speaker BOf course, let it be said, the different languages is also a challenge.
Speaker BIf you challenge, if you look at texts, which is also something we have to tackle technically for the different models and approaches that we have, but also clinically that these words have different meanings.
Speaker BAnd then also what you say is like, yes, this was for lung cancer, we did similar work for other cancers and there are still a lot cancers that this could also be applied to or other conditions.
Speaker BI think one word of warning there as well is that the opportunities also go together with what are we going?
Speaker BI say where is the specialist going to end up in a jungle of clinical risk scores, especially in the general practice, that might be too overwhelming and we also have to think about how to be able to deal with that dynamics because as now we have information overload and there are too many.
Speaker BThere are a lot of scientific articles about all kinds of conditions.
Speaker BWe might generate the same with clinical prediction models and that is something we are working on, but also a more holistic approach to clinical reasoning and also how you organize that in a healthcare setting.
Speaker AReally fascinating work and yeah, I'm looking forward to see what else comes out from your long research project.
Speaker AI think this has been a really interesting discussion around this area and potentially brings up more questions sometimes than answers, but I think it's a fascinating piece of work and I think shows the potential for these different techniques.
Speaker AI just wanted to say thank you both very much for your time.
Speaker BThank you.
Speaker AAnd thank you all very much for your time here and for listening to this BJDP podcast.
Speaker AHank and Martin's original research article can be found on bjgp.org and the show notes and podcast audio can be found@bjgplife.com and as just a note, we've done this podcast today with Martin and Hank, and next week's podcast is going to be with Dr.
Speaker ASteve Bradley, also on Lung cancer.
Speaker AAnd this all ties in with our June issue of the bjgp, which is themed on cancer.
Speaker ASo do go back and take a look at that issue if you're interested in this area.
Speaker AThanks again and bye.