Merging Minds and Machines: the Future of Drug Discovery

Featuring John Marioni, Senior Vice President and Head of Computational Sciences.

Computational approaches have revolutionized how we interpret data. With the advent of genomic sequencing, scientists can derive significant conclusions from sequence data. For example, through transcriptomics – the study of genes and their expression in different cells – researchers have made groundbreaking discoveries in fields like developmental and disease biology. More recently, computational approaches have expanded with the introduction of generative artificial intelligence (AI) and its ability to generate new insights from massive datasets, diverse in scope, which can be validated in the lab. In this episode, co-host Maria Wilson sits down with John Marioni, Senior Vice President and Head of Computational Sciences, to discuss how computational methods can complement other research techniques to expedite drug discovery, enhance clinical trials, and prevent biases, ensuring the development of medicines that can treat people of diverse backgrounds.

If you would prefer to read a transcript of this episode, please click here.


SUBSCRIBE BELOW TO CATCH EACH EPISODE


If you want to learn more about the groundbreaking science happening in our labs, click here. To learn more about the jobs in our research and early development group, click here.



Transcript of Two Scientists Walk Into A Bar: “Merging Minds and Machines: The Future of Drug Discovery” with John Marioni

Maria: I’m Maria Wilson.

Danielle: And I’m Danielle Mandikian.

Maria: And we are scientists. We. Love. Science.

Danielle: Yeah, we do. So, when we aren’t doing it, the next best thing is to talk about science! What’s really awesome is we’re surrounded by some of the most brilliant minds in research!

Maria: So there’s always someone interesting to talk to. But there’s never much time to just chat at work. That’s why we are so excited to be hosting this podcast. We are going to step away from the labs today to talk to other scientists about the cool stuff they are thinking about, working on and imagining…

Danielle: … as well as how some of these discoveries just might lead to new medicines. So, grab your favorite drink, get ready to unlock your science brain and join us for Two Scientists Walk into a Bar…

Maria: The show for scientists, science geeks and the people who love them!



Maria: So we wanted to ask you, what's the difference between AI and machine learning?

Employee responses:

[Laughs] That’s perfect for her!

[Laughs] That’s the perfect question!

What is the difference between artificial intelligence and machine learning? I bet they're related.

AI is a category of machine learning, right?

I think machine learning is a category of AI.

I have to say, I have no idea.

I would really say I don't, but my wife does all that stuff for me, so…

I also don't know the exact definition between the two, but I would just assume that machine learning is a subtype of AI.

Machine learning is very focused on a particular, maybe, industry or discipline or function, whereas AI is more diverse and broader in scope.

Machine learning – a lot of it is based on the usage and the learnings that the application gets from the usage, right, that is able to define that. AI is really very broad. And, it does take learnings from usage, you know, but it also incorporates other external factors.

AI uses TensorFlow. And machine learning uses more standard mathematical algorithms.

Maria: Good answers!



Maria: Hi, everyone. Welcome to the show! I'm so excited about today's topic, which is on the use of artificial intelligence and machine learning in biology and drug discovery. And I'm talking today with John Marioni, who is a good friend and expert in this area of science. Welcome to the bar, John.

John: Thank you.

Maria: It’s so great to have you here today. I have heard that your area of expertise is transcriptomics. Um…what is that?

John: So, our body is made up of billions and billions of cells, and each cell contains a copy of our genome. And although there are small variations between the genome between cells, effectively it's the same. So, the way that each cell is able to distinguish itself and do different things is by expressing different sets of genes. And so, we have about 20,000 genes in each cell in the human body, and different cell types. So, for example, cell types within the liver, within the kidney, will express different combinations of these genes. And by doing that, they're able to have different functions. So, some are going to be the so-called stem cells that will make other cells. Some are more terminal cells that you'd see in your body, things like fibroblasts or skin cells. And so, it's this different combination of genes that really makes each cell different, and makes us humans and makes other organisms what they are. So, transcriptomics, in essence, is the study of that collection of genes and how they are expressed in different cells. And the technology has moved from being able to do this at an average level, where you're able to study maybe the average expression across hundreds of thousands of cells, to being able to do this at single-cell resolution. You could think about this as moving from the ability to drink a fruit smoothie and having to guess the different pieces of fruit in it, to being able to sample individually and directly each piece of fruit and read out what it is, one by one. And that allows you to tease apart the easy things, like an orange from an apple, but also different varieties of apples. So, you're able to then tell apart those subtle differences between those different varieties of apples, between those subtly different cell types. And that can be really important from a disease and drug perspective, because you want to target your drugs at the particular cell types that are contributing to the disease. And knowing the set of genes that are expressed is so important for being able to find the right targets and then to design the appropriate drugs to change the lives of patients right across the world.

Maria: So, you're talking about single-cell transcriptomics, right?

John: That's right.

Maria: Which was a huge breakthrough...But to be able to understand in every single individual cell what set of genes is being expressed is still to me kind of mind boggling. And also, it's a huge amount of data. And I know that's also where your expertise has really come in, is managing that data.

John: Absolutely. What I've done from a scientific perspective is, in essence, trying to organize that dataset so you've got this big matrix of cells by genes, and for each cell, you know the sets of genes that are expressed. And one of the key challenges is organizing those cells. So, you can group them together, understand how they are associated with one another. So, are they similar? Can you identify a distinct cell type from that? It’s rather like a librarian. You've got a big pile of books, and you want to index them by whether they're not only fiction or non-fiction, but historical fiction, science fiction, et cetera. And in essence, that's what we're trying to do. We're trying to take all of these different pieces of information, these transcriptomes, understand how similar or how different they are from one another, and then interpret those phenotypes. And that could be collecting them as properties as groups, but also other things, like whether one particular set of cells might be signaling to another – so, you'd get a ligand expressed in one population and a receptor in the other, and you could understand something about whether those cell types are signaling and talking to one another. And you could do all of that inference computationally. And so, really the computational analyses drive the interpretation of the data.

Maria: Yeah, that sounds amazing. I worked on the developmental biology of the pancreas... that is almost a cell that it's going to be – it's still in its earlier stages of differentiation? So, I know you mapped the human embryonic transcriptome earlier in your career. I think that's just fascinating. Human developmental biology – how we turn from a single cell into a multicellular cellular organism. Tell me about what you learned.

John: Developmental biology is just beautiful.

Maria: Yeah.

John: It's kind of a miracle that we exist at all, given all the things that can go wrong, all the things that have to be coordinated. And yeah, absolutely, it's all about cells gradually changing, making decisions – sometimes pretty rapidly, right? I mean, if you think about the mouse, it goes from 650 cells to 40, 50 thousand cells over a 48-hour window. So, there's many, many duplications. And not only are the cells dividing, but they're also beginning to commit to different lineages. And so, this sort of notion of a cell type in development kind of doesn't make so much conceptual sense, because you're along a gradient. You're not static. You're not – you don't just jump from A to B. There's a transition as the cells move along that. And so, one of the really interesting questions is how can you map out those gradients computationally so you can order the cells – basically from earlier to later? As you go through that, then you kind of work out the set of genes that change in their expression along that trajectory. And that ordering tells you something about how the organ is going to form. Of course, it only tells you one part though, because understanding how those cells are located in space within the embryo is also critically important, right, because they're getting different signals – you could put a different cell in a different location, and it'll do something different because it depends upon its environment. So, putting all of these pieces together is what's going to allow you to come to a big – ultimately understand what's going on. And as we kind of move in the lab here, and other places, using organoids it kind of gives a way of thinking about that in the human biology context. But it's still very early to really understand some of these absolutely critical processes. It's interesting to think how it's going to evolve as we move forward, though.



Wellington: Hey, Maria!

Maria: Hey, Wellington!

Karen: Hey, Maria!

Maria: Hey, Karen. Welcome to the team!

Karen: Thanks for having me.

Wellington: So, question – what is an organoid?

Maria: Oh, so an organoid is essentially, it's more than just a cultured cell. People are familiar with cultured cells – you have a monolayer of cells in a dish. An organoid is when you take cultured cells and take mixtures of cells and multiple different types of cells that might belong to an organ, culture them together under conditions they form little mini organs. So, you've got some cell-cell communication and some more biological function going on than just a single cell type cultured as in a monolayer.

Karen: So, Maria, what types of organoids are scientists using in the lab these days?

Maria: Yeah, I think one example I can think of that I've come across is people making organoids of the liver, because, you know, the liver is very important for drug processing and metabolism. So, we often want to make sure drugs are not toxic to the liver. And using liver organoids is a way that that can be done.



Maria: What type of things do you think we can learn that will apply to human biology and diseases from this type of work?

John: From the developmental stuff specifically?

Maria: Yeah.

John: So, there's a few things that I think is super helpful. One, from a more methodological perspective first, compared to disease, development is relatively stereotypical. It's more stereotypical than a disease setting where you're not quite sure what to look at. So, from a methods development perspective, computationally, many of the methods that we use and apply in a human or in a disease context were first tested and validated using developmental biology models, because you know what's going on better – you have more reproducibility, so you can really validate and test the models. So, from a methods perspective, that's a really important thing. It's not just computational – also, experimental methods are very often validated using developmental biology. So, developmental biology is fundamentally important from that perspective. But also – biology, you tend to reuse the same pathways over and over and over again. And so, understanding how signaling works in a developmental context will often mimic, not entirely, but it'll have definite overlaps with what's going to happen in a disease context. So, if you're able to really map out what's going on in development, you can then think about – well, I see a similar set of genes, modules, pathways expressed in this disease context, well, understanding of the developmental system is going to really help me here. And then a third way – I mentioned organoids before, this sort of technology where you grow organs or different systems in vitro – in order to grow those organoids, you need to know what set of signals the cells are exposed to. How the organs actually develop in vivo. And then you can, in vitro, think about, well what combinations of signaling do they receive at particular times, and can I use that to better, in vitro, mimic the type of organoid and the type of organ that you actually have in an in vivo setting? So, it kind of all comes full-circle, this development from a methods perspective – developmental biology from a methods perspective, from an experimental, computational. And then applying that to understand both normal biology to develop better models, and then to understand mechanisms that are active in disease. So, I see it really all coming together pretty nicely.

Maria: So, I understand what you mean by talking about experimental methods – treating cells with different agents, taking cells from different stages of development – but could you tell me a little bit more about the computational methods that you use?

John: Yeah. I mean, there are a variety of different approaches that are helpful and important for this question, or these types of questions. I think it's important to say that everything is very collaborative. So, when you're thinking about how you apply a method, or even how you develop a method, you've got to think about the biological question that you're trying to address. Because one of the things that is not helpful is developing a method in a vacuum. So, one of the first things you have to do in any setting – especially true when you're working in biology – is to understand, what is the question? And that means working with your partners in the lab, it means working with colleagues to understand the question. And then, once you've understood that question, then there's a real toolkit of different approaches that you could take. And sometimes it could be very simple. You don't always need a complicated statistical method to answer a particular question. It might be that a very simple thing is going to work really well. Sometimes it's more complicated and you've got to develop a novel approach or take a more computationally expansive approach to get to it. But I think that there are settings, and we can begin to see that now, where some of these datasets are just so large, and it's so hard to maybe interrogate them and to glean all the information you want from it that you begin to use some of these approaches that are going to try and find patterns that the human eye might not discern, or that it might not be a hypothesis that you've had beforehand, but sort of broadly think about things like machine learning as ways of trying to extract correlations, relationships between data points that you might not have been able to look at before that are a really powerful way of digging into the data and then generating new hypotheses that you can then work on in the lab to validate. I would say in order to really be super powerful, you need to have that tight connection between the analysis and the lab so that you can quickly validate – you can work together to understand what the right question is. So, there are a variety of tools, but that intersection, that interface between all parts of the process – from the data generator to the data analyst – absolutely critical to have that working together, and that allows you to develop the right question for the right method.

Maria: So, it sounds as if you're telling me that algorithms and computation are not going to solve everything for you without human beings who can understand the –

John: Absolutely.

Maria: – the context and the biology.

John: It's absolutely the case. There's sort of the sense at the moment that elements of artificial intelligence are going to take over.

Maria: Mm-hmm.

John: I don’t think that that’s the right way to think about it at all. It’s more, how does it complement –

Maria: Yeah.

John: – what's going on in other settings, and how can it speed up things? And maybe sometimes it will throw out something that we've never thought of before – just because we haven't been able to join the dots in a particular way, we have a particular perspective – and it's able to detect an association between things that we just weren't able to do. But I see it as more complementing what we do and how we approach scientific questions. So, it's still very much going to be a partnership between the wet lab scientist, between the dry lab scientist, and using these models – experimental and computational. And working closely together is going to speed up how quickly we can understand different biological questions, how quickly we can come to new targets. We already see this happening in the molecule design space, where the universe of chemical molecules is something that you could begin to sample, and it can suggest new structures that maybe you haven't – you wouldn't have predicted in the lab. And then you can quickly validate them, test them, see if it's working or not. Biology is harder, I would say there. Because, sort of, the rules and the data are still – there's less data, and we have a bit of a less understanding of the rules, the fundamental rules of biology. It's not like physics –

Maria: Yep.

John: – or chemistry where you have those fundamental rules that you can actually build simulations off. But the bout of data that we have in the biological sciences is beginning to allow us to understand how would – can we predict how drug X and mutational background Y in person Z will impact the phenotype? And then that's going to motivate a whole series of additional experiments.

Maria: I do like this hypothesis-neutral idea, though, that you would just ask – I guess, ask the data, "what's going on in here?" without saying, "does Gene X influence Gene Y?"

John: You've got to work out how to encode things we know.

Maria: Yeah.

John: And that's where in some systems, especially in, say, physics, where we have natural laws – there's laws of motion, there's laws of light – and you can encode that. You can kind of hard code those constraints. Biology, we don't really understand how a cell works.

Maria: Right.

John: Even accurately predicting transcription, right? If I change enhancer X, what will it do to the expression of gene Y? I mean, sometimes we have a good idea, but we don't really have a hard and fast way of predicting that. So, kind of that understanding of the laws of what's driving it is not there. So, it's kind of working out how to incorporate that useful prior information and the insights that we have together with the ML – that's going to be very powerful. And also, there's a lot of work at the moment on explainable machine learning, so that not only do you get an ability to tease apart the source of variation of the data, but you understand what the model is doing at the same time. So you kind of can understand the different features and what they mean. And that's a really important thing to take advantage of these sorts of approaches comprehensively. But yeah, it is super exciting, and the potential for complementing the work that we do at the moment is really tremendous. I'm really excited. Hopefully, you could probably tell that.

Maria: Yeah, that's wonderful.

John: And the next five, 10 years are going to be amazing in this field, I think the potential is really broad.

Maria: So, we've talked about cells and organs and how you can use computation and data to make predictions there. What about clinical trials? I mean, I remember even 10, 15 years ago, people were starting to try and do virtual patients. And I remember seeing some quite impressive data way back on a class of drugs called the CtIP inhibitors, where they, you know – a lot of stuff on like lipid metabolism. And they had really done some quite good modeling that showed that a drug would fail. And that drug ultimately did fail. But I'm wondering what you're thinking about how that's going to impact our work and just the society in general?

John: Absolutely. The sort of the digital twin concept.

Maria: Digital twin. Okay, that's a nice term.

John: It’s something that's absolutely going to impact what we do, so we can begin to predict or simulate how someone would respond to a drug at a particular dose. And not only that, but how someone with a particular genetic background, for example, would respond. So, can we stratify our trials better and predict upfront – using human genetics, other approaches – how people will respond so that we can target those drugs, we can design those trials so that we are going to be able to find the set of patients that are going to show the biggest response to the drugs. And I think that we then mirror that modeling, that initial modeling, with something called adaptive experimental design, where you kind of update the design of the trial on the hoof, as you go. That's going to be really important. So, it's still in terms of – from a regulatory perspective, it's a complicated area. But the ability to cleverly update your trial design and use these adaptive experimental approaches – bringing in the digital twins into that, bringing in other insights that you get from biomarkers, that you get from other data types that are generated as part of the trial – will allow us to design better trials and to target the drugs at the right cohort.

Maria: Yeah.

John: Because the two are equally important. You need to kind of mirror that particular response, and then update, maybe the trial design, because maybe it's women between the ages of 50 and 70 that are showing the best response, and maybe you don't know that until you start the trial.

Maria: Right.

John: But if you begin to detect that signal strongly, can you then quickly update the design of the trial to focus on the right – in inverted commas – “cohort” that's looking to show the best response?

Maria: Interesting. What type of input of data are you using to build digital twins?

John: So, there's a variety of types of data. So, anything from X-rays to the standard clinical readouts. Almost anything you can get your hands on is going to be powerful for helping to –

Maria: Could you put genomic data in there, or is that too much data?

John: I'm not sure you could ever, if you ask a data scientist –

Maria: Yeah.

John: – "can you ever have too much data?" –

Maria: [Laughs]

John: – the answer is almost always going to be “no.” It might become overwhelming, but no. That data is going to be useful, as well.

Maria: Yeah.

John: It might not be everything that you put in, because you have to understand that – how much weight you give to different types of data. Some data might be more useful than others. And also, data that are at different scales can impact how the model fits. So, if you have one dataset that's got 10,000 entries in it – or 10,000 entries and then another that's got five – you've got to be a bit careful about how you fit the model. There's quite a bit of fine-tuning involved, so it's not just – there's not a ready-baked way of doing this where you just plug in the data and it's all going to work. So, it really is a horses-for-courses kind of – I just realized that we're probably using a lot of British vernacular here in this conversation.

Maria: [Laughs]

John: Talking about "on the hoof" and "horses-for-courses.”

Maria: [Laughs]

John: May not translate very well across the Atlantic, but oh well.

Maria: That's okay.

John: What can you do? [Laughs]

Maria: [Laughs]

John: We'll do our best. But, the ability to understand this, and how to weed the different types of data is going to be really important to actually build meaningful models, otherwise the models you build may not be as useful as you would like.



Karen: Maria?

Maria: Hi, Karen.

Karen: So, my Ph.D. was in biomedical sciences, and I specifically was doing some neuroscience focused research. I'm not really so sure about this concept of digital twins. Could you talk a little bit more about that?

Maria: So, it's something that's relatively new to me. But I think what we're talking about here is the idea that, for example, when you’re generating a clinical trial, we use real humans obviously in clinical trials. But you could imagine creating digital people based upon parameters and things that are relevant that you're measuring in the trial to create a sort of digital version of a patient. And then you could test multiple different hypotheses. What does this drug do if everybody's female? What does this drug do if everybody carries this particular genetic ancestry? What does this drug look like if, you know, everybody is only sleeping three hours a night? You know, you can think of all of the different types of parameters that you could – if you have the data to feed in – you could generate digital twin patients for clinical studies.

Wellington: So, are you saying that it's not quite a twin of like me, but like a twin of something like in the lab?

Maria: I think it's a twin of a system, right? So, a digital twin is a twin of a system. A person is a system. You are a physiological system. And so, you can build a twin and the accuracy and the utility of that twin is going to depend upon the inputs you have to inform it. That's my understanding at least. And again, I'm not an expert in this either.

Wellington: Will it be able to do yard work?

Maria: I don't think it's going to be able to do yard work or answer your emails. I don't think so. But you know, we could ask!



Maria: I was thinking about a previous conversation we've had on the podcast about genomics and diversity and inclusion, right?

John: Mm-hmm.

Maria: Because I think a lot of the genomic data that is out there is still from certain ethnic populations, and it is not truly global yet. But I wanted to get your opinion on it.

John: It's absolutely a problem. The scientific community is very biased toward samples from individuals of Western European descent. That's unfortunately just the reality. As a field, though, we are very aware of this. So, projects such as the Human Cell Atlas, for example – so, this effort to try and understand the different cell types that make up the human body, that I've been involved with for quite a few years – a really important part of that is the interactions with colleagues from right across the world to make sure that we have a diverse representation of individuals in that reference atlas. Because without that, you're just going to make mistakes.

Maria: Right.

John: And you're not going to serve people right across the world. And that's what we want to do. So, we need to be able to do that effectively. But it is a problem that I think everybody in this field recognizes, that much of the data that's been generated is very biased.

Maria: Yeah.

John: And we need to overcome those limitations or we're not going to – we're not going to succeed, so it's absolutely critical.

Maria: Right. Because you really want these models – if we're going to start having models to enhance our clinical development plans, clinical decision-making, one of the things we're really trying to do is increase representation in clinical development, in understanding how medicines work in people of different backgrounds. If we can start to use machine learning and, you know, not actual clinical experiments, modeled experiments, to help with that would be really powerful.

John: Being able to predict which cohort or which – with a particular genetic background – will respond.

Maria: Or, will have a safety signal, for example, right?

John: Absolutely, yeah.

Maria: All of the different aspects.

John: We know that that varies very much. So, all of these sorts of models, in silico, can give us hints of where drugs will or will not be effective, and that's going to be absolutely vital. And that's just the one axis of diversity, right? There's the diversity, genetic. But there's also the diversity of thinking and thought that we have here in terms of our science. We want to have diverse sets of opinions, diverse viewpoints. Because that's just going to help us do better science. It's not only the right thing to do from an ethical perspective, but practically…

Maria: Exactly.

John: Diverse viewpoints will help us actually develop better drugs. So, it's not only right, but actually it's essential for us to move forward successfully – the field to move forward. So, it's really, really important on so many levels.

Maria: There's been big, big advances of technology over the years that have led to unexpected, perhaps, ideas and understandings that were not necessarily what the technologies were designed for. Wasn't there a guy named Robert Hooke who used one of the very early microscopes to ask interesting questions of biology?

John: And actually he… [laughs] characterized the term "cells" because he was looking at corks, and he saw little structures in that cork, and he thought they kind of resembled the cells that monks slept in. And so, the term "cell" came from that ability to use the microscope to look at that piece of cork and to see the kind of patterns that resembled a monk's cell. But he could only see that from the microscope. So, this was a tremendously revolutionary technology that allowed you to go from characterizing a liver to being able to see sub parts of a liver. And more recent, PCR would be another one that's had a fundamental effect on everything that's done in the wet lab. I'm sure when PCR was in its infancy, there was some uncertainty about how effectively to use it, but now it's everywhere. We've just got to work out how to use these models effectively to really accelerate what we do.

Maria: I like that analogy. I did not know that the word "cell" came from that observation.

John: Yeah.



Wellington: Maria, how would you answer that question?

Maria: I think for me it has to be the more recent discovery of the CRISPR-Cas9 pathway and how that is being used to really precisely look at how the role of a specific gene in any cell type in almost any organism these days. The power of that technology – I'm a little envious that I was not a graduate student when that technology was available, because it was hard work to make genetic manipulations back when I was in grad school. Now, the world is so open for understanding how genes work through that technology.



Maria: Going back to the cell atlas work, now you can zoom in and know exactly really what's going on in an individual cell from the level of all of the –

John: Exactly. And you don't just have to do the pathology experiment where you visually see that it's got this shape.

Maria: Yeah.

John: You can, at the molecular level, interrogate a huge amount of individual cells. So it's another axis of information that just tells us more about how our bodies function.

Maria: Yeah. It's amazing.

John: It's just really amazing. It is.

Maria: Yeah.

John: When you take a step back, you go, "Wow, we could do all this. It's so cool."

Maria: It is so cool. And I think the large language models, which are really just coming – we're just figuring out what those mean, right? I personally find them a little scary, because you ask ChatGPT a question, and it gives you this beautiful, confident, human-sounding answer.

John: [Laughs]

Maria: So, you're more likely to trust what it says, I think, somehow just psychologically, right? It gives you a false, perhaps, sense of confidence in its truth. Whereas we're used to the fact that if you get a page of hits on Google, well yes, you're going to treat those with skepticism and look into what comes forward. But you ask ChatGPT – and you know, I like to play around with it – "My dog's making this funny honking sound when he breathes. What's wrong with him?" And ChatGPT will give you an answer to that, which is…

John: He's a trumpet.

Maria: [Laughs]

John: Is your dog a trumpet?

Maria: Is he playing the trumpet? Yes. [Laughs]

John: [Laughs] No, absolutely. And there's work going on at the moment – a lot of work researching the field to kind of give a score of confidence in the output that's generated, so you can know how much to trust –

Maria: Okay.

John: – some of the output. So, that is there. There's also this feature in ChatGPT that in essence, you can tune how imaginative it is going to be, so its – how much it hallucinates technically.

Maria: Okay.

John: And you can make it hallucinate a lot, or you can choose to have it hallucinate less. And that, in some sense, is how creative you want it to be.

Maria: I see.

John: But there's indeed a lot of work, because it can give you very convincing-sounding screeds of text that are totally false.

Maria: That's right.

John: And that is slightly alarming. So, we work in the field to give some measure of confidence in that and to cross-reference what it generates.

Maria: Yeah.

John: That is absolutely ongoing in the field, and many companies are working in that space.

Maria: And I guess us as the users will get more attuned to it and get used to it.

John: Yeah, absolutely. And ultimately, we are reading it. And we're not robots. If it tells – very confidently tells you, "we're going to have to do this experiment" – you don't have to.

Maria: That's right. Yes.

John: I mean, you can go, "Nah, it's just nonsense."

Maria: Yes.

John: [Laughs] So it's, again, where the partnership aspect comes into this. Models can give you ideas, hypotheses. But we can decide whether we follow up or how we follow up on them.

Maria: If you had a dream of what you would like to see this technology achieve, what would it be?

John: That's a very good question. I think in general we're on the cusp of something really exciting with the amount of data that's being generated, coupled with the computational tools, in conjunction with the amazing experimental and domain expertise that we have here and across the whole field. I think the ability to really – I said earlier, to understand, to simulate or to predict what will happen if you give drug X in genetic background Y to individual Z at each whatever it is, we are not so far away from being able to do that. And the ability to do that, and to see how that can really translate and help us design drugs more effectively and change patients' lives – I think that is so exciting and inspiring. So, I don't know if I would have a specific thing I would want to target. It's more just, can we really build these models that understand, or that can simulate the human body? That can simulate molecule space? That can simulate clinical trials? I think we're really close. And the ability to do that, in conjunction with the – and it's going to take a variety of computational approaches, not just machine learning, but a whole variety of things in order to do this. That's so exciting.

Maria: It is exciting! It's been wonderful talking about this intersect between the maths and the stats and the biology. Thank you for coming to the bar. I really enjoyed the conversation.

John: Thank you so much.



Karen: That was a great conversation.

Wellington: Yeah.

Maria: Yeah, he's doing really interesting work. I think the power of applying machine learning and artificial intelligence to biological systems is – we just see just the beginning of it and it's going to be a really exciting decade or more of discoveries that we're going to see.

Wellington: I'm still kind of stuck on this concept of digital twins. And all joking aside, how is it most going to be helpful?

Maria: I think it's going to be helpful to refine hypotheses, because you can't do multiple clinical trials in hundreds of thousands of people. But you always have more questions than you can answer. And I think if you have a solid sort of digital model to model clinical trials and clinical questions, you can run a lot of simulations and scenarios and then have a better idea that the study you actually do in actual people is really the best study and the right study to answer your question.

Karen: Does that mean that maybe digital twins could be used to help determine what people would be best for a clinical trial?

Maria: Maybe. Yeah, that's an interesting thought. But we'll have to see what the future holds!

And that's our show. Thanks so much for listening. If you haven't already, rate our podcast, wherever you listen – it will help new people to find us. And be sure to subscribe. If you have a question about the show, you can contact us at [email protected]. That's G-E-N-E dot com.

And now for me, it's back to wrestling with data!



The name Two Scientists Walk Into A Bar is under license and used with permission from the Fleet Science Center.