Study design & clinical metabolomics

In this episode, Julie Courraud and Alice talk about the challenges and pitfalls in clinical metabolomics and what it takes to become a good data interpreter and a good scientist.

All episodes

Julie Courraud

Researcher in clinical metabolomics, National and Kapodistrian University of Athens, Greece

Current affiliation – TrAMS group Athens
Previous affiliation – Statens Serum Institute Copenhagen

Discussed paper by Courraud et. al.
Studying Autism Using Untargeted Metabolomics in Newborn Screening Samples

Sign-up for The Metabolomist e-mail list

Episode Transcript

Alice: Hello Julie and welcome on this podcast.

Julie: Hi, thank you for inviting me. I’m very happy to share whatever I can share.

Alice: Thanks. I’m going to start by a short bio about you, and then please fill in every missing information or correct me if I say anything wrong.

Julie: Sure.

Alice: You have a background in pharmacy and experience working in the clinical metabolomics. You worked for several years at the Danish center for newborn screening, where you used metabolomics for biomarker discovery in newborn screening, and also to study the impact of prenatal and perinatal events on health.
This work is done primarily with mass spectrometry based methods applied to dried blood spots. That are routinely collected from newborns in Denmark, as they are in many other countries in the world to screen for various metabolic diseases. Is there any big part that you would like to point to?

Julie: Well, I think it is good to know that I used to work in analytical chemistry for many years in toxicology as a resident in the hospital. So I have the clinician perspective on what a reliable analysis is and what patients need. In that regard, it’s a good background in terms of where do I want my research to go.

Alice: So you have that point of view of the analytical chemists, but also almost the contact with the patients. Do you work with patients?

Julie: I don’t anymore, but as a resident, I was analyzing a blood and urine and all kinds of stuff to report to different clinical services. This is when you know the impact of what you’re doing on someone’s life and this is not trivial at all. And also, I think, you get a sense of what is important in the analytical part of things, but also, how a test is used.

Alice: I think it’s a unique perspective as well, to have the kind of technical aspect of really knowing how the measurements are done for metabolomics, for example, and also to know in real life how it’s applied when you work in the clinics or when in your contact with the patients that you can really see that you can have a quick impact on people’s lives, if you can make a better diagnosis for example, because you have better tools.

Julie: And I’ve also seen the impact of bad errors. And that can be really dramatic. One thing I, enjoyed doing was discovering both basic science and my PhD, but also clinical research. I was working in a clinical research unit and a cancer Institute. So I was also very close to actual clinical trials, and how to implement newer treatments, et cetera. And how all day analytical chemistry can come to support so many applications in way.

Alice: And metabolomics – Is it a big part of it? In my understanding newborn screening is probably one of the main applications at the moment of metabolomics in the clinics.

Julie: Yeah. I’m not aware of all the applications obviously, but newborn screening is one happening daily.

Alice: Can you maybe explain a bit what that is? I learned about it as I got deeper into metabolomics, but I’m sure there are some people who don’t know exactly what that is.

Julie: Yes. Every baby born in Denmark gets [tested]. We collect a dried blood spot, two to three days after birth – we take the heel of the baby and we get a bit of blood and put it on the filter paper and these filter papers are sent to us and we check for some rare inborn errors of metabolism and also all kinds of diseases that are detectable at birth.
We screen eighteen diseases now as far as I remember. The number is increasing.

Alice: As different in every country, right?

Julie: Exactly. I think in France where I was born they only measure five diseases and of course there are lots of consequences and politics behind it. It’s one of the principles in Denmark at least, is that you should not screen for disease that you cannot cure or treat.

Alice: That’s an interesting point of view.

Julie: And it’s always a debate because what do you mean by treatment or cure? The last study I was working on is genetic disorders. Of course you cannot change the genetics of people, but there are supportive therapies that can help people improve their quality of life – So, where do you draw the line.

Alice: I think it is also important, let´s say if you have a strong phenotype and a strong manifestation of a disease that you don’t know, you have, I guess it makes a huge difference for the patients to at least put a name on what’s happening to them.

Julie: That’s exactly the example of my last study that I’m writing right now. And the poster I had for Metabolomics Society conference. So we can talk about this later.

Alice: We´ll come back to this, yes.

Julie: Yes. So newborn screening is exactly for that. And then it’s in many countries it’s based on mass spectrometry. This is how the metabolomics unit started where I am at the moment [since the interview, Julie changed her affilitation]. They wanted to build on this experience that they had on dry blood spot analysis using mass spectrometry to develop more untargeted types of analysis. And because of course with newborn screening we target a certain number of compounds we don’t do untargeted obviously.
The idea was – okay, can we learn more from these samples? And also in Denmark, these samples have been stored in the biobank for the last 30 something years. They have this huge gold mine of samples that can be retrieved for research and so there is really big potential.

I have 3 different approaches:
One is a biomarker discoveries. The point is how can we add a new disease to that screening that would really make a difference for patients. – Find the right set of biomarkers to detect that disease.
We also have more mechanistic explorations: What is this dry blood spot? What are those samples telling us about this disease? Because dry blood spots is usually collected so right after birth, but in some other studies we collected them later in life. Just because it’s a bit less invasive than the actual plasma collection. So we use that also sometimes for adults.

Alice: Is there a stability question, as well, that you want to study similar types of samples? Because if you collect a large liquid sample maybe the way of preserving it and the way it’s stable is different from a small blood sample on a piece of paper.

Julie: There are so many methodological question behind the dried blood spot versus plasma and plasma is much more defined and studied, so it’s easier in many ways, but it is just a different type of samples, so you cannot really expect to see the same.

Alice: Yes it is very different.

Julie: And, and dry blood spot. We have, not too much data on it and, that’s the third approach of my work. Methodological studies where we want to see what these samples can offer. Are they indeed stable through time? Can we see any type of metabolites on them or maybe some classes of metabolites are not reliable or et cetera, et cetera. So we have a lot of data on that that we want to process and publish soon.

Alice: I guess.

Julie: That would come out to hopefully in the next year. Those three aspects we use metabolomics for, we have to restrict ourselves because it is so easy to get lost with all the applications possible.

Alice: Okay. Let’s talk about one of your most recent papers. It’s an application of untargeted metabolomics to study autism in newborn screening samples. Kind of paraphrase the title of it. It’s published in the Journal of Molecular Neuroscience in 2021. Do you want to give a quick overview of what the paper is about so people don’t have to go read the abstract before?

Julie: Yes, It was my first paper as first author in metabolomics, even though I have been doing research for many years, metabolomics is a challenging field. You need to know a lot about it before you can publish something that makes sense, I would say. And this paper was a trial study for me and I learned so much from it because I made all the mistakes possible. When I say mistakes I mainly mean in the design. I’m very happy with the processes in considering the data in the end, but, this really stresses the importance of having a proper design if you want to do anything.

Alice: Did you get to do the design by yourself, or did you start your work with samples that were produced by other people? Because that happens very often.

Julie: My manager at the time was the one making the design. And I didn´t object to it because I wasn’t knowledgeable enough. Both of us thought this would be good trial – and we both learned from from the fact that it was not optimal. We were aware for instance of batch effects, but instead of just accepting that and do some correction afterwards we just went with one [single] batch. That should be enough. But of course that´s very few samples. When you do untargeted metabolomics on a 74 samples, it’s just too few anyway, so, that was the first mistake: The sample size.

Alice: Do you think you would have been better off having several batches and more replicates or more, more conditions.

Julie: Yes. At the same time our purpose was to test our methods mainly and I don’t think this was this would have been realistic to try to both, answer methodological questions as in what’s the best design for Autism research, and, at the same time sort, the research question itself, as in, can we see biomarkers for autism. I think that this is mistakes that she always want to do, because we want to answer too many questions at the same time.

Alice: Sometimes these are the restraints you have in the research, like you don’t have the space or the money to do the preparation study and then to do your full blown study.

Julie: I think in our case, even for the methodological questions, we could have answered them much better. We thought it would be simpler, but it turned out to be a real pain.
Garbage in, garbage out and if you don’t design your study properly, you will suffer and the processing and then need to spend the time doing the design and try to really think of what confounders you will have. That was the second lesson we had for this paper. All these confounders of dry blood spot, because it’s not common to do and untargeted metabolomics on dried blood spots. In short here (in the paper it is explained): Most important is the age at sampling – whether you have collected this dried blood spot two days after birth or 10 days after birth or et cetera. That’s extremely important if you want to work on dried blood spot and also, the gestational age of the baby, but that’s something that’s known already. So that was not a surprise. That is something you should take into your study design.

Alice: I’m surprised that gender is so small compared to other confounders. Isn’t that unusual?

Julie: No, it’s a repeated finding. In the few studies I have processed a gender has never been significantly influencing the metabolome because at birth, boys and girls were more or less the same thing.

Alice: It’s really interesting.

Julie: Of course, in dried bloodspots for adults there are differences but at birth it doesn’t matter so much. At least, not that we could see so far.
The seasonal effect is very interesting. I think.

Alice: Really?

Julie: I was playing around with my data and I thought, okay, the month of birth is actually very important. And, of course it makes sense that that if you’re born in winter or in the summer, especially in countries like Denmark, it makes a huge difference. Maybe in some other countries it’s not so visible, but there is always a yearly cycle.

Alice: Maybe simply because of light and things like this that did, would you expect it could be different in different countries?

Julie: Yeah, I would say so. I am prone to allergies in the spring and summer, in some countries you will have infections [in certain seasons] you can also have sun exposure obviously, but also air pollution in cities changes according to seasons and diet, obviously.

Alice: That would be the whole study of its own just to figure that out.

Julie: We are exploring those things as well. We will have more publications talking about this in the future.

Alice: Okay.

Julie: I think that even though our design did not allow us to give any strong signal about autism, we learned a lot about dried blood spots. And also what I wanted to see if my method was, appropriate for studying autism. You mentioned one of the table of the article.

Alice: I really liked it. Can I explain how I see it? It’s an unusual table because it’s not actually the results of the paper, right? It’s more the results of the literature search.

Julie: Yes.

Alice: Based on what was interesting in the results. So it is a set of publications for analytes that were found to be relevant in autism and at least three publications. Right?

Julie: Yes.

Alice: So then you have the whole outputs of the literature search at the fingertips of the person who is reading the paper as you would in a review article usually, but in the much nicer way than it’s usually presented. Normally you put that in the text and it’s all a bit messed up and you have a list here from this paper and another from that paper. When I personally read papers, I like this kind of tables because it makes the work of other people much easier to take and digest and reuse that information. That is a lot of work to put together.

Julie: Yes. And I think we all teach this work, in our research. Every time I go through that type of work, I’m like, how can I share this? Because it’s sad that someone else has to do it again. And this is also a way like reviews, but, it’s very often that reviews are not comprehensive on that and I’m not blaming them at all. It’s just the way it is, right. The reviewers for that specific paper asked me to remove a lot of references because I had even more. And, that’s a bit of my weakness as a researcher. I like to share all those, research have made in the literature to see, okay, have I really gone around this topic? Because I couldn’t see a specific biomarkers that were really strongly connected to autism, I thought, what about the other studies? Can I at least reproduce anything?

Alice: This is a classical thing to do at least if you don’t have anything shiny to show from the results to check out how you compare with the others. Right?

Julie: Yeah. And in general, negative results are still results. And I’m sad that it’s always more complicated and to defend but those many metabolites that had been reported in the literature to be potentially involved in autism, I could not see them in my sample.
Of course, this could be because of the type of sample, as we said, the age they were collected, the method, or anything sample sizes indeed. If I’m to use that method for a bigger study, I might going to check these findings again and look at what had been reported and how do I fit in that picture. That was the point of this table .

Alice: Looking back, do you see places where you could have saved time? Besides the project planning, maybe in the execution of the analysis or the preparation of the paper.

Julie: I saved a time by having good acquisition. Then you don’t need to try to clean it afterwards.

Alice: Yeah.

Julie: That’s something of course. It was time-consuming because I didn’t get the lucky, easy, strong finding that you can just exploit it and be happy with.
Even if you try to do your pre-processing, peak integration and everything as thoroughly as possible, there’s always something that pops in your statistics that don’t make sense. A noise feature will be there randomly, they just ended up being a stronger in cases than in controls and et cetera. You need to do the tedious work of double checking and skipping that opens the door to wrong conclusions.

Alice: There’s another tedious work I’m thinking of now. That is, how to identify analytes. I know in other data sets you had to tell MetaboAnalyst for example, which of the four or 500 metabolites you were looking at; which identifiers they had so that MetaboAnalyst would recognize them, and then use them for pathway analysis or whatever. And this is also very tedious and we say in French, ungrateful work.

Julie: I think everyone, is just afraid of that part of the data processing and any interpretation that is a metabolite annotation. Even though things are progressing very fast, there is always a big risk of misidentification. All these unknowns, especially if you work with clinicians who don’t know what to do with the unknowns, [is tedious]. And I fully understand because you cannot go further with that.
If you’re more in an explorative type of study where you want to see what your method covers then maybe it’s not so important to identify everything but when you start trying to use the data to do biological interpretation such as mechanisms or which pathways are involved you, have more tools out there, but they’re all based on the idea that you know exactly which metabolites you’re talking about.

Alice: Which sometimes even the method doesn’t allow, because I know in my mass spec we often measure some signals of things and then you know, it’s that structure more or less, but it can still correspond to many different metabolites that would be identified differently.

Julie: I think there is a margin of errors on the side of annotation, but also on the side of those databases for pathway analysis. Those are also based on someone’s research, right? So they also full of mistakes. I’m sorry to say that but necessarily they are.

Alice: And the pathways often don’t cover everything, but the pathways that are the map behind also contain misinformation and they often don’t take in all the analytes we would have to give them. There is also a whole bunch of analytes that will automatically fall out because they don’t map to anything yet.

Julie: Yes. I think that it’s really important that when you use those tools, you know their limitations and you’re aware of those limitations and what your method is actually giving you.
Typically I was using the MetaboAnalyst pathway analysis and metabolite enrichment analysis.
But, depending on the database you choose or exactly how you name your metabolites, you get to quite different results. So you have to be careful when reporting your resources. I got this result, but because I had this list of metabolites, it is not necessarily the truth. It’s just the truth from the angle you had in your study.

Alice: I think every interpretation is a kind of special juxtaposition of a data set; the tools you use to look at it and how you identified it, because there’s also a translation moment there. And then you look at it from a certain angle. That is the one of what you know, and of the question you’re trying to answer. So it’s all different things that kind of narrow down the question to something that is actually very biased in many different ways that add up to what you get at the end. In that context, the bias is necessary, I think. Because you have to reduce to something, but you have to remain aware of the bias that you have, and then you can go on and think and interpret and make hypothesis or even draw conclusions as long as you don’t forget that you’ve made these limitations before.

Julie: I think that the one thing that researchers should accept is that sometimes you have to give up on some analysis because it’s just not reliable. Even though you can always run a script and you can always click around and generate results if you have to.

Alice: Are there hard limits that should not be crossed that you can think of?

Julie: A quarter of your metabolites are mapped in databases. Did you believe what you see? Is it reliable? Maybe you will be thrilled because of a particular pathway. “I’ve seen that in a paper. That makes sense.” Let’s just stick that. And there is this confirmation bias that comes again and again and you have to remember that this might not be true. And even if it is the actual finding that people are interested in? Should people really spend time on that finding? Or are we going to move forward with that? – Or is it just good because then in the paper you can say, that you confirmed that finding?

I think that in metabolomics this is very important especially because we usually design studies without a specific hypothesis in the beginning – it is very easy to get into this kind of dilemma. Okay. I found this, but should I report it or not? And if I don’t report it, is it considered, hiding something or not? If I had one way to express how I consider a study successful. I would say that, it is about going further to the next step in a concrete manner. Of course it depends on your research question but did I learn something concrete that I can apply and am I ready to take that next step and do it better next time?

Alice: Maybe a good example of this is the poster that you presented at the metabolomic society. You want to explain quickly what the poster was about and the main finding of it.

Julie: Yes. It’s a different story than my first paper. I cannot really say that it was a success story because it’s just one (single) study and I have no clue if it’s going to go anywhere.

Alice: I`m sure a lot of people would be happy to get that kind of output [as in this study]!.

Julie: I was a bit surprised myself that no one had found it before, but, let’s go back to the research question.

Alice: It about different disease right?

Julie: Yeah, it’s about 22q11 deletion syndrome, which is the most common deletion syndrome. People are born and they’re missing a big chunk off of their chromosome 11. They’re missing many genes and the thing is it’s not lethal. You might actually die because there are like continental heart defect, etc. But a lot of people survive and they have various, phenotypes and many different organs are involved. Because a lot of the manifestations are quite unspecific it takes several years before people get the proper diagnosis. And that’s why we’re fighting here, because we think that these people should be diagnosed at birth but [currently] they have to go through many years seeing different experts and sometimes they never get diagnosed they have kids who have to carry on that, deletion and I think it’s sad that they weren’t informed because it’s possible to do so. And of course, genetic analysis of everyone that is born is yet possible. And also, maybe not “wishable”. I mean, that’s another debate.

Alice: But new born screening is happening. So I guess that’s the main idea.

Julie: Exactly. Since we are collecting those dried blood spots, anyway.
I mean, maybe there is something in those dried blood spot that could help us detect the deletion. In the literature, a lot of people are really focused on the genetic aspects of this, like, which is the deletion big or short, or which genes are involved but doing just metabolomics on the blood to see what happens in the blood has not been done much. I could not find anyone that had the report to the tyrosine level, for instance. It is a lower levels in cases, compared to controls at birth. And the proline in higher levels in cases than in controls. That has been shown in the past because proline dehydrogenase is a gene is deleted. So that made sense. High proline was kind of a finding that I could expect, but, not tyrosine. And the good part of these two amino acids is that they are both part of newborn screening in many, many places.
So it’s not anything new. You don’t ask the labs to develop a new method. You just ask them to take that data, make a ratio and [it would be done]. We need to do more studies to see which value should be used for that ratio.

Alice: And I guess it could be used as a pointer, even if the significance is not enough to really discriminate from other diseases that at least you might be tempted to then, do a genetic test.

Julie: For every screening, we always have a second analysis to confirm the diagnosis. And in the case of the 22q11 deletion syndrome, there we would need the genetic analysis for sure to confirm. The idea is can we detect some people that probably have this deletion syndrome and catch them early.

Alice: How did the idea come to make a ratio of those two metabolites?

Julie: I’m not aware of other ratios used in clinics, but I know that some ratios are used already in newborn screening and so it was already something that I was thinking of and I was very happy to see that MetaboAnalyst biomarker discovery type of analysis computes the ratios for you. So you don’t need to bother doing it yourself. It computes the ratios and it treats them as a variables like other metabolites. And it will select the 20 best ratios. And this is how this one came about actually.

Alice: Ok. Interesting.

Julie: I think it made sense because most of the metabolites significantly different in cases than in controls have lower levels but proline is one of the few that have higher levels. Of course, if you make a ratio then you amplify the effect; so ratios are very helpful for that. And it’s very simple. I’m very happy to think that that this can be implemented in many places if they want to, obviously. Again, this deletion syndrome cannot be cured. So some states will not want to screen for that, but that’s another discussion.

Alice: So let’s move towards more general topics about data interpretation; is there something that, in your experience, you find particularly challenging for, the interpretation itself?

Julie: If I was to list some things that I think people should be aware of is – There is also the lack of interpretation! That’s the first thing. If you stop at the descriptive analysis of your data, I think it’s a bit sad because if you don’t do that job, then who’s going to do it. It’s a bit sad that that many papers don’t bother going further than that.

Alice: Why do you think that is?

Julie: Obviously it’s a lot of work and you really need a lot of knowledge! – One person cannot be an expert in everywhere. So if you are doing your PhD or whatever, and you don’t have the time to go further and to just get that paper out. I’m not blaming anyone for doing this; it just doesn’t satisfy me personally.

Alice: I know that feeling.

Julie: It is also very easy when you do descriptive analysis to fall again into this confirmation bias. Well, this other study found this as well. And so that must be right. And this is it. I think that it’s only when you try to go further that you see if it makes sense and if your work is actually going to be useful to the community.

Alice: How do you go further? What are the tools that you use? Recently I posted a poll on LinkedIn to ask people how they do metabolomics interpretation. And one of the options I gave them was pen paper and pub med.

Julie: Yeah. That’s the painful way.

Julie: It depends on your, approach. If you’re doing biomarker discovery, you might have some findings where you see biomarkers that can help detecting whatever you want to detect. I think then there is the interpretation: Does that make sense in terms of a mechanism? Sometimes you don’t know. And then of course it’s not much you can do except really digging into literature and all the tools and the databases (like pathway analysis). If you really want to go deep into why these biomarkers have changed, you can go there. But again, with the limitations we have mentioned. Sometimes it is not necessary to know why the markers are like this. If your goal is just to find a biomarker, maybe you just need to check if people can measure it, is it stable in your samples? Then you go more for kind of the methodological or logistic kind of questions.

Alice: If you study a disease that’s not well-studied then you can always have the kind of exit strategy to say, well, there isn’t much information about mechanisms yet, so I’m putting this out there and then maybe one day someone else can put it together with something else to figure out. There are cases like this.

Julie: Yeah. And I think that sharing raw data is also the way forward for that, because it happened to me typically for the 22q11 deletion syndrome that there were one or two studies with metabolite measurements, but they will not share the full list of metabolites they had measured. They reported only the positive results and that is too bad.

Alice: Or the studies with entire datasets that are all expressed in relative value. So let’s say relative to control, and then it’s all like hundreds of rows of things. But what does it mean? How should I know what it is like in the controls?

Julie: Yeah. Exactly that. If I want to go back to what can go wrong in an interpretation? I would say that one of the obvious: Bad design, bad acquisition, bad processing, and bad statistics. The big pitfalls. And then it’s not always easy to know whether your method is adaptive, so you definitely need to talk to other people and read a lot or watch webinars and courses and try to get as many opinions as you can. Just to kind of make your own opinion about what is adapted for your type of data. I think you cannot alone just necessarily figure out what’s best unless you have experience or routine – As a beginner, it’s just impossible. But the risk with having a bad design is that you might or might not see something, it’s just totally unreliable. And then that’s the same for, bad statistics, et cetera. I mean, you always need to be aware of your bias and confounders is very simple, specific make made the matrix specific. You have to understand what you are looking at. Is it blood from adults? Is it blood from newborns? Are these people taking medications, what´s their diet, this and that.

Alice: Would you say that you should always keep mind why you’re doing this experiment as well, because I know in my experience there are also pitfalls that just fall on the road as you go. And I think this applies to any type of experiment.

Julie: I think that’s a big mistake that my unit has made in the past. Initially I was supposed to just acquire data and deliver data to other people who were supposed to analyze them. But these people were not metabolomics expert and they were a bit too optimistic as in we’ll figure things out, but you cannot just figure things out out of nowhere. It takes a lot of time and knowledge about what is mass spectrometry in my case, what do you see in terms of adducts? Where did it come from? Can you pull them together? What is this noise? I’m not even talking about annotation, but what is intensity, what is peak integration?

Alice: And I guess in this context that it really helps us to have people either more experienced or at least in the similar situation that you can work with. Did you have this kind of environment or did you have to go actively look for help on the outside?

Julie: Initially I was alone, so I was really looking for help, and this is also how I ended up creating the Danish clinical metabolomics network, because I needed to gather people.

Alice: This is what networks are for.

Julie: Yeah, exactly. And I was alone and I cannot come through this and I need people to talk to each other. And I’ve learned a lot from those people. And also of course after some point, in my unit, we recruited a bioinformatician and she had a lot of experience with her own workflow. And so we adopted that workflow as a default one, which had a lot of good innovations. I couldn’t have done this by myself. Even though I think it’s more and more feasible with videos and tutorials online to find your way, but, you need an understanding as you say, of what you’re looking for, what is your goal? And what is the data you’re working with? So, It doesn’t come in a, in a day. Right.

Alice: Absolutely it really takes time.

Julie: And I think that we cannot avoid the pilot studies. People don’t want the pilot studies; they want good, nice studies, well designed, but you cannot design something if you don’t know what your samples are bringing to you and what your confounders are, et cetera. So, you need to go through that. Jumping steps never helps if you ask me. I don’t think that sample size can always compensate for bad design. And it’s not everyone that can afford to have big sample size anyway.

Alice: Of course.

Julie: To go back to this pitfalls: Of course, it’s very difficult to know if your statistics and your processing is reliable, but, I would not trust only the reviewers of your paper. I would try to have someone look at it before.

Alice: Yes.

Julie: Because reviewers have also their own limited time and their vision of things.

Alice: Have you also been disappointed in your reviewers? Cause I have.

Julie: Yeah. Oh yes. I’ve had reviewers that have pretty quick points and very reconstructive and I had all kinds of reviewers, of course. But, you have to remember reviewers are a human beings, right? So the day they print your paper, they might not be in the right mood.

Alice: Of course.

Julie: And if they haven’t understood something, maybe it’s just that you have not been clear enough because you have the nose in your paper, right? I think that a very big part of interpretation is what did you report and how do you report that interpretation? How do you translate what you have found into something that people can understand and, use? And that’s even the next step of interpretation is reporting. It’s not trivial either at all – it depends on your audience, right?

Alice: Yeah. And in a scientific paper is it’s a sort of storytelling it would say, or would you put it differently? Because for example, when you explain now the story back to your poster about the proline/tyrosine ratio and how the proline dehydrogenase is missing from those missing genes stuff. For me, that’s the beginning of a story. That’s how you help make the information stick in other people’s brains because the facts are kind of following. How do you do this reporting in scientific publications.

Julie: It really depends on your goal. I think it’s much easier, in cases such as my last study where we indeed found something and it made sense.

Alice: That helps.

Julie: It was just beautiful to write that paper because the literature made sense and my results made sense and I have this feeling I am bringing something. It’s just sad that the data has been sitting for three years before anyone looked at it because it was so difficult in the end for someone else to figure out what to do with that. So again, I think we should have saved time there.

Alice: Yeah, but I think there’s really also a goldmine in studies that are sitting on someone’s hard drive but it is surely a lot of work.

Julie: It is a lot of work and so storytelling is sometimes very easy. You need to go into literature and then take it. I usually take like so many notes and of course it’s very little that gets into the paper (only essence of it). And, I’m trying to, keep that red line indeed. The, the story it’s much more difficult when you do explorative type of studies because then you have so many things to see, but what do I want to present? So you have to take a decision and choose some examples and you go for examples and you just to show people will take if they want to. How do you choose those examples then? I guess it’s pretty personal.

Alice: There’s a big subjective part in the discussion of papers.

Julie: Yeah. And again, one of the big pitfalls is jumping to conclusions just because the literature you’ve been to first search has gotten whatever result. I think that the skill of interpretation cannot be developed or expanded if you don’t have the skill for criticizing literature and doing a proper literature search and doing this detective work of actually reading the papers you cite. I’m sorry, because it’s so easy to decide based on the statement in the abstract and you don’t read it, you propagate findings. That might not be true. I think you can’t avoid it sometimes, but with some signals are just amplified in the literature in a non-constructive manner.

Alice: Very good point.

Julie: This is also part of the tedious work and our duty, as a romantic of science, I think it’s our duty to check these findings and see: Can I actually cite that paper? And does it make sense to cite that example in my data? Is that going to bring something to others?

Alice: Would you say then that this critical thinking would be the most important skill for you?

Julie: Having a bit of marketing skill is definitely important in science, but, the more you know, the more you know, you don’t know, right? As a pharmacist I’ve been educated on various diseases and various organs, et cetera. And, we cannot have a comprehensive view of what’s happening in a human body, and this is why it’s so important to collaborate with other experts. Never think that your vision is just the only vision! Having a critical mind for me, goes together with opening yourself to others’ opinions and chasing them because when you go to conferences, It’s very easy to just, listen and never interact, but I have learned so much from just going there and saying, “oh, I found this, what do you think about this?” And then sometimes you get to extremely interesting conversations that you don´t expect really.

Alice: Yeah. But it all begins as in any human interaction. I think it all begins with this moment where you make yourself vulnerable by saying: “This is my work and this is what I found. Do you say it’s good or you say it’s, bad?” We can always discuss whatever your answer is, but you’re exposing yourself. And this is why it’s of course, much more comfortable to sit at the back of the audience and say nothing, but you don’t get as much as if you ask a question or you actually present your poster to people or talk at the conference.

Julie: Yeah. And I think that I’m a very social person, so I’ve enjoyed that type of direction. I understand it’s not the same for everyone, but if it’s not your strong suit, like ask your close collaborator who knows about the study to do it for you and maybe, and you have to get out there with that story and challenge it! Is it robust? One statistical analysis is not the truth and that’s something I tried to do in my last study. I was trying to see if the biomarkers I saw were ones that I could find through 3, 4, 5 different type of tools and analyses. You can of course take one and report it, but you have to challenge that right.

Alice: How did that work out?

Julie: It worked well actually. And this is why I was happy with the findings.

Alice: Which tools do you use, actually, what are those tools that you use in your work.

Julie: Classical ones. I’m not a bioinformatician. I’m not making my own R! packages and stuff but I get inspired by all the papers and then webinars and like read. Very recently I wrote to Julien Boccard who’s been teaching a workshop at the Metabolomics2021, because he was showing a very nice plot on how the most interesting biomarkers or metabolites were impacting the different factors in his study. You have to do think of your samples, so many dimensions, right. – Let’s say you have a case control study. You have people with diabetes, people without diabetes, but these people also have different age, different comorbidities, different gender, different this and that, and all these factors are very important to look at them and not ignore them. It’s not because you’re interested in diabetes that you should ignore these ones. There are now different type of statistical analysis that you can use to see is my diabetic condition, actually the most, dominant one in the story.

Alice: Or am I studying something else?

Julie: Sometimes you are just having your nose on your diabetes and you forget that maybe obesity is triggering that. Of course the more you amplify numbers of factors, the more, you open the door to random findings as well depending on your sample size. So you have to be careful with that. And of course it’s a lot of fun to put every factor you can in your table. But, there again, you have to draw a line what’s relevant or not.

Alice: This is also advice I give if you have different types of data to keep all the factors that you know in your data set as long as possible to not just get rid of the columns you don’t need. (“Because nobody cares if you smoke or not.”) But we might, once we reduce the groups to this and that, and then realize all the smokers are left and all the non-smokers are right.

Julie: Exactly. Yeah, and this is also how you go for handling outliers. I usually do the Permanova in terms of tools you were asking and we’re sharing the scripts always when we publish. It’s interesting because it shows all the factors that you’re testing, or I also call them metadata variables. So age and sex and all these ones, you can see to what extent they impact the metabolome. And if the effect is significant or not. So you get two type of information here, the extent, the significance. And then, the thing that Julien was showing the other day, I said it was very interesting is that, once you have already some candidates biomarker, let’s say like a 20 most interesting metabolites in your whole data set, how do you know, why they are what they are?
What influences them the most, and he had this plot where you could test the importance of each of the factors. So the metadata variables for each of these biomarkers, and you could, have this plot where you at effects on top of one another. I mean, it’s very visual. It’s very easy to see. Actually, that’s a big biomarker that has super strong effect. Well, it’s mainly driven by age or is mainly driven by a sex. And It was very interesting and it comes together with a Permanova and I’m quite excited to use it in my next study. Even though it actually published some years ago, but somehow I never came across it before now.

Alice: Do you feel it helps when the visuals are good for a new analysis? It does help, doesn’t it?

Julie: This is fundamental. You cannot having a good interpretation and being unable to report to the property. That is not constructive and, you fail again, even if you have the right interpretation, if you cannot report your resource in a way that people can use it or understand it themselves, then it doesn’t work. I think it’s a lot about writing publications and making the correct plots and it doesn’t come naturally. I was terrible in my first paper writing but I think I’m much better now. I hope so anyway.

Alice: It takes practice for sure.

Julie: It takes practice and also it takes, reading again and then showing your results to others and okay.

Alice: Showing it to others? Cause you can write 20 papers. If you never show them to anyone. You’re probably not learning as much as if you show it first to the people you work with, then to collaborators, then to reviewers and present them at conferences. It really makes a big difference.

Julie: The outsider “I” is important and I think that it’s exposing yourself. And this red line, you have to put yourself in the shoes of someone who has no clue what you’re talking about.

Alice: And level with the audience.

Julie: That’s not easy but that’s something you have to do!

Alice: That’s the story I’m talking about it’s the red line that is helping you to go, also including the study design and including all the steps, to go and take the reader along the story to the conclusion that you have. And of course, the better the conclusion, the more exciting it is and the easier it is promised to keep them interested as well. Now as we are approaching to the end of our episode today would you like to share with the listener what would your process is to interpret metabolomic data?

Julie: I have a usual flow I’m following when I interpret data. I think that it helps sometimes not forgetting the different parts of things. So the first is of course the technical quality of the data. And I think that we should not overlook this because if you overlook this, you will definitely interpret something but to my just report false findings on that would be sad. Be very aware of what is a good acquisition. What is a good data has asked for help if you don’t know how to assess this. There might be some post-acquisition corrections you can do. And then there are many things and I’ve done some so it’s totally possible doesn’t mean that you can always rely on the data afterwards, but some things can be corrected.
A second part would be to check roughly if you are a studied condition, here an example of a case control study is your started condition significantly visible or not. If it’s not, then can you explain it to you? Do you have strong confounders? Is there too much noise? Can you go back to the technical quality step on and improve it and try again? – Or do you have to improve your design and make a new study?
At the end, even if nothing is possible – you would always learn something from that. It’s always positive, even if it is disappointing. In the case of a positive or significant effect of your stated condition, then you have to look at the strengths of that effect in comparison to other factors, as we said, and you can learn from that too, to improve the methods and you designed for the next study.
My next step is to look at metabolites, which of those have actually an impact? Which are interesting? And depending on your goal, then you have different approaches:

If it’s for biomarker discovery, explaining them is of course interesting, but if you cannot explain them, it’s not the end of the world. What you need for a good biomarker is something that’s specific that is sensitive and is robust? If you apply that analysis in real life setting with different type of samples, et cetera. Can you acquire them with various methods or is it difficult to acquire it, et cetera? So these are the questions I would address in terms of biomarker discovery.
For mechanistic exploration: If I have a study more on mechanistic expression, I look at patterns and what literature says, but be careful with the confirmation bias again. And if you do pathway analysis or cross-omics, you have to understand the tools you’re using. Collaboration is the key here! And remember, the method you’re using as limitation.

Then, the last step is: How do you report these findings. And I don’t have the actual guideline for that.

Alice: This is the whole reason for this, because there is no guideline. I think it really helps when we discuss this with you and with other people to get at least a feeling for how people do it, that’s even for the people who are a bit shy at the conference and don’t dare to ask questions that they also get this input and these ideas. And maybe also go and ask next time there in the conference room.
What you find the most promising for applications of metabolomics.

Julie: I think the main idea for me in terms of application of metabolomics is the question: “Am I going to make a difference for the patients?” And if yes, in a realistic timeframe? Is it going to be in 30 years or is it going to be in five years? And I think that it is very tempting with metabolomics – There is so much potential that you easily get lost and exploration and the clinician in me is always trying to keep track on the idea – “Are we going to get there to this application?”

Alice: Feel it’s like positive pressure for you. That’s this presence of the patients help you, to work better in that fields?

Julie: It gives a lot of meaning. I know the application of biochemical analysis in the real world and I know what we don’t. I mean, I don’t know what we don’t have, but I know we don’t have a lot of things and we’re missing a lot. And I also know that when you get a result to even personally in my family or from myself, when I get a number on a paper saying, oh, you get that level of this and this means that. I’m always thinking “Really? who knows”. Of course I want to trust that most labs are doing a good job, but you have to keep in mind that everything that you go through as a patient yourself has been developed based off research and researchers are human beings. And, of course, things are usually not implemented in the clinic if they’re not reproduced and sure and double-checked X times.

Alice: There’s the methodology that can be faulty and there’s also the entire story behind it that can be faulty because it is based on the believe that things are one way, and it is actually something else. And this, we find out much later and sometimes.

Julie: Sometimes it takes decades before you find out that this biomarker [for example] was actually not specific enough. And often this has triggered so much damage and you cannot know that of course, beforehand, but that is something to keep in mind!

Alice: Yeah. As long as it remains a positive pressure to do good work and not a daunting fear that you might make the mistake. Right.

Julie: Exactly. I think you have to find a good balance between the wish for having research applied in the clinic as fast as possible and the method/rationale to be reliable as well. Like any researcher, we have that pressure of publishing and off applying for grants and paying for our own salaries. In my case, it’s a short term contract.

So you have to keep your integrity through all these pressures. Remember why you’re doing it right. This is my pure why! I am described as a romantic of science.

Alice: Self-described or the people to describe.

Julie: It is my companion who says that. I want to believe that most of the people are like me and the way that they really want to do a good job. There are also some scientist that would maybe not be so careful, I would say. – Not necessarily because they conscientiously doing it. It’s just lot of knowledge. So it’s very difficult.

Alice: Julie, thank you very much for taking part in this podcast. And for speaking with me!
If people wants to contact you, then they can contact you directly.

Julie: Yes. Great. My LinkedIn because I might not stay forever where I am. So LinkedIn is a bit more sustainable. Thank you so much again for letting me speak.

Alice: I’ve found it very, very interesting

All Episodes

The biocrates podcast

The biocrates podcast

Study design & clinical metabolomics

Julie Courraud

Episode Transcript