Wednesday, May 19, 2021

Drug Resistance and Transcriptomics

Our database lists more than 100 studies in which drug resistant cells were compared against drug sensitive cells. Most commonly, a sensitive cell line is passaged in the presence of low drug levels until a resistant strain emerges, whereupon a transcriptomic comparison can be made. In other cases, tumor cells from resistant patients may be compared with cells from sensitive patients.

Given this plethora, we decided to gather all these studies and see if any particular genes emerged that were commonly upregulated or downregulated in the case of drug resistance. The result is a bit more complex than we had hoped. We looked at 95 studies, excluding those involving “radioresistance.” The gene that was most commonly altered in these studies was OAS1, appearing 21 times out of 190 opportunities (all 95 studies have up and down-regulated portions). That seems nice…a “big name” gene popping up at a frequency that, without crunching the numbers, appears to be significant. The problem is this: OAS1 appeared in both the resistance-upregulated and resistance-downregulated datasets (13 times up and 8 times down, to be specific). Therefore, we can’t make the blanket statement that OAS1 is upregulated in cases of drug resistance, nor can we surmise that suppressing the innate immune response (OAS1 is a big player there, after all) might overcome drug resistance. OAS1 is not unique in this respect.

That doesn’t mean that generation of lists of commonly up- and down-regulated genes involved in drug resistance would be entirely fruitless and couldn’t possibly spur insight. We’ve given these two lists the database IDs 129091122 and 129092122. In both cases, a gene had to occur at least 7 times (out of 95) to make the list, giving the lists a composition of 163 and 119 genes. 26 genes were found in both lists:

AREG

IFI27

C1orf24

IL1A

CA12

KYNU

CD24

LCN2

CEACAM6

OAS1

CXCR4

PEG10

DUSP6

SERPINB2

EMP1

SERPINE2

FAM129A

SOCS2

FSTL1

STC2

GPNMB

TSPAN8

HLA-DRA

UCHL1

HLA-DRB1

VCAN

Ignoring the fact that multiple genes are found in both lists, what broad categories of genes intersect with these lists? In the case of upregulation, the innate immune response does indeed seem relevant (e.g. genes upregulated early (vs late) in HCMV infection intersect with a P-value of 10-76). Erlotinib and neo-adjuvant therapy seem to do a fine job of upregulating common drug-resistance genes; if borne out in the lab/clinic, this would have obvious implications for cancer cocktail approaches. On the downregulation side, the metastatic (or not) nature of underlying tissues seems to be relevant. Specifically, transcripts downregulated in aggressively metastatic tissue overlap with transcripts that are downregulated in the case of drug resistance. For far deeper details, just plug either of the above dbase IDs into our “Fisher” app.

How about creating lists of, say, upregulated genes that weren’t found at all in the downregulated category? We tried that, but were met with discouragement. We found that RAB25 was the single best example of a transcript that was downregulated (in 8 studies), but never upregulated. Searching for validation of this characteristic of RAB25 in specific studies, the first study we stumbled upon was this: RAB25 confers resistance to chemotherapy by altering mitochondrial apoptosis signaling in ovarian cancer cells. There, it seems, RAB25 upregulation, not downregulation, correlates with resistance. Hmmmm.

Apparently, the same transcript may be upregulated in one resistance study, and downregulated in the next. My background in this field (a couple distant lectures and/or presentations) informed me that a handful of transporters are the main culprits in drug resistance, and I had hoped that this phenomenon would be obvious once multiple studies were compounded. This is not the case.

One might think that the above conundrum be resolved by examining particular drugs. That is, certain critical transcripts would always be upregulated in resistance to a particular drug. Cisplatin-resistance is the most commonly studied sort of resistance. Here, of 5 genes found in 4 out of 6 of the cisplatin-resistance studies we have on hand, 3 are both upregulated and downregulated, depending on the study: QPCT, SAA1, and MMP1 (TGFB2 is up in all four cases, and ANO1 is always down).

To attempt to clarify matters, we clustered the 190 datasets. Specifically, we performed Fisher’s exact test for each dataset against our entire database, generating millions of P-values. These P-values were the raw data for clustering (Cluster 3.0, k-means). 10 clusters were generated. The top genes found in each cluster are now found in our database with IDs 129137122, 129138122, 129139122, 129140122, 129141122, 129142122, 129143122, 129144122, 129145122, and 129146122.

Though the clusters did not nicely segregate according to drugs or cell types, as one might desire, some clarity was gained. Bearing in mind that both up- and down-regulated transcripts can be found in a single cluster, Cluster 0 transcripts tend to be upregulated on innate immune stimulation (e.g. via interferons). Cluster 3 transcripts tend to be upregulated in the case of metastasis and are enriched for cell-surface markers, while cluster 5 and 8 genes tend to be downregulated in metastatic cells. Cluster 6 genes have a strong tendency to be upregulated in cancer versus adjacent tissue and, rather bizarrely, downregulated on resveratrol treatment (P = 10-56). Other clusters are more nuanced. Thus, it would appear that investigators might wish to place special relevance on the status of cells with regard to innate immunity and metastasis when considering approaches that might mitigate drug resistance. Depending on the cluster, we see hints that particular drugs could, to some extent, reverse drug resistance: bromodomain inhibitors, noggin, losartan, gefitinib, etc. Other drugs, of course, could enhance drug resistance.

It is sometimes difficult to trust the output of clustering programs, so I performed an eyeball version of clustering in Excel. Give different colors to different significance levels (below, green indicates P<10-15), and then sort a column. Gather all columns where colors (indicating significance) percolated to the top. That’ll be cluster 1.Then move on to a column that doesn’t fall into cluster 1. Repeat. Believe it or not, this crude method matched up quite nicely with the software I used. To me, this sort of correspondence between the mathematical perfection of the clustering software and the childish simplicity of matching columns that have the same colors indicates that maybe we shouldn’t spend an excess amount of time/energy debating the merits of, say, “Euclidean distance” vs. “City block distance.” Below is a sliver of the result:


Again, for the fine details, just visit WhatIsMyGene, plug in database IDs (or your own datasets), and have fun.


Note 2/22/2022: We've added quite a few more studies involving resistance to our database. At this point, it's fairly obvious that genes upregulated in cells resistance to one sort of treatment may actually be downregulated on resistance to another treatment. This observation dampens hopes for across-the-board approaches to drug resistance. On the positive side, it may mean that resistance could be dealt with via drug cocktails; i.e. two drugs that trigger opposing resistance patterns could be combined in a treatment. At some point in the future, we'll re-cluster our resistance results. We'll be a bit more rigorous about finding an optimal number of clusters, look a bit deeper into commonalities in these clusters, and perhaps examine cases where 2 "resistance-complementary" drugs might be applied to particular maladies.

Note 9/17/2022: A quote from Comparative proteomic analysis identifies key metabolic regulators of gemcitabine resistance in pancreatic cancerSurprisingly, a number of proteins that were downregulated in MIA-GR8 cells have been reported to promote drug resistance in other cancer types. It's nice to validate our view above, but it's also disappointing to see that many researchers may still be stuck in a one-dimensional view of drug resistance.


whatismygene.com 


Saturday, May 15, 2021

Did Covid-19 Emerge from the Wuhan Institute of Virology?

I’m going to address this topic with a minimum of drama. Go away if you’re a conspiracy buff. Stick around if you’re interested in a frank, somewhat introspective take on this question from a dude (albeit a low-impact dude) who has actually tinkered with viruses. For "safety", I'll spell out my #1 point right here: scientists have knee-jerk responses too. For even more safety, let me also spell out the following at the start: I still find it unlikely that the virus emerged from the Wuhan Institute of Virology.

There are two widely disseminated documents from credentialed authors providing arguments against and for the notion that Covid-19 was lab-generated. On the “against” side, we have a Nature article from March of 2020. On the “for” side, we have Nicholas Wade’s take.

I recall reading the Nature article last year, shaking my head at some of the refutations within, and then moving on to other topics. Wade’s article reminded me of my early skepticism. I’ll affirm two of Wade’s points:

1) The Nature article argues that the absence of a “previously used virus backbone”* within Covid-19 provides evidence that there was no lab-manipulation. Let me say: this is malarkey and, at best, an embarrassment for Nature. I’ve generated “backbone” free viruses myself (on dengue, to be specific). You insert the viral sequence into a plasmid, perform in vitro transcription, and infect cells with the resulting RNA. If you designed the plasmid correctly, there should be no evidence of “backbone.” Even if you erred, the virus may quickly shirk garbagy, non-optimal sequences upon multiple passaging (it can be frustrating to insert “loss of function” mutations into a virus, as the virus might dispense with them surprisingly quickly, if they don’t kill the virus from the very beginning).

In case anyone wishes to nitpick: yes, the 30kb length of coronaviruses makes ordinary plasmid insertion tricky, if not impossible. But there are plenty of methods to generate these long viruses without evidence of a backbone.

It’s hard to believe that the esteemed authors of the Nature article weren’t aware of these viral basics. Why did they choose to offer this lame argument?

2) The argument is made that the spike protein’s interaction with the ACE2 receptor is not optimal; therefore, Covid-19 could not be the product of manipulation.

Again, this is absurd. You have to assume that any and all lab experiments involving Covid-19 would involve insertion of the theoretically optimal (for ACE2 binding) spike protein sequence. Here’s an example of an experiment that I would consider interesting: perform some sort of guided evolution to generate a myriad of spike protein sequences, and test them ALL for both ACE2 affinity and infectivity**. Take the “winners” of this process, insert them into the virus, and write a paper. That’s just one of a near infinite number of experiments you could perform.

Let’s imagine that Dr. Evil is indeed behind the Covid-19 pandemic. He, like any competent virologist, would not automatically assume that the virus that best binds ACE2 has the highest potential to wipe out the human race. It wouldn’t surprise me at all to find that such a virus would be severely handicapped, refusing to let go of ACE2 at any step, and unable to perform its various pleiotropic functions.

Again, it’s odd that virologists would even attempt to pass this argument off in a Nature article.

There’s further lameness in the Nature article. For example: some of the mutations in Covid-19 haven’t been mentioned in the literature as yet. The idea, I guess, is that any lab-generated mutations would already have been described. I won’t even bother refuting that.

I have to question at least one of Wade’s other arguments, however. This regards the appearance of a furin cleavage site within the virus. This is supposed to be some sort of smoking gun for lab experimentation. The site is only 4 amino acids long. It’s not easy to estimate the probability that nature would come up with this mutation. Bear in mind that coronaviruses are the absolute champions of a process called “RNA recombination.” Without going into detail, the furin site doesn’t have to emerge via a step-wise series of mutations…it could enter in one fell swoop. Again, if there’s any “garbage” RNA left over from recombination, it could be eliminated quickly via evolution, including further recombination. If a paper attempts to address the furin cleavage site appearance from a probabilistic perspective, be skeptical about the underlying assumptions about what viruses do and don’t do.

On the other hand, everybody in the virology world inserts furin cleavage sites in their viruses and “replicons.” It’s something we do.

So, where do I stand? The most dramatic thing I can say without feeling guilty is this: we’re far from eliminating the possibility of a lab-generated Covid-19. Nothing I’ve seen convinces me that the virus couldn’t have emerged from the lab in Wuhan. Certainly not the Nature commentary.

I’ll take Wade at his word when he says that the Wuhan lab is China’s #1 coronavirus research facility. Rather odd, no? The counter-argument, I guess, might be that Wuhan is an optimal location to study coronaviruses, because that region of China is coronavirus heaven. I don't know.

To be clear, there’s a huge difference between a lab accident and intentional release. I don’t see any reason to assume the latter. How has China emerged from this mess? With an economy that’s not any stronger than anyone else’s, and the clunkiest vaccines on the market. Infections have been minimized in China, but the emergence of variants threatens that. India surprised everyone with a minimum of infections and deaths…last year.

Returning to the question of why Nature published its lame refutation, let me offer a bit of introspection. I don’t want a lab accident to be the cause of Covid-19 and I feel compelled to argue against the possibility. Just as a big-time developer tires of apparently nit-picky regulations covering endangered insect species, virologists don’t want further restrictions on their activities. We feel like we know what we’re doing. I suspect that the authors of the Nature article feel the same.

Finally, if you point a gun at my head and inquire as to the most probable source of Covid-19, I'd have to lean strongly on the side of natural origin. If you've read the above and have concluded I'd think otherwise, sorry to disappoint. There are plenty of arguments to support the natural origin of Covid-19; most of them, unfortunately, are not very accessible to layfolk. Here's the one that I find most difficult to refute: the 97% similarity between Covid-19 and its closest relative, RatG13, means that Covid-19 diverged from RatG13 no later than the early 1980's, and probably earlier. Thus, Dr. Evil (or Dr. Carelessness) would need to introduce about 1,000 mutations into RatG13 over the years. Whether by site-directed mutagenesis, lab passaging, or directed evolution, that's a figure that nearly unimaginable to virologists. Note also that these sequence differences are spread all over the viral genome; there's no sign that, for example, the spike protein was singled out for special treatment.

Given the above, if you're dead-set on blaming the Wuhan Institute, the only remotely possible scenario that I see would be the following: WIV scientists gathered the Covid-19 virus, or something very closely related, and brought it into the lab, whereupon it escaped with few or no mutations. Given that folks have not identified any virus with higher Covid-19 similarity than RatG13, one might surmise that such a virus may have been collected outside of China. There are indeed studies wherein WIV scientists gathered viruses outside of China (e.g. Africa). Now, if Covid-19 has a natural origin in China, what's more likely: it spread in the chaotic environment of a wet-market, or it spread in the controlled environment of a virology institute? In the case of import from outside of China, one could accuse the WIV of carelessly handling a virus to which the local population may have little immunity. All very speculative, with no evidence at all at this point.

*I note that some metavirology folks use the term "backbone" to refer to the conserved portions of a viral sequence. However, that's not the case in the Nature paper, which points to a paper on Coronavirus construction methods, not broad sequence comparisons, following the term "backbone".

**In fact, a bit of Googling shows that the folks at Wuhan are familiar with Selex, a method that lets evolution, as opposed to "rational design", determine an experimental outcome. Check out, for example, A SELEX-Screened Aptamer of Human Hepatitis B Virus RNA Encapsidation Signal Suppresses Viral Replication. To be clear, this particular paper optimizes RNA, not a protein.


whatismygene.com 


A Preprint

It has been a while since we posted. That's largely because of the effort put into generating a paper. Check it out on BioRxiv . This is...