Wednesday, June 8, 2022

Transcripts that are and are not Perturbed in Cancer

We have nearly 2500 transcript sets derived from about 1200 studies involving cancer in human patients. Given this plethora, we’re equipped to ask questions like, “Which genes are rarely perturbed in cancer?” We did that. The underlying idea is that if a transcript is not upregulated or downregulated in cancer, there’s always the possibility that perturbing that transcript would have anti-cancer effects.

Specifically, we calculated the frequencies of all transcripts in our database in studies with human cells. We then repeated the operation, this time with the additional requirement that the cells must be derived from cancer tissue (NOT cell lines!). We can then use the binomial distribution to calculate the odds that particular genes would be randomly over/under-represented in the cancer set versus the larger set.

Jumping right into it, here are the first 25 transcripts that are rarely perturbed:

GADD45A

CHAC1

IFIT2

DDIT3

HBEGF

OASL

MAFF

TXNIP

ID2

INSIG1

IFIT3

PPP1R15A

MX2

ARRDC4

DDX58

NFKBIA

IFIT5

IFIT1

ATF3

DUSP5

ETS1

HMOX1

HERC5

CDKN1A

You may notice that the list is strongly overloaded with genes involved in the innate immune response. All transcripts are significant at a level no larger than P10-60. The champion, GADD45A appeared 632 times in the larger set, but only 6 times in the cancer set. Cancer hates the innate immune response; one big problem, of course, is inducing the response specifically in cancer cells.

Genes that were never perturbed (not even once) include:

ULK1

MYD88

LGALS9B

CPOX

PPP3CB

GRWD1

PPM1B

DEFA1

ZFYVE26

KIFAP3

PCTP

DCLRE1C

LPPR2

KIAA0355

C9orf47

All of these transcripts were at least as significant as P10-30. Though these significances can’t compete with those in the first, more inclusive list above, bear in mind that these are relatively rare transcripts, meaning that it’s difficult to derive extreme significances in these cases. ULK1 appeared 176 times in the larger set, but not once in the cancer set. The protein, a kinase, is involved in autophagy. 

We can also ask, “Which genes are most commonly perturbed in cancer?” This is a bit more mundane, but here are the first 25 genes in the list:

AGR3

ADH1B

DPT

COL10A1

MMP11

PIGR

MYH11

SFRP4

MMP12

ABCA8

C7

OGN

CDH3

SFRP2

COMP

ESR1

FDCSP

NAP1

ASPN

CXCL17

GABRP

THBS2

COL11A1

CYP2B7P

CXCL14

ADH1C

AGR3 appeared 93 times in the cancer set, but only 239 times in the larger set (adjusting for the size of the sets, you could say AGR3 appeared 1182 times in the cancer set). All of the above transcripts were significant at a level no greater than 10-79. The transcripts commonly appear in “cancer vs adjacent tissue” studies; nothing surprising there. It may be a surprise, however, to observe that the perturbation leans fairly strongly in the direction of downregulation in cancer, versus upregulation.

We can, of course, further divide the sets. For example, we can restrict the two datasets (cancer and non-cancer) to transcripts that are upregulated, as opposed to downregulated. Here are transcripts that are rarely upregulated:

DDIT3

CREBRF

DDX58

CDKN1A

GADD45A

IFIT2

HBEGF

CHAC1

TXNIP

ATF3

MAFF

IFIT5

HERC5

ID2

YPEL3

OASL

GABARAPL1

HMOX1

PPP1R15A

TRIM21

BCL6

MX2

NFKBIA

PARP9

IFIT3

Not surprisingly, the list strongly overlaps with the initial “rarely perturbed in cancer” list. The same applies for the list of transcripts which are rarely downregulated in cancer, which we won’t bother listing here. We will, however, note that plugging these rarely downregulated transcripts into our Fisher app, and selecting "Dominant Tissue" in the "Cell Type" box, we find that this list is strongly enriched (P = 10-12 ) with epithelial cell types, meaning that some degree of specificity can be obtained by targeting some of these genes for downregulation. In various studies, these rarely-down-in-cancer/epithelial-dominant transcripts can be downregulated by stat1 knockdown, resveratrol treatment, klhl23 knockdown, carboplatin treatment, top1 knockdown, and much more.

Transcripts that are rarely upregulated are also rarely downregulated (10-25). The list of 33 transcripts found at the intersection of these two sets includes innate immune response factors (e.g. ifit1/2/3, oasl, mx2).That’s a bit of an odd result; why would cancer not downregulate the transcripts it works so hard to avoid upregulating? We should note a possible weakness in our approach; the large list of perturbed transcripts against which the cancer set was compared was heavy with cell lines (while, of course, our “cancer tissue” data is not derived from cell lines). The larger set also contains infection studies; infections almost inevitably result in an innate immune response. There may be other weaknesses that we’re not aware of. Nevertheless, we note that our “commonly up/down-regulated in cancer” lists overlap strongly with our “commonly up/down-regulated in cancer vs adjacent tissue” lists (with P-values less than 10-110 for both up and down). The “cancer vs adjacent” lists, of course, do not suffer from any of the aforementioned weaknesses*.

The database IDs for the aforementioned lists are as follows:

transcripts rarely perturbed in human cancer:  143176203

transcripts commonly perturbed in human cancer: 143177203

transcripts rarely upregulated in human cancer:  143178203

transcripts commonly upregulated in human cancer:  143179203

transcripts rarely downregulated in human cancer:  143180203

transcripts commonly downregulated in human cancer:  143181203


*So…why not look for rarely perturbed genes in “cancer vs adjacent” studies? We can do that in the future, but here we wanted the statistical power that comes from tinkering with huge datasets.

whatismygene.com 


No comments:

Post a Comment

A Preprint

It has been a while since we posted. That's largely because of the effort put into generating a paper. Check it out on BioRxiv . This is...