We have nearly 2500 transcript sets derived from about 1200 studies involving cancer in human patients. Given this plethora, we’re equipped to ask questions like, “Which genes are rarely perturbed in cancer?” We did that. The underlying idea is that if a transcript is not upregulated or downregulated in cancer, there’s always the possibility that perturbing that transcript would have anti-cancer effects.
Specifically, we calculated the frequencies of all transcripts in our database in studies with human cells. We then repeated the operation, this time with the additional requirement that the cells must be derived from cancer tissue (NOT cell lines!). We can then use the binomial distribution to calculate the odds that particular genes would be randomly over/under-represented in the cancer set versus the larger set.
Jumping right into it, here are the first 25 transcripts that are rarely perturbed:
GADD45A
CHAC1
IFIT2
DDIT3
HBEGF
OASL
MAFF
TXNIP
ID2
INSIG1
IFIT3
PPP1R15A
MX2
ARRDC4
DDX58
NFKBIA
IFIT5
IFIT1
ATF3
DUSP5
ETS1
HMOX1
HERC5
CDKN1A
You may notice that the list is strongly overloaded with genes involved in the innate immune response. All transcripts are significant at a level no larger than P = 10-60. The champion, GADD45A appeared 632 times in the larger set, but only 6 times in the cancer set. Cancer hates the innate immune response; one big problem, of course, is inducing the response specifically in cancer cells.
Genes that were never perturbed (not even once) include:
ULK1
MYD88
LGALS9B
CPOX
PPP3CB
GRWD1
PPM1B
DEFA1
ZFYVE26
KIFAP3
PCTP
DCLRE1C
LPPR2
KIAA0355
C9orf47
All of these transcripts were at least as significant as P = 10-30. Though these significances can’t compete with those in the first, more inclusive list above, bear in mind that these are relatively rare transcripts, meaning that it’s difficult to derive extreme significances in these cases. ULK1 appeared 176 times in the larger set, but not once in the cancer set. The protein, a kinase, is involved in autophagy.
We can also ask, “Which genes are most commonly perturbed in cancer?” This is a bit more mundane, but here are the first 25 genes in the list:
AGR3
ADH1B
DPT
COL10A1
MMP11
PIGR
MYH11
SFRP4
MMP12
ABCA8
C7
OGN
CDH3
SFRP2
COMP
ESR1
FDCSP
NAP1
ASPN
CXCL17
GABRP
THBS2
COL11A1
CYP2B7P
CXCL14
ADH1C
AGR3 appeared 93 times in the cancer set, but only 239 times in the larger set (adjusting for the size of the sets, you could say AGR3 appeared 1182 times in the cancer set). All of the above transcripts were significant at a level no greater than P = 10-79. The transcripts commonly appear in “cancer vs adjacent tissue” studies; nothing surprising there. It may be a surprise, however, to observe that the perturbation leans fairly strongly in the direction of downregulation in cancer, versus upregulation.
We can, of course, further divide the sets. For example, we can restrict the two datasets (cancer and non-cancer) to transcripts that are upregulated, as opposed to downregulated. Here are transcripts that are rarely upregulated:
DDIT3
CREBRF
DDX58
CDKN1A
GADD45A
IFIT2
HBEGF
CHAC1
TXNIP
ATF3
MAFF
IFIT5
HERC5
ID2
YPEL3
OASL
GABARAPL1
HMOX1
PPP1R15A
TRIM21
BCL6
MX2
NFKBIA
PARP9
IFIT3
Not surprisingly, the list strongly overlaps with the initial “rarely perturbed in cancer” list. The same applies for the list of transcripts which are rarely downregulated in cancer, which we won’t bother listing here. We will, however, note that plugging these rarely downregulated transcripts into our Fisher app, and selecting "Dominant Tissue" in the "Cell Type" box, we find that this list is strongly enriched (P = 10-12 ) with epithelial cell types, meaning that some degree of specificity can be obtained by targeting some of these genes for downregulation. In various studies, these rarely-down-in-cancer/epithelial-dominant transcripts can be downregulated by stat1 knockdown, resveratrol treatment, klhl23 knockdown, carboplatin treatment, top1 knockdown, and much more.
Transcripts that are rarely upregulated are also rarely downregulated (P = 10-25). The list of 33 transcripts found at the intersection of these two sets includes innate immune response factors (e.g. ifit1/2/3, oasl, mx2).That’s a bit of an odd result; why would cancer not downregulate the transcripts it works so hard to avoid upregulating? We should note a possible weakness in our approach; the large list of perturbed transcripts against which the cancer set was compared was heavy with cell lines (while, of course, our “cancer tissue” data is not derived from cell lines). The larger set also contains infection studies; infections almost inevitably result in an innate immune response. There may be other weaknesses that we’re not aware of. Nevertheless, we note that our “commonly up/down-regulated in cancer” lists overlap strongly with our “commonly up/down-regulated in cancer vs adjacent tissue” lists (with P-values less than 10-110 for both up and down). The “cancer vs adjacent” lists, of course, do not suffer from any of the aforementioned weaknesses*.
The database IDs for the aforementioned lists are as follows:
transcripts rarely perturbed in human cancer: 143176203
transcripts commonly perturbed in human cancer: 143177203
transcripts rarely upregulated in human cancer: 143178203
transcripts commonly upregulated in human cancer: 143179203
transcripts rarely downregulated in human cancer: 143180203
transcripts commonly downregulated in human cancer: 143181203
*So…why not look for rarely perturbed genes in “cancer vs adjacent” studies? We can do that in the future, but here we wanted the statistical power that comes from tinkering with huge datasets.
No comments:
Post a Comment