Wednesday, February 17, 2021

Underrated Genes

Nature magazine recently published a study of...biological studies. A number of questions were asked, one of them being, “which genes are most represented in the literature?” Not surprisingly, TP53 is the champion, with 9,232 publications. It’s a good read.

A question not addressed is, “What are the most under-represented genes in the literature?” Of course, it’s trivial to find genes that have no mentions at all. What we can do, however, is use our own database and ask, “Which genes have the largest disparity between inclusions in our database and inclusions in the literature?” The exercise is simple on its face, but there are a number of technicalities that make it a bit tricky. If we were writing an academic paper, we’d have to do 100X the work we’re putting into this post. Basically, though, our procedure works like this:  Download a list of genes ranked according to literature mentions. Convert these gene IDs into the format used in the database. Generate a frequency table of all genes in our database. Compare the frequencies in our database against the frequencies in the literature.

The list of genes according to literature mentions is found here: ftp://ftp.ncbi.nih.gov/gene/GeneRIF/generifs_basic.gz

With the understanding that there are a number of ways in which results can be skewed, here’s a list of the most under-rated players in the genomic universe:

RTP4

VSIG2

CLIC6

FAM198B

HIST1H2BD

MT1L

MOXD1

CENPK

ANKRD37

CMBL

PLBD1

TUBA1C

ARHGAP11A

TMEM154

HIST1H2BI

NMES1

TMEM140

PKIA

ADGRL2

KBTBD11

NT5DC2

C15orf15

RSL24D1

RPL27A

FAM49A

PGM5

RGL1

CLMN

EVI2A

TFEC

RPL18A

RPL21

SRM

CALML4

OLFML2A

RPS8

ENDOD1

KDELR3

RPS11

GNG4

TMEM56

SH3BGRL2

CIART

ENPP5

GBP6

RSRP1

COX6A2

GPRIN3

GPRC5C

TMEM71

NRIP3

MFAP3L

CPNE2

ABLIM3

SMIM14

HIST1H2BM

SLC46A3

EVI2B

PCP4L1

TRNP1

GBP4

SLC16A14

RBP7

SLFN13

FAM84A

RAPGEF5

TM6SF1

NSG2

VAT1L

EPPK1

RPL27

DNAJA4

PGAM2

TTC39C

TRANK1

GBP7

N4BP2L2

MEGF6

CDH19

FIBIN

TINAGL1

CCDC3

LONRF2

DDX60L

MXRA7

GPR137B

CENPV

GNG12

CCDC85A

GRAMD3

FAM105A

STRBP

ZNF608

KIAA1551

LRRC2

UAP1L1

MEGF9

EPB41L4A

PLEKHA4

METTL7B


RTP4 is the champion, with few mentions in the literature but more than 700 appearances in our database. Googling RTP4, it seems that there’s no dearth of studies on this gene, but we’re sticking with the above NIH list of literature mentions. Next on the list is VSIG2. A Google search does seem to indicate that nobody cares about this sad gene. It’s hard to even get a clue as to its function.* Nevertheless, it appears 699 times in the database; perturb a cell and there’s a decent chance you’ll alter VSIG2 expression.

We ran a Fisher analysis of an extended, 500-ID list of undervalued genes against our entire database. As might be expected, there’s no massive enrichment for any particular group. There does seem to be a tendency for genes with short transcripts and genes that are depleted in P-bodies to be represented on the list (unadjusted log(P-values) of -7.5 and -5.8). Eyeballing the list, a number of ribosomal proteins can be seen. Perhaps folks view the ribosome as a big unified glob, and don’t care to tinker with its individual components.

The opposite task, that of generating a list of “overrated” genes, is even trickier, and we won’t bother with it here. In the end, genes like TP53 would dominate the list and, given TP53’s role in cancer, labeling it “overrated” or “overstudied” would hardly be fair.

 

*Let’s say you want to know about VSIG2’s function. You can use our tools. First, you enter VSIG2 into our Coregulation app. You’ll get a list of coregulated genes. Take that list and enter it into the Fisher app. To spare you this [minimum] trouble, the swarm of genes with which VSIG2 is coexpressed looks to be hugely involved in the cell cycle, altered by a large array of common drugs (e.g. glucosamine), and also relevant to viral infections. Using the coregulation tool alone, individual genes that are strongly coexpressed with VSIG2 include TRIB3, CHAC1, ASNS, and many more. You can also note that CA9, FAM111B, NREP, and more have a fairly strong tendency to be expressed in the opposite direction to VSIG2 (i.e. when VSIG2 is up, CA9 tends to be down).


whatismygene.com 


Tuesday, February 9, 2021

Alzheimer's According to Various Brain Tissues

Below is a table of all individual studies that contributed to our "canonically altered in the Alzheimer's brain" lists. The numbers show log(P-values) for the intersections between the canonical upregulated and downregulated lists against individual studies. If this were an academic paper, a referee might (justifiably) complain about our method; technically, one shouldn't intersect a compendium of studies with studies from which the compendium is derived and go on to derive P-values. A better approach would be to remove all contributions of a particular study from the canonical lists, and then perform the Fisher test. This exercise has to be performed for all studies; a lot of work. Practically speaking, however, we entered such a large volume of studies into these canonical lists that the P-values won't be exaggerated to any great extent. 

I've broken down the studies according to brain regions. Hopefully, my inadequacy in basic neuroanatomy won't be too obvious (note that I bundled everything with the terms "cortex" and "cortical" into a single group). The goal is to ascertain whether Alzheimer's, as defined by the "canonical" lists, is particularly potent in particular regions. One could also ask if Alzheimer's trends reverse in particular tissues of Alzheimer's patients.


region

study

up

down

all

downregulated (MS) in Alzheimer's brain (vs controls)(fc/p: s: Large-scale proteomic analysis of Alzheimer’s disease brain)

0.91

-3.98

downregulated in Alzheimer's brain (all regions incl)(fc/p:  GSE5281: anatomically and functionally distinct regions of the normal aged)

0.21

-25.4

downregulated in Alzheimer's brain (all regions incl)(fc/p: GSE118553: asymptomatic and symptomatic alzheimer brains)

0.21

-18.17

downregulated in Alzheimer's disease in three brain tissues (GSE131617)

-2.64

-10.51

upregulated (MS) in Alzheimer's brain (vs controls)(fc/p: s: Large-scale proteomic analysis of Alzheimer’s disease brain)

-1.45

-0.75

upregulated in Alzheimer's brain (all regions incl)(fc/p:  GSE5281: anatomically and functionally distinct regions of the normal aged)

-16.78

0.45

upregulated in Alzheimer's brain (all regions incl)(fc/p: GSE118553: asymptomatic and symptomatic alzheimer brains)

-20.18

0.45

upregulated in Alzheimer's disease in three brain tissues (GSE131617)

-2.85

0.48

astrocytes

downregulated in alzheimer's astrocytes (with c9orf72 mutations) vs. healthy astrocytes (raw data fc/p: GSE142730)

-0.51

-6.73

downregulated in astrocytes of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

0.52

-3.65

downregulated in astrocytes of alzheimer's patients at braak stage III or greater (GSE29652: astrocyte transcriptome in the aging brain)

0.22

-20.52

upregulated in alzheimer's astrocytes (with c9orf72 mutations) vs. healthy astrocytes (raw data fc/p: GSE142730)

-1.11

0.49

upregulated in astrocytes of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

-11.29

0.59

upregulated in astrocytes of alzheimer's patients at braak stage III or greater (GSE29652: astrocyte transcriptome in the aging brain)

-2.86

0.48

cerebellum

downregulated in cerebellum of "asymptomatic alzheimers" vs normal (GSE118553)

-1.13

-4.9

upregulated in cerebellum of "asymptomatic alzheimers" vs normal (GSE118553)

-7.57

0.43

choroid plexus

downregulated in choroid plexus of alzheimers patients (GSE61196)

0.19

-1.91

upregulated in choroid plexus of alzheimers patients (GSE61196)

-1.21

-0.43

cortex

downregulated (MS) in frontal cortex of Alzheimer's patients (s4: fc/p: proteomic analysis of the frontal cortex in Alzheimer’s)

0.32

-4.58

downregulated in alzheimer's cortex (s3: fc/p: link between amyloidosis and neuroinflammation)

0.22

-12.37

downregulated in cortex of Alzheimer's patients (fc/p: GSE15222: brain transcript expression in Alzheimer disease)

0.26

-43.23

downregulated in dorsolateral prefrontal cortex of alzheimer's patients (raw data w/ttest: GSE53697: ELAV-like protein binding to coding and non-coding)

-0.74

0.22

downregulated in entorhinal cortex of alzheimer's patients (fc/p: GSE48350: normal brain aging are sexually dimorphic)

0.22

-1.14

downregulated in entorhinal cortex of alzheimer's patients (GSE26972:  loss of hnRNP-A/B in Alzheimer's disease)

0.12

-53.23

downregulated in frontal cortex of Alzheimer patients (GSE36980: Altered expression of diabetes-related genes in Alzheimer's disease brains)

0.2

-48.11

downregulated in frontal cortex synaptoneurosome of Alzheimer's patients (fc/p: GSE12685: synaptoneurosomes identifies neuroplasticity genes overexpressed)

-0.45

0.5

downregulated in neocortex of alzheimer's patients (GSE37263: gene expression in the neocortex of Alzheimer's )

0.24

-96.02

downregulated in neocortex of alzheimer's patients (GSE37264: alternative splicing in Alzheimer's disease)

0.14

-66.08

downregulated in parietal cortex of Alzheimer's patients (GSE16759: miRNA and mRNA expression in Alzheimer's disease)

-1.21

-18.78

downregulated in prefrontal cortex of alzheimer's patients (fc/p: GSE33000: human prefrontal cortex underlies two neurodegenerative)

0.19

-29.5

downregulated in temporal and frontal cortex of Alzheimer's patients (GSE139384: Pathomechanism of Kii ALS/PDC)

-3.62

0.69

downregulated in temporal cortex of alzheimers patients (vs. vascular dementia patients)(GSE122063)

0.17

-32.84

upregulated (MS) in frontal cortex of Alzheimer's patients (s4: fc/p: proteomic analysis of the frontal cortex in Alzheimer’s)

-2.19

0.77

upregulated in alzheimers (frontal pole and occipital cortex)(GSE84422: regional vulnerability to Alzheimer's disease)

-26.42

0.41

upregulated in alzheimer's cortex (s3: fc/p: link between amyloidosis and neuroinflammation)

-4.97

0.49

upregulated in cortex of Alzheimer's patients (fc/p: GSE15222: brain transcript expression in Alzheimer disease)

-13.67

0.59

upregulated in entorhinal cortex of alzheimer's patients (fc/p: GSE48350: normal brain aging are sexually dimorphic)

-13.4

-0.7

upregulated in entorhinal cortex of alzheimer's patients (GSE26972:  loss of hnRNP-A/B in Alzheimer's disease)

-16.26

0.19

upregulated in frontal cortex of Alzheimer patients (GSE36980: Altered expression of diabetes-related genes in Alzheimer's disease brains)

-25.66

0.28

upregulated in frontal cortex synaptoneurosome of Alzheimer's patients (fc/p: GSE12685: synaptoneurosomes identifies neuroplasticity genes overexpressed)

0.31

-4.64

upregulated in neocortex of alzheimer's patients (GSE37263: gene expression in the neocortex of Alzheimer's )

-30.18

0.44

upregulated in neocortex of alzheimer's patients (GSE37264: alternative splicing in Alzheimer's disease)

-15.5

-0.45

upregulated in parietal cortex of Alzheimer's patients (GSE16759: miRNA and mRNA expression in Alzheimer's disease)

-9.24

-1.83

upregulated in prefrontal cortex of alzheimer's patients (fc/p: GSE33000: human prefrontal cortex underlies two neurodegenerative)

-19.04

0.42

upregulated in temporal and frontal cortex of Alzheimer's patients (GSE139384: Pathomechanism of Kii ALS/PDC)

0.78

-6.02

upregulated in temporal cortex of alzheimers patients (vs. vascular dementia patients)(GSE122063)

-2.48

0.3

endothelial

downregulated in prefrontal cortical endothelial cells of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

0.09

-4.14

upregulated in prefrontal cortical endothelial cells of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

-3.67

1.25

gray matter

downregulated in incipient Alzheimers gray matter (Microarray analyses of laser-captured hippocampus)

-3.24

-0.8

downregulated in severe Alzheimers gray matter (GSE28146: Microarray analyses of laser-captured hippocampus)

0.16

-5.83

upregulated in incipient Alzheimers gray matter (Microarray analyses of laser-captured hippocampus)

0.17

0.39

upregulated in severe Alzheimers gray matter (GSE28146: Microarray analyses of laser-captured hippocampus)

-0.57

0.38

hippocampus

downregulated in hippocampus of alzheimer's patients (fc/p: GSE29378: cell type changes in Alzheimer's disease)

0.21

-4.02

downregulated in hippocampus of alzheimer's patients (raw data fc/p: GSE67333: Alzheimer's Disease Reveal Neurovascular Defects)

0.41

-4.44

downregulated in hippocampus of alzheimer's patients (raw data w/ttest (p<.01): GSE113524: autism/intellectual disability somatic mutations in Alzheimer's brains)

0.06

0.13

downregulated in hippocampus of severe Alzheimer's patients (Incipient Alzheimer's disease: microarray correlation analyses)

0.19

-34.13

upregulated in hippocampus of alzheimer's patients (fc/p: GSE29378: cell type changes in Alzheimer's disease)

-37.91

-0.74

upregulated in hippocampus of alzheimer's patients (raw data fc/p: GSE67333: Alzheimer's Disease Reveal Neurovascular Defects)

-0.43

0.59

upregulated in hippocampus of alzheimer's patients (raw data w/ttest (p<.01): GSE113524: autism/intellectual disability somatic mutations in Alzheimer's brains)

0.21

0.46

upregulated in hippocampus of severe Alzheimer's patients (Incipient Alzheimer's disease: microarray correlation analyses)

-4.23

0.41

microglia

downregulated in prefrontal cortical microglia of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

0.46

0.64

upregulated in prefrontal cortical microglia of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

-0.46

0.63

neuron

downregulated in excitatory neurons of alzheimer's patients (S2: fc/p: Single-cell transcriptomic analysis of Alzheimer?s disease)

-0.48

-21.72

downregulated in prefrontal cortical excitatory neurons of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

-0.43

-8.22

downregulated in prefrontal cortical inhibitory neurons of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

0.04

-0.78

upregulated in excitatory neurons of alzheimer's patients (S2: fc/p: Single-cell transcriptomic analysis of Alzheimer?s disease)

-0.48

-1.47

upregulated in prefrontal cortical excitatory neurons of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

0.45

1.27

upregulated in prefrontal cortical inhibitory neurons of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

0.04

0.09

nscs

downregulated in neural progenitor cells from sporadic alzheimer's patients (GSE117586: iPSC Models of Alzheimer's Disease)

-0.51

-12.37

upregulated in neural progenitor cells from sporadic alzheimer's patients (GSE117586: iPSC Models of Alzheimer's Disease)

0.22

-5.74

olfactory

downregulated in olfactory of advanced alzheimers patients (vs. control and initial disease)(GSE93885: olfactory bulb transcriptome during Alzheimer®s disease evolution)

-0.55

-8.59

upregulated in olfactory of advanced alzheimers patients (vs. control and initial disease)(GSE93885: olfactory bulb transcriptome during Alzheimer®s disease evolution)

-5.57

-0.43

oligodendrocyte

downregulated in prefrontal cortical oligodendrocytes of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

0.39

-0.75

upregulated in prefrontal cortical oligodendrocytes of alzheimer's patients (single nucleus sequencing) (s3: endothelial cells and neuroprotective glia in Alzheimer’s)

-1.38

0.54

posterior cingulate

downregulated in posterior cingulate of early onset alzheimer's patients (fc/p: GSE39420: sporadic and monogenic early-onset Alzheimer's disease)

0.2

-40.22

upregulated in posterior cingulate of early onset alzheimer's patients (fc/p: GSE39420: sporadic and monogenic early-onset Alzheimer's disease)

-20.14

0.45

temporal gyrus

downregulated in middle temporal gyrus of alzheimers patients (GSE132903: Alzheimer's Disease Middle Temporal Gyrus: Importance)

0.2

-123.42

downregulated in temporal gyrus of alzheimer patients (GSE109887: Alzheimer's disease-associated (hydroxy)methylomic)

0.2

-122.84

upregulated in middle temporal gyrus of alzheimers patients (GSE132903: Alzheimer's Disease Middle Temporal Gyrus: Importance)

-38.49

0.49

upregulated in temporal gyrus of alzheimer patients (GSE109887: Alzheimer's disease-associated (hydroxy)methylomic)

-43.17

0.4

temporal lobe

downregulated in lateral temporal lobe of Alzheimer's patients (vs healthy elderly)(raw data w/ttest (p<.001): GSE104704: landscape of normal aging in Alzheimer's disease)

0.22

-20.52

upregulated in lateral temporal lobe of Alzheimer's patients (vs healthy elderly)(raw data w/ttest (p<.001): GSE104704: landscape of normal aging in Alzheimer's disease)

-3.87

0.49



Initially, I included the 5 groups, both up and down-regulated (10 more columns), from the previously mentioned Alzheimer's clustering study. It generated a confusing, difficult-to-post mess, so I simplified. In general, these 10 groups intersected the individual studies as might be expected, with groups B1 and B2 often reversing the trends seen above. 

To answer the initial question: the up/down-regulation patterns in various tissues strongly tended to conform to up/down-regulation in our canonical lists. In other words, these canonical fingerprints don't strongly depend on tissues. Particularly strong intersections were seen between the broad canonical lists and two temporal gyrus studies. The posterior cingulate and general cortex also showed nice P-values.

Red text signifies cases where the direction (+ or -) of the canonical lists versus individual studies are not the same (e.g. intersection of the canonically upregulated list with transcripts downregulated in a particular study generates an interesting P-value). There are only four of these cases. Most interesting, in my opinion, is the case in which transcripts upregulated in Alzheimer's neural stem cells significantly overlap canonically downregulated Alzheimer's transcripts. Is it possible that the levels of particular transcripts (and their accompanying proteins) in NSCs generates a reverse signature in surrounding tissues?

A number of tissues do not strongly overlap with the canonical lists: oligodendrocytes, the choroid plexus, and microglia in particular. One should not naively conclude that such tissues are irrelevant to Alzheimer's...such tissues might be especially relevant.

In general, we can't claim that this exercise was particularly insightful. The main point is to show that the Alzheimer's signature is evident in multiple regions of the Alzheimer's brain.

whatismygene.com 


A Preprint

It has been a while since we posted. That's largely because of the effort put into generating a paper. Check it out on BioRxiv . This is...