Thursday, August 8, 2024

Genes beginning with "LOC"

Our database contains more than 27,000 genes that begin with the "LOC" designation (meaning "locus"). In total, they make 390,000 appearances in the database. Most of these genes are poorly characterized; one indication of that is the fact that all but about 2000 are lacking ENSG identifiers. Nevertheless, a couple of these LOCs appear more than 1,000 times in the database, and 869 appear at least 100 times. Most, but not all of these, are non-coding. 

Before proceeding, we should note that "poorly characterized" can also mean "unsure about their existence as separate species." LOC102724852, noted below, is associated with chromosome 11, but apparently hasn't been pinpointed to a location. It is also co-expressed with other chromosome 11 genes, which is a bit odd.

Using the "Cell Type" app on our website, let's plug in some of the most common LOCs in our database and get a feeling for what they do.

LOC102724852: Appears 1013 times in the database. Found significantly more often in female tissue than male, despite being found on chromosome 11. Perhaps amazingly, 15 studies in our database list this gene as the top ranking perturbation. Using the co-expression tool, we also see that it is very commonly found in association with H19 (H19 Imprinted Maternally Expressed Transcript) and, to a lesser extent, mir675, both of which are also found on chromosome 11. Hmmmm.

LOC112268238: Appears 920 times in the database. Again, more common in female studies. Significantly associated with results involving bromodomain targeting (drugs, knockout, etc). Also associated with degron experiments, which seems odd until you realize that degron experiments often target bromodomain proteins. Co-expressed genes are hugely overrepresented by histones. BRD2, a bromodomain gene, is also strongly associated.

LOC112268430: Appears 897 times in the database, but isn't strongly associated with any of our key words.

LOC107986126: 830 times. Slight association with leukemia.

LOC112268313: 810 times. Again, associated with degron experiments.

LOC100044068: 786 times. A mouse gene, oddly associated with knockout experiments (log(p) = -22). Also associated with the brain, particularly the hippocampus.

LOC105374985: 735 times. Associated with prostate studies.

LOC100419583: 717 times. Strongly associated with innate immune response keywords (ifn, cytokine, virus, infection, etc).

LOC112267876: 689 times. Associated with stem cell studies.

LOC107984316: 685 times. No strong associations.

LOC112268267: 668 times. Associated with studies involving ifn-gamma.

LOC101928841: 661 times. Associated with studies involving the HELA cell line.

LOC101929185: 660 times. Strongly associated with HEK293 studies.

LOC112268284: 658 times. Associated with studies involving fungi (i.e. fungal infections).

LOC112268155: 654 times. Commonly found in MCF7 studies (female breast line), but also LNCAP (prostate).

LOC107986762: 643 times. Weak association with macrophage studies.

LOC105369370: 643 times. Associated with carcinoma.

LOC107987206: 636 times. No significant associations.

LOC105378936: 597 times. Slight association with leukemia.

LOC112268109: 535 times. Associated with studies involving huvecs.

LOC112268426: 534 times. Associated with endothelial cells.

LOC112268447: 504 times. Strong association with fibroblast studies.

Just for the fun of it, we lumped together the top 250 of these LOCs and ran the list through our Fisher app. Somehow, it seems that a large number of these LOCs found themselves in a list of genes upregulated in "glioblastoma tissue after g207 innoculation" (unadjusted log(p) = -16). They are also "downregulated in high-grade T1 micropapillary bladder cancer w/micropapillarity = 1 vs 0", and "upregulated in caco2 line on 12h vs 7h SARS-CoV-2 infection." The most common keyword associated with the list was "line", meaning the LOCs are overrepresented in cell line experiments (particularly HEK293) vs in vivo studies. The bias toward female studies is also retained. However, this bias may relate to a disproportionate number of female cell lines, as the bias disappears when cell lines are eliminated from consideration. In fact, when only in vivo tissue is examined, the association with the keyword "disease" is surprisingly significant (log(p) = -44). Other keywords of interest include "resistance" (as in drug resistance) and "virus".



whatismygene.com 

No comments:

Post a Comment

T-cell Exhaustion

"T-Cell Exhaustion" is associated with an inability of the immune system to fight off cancer and other diseases. We grabbed 7 mark...