WhatIsMyGene

Wednesday, September 22, 2021

A New Tool

We’ve added a new tool to WhatIsMyGene.com called “Cell Types.” The idea is fairly simple. You enter a gene name, submit, and the output will be a list of keywords associated with your gene. The keywords primarily relate to cell type. There’s a binomial probability calculation being performed in the process, comparing the frequency of those keywords over the complete database versus the frequency of those keywords in data in which your gene appears. A high “binomial” output would represent a high positive correlation, and a strongly negative number would indicate a negative correlation. If you choose filters, both the gene-specific data and the larger database will be filtered.

It was difficult to get this tool up and running. I won’t bore you with the details of programming. But let me know if it crashes on you.

It’s possible to get zero or minimal output by improper filter selection. For example, our “cell type” data is largely composed of genes that are not “upregulated” or “downregulated” on some perturbation…that’s not the nature of the typical clustering result based on single-cell sequencing. So if you select “Cell Type” (in the “experiment” box, NOT the new tool we're talking about), and “upregulated” (as opposed to “Any”), you may not receive any output. It’s also possible, of course, to enter rare genes and get zero output, or very un-insightful output.

In keeping with our previous discussion of the “perturbome”, please note that the output you’ll receive is probably not relevant to abundance. Most (not all) of the lists in our database are not abundance lists. Rather, they are tagged as “upregulated” or “downregulated” under particular conditions. There’s little or no correlation between a list of genes that are abundant in liver and a list of genes that are commonly perturbed in liver.

Plugging in some well-known cell-type markers, the tool works quite nicely! Below, we take a common marker for lymphocytes, CD8, run it through the tool, and use a few lines of R to generate a graph. See below for the code we used. Bear in mind that the standard 0.05 cutoff for significance would be found at binomial values of +-/- 1.3. We output 100 keywords, so an adjusted cutoff would be +/- 3.3. In the graph below, we tossed the 25^th-75^th tissues (the not-so-interesting ones).

Not surprisingly, CD8 is primarily perturbed in tissues with keywords like “blood” or “lymphocytes” or “spleen.” Perhaps more interesting is the fact that one rarely sees this perturbation in tissues labeled “adherent” or “epithelial.” Stem cells just miss the adjusted cutoff for absence of CD8 perturbations.

We plugged in ACE2, a protein everyone knows to be expressed in lungs. However, judging from output from the cell type tool, it’s not commonly perturbed in lungs, which may offer one explanation why lung tissue is a handy-dandy, dependable target for Covid-19.¹ More commonly, ACE2 is altered in the colon and intestine (log(P)< -10 and -7). It’s particularly difficult to tweak in the case of blood and the brain (both with log(P)<-4). The rarity of tweaked ACE2 in the blood and brain may be because it’s not there to begin with. However, we know that ACE2 is found in the lungs…it’s simply difficult to alter its expression. To probe further, one could use filters to see if drugs or knockdowns (or whatever) alter these probabilities.

Actually, a quick peek at ACE2 expression (genecards.org) shows that the transcripts are indeed commonly found in blood and brain. Quite interestingly, however, ACE2 protein is rare in blood and brain, while ACE2 protein is common in the kidney, as well as heart and ovary (which also ranked high as tissues in which ACE2 is commonly perturbed). The pattern is broken with the colon, however, where the protein is rare. Nevertheless, we wonder if there’s a relationship between perturbability and protein levels that differs from the perturbability/transcript-levels relationship.

The above ACE2 results have implications for anyone who wishes to decrease lung ACE2 expression via some treatment. Another practical implication would be in the choice of cell lines for experiments. If you want to perform a knockdown of some transcript, you’ll obviously want to choose a cell type in which the transcript is expressed. However, it might also be prudent to choose a cell line in which the transcript can be perturbed!

We had a lot of fun entering our favorite genes into the tool. Guess the tissue in which APP (the Alzheimer’s amyloid gene) is most difficult to perturb! Compare the perturbability of PD-1 and PD-L1 over tissues. Compare the HLA-I and HLA-II perturbomes.

One of my favorite genes is DDX6. I’ve oft-noted how the genes it regulates overlap with the genes another helicase, DHX9, regulates. It seemed a bit redundant. But the Cell-Type tool makes it fairly obvious that this regulation happens in very different cell types. DHX9 loves to do its job in epithelial cells and DDX6 hates it!

One idiosyncrasy is the following: cell lines are either male or female. Huh7, for example, is male. Whenever possible, we’ve labeled cell lines with a “male” or “female” keyword. You may thus find that your gene is strongly enriched with the “male” designation. You may wish to ignore this, as it may reflect the fact that the cell lines that represent certain tissues are overwhelmingly male or female, not a broad tendency for a gene to be perturbed, for example, only in males.

A few other keywords bear explanation. “3d” refers to organoids. “Cancer_tissue” refers to in-vivo cancer tissue, not cell culture (after all, the majority of cell culture lines are generated from cancers). “Resistance” relates to studies where resistance to a treatment (e.g. cisplatin) was examined. Such studies can be in-vitro (performing cell culture until resistant strains emerge) or in-vivo (e.g. from studies of patients who respond, vs don’t respond, to particular therapies).

If you don’t want to examine cell line data at all, one trick is to exclude the keyword “ line” (include the space) in all studies. We’re currently retroactively labeling all cell line studies (there are a lot, of course) with this keyword…the trick won’t work optimally until we’re finished with this task. This trick applies to many of our tools, actually. Another way to de-emphasize cell line data is to examine only mouse data, not human data. This is because, with the exception of blood, muscle, and cancer, it’s difficult to access human in-vivo tissue; researchers use mice for those.

One might imagine a sort of “inverse” cell-type tool. Here, you’d select from a list of keywords and the output would be the genes that are most enriched (or depleted) for the keyword. I’m guessing this task would be computationally expensive…you’d need to “stack” all the genes in the database into a frequency table, then stack all the keyword-relevant genes into another frequency table, merge the tables, and then perform something like a hundred thousand binomial calculations. All this stacking would have to be performed on the fly (as opposed to using a one-time table that resides on the hard drive), because the user might apply filters to the database. However, we may embark on this little exercise in the future on our local machine, and report on the outcome. For now, the big task is to increase/refine/improve the keywords in our database.

***

Initially, we considered outputting results in graphical format, as opposed to a table. In the end, we decided to stick with tables. You can generate graphics based on the table output in any way you like, rather than being stuck with a limited set of color schemes, labels, graphics formats, etc. If you’re familiar with R, the code below might be useful.

library(ggplot2)

tissue_data <- read.csv("D:/your_table.csv")tissue_data$stacked_tissues <- factor(tissue_data$stacked_tissues, levels = tissue_data$stacked_tissues)tissue_data$fill<- ifelse (abs(tissue_data$binomials) > 3.3,"red", ifelse(abs(tissue_data$binomials)>1.3,"purple","gray"))g <- ggplot(tissue_data, aes(x = binomials,y = stacked_tissues, fill = fill))g + geom_col()+ylab(NULL)+scale_fill_identity()

#The “factor” line prevents the table from being sorted alphabetically.

1) Gotta be careful with this kind of logic, of course. If the virus has the capacity to alter the expression of a target (such as ACE2), the perturbability of the target might work to the benefit of the virus, not to the detriment.

whatismygene.com

Thursday, September 16, 2021

Why Standard Gene Enrichment Tools Can Fail to Produce Insight

Why isn’t the gene enrichment tool you’re using spurring further insight and hypothesis? Or, worse yet, is it possible the gene enrichment tool you’re using is spurring unjustified insight and hypothesis? Without mentioning specific tools, here are some potential causes:

*With some standard tools, relatively few studies may be combined to create a “canonical response” list of genes, with a majority of relevant studies being ignored or excluded. Why? I note, for example, that relatively few studies are embodied in a particular “response to hypoxia” list, whereas there are certainly more than a hundred deep transcriptomic/proteomic studies on the subject.

*Conversely, it’s possible that the response to hypoxia, for example, is strongly dependent on cell type. In this case, a researcher may fail to recognize that a particular drug indeed induced a hypoxia response if he/she compares results against a “canonical regulation” list. It might have been better to examine a single previous study that more closely mirrored the researcher’s own study setup.

*”Low quality” genes may be excluded from canonical lists. That is good, of course, and one assumes that there are reasonably stringent criteria for such exclusion. If however, a “low quality” gene (for example, a probe that lacks a well-described transcript) appears again and again in studies of, say, hypoxia, perhaps it’s more relevant than had been thought.

*Researchers are not immune to fads in their own fields. I’ve seen it myself…RNA-seq is performed and the 25^th most significantly altered transcript is selected for functional studies, not the first.¹ Why? It could be because the 25^th gene rings a bell for the researcher. Or…it fits his/her preconceptions as to what should be altered in the experiment. Or…it’s a “hot” gene that would be more likely to draw in grant money. Or…it’s easy to study, as antibodies against the protein are already in the lab freezer. Or…the 25^th gene is the subject of previous studies, making it easier to formulate a hypothesis for its involvement in a process. What other factors unrelated to biological significance cause researchers to mention one entity versus another in papers? The point here is that the folks who screen studies for genes that can be incorporated into lists will be victims of these biases.

*To what extent are human screeners subject to their own biases? Do they examine supplemental data?²

*A single study may contribute an excess of data to a transcriptomic database. You could examine the effects of viral infection on a cell line at 1, 2, 3, 4….72 hours and compare the transcriptomic results against controls for each timepoint. Such studies could inflate the size of a database to an impressive degree. However, does insight follow? Does one really expect that the result at 16 hours is going to be interestingly different from the result at 18 hours? Inclusion of multiple highly similar studies will also confound large-scale co-expression analysis (e.g. gene ABC could be lumped together with gene XYZ 72 times, even though the two genes aren’t associated in other studies, in other cell types, under other infection conditions).

*Rare entities may be excluded from canonical lists. Consider two transcripts. Transcript ABC is upregulated in hypoxia in 6 out of 10 studies. ABC is abundant and also tends to be altered in numerous non-hypoxia studies. That is, if you perturb a cell, there’s a good chance you’re perturbing ABC. Transcript XYZ, which may not even be represented by probes in some microarrays, is upregulated in 2 out of 10 hypoxia studies. It’s never mentioned in the body of hypoxia papers, and it’s rarely seen in non-hypoxia studies. Shall we exclude XYZ from our list of transcripts altered in hypoxia?

*Some enrichment tools do not incorporate estimates of the “background” of an experiment. Even if a background is incorporated, shall we assume that all gene ontology lists share the same background? As we’ve noted previously, some of these lists are heavily overloaded with extremely abundant proteins/transcripts. In these cases, it would appear that the genes that compose these lists are more likely to be drawn from a pool of 2,000, as opposed to 20,000, possible genes. In other cases, a gene ontology list does not over-represent abundant entities, meaning that a background of 20,000 might be appropriate for comparison against your own list of genes.^3,4

*You add a drug to cell culture and perform transcriptomics against controls. Performing “pathway analysis” on your list of up- and down-regulated transcripts could certainly prove insightful. However, is that all you wish to do with respect to enrichment analysis? Bear in mind that your significantly altered transcripts may be more likely to be bundled in “modules” than in groups of genes found in particular pathways. In other words, your transcripts may contain a large dose of genes downregulated in autophagy, a moderate dose of mitochondrial process, a smattering of genes upregulated in antiviral response, and a heavy dose of genes upregulated in an esoteric process that isn’t even represented in popular gene ontology lists. If there are other studies that match up with your results, will you know?

*Try entering a standard gene-enrichment list (GO, Reactome, whatever), into our Fisher app. Despite the fact that a majority of the lists in our database are derived from individual studies, not mere copies of gene-enrichment lists that other folks have created, you'll probably find that the output is dominated by other gene-enrichment lists (be sure to set the "regulation" filter to "Any"). Basically: GO lists (and the like) best overlap with other GO lists, not data generated from studies involving specific tests versus controls.

The solution to the above concerns is not necessarily tricky. All you need is a database of results from specific studies, as opposed to (or in addition to), compiled lists of genes. To maximize the chance that your own results will strongly align with results from another specific study, the database should be large. This large database should contain a roughly randomized set of studies versus, say, a strong focus on cancer. Inclusion of multiple results from a single study should be avoided. Rare and/or uncharacterized genes should not be eliminated without very good cause.

The above describes our database fairly well. Have we fully eliminated all the above concerns? No. In addition to specific studies, we do offer some compiled lists, described in some of our previous blog posts. On some occasions, we do include multiple results from one study. However, we take steps to make sure that such studies do not confound results from our co-expression app.

1) Yes, I’ve got a particular study in mind. In fact, the single most significantly altered transcript was not even mentioned in this study.

2) Plenty of biologists believe that confirmation of a protein alteration requires a Western blot. The mass-spec community scoffs at this, believing that blots are vastly inferior to MS and antibody studies are a waste of time if MS is performed properly. I side with the MS folks. In any case, though, where do the screeners draw their particular lines? Even if they’re consistently following particular criteria, can we assume the criteria is reasonable?

3) If this bit seems difficult to understand, my apologies. It might help to bear in mind that Fisher’s exact test, or similar tests, require a “background” figure. Strictly speaking, this should be the intersection of ALL identified entities in study A with ALL identified entities in study B, regardless of metrics like significance and fold-change. This is not so difficult if you’re comparing results from two studies that used, say, the same brand of microarray. But what if Study A is generated by compiling multiple studies, or if study A is generated by humans who screen papers for genes involved in various processes? What is the sum of all identified (not simply "mentioned") transcripts/proteins in study A? This gets tricky. Things get particularly tricky if the process of compilation results in an excess of highly abundant entities. And we certainly do see cases where abundant entities are strongly over-represented.

4) If you've tinkered with Fisher's exact test, you know that small/moderate errors in the background figure don't necessarily make much difference. However, some potential errors go way beyond the "small/moderate" level. In gene enrichment analysis, the output often consists of a list of enriched groups ranked from most to least significant. Here, one naturally pays most attention to the top ranked groups. In the case of a significantly tweaked background, however, perhaps the top-ranked study should really belong at the 20th position.

whatismygene.com

Saturday, August 14, 2021

Abundant Transcripts

As long as we’re in heatmap mode, we thought we’d throw some fairly “conventional” data into R’s “heatmap.2” function. This time, it’s simply the most abundant transcripts found in 61 different tissues. All the data, of course, is found in our database; if your data strongly overlaps with a particular tissue type and you run it through our apps, you’ll certainly know. You can find the underlying data (and a lot more) here: https://www.proteinatlas.org/about/download .

The presence of abundance data in our database is also useful in another respect. If your data is biased with abundant transcripts/protein, as opposed to a more typical mix of abundant and rare entities, and you apply our Fisher app to the data, the app’s output will be overloaded with abundance studies. This is a bit of a warning. It’s possible that you simply need to adjust the “background” for your data (e.g. a typical proteomic study contains around 4,000 proteins…our default background of 20,000 is not appropriate in this case, and you’ll need to tweak it). There may be other, more problematic reasons for the bias in your data. Alternatively, there may be a silly mistake…your data was sorted according to abundance. We’ve noted problems in our own data entries this way (and then fixed them!). We also note issues in external datasets. Our favorite is probably the GO “ESTABLISHMENT OF PROTEIN LOCALIZATION TO ENDOPLASMIC RETICULUM” group, which is a wonderful proxy for the most abundant proteins in human tissue. Of course, a final possibility is this: your data is legitimately tweaked toward or against abundance. One phenomenon we note is that cancer tissues often lose, to some extent, the underlying tissue identity; stomach cancer tissue, for example, will become less stomach-y, and you may note this via a cancer-related decrease in the most abundant stomach entities.

One final technical point…if you’re using our co-expression app, these abundance lists will not be examined unless you manually tweak the “regulation” feature to “ANY.” In other words, only lists involving up/down-regulated entities will be examined unless you overrule the “regulation” feature.

Here’s the heatmap, with “values” being –log(P-values), as generated by Fisher’s exact test, applied to all combinations of cell types:

It shouldn’t be surprising that some extreme P-values were generated. The basic components for metabolism and structure (etc.) are both plentiful and don’t vary much from cell to cell.

The image probably aligns with your own ideas about similarities between various cell types. I was surprised how cleanly, however, the cell types clustered into various groups (see the dendrogram on top). The left-most columns contain cell types that don’t seem to overlap with other cell types with great significance: testis, liver, parathyroid, and placenta, in particular, followed by cerebellum, granulocytes, skeletal muscle, thymus, heart, and intestines. Next, there’s a square of red/orange/yellow color. That’s all brain tissue: basal ganglia, pons/medulla, spinal cord, olfactory gland, hypothalamus, amygdala, midbrain, cerebral cortex, corpus callosum, thalamus, hippocampus. Perhaps it’s interesting that the cerebellum was not found in this group. Next is a grouping of cell types that don’t overlap any other types with extreme significance, with a few exceptions (total pbmcs/monocytes, duodenum/colon, monocytes/dendritic cells). The next strong red/orange patch consists of gall bladder, vagina, skeletal muscle, cervix, prostate, fallopian tube, endometrium, and bladder. The next red/orange patch contains t-cells, NK cells, b-cells, pbmcs, and the spleen. Next, the appendix, lymph nodes, and tonsils group together strongly.

Examining the underlying data, the weakest overlap belongs to the cerebellum/liver pairing, with a P-value that doesn't even reach 0.05.

Oddly, the midbrain and the amygdala match up with the rectum fairly significantly!

*Note to self and anybody who 1) doesn’t think my heatmap is utter garbage and 2) would like to do something similar. Here’s the code that generates the colors:

col = c("navy","blue","dodgerblue","lightskyblue","palegreen","yellow","orange", "red")

breaks <- c(0, 2, 12, 25, 50, 100, 150, 200, 325)

heatmap.2(blah, blah, col = col, breaks = breaks, blah, blah)

I like this approach because it’s easy. Just make sure you’ve got one more break than colors (above there are 9 breaks and 8 colors). Of course, if you must make a gradient, you can’t use this easy method. In any case, here’s a nice color “cheatsheet”: https://www.nceas.ucsb.edu/sites/default/files/2020-04/colorPaletteCheatsheet.pdf . The bottom of the sheet contains names for something like 600 colors that you can plug in as above.

whatismygene.com

Become a Patron!

Wednesday, August 11, 2021

The Perturbome

When focusing on cell types, we could make a list of the most abundant transcripts in particular cell types. We could also focus on the proteome. We could even ask, “what are the common transcripts/proteins that are rarely seen in a particular cell type?” We could search for cell type markers that are rarely or never seen in other cell types, even if these markers are not particularly abundant in the cell type of interest. Our database is chock-full of the above sorts of lists.

There’s another sort of list we are able to prepare, largely because the sheer size of our database affords the opportunity. The largest portion of the database falls into the category of “perturbation studies” wherein cells are perturbed via drug, knockout, heat, whatever. We can thus ask the question, “what transcripts/proteins are most commonly perturbed in a particular cell type?” We can also ask which entities are least frequently perturbed in a particular cell type. This is not a question of abundance or of “uniqueness” to a particular cell type. Rather, we’re focusing on the entities that fluctuate when you tweak a particular sort of cell.

Pulling data from about 10,000 studies, we’ve constructed these lists for 20 different cell types: brain, liver, skin, muscle, lymphocytes, stem, kidney, breast, colon, prostate, heart, lung, intestines, glands, pancreas, dendritic, ovaries, adipose, fibroblasts, epithelial. Why not other cell types in our database? That’s primarily because the above 20 designations are the most common in our database; we required at least 100 studies for each cell type. We could have also included “blood” as a category, but we chose to break it down further into two common subtypes: lymphocytes and dendritic cells. Some other choices were somewhat arbitrary (we have a lot of macrophage studies…why didn’t we include them?) Note also that some of these cell types can overlap….skin and liver are different tissues, but skin can contain stem cells, and breast cells can be epithelial. For this initial stab at the “perturbome”, this isn’t a problem.

With the 20 cell types, we generated 40 lists, as each cell contains entities that are frequently perturbed, as well as entities that are rarely/never perturbed; two lists per cell type. Entities that are “rarely perturbed” most likely are simply never expressed in the particular cell type, though it is possible that they are indeed expressed, but it’s difficult to tweak them; we don’t discriminate between these two cases.

If you’re interested, the dirty details are as follows: We first generated a list of all genes found in the above studies. We then simply counted their occurrences in the above 20 cell types. We then used the binomial distribution to calculate how significantly a particular gene may be over/under-represented in a particular cell type. The “probability” input for the binomial distribution (which is .5 if you’re talking about coin flips) is calculated by dividing the total genes perturbed in a tissue (e.g. brain) by the total genes perturbed over all 20 tissues. Liver, for example, constitutes 7% (.07) of all genes in our database’s perturbome. Thus, if you know that gene ABC is found 100 times in our liver studies, and 500 times over all studies, you’re equipped to perform a probability calculation. In the final step, we simply rank genes according to these probabilities, making sure to discriminate between significance generated from an excess, versus depletion, of a particular gene.

So…what did we find? First, what is the most commonly perturbed gene over all tissue types? The answer is EGR1, perturbed 1581 times, followed by SERPINA3, IFIT1, GDF15, and FOS. What is the most common gene which was never perturbed in a particular tissue? That distinction belongs to LUM (lumican), which was never perturbed in dendritic cells, despite being altered a total of 758 times over the other 19 tissues. Perhaps dendritic cells are adamant that they not be confused with other cell types that express lumican, which is largely an extracellular protein.

Of note, GAPDH, commonly used as a housekeeping control gene, was only perturbed 358 times. Actin-B was seen 610 times. Our gene lists primarily reflect perturbations, not abundance.

Below is one of the more obscure gene tables you’ll ever stumble across. It details the top genes that were never perturbed in particular cell types. The table is ordered by the count over all other tissues; thus, the cell types at the bottom of the table express a large array of transcripts/proteins.

CELL TYPE	GENE	COUNT OVER OTHER TISSUES
dendritic	lum	758
adipose	hopx	603
intestines	nav2	591
ovary	lam1	481
heart	jdp1	469
prostate	pck1	456
gland	hpp1	435
pancreas	cd38	422
muscle	slc6a14	405
colon	kcnk2	339
fibroblast	c1orf116	329
kidney	scca1	312
skin	sizn	230
brain	ugt2b15	205
lung	gpr37l1	183
breast	miat	175
stem	cyp2c9	164
liver	blcap	149
epithelial	ces1g	121

Another question: what are the genes that were uniquely perturbed in particular tissues? The champion is probably GM1818, a mouse gene that was perturbed 21 times in the brain, but never elsewhere. For human genes, we have FAM90A7P, which was perturbed 16 times in the brain, and never elsewhere. The brain, in fact, seems to have the largest number of uniquely perturbed genes by a large margin; the first case of a non-brain gene that was uniquely perturbed was the mouse gene AI132709 (liver), which was tweaked in 8 studies in our database…201 brain-unique genes are tweaked at least as frequently. The lncRNA Lnc-CHSY1-3 was uniquely expressed in lymphocytes, albeit with a mere 4 occurrences.

The special status of the brain is also seen in the heatmap below. We took our 40 perturbation lists and performed Fisher’s exact test on all combinations of lists, for a total of 760 P-values.

If the image is too small, you could click on it to get a bigger view. The first row is labeled “LO_COL”, which means “genes that were least frequently perturbed in the colon.” Hopefully the other 39 labels are self-explanatory. The color key shows the –log(P-values). Combinations with very significant P-values tend to make sense…highly perturbed genes in ”breast” and “gland” overlap with extreme significance, as do non-perturbed genes in the lymphocyte/dendritic categories, and perturbed genes in the colon/intestine. There are, however, some very interesting overlaps that might not be so intuitively obvious. For example:

1) genes that are rarely perturbed in the brain are rarely perturbed in stem cells.

2) genes highly perturbed in glands are rarely perturbed in the brain.

3) genes highly tweaked in the breast are rarely tweaked in stem cells.

4) genes that are rarely perturbed in the brain are also rarely perturbed in muscle and lymphocytes.

5) looking at the “high_BR” (highly perturbed in the brain) group, the best matching “highly perturbed” cell type would be “stem”, with a –log(P-value) of about 4. This is a bit of a cheat, since stem cells and brain cells are not exclusive (i.e. some brain cells are stem cells). In truth, then, highly perturbed genes in brain cells do not overlap with the highly perturbed genes of “pure” cell types with any significance.

6) unlike the brain, the rarely perturbed genes in some tissues don’t overlap rarely perturbed genes in other tissues with great significance. For example, the rarely perturbed genes in the pancreas don’t overlap with rarely perturbed genes in other tissues with any amazing significance; the best match, in fact, would be to intestines, with a -log(P) of 7.

You can tinker with the data yourself at whatismygene.com. The table below gives you the dbase IDs that allow you to perform operations with our various apps.

DBASE ID	CELLS
132346123	most frequently perturbed in the brain
132346124	least frequently perturbed in brain
132346125	most frequently perturbed in the liver
132346126	least frequently perturbed in the liver
132346127	most frequently perturbed in skin
132346128	least frequently perturbed in skin
132346129	most frequently perturbed in muscle
132346130	least frequently perturbed in muscle
132346131	most frequently perturbed in lymphocytes
132346132	least frequently perturbed in lymphocytes
132346133	most frequently perturbed in stem cells
132346134	least frequently perturbed in stem cells
132346135	most frequently perturbed in the kidney
132346136	least frequently perturbed in the kidney
132346137	most frequently perturbed in the breast
132346138	least frequently perturbed in the breast
132346139	most frequently perturbed in the colon
132346140	least frequently perturbed in the colon
132346141	most frequently perturbed in the prostate
132346142	least frequently perturbed in the prostate
132346143	most frequently perturbed in the heart
132346144	least frequently perturbed in the heart
132346145	most frequently perturbed in the lung
132346146	least frequently perturbed in the lung
132346147	most frequently perturbed in the intestines
132346148	least frequently perturbed in the intestines
132346149	most frequently perturbed in glands
132346150	least frequently perturbed in glands
132346151	most frequently perturbed in the pancreas
132346152	least frequently perturbed in the pancreas
132346153	most frequently perturbed in dendritic cells
132346154	least frequently perturbed in dendritic cells
132346155	most frequently perturbed in ovaries
132346156	least frequently perturbed in ovaries
132346157	most frequently perturbed in adipose tissue
132346158	least frequently perturbed in adipose tissue
132346159	most frequently perturbed in fibroblasts
132346160	least frequently perturbed in fibroblasts
132346161	most frequently perturbed in epithelial cells
132346162	least frequently perturbed in epithelial cells

We’re not finished with our dissection of the perturbome. We’ll resume the discussion in a couple weeks.

whatismygene.com

WhatIsMyGene

Wednesday, September 22, 2021

A New Tool

Thursday, September 16, 2021

Why Standard Gene Enrichment Tools Can Fail to Produce Insight

Saturday, August 14, 2021

Abundant Transcripts

Wednesday, August 11, 2021

The Perturbome

The WIMG view of mouse Alzheimer's studies

Report Abuse