WhatIsMyGene: A Better Website

We've got a more professional interface. Needless to say, AI (Gemini) was largely responsible for the improvement. The improvements are not limited to cosmetics...you should receive noticeably faster outputs from several of the tools (particularly "Relevant Studies", which doesn't require a lot of behind-the-scenes calculations). Despite the assistance, it still took about three weeks of work and 6,000 lines of code. The old site actually required fewer lines of code...this is not a testament to my own efficient coding, but rather, Gemini's insistence on bullet-proofing and commenting everything.

There are also several new features. The coolest is this: The "Third Study" tool outputs a nice Venn diagram that you could use for your paper. Graphics generated from bioinformatic sites are rarely, if ever, publication ready, but it's easy enough to right-click on the image and edit it in Photoshop or Illustrator. The sizes of the circles and intersecting regions correspond to the number of genes within, failing to some extent when there are only a few genes in an intersection. Click on an output study, go to the "Venn Diagram" tab, and you'll get something like this:

In case you don't know what the tool is supposed to achieve, the user enters genes from two sets that already have a significant intersection. The tool finds studies that intersect strongly with the "central" set, but not the second set. This is apparent in the above diagram, where the central set and the "study" set share 67 genes, but the second set and the study set share only 11. Bearing in mind that the algorithm references both user-entered and database-associated backgrounds, it determined that the log(P) for the central/study intersection was much more significant than that for the second-set/study intersection. Basically, the tool helps you answer the question, "What ELSE is happening in my gene set?" You may find that your gene set is enriched for, say, cell-cycle genes, but you'd also like to know if a second theme is lurking, possibly overwhelmed by the cell cycle signal. This tool will help.

Here's another new feature: the "Tissue Specificity" box for the "Fisher" tool. Let's say you have a study that compares breast cancer tissue to healthy tissue. You derive a list of DEGs that are upregulated in breast cancer. You could use our Fisher tool to find knockouts, drugs, etc., that tend to downregulate these genes. However, you might suspect that these treatments could cause unpleasant systemic effects. You'd prefer to target genes that are breast-specific. The "Tissue Specificity" choice allows you to do that. Specifically, the tool looks in a table of breast-specific genes and then filters the database for studies in which these genes were specifically targeted. Though not related to tissue-specificity, we've also included a list of genes that can be targeted by existing drugs. More lists are possible.

Another feature is this: "Select Database", seen in the sidebar for several tools. Currently, there are only two choices, our "standard" database and a second database ("Reduced p10"). Here, we've simply taken the standard database and removed the top 10% of most commonly perturbed genes. It's computationally expensive to do this on the fly, thus a revised database. The revised database is an attempt to address the fact that many input studies converge on relatively few database studies. Here, commonly perturbed genes are removed in order to allow the small-time talents to shine. The idea is far from optimized...currently, it seems like removing a mere 10% of genes was probably too conservative, since outputs currently don't seem to be tremendously different for either choice of database. We've got ideas for other database alterations as well.

whatismygene.com

WhatIsMyGene

Saturday, March 28, 2026

A Better Website

No comments:

Post a Comment

Mixing WIMG and AI

Report Abuse