The following examples are demonstrated in a FAANGMine tutorial that is available on YouTube.
A. List Tool. In this example, the FAANGMine List Tool is used to upload gene lists and perform gene set enrichment analysis. The data is from: Xie S, Yang X, Wang D, et al. Thyroid transcriptome analysis reveals different adaptive responses to cold environmental conditions between two chicken breeds. PLoS One. 2018;13(1):e0191096. Published 2018 Jan 10. doi:10.1371/journal.pone.0191096. PMCID: PMC5761956.
In this example, you will upload two gene lists. Right click the file names below (or control-click in Mac) to download the txt files.
chicken_background_genes.txt – this is a list of genes that were expressed in the experiment. This list was created using Supplemental File S1 of the publication.
chicken_de_genes.txt – this is a list of genes differentially expressed in thyroid between Bashang Long-tail (BS) and Rhode Island Red (RIR) chickens in a warm environment, from Supplemental File S2 of the publication.
You can choose whether to upload the txt files themselves or to open the files in Excel and copy gene ids from a spreadsheet.
- Be sure to login to FAANGMine.
- Click the List Tool tab in the navigation bar. If the Upload menu is not showing, click “Upload” in the gray bar just below the navigation bar.
- Select “Gene” as Data Type and “G. gallus” as Organism.
- Paste the genes from chicken_background_genes.txt into the text entry box or click “Choose File” to upload the txt file.
- Click Create List.
- After a database lookup is performed, the number of ids entered and ids found will be provided. Notice that some ids were not found because this experiment is from an older chicken genome assembly, and some of the gene ids are not in the newest assembly.
- Enter a name for the list, then click “Save a List of ….. Genes”.
- After the list has been saved, click “View” in the gray bar below the navigation bar and notice that the new list shows up in your list of lists.
- Click “Upload” in the gray bar below the navigation bar, and repeat steps 1 through 7 with genes from the chicken_de_genes.txt file, using a new name when saving the list.
- While still on the List Analysis page, scroll down until you see the gene set enrichment widgets. Wait a few moments for the enrichment analyses to complete.
- The Pathway enrichment will likely finish first. KEGG uses only RefSeq ids, but your gene list is from Ensembl, so there will be no KEGG enrichment. Use the Data Set pulldown menu to change the data set to Reactome. Wait again for the new analysis to complete. Once complete, change the background gene list by clicking “Change” under “Background Population”, and use the pulldown menu to select your background gene list. If desired, you can change the test correction and maximum p-value. Once the analysis is complete, you can download the results by clicking "Download".
- For the GO enrichment widget, use methods described in step 11 to change the background gene list and download the results. You can use the Ontology pulldown menu to change the ontology from “Biological Process” to “Molecular Function” or “Cellular Component”.
B. Regions Search. In this example, the FAANGMine Regions Search Tool is used to search for genes within QTL regions for the trait “Corpus Luteum Number” in the pig genome. These QTL regions were downloaded from FAANGMine, and were originally from AnimalQTLdb Release 36 (Hu ZL, Park CA, Reecy JM. Building a livestock genetic and genomic information knowledgebase through integrative developments of Animal QTLdb and CorrDB. Nucleic Acids Res. 2019;47(D1):D701‐D710. doi:10.1093/nar/gky1084. PMCID: PMC6323967).
pig_corpus_luteum_number_qtl.txt – this is a tab delimited file with chromosome id, start coordinate and end coordinate. Right click the file name (or control-click in Mac) to download the txt file.
- Be sure you are logged into FAANGMine.
- Click on the Regions tab in the navigation bar.
- In the Regions Search upload menu, select the organism “S. scrofa”. The correct assembly will automatically be selected.
- Click the box next to “Select Feature Types” to deselect all the feature types, then select the box next to “Gene”.
- Either paste the coordinates from an Excel spreadsheet into the text box or click “Choose” to upload the txt file.
- For this example, we will not extend the regions or select a strand-specific search (numbers 5 and 6 in the upload menu).
- Click “Search”.
- The search may take a few moments. Once complete, you will see an intermediate result page that will allow you to download results from individual regions or all regions together in various formats. Notice that both Ensembl and RefSeq genes show up in the search.
- To save the list of all the genes, click “Go” next to the menu to the right of “Create list by feature type”, above the search result table. The pulldown menu provides only “Gene” as a choice, because the search was limited to genes.
- Saving the list brings you to the List Analysis page, as already seen in Example A. The Regions Search tool does not allow you to name the list, but assigns a name like “all_regions_Gene_list_1”.
- Next you will prepare lists to perform gene set enrichment. Since the regions search resulted in genes from two different gene sets, it is necessary to create a new list for each gene set separately. Click the histogram icon within the Gene Source column to enable filtering for individual gene sets. This step is time consuming and may take several minutes. We are working to improve the response time.
- Eventually a box will show up showing the number of genes in each gene set. First click the box next to “RefSeq”, then click the arrow next to “Filter” and select “Restrict table to matching rows”. Then click “Save as a List”, above and to the right side of the table, and select “Gene (1485 Genes)”. This will open a box that will allow you to name a new list and save it by clicking “Create List”.
- Once the RefSeq genes have been saved, use the “Undo” button to undo the previous filtering step. Repeat the filtering step (steps 11 and 12), but this time restrict the rows only to Ensembl genes.
- Now your gene lists are ready for viewing enrichment. The enrichment results showing in the current page are incorrect, because they are based on two gene sets together, even after restricting the table to one gene set. To re-perform the enrichment analysis with the new individual gene sets, click “View” in the gray bar below the navigation bar to view your list of lists. Clicking either one of your new lists (RefSeq or Ensembl) will prompt new enrichment computations based on those lists. Use instructions from Example A to change the background gene lists in the enrichment widgets to either “S. scrofa RefSeq All Genes” or “S. scrofa Ensembl95 All Genes”, depending on which list you selected. Repeat this step for the other gene set.
Please contact us if you have questions about these examples.