Enrichment Map g:Profiler Tutorial

Contents

Enrichment Map g:Profiler Tutorial
Outline
Instructions

Outline

This quick tutorial will guide you through the generation of an Enrichment Map for an analysis performed using g:Profiler (Functional Profiling of Gene List from large-scale experiments)

To run this tutorial:

You need to have Cytoscape installed : minimally 2.6.3 must be installed but preferable to have the latest version of Cytoscape (e.g. version 3.1)
Install the Enrichment Map plugin from the Cytoscape plugin manager. If you install it manually (e.g. if you need to install a new version that doesn't happen to be in the plugin manager yet), then it must be in the Cytoscape-[Version#]/plugins folder --> For Cytoscape 2.8
Install the Enrichment Map App from the Cytoscape App manager in cytoscape 3. --> For Cytoscape 3.

Instructions

Step 1: Generate g:Profiler output files

You need to download the test data: gProfilerTutorial.zip
Description of the tutorial files contained in the gProfilerTutorial folder:
- 12hr_topgenes.txt : List of top genes expressed in Estrogen dataset at 12hr - Official Gene Symbol.
- 24hr_topgenes.txt : List of top genes expressed in Estrogen dataset at 24hr - Official Gene Symbol.

Go to g:Profiler website - http://biit.cs.ut.ee/gprofiler/
Select and copy all genes in the tutorial file 12hr_topgenes.txt in the Query box. $/!\$ make sure that your list contains only official gene symbol (HUGO)
In Options, check Significant only, No electronic GO annotations
Set the Output type to Generic EnrichmentMap
Show advanced options
Set Min and Max size of functional category to 3 and 500 respectively.
Select 2 for Size of Q&T
On the right panel, choose the Gene Ontology Biological process and Reactome
Set Significance threshold to Benjamini-Hochberg FDR
Click on g:Profile! to run the analysis
1. [Note] - if some of your identifiers in your query have multiple mappings in g:Profiler! by default they get excluded. If this happens you will see the following above the g:Profiler! results:
2. Click on above link to manually map each gene to its correct annotation
3. click on Resubmit query to update your results with the specified mappings.
4. If the identifier discrepancy warning is ignored there might be differences between the number of genes g:Profiler attributes to a particular gene set and those associated with it in the Enrichment Map.
Download g:Profiler data as gmt: name Note: you will have to unzip the folder
Download the result file: Download data in Generic Enrichment Map (GEM) format

Note: repeat these steps for the 24hrs time-point and the file 24hr_topgenes.txt

Screenshot gProfilet EM Panel

Link to a step by step tutorial: gProfiler_step_by_step.pdf

Step 2: Generate Enrichment Map with g:Profiler Output

g:Profiler output files gProfiler_EM.zip

1. Open Cytoscape

2. In the menu bar, locate the App tab and then select --> EnrichmentMap --> Create Enrichment Map

3. Make sure the Analysis Type is set to generic(ex:gProfiler)

4. Please select the following files by clicking on the respective (...) button and selecting the file in the Dialog:
- GMT / hsapiens.pathways.NAME.gmt
- Dataset 1 / Enrichments: gProfiler_results_12hr.txt

5. Tune Parameters
- P-value cut-off 1
- Q-value cut-off 1
- Overlap Coefficient cut-off 0.5

6. Click on the Build radio button at the bottom of the panel to create the Enrichment Map

7. In the menu bar, Go to View, and activate Show Graphics Details

8. In the control panel, go to Style, click on Label and select EM1_GS_DESCR in the Column dropdown. This will label nodes with names rather thsn GO IDs. The selected value may be EM2_GS_DESCR or other if you have more than one Enrichment Map open in Cytoscape.

Step 3: Examining Results

gProfiler EM Result
Legend:

Node size corresponds to the number of genes in dataset 1 within the geneset
Colour of the node corresponds to the significance of the geneset for dataset 1.
Edge size corresponds to the number of genes that overlap between two connected genesets.

-  ⇤ ← Revision 21 as of 2013-06-07 20:22:34 → 
  Size: 5160
  Editor: VeroniqueVoisin
  Comment:
+   ← Revision 46 as of 2017-07-27 13:14:30 → ⇥
  Size: 5178
  Editor: RuthIsserlin
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-/!\ PAGE UNDER CONSTRUCTION / NOT PUBLIC YET
+#acl All:read
-Line 8:
+Line 7:
-== Outline ==
+= Outline =
-Line 13:
+Line 12:
-    * You need to have Cytoscape installed : minimally 2.6.3 must be installed but preferable to have the latest version of cytoscape
    * Install the Enrichment Map plugin from the Cytoscape plugin manager. If you install it manually (e.g. if you need to install a new version that doesn't happen to be in the plugin manager yet), then it must be in the Cytoscape-[Version#]/plugins folder
+    * You need to have Cytoscape installed : minimally 2.6.3 must be installed but preferable to have the latest version of Cytoscape  (e.g. version 3.1)
    * Install the Enrichment Map plugin from the Cytoscape plugin manager. If you install it manually (e.g. if you need to install a new version that doesn't happen to be in the plugin manager yet), then it must be in the Cytoscape-[Version#]/plugins folder --> '''For Cytoscape 2.8'''
    * Install the Enrichment Map App from the Cytoscape App manager in cytoscape 3. --> '''For Cytoscape 3.'''

= Instructions =
== Step 1: Generate g:Profiler output files ==
-Line 16:
+Line 20:
-Description of the tutorial files contained in the gProfilerTutorial folder:
    * 12hr_topgenes.txt : List of top genes expressed in Estrogen dataset at 12hr - Official Gene Symbol.
    * 24hr_topgenes.txt : List of top genes expressed in Estrogen dataset at 24hr - Official Gene Symbol.
    * gProfiler_output_12hr.txt :  Estrogen treatment - 12hr g:Profiler output results
    * gProfiler_output_24hr.txt : Estrogen treatment - 24hr g:Profiler output results
    * Estrogen_expression_file.txt: Expression File - Estrogen treatment, Official Gene Name as key.
    * hsapiens.NAME.gmt: GMT file (gene-set file) downloaded from g:Profiler website
+    * Description of the tutorial files contained in the gProfilerTutorial folder:
     * 12hr_topgenes.txt : List of top genes expressed in Estrogen dataset at 12hr - Official Gene Symbol.
     * 24hr_topgenes.txt : List of top genes expressed in Estrogen dataset at 24hr - Official Gene Symbol.
-Line 26:
+Line 25:
-== Instructions ==
=== Step 1: Generate g:Profiler output files ===
-Line 29:
+Line 26:
-. Select and copy all genes in the tutorial file 12hr_topgenes.txt in the '''Query''' box
+. Select and copy all genes in the tutorial file 12hr_topgenes.txt in the '''Query''' box. /!\ make sure that your list contains only official gene symbol (HUGO)
-Line 33:
+Line 30:
-. Set '''Max size of functional category''' to '''500'''
+. Set '''Min and Max size of functional category''' to '''3''' and '''500''' respectively.
  1. Select '''2''' for '''Size of Q&T'''
  1. On the right panel, choose the '''Gene Ontology Biological process''' and  '''Reactome'''
-Line 36:
+Line 35:
-. Download  g:Profiler data as gmt: '''name'''
+    a. '''[Note]''' - if some of your identifiers in your query have multiple mappings in g:Profiler! by default they get excluded.  If this happens you will see the following above the g:Profiler! results:
      {{attachment:gProfiler_warning_screenshot.png}} 
    a. Click on above link to manually map each gene to its correct annotation
      {{attachment:gProfiler_warning_expanded_screenshot.png}} 
    a. click on '''Resubmit query''' to update your results with the specified mappings.
    a. '''If the identifier discrepancy warning is ignored there might be differences between the number of genes g:Profiler attributes to a particular gene set and those associated with it in the Enrichment Map.'''  
  1. Download  g:Profiler data as gmt: '''name'''  Note: you will have to unzip the folder
-Line 46:
+Line 51:
-=== Step 2: Generate Enrichment Map with g:Profiler Output ===
+ * Link to a step by step tutorial: [[attachment:gProfiler_step_by_step.pdf]]

== Step 2: Generate Enrichment Map with g:Profiler Output ==
-Line 48:
+Line 56:
+ * g:Profiler output files [[attachment:gProfiler_EM.zip]]
<<BR>><<BR>>
-Line 51:
+Line 60:
-  *2. Click on '''Plugins''' / '''Enrichment Map'''/ '''Load Enrichment Results'''
+  *2. In the menu bar, locate the '''App''' tab and then select  --> '''!EnrichmentMap''' --> '''Create Enrichment Map'''
-Line 53:
+Line 62:
-  *3. Make sure the Analysis Type is set to '''Generic'''
+  *3. Make sure the Analysis Type is set to '''generic(ex:gProfiler)'''
-Line 56:
+Line 65:
-      * GMT / 'hsapiens.NAME.gmt'
      * Dataset 1 / Expression: `Estrogen_expression_file.txt` (OPTIONAL)
      * Dataset 1 / Enrichments: `gProfiler_output_12hr.txt`
      * Click on "''Dataset 2 {{attachment:arrow_collapsed.gif}}''" to expand the panel
      * Dataset 2 / Expression: ''leave empty''
      * Dataset 2 / Enrichments 1: `gProfiler_output_24hr.txt` (OPTIONAL)
+      * GMT / '''hsapiens.pathways.NAME.gmt'''
      * Dataset 1 / Enrichments: '''gProfiler_results_12hr.txt'''
-Line 64:
+Line 69:
-      * P-value cut-off `0.01`
      * Q-value cut-off `0.01`
      * Check ''Overlap Coefficient''
         * Jaccard + Overlap combined cut-off `0.5`
+      * P-value cut-off '''1'''
      * Q-value cut-off '''1'''
      * Overlap Coefficient cut-off '''0.5'''
-Line 69:
+Line 74:
-  *6. '''Build Enrichment Map'''
+  *6. Click on the '''Build''' radio button at the bottom of the panel to create the Enrichment Map
-Line 71:
+Line 76:
-  *7. Go to '''View''', and activate '''Show Graphics Details'''
+  *7. In the menu bar, Go to '''View''', and activate '''Show Graphics Details'''
<<BR>>
  *8. In the control panel, go to '''Style''', click on '''Label''' and select '''EM1_GS_DESCR''' in the '''Column''' dropdown. This will label nodes with names rather thsn GO IDs. The selected value may be '''EM2_GS_DESCR''' or other if you have more than one Enrichment Map open in Cytoscape.
-Line 74:
+Line 81:
-=== Step 3: Examining Results ===
+== Step 3: Examining Results ==
-Line 77:
+Line 84:
-. Node (inner circle) size corresponds to the number of genes in dataset 1 within the geneset
  1. Node border (outer circle) size corresponds to the number of genes in dataset 2 within the geneset
  1. Colour of the node (inner circle) and border(outer circle) corresponds to the significance of the geneset for dataset 1 and dataset 2, respectively.
  1. Edge size corresponds to the number of genes that overlap between the two connected genesets.  Green edges correspond to both datasets when it is the only colour edge.  When there are two different edge colours, green corresponds to dataset 1 and blue corresponds to dataset 2.
    * '''NOTE''': if you are using two enrichment sets you will see two different colours of edges in the enrichment map.  When the set of genes in the two datasets are different (for example, when you are comparing two different species or when you are comparing results from two different platforms) the overlaps are computed for each dataset separately as there is a different set of genes that the enrichments were calculated on.  In this case, since the enrichments were reduced to only a subset of most differentially expressed at each time point the set of genes the enrichments are calculated on are different and overlap are calculated for each set separately.
+. Node size corresponds to the number of genes in dataset 1 within the geneset
  1. Colour of the node corresponds to the significance of the geneset for dataset 1.
  1. Edge size corresponds to the number of genes that overlap between two connected genesets.