⇤ ← Revision 1 as of 2011-08-24 13:38:30
Size: 1523
Comment:
|
Size: 2438
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 9: | Line 9: |
Line 17: | Line 18: |
|| Biocarta || - || - || - || {X} - found in msigdb c2 || | || Biocarta || msigDB -c2 || gmt || Entrez gene || |
Line 21: | Line 22: |
Line 22: | Line 24: |
'''< > denotes directory''' | |
Line 25: | Line 28: |
* GO * Pathways * miRs * TF * Disease phenotypes |
* <GO> * BP = biological process * MF = molecular function * CC = Cellular component * All = BP + MF + CC * no_GO_IEA - indicates that the file '''excludes''' GO annotations with evidence codes - 'IEA' (inferred from electronic annotation), 'ND' (No biological data available), 'RCA' (inferred from reviewed computational analysis) * with_GO_IEA - indicates that the file '''includes''' GO annotations with evidence codes - 'IEA' (inferred from electronic annotation), 'ND' (No biological data available), 'RCA' (inferred from reviewed computational analysis) * <Pathways> * <miRs> * <TF> * <Disease phenotypes> * In each <identifier> directory There are amalgamated gene set files: * AllPathways - contains all pathway sources in the Pathways directory * GOPathways - contains all GO (mf, bp, cc) and all Pathway sources in the Pathways directory. |
= Enrichment Map Genesets =
Summary
Enrichment Map Genesets are a set of Gene Set files in GMT format (compatible with GSEA) updated monthly from original source locations available with:
- Entrez gene ids
UniProt accessions
- Gene symbols
Sources
Source |
File Origin |
File Type |
ID extracted |
KEGG |
static (July 2011) |
gmt |
symbol |
IOB |
static (July 2011) |
biopax |
Entrez gene |
Msigdb - c2 |
static (needs to be updated manually) |
gmt |
Entrez gene |
NetPath |
website (scripted grab of file numbered 1-25) |
biopax |
Entrez gene |
HumanCyc |
scripted grab of zipped release from password protected website. |
biopax |
Uniprot |
NCI |
scripted grab from pathwaycommons |
gmt |
Entrez gene |
Biocarta |
msigDB -c2 |
gmt |
Entrez gene |
Reactome |
scripted grab of zipped release from website |
biopax |
Uniprot |
GO |
scripted grab from ftp site |
GAF |
Uniprot |
Specialty GMTs |
grab from Msigdb |
gmt |
Entrez gene |
File Structure
< > denotes directory
<Release> - directory is named according to date sets were updated.
<Species>
<Identifier> - (either Entrez gene, UniProt, Gene symbol)
<GO>
- BP = biological process
- MF = molecular function
- CC = Cellular component
- All = BP + MF + CC
no_GO_IEA - indicates that the file excludes GO annotations with evidence codes - 'IEA' (inferred from electronic annotation), 'ND' (No biological data available), 'RCA' (inferred from reviewed computational analysis)
with_GO_IEA - indicates that the file includes GO annotations with evidence codes - 'IEA' (inferred from electronic annotation), 'ND' (No biological data available), 'RCA' (inferred from reviewed computational analysis)
<Pathways>
<miRs>
<TF>
<Disease phenotypes>
In each <identifier> directory There are amalgamated gene set files:
AllPathways - contains all pathway sources in the Pathways directory
- GOPathways - contains all GO (mf, bp, cc) and all Pathway sources in the Pathways directory.