Enrichment Map Genesets

Summary

Sources

Source

File Origin

File Type

ID extracted

Frequency source is updated

Number of pathwayss

Notes

KEGG

KEGG ftp site (July 2011)

gmt

symbol

static as of July 1, 2011

236

Not available in biopax, available in flatfile, translated into gmt files

Msigdb - c2

static (needs to be updated manually)

gmt

Entrez gene

sporadically

Biocarta - 217
Other - 47

Only need other and Biocarta as all other sources are currently covered

NCI

NCI

biopax

Entrez gene

sporadically

219 pathways

{X} Biocarta

NCI

biopax

Entrez gene

static

386 pathways

Biopax 3 - Complete Mess! - currently getting from Msigdb

IOB

directly from IOB - static (July 2011)

biopax

Entrez gene

sporadically

35 pathways -
10 are the same as CellMap,
1 is the same as NetPath

need biopax pathways fixed so species info is correct but information is still extractable.

NetPath

www.netpath.org/browse (scripted grab of file numbered 1-25)

biopax

Entrez gene

static

25 pathways -
12 are cancer pathways (10 are CellMap)
13 are immunity pathways

need biopax pathways fixed so species info is correct but information is still extractable.

HumanCyc

scripted grab of zipped release from password protected website.

biopax

Uniprot

updated periodically

249 Pathways

available in biopax level 2 and level 3

Reactome

scripted grab of zipped release from website

biopax

Uniprot

updated release

1117 pathways (release 37)

No way of getting version of release from biopax file

GO

scripted grab from EBI ftp site (human)

GAF

Uniprot

released once a month

13,034 no GO IEA
15,181 with GO IEA

source is direct from original curator of annotations

msigdb - c3
Specialty GMTs
mirs, transcription factors

grab from Msigdb

gmt

Entrez gene

sporadically

221 miRs
616 TFs

Source

File Origin

File Type

ID extracted

Frequency source is updated

Number of pathwayss

Notes

Reactome

scripted grab of zipped release from website

biopax

Uniprot

updated release

946 pathways (release 37)

No way of getting version of release from biopax file

GO

scripted grab from MGI ftp site (human)

GAF

MGI

released once a month

14,563 no GO IEA
15,041 with GO IEA

source is direct from original curator of annotations

KEGG

translated from Human using Homologene

gmt

Entrezgene

static as of July 1, 2011

236

Not available in mouse specific format

Msigdb - c2

translated from Human using Homologene

gmt

Entrez gene

sporadically

total 880:
Kegg -186
Reactome - 430
Biocarta - 217
Other - 47

Only need other and Biocarta as all other sources are currently covered

NCI

translated from Human using Homologene

gmt

Entrez gene

sporadically

219 pathways

IOB

translated from Human using Homologene

gmt

Entrez gene

sporadically

35 pathways -
10 are the same as CellMap,
1 is the same as NetPath

need biopax pathways fixed so species info is correct but information is still extractable.

NetPath

translated from Human using Homologene

gmt

Entrez gene

static

25 pathways -
12 are cancer pathways (10 are CellMap)
13 are immunity pathways

need biopax pathways fixed so species info is correct but information is still extractable.

HumanCyc

translated from Human using Homologene

gmt

Entrez gene

updated periodically

249 Pathways

available as Mousecyc in biopax but when we parsed it we got a fraction of the pathways that are in human so chose to convert the human files instead

File Structure

< > denotes directory

Creating customized Genesets

  1. Download the desired gene set files you would like to use in your customized set. (For example Human_IOB_Entrezgene.gmt Human_NetPath_Entrezgene.gmt )

   cat Human_IOB_Entrezgene.gmt Human_NetPath_Entrezgene.gmt > MyCustomizedSet.gmt

GeneSets (last edited 2011-09-01 19:51:40 by RuthIsserlin)

MoinMoin Appliance - Powered by TurnKey Linux