DW Resource Loading: Entrez:

gene2accession, 
gene2go, 
gene2pubmed, 
gene2refseq,
gene2sts,
gene2unigene,
gene_history,
gene_info,
mim2gene,
gene_refseq_uniprotkb_collab,
interactions.

Identifier Mapping Tables:

1. Six tab delimited files (can be imported into Excel or a similar package, as well as parsed). Each file belongs to one of the species of interest, except for A.Thaliana. Files will be named based on their species.

2. Each sheet will have the following columns: (example from human)

Ensembl Gene ID: ENSG00000198888
Ensembl Gene Symbol: ND1
Entrez Gene ID: 4535
Ensembl Description: etc...

3. First row in each file will be the headers row.

4. In general, when there is no data available for a specific cell in a particular column, the term 'NA' (or a similar standard term) will be used instead. In cases when there is more than one entry per cell, the entries will be separated by a ';'.

5. As a general rule for any importing system, its better to treat IDs as alphaneumeric in type rather than neumeric, since they can be either, depending on the source database referenced.

More DW Documents:

GeneMania/GeneManiaDataWarehouse (last edited 2008-10-20 17:07:33 by GaryBader)

MoinMoin Appliance - Powered by TurnKey Linux