2580
Comment:
|
3992
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
Download annotation file based on Chembl v6. data[[attachment:Chembl_targets.txt]] | ---- == Sample Network based on synthetic lethal data for KRAS. == “KRAS_SL_genes_from_publications.txt” is a tab-delimited file that describes a synthetic lethal interaction network between KRAS and 116 genes that was derived from two publications. Download here [[attachment:KRAS_SL_genes_from_publications.txt]] |
Line 5: | Line 7: |
“Chembl_targets.txt” is a tab-delimited file describing 2104 proteins from a variety of species that are reported in the Chembl database (v6) to have at least one published, high-confidence, potent interaction with a compound. It contains 10 columns: Uniprot ID: The Uniprot Id for the protein, as listed in Chembl. Entrez Gene ID: The Entrez Gene ID for this protein based on the ID mapping service provided by the Uniprot website. Protein name from Chembl: A common name for this protein as listed by Chembl. Number of different compounds reported in Chembl: The number of distinct compounds that have a value of <1uM associated with the this protein at a Chembl confidence level of 5 or greater. As commercial supplies of particular compounds tend to vary over time, this number could be used as an estimate of how easy it will be to source a compound to inhibit this protein. Number of publications reporting a compound protein interaction: Based on the same criteria as above, this is the number of distinct publications that report at least one compound with an interaction with the protein. This number could be used as a proxy for how often novel compounds associated with this protein are identified. Smiles: The smiles string in this column represents the chemical structure for an example compound that has been annotated with this protein. The example compound was chosen as Chembl reports it has the highest number of distinct publications associating it with this protein. InChIKey: InChIKey is an alternative chemical representation of the example compound that is optimized to search for chemical information using text based search engines, such as internet searches and database queries. #Publications reporting interaction: This column contains the number of publications in Chembl in which the example compound has been associated with this protein. It could be used as a measure of how reliable this particular compound interaction is on the grounds that the more often the interaction is tested and reported the more reliable it is. Chembl link: A link to the Chembl webpage for this protein. The webpage contains a wide variety of additional information regarding the reported interaction of this protein with different compounds. Compound: A flag to identify which proteins/genes have been annotated in Chembl. |
Mutations in the KRAS gene, a member of the Ras family of small GTPases, are frequently found pancreatic, thyroid, colon, lung and liver cancers and are correlated with poor prognosis. In this network nodes represent genes, and edges represent a synthetic lethal interaction based on published interactions. Genes that are syntheticly lethal with KRAS that have a known inhibitor available could represent potential and accessible theraputic targets for treatment of tumours with a KRAS mutation. It contains five columns: ''Screened Entrez Gene'': The KRAS entrez gene number. ''SL Entrez Gene'': The entrez gene number for the gene that was reported to be synthetically lethal with the KRAS gene. ''Screened Gene'': Gene symbol for KRAS ''SL Gene'': Gene symbol for the gene that was reported to be syntheticly lethal with the KRAS gene. ''Pubmed Link'': Link to the publication that reported this interaction. ---- == Annotation File Based on Chembl v6. data == “Chembl_targets.txt” is a tab-delimited file describing 2104 proteins from a variety of species that are reported in the Chembl database (v6) to have at least one published, high-confidence, potent interaction with a compound. Download here: [[attachment:Chembl_targets.txt]] It contains 10 columns: ''Uniprot ID'': The Uniprot Id for the protein, as listed in Chembl. ''Entrez Gene ID'': The Entrez Gene ID for this protein based on the ID mapping service provided by the Uniprot website. ''Protein name from Chembl'': A common name for this protein as listed by Chembl. ''Number of different compounds reported in Chembl'': The number of distinct compounds that have a value of <1uM associated with the this protein at a Chembl confidence level of 5 or greater. As commercial supplies of particular compounds tend to vary over time, this number could be used as an estimate of how easy it will be to source a compound to inhibit this protein. ''Number of publications reporting a compound protein interaction'': Based on the same criteria as above, this is the number of distinct publications that report at least one compound with an interaction with the protein. This number could be used as a proxy for how often novel compounds associated with this protein are identified. ''Smiles'': The smiles string in this column represents the chemical structure for an example compound that has been annotated with this protein. The example compound was chosen as Chembl reports it has the highest number of distinct publications associating it with this protein. ''InChIKey'': InChIKey is an alternative chemical representation of the example compound that is optimized to search for chemical information using text based search engines, such as internet searches and database queries. ''#Publications reporting interaction'': This column contains the number of publications in Chembl in which the example compound has been associated with this protein. It could be used as a measure of how reliable this particular compound interaction is on the grounds that the more often the interaction is tested and reported the more reliable it is. ''Chembl link'': A link to the Chembl webpage for this protein. The webpage contains a wide variety of additional information regarding the reported interaction of this protein with different compounds. ''Compound'': A flag to identify which proteins/genes have been annotated in Chembl.This is useful for visualizing proteins/genes that have an associated compound |
This page details how to visualize a biological network (such as a gene-gene interaction network) with associated chemical information
Sample Network based on synthetic lethal data for KRAS.
“KRAS_SL_genes_from_publications.txt” is a tab-delimited file that describes a synthetic lethal interaction network between KRAS and 116 genes that was derived from two publications. Download here KRAS_SL_genes_from_publications.txt
Mutations in the KRAS gene, a member of the Ras family of small GTPases, are frequently found pancreatic, thyroid, colon, lung and liver cancers and are correlated with poor prognosis. In this network nodes represent genes, and edges represent a synthetic lethal interaction based on published interactions. Genes that are syntheticly lethal with KRAS that have a known inhibitor available could represent potential and accessible theraputic targets for treatment of tumours with a KRAS mutation.
It contains five columns:
Screened Entrez Gene: The KRAS entrez gene number.
SL Entrez Gene: The entrez gene number for the gene that was reported to be synthetically lethal with the KRAS gene.
Screened Gene: Gene symbol for KRAS
SL Gene: Gene symbol for the gene that was reported to be syntheticly lethal with the KRAS gene.
Pubmed Link: Link to the publication that reported this interaction.
Annotation File Based on Chembl v6. data
“Chembl_targets.txt” is a tab-delimited file describing 2104 proteins from a variety of species that are reported in the Chembl database (v6) to have at least one published, high-confidence, potent interaction with a compound. Download here: Chembl_targets.txt
It contains 10 columns:
Uniprot ID: The Uniprot Id for the protein, as listed in Chembl.
Entrez Gene ID: The Entrez Gene ID for this protein based on the ID mapping service provided by the Uniprot website.
Protein name from Chembl: A common name for this protein as listed by Chembl.
Number of different compounds reported in Chembl: The number of distinct compounds that have a value of <1uM associated with the this protein at a Chembl confidence level of 5 or greater. As commercial supplies of particular compounds tend to vary over time, this number could be used as an estimate of how easy it will be to source a compound to inhibit this protein.
Number of publications reporting a compound protein interaction: Based on the same criteria as above, this is the number of distinct publications that report at least one compound with an interaction with the protein. This number could be used as a proxy for how often novel compounds associated with this protein are identified.
Smiles: The smiles string in this column represents the chemical structure for an example compound that has been annotated with this protein. The example compound was chosen as Chembl reports it has the highest number of distinct publications associating it with this protein.
InChIKey: InChIKey is an alternative chemical representation of the example compound that is optimized to search for chemical information using text based search engines, such as internet searches and database queries.
#Publications reporting interaction: This column contains the number of publications in Chembl in which the example compound has been associated with this protein. It could be used as a measure of how reliable this particular compound interaction is on the grounds that the more often the interaction is tested and reported the more reliable it is.
Chembl link: A link to the Chembl webpage for this protein. The webpage contains a wide variety of additional information regarding the reported interaction of this protein with different compounds.
Compound: A flag to identify which proteins/genes have been annotated in Chembl.This is useful for visualizing proteins/genes that have an associated compound