Diff for "DomainSpecificityPredictionProject" - Bader Lab @ The University of Toronto

Differences between revisions 77 and 104 (spanning 27 versions)

Peptide Recognition Domain Interaction Prediction

Contents

Peptide Recognition Domain Interaction Prediction
Table of Contents
Team

Background

The human genome contains approximately 26,000 protein-coding genes, which through alternative splicing can direct the synthesis of thousands of different proteins. The majority of these proteins interact with other proteins to coordinate a variety of cellular processes including DNA replication, cell cycle control, and signal transduction. The ability to accurately detect these interactions enables the assembly of protein interaction networks which can be used to better understand and study the biochemistry of the cell.

Computation PPI Prediction

Computational methods to predict (protein protein interactions) PPIs have been developed and can be used to support or prioritize experiments. Such methods fall into a range of categories from physics to statistics-based method, however they all face several challenges. For physics-based prediction methods, the structures of the proteins are often unavailable or protein flexibility is not taken into consideration. Sequence based methods like PWMs can only represent short binding motifs and often do not account for interdependencies between residues and positions. In general, the computational prediction of PPIs is considered an extremely difficult problem that is not fully addressed by any existing method.

Many PPIs are mediated by peptide recognition domains (PRDs), which are evolutionary conserved modular interaction domains often found combined in different ways to form larger proteins. Proteins containing PRDs are used by the cell for numerous processes such as the co-localization of proteins, regulation of signaling processes or recognition of protein post-translational modifications. Interactions usually occur through the recognition of short linear sequences in the target protein such as proline-rich or C terminal motifs. Because of their simpler binding sites and straightforward modes of target recognition, it is easier to computationally predict peptide-PRD interactions than it is to predict PPIs more generally.

Computational Prediction of PDZ Domain Interactions

The PSD95/DlgA/Zo-1 (PDZ) domain is an ideal model for studying the computational prediction of peptide-PRD interactions since they are have important biological roles, are well studied and one of the simplest binding sites among PRDs. PDZ domains are found in bacteria, yeast, plants, and metazoans with 250 found in humans. They often interact with ion channels, adhesion molecules, and neurotransmitter receptors in signaling and scaffolding proteins. The biological roles include maintaining cell polarity, facilitating signal coupling, and regulating synaptic development. Their importance is emphasized, as mutations of the PDZ domain in different proteins have been associated with various diseases.

Sequence Based Prediction

Recently, two high through put experiments have been performed to study different PDZ domains. This has enabled the development of computational predictors of PDZ domain interactions. My current project focuses on using a machine learning method called support vector machines to computationally predict PDZ domain interactions directly from a given proteome. [Read More]

Sequence and Structure Based Prediction

Work in progress [Read More]

Team

Shirley Hui
Gary Bader

CategoryProject

-  ⇤ ← Revision 77 as of 2009-07-07 00:48:23 → 
  Size: 5292
  Editor: localhost
  Comment: converted to 1.6 markup
+   ← Revision 104 as of 2010-10-13 14:38:17 → ⇥
  Size: 8739
  Editor: ShirleyHui
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-#acl BaderLabGroup:read,write,revert,delete All:
+#acl All:read

== Peptide Recognition Domain Interaction Prediction ==
-Line 6:
+Line 8:
-== Goals ==
 * Computationally predict specificity of peptide recognition domain from the primary amino acid sequences
 * Analyze PDZ, WW and then SH3 domains
+=== Background ===
The human genome contains approximately 26,000 protein-coding genes, which through alternative splicing can direct the synthesis of thousands of different proteins. The majority of these proteins interact with other proteins to coordinate a variety of cellular processes including DNA replication, cell cycle control, and signal transduction. The ability to accurately detect these interactions enables the assembly of protein interaction networks which can be used to better understand and study the biochemistry of the cell.
-Line 10:
+Line 11:
-== Background ==
 * [[/PDZ|PDZ Domains]]
 * [[/MachineLearning|Machine Learning]]
+=== Computation PPI Prediction ===
Computational methods to predict (protein protein interactions) PPIs have been developed and can be used to support or prioritize experiments.  Such methods fall into a range of categories from physics to statistics-based method, however they all face several challenges.  For physics-based prediction methods, the structures of the proteins are often unavailable or protein flexibility is not taken into consideration.  Sequence based methods like PWMs can only represent short binding motifs and often do not account for interdependencies between residues and positions. In general, the computational prediction of PPIs is considered an extremely difficult problem that is not fully addressed by any existing method.
 Line 14:
-== Strategy ==
 * [[/Strategy|Strategy]]
+Many PPIs are mediated by peptide recognition domains (PRDs), which are evolutionary conserved modular interaction domains often found combined in different ways to form larger proteins.  Proteins containing PRDs are used by the cell for numerous processes such as the co-localization of proteins, regulation of signaling processes or recognition of protein post-translational modifications. Interactions usually occur through the recognition of short linear sequences in the target protein such as proline-rich or C terminal motifs.  Because of their simpler binding sites and straightforward modes of target recognition, it is easier to computationally predict peptide-PRD interactions than it is to predict PPIs more generally.
-Line 17:
+Line 16:
-== Ideas ==
 * [[/Ideas|Ideas]]
+=== Computational Prediction of PDZ Domain Interactions ===
The PSD95/DlgA/Zo-1 (PDZ) domain is an ideal model for studying the computational prediction of peptide-PRD interactions since they are have important biological roles, are well studied and one of the simplest binding sites among PRDs. PDZ domains are found in bacteria, yeast, plants, and metazoans with 250 found in humans. They often interact with ion channels, adhesion molecules, and neurotransmitter receptors in signaling and scaffolding proteins.  The biological roles include maintaining cell polarity, facilitating signal coupling, and regulating synaptic development. Their importance is emphasized, as mutations of the PDZ domain in different proteins have been associated with various diseases.
-Line 20:
+Line 19:
-== Data ==
 * [[/PDZData|PDZ Data]]
+==== Sequence Based Prediction ====
Recently, two high through put experiments have been performed to study different PDZ domains.  This has enabled the development of computational predictors of PDZ domain interactions.  My current project focuses on using a machine learning method called support vector machines to computationally predict PDZ domain interactions directly from a given proteome. [[Data/PDZProteomeScanning|[Read More]]]
-Line 23:
+Line 22:
-== Experiments ==
 * [[/Experiments|Experiments and Results]]
+==== Sequence and Structure Based Prediction ====
Work in progress [[/Strategy|[Read More]]]
-Line 26:
+Line 25:
-== Status ==
 * [[/Log|Status]]
+## == Goals ==
## * Computationally predict specificity of peptide recognition domain from the primary amino acid sequences
## * Analyze PDZ, WW and then SH3 domains

## == Background ==
## * [[/PDZ|PDZ Domains]]
## * [[/MachineLearning|Machine Learning]]

## == Strategy ==
## * [[/Strategy|Strategy]]

## == Ideas ==
## * [[/Ideas|Ideas]]

## == Data ==
## * [[/PDZData|PDZ Data]]

## == Experiments ==
## * [[/Experiments|Experiments and Results]]

## == Status ==
##  * [[/Log|Status]]
-Line 79:
+Line 98:
-== Committee Meetings ==
 * [[/Meeting|Notes]]
+## == Committee Meetings ==
## * [[/Meeting|Notes]]
-Line 82:
+Line 101:
-== Tools/Resources ==
 * [[/ToolsResources|Tools and Resources]]
+## == Tools/Resources ==
## * [[/ToolsResources|Tools and Resources]]
-Line 85:
+Line 104:
-== Reading Notes ==
 * [[/../ShirleyHui/MBCReadings|Molecular Biology of the Cell]]
 * [[/../ShirleyHui/PPIReadings|Protein-protein Interaction Detection]]
 * Support Vector Machines
+## == Reading Notes ==
## * [[/../ShirleyHui/MBCReadings|Molecular Biology of the Cell]]
## * [[/../ShirleyHui/PPIReadings|Protein-protein Interaction Detection]]
## * Support Vector Machines
-Line 90:
+Line 109:
-== Related Literature ==
 * [[http://www.connotea.org/rss/user/s2hui?download=view|Literature List on Connotea]]
 * [[http://www.baderlab.org/DomainSpecificityPredictionProject/Reading|Molecular Biology of the Cell]]
+## == Related Literature ==
##  * [[http://www.connotea.org/rss/user/s2hui?download=view|Literature List on Connotea]]
## * [[http://www.baderlab.org/DomainSpecificityPredictionProject/Reading|Molecular Biology of the Cell]]