8172
Comment:
|
5329
|
Deletions are marked like this. | Additions are marked like this. |
Line 7: | Line 7: |
* Predict specificity of peptide recognition domain from the primary amino acid sequence. | * Computationally predict specificity of peptide recognition domain from the primary amino acid sequences |
Line 10: | Line 10: |
== Background == * [wiki:/PDZ PDZ Domains] * [wiki:/MachineLearning Machine Learning] |
|
Line 11: | Line 15: |
## [wiki:/Strategy Strategy Log] | * [wiki:/Strategy Strategy] == Ideas == * [wiki:/Ideas Ideas] == Data == * [wiki:/PDZData PDZ Data] == Experiments == * [wiki:/Experiments Experiments and Results] |
Line 14: | Line 27: |
* [wiki:/Log Status Log] | * [wiki:/Log Status] |
Line 69: | Line 82: |
== Tools/Resources == * [wiki:/ToolsResources Tools and Resources] == Reading Notes == * [wiki:/../ShirleyHui/MBCReadings Molecular Biology of the Cell] * [wiki:/../ShirleyHui/PPIReadings Protein-protein Interaction Detection] * Support Vector Machines == Related Literature == * [http://www.connotea.org/rss/user/s2hui?download=view Literature List on Connotea] * [http://www.baderlab.org/DomainSpecificityPredictionProject/Reading Molecular Biology of the Cell] |
|
Line 73: | Line 98: |
== Tools/Resources == === Domains === * [wiki:/PDZ PDZ Domain] === Databases === * [http://www.ensembl.org/ Ensembl] * Software system which produces and maintains automatic annotation on selected eukaryotic genomes. * [http://www.ebi.ac.uk/interpro/ InterPro] * Database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences. * [http://www.biomart.org/ BioMart] * Query-oriented data management system that simplifies the task of creation and maintenance of advanced query interfaces backed by a relational database. It is particularly suited for providing the 'data mining' like searches of complex descriptive (e.g. biological) data. === Sequence Alignment === ==== Multiple ==== ===== Hierarhical Methods ===== * [http://www.compbio.dundee.ac.uk/Software/Amps/amps.html/ AMPS] 1990 * Calculates Z-scores through pairwise sequences comparison with randomization * Generates alignments without having to generate trees * [http://www.ebi.ac.uk/clustalw/ ClustalW] 1997 * Uses a series of different pair-score matrices * Biases location of gaps based on secondary structure mask * Allows for realigning to refine the alignment * Can infer phylogeny * Problems: * Time required to complete first all against all comparison to create guide tree * [http://www.drive5.com/muscle/ MUSCLE] 2004 * MUltiple Sequence Comparison by Log-Expectation * Uses a quick hashing comparison based on identical matches * [http://www.biophys.kyoto-u.ac.jp/~katoh/programs/align/mafft/ MAFFT] 2005 * Calculates guide tree faster by using fast Fourier transform method on AA properites to identify regions of similarity * Uses these regions to guide dynamic programming alignment of the sequences ===== Non Hierarchical Methods ===== * [http://www.ncbi.nlm.nih.gov/BLAST/ PSI-BLAST] 1997 * Searches a database with a single sequence * High scoring sequences are built into a multiple alignment which is used to derive a search profile for subsequent search of the database * Repeat until no new sequences are added to the profile or a specified number of iterations have been performed * [http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi T-Coffee] 2000 * Builds a library of pairwise alignments for the sequences of interest * Uses library to inform hierarchical method to find a multiple alignment that preserves consistency between the pairwise alignments * Can align sequences of varying lengths * [http://baboon.math.berkeley.edu/amap/ AMAP] 2007 * Multiple sequence alignment by sequence annealing ===== Probabilistic Methods ===== * [http://probcons.stanford.edu/ Probcons] 2005 * [http://probalign.njit.edu/probalign/login ProbAlign] 2006 * Estimates amino acid posterior probabilities using a partition function of the alignments. * Computes the maximum expected accuracy alignment after applying the probability consistency transformation of Probcons. * Improvements best seen with datasets of variable and long length sequences. === Viewers === * [http://www.jalview.org/ JalView] * Multiple alignment viewer/editor written in Java == Background Literature == [http://www.connotea.org/rss/user/s2hui?download=view Literature List on Connotea] === Textbook === * [http://www.baderlab.org/DomainSpecificityPredictionProject/Reading Molecular Biology of the Cell] === Other === * http://proteinkeys.org |
Table of Contents
Goals
- Computationally predict specificity of peptide recognition domain from the primary amino acid sequences
- Analyze PDZ, WW and then SH3 domains
Background
- [wiki:/PDZ PDZ Domains]
[wiki:/MachineLearning Machine Learning]
Strategy
- [wiki:/Strategy Strategy]
Ideas
- [wiki:/Ideas Ideas]
Data
- [wiki:/PDZData PDZ Data]
Experiments
- [wiki:/Experiments Experiments and Results]
Status
- [wiki:/Log Status]
Tools/Resources
[wiki:/ToolsResources Tools and Resources]
Reading Notes
- [wiki:/../ShirleyHui/MBCReadings Molecular Biology of the Cell]
- [wiki:/../ShirleyHui/PPIReadings Protein-protein Interaction Detection]
- Support Vector Machines
Related Literature
[http://www.connotea.org/rss/user/s2hui?download=view Literature List on Connotea]
[http://www.baderlab.org/DomainSpecificityPredictionProject/Reading Molecular Biology of the Cell]
Team
- Shirley Hui
- Gary Bader