#acl All:read == Structure based proteome scanning prediction of PDZ domain peptide interactions == Shirley Hui, Xiang Xing, and Gary D. Bader Website: URL == Background == PDZ domains are peptide recognition domains that are involved in important biological processes and bind their targets through the recognition of simple linear motifs. The recent availability of high throughput PDZ domain peptide interaction data has prompted the development of sequence based predictors of PDZ domain peptide interactions. However, the performance of these predictors depends on how similar in sequence a given domain is to the training domains. On the other hand, domain structure features are known to play roles in determining PDZ domain binding specificity and can also be used for training. When used for proteome scanning, such a predictor may be able to predict more novel interactions and increase the coverage of PDZ domain mediated protein protein interactions that can be currently predicted. == Results == We developed a structure based predictor of PDZ domain peptide interactions. We use domain structure features for training which are known to facilitate protein folding and stability and protein interactions. We also computationally generate additional negative interactions for training and show that this reduces the number of potential false positives returned by the predictor. Through multiple cross validation strategies and a series of blind tests we show that the predictor is estimated to have improved generalization performance and can correctly predict interactions in different organisms. Through proteome scanning in human we show that the structure based predictions correspond to known PDZ domain peptide interactions and known protein protein interactions in curated databases. We also show that a large number of validated hits are novel, representing a 53% increase in PDZ domain mediated PPIs that could be predicted before. A functional enrichment analysis shows that the biological process terms associated with these hits are also novel. == SVM Predictions == SVM predictions were validated using known interactions from PDZBase, a domain peptide interaction database and known protein-protein interactions (PPIs) from iRefIndex. iRefIndex is a PPI database which consolidates PPIs from different databases including BIND, BioGRID, CORUM, DIP, HPRD, !IntAct, MINT. The following are SVM proteome scanning predictions for 175 human, 7 fly and 6 worm PDZ domains. * [[attachment:HumanPredictions.zip|Human 175 (zip)]] * [[attachment:FlyPredictions.zip|Fly 7 (zip)]] * [[attachment:WormPredictions.zip|Worm 6 (zip)]] The format of the output files is: * is one of the following symbols: * * = validated by PDZBase, corresponds to an iRefIndex PPI (human only), or validated by protein microarray experiments (fly and worm only) * X = false positive as determined by protein microarray experiments (fly and worm only) * empty = no experiment or other evidence to validate or support this prediction * is the sequence of length five of the predicted binder * is a real number computed by the SVM to evaluate if a given sequence should be predicted as a binder or not. All values will be greater than zero since the files only contain predicted binders. * is non empty if only indicator is non empty and is one of the following codes: * PB = found in PDZBase * IR = corresponds to a PPI in iRefIndex (transcript index1, ..., transcript index n) * PM = found in protein microarray experiment * - = not found in any of the above sources * * Ensembl TRS ids corresponding to the predicted binder == Supplementary == * Supplementary Doc Link * Cytoscape BiNGO * [[attachment:BiNGOEnrichmentFiles.zip|BiNGO Enrichment Files (zip)]] * BiNGO enrichment files created by the Cytoscape BiNGO Plugin. Cytoscape v2.8.1 and the BiNGO Plugin v1.44 were used. * [[attachment:HumanBingoEnrichmentSummary.txt|BiNGO Enrichment Summary (Human) (txt)]] * Summary of human BiNGO enrichments for individual domains. * Cytoscape Enrichment Map * [[attachment:SummaryMapSession.cys|Summary Enrichment Map (cys)]] * Cytoscape session file for the summary enrichment maps for human PDZ domains created for this project. Cytoscape v2.8.1 and the Enrichment Map Plugin v1.2 were used. * [[attachment:PDZSVMStructData.zip|Data Files (txt)]] * Domain Structure * PDB files for experimentally determined and homology modelled structures for human, mouse, worm and fly * Proteomes * Ensembl proteome files for Human, Worm and Fly * Experiment Interaction files (in peptide file format) * Fly files from Chen * Human files from Tonikian * Mouse files from Stiffler * Worm files from Chen * Negative Interaction files (in raw format) * Human files generated by SVM * Mouse files generated by SVM * Curated Interaction files (flat files) * PDZBase for Human (Worm and Fly included, but not used) * iRefIndex interactions for Human * Phage codon bias files * !ProteomeScan Files * Files required to run the !ProteomeScan software == Source Code == *[[attachment:PDZSVMStruct_LICENSE.txt|GNU LGP License (txt)]] *[[attachment:PDZSVMStruct_1.0_src.zip|Java Code (zip)]] *[[attachment:PDZSVMStructDep.zip|Dependency Jars (zip)]] * jfreechart 1.0.12 (and dependencies) * weka 3.9.1 * auc calculator (Davis & Goadrich, 2006) * !BioJava 1.5 * iText 2.1.3 * jmatio * Bingo 2.3 * Cytoscape 2.6.3 * Cytoscape-task 2.6.3 * BRAIN 1.0.5 (pdzsvm) * libSVM 2.8.9 (pdzsvmstruct) === Team === * Shirley Hui * Xiang Xing * Gary Bader ---- CategoryHomepage