#acl All:read = SH3 Interactome Conserves General Function Over Specific Form = This data is supplementary material for: ''SH3 Interactome Conserves General Function Over Specific Form'' Xiaofeng Xin1*†, David Gfeller1*‡, Jackie Cheng2*, Raffi Tonikian1*, Lin Sun3¶, Ailan Guo4, Lianet Lopez1, Alevtina Pavlenco1, Adenrele Akintobi3, Yingnan Zhang5, Jean-Francois Rual6,7, Bridget Currell8, Somasekar Seshagiri8, Tong Hao6,7, Xinping Yang6,7, Yun A. Shen6,7, Kourosh Salehi-Ashtiani6,7, Jingjing Li1, Aaron T. Cheng2, Dryden Bouamalay2, Adrien Lugari9, David E. Hill6,7, Mark L. Grimes10, David G. Drubin2, Barth D. Grant3, Marc Vidal6,7, Charles Boone1§, Sachdev S. Sidhu1§, Gary D. Bader1,11§ == Supplementary Information == * [[attachment:Xin_Supp.pdf|Supplementary Information]] == Supplementary Tables == * [[attachment:XinTableS1.xls|Table S1]] - Summary of C. elegans SH3 domains analyzed in phage display screen * [[attachment:XinTableS2.xls|Table S2]] - Summary of baits screened as prey as well in Y2H * [[attachment:XinTableS3.xls|Table S3]] - Summary of Y2H interactions * [[attachment:XinTableS4.xls|Table S4]] - Analysis of overlap with other datasets * [[attachment:XinTableS5.xls|Table S5]] - Analysis of Gene Ontology similarity * [[attachment:XinTableS6.xls|Table S6]] - Summary of interactions found in iRefWeb * [[attachment:XinTableS7.xls|Table S7]] - GO enrichment of the worm SH3 network * [[attachment:XinTableS8.xls|Table S8]] - GO enrichment of the yeast SH3 network * [[attachment:XinTableS9.xls|Table S9]] - Summary of observed rewiring events * [[attachment:XinTableS10.xls|Table S10]] - List of manually curated worm endocytosis proteins * [[attachment:XinTableS11.xls|Table S11]] - List of yeast endocytosis proteins * [[attachment:XinTableS12.xls|Table S12]] - List of predicted Worm endocytosis proteins, together with human orthologs * [[attachment:XinTableS13.xls|Table S13]] - Summary of sub-categories among endocytosis proteins * [[attachment:XinTableS14.xls|Table S14]] - Summary of cloned human orthologs * [[attachment:XinTableS15.xls|Table S15]] - Summary of overlap between endocytosis proteins * [[attachment:XinTableS16.xls|Table S16]] - Summary of competitive and coincident interactions * [[attachment:XinTableS17.xls|Table S17]] - list of phage peptides * [[attachment:XinTableS18.xls|Table S18]] - Summary of Position Weight Matrices used to model binding specificity * [[attachment:TonikianTableS19.xls|Table S19]] - Summary of yeast-to-worm orthologs for proteins in the worm or yeast SH3 networks == Data == * Phage display peptide data in brain format (see below) * [[attachment:WormSH3.zip]] * Protein-protein interaction data in PSI-MI 2.5 format * [[attachment:WormSH3PPIs.zip]] * Protein-protein interaction data has been submitted to the [[http://www.ebi.ac.uk/intact/|IntAct database]]. == File Format == Each zip file contains a set of ''peptide files'' as shown in the example below. A peptide file describes a protein containing a specific domain, and provides known peptide ligands of this domain obtained by an experimental technique. These files can be directly loaded into the [[http://www.baderlab.org/Software/LOLA|LOLA software]], which can be used to visualize peptide sets as sequence logos and to create Logo Trees, or the [[http://baderlab.org/Software/BRAIN|BRAIN software]], which is a Cytoscape plugin that can be used to search protein sequence databases for proteins that contain sequence patterns matching a set of peptides. The peptide file consists of a '''Header Section''' that describes the protein and domain sequence, and a '''Peptide Section''' that lists and describes the peptide ligands. The peptides have been aligned with MUSCLE (Nucl. Acids Res. (2004) 32 (5): 1792-1797) and all multiple alignments have been manually curated. For some domains (STAM-1 = C34G6.7 and HUM-1 = F29D10.4), the interacting peptides have bee split into two groups corresponding to Class I (RxxPxxP) and class II (PxxPxR) SH3 ligands. Example: {{{ Gene Name B0303.7 Accession Refseq:B0303.7 Organism C. elegans NCBITaxonomy ID9606 Domain Number 1 Domain Type SH3 Interpro ID IPR001452 Technique Phage Display High Valency Domain sequence PSYSAPAISTPYGIAKFDYAPTQSDEMGLRIGDTVLISKKVDAEWFYGENQNQRTFGIVPSSYLDIKIPLKEAFTAL PeptideName Peptide CloneFrequency 1 GSEVPPVPPRPV 1 2 SLDPPPVPPRPV 1 3 ASARPPVPPRPL 1 4 SNLDHWQPGEPV 1 5 DMRAPPVPPRPD 1 6 XXXAPLPPPRPD 1 7 FQDVRIAPQVPA 1 8 APVAPEPPPRPH 1 9 SXIVPTVPPVPD 1 10 GQTAPPVPARPA 1 11 EGVAPPPPPRPV 1 12 FTTAPPVPPRPL 1 13 LGTAPPVPARPI 1 14 SEILPTIPPRPD 1 15 SGTPPPVPTRPD 1 16 VTEAPQVPPRPF 1 17 GETAPPVPPRPL 1 18 LPEPPPPPPRPV 1 19 LLDTPAVPTRPC 1 20 YGAAPPVPPRPV 1 21 QPKAPPVPVRPS 1 22 LLDTPAVPTRPC 1 23 GEYAPPVPPRPD 1 24 DITAPPPPPRPY 1 25 VRETPPVPARPA 1 26 SLDPPPVPPRPV 1 27 LDRVPPPPARPS 1 28 XGEAPPAPPRPG 1 29 VDAAPPVPQRPA 1 30 GATAPPVPPRPN 1 31 GDLPPPVPPRPS 1 }}} ==== Header Section ==== Describes the protein, domain, and experiment. Required fields are indicated with a '''*'''. '''NOTE:''' This section is in a 2 column format. Field names must be separated from their values with a single TAB character. Multiple TABs, or spaces, are not accepted. '''Gene Name:*''' An identifier that represents the gene or protein sequence. Not required to be unique. '''Accession:*''' A space-separated list of database accession identifiers for the protein or corresponding gene. '''Organism:''' Description of taxon of the protein. '''NCBITaxonomyID*:''' Taxon identifier from NCBI's Taxonomy repository. '''Domain Number*:''' A number that represents the position of the domain sequence within the protein. For proteins containing multiple instances of the domain, this number helps distinguish the position of these instances. Set to "0" if instance information is not known. '''Domain Type:*''' The formal name of the domain, e.g. WW, PDZ, SH3. '''Interpro ID:''' The Interpro database identifier for the domain. '''Technique:''' The experimental method used to identify potential ligands of the protein. '''Domain Sequence:*''' The amino-acid sequence of the domain region. '''Domain Range:''' The amino-acid position range for the domain region within the protein. '''Comment:''' Notes, additional information, personal comments pertaining to this file. ==== Peptide Section ==== Describes the experimentally determined peptide ligands. The peptide sequences must be in '''multiple alignment format'''. The sequences should contain '''no gaps''', and should be padded with the '''X''' symbol on both sides, where required, such that all sequences have identical length. '''NOTE:''' This section is in a 5-column format. Column headers and values must be separated with a single TAB character. Multiple TABs or spaces are not accepted. Required fields are indicated with a '''*'''. '''!PeptideName:*''' A ''unique'' numerical symbol assigned to each peptide ligand. To omit a peptide, set to a non-numeric value (e.g. "A"). '''Values in this column must be unique.''' '''Peptide:*''' The peptide ligand sequence. '''!CloneFrequency:''' Applies only to phage display data: the observed frequency of the peptide in the cloning step. '''!QuantData:''' A number that relatively or absolutely quantifies the protein-ligand interaction. E.g. The optical density (OD) from a protein chip experiment. '''!ExternalIdentifier:''' A database identifier for the peptide. == Author Information == 1 The Donnelly Centre and the Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada 2 Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA 3 Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ 08854, USA 4 Cell Signaling Technology, Danvers, MA 01923, USA 5 Department of Early Discovery Biochemistry, Genentech, South San Francisco, CA 94080, USA 6 Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA 7 Department of Genetics, Harvard Medical School, Boston, MA 02115, USA 8 Department of Molecular Biology, Genentech, South San Francisco, CA 94080, USA 9 IMR Laboratory, UPR 3243, Institut de Microbiologie de la Méditérannée, CNRS and Aix-Marseille Université, 31 Chemin Joseph Aiguier, 13402 Marseille Cedex 20, France 10 Division of Biological Sciences, Center for Structural and Functional Neuroscience, The University of Montana, Missoula, MT 59812, USA 11 Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada * These authors contributed equally to this work. † Present address: Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA ‡ Present address: Swiss Institute of Bioinformatics, Molecular Modelling, Génopode, CH-1015 Lausanne, Switzerland ¶ Present address: Department of Physiology and Biophysics, Boston University School of Medicine, Boston, MA 02118, USA § To whom correspondence should be addressed. E-mail: sachdev.sidhu@utoronto.ca (S.S.S.); charlie.boone@utoronto.ca (C.B.); gary.bader@utoronto.ca (G.D.B.)