The Bader Lab is involved in a number of collaborative open-source bioinformatics projects designed to make biological pathway data easy to visualize and analyze.
Contents
Biological Network Analysis and Visualization Software
Cytoscape
Cytoscape is a bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. Additional features are available as plugins. Plugins are available for network and molecular profiling analyses, new layouts, additional file format support and connection with databases. Plugins may be developed using the Cytoscape open Java software architecture by anyone and plugin community development is encouraged.
Cytoscape was originally developed at the Institute of Systems Biology and is now a collaborative effort involving many different academic and commercial groups.
Fun link: Controlling Cytoscape using the Nintendo Wiimote
NetMatch
NetMatch is a Cytoscape plugin that finds user defined network motifs in any Cytoscape network. Node and edge attributes of any type and paths of unknown length can be specified in the search.
Released by: Ferro, Giugno, Pulvirenti group, University of Catania, Bader group, University of Toronto and Shasha group, New York University.
Available from the Bader Lab mirror of the NetMatch homepage
MCODE
MCODE is a Cytoscape plugin that finds clusters (highly interconnected regions) in a network. Clusters mean different things in different types of networks. For instance, clusters in a protein-protein interaction network are often protein complexes and parts of pathways, while clusters in a protein similarity network represent protein families.
Available from the MCODE homepage
BRAIN
The Biologically Relevant Analysis of Interaction Networks (BRAIN) is a set of algorithms for predicting and analyzing protein domain-peptide ligand interactions based on experimentally known binding evidence (e.g. from protein chip or phage display experiments). BRAIN can be accessed as a Cytoscape plugin which reads peptide binding profiles and generates interactions displayed as a Cytoscape network.
Available from the BRAIN homepage
LOLA
LOLA (LOgos Look Amazing) is a tool for generating sequence logos using Position Weight Matrix based protein profiles. LOLA allows you to generate custom sequence logos by setting parameters such as logo height, trim percentage, and residue colour scheme. You can also generate "logo trees" based on clustered protein profiles to analyze binding motif classes. Sequence logos and logo trees can be saved in various formats including PDF, PNG, and JPEG.
Available from the LOLA homepage
DoMo-Pred Yeast
DoMo-Pred Yeast is a novel method for predicting physiologically relevant SH3 domain-peptide mediated protein-protein interactions in S. cerevisae using phage display data. This method uses position weight matrix models of protein linear motif preference for individual SH3 domains to scan the proteome for potential hits and then filters these hits using a range of evidence sources related to sequence-based and cellular constraints on protein interactions. By combining different peptide and protein features using multiple Bayesian models we are able to predict high confidence interactions with an overall accuracy accuracy of 0.97
Available from the Domo-Pred Yeast homepage
DoMo-Pred Human
DoMo-Pred Human is a novel method of predicting SH3 domain-peptide mediated protein-protein interactions in humans using phage display data. This method builds upon our previously published work of combining multiple binding site and full length protein features using naive Bayesian models for predicting PRM mediated interactions. This work is a novel algorithm for predicting protein interactions using network topology. We have also extended the semi-supervised training regime of multinomial naive Bayesian classifier developed for text classification to Gaussian naive Bayesian models for PPI prediction.
Available from the Domo-Pred Human homepage
Enrichment Map
Enrichment Map is a Cytoscape plugin for functional enrichment visualization. Enrichment results have to be generated outside Enrichment Map, using any of the available methods. Gene-sets, such as pathways and Gene Ontology terms, are organized into a network (i.e. the "enrichment map"). In this way, mutually overlapping gene-sets cluster together, making interpretation easier. Enrichment Map also enables the comparison of two different enrichment results in the same map.
Available from Enrichment Map homepage
WordCloud
WordCloud is a Cytoscape app that generates a visual summary of a network. It displays string attributes associated with nodes in the network as a tag cloud, where more frequent words are displayed using a larger font size. Word co-occurence in a phrase can be visualized by arranging words in clusters or as a network.
Available from WordCloud homepage
AutoAnnotate
AutoAnnotate is a Cytoscape app that finds node clusters and visually annotates them with labels and groups. These visualizations provide a concise visual summary which is helpful for network analysis and interpretation.
Available from AutoAnnotate homepage
SIREN
The Signing of Regulatory Networks (SIREN) algorithm can infer the regulatory type (positive or negative regulation) of interactions in a known gene regulatory network given corresponding genome-wide gene expression data.
http://apps.cytoscape.org/apps/siren
SocialNetworkApp
The Social Network Cytoscape app creates a visual summary of how individuals are connected. Biological networks can be visualized and analyzed using Cytoscape. Often researchers want to go beyond the network of proteins or genes and also look at the inter-connectedness between colleagues and institutions. Who tends to publish together? What institutions are most collaborative? Are there inter-disciplinary connections in my institution? The app addresses these questions by building co-publication networks where the nodes represent authors, edges represent co-authorship and edge thickness represents how frequently co-authors collaborate.
Available from SocialNetworkApp homepage
Biological Pathway and Network Database Software
GeneMANIA
GeneMANIA helps you predict the function of your favourite genes and gene sets.
GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional association data. Association data include protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity. You can use GeneMANIA to find new members of a pathway or complex, find additional genes you may have missed in your screen or find new genes with a specific function, such as protein kinases. Your question is defined by the set of genes you input.
This work is done in collaboration with Quaid Morris' lab at the University of Toronto.
cPath: Pathway Commons data integration and web services
cPath, an open source database and web service application for collecting, storing, browsing and querying biological pathway data. cPath makes it easy to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data to biologists, and export pathway data via a web service to third-party software for visualization and analysis. cPath is software only, and does not include new pathway information. Key features include: a built-in identity and reference service for identifying and annotating interactors; built-in support for PSI and BioPAX standard pathway exchange formats; and a web service interface for searching and retrieving biological network data sets. The cPath and related software is freely available under the LGPL or MIT open source license for academic and commercial use.
https://github.com/PathwayCommons/
http://www.pathwaycommons.org/ (Pathways and molecular interactions)
PC2 (cPath2): http://www.pathwaycommons.org/pc2/
This work is done in collaboration with the Sander group of the Computational Biology Center at Memorial Sloan-Kettering Cancer Center in New York City, then at the Memorial Slon the Dana-Farber Cancer Institute and Harvard Medical School, and the Demir Lab at Oregon Health & Science University.
netDx
netDx is a patient classifier algorithm that can integrate several types of patient data into a single model. It does this by converting each type of data into a view of patient similarity; i.e. by converting the data into a graph in which more similar patients are tightly linked, while less similar patients are not so tightly linked.
Pathway and Network Data
Pathway Commons
Pathway Commons is a convenient point of access to biological pathway information collected from public pathway databases, which you can browse or search. Pathways include biochemical reactions, complex assembly, transport and catalysis events, and physical interactions involving proteins, DNA, RNA, small molecules and complexes.
http://www.pathwaycommons.org/
This work is done in collaboration with the Sander group of the Computational Biology Center at Memorial Sloan-Kettering Cancer Center in New York City.
The Cancer Cell Map
The Cancer Cell Map contains selected cancer related signaling pathways which you can browse or search. Biologists can browse and search the Cancer Cell Map pathways. View gene expression data on any pathway. Computational biologists can download all pathways in BioPAX format for global analysis. Software developers can build software on top of the Cancer Cell Map using the web service API. Download and install the cPath software to create a local mirror.
This work is done in collaboration with the Sander group of the Computational Biology Center at Memorial Sloan-Kettering Cancer Center in New York City.
Pathway Data Exchange Standards
BioPAX
BioPAX (Biological Pathway Exchange) is a collaborative effort to create a data exchange format for biological pathway data. BioPAX covers metabolic pathways, molecular interactions and protein post-translational modifications. Future versions will expand support for signaling pathways, gene regulatory networks and genetic interactions.
PSI-MI
PSI-MI (The Proteomics Standards Initiative Molecular Interactions) format allows exchange of molecular interaction data, focusing on protein-protein interactions.
http://psidev.sourceforge.net/
Both of these projects are large international collaborative efforts involving many different academic and commercial groups.
Community Web Sites
Pathguide
Pathguide, the Pathway Resource List, contains information about hundreds of online biological pathway resources. Databases that are free and those supporting BioPAX, CellML, PSI-MI or SBML standards are highlighted.
DV-IMPACT
Database of disease variants and their impacts on domain-peptide protein interaction networks.
HuRI - Human Reference Interactome
Pairwise combinations of human protein-coding genes are tested systematically using high throughput yeast two-hybrid screens to detect protein-protein interactions. The quality of these interactions is further validated in multiple orthogonal assays. Currently 64006 PPIs involving 9094 proteins have been identified using this framework.
In addition to systematically identifying PPIs experimentally, this web portal also includes PPIs of comparable high quality extracted from literature. This subset of literature-curated PPIs currently comprises 13441 PPIs involving 6047 proteins.
http://www.interactome-atlas.org/
YeRI - Yeast Reference Interactome
Pairwise combinations of yeast protein-coding genes were tested systematically using high throughput yeast two-hybrid (Y2H) screens to detect protein-protein interactions. All four systematic maps were generated using different assay versions of the canonical yeast two-hybrid system, described in table 1. For maps generated at CCSB (Yu et al and YI-II), the precision of the dataset was tested by validating some of the interactions in multiple orthogonal assays (Venkatesan et al). In total, 4,556 high-quality protein-protein interactions (PPIs) involving 2,552 proteins have been identified by the four maps.
In addition to systematically identifying PPIs experimentally, the yeast interactome map also contains PPIs of comparable high quality from literature curation databases. This subset of literature-curated PPIs currently comprises 5,589 PPIs.
http://yeast.interactome-atlas.org
MIMP
Mutation IMpact on Phosphorylation
- Predicting loss and gain of phosphorylation - MIMP characterizes genetic variants such as cancer mutations that specifically alter kinase-binding sites in proteins. As these residues are changed in disease mutations, alterations in kinase binding specificity potentially lead to rewiring of kinase-substrate interaction networks
- Comprehensive kinase-substrate data - MIMP makes use of phosphorylation data for a wide range of kinases to comprehensively predict kinase-rewiring mutations. Kinase specificity models are refined to remove sequences not matching the general motif and improve prediction power.
POW
POW is a website that allows users to predict PDZ domain-peptide interactions for human, mouse, worm and fly PDZ domains. Predictions are made using a support vector machine (SVM) that was trained using experimentally determined PDZ interaction data from protein microarray and phage display experiments for mouse and human [1,2]. Two types of predictors are available for use. The first is sequence-based (trained using domain and peptide sequence features) while the other structure-based (trained using domain structure and peptide sequence features).
PRM-DB
The database of Peptide Recognition Modultes
RShiny apps
Mouse Retinal Stem Cells - Data portal for the paper A microfluidic platform enables comprehensive gene expression profiling of mouse retinal stem cells by Coles, Labib, et al. Lab on a Chip, 2021. This is an RShiny app based on scClustViz for exploring the single-cell transcriptomes of cells isolated from the mouse retina and enriched for stem cells using the published microfluidics platform.
Aging Mouse Brain - Data portal for the paper Single-cell transcriptomic profiling of the aging mouse brain by Ximerakis, Lipnick, et al. Nat Neurosci, 2019. This data portal includes RShiny apps based on both scClustViz for exploring scRNAseq data and CCInx for predicting interactions between cell types.
hPSC-derived LSECs - Data portal for the paper Generation of functional liver sinusoidal endothelial cells from human pluripotent stem-cell-derived venous angioblasts by Gage et al. Cell Stem Cell, 2020. This data portal includes RShiny apps based on scClustViz for exploring scRNAseq data of hPSC-derived LSECs generated from the maturation of venous angioblasts in NSG mice, as well as a comparison between these cells and the cells of the human liver atlas presented below.
Human Liver Atlas - Data portal for the paper Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations by MacParland et al. Nat Commun, 2018. This is an RShiny app based on scClustViz for exploring the scRNAseq data presented in the paper.
Mouse Cerebral Cortex - Data portal for the paper Developmental emergence of adult neural stem cells as revealed by single cell transcriptional profiling by Yuzwa, Borrett, et al. Cell Rep, 2017. This data portal contains time-series data presented as a set of RShiny apps based on scClustViz. Timepoints: E11.5 — E13.5 — E15.5 — E17.5