National Centers of Systems Biology

Software & Databases

Center for Complex Biological Systems (UC Irvine)

GPU Codes for 3D Model of Epidermal Development

We provide CUDA code for Nvidia GPUs of a 3D model of epidermal development.  The model includes GPU implementation of the subcellular element method for 3D spatial cells, an intracellular gene network for each cell represented by a set of ODEs, cell-cell neighbor communication through Notch signaling as part of each cell’s internal gene network, and cell behaviors of growth and division.

A Gene Network Inference Tool

We provide our Objective-C (Cocoa for Mac, GNUstep for Linux/Windows) optimization framework to learn linear gene regulatory networks from various types of gene expression data. The optimization incorporates network sparsity constraint through L1 regularization as well as incorporation of existing network information. The framework can handle wild type, perturbation, gene knockout and heterozygous knockdown gene expression data.

Fast Numerical Algorithms for Stiff Reaction-Diffusion Equations

We provide Matlab and C codes based on a novel and efficient algorithm for Reaction-Diffusion equations that model spatial dynamics of complex biological systems. The numerical method used in the code is designed for effective treatment of stiff reactions in spatial systems.

Accelerated Stochastic Simulation Algorithm for Reaction Networks

This is an exact method for stochastic simulation of chemical reaction networks (Exact R-Leap) to accelerate the Stochastic Simulation Algorithm (SSA).

Dynamical Grammar Simulator

Plenum is a simulation software written in Mathematica for dynamical grammar models.  Dynamical grammars are an elegant language for representing complex processes that include stochastic events and continuous dynamics.

Scientific Inference Systems Tools

This is a collection of software and packages for modeling and image analysis on systems inferences.

The Sigmoid Project (http://www.sigmoid.org)

The SIGMOID project is intended to produce a database of cellular signaling pathways and models thereof, to marshal the major forms of data and knowledge required as input to cellular modeling software and also to organize the outputs.

Center for Genome Dynamics (The Jackson Laboratory)

CGDSNP - SNP Database

CGDSNP is a high quality single nucleotide polymorphism (SNP) database produced by the Center for Genome Dynamics with more than 8 Million SNPs from 74 strains of laboratory mice, drawn from several sources.

GBrowse

Generic Genome Browser (GBrowse) has been employed to connect datasets produced by projects within the Center for Genome Dynamics to locations within the mouse genome.

Genome Interval Overlap Calculator

Genome Interval Overlap Calculator is a general purpose application for calculating the interval-by-interval overlap between two input files. One use of this Center for Genome Dynamics application is to use the output from the MouseIBS tool to “filter out” regions of the genome that are identical between two strains. This could be applied to QTLs, gene lists etc.

Mouse Map Converter

Mouse Map Converter is a simple web interface developed by the Center for Genome Dynamics for converting mouse genome coordinates between MIT Markers, Base Pair Positions and Centimorgan Positions. The conversion process is based on the newly calculated mouse map described in “A New Standard Genetic Map for the Laboratory Mouse” by Cox et al.

Mouse Strain Comparison

Mouse Strain Comparison allows you to build strain comparison expressions which are evaluated against the selected strain genomes. This Center for Genome Dynamics tool can be used to determine where strains are Identical By State or where strains with a high phenotype differ from strains with a low phenotype etc.

UNC Computational Genetics Tools

Tools developed by the Center for Genome Dynamics systems genetics research team at UNC-Chapel Hill for the analysis and visualization of systems biology data.

Center for Modular Biology (Harvard University)

Cancer Module Map data mining website

Web access to our comprehensive modular analysis of transcriptional responses in human cancer, including the full cancer expression compendium, detailed clinical annotations, and all the significant modules. Various searching, browsing and visualization capabilities are provided. Segal, E., Friedman, N., Koller, D. & Regev, A. A module map showing conditional activity of expression modules in cancer. Nature Genetics 36, 1090-1098 (2004). [ PDF ]

Cichlid ESTs, gene index, and genomics database

1) Astatotilapia burtoni ESTs: Genbank accession numbers CN468542 - CN472211 Genbank accession numbers DY625779 - DY632420

2) TIGR Gene Index for Astatotilapia burtoni (in collaboration with John Quackenbush’s group): A gene index for A. burtoni (release 1.0) Release 2.0 is scheduled for early June, 2006.

3) Cichlid EST and microarray database (beta version): Cichlid genomics site A comprehensive database resource that connects all sequenced clones to data, including pertinent information on physical location, array coordinates, BLAST results, and functional annotation. Fungal Orthogroups Repository

We provide the orthogroup assignments for all predicted protein-coding genes across 13 Ascomycete fungal genomes. Ilan Wapinski, Avi Pfeffer, Nir Friedman and Regev, A. A Natural history and evolutionary principles of gene duplication in fungi. Nature 449, 54-61 (2007).

GeneXPress

A comprehensive tool for visualization, integration and analysis of genomic data, from a modular perspective. We have recently introduced a new tool, Genomica (http://genomica.weizmann.ac.il, now under separate funding) a redesign of GeneXPress to accommodate additional modular analysis. To facilitate Genomica’s use by any genomics researcher, we developed an extensive online tutorial and basic analysis packages for several organisms. Further resources for analysis were developed as an online repository (GeneSets, below). Segal, E., Kaushal, A., Yelensky, R., Pham, T., Regev, A., Koller, D. & Friedman, N. GeneXPress: a visualization and statistical analysis tool for gene expression and sequence data. Proc. 12th Intl Conf. on Intelligent Systems for Molecular Biology (2004). [ PDF ]

Mutation mapping method

Software for using microarray data on DNA hybridization to detect single feature polymorphisms (SFP) and use them to map genetic traits in crosses between different strain backgrounds.

SERV: Sequence-based Estimation of Repeat Variability

An algorithm to predict the variability of tandem repeats developed by the Center for Modular Biology.

Center for Quantitative Biology (Princeton University)

bioPIXIE

bioPIXIE is a novel system for biological data integration and visualization. It allows you to discover interaction networks and pathways in which your gene(s) of interest participate.

ChARM: Chromosomal Aberration Region Miner

Chromosomal aberration detection tool [ download ] FIRE

FIRE is a motif discovery and characterization program based on mutual information [download ] GOLEM

GOLEM is a userful tool which allows the viewer to navigate and explore a local portion of the Gene Ontology (GO) hierarchy. Users can also load annotations for various organisms into the ontology in order to search for particular genes, or to limit the display to show only GO terms relavent to a particular organism, or to quickly search for GO terms enriched in a set of query gene.

GRIFn

GRIFn is a novel system for interactive evaluation of functional genomic data and methods. It allows you to upload your own data, view evaluations in multiple contexts, and compare it with other published high throughput data.

Generic Gene Ontology (GO) Term Finder

This generic (”multi-organism”) GO Term Finder web tool finds significant GO terms shared among a list of genes from your organism of choice, helping you discover what these genes may have in common. The implementation of this Generic GO Term Finder depends on the GO-TermFinder software written by Gavin Sherlock and Shuai Weng at Stanford University, made publicly available through the GMOD project.

Generic Gene Ontology (GO) Term Mapper

This generic (”multi-organism”) GO Term Mapper web tool maps the granular GO annotations for genes in a list to a set of GO slim terms, allowing you to bin your genes into broad categories. The implementation of this Generic GO Term Mapper uses map2slim.pl script written by Chris Mungall at Berkeley Drosophila Genome Project, and some of the modules included in the GO-TermFinder distribution written by Gavin Sherlock and Shuai Weng at Stanford University, made publicly available through the GMOD project.

Inquiry Bioinformatics Suite

The Inquiry Bioinformatics Suite provides commonly used bioinformatics tools such as the EMBOSS software suite, BLAST, and HMMer.

MEFIT : a Microarray Experiment Functional Integration Technology

MEFIT is a Microarray Experiment Functional Integration Technology. Given any amount of microarray data, it predicts the probability of pairwise functional relationship for any gene pair within individual biological functions.

Nearest Neighbor Networks (NNN)

Nearest Neighbor Networks (NNN) is a graph-based algorithm used to cluster genes with similar microarray expression profiles. The NNN clustering method is an alternative to classical techniques such as hierarchical and K-means clustering. NNN generates clusters of functionally related genes with high precision, and the clusters generally represent a broader selection of biological processes than those produced by other methods; NNN performs best on data sets with many conditions and on datasets that are modular (i.e. contain several grouped subsets of conditions). The NNN algorithm is described in Huttenhower et al. 2007 (http://www.biomedcentral.com/1471-2105/8/250) and was developed in the Troyanskaya and Coller labs, and the web tool was implemented by Juan Alvarez in the Bioinformatics group at Princeton [ download ] SiteSifter

SiteSifter finds highly conserved DNA motifs embedded within coding regions. Each instance of a motif is scored based on the chance that its constituent codons are conserved over and above that required for amino acid conservation.

growthrate.princeton.edu

Provides an analysis of yeast genes and their growth rate correlations.

P-POD : Princeton Protein Orthology Database

P-POD displays families of predicted orthologs from P. falciparum, H. sapiens, D. melanogaster, M. musculus, A. thaliana, C. elegans, D. rerio, and S. cerevisiae with an emphasis on providing information about disease-related genes and experimental confirmation of orthology from the literature.

Princeton University Microarray Database (PUMAdb)

The Princeton University MicroArray database (PUMAdb) stores raw and normalized data from microarray experiments, as well as their corresponding image files. In addition, PUMAdb provides interfaces for data retrieval, analysis and visualization. Princeton researchers and their collaborators should register for a database account [ download ] [ publications ]

Sleipnir Library for Computational Functional Genomics

Sleipnir is a C++ library enabling efficient analysis, integration, mining, and machine learning over genomic data. This includes a particular focus on microarrays, since they make up the bulk of available data for many organisms, but Sleipnir can also integrate a wide variety of other data types, from pairwise physical interactions to sequence similarity or shared transcription factor binding sites.

SPELL : Serial Pattern of Expression Levels Locator

SPELL (Serial Pattern of Expression Levels Locator) is a query-driven search engine for large gene expression microarray compendia. Given a small set of query genes, SPELL identifies which datasets are most informative for these genes, then within those datasets additional genes are identified with expression profiles most similar to the query set.

Virus Infection Project

The Virus Infection Project (VIP) is a web tool that provides a way to look at information about transcripts during CMV infections.

Yeast Functional Genomics Database (YFGdb)

The goal of YFGdb is to collect and freely disseminate all available yeast functional genomics data, along with requisite analysis tools, to the yeast community and the biomedical research community at large. YFGdb contains data sets from microarray as well as many other genomics/proteomics studies including large-scale interaction and phenotype experiments. YFGdb has been implemented using the Generic Model Organism Database Construction Set as part of the GMOD project.

Center for Systems Biology (Seattle, WA)

Gaggle

Database and software integration framework - The Gaggle is a framework for exchanging data between independently developed software tools and databases to enable interactive exploration of systems biology data. PMID:
18021453

Nitin Baliga

Peptide Atlas

Information resource - Atlas for human, mouse; yeast peptides from large set of tandem mass spectrometry data. Results processed through Trans-Proteomic Pipeline PMID:
PMC2373374

Eric Deutsch

Trans-Proteomic Pipeline

Data analysis software - a suite of software tools for the analysis of tandem mass spectrometry data sets. The tools encompass most of the steps in a proteomic data analysis workflow in a single, integrated software system. Specifically, the TPP supports all steps from spectrometer output file conversion to protein-level statistical validation, including quantification by stable isotope ratios. PMID:
18212004

Eric Deutsch

NRC-1: Peptide Atlas

Data analysis software - A peptide atlas for 62.7% of the predicted proteome of the extremely halophilic archaeon Halobacterium salinarum NRC-1 by compiling approximately 636,000 tandem mass spectra from 497 mass spectrometry runs in 88 experiments. PMID:
PMC2643335

Nitin Baliga

Cytoscape

Data analysis - An open source bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. PMID:
PMC2478690

Ilya Shmulevich

Addama

Data management - An adaptable data management system designed to support the mining and analysis of biological experiment data that is commonly used in systems biology (e.g. ChIP-chip, gene expression, proteomics, imaging, and flow cytometry). PMID:
19265554

Ilya Shmulevich

Simcluster

Data analysis - Designed to perform clustering analysis for data on the simplex space. Simcluster is a stand-alone command-line C package and user-friendly on-line tool. PMID:
PMC2147035

Ilya Shmulevich

ProbCD

Data analysis - An open source analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques, and (ii) probabilistic gene annotation. PMID:
PMC2169266

Ilya Shmulevich

ProbTF

Data analysis - A flexible, probabilistic framework to predict transcription factor binding from multiple data sources. PMID:
PMC2268002

Ilya Shmulevich

Duke Center for Systems Biology

Imaging software

The CellTracer software for automated dynamic image analysis in single cell fluorescent microscopy studies.

Gene circuit simulation software

The Dynetica software for graphical construction of kinetic models and automatic generation of the implied differential equations, time course simulations using deteriministic or stochastic algorithms, and sensitivity analyses.

Statistical analysis software relevant for systems biological applications

The BFRM software for Bayesian Factor Regression Models — modelling and analysis of sparse latent factor models use in applications in pathway analysis and predictive modelling with large-scale gene expression data sets, with links to studies in gene expression analysis.

The GGM software for Bayesian analysis and search of graphical/network structures in the framework of Gaussian graphical models, with links to applications in gene expression studies.

The PRIORITY software package for de novo motif discovery based on Bayesian classifiers for identification of transcription factor binding sites.

General software and visualisation tools that may be of interest for systems biological application

The GraphExplore software for dynamically displaying, exploring and modifying large graphs and networks, with a range of visualisation and interactive analysis facilities

General statistical model search and analysis software applications

Additional Bayesian analysis software from Alex Hartemink’s group, including software for sparse multinomial ligistic regression model search, and for Bayesian network modelling

Additional Bayesian analysis software from Mike West’s group, including shotgun stochastic search (SSS) for linear and nonlinear regression models with many variables, time series analysis with flexible dynamic models, classification and prediction tree modelling, and others

Software tools and links related to genomics and systems biology applications in other Duke groups

Integrative Cancer Biology Program software site.

Center for Applied Genomics & Technology software site.

Systems Biology Center New York (SBCNY)

chea2

ChEA

ChEA (ChIP-X Enrichment Analysis) database contains manually extracted datasets of transcription-factor/target-gene interactions from over 100 experiments such as ChIP-chip, ChIP-seq, ChIP-PET applied to mammalian cells. We use the database to analyze mRNA expression data where we perform gene-list enrichment analysis as the prior biological knowledge gene-list library. The system is delivered as web-based interactive software. With this software users can input lists of mammalian genes for which the program computes over-representation of transcription factor targets from the ChEA database. PMID: 20709693 (Contact: Avi Ma’ayan)
gate1GATE GATE (Grid Analysis of Time-series Expression) is a computational software platform for integrated visualization and analysis of expression time-series. Given a high-dimensional time-series dataset, GATE employs a clustering algorithm which creates movies of expression dynamics by assigning individual genes/proteins to hexagons on a hexagonal array and dynamically coloring each hexagon according to the expression level of the molecular species to which it is associated. Additionally, in order to infer potential regulatory control mechanisms from patterns of time-series correlations, GATE allows interactive interrogation of the movies with a wide variety of background knowledge datasets. PMID: 19892805 (Contact: Avi Ma’ayan)
lists2networksLists2Networks Lists2Networks is a web-based system that will allow users to upload and analyze lists of mammalian gene-sets in a client-server-based software application. Within their workspace users can examine the overlap among the lists they upload, manipulate lists with different set operations, expand lists using existing mammalian networks of protein-protein, co-expression correlations, or background knowledge annotation correlations, and apply simple gene-set enrichment analyses on many gene lists at once against a plethora of background datasets. PMID: 20152038 (Contact: Avi Ma’ayan)
keaKEA KEA (Kinase Enrichment Analysis) is a web-based tool with an underlying database providing users with the ability to link lists of mammalian proteins/genes with the kinases that phosphorylate them. The system draws from several available kinase-substrate databases to compute kinase enrichment probability based on the distribution of kinase-substrate proportions in the background kinase-substrate database compared with kinases found to be associated with an input list of genes/proteins. PMID: 19176546 (Contact: Avi Ma’ayan)

snavi1SNAVI

SNAVI (Signaling Networks Analysis and Visualization) Studies of cellular signaling indicate that signal transduction pathways combine to form large networks of interactions. Viewing protein-protein and ligand-protein interactions as graphs (networks), where biomolecules are represented as nodes and their interactions are represented as links, is a promising approach for integrating experimental results from different sources to achieve a systematic understanding of the molecular mechanisms driving cell phenotype. The emergence of large-scale signaling networks provides an opportunity for topological statistical analysis. However, visualization of such networks represents a challenge. SNAVI (Signaling Networks Analysis and Visualization) is Windows-based desktop application that implements standard network analysis methods to compute the clustering, connectivity distribution, and detection of network motifs, as well as provides means to visualize networks and network motifs. SNAVI is capable of generating linked web pages from network datasets loaded in text format. SNAVI can also create networks from lists of gene or protein names. SNAVI is a useful tool for analyzing, visualizing and sharing cell signaling data. SNAVI is open source free software. PMID: 19154595 (Contact: Avi Ma’ayan)
avisAVIS AVIS AJAX Viewer is a visualization tool for viewing and sharing intracellular signaling, gene regulation and protein interaction networks. AVIS is implemented as an AJAX enabled syndicated Google gadget. It allows any webpage to render an image from a text file representation of signaling, gene regulatory or protein interaction networks. PMID: 17855420 (Contact: Avi Ma’ayan)
genes2networksGenes2Networks Genes2Networks is a tool that can be used to place lists of mammalian genes in the context of a background mammalian signalome and interactome networks. The input to the program is a list of human Entrez Gene gene symbols and background networks in SIG format, while the output includes: (a) all identified interactions for the genes/proteins, (b) a subnetwork connecting the genes/proteins using intermediate components that are used to connect the genes, (c) ranking of the specificity of intermediate components to interact with the list of genes/proteins, and (d) a clustering analysis of the genes/proteins from the seed list based on their distance from one another in network space. PMID: 17916244 (Contact: Avi Ma’ayan)
mooseMOOSE MOOSE (Multiscale Object Oriented Simulation Environment) is designed to handle large complex simulations especially in biology. MOOSE spans the range from single molecules to subcellular networks, from single cells to neuronal networks, and to still larger systems. It is backwards-compatible with GENESIS, and forward compatible with Python and XML-based model definition standards like SBML and MorphML. PMID: 19129924 (Contact: Upi Bhalla)
The Presynaptome Database was created to visualize and share, with the neuroscience and molecular biology research communities, information about proteins and interactions identified to be present in presynaptic nerve terminals of mammalian neurons. The web-site features a network of protein-protein interactions manually extracted from neuroscience research literature. The interactions in this network are identified to be exclusively from presynaptic nerve terminals of mammalian neurons. PMID: 19562802 (Contacts: Avi Ma’ayan and Lakshmi A. Devi)
iscmid1 iScMiD (Integrated Stem-Cell Molecular Interactions Database) is an initial database for disseminating and displaying gene regulatory networks in stem cells. It currently contains interactions from 12 recent publications of profiling stem-cell related transcription factors using various high throughput ChIP profiling methods. The resultant integrated network has 50,250 entries. The file contains four columns: transcription factor (TF), target gene, PubMed ID (pmid), and organism (mouse/human). Please be aware that this network is likely to contain many false-positives and should be used for hypothesis generation only. PMID: 19738627 (Contact: Avi Ma’ayan)
doqcs DOQCS DOQCS (Database of Quantitative Cellular Signaling) is a repository of models of signaling pathways. It includes reaction schemes, concentrations, rate constants, as well as annotations on the models. The database provides a range of search, navigation, and comparison functions. PMID: 12584128 (Contact: Upi Bhalla)
sig2biopaxSig2BioPAX Sig2BioPAX is a comand-line Java program that can be used to convert structured text files describing molecular interactions into the BioPAX Level 3 standard format. (Contact: Avi Ma’ayan)