AnVil Informatics, Inc. is an in silico drug discovery company employing proprietary data exploration technology. Our highly experienced team of scientists, informaticists, data visualization specialists, and data miners are experts at navigating large, high dimensional datasets. Here we present an example of one such exploration where AnVil uncovered new knowledge previously not identified.

Golub & Slonim
New Breakthroughs From The
Re-examination of Previous Results

Background

Golub & Slonim et al. (1999) produced gene expression profiles from 72 Acute Leukemia patient samples using Affymetrix GeneChip™ Hu6800 microarrays. Samples from four sites, collected over a 20 year period, were assayed and evaluated using Affymetrix software. While the lack of replicate testing and sample quality control may have increased the signal-noise level in the resulting data set, the data were sufficient for the authors to successfully classify cancer types and discover cancer sub-types.

Read the original paper (PDF format)
by Golub & Slonim et al. for more details.

The Golub & Slonim et al. data set continues to be actively studied, with dozens of published analyses of the data in print and on the Internet. Of the dozens of groups that have published analyses of the Golub & Slonim data set, including the original authors, only one other than AnVil has presented a gene predictor set diagnostic of clinical treatment outcome.

What sets AnVil Informatics' approach apart from others is not only the visual means by which we use to conduct our analyses, but the "big picture" view we create of the problem as well as the rigor with which we verify our answers. Traditional analysis and visualization methods are unable to look at the entire data set as a whole due to limitations of existing software tools. Assumptions made in reduction of data dimensions may also remove or obscure data relationships and candidate genes resulting in weaker classifiers and predictors and loss of valuable targets. AnVil Informatics identifies clustering potentials and outlier samples early in order to avoid the costly mistakes that can slow or misdirect the drug discovery process.

Using AnVil's Approach to Analyze the Data Set

AnVil Informatics' approach preserves the value and meaning inherent in the full data set by creation of global data and meta-statistical overviews that allow one to reveal major data patterns and identify aberrant samples that may bias results. This approach departs from the now traditional analysis methods exemplified by Golub & Slonim et al.'s analysis of microarray data. We first take a high-level overview of the data set, examining statistical metadata and working in high-dimensional space to assess the full data set. Visualizations are used not only as a means of portraying the results of analyses, but also as interactive tools for the exploration, manipulation and analysis of the data and generation of subsequent results.

This presentation demonstrates how high dimensional visualization of massive microarray data sets can reveal valuable clustering relationships, even before filtering, thresholding and other pre-processing.

Read the Golub & Slonim Exploration Report

Results

Using the data of Golub & Slonim et al., AnVil Informatics' analysis identified:

Several suspect patient samples that were subsequently shown to be falsely misclassified samples in a 3 gene predictor set that otherwise classified B- and T-Cell ALL and AML based on the influence of a B-Cell associated gene. This biologically important result may have been discarded by others using methods not evaluating data quality along with classifier prediction.

Analysis of limited chemotherapy treatment outcome data also yielded a 76-gene predictor for remission success and failure; a finding until recently unique among the dozens of presented analyses of the Golub & Slonim et al. data set. While a survival prognosis predictor has been generated for lymphoma [Alizadeh et. al (2000)], a gene predictor set for prediction of the success or failure of chemotherapy treatment is a significant contribution.
 
 

Copyright © 2001-2002 AnVil Informatics, Inc. All rights reserved.
sitemapoutdoor fireplaces | infrared heaters | kerosene heaters |