VERAandSAM.README
Institute for Systems Biology
mjohnson@systemsbiology.org
October, 2001

Go up one level to [VERA and SAM home page]


OVERVIEW

VERA and SAM are a pair of programs that provide a method to determine whether any given gene is expressed at a different level in one cell population than in another according to microarray data.

This README includes a brief description of VERA and SAM followed by notes on usage.

VERA takes data from replicate microarray experiments, and describes the overall variability in the data in terms of five parameters, called error-model parameters. Error-model parameters are fitted to the data by starting from an initial guess, and optimizing them in iterated steps until they have converged.

The parameters are:

 
sigma_epsilon_x : Standard deviation of multiplicative error in X (the 1st dye)
sigma_epsilon_y : Standard deviation of multiplicative error in Y (the 2nd dye)
rho_epsilon     : Correlation between multiplicative errors
sigma_delta_x   : Standard deviation of additive error in X (the 1st dye)
sigma_delta_y   : Standard deviation of additive error in Y (the 2nd dye)

SAM gives a value, lambda, for each gene on an array, which describes how likely it is that the gene is expressed differently in the two cell populations. A large value of lambda means that the gene is almost certainly expressed differentially, while a value close to 0 indicates that there is no evidence for differential expression. A threshold value for differential expression, lambda_c, should be determined from control experiments. In the reference publication below, lambda_c was fixed at the value at which 0.1% of the genes were differentially expressed in a control experiment with identical conditions for both cell populations (lambda_c = 23.8 ).

Reference:
T. E. Ideker, V. Thorsson, A. F. Siegel, and L. E. Hood. Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. Journal of Computational Biology 7, 805-817 (2001).


RUNNING VERA

Chain of Commands:


Input file format: File providing the (X,Y) expression levels for each gene and replicate experiment, e.g. as produced by the mergeReps script. Please see the file-format specification for details.

Output file format: The ErrorModel output file lists the five error-model parameters, for example:

 
beta
0.3 0.4 0.8 200 100
 

VERA Options:
VERA options can be accessed by clicking "Options..." on the main dialog

 
  Option #1-    Use if you would like to generate a file
                showing how the parameters converge. The
                file name and path is also displayed.

  Option #2-    Optimization ceases when an all changes
                after an iteration step are less than <number>

  Option #3-    Use if you would like to specify your own
                initial choices for parameter optimization.
                Upon checking the box, the file name will
                be requested.


  Option #4-    Display details of optimization (Use for 
                debugging only).  ".VERAdebug" will be the file
                extension.


RUNNING SAM


Chain of Commands:


Input file format: File providing the (X,Y) expression levels for each gene and replicate experiment, e.g. as produced by the mergeReps script. Please see the file-format specification for details.

Output file format: <mergedFileSignificance> contains all of the information in <mergedFile>, but with five additional columns appended:

 
<mu_X> <mu_Y> <lambda> <muRatio> <T>
 
<mu_X>         mean intensity for first dye
<mu_Y>         mean intensity for second dye
<lambda>        likelihood of differential expression, i.e. that <mu_X> differs from <mu_Y>
<muRatio>       log_10( mu_X / mu_Y )   (unless ratio was tempered, see below)
<T>             'T' if <muRatio> was tempered, '-' if not

The column <muRatio> displays a "tempered" alternative to the ratio if the mean intensity when dye falls below a threshold given by the background.

SAM options:
SAM options can be accessed by clicking "Options..." on the main dialog

 
  Option #1         Select if the SAM input is different from 
                    the VERA output.

  Option #2         Display details of optimization (Use for 
                    debugging only)