VERAandSAM.README

Institute for Systems Biology

thorsson@systemsbiology.org

February, 2001

Go up one level to [VERA and SAM home page]

VERA and SAM are a pair of programs that provide a method to determine whether any given gene is expressed at a different level in one cell population than in another according to microarray data.

This README includes a brief description of VERA and SAM followed by notes on usage.

VERA takes data from replicate microarray experiments, and describes the overall variability in the data in terms of five parameters, called error-model parameters. Error-model parameters are fitted to the data by starting from an initial guess, and optimizing them in iterated steps until they have converged.

The parameters are:sigma_epsilon_x : Standard deviation of multiplicative error in X (the 1st dye) sigma_epsilon_y : Standard deviation of multiplicative error in Y (the 2nd dye) rho_epsilon : Correlation between multiplicative errors sigma_delta_x : Standard deviation of additive error in X (the 1st dye) sigma_delta_y : Standard deviation of additive error in Y (the 2nd dye)

SAM gives a value, lambda, for each gene on an array, which describes how likely it is that the gene is expressed differently in the two cell populations. A large value of lambda means that the gene is almost certainly expressed differentially, while a value close to 0 indicates that there is no evidence for differential expression. A threshold value for differential expression, lambda_c, should be determined from control experiments. In the reference publication below, lambda_c was fixed at the value at which 0.1% of the genes were differentially expressed in a control experiment with identical conditions for both cell populations (lambda_c = 23.8 ).

T. E. Ideker, V. Thorsson, A. F. Siegel, and L. E. Hood. Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data.

VERA [OPTIONS] <mergedFile> <ErrorModel>

beta 0.3 0.4 0.8 200 100

-evol Use if you would like to generate a file showing how the parameters converge.

-init <ErrorModel> Use if you would like to specify your own initial choices for parameter optimization

-crit <number> Optimization ceases when an all changes after an iteration step are less than <number>

-iter Display details of optimization (Use for debugging only)

SAM [OPTIONS] <mergedFile> <ErrorModel> <mergedFileSignificance>

<mu_X> <mu_Y> <lambda> <muRatio> <T> <mu_X> mean intensity for first dye <mu_Y> mean intensity for second dye <lambda> likelihood of differential expression, i.e. that <mu_X> differs from <mu_Y> <muRatio> log_10( mu_X / mu_Y ) (unless ratio was tempered, see below) <T> 'T' if <muRatio> was tempered, '-' if notThe column <muRatio> displays a "tempered" alternative to the ratio if the mean intensity when dye falls below a threshold given by the background.

-iter Display details of optimization (Use for debugging only)