preprocess.README
Institute for Systems Biology
(c) Trey Ideker, October 2000
Go up one level to [Data-Processing Pipeline]
preprocess [OPTIONS] <dappleFile> <geneKey> <processedOutput>
num_rows_per_slide 144Each of the following rows lists the mapping between a spot position and a
X X intensity below background threshold (given in output header)
Y Y intensity below background threshold (given in output header)
A Abnormally high local background in X
B Abnormally high local background in Y
N No spot found by Dapple at this microarray location (row, col)
S Spot is saturated in X or Y intensity (intensity is above number
set with -sat option)
- No flag set
-base <num> Output the log intensity ratio using base <num>. To obtain the
natural logarithm, specify 'e'. The default is base 10.
-norm {median, mean, none} Specifies method for normalizing intensities
between the two dyes X and Y. 'Median' (the default) scales
X and Y intensities by fixed factors Mx and My so that the
median X intensity is equal to the median Y intensity, over
the top 50% brightest spots on the microarray (as sorted by X+Y
intensity). 'Mean' uses the mean instead of the median.
-sat <num> Specify a saturating intensity for the microarray scanner.
Intensities above this number are flagged with 'S'.
-scale <num> During median normalization, forces median(x)=median(y)=<num>.
Without this option (by default), median normalization sets
median(x)=median(y)=average(median(x),median(y)). This option
is useful is multiple replicate microarrays are to be analyzed
because it ensures that all of them have the same scale.
-debug Creates <outputFile>.debug, containing diagnostic information.
# Output file contains a header containing general # information about the preprocessing run. Each line of # the header starts with the '#' character. Information # about the total number of genes, distributions of # x and y, and average background intensity comes first... # # ...followed by information pertaining to the # normalization process... # # ...then by a histogram of normalized log ratios # # # Normalized, background-subtracted data starts after the # header, one line per gene. For example... # # GENE GENE RATIO LOG -SLIDE- # NAME DESCRIPT X/Y RATIO X INT Y INT FLAG ROW COL #-------- --------- -------- ------ ------ ------ ----- --- --- YNL080C YNL080C 0.1893 -0.723 2017 10655 - 141 1 YEL055C POL5 0.3165 -0.500 1001 3818 X 141 0 YDL081C RPP1A 0.5217 -0.283 33393 64009 S 58 0
...
YNL330C RPD3 0.5625 -0.250 5142 9140 B 9 3