preprocess.README
Institute for Systems Biology
(c) Trey Ideker, October 2000
Go up one level to [Data-Processing Pipeline]
preprocess [OPTIONS] <dappleFile> <geneKey> <processedOutput>
num_rows_per_slide 144Each of the following rows lists the mapping between a spot position and a
X X intensity below background threshold (given in output header) Y Y intensity below background threshold (given in output header) A Abnormally high local background in X B Abnormally high local background in Y N No spot found by Dapple at this microarray location (row, col) S Spot is saturated in X or Y intensity (intensity is above number set with -sat option) - No flag set
-base <num> Output the log intensity ratio using base <num>. To obtain the natural logarithm, specify 'e'. The default is base 10.
-norm {median, mean, none} Specifies method for normalizing intensities between the two dyes X and Y. 'Median' (the default) scales X and Y intensities by fixed factors Mx and My so that the median X intensity is equal to the median Y intensity, over the top 50% brightest spots on the microarray (as sorted by X+Y intensity). 'Mean' uses the mean instead of the median.
-sat <num> Specify a saturating intensity for the microarray scanner. Intensities above this number are flagged with 'S'.
-scale <num> During median normalization, forces median(x)=median(y)=<num>. Without this option (by default), median normalization sets median(x)=median(y)=average(median(x),median(y)). This option is useful is multiple replicate microarrays are to be analyzed because it ensures that all of them have the same scale.
-debug Creates <outputFile>.debug, containing diagnostic information.
# Output file contains a header containing general # information about the preprocessing run. Each line of # the header starts with the '#' character. Information # about the total number of genes, distributions of # x and y, and average background intensity comes first... # # ...followed by information pertaining to the # normalization process... # # ...then by a histogram of normalized log ratios # # # Normalized, background-subtracted data starts after the # header, one line per gene. For example... # # GENE GENE RATIO LOG -SLIDE- # NAME DESCRIPT X/Y RATIO X INT Y INT FLAG ROW COL #-------- --------- -------- ------ ------ ------ ----- --- --- YNL080C YNL080C 0.1893 -0.723 2017 10655 - 141 1 YEL055C POL5 0.3165 -0.500 1001 3818 X 141 0 YDL081C RPP1A 0.5217 -0.283 33393 64009 S 58 0
...
YNL330C RPD3 0.5625 -0.250 5142 9140 B 9 3