The raw data from a microarray experiment is a series of scanned images.
- Images must be converted into quantitative data
- Steps to preprocess and transform the image into a format suitable for analysis are under the realm of "image analysis."
Fall 2016
The raw data from a microarray experiment is a series of scanned images.
Therefore, each \(f(x,y)\) represents the brightness of a small picture element, called pixel, at location \((x,y)\).
The number of pixels contained in a digital image is called resolution
2^7=128 | 2^6=64 | 2^5=32 | 2^4=16 | 2^3=8 | 2^2=4 | 2^1=2 | 2^0=1 |
---|---|---|---|---|---|---|---|
0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 |
Â
\(128*0 + 64*1 + 32*0 + 16*1 + 8*0 + 4*1 + 2*1 + 1*1\)
Â
Color depth of 16 bits/pixel (common for microarray scanners) means the intensity values of each pixel is an integer between \(0\) and \(65,535 (=2^{16}-1)\)
These steps use specialized software and can involve varying degrees of human intervention.
Â
http://image.bio.methods.free.fr/ImageJ/?Protein-Array-Analyzer-for-ImageJ.html&artpage=5-6
Custom spotted arrays are manufactured by a robotic system that uses several print tips (pins, pinheads) to deposit the cDNA fragments on each of the spots.
Typically each of the n print tip spots in a regular sized sub-grid, such that the entire microarray is composed of n matrices with the same number of rows and columns.
Ideally, the spots are of the same size, the same shape and are equally spaced throughout the array.
The ultimate goal of any image analysis technique should be the automation of the image analysis process.
Although the layout of the cDNA array is known and can be used for addressing, the known model must be matched to the scanned image.
Therefore, most software packages include both automatic and manual procedures for addressing.
Microarray Layout Parameters | Value |
---|---|
Array Rows | 4 |
Array Columns | 4 |
Rows | 21 |
Columns | 21 |
Array Row Spacing | 9000 |
Array Column Spacing | 9000 |
Spot Row Spacing | 425 |
Spot Column Spacing | 425 |
Spot Diameter | 300 |
Spots per Array | 441 |
Total Spots | 7056 |
Once the address of the spots has been identified, the pixels must be classified as signal versus background, a process called segmentation.
Background represents a value of the measured signal intensity that is presumed to be due to non-specific binding of target to the probe
Thought to be removed from the signal intensity measurement in order to accurately quantitate the amount of target RNA present in the sample.
Uneven hybridization, auto fluorescence, non-specific binding - measurements outsize the spot not at 0 intensity
Spatial based segmentation:
Â
Intensity-based segmentation
Chen Y, Dougherty ER, Bittner ML. "Ratio-based decisions and the quantitative analysis of cDNA microarray images." J Biomed Opt. 1997. PMID: 23014960
Spot intensity: Some statistics representing intensities for all pixels in spot area; similarly for background intensity
Still, no consensus what to use
6 x 6 matrix of pixels for each PM and MM probe HG-U133A GeneChip
(X,Y) | Y=2433 | Y=2434 | Y=2435 | Y=2436 | Y=2437 | Y=2438 |
X=2366 | 164 | 209 | 225 | 215 | 200 | 145 |
X=2365 | 294 | 438 | 511 | 562 | 432 | 238 |
X=2364 | 259 | 433 | 542 | 514 | 530 | 275 |
X=2363 | 374 | 597 | 595 | 621 | 672 | 358 |
X=2362 | 319 | 542 | 555 | 518 | 594 | 286 |
X=2361 | 267 | 372 | 369 | 356 | 378 | 190 |
(X,Y) | Y=2433 | Y=2434 | Y=2435 | Y=2436 | Y=2437 | Y=2438 |
X=2366 | 164 | 209 | 225 | 215 | 200 | 145 |
X=2365 | 294 | 438 | 511 | 562 | 432 | 238 |
X=2364 | 259 | 433 | 542 | 514 | 530 | 275 |
X=2363 | 374 | 597 | 595 | 621 | 672 | 358 |
X=2362 | 319 | 542 | 555 | 518 | 594 | 286 |
X=2361 | 267 | 372 | 369 | 356 | 378 | 190 |