RMAExpress
Written by Ben Bolstad
email bmb@bmbolstad.com
|
Special note
This website supersedes the previous RMAExpress webpage at UC Berkeley. New and future versions of RMAExpress will appear only here.
|
What is RMAExpress?
RMAExpress is a standalone GUI program for Windows (and Linux) to compute gene expression summary values for Affymetrix Genechip® data using the Robust Multichip Average expression summary. It does not require R nor is it dependent on any component of the BioConductor project.
|
What is RMA?
RMA is the Robust Multichip Average. It consists of three steps: a background adjustment, quantile normalization (see the Bolstad et al reference) and finally summarization. Some references (currently published) for the RMA methodology are:
| Bolstad, B.M., Irizarry R. A., Astrand,
M., and Speed, T.P. (2003), A Comparison of Normalization
Methods for High Density Oligonucleotide Array Data Based
on Bias and Variance. Bioinformatics 19(2):185-193 Supplemental
information |
| Rafael. A. Irizarry, Benjamin M. Bolstad, Francois Collin, Leslie M. Cope, Bridget Hobbs and Terence P. Speed (2003), Summaries of Affymetrix GeneChip probe level data Nucleic Acids Research 31(4):e15
|
| Irizarry, RA, Hobbs, B, Collin, F, Beazer-Barclay, YD, Antonellis, KJ, Scherf, U, Speed, TP (2002) Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data. Accepted for publication in Biostatistics. [Abstract, PDF, PS, Complementary Color Figures-PDF, Software]
|
|
What do I need?
You will need the appropriate CDF and CEL files for your dataset.
|
Can I use affy/Bioconductor instead?
Of course. Hypothetically you will get the same results from both places, provided you have consistent settings in affy/Bioconductor and RMAExpress. Some people prefer the power and flexibility of R and others like the point and click simplicity of a GUI. RMAExpress caters to the second option. Since RMAExpress outputs the computed expression values to a text file, you may of course load the expression measures into R and use features of Bioconductor for the analysis of your gene expression values. You can of course open the results file in any other application that supports importing plain text files.
|
Will I get the same results as I would using affy/Bioconductor?
Yes. As of version 1.2.13 (of the affy package), both rma() and expresso() in affy/Bioconductor will give the same values for RMA gene expressions. The results from RMAExpress should be consistent.
|
What are the machine requirements?
This information is unknown at this time. A good rule of thumb is the more RAM you have the better. At this point the program has been tested using Windows 2000, Windows XP and Linux. I have had successful user reports for processing 250 chips (this refers to versions prior to 0.4a3). Using 0.4a7 I have had one user report of 650 arrays. If you have processed more arrays in one batch please let me know at bmb@bmbolstad.com.
Another user has reported processing 1772 Affy HG-U133 Plus 2 arrays in one batch using RMAExpressConsole version 0.5 on a Linux system.
|
Can I do any quality assessment?
Yes, store the residuals when you compute the expression values. Then you may examine chip pseudo-images of the residuals. Note that high positive residuals are colored increasingly read and low negative residuals are colored increasingly blue.
To better interpret these images and gain a better feel for what is typical you may visit the PLM Image Gallery where images for a number of different datasets are shown.
|
How do I download and install it?
Click here for the current release Windows version. Use the installer to install the program. The current release version number is 0.5 (released Feb 26, 2007). A pre-built linux version is not currently available, but you may build it using the source code. You can download pre-release versions from the following table (the release versions will be more stable, the development versions may have features that are incomplete or will be removed or altered before the next release version),
| Version | Date |
| 0.1 Release | Jun 11, 2003 |
| 0.2 Release | Jan 11, 2004 |
| 0.3 Release | Dec 16, 2004 |
| 0.4 Release | Nov 10, 2005 |
| 0.4.1 Release | Jan 30, 2006 |
| 0.5 Release | Feb 26, 2007 |
| 1.0 beta 1 | Mar 25, 2007 |
| 1.0 beta 2 | Jun 17, 2007 |
| 1.0 beta 3 | Aug 24, 2007 |
| 1.0 beta 4 | Oct 28, 2007 |
| 1.0 beta 5 | Jan 20, 2008 |
| 1.0 beta 6 | Feb 2, 2008 |
| 1.0 beta 7 | Feb 16, 2008 |
| 1.0 beta 8 | Feb 29, 2008 |
| 1.0 beta 9 | Mar 10, 2008 |
| 1.0 beta 10 | Mar 20, 2008 |
|
Screenshots
|
Is there a mailing list?
Yes. You may subscribe by sending email to rmaexpress_help-request@freelists.org with 'subscribe' in the Subject field or unsuscribe by sending email to the same address with 'unsubscribe'. Once you are a member you may send messages to the list by sending an email to rmaexpress_help@freelists.org. The archive is available here.
|
I just want expression estimates. What do I do?
In most cases you do not need to use the RMADataConv program. You should just read in your data directly into RMAExpress using the "Read Unprocessed files" option. It will ask you for the locations of your CDF and CEL files, then read the data. Next select "Compute RMA measure" to actually compute the expression measure. It will bring a dialog box up asking you to select specific options (just leave the defaults), press ok and it will compute RMA expression summaries. Then choose "Write results to file" to output the expression summaries to a text file. |
Is there a user manual or other documentation?
As of version 0.4 alpha 4 a Users Guide has been introduced. It is included in the installation. You may download the latest release version of the RMAExpress Users Guide separately. The user manual for the current development version is here
|
Can I get the source code?
Yes. The source code is available under the GNU Public License Version 2. To understand the GPL read the GPL FAQ.
You can download the source code for RMAExpress_0.5_src.tar.gz which is the current release version. Note to build it you will require the wxWidgets library. The code should build on both Windows and Linux. I will not support pre-built binaries except those available from this website.
The source code for the current development version RMAExpress_1.0beta10_src.tar.gz is also available.
You can find instructions for building RMAExpress here. Note that most people will not want to compile the source code themselves, but instead just use the pre-built binaries.
|
Brief History
| Version | Date | Description |
| 0.1 beta 1 | Apr 25, 2003 | First Public version |
| 0.1 beta 2 | Apr 30, 2003 | Fixes/Optimizations to the CDF input routines |
| 0.1 beta 3 | May 20, 2003 | A few warning messages added. A small memory leak eliminated |
| 0.1 beta 4 | Jun 04, 2003 | A check that memory was properly allocated in normalization routine |
| 0.1 Release | Jun 11, 2003 | No changes from 0.1 beta 4, only a bump in version number |
| 0.2 alpha 1 | Jul 22, 2003 | A processed data format is introduced. This will speed up reloading data sets. |
| 0.2 alpha 2 | Aug 14, 2003 | You can add additional CEL files after you have already loaded some in |
| 0.2 alpha 3 | Sep 12, 2003 | A batch file convertor |
| 0.2 alpha 4 | Sep 18, 2003 | Fixes some problems with cdf filepaths (in convertor) on Windows |
| 0.2 alpha 5 | Oct 9, 2003 | Faster CEL file parser |
| 0.2 alpha 6 | Oct 19, 2003 | Preliminary support for the new binary cel file format. |
| 0.2 beta 1 | Oct 31, 2003 | Show menu. Low memory Overhead normalization step. |
| 0.2 beta 2 | Nov 16, 2003 | Critical fix for binary cel file support (previous versions will give incorrect results) |
| 0.2 Release | Jan 11, 2004 | No changes from 0.2 beta 2. Only bump in version number |
| 0.3 alpha 1 | Jan 27, 2004 | It is now possible to store and visualize RMA residuals. |
| 0.3 alpha 2 | Feb 29, 2004 | The RMA residual images may now be saved. |
| 0.3 alpha 3 | Jun 27, 2004 | RMAExpressConsole application introduced. |
| 0.3 alpha 4 | Jul 7, 2004 | Support for chips with PM only probesets |
| 0.3 alpha 5 | Oct 13, 2004 | Minor bug fixes, deals better with sense transcript arrays, output in either log2 scale (traditional) or natural scale |
| 0.3 alpha 6 | Oct 19, 2004 | Minor bug fixes |
| 0.3 beta 1 | Nov 9, 2004 | Fixes to deal with soybean chips |
| 0.3 Release | Dec 14, 2004 | No changes from 0.3 beta 1, Only a bump in version number |
| 0.4 alpha 1 | Feb 19, 2005 | Preliminary support for binary (xda) format cdf files. |
| 0.4 alpha 2 | Mar 25, 2005 | Fix a minor bug in background correction routine that on rare occasions causes slight difference in expression measures than those computed using R/BioConductor (usually difference is in 3rd decimal place). Some changes/additional progress bars. |
| 0.4 alpha 3 | Apr 1, 2005 | Experimental support for dealing with extremely large datasets (200 or more arrays) |
| 0.4 alpha 4 | Jun 5, 2005 | Max arrays in buffer now 150, Users Guide introduced |
| 0.4 alpha 5 | Jul 11, 2005 | Added the ability to do a "signs" image. Fixed the incorrect color labeling (red is positive, blue is negative). |
| 0.4 alpha 6 | Aug 23, 2005 | Fix console application so that filename for output is fully pathable. Bug fix for "Write process files" with PM only chips |
| 0.4 alpha 7 | Aug 30, 2005 | Fixes for console application. |
| 0.4 beta 1 | Oct 30, 2005 | Console application can now produce images. Fix signs images so unused areas show up as white. |
| 0.4 Release | Nov 10, 2005 | Preserve some user controllable options when application quits. |
| 0.4.1 Release | Jan 30, 2006 | Fixes for residual images dialog box for large chips. |
| 0.5 alpha 1 | Mar 30, 2006 | Preliminary experimental support for exon arrays. |
| 0.5 alpha 2 | Apr 4, 2006 | Improved support for exon arrays. An export function. |
| 0.5 alpha 3 | May 1, 2006 | Console application can produce binary output. |
| 0.5 alpha 4 | May 5, 2006 | Bug fixes for the console application |
| 0.5 alpha 5 | Aug 3, 2006 | Fix problem on Windows versions. The program would crash when large numbers of binary format cel files were trying to be read in. |
| 0.5 alpha 6 | Aug 31, 2006 | Fix problem with modal dialog not being closed and program locking up when mismatch in CDF filenames. |
| 0.5 alpha 7 | Sep 17, 2006 | Fix source code so compiles properly on Unicode builds of wxWidgets. Rebuild windows binary using standard procedure (0.5.6 was built using an alternative method). |
| 0.5 Release | Feb 26, 2007 | Minimal changes. Fixes some of the console output to the screen. |
| 1.0 beta 1 | Mar 25, 2007 | First version including the PLM based NUSE and RLE statistics |
| 1.0 beta 2 | Jun 17, 2007 | Improve plot placement when printed |
| 1.0 beta 3 | Aug 24, 2007 | Additional QC plots. Source code is now being built against wxWidgets 2.8.x |
| 1.0 beta 4 | Oct 28, 2007 | Support for reading AGCC format CEL files. Significant changes to CEL file parsing source code. Significant changes to source code to help improve portability |
| 1.0 beta 5 | Jan 20, 2008 | PGF/CLF file format parsing in the data convertor |
| 1.0 beta 6 | Feb 2, 2008 | Fix crash on reading non RME format cel files affecting XP, Windows 2000. Windows builds made using Visual C++ 2008 Express Edition |
| 1.0 beta 7 | Feb 16, 2008 | Add support for PS files to data convertor. Improved console application log to screen. Small speed improvements in background correction and quantile normalization. Minimum arrays in buffer has been reduced to 1 (previously was 5). |
| 1.0 beta 8 | Feb 29, 2008 | Fix BufferedMatrix indexing bug causing crash with extremely large datasets. |
| 1.0 beta 9 | Mar 10, 2008 | Better corruption checking of CEL files |
| 1.0 beta 10 | Mar 20, 2008 | Support for MPS files in data convertor |
|
Questions/Comments/Problems/Feature Requests?
Send me an email at bmb@bmbolstad.com.
|
 Free Counter
|