RMAExpress

Written by Ben Bolstad
email bmb@bmbolstad.com

Special note

This website supersedes the previous RMAExpress webpage at UC Berkeley. New and future versions of RMAExpress will appear only here.

What is RMAExpress?

RMAExpress is a standalone GUI program for Windows (and Linux) to compute gene expression summary values for Affymetrix Genechip® data using the Robust Multichip Average expression summary. It does not require R nor is it dependent on any component of the BioConductor project.

What is RMA?

RMA is the Robust Multichip Average. It consists of three steps: a background adjustment, quantile normalization (see the Bolstad et al reference) and finally summarization. Some references (currently published) for the RMA methodology are:
Bolstad, B.M., Irizarry R. A., Astrand, M., and Speed, T.P. (2003), A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics 19(2):185-193 Supplemental information
Rafael. A. Irizarry, Benjamin M. Bolstad, Francois Collin, Leslie M. Cope, Bridget Hobbs and Terence P. Speed (2003), Summaries of Affymetrix GeneChip probe level data Nucleic Acids Research 31(4):e15
Irizarry, RA, Hobbs, B, Collin, F, Beazer-Barclay, YD, Antonellis, KJ, Scherf, U, Speed, TP (2002) Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data. Accepted for publication in Biostatistics. [Abstract, PDF, PS, Complementary Color Figures-PDF, Software]

What do I need?

You will need the appropriate CDF and CEL files for your dataset.

Can I use affy/Bioconductor instead?

Of course. Hypothetically you will get the same results from both places, provided you have consistent settings in affy/Bioconductor and RMAExpress. Some people prefer the power and flexibility of R and others like the point and click simplicity of a GUI. RMAExpress caters to the second option. Since RMAExpress outputs the computed expression values to a text file, you may of course load the expression measures into R and use features of Bioconductor for the analysis of your gene expression values. You can of course open the results file in any other application that supports importing plain text files.

Will I get the same results as I would using affy/Bioconductor?

Yes. As of version 1.2.13 (of the affy package), both rma() and expresso() in affy/Bioconductor will give the same values for RMA gene expressions. The results from RMAExpress should be consistent.

What are the machine requirements?

This information is unknown at this time. A good rule of thumb is the more RAM you have the better. At this point the program has been tested using Windows 2000, Windows XP and Linux. I have had successful user reports for processing 250 chips (this refers to versions prior to 0.4a3). Using 0.4a7 I have had one user report of 650 arrays. If you have processed more arrays in one batch please let me know at bmb@bmbolstad.com.

Another user has reported processing 1772 Affy HG-U133 Plus 2 arrays in one batch using RMAExpressConsole version 0.5 on a Linux system.

Can I do any quality assessment?

Yes, store the residuals when you compute the expression values. Then you may examine chip pseudo-images of the residuals. Note that high positive residuals are colored increasingly read and low negative residuals are colored increasingly blue.

To better interpret these images and gain a better feel for what is typical you may visit the PLM Image Gallery where images for a number of different datasets are shown.

How do I download and install it?

Click here for the current release Windows version. Use the installer to install the program. The current release version number is 0.5 (released Feb 26, 2007). A pre-built linux version is not currently available, but you may build it using the source code. You can download pre-release versions from the following table (the release versions will be more stable, the development versions may have features that are incomplete or will be removed or altered before the next release version),

VersionDate
0.1 ReleaseJun 11, 2003
0.2 ReleaseJan 11, 2004
0.3 ReleaseDec 16, 2004
0.4 ReleaseNov 10, 2005
0.4.1 ReleaseJan 30, 2006
0.5 ReleaseFeb 26, 2007
1.0 beta 1Mar 25, 2007
1.0 beta 2Jun 17, 2007
1.0 beta 3Aug 24, 2007
1.0 beta 4Oct 28, 2007
1.0 beta 5Jan 20, 2008
1.0 beta 6Feb 2, 2008
1.0 beta 7Feb 16, 2008
1.0 beta 8Feb 29, 2008
1.0 beta 9Mar 10, 2008
1.0 beta 10Mar 20, 2008

Screenshots

Screenshot of RMAExpress
Screenshot of RMAExpress

Is there a mailing list?

Yes. You may subscribe by sending email to rmaexpress_help-request@freelists.org with 'subscribe' in the Subject field or unsuscribe by sending email to the same address with 'unsubscribe'. Once you are a member you may send messages to the list by sending an email to rmaexpress_help@freelists.org. The archive is available here.

I just want expression estimates. What do I do?

In most cases you do not need to use the RMADataConv program. You should just read in your data directly into RMAExpress using the "Read Unprocessed files" option. It will ask you for the locations of your CDF and CEL files, then read the data. Next select "Compute RMA measure" to actually compute the expression measure. It will bring a dialog box up asking you to select specific options (just leave the defaults), press ok and it will compute RMA expression summaries. Then choose "Write results to file" to output the expression summaries to a text file.

Is there a user manual or other documentation?

As of version 0.4 alpha 4 a Users Guide has been introduced. It is included in the installation. You may download the latest release version of the RMAExpress Users Guide separately. The user manual for the current development version is here

Can I get the source code?

Yes. The source code is available under the GNU Public License Version 2. To understand the GPL read the GPL FAQ.

You can download the source code for RMAExpress_0.5_src.tar.gz which is the current release version. Note to build it you will require the wxWidgets library. The code should build on both Windows and Linux. I will not support pre-built binaries except those available from this website.

The source code for the current development version RMAExpress_1.0beta10_src.tar.gz is also available.

You can find instructions for building RMAExpress here. Note that most people will not want to compile the source code themselves, but instead just use the pre-built binaries.

Brief History

VersionDateDescription
0.1 beta 1 Apr 25, 2003First Public version
0.1 beta 2 Apr 30, 2003Fixes/Optimizations to the CDF input routines
0.1 beta 3 May 20, 2003A few warning messages added. A small memory leak eliminated
0.1 beta 4 Jun 04, 2003A check that memory was properly allocated in normalization routine
0.1 ReleaseJun 11, 2003No changes from 0.1 beta 4, only a bump in version number
0.2 alpha 1Jul 22, 2003A processed data format is introduced. This will speed up reloading data sets.
0.2 alpha 2Aug 14, 2003You can add additional CEL files after you have already loaded some in
0.2 alpha 3Sep 12, 2003A batch file convertor
0.2 alpha 4Sep 18, 2003Fixes some problems with cdf filepaths (in convertor) on Windows
0.2 alpha 5Oct 9, 2003Faster CEL file parser
0.2 alpha 6Oct 19, 2003Preliminary support for the new binary cel file format.
0.2 beta 1Oct 31, 2003Show menu. Low memory Overhead normalization step.
0.2 beta 2Nov 16, 2003Critical fix for binary cel file support (previous versions will give incorrect results)
0.2 ReleaseJan 11, 2004No changes from 0.2 beta 2. Only bump in version number
0.3 alpha 1Jan 27, 2004It is now possible to store and visualize RMA residuals.
0.3 alpha 2Feb 29, 2004The RMA residual images may now be saved.
0.3 alpha 3Jun 27, 2004RMAExpressConsole application introduced.
0.3 alpha 4Jul 7, 2004Support for chips with PM only probesets
0.3 alpha 5Oct 13, 2004Minor bug fixes, deals better with sense transcript arrays, output in either log2 scale (traditional) or natural scale
0.3 alpha 6Oct 19, 2004Minor bug fixes
0.3 beta 1Nov 9, 2004Fixes to deal with soybean chips
0.3 ReleaseDec 14, 2004No changes from 0.3 beta 1, Only a bump in version number
0.4 alpha 1Feb 19, 2005Preliminary support for binary (xda) format cdf files.
0.4 alpha 2Mar 25, 2005Fix a minor bug in background correction routine that on rare occasions causes slight difference in expression measures than those computed using R/BioConductor (usually difference is in 3rd decimal place). Some changes/additional progress bars.
0.4 alpha 3Apr 1, 2005Experimental support for dealing with extremely large datasets (200 or more arrays)
0.4 alpha 4Jun 5, 2005Max arrays in buffer now 150, Users Guide introduced
0.4 alpha 5Jul 11, 2005Added the ability to do a "signs" image. Fixed the incorrect color labeling (red is positive, blue is negative).
0.4 alpha 6Aug 23, 2005Fix console application so that filename for output is fully pathable. Bug fix for "Write process files" with PM only chips
0.4 alpha 7Aug 30, 2005Fixes for console application.
0.4 beta 1Oct 30, 2005Console application can now produce images. Fix signs images so unused areas show up as white.
0.4 ReleaseNov 10, 2005Preserve some user controllable options when application quits.
0.4.1 ReleaseJan 30, 2006Fixes for residual images dialog box for large chips.
0.5 alpha 1Mar 30, 2006Preliminary experimental support for exon arrays.
0.5 alpha 2Apr 4, 2006Improved support for exon arrays. An export function.
0.5 alpha 3May 1, 2006Console application can produce binary output.
0.5 alpha 4May 5, 2006Bug fixes for the console application
0.5 alpha 5Aug 3, 2006Fix problem on Windows versions. The program would crash when large numbers of binary format cel files were trying to be read in.
0.5 alpha 6Aug 31, 2006Fix problem with modal dialog not being closed and program locking up when mismatch in CDF filenames.
0.5 alpha 7Sep 17, 2006Fix source code so compiles properly on Unicode builds of wxWidgets. Rebuild windows binary using standard procedure (0.5.6 was built using an alternative method).
0.5 ReleaseFeb 26, 2007Minimal changes. Fixes some of the console output to the screen.
1.0 beta 1Mar 25, 2007First version including the PLM based NUSE and RLE statistics
1.0 beta 2Jun 17, 2007Improve plot placement when printed
1.0 beta 3Aug 24, 2007Additional QC plots. Source code is now being built against wxWidgets 2.8.x
1.0 beta 4Oct 28, 2007Support for reading AGCC format CEL files. Significant changes to CEL file parsing source code. Significant changes to source code to help improve portability
1.0 beta 5Jan 20, 2008PGF/CLF file format parsing in the data convertor
1.0 beta 6Feb 2, 2008Fix crash on reading non RME format cel files affecting XP, Windows 2000. Windows builds made using Visual C++ 2008 Express Edition
1.0 beta 7Feb 16, 2008Add support for PS files to data convertor. Improved console application log to screen. Small speed improvements in background correction and quantile normalization. Minimum arrays in buffer has been reduced to 1 (previously was 5).
1.0 beta 8Feb 29, 2008Fix BufferedMatrix indexing bug causing crash with extremely large datasets.
1.0 beta 9Mar 10, 2008Better corruption checking of CEL files
1.0 beta 10Mar 20, 2008Support for MPS files in data convertor

Questions/Comments/Problems/Feature Requests?

Send me an email at bmb@bmbolstad.com.
Web Counter
Free Counter