Daniele Merico - HowTo Directory
Affymetrix Microarray Analysis
Importing raw data and generating standard gene expression metrics (signals, calls, etc...)
Importing Affymetrix CEL files, calculating MAS5 calls and signals
CEL files are the almost-raw files generated after chip image processing by Affymetrix software;
the "fun" usually starts from the CEL files onwards; here's is the simplest things you can do with CEL files.Importing Affymetrix CEL files, bothering about the R exprSet object, calculating MAS5 calls and signals
if the experimental design is quite complex, or you are using a function requiring an expression set (exprSet),
then, sorry, but you probably need to read this part instead of the previous one.
Data exploration by dimensionality reduction techniques
- How to perform on a data matrix (e.g. expression matrix)
Computing Differential Expression
2-class methods
these methods require a dicotomic classification of the samples (e.g. case vs control), and reproducibility of samples belonging to the same classPLGEM
Features:- statistic used: corrected signal-to-noise, every gene treated as an independent entity; signal-to-noise is corrected according to an error model for the global estimation of varibility;
- error model requires: linear relation between signal mean and standard deviation
- significance: estimated by randomly permuting the data (by column), and computing the statistic;
- recommended when: the number of replicates is uneven between case and control, with one of the two having very few, or just one replicate;
proteomics: successfully applied to tandem mass-spec proteomics data, where the signal was generated as abundancy-normalized peptide counts (NSAF)
- Pubmed.ID: 15606915 (main)
- Pubmed.ID: 18029349 (proteomic application)
- statistic used: corrected signal-to-noise, every gene treated as an independent entity; signal-to-noise is corrected according to an error model for the global estimation of varibility;
SAM
Features:- statistic used: corrected signal-to-noise, every gene treated as an independent entity;
- significance: estimated by randomly permuting the data (by column), and computing the statistic;
- recommended when: the number of replicates is 3 or more, and even between case and control;
proteomics: unknown
- Pubmed.ID: 11309499 (main)
General Computational Techniques
Computational Techniques for multi-dimensional data:
A few tips on distances (especially for binary strings)
Tuning Visualization in R
My stuff:
For a more general reference:
A graphical description of the main graphical parameters for R graphs
and a broader how-to for R graphics
System: R & the Mac
Where is R installed in the Mac?
As a former Windows user, I spent an hour trying to answer the following question: what is the f. location of R executables on the Mac? (i.e. where the hell are R files installed?) (where, of course, "f." stands for funny). The answer is quite straightforward if, instead of wasting time looking for them all round your Mac, you just read the R Mac OS X FAQ, under the chapter "uninstalling R". In my system (Mac OS X 10.5.1), the funny location of R files is:- Rgui:
other R files: /library/frameworks/R.framework
for arcane reasons, the R plugin for Eclipse requires as folder of R executables: /Library/Frameworks/R.framework/Versions/.../Resources (where "..." is the version currently under use)
The Eclipse Plug-in for Mac
- Eclipse can be used as a programming environment for R, and it can be also connected to Subversion (thus catching two pigeons with one bite)