| Size: 632 Comment:  | Size: 2989 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 10: | Line 10: | 
| Standard Microarray Analysis: | === Affymetrix Microarray Analysis === __Importing raw data and generating standard gene expression metrics__ (signals, calls, etc...) | 
| Line 12: | Line 13: | 
| * [:DanieleMerico/HowtoDirectory/AffyCelCalSig: Importing Affymetrix CEL files and calculating MAS5 calls and signals] | * [:DanieleMerico/HowtoDirectory/AffyCelCalSig: Importing Affymetrix CEL files, calculating MAS5 calls and signals][[BR]] CEL files are the almost-raw files generated after chip image processing by Affymetrix software; [[BR]] the "fun" usually starts from the CEL files onwards; here's is the simplest things you can do with CEL files. * [:DanieleMerico/HowtoDirectory/ExprSet: Importing Affymetrix CEL files, bothering about the R exprSet object, calculating MAS5 calls and signals][[BR]] if the experimental design is quite complex, or you are using a function requiring an expression set (exprSet),[[BR]] then, sorry, but you probably need to read this part instead of the previous one. | 
| Line 14: | Line 20: | 
| Computational Techniques for multi-dimensional data: | __Computing Differential Expression__ * '''2-class methods''' [[BR]] these methods require a dicotomic classification of the samples (e.g. case vs control), and reproducibility of samples belonging to the same class * [:DanieleMerico/HowtoDirectory/PLGEM: PLGEM][[BR]] Features: * statistic used: corrected signal-to-noise, every gene treated as an independent entity; signal-to-noise is corrected according to an error model for the global estimation of varibility; * error model requires: linear relation between signal mean and standard deviation * significance: estimated by randomly permuting the data (by column), and computing the statistic; * recommended when: the number of replicates is uneven between case and control, with one of the two having very few, or just one replicate; * proteomics: successfully applied to tandem mass-spec proteomics data, where the signal was generated as abundancy-normalized peptide counts (NSAF)[[BR]] References * Pubmed.ID: 15606915 (main) * Pubmed.ID: 18029349 (proteomic application) * SAM[[BR]] Features: * statistic used: corrected signal-to-noise, every gene treated as an independent entity; * significance: estimated by randomly permuting the data (by column), and computing the statistic; * recommended when: the number of replicates is 3 or more, and even between case and control; * proteomics: unknown[[BR]] References: * Pubmed.ID: 11309499 (main) === General Computational Techniques === __Computational Techniques for multi-dimensional data__: | 
| Line 17: | Line 47: | 
| === Tuning Visualization in R === * [:DanieleMerico/HowtoDirectory/Boxplots: Hacks for boxplots tuning] | 
Daniele Merico - HowTo Directory
Affymetrix Microarray Analysis
Importing raw data and generating standard gene expression metrics (signals, calls, etc...)
- [:DanieleMerico/HowtoDirectory/AffyCelCalSig: Importing Affymetrix CEL files, calculating MAS5 calls and signals]BR CEL files are the almost-raw files generated after chip image processing by Affymetrix software; BR the "fun" usually starts from the CEL files onwards; here's is the simplest things you can do with CEL files. 
- [:DanieleMerico/HowtoDirectory/ExprSet: Importing Affymetrix CEL files, bothering about the R exprSet object, calculating MAS5 calls and signals]BR if the experimental design is quite complex, or you are using a function requiring an expression set (exprSet),BR then, sorry, but you probably need to read this part instead of the previous one. 
Computing Differential Expression
- 2-class methods BR these methods require a dicotomic classification of the samples (e.g. case vs control), and reproducibility of samples belonging to the same class - [:DanieleMerico/HowtoDirectory/PLGEM: PLGEM]BR Features: - statistic used: corrected signal-to-noise, every gene treated as an independent entity; signal-to-noise is corrected according to an error model for the global estimation of varibility; - error model requires: linear relation between signal mean and standard deviation
 
- significance: estimated by randomly permuting the data (by column), and computing the statistic;
- recommended when: the number of replicates is uneven between case and control, with one of the two having very few, or just one replicate;
- proteomics: successfully applied to tandem mass-spec proteomics data, where the signal was generated as abundancy-normalized peptide counts (NSAF)BR 
 - Pubmed.ID: 15606915 (main)
- Pubmed.ID: 18029349 (proteomic application)
 
- statistic used: corrected signal-to-noise, every gene treated as an independent entity; signal-to-noise is corrected according to an error model for the global estimation of varibility; 
- SAMBR Features: - statistic used: corrected signal-to-noise, every gene treated as an independent entity;
- significance: estimated by randomly permuting the data (by column), and computing the statistic;
- recommended when: the number of replicates is 3 or more, and even between case and control;
- proteomics: unknownBR 
 - Pubmed.ID: 11309499 (main)
 
 
General Computational Techniques
Computational Techniques for multi-dimensional data:
- [:DanieleMerico/HowtoDirectory/Distances: A few tips on distances] (especially for binary strings)
Tuning Visualization in R
- [:DanieleMerico/HowtoDirectory/Boxplots: Hacks for boxplots tuning]
