collapse_ExpressionMatrix.py

This tool can process a gene expression matrix (in GCT or TXT format) ranked list (RNK format) and:

This can be done either individually or both at the same time.

Download

Requirements

Supported Operating Systems:

GUI Mode

Screenshot of GUI collapse_ExpressionMatrix.py now has a Tk-based Graphical User Interface (GUI). To use the GUI, just start the program without any arguments. This can be done:

After starting the GUI:

  1. use the first Browse-Button to select an Expression Matrix or Ranked gene list as an input file.
  2. use the second Browse-Button to choose a name and location of the output file (the program will suggest to use the same name as the input file with an inserted "_collapsed" before the extension)
  3. choose if the Identifiers should be converted or the file should be collapsed by checking the check-boxes
  4. if Identifiers are to be converted, choose a matching chip file
  5. start the conversion by clicking the Run button

Command Line Mode

If you are familiar with command line tools under Unix/Linux, collapse_ExpressionMatrix.py -h gives you all the information you need (if not, see below):

$ collapse_ExpressionMatrix.py -h
Usage: collapse_ExpressionMatrix_SVN.py [options] -i input.gct -o output.gct [-c platform.chip] [--collapse]

This tool can process a gene expression matrix (in GCT or TXT format) or
ranked list (RNK format) and either replace the Identifier based on a Chip
Annotation file (e.g. AffyID -> Gene Symbol), or collapse the expression
values or rank-scores for Genes from more than one probe set. Both can be done
in one step by using both '-c platform.chip' and '--collapse' at the same
time. If a ranked list is to be collapsed, an additional expression matrix can
be supplied by the -e/-x parameters and will be filtered to contain the same
probe-sets as selected from the RNK file. If however the file supplied by -i
is not recognized as a RNK file, these options have no effect.  For detailed
descriptions of the file formats, please refer to:
http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats
Call without any parameters to select the files and options with a GUI
(Graphical User Interface)

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -i FILE, --input=FILE
                        input expression table or ranked list
  -o FILE, --output=FILE
                        output expression table or ranked list
  -c FILE, --chip=FILE  Chip File This implies that the Identifiers are to be
                        replaced.
  -e FILE, --ei=FILE    (optional) additional input Expression-table, to be
                        restricted to the same probe-sets as the RNK file
  -x FILE, --xo=FILE    (optional) corresponding output file for -i/--ei
                        option
  --collapse            Collapse multiple probe sets for the same gene symbol
                        (max_probe)
  --no-collapse         Don't collapse multiple probesets [default]
  --null                suppress Gene with Symbol NULL
  -g, --gui             Open a Window to choose the files and options.
  -q, --quiet           be quiet

On MacOS and Linux you need to make the program executable. Therefore:

On Windows:

Software/EnrichmentMap/CollapseExpressionMatrix (last edited 2010-04-19 23:58:03 by OliverStueker)

MoinMoin Appliance - Powered by TurnKey Linux