## Please edit system and help pages ONLY in the moinmaster wiki! For more
## information, please see MoinMaster:MoinPagesEditorGroup.
##master-page:Unknown-Page
##master-date:Unknown-Date
#acl DanieleMerico:admin,read,write,delete,revert All:read
#format wiki
#language en
== Network-based classification of breast cancer metastasis ==
=== General Idea ===
Traditional approach to transcriptional markers study is based on the comparison between large cohorts of control and disease patients.
Although these studies often yield a list of discriminant markers, meta-analysis among different studies - especially in the case of cancer - has revealed critical shortcomings in reproducibility and cross-predictivity.
The partial failure (or margin of improvement) of these studies is due to the inherent variability of transcriptional states associated to complex diseases (e.g. cancer).
This paper suggests to replace single genes with subnetworks from the physical-interaction network. Shifting the focus from genes to subnetworks enables to neutralize individual-gene variability, and instead take into account the global activity of a whole functional unit; the subnetwork-based approach displays a better performance than traditional list-of-genes based approaches, and outperforms also approaches based on gene-sets (derived from Gene Ontology or other sources). In addition, the subnetworks retrieved also include genes not transcriptionally regulated, yet important for the disease; however, the authors acknowledge that the performance of their method with respect to the retrieval of such transcriptionally invisible modulators is still sub-optimal.
=== Techniques (Successfully) Adopted ===
* subnetworks are searched by a greedy seeding approach: the subnetwork is extended along directions increasing the global scoring function
* the scoring function of a subnetwork is the mutual information between the global subnetwork activity levels of the different samples and breast-metastasis/control classification
* the global subnetwork activity is the average of the z-score-normalized subnetwork-gene levels
=== My Critical Review & Suggested Improvements ===
* using mutual information is a smart idea, as traditional statistical test applied to dicotomic classifications assume reproducibility,
which does not hold for human patients (which have often several phenotypic stratifications)
* looking for known cancer-genes,
relevant at a non-transcriptional level (e.g. reported as mutated in OMIM, or regulated at post-transcriptional level),
in the retrieved subnetwork is a smart cross-validation idea
* averaging the normalized levels of subnetwork-genes leads to shortcomings, as it biases subnetworks with the same transcriptional trend (e.g. all UP-r, all DOWN-r)
whereas it is well accepted that functionally-related genes may be either correlated or anti-correlated from a transcriptional standpoint
* it could be interesting to test a reverse approach
* define a differentiality metric of genes across the two classes (without biasing reproducibility as standard methods do)
* define different thersholds for gene selection
* check if the enriched subnetworks change substantially, or it's basically the extension of the same seeds
* an even more extreme idea would be to do something analogous to genetic markers:
* instead of using a flat list of subnetworks (subnetwork activity = feature),
* exploit binary combinations of sub-networks (joint activity of two subnetworks = feature)<
>
(e.g. when subnetwork A is in state X -AND- subnetwork B is state Y, then the sample is disease)<
>
(cf. [[DanieleMerico/MemorandaDirectory/MarkersAndContext| The search for Genetic Markers and the Context-dependence problem]])<
>
this idea should be further explored and discussed