Tutorial

Data

Are derived from Swings et al. They consist of 16 parallel evolved populations (indicated by the identifiers in the file HT1 till HT16). that were subjected to increasing ethanol concentrations. During adaptation the cells underwent several selection sweeps (i.e. sudden increase in fitness despite the higher ethanol concentrations). Populations were sequenced at two time points (right before and after a selection sweep). The goal is to identify the mutations that cause this adaptation towards increased ethanol tolerance that are responsible for the observed sweep. In the mutation data file for each mutation its functional impact score and its frequency increase between the second and first time point are indicated. These values will be used to assign to an observed mutation a relevance score.

Mutations that increase more in frequency during the sweep or that have a higher mutational impact will have a higher relevance score and hence contribute more to the results. To indicate that we want to explicitly use the information on the frequency increase and the functional impact scores we use the sliders (see print screen)

Important:

The load example button contains a slice of whole dataset, to make the analysis faster and allow the user to tweak the parameters, the complete dataset can be downloaded from the example-data option in the side menu.

The experiment contains populations with a mutator phenotype i.e. some populations have an aberrantly high mutation frequency. Because these populations carry so many mutations, they are less informative in pinpointing the correct driver mutations. So we want the contribution of the mutations occurring in such frequently mutated populations to have a lower impact on the analysis and on hence on the distinction between drivers and passengers. To downweight the impact of mutated genes occurring in mutator populations we switch on the slide "additional hub correction".

Important:

Note that when you indicate to use the additional information (frequency increase, functional impact scores) this information should be provided in the input file. Otherwise the analysis will be performed with default parameters throwing an error in the processing page.

Data Processing

After uploading the files, the processing starts and the progress can be followed in real-time. The top of the page indicates the parameters used.

Additionally, you can follow the progress of the analysis in the window located below. When the analysis is complete you will have the option of visualize the inferred subnetwork or download the results.

Generating the subnetworks can take a while. If the process does not end, disable the ‘correction for mutator phenotypes’. The correction might be too stringent preventing the algorithm from finding an optimum. You can also try higher cost parameters which would make the search more stringent and impose more constraints towards finding an optimum.