Understanding the Results Folder

Important: The results of IAMBEE can be downloaded or sent by email:
  1. At the moment of submit your data an field to add an email will appear. Filling this field will automatically send an a link with the results to be downloaded when the analysis is complete.
  2. At the moment of analysis completion, an option will appear to download or send the complete results folder to a desired email address.

Folder Structure

The results folder has the following structure.

For users the most important folder is the ’resulting_networks’ folder (see below). If you want to view the visualization html file in this folder do not forget to extract the folder prior to opening the visualization file.

The opt folder

This folder contains all the paths and networks resulting from the analysis with the selected parameters.

Path Files

This group of files (HT1_paths, HT2_paths, ..., etc) contains the results of the path-finding step for each population (condition). The paths define the set of edges in the topology weighed interaction network that connect mutated genes from one population to any of the mutated genes occurring in the other populations.

Each path is assigned a weight reflecting the degree of belief that the path contributes to the adaptive phenotype. The path weight is derived from the network topology and the relevance scores of the begin and end nodes. For more information on the path weights see the extended help file.

The file paths.txt collects all paths detected in all populations.

In the example below the path is given from b1341 (start) occurring in HT1 to b1296 (end) where the path is defined as a set of connected edges, each with their own weight. The weight of the path is here 0.2641678655493167.

>path(b1341,all,HT1)
b1341 edge_pp(b1341,b2370,Directed)(0.9444507398423881)
b2370 b2370 edge_pp(b2370,b2219,Directed)(0.8892726820276319)
b2219 b2219 edge_pp(b2219,b2220,Directed)(0.9046505351008906)
b2220 b2220 edge_pp(b2220,b1296,Directed)(0.9180894228446748)
b1296 (0.2641678655493167) Downstream

Edge Cost Folders

The optimization strategy or sub-network inference step uses as input the paths found in the path-finding step to search for a sub-network that connects as many mutations as possible from different population using the least number of edges. The latter is imposed by the cost parameter.

By default a sweep is performed over the cost parameter. Per tested cost parameter the stochastic algorithm is tested multiple times (indicated by the number of repeats parameter). Hence for each cost parameter a separate folder exists that contains for each repeated run of the algorithm a network file with the selected nodes/edges (sub-network).

For each cost parameter, the network with the highest score is selected (best.result.network). Per cost parameter there is also a folder results_summary which contains an overview of the summary statistics of the networks obtained for each different repeated run performed at that cost ‘networkMetrics’ and a comparison of the different networks in terms of score, genes and edges (‘solution stability’).

Resulting Networks Folder

In the folder ‘resulting_networks’ the best network obtained during the selection sweep is provided ‘highestScoringSubnetwork’. Also file with the merged network is given (resultingSubnetwork.weightednetwork). This file contains a merge of the subnetworks that were obtained at the different edge cost with for each edge a weight (indicating at which cost a particular edge was recovered).

The folder also contains the rankedMutations.txt. This file contains the nodes prioritized based on the maximum cost at which they were first recovered. Low number indicate high ranks (more reliable nodes).

rankedMutations.txt is a tab delimited file in which gene is followed by its rank. The number of ranks correspond to the number of times the cost parameter was varied during the sweep. So the smaller the steps during the sweep the higher the resolution of the ranking. Note that sometimes the folder for a particular cost is empty, meaning that no network could be selected at a particular cost (cost was taken too stringent).

In the sub-folder d3js_visualization networks are provided in a format that can be visualized in html. Visualization is in any html browser without internet connection. The combination of "CTRL or CMD" keys + mouse scroll can be used to zoom in and out (if at first sight you do not see any network try to zoom out).

Other formats (.sif, .xgmml) that allow offline visualization in other platforms as Cytoscape are also provided.