Logo MotifRanking Guidelines

Go to evaluation of the output if you need a quick help to optimally run MotifRanking.
If you have not used MotifRanking before, go through the following steps :
    Introduction
    MotifRanking Algorithm
    Output to the user
    Evaluation of the output
Optionally, also read an elaborate discussion on the results of MotifRankings performance in a case study.
    Case study: The value of MotifRankings metrics demonstrated on real benchmark datasets.



Introduction

MotifRanking is designed to extract the most likely true motif(s) from a list of multiple motif detection solutions in PWM format. MotifRanking can be used on any list of motifs (PWMs) irrespective of the tool that was used to predict these motifs. MotifRanking allows prioritizing motifs according to their motif score and their motif detection frequency and computes how many times each motif occurs in the list of predicted motifs (count).

In what follows, we give a comprehensive description of MotifRanking and we provide guidelines on selecting program parameter settings and evaluating the output. If you want to use this application, go back to the applications webpage or setup the MotifRanking commandline in case of standalone use. Note : all program parameters described below have default settings in MotifRanking unless stated otherwise.



MotifRanking Algorithm

The input for MotifRanking is a file (parameter -i) containing a list of detected motifs in PWM format (reported by MotifSampler or any other motif finder). MotifRanking computes two independent metrics to prioritize the detected motifs: the motif score and the motif detection count.

The motif score is used to sort the detected motifs by descending motif score order. In MotifRankings default parameter settings, the motif score is the '#Score =' as described in the PWM format of the supplied input file. When you have used MotifSampler to generate the input file, '#Score =' is the Log-likelihood score (LL) which we found most suitable for extraction of the most likely true motif as this score balances information on the conservation and the total number of instances of the motif (Fig.1).

Fig fail : LLmotifscores.png

Fig.1 : Formula of Log-Likelihood score (we refer to MotifSampler Guidelines - Algorithm for symbols).

Alternatively, you can also provide your own motif score in the #Score field or MotifRanking can internally compute another score based on the motifs PWM description (Fig.2) : the Consensus score (CS) or the Information Content score (IC). CS is a measure for the conservation of the motif and this score can be used in absolute terms. A perfectly conserved motif has a score equal to 2, while a motif with a uniform nucleotide distribution has a score 0. IC takes not only into account how well the motif is conserved, but also how much the motif differs from the single nucleotide background distribution. To calculate the IC, you need to supply a genome-specific backgroundmodel Bm (format) from which the algorithm will use the zero-order single nucleotide frequencies (#snf). The score is maximal (equals 2) if the motif is well conserved and differs considerably from the background distribution.

Fig fail : CSICmotifscores.png

Fig.2 : Formula for Consensus Score (CS) and Information Content (IC) score.

The motif with the highest motif score in the sorted list of detected motifs is compared with each lower sorted motif in this list using a matrix comparison strategy. Two motifs are judged similar if the Kullback-Leiber (KL) distance between their PWMs is below a given similarity threshold (parameter -t). For more details on KL and threshold, we refer to MotifComparison Guidelines. As different alignments between two PWMs may give a different KL distance, all possible alignments (Fig.3) that respect a minimally required overlap (parameter -x) and a maximally allowed shift (parameter -s) are evaluated,

Fig fail : alignments.png

Fig.3 (extracted from MotifComparison Guidelines) : There are multiple possible alignments when comparing the similarity between two PWMs.
The rectangles represent (green:) a particular motif (of length w) for which the motif detection count is being computed and
(blue:) a motif (with equal or lower motif score) that was not yet found similar to a detected motif,
visualized in 3 shifted alignments : maximal shift(s) to the left, zero shift, and maximal shift to the right.
The grey area visualizes the overlap (x) between the query and database motif in a particular alignment.

and the best alignment (i.e. with the lowest KL distance) is retained to judge the similarity of the PWMs at hand. In default mode, -s allows a shift of only 1 nucleotide and -x requires an overlap of at least 6 nucleotides. The reason why two parameters -s and -x are introduced is mainly to handle motif comparisons of different motif length : allowing for sufficient shifts while conserving a minimal overlap between aligned regions accomodates the comparison of longer motifs (for which many shifts might be needed to find the optimal alignment) while the minimal overlap is needed to constrain the allowed shifts for shorter motifs. When all detected motifs have the same length, only one parameter is needed to restrict the admitted alignments e.g. the minimum overlap constraint (-w) is typically set to half of the (fixed) motif length and the maximal shift (-s) to an unrestricting high value (e.g. equal to the motif length).

All motifs that are found similar to the highest scoring motif are grouped and correspond to the same particular motif that is represented by the motif with the highest motif score in this group. The number of similar motifs in the group equals the motif detection count (N) of the respective motif and is the number of times this particular motif (PWM) has been detected by MotifSampler. Finally, all motifs of the group are removed from the original list of sorted motifs to give a skimmed list of unassessed motifs where the next motif can be extracted from.

Fig fail : MotifRanking2.png

Fig.4 : MotifRanking sorts motifs by their motif score and removes redundancy by grouping similar motifs that represent the same motif.
The green square represents the PWM description of a detected motif by one run in MotifSampler

The selection of highest scoring motif, grouping of similar motifs and removal of this group from the list of sorted motifs is repeated untill there are no motifs left that have not been assigned to a particular group of similar motifs. The output of MotifRanking consists of the selected representatives of each group, sorted in order of their motif score together with the size (N) of the group they represent. The return ratio (RR) is a measure for the significance of a detected motif and is obtained by dividing the count N by the total number of motifs that was present in the input file. Mark that the denominator in RR does not necessarily equal the number of initiated motif detection runs (set by MotifSamplers parameter -r) as some runs in MotifSampler may be aborted and report no motif.



Output to the user

The retrieved different motifs (default maximal 5, set by parameter -r) are reported in descending motif score order.
MotifRanking reports each motif in two files :

1) a text file -O that reports the different motifs by their motif identifier, motif score, motif detection count (in the field '=> Total') and return ratio RR. The file also lists the identifier of all similar motif predictions. (EvgA example)

2) a matrix file -o that reports the non-redundant sorted motifs in PWM format. (EvgA example)
MotifRanking simply copies the PWMs of the different motifs from the supplied input file into this output file. The values in the PWMs in this outputfile will differ slightly compared to the values in the original PWM because a small pseudocount (0.0001) is always applied when a PWM is loaded by a MotifSuite application (this to avoid zero values in a PWM that may confound PWM comparisons or computing PWM scores).



Evaluation of the output

- At all times, make sure you have reasonable entries for both the shift (-s) and overlap (-x) parameter as this may influence the computation of the motif detection count significantly as demonstrated in Fig.5.

Fig fail : contra_s_x.png

Fig.5 : Example of increasing the minimal required overlap (-x) without adjusting the maximal allowed shift (-s).

- The fraction of times a same motif was detected by a stochastic motif finder (return ratio RR) is a measure for the significance of a motif, meaning that motifs with high RR are more statistically relevant and are not likely to have been detected by chance. >From our experience, highly significant motifs have a return ratio above 50% and motifs with a return ratio below 10% are likely to be spurious. A high RR also means that the motif to noise signal is strong in the dataset and indicates that the chance is lower that a motif is obscured by untrue instances that do not truly belong to the respective motif. So the higher RR, the better this motif (PWM and instances representation) describes the biologically true motif signal in the dataset.

- An assessment of significance is only possible if the number of motifs supplied to MotifRanking is sufficiently high (compare it to testing whether a dice is unbiased by throwing it a hundred times : if the dice falls of the table 90 times, you cannot make conclusions on the 10 remaining outcomes). With default parameter settings in MotifSampler, 100 initiated motif detection runs in theory predict 100 motifs and when analyzed with MotifRanking, the absolute value of the (percentual) return ratio RR equals the absolute count N of a detected motif (e.g. RR = 20% and N = 20 on 100 motifs). If many runs in MotifSampler were aborted (e.g. see impact of MotifSamplers parameter -p on run abortion), the value of RR could be significantly higher than the absolute count N giving a falsely high confidence in this motif (for example when 40 non-aborted runs report the same motif (N=) 20 times (RR = 50%)). Based on our experience, a motif should be detected at least 10 times (absolute count N) before considering it a significant solution and minimally 30 times to be on the safe side.

- The motif score is used in a relative manner to sort the motif predictions in terms of likelihood to represent a biologically true motif. Note that the absolute value of the by default used Log-Likelihood score depends on both the properties of the motif and the dataset and cannot be compared between different datasets.

- When evaluating the importance of multiple motifs reported in the same outputfile of MotifRanking, the motif score and motif detection count (or return ratio RR, this indicator only differs from N by a fixed denominator for each motif) should be used in a complementary way. Motifs with a comparable count and score are equally valid. However, also a high scoring motif with a low (but still sufficiently high) count can be considered equally important than a lower scoring motif that has a higher count.

- In case of doubt on the non-similarity of different motifs in the same outpufile of MotifRanking (e.g. same order of LL, same order of consensus score CS, apparently shifted consensus description), we recommend to compare both motifs more extensively e.g. by comparing the annotation of their instances or by running MotifComparison with more relaxed settings on the required maximal shift and minimal overlap. Running MotifComparison in default metric mode (KL) in fact executes the same similarity computations as the grouping in MotifRanking, yet the information on the computed similarity score and most likely shift between two motifs being compared is now available for own interpretation. Running MotifComparison with another similarity metric (p-BLiC) gives an independent evaluation on the (non)-similarity of the different reported motifs by MotifRanking. The p-BliC method in MotifComparison is in many cases most suitable to find the truly different motif signals present in the multiple motif predictions (read more in MotifComparison Guidelines).

- If the reported motifs by MotifRanking all have a low count (N<10), there is not enough statistical evidence to support these motifs as candidates for further biological assessment. Before concluding that no motif is present in the dataset, we recommend to repeat MotifSampler with revised parameter settings to further explore the regulatory motif solution space, e.g. run with a higher number of Gibbs sampling runs (parameter -r) and/or change some influential parameters such as the background model (parameter -b), the motif width (parameter -w) and the information on the prior number of instances per sequence (parameter -p).



Feedback

Contact us if you have comments, questions or suggestions or simply want to react on the contents of this guideline. Thank you.