Logo PHMS and NOMS

PhyloMotifSampler (PHMS) and NOrthoMotifSampler (NOMS) are complementary tools for probabilistic de novo motif detection in sets of orthologous regulatory DNA sequences from multiple related species. The basic idea is that selective pressure causes functional elements to evolve at a slower rate than non-functional sequences. Detection is done by means of a stochastic optimization strategy (a Gibbs sampling approach) that searches for all possible sets of short DNA segments in orthologous regulatory regions that are evolutionary better conserved compared to the conservation of the surrounding nucleotides (also called the non-functional background). The output of PHMS or NOMS must be provided to MotifRanking and/or FuzzyClustering to extract the significant motifs from the list of multiple solutions reported by the motif detector.

PHMS applies to datasets from species related by a star toplogy for the detection of motifs with a conserved consensus in each of the involved species (a slight mutation proportional to the phylogenetic distance is allowed for).
NOMS extends to the inclusion of orthologous sequences from more distantly related species (modelled by a rooted tree topology) and has the capability to detect motifs with a mutated or even replaced motif consensus in the related species.

To optimally choose between PHMS or NOMS, to run the tools and evaluate its output, please consult our guidelines (includes link to a case study).
Stand-alone executable: download.

The speed with which results are generated depends on the server load.

Last PHMS software revision : .

Last NOMS software revision : .

Questions & suggestions: contact us.

Publications:

If you like our software, please use the following publication for citing : (in process)



Run PHMS or NOMS:

The format of input, parameters and format of the output is mainly the same for PHMS and NOMS. The differences or extra requirements for each application are indicated at the appropiate fields below. The choise between running PHMS or NOMS is decided at the start of this section.
To run the chosen application, please fill in the required input in the blank fields. In the output section, a (randomized) file name has been generated. You can overwrite this automatically generated filename with a more meaningful description if desired (do not use spaces, dots, colons,... in this name). The program parameters have been set to a default value. Please analyze if these settings apply to your case (checkout our PHMS/NOMS Guidelines) and overwrite whenever needed. Pressing Submit will initiate the selected software on our server. An url containing the results will be sent by email.
Illustrative examples are the result of running NOrthoMotifSampler on orthologous sequences from yeast (Saccharomyces species) containing known binding sites for the Urs1h transcription factor.

  CHOOSE YOUR PROGRAM:
    Use PHMS for the detection of conserved or slightly mutated motifs with one common motif consensus.
    Use NOMS if you expect mutated or replaced motif consensus in the different related species.



  Input:
  Your email address,   we will mail you the url with the result.
  -f <filename>:   file with orthologous DNA sequences grouped per target gene,
    in Fasta format (Urs1h example).
  -q <filename>(optional):   file with position-specific prior(PSP) scores grouped per target gene,
    in PSP format (Urs1h example).
  -b <filename>:  file listing all involved species with the names of the respective background model files (listing format, example).
  <bgfile1, bgfile2,...>:  Upload all your own genome-specific background model files
    (spaced by a komma, format of one file, example). All files listed in -b that are not uploaded here will be searched for in our database.
  -c <filename>:   file describing the star (PHMS/NOMS) or rooted (NOMS) phylogenetic tree with branchlengths, in Newick tree format (star example, rooted example).

  Output:
  -Z <0|1>:   ONLY NOMS!: select if you also want species-specific motif results. Default <0>(=no).
  -O <0|1>:   ONLY NOMS!: select if you also want inner node and root motif results. Default <0>(=no).
  -o <filename>:   file with solutions in annotated instances format (Urs1h example)
  -m <filename>:   file with solutions in PWM format (Urs1h example)


  Parameters:
  -r <value>:   number of times one algorithm run should be repeated with the same parameter settings on the same input sequence dataset. Default <100>.
  -s <0|1>:   default <1> both strands of the sequences will be analyzed (i.e. input sequences and the reverse complement). <0> is only input sequences.
  -w <value>:   length of the motif. Default <8>.
  -n <value>:   number of different motifs to search for. Default <1>.
  -x <value>:   maximal allowed overlap between different motifs (only used if -n > 1). Default <1>.
  -M <value>:   maximum number of instances of a motif to search for in any sequence. Default <2>.
  -p <prior>:   sets prior information on the number of motif instances to search for per sequence. Default is tuned towards mainly 1 instance per sequence (but also 0 and 2 allocations are possible). Read more in 5 types prior for more options on this parameter.
  -Q <value>:   weight of PSP impact during motif search (Q<1 = minor impact). Default <10>.
  -k <value>:   ONLY NOMS: proportional weight of prior evolution counts (high k) compared to data-inferred evolution (k=1). Default <1000>.


! Proceed with MotifRanking (or FuzzyClustering) to prioritize your PHMS or NOMS output (why? step2/3).