Table 2: Details on the chosen set of clusters for each (benchmark + additional) data set.

(a) Name of data set; (b) similarity parameter and (c) inflation value parameter used to generate these set of clusters; (d) number of clusters in the chosen set and (e) the number of subsequences in the largest cluster.

Gene (a) P (b) I (c) # clusters (d) # el largest cluster (e)
cfos 0 4 12 5
hoxb2 -10 4 4 6
pax6 0 4 20 6
scl -10 4 11 4
EGR3 0 4 11 8
GSH1 -10 4 12 4
HIV-EP1 0 4 13 6
HOXB5 0 4 1 4
MEIS2 0 4 24 6
PCHD8 0 4 14 4