Query Driven Biclustering
QDB (Query Driven Biclustering) is a Bayesian query-driven biclustering framework for microarray data in which the prior distributions allow introducing knowledge from a set of seed genes (query) to guide the pattern search. The algorithm has been described and validated in the following paper:
Dhollander T, Sheng Q, Lemmens K, De Moor B, Marchal K, Moreau Y Query-driven module discovery in microarray data Bioinformatics , 23(19):2573-80. (2007).
The software for query-driven biclustering (QDB) is freely available for ACADEMIC USE ONLY under the license. To download the software, registration is required. For commercial usage, please contact us. The software for query-driven biclustering (QDB) was implemented in R (version 2.4.1). The code consists of a collections of R-scripts. To successfully run the software perform the following steps:
setwd("/home/usr/QDB_v1.1/")
'params'
section of the script. Currently, default parameter settings are used. For a detailed description of the different parameters and their default settings we refer to the manuscript.
'read data sources'
section of the script. Expression data is assumed to be a matrix with in the row the genes and the columns referring to the different conditions under which gene expression was measured. The expression data file must be a tab-delimited text file with in the first row the identifiers of the experimental conditions, the first column the gene locus tags and the remainder of the file the expression matrix.
'seed genes'
section of the script. Seedgenes can be spefied in two different ways: either by their locus tags (note that these should correspond to the row names in the expression data file) or through their indices in the expression data file (i.e. the row number).
'save the result'
section of the script. Two different outpufiles can be specified. The "/tempresult.RData"
file contains the output of the whole QDB-run and contains the gene scores, condition scores and loglikelihood-scores for all iterations of the algorithm. This data can be loaded into R using the load
-command. The "tempresult.bcl"
file is a txt-file that contains a selection of all biclustering results produced by the resolution sweep approach, chosen based on the Akaike Information Criterion described in the manuscript
source("/main files/qdb_main_CM.R")
kathleen.marchal<at>biw.kuleuven.be