We use cookies to ensure that we give you the best experience on our website. You can change your cookie settings at any time. Otherwise, we'll assume you're OK to continue.

Durham University

Research & business

View Profile

Publication details

Jackson, Samuel E., Einbeck, Jochen, Kasim, Adetayo & Talloen, Willem (2016). The correlation threshold as a strategy for gene filtering, with application to irritable bowel syndrome and breast cancer microarray data. Reinvention: an International Journal of Undergraduate Research 9(2).

Author(s) from Durham


It is well established in the literature that certain disease-associated gene signatures can be identified as a source for predicting the classification of samples or cell lines into diagnostic groups – for example, healthy and diseased. Using standard techniques for the selection of significant genes may lead to many highly correlated genes to be chosen, which may be an issue if we are limited in the number of genes we can select. This article therefore aims to investigate methods for selecting genes with the application of a correlation threshold. The methods are applied to two high-dimensional microarray datasets, one to aid the prediction of the presence or absence of Irritable Bowel Syndrome, and one to predict whether the oestrogen-receptor class of a given breast cancer cell line is positive or negative. Our results suggest that the effectiveness of the correlation threshold as a gene selection parameter depends on the particular microarray dataset and classification problem. While the correlation threshold may be beneficial in some specific scenarios where the number of required genes is restrictively small, it may also have no or even detrimental effect on the classification accuracy.