Jackson, Samuel E. and Einbeck, Jochen and Kasim, Adetayo and Talloen, Willem (2016) 'The correlation threshold as a strategy for gene filtering, with application to irritable bowel syndrome and breast cancer microarray data.', Reinvention : an international journal of undergraduate research., 9 (2).
It is well established in the literature that certain disease-associated gene signatures can be identified as a source for predicting the classification of samples or cell lines into diagnostic groups – for example, healthy and diseased. Using standard techniques for the selection of significant genes may lead to many highly correlated genes to be chosen, which may be an issue if we are limited in the number of genes we can select. This article therefore aims to investigate methods for selecting genes with the application of a correlation threshold. The methods are applied to two high-dimensional microarray datasets, one to aid the prediction of the presence or absence of Irritable Bowel Syndrome, and one to predict whether the oestrogen-receptor class of a given breast cancer cell line is positive or negative. Our results suggest that the effectiveness of the correlation threshold as a gene selection parameter depends on the particular microarray dataset and classification problem. While the correlation threshold may be beneficial in some specific scenarios where the number of required genes is restrictively small, it may also have no or even detrimental effect on the classification accuracy.
|Full text:||(VoR) Version of Record|
First Live Deposit - 03 November 2016
Download PDF (353Kb)
|Publisher Web site:||http://www2.warwick.ac.uk/fac/cross_fac/iatl/reinvention/issues/volume9issue2/jackson/|
|Record Created:||03 Nov 2016 11:20|
|Last Modified:||02 Dec 2016 12:29|
|Social bookmarking:||Export: EndNote, Zotero | BibTex|
|Look up in GoogleScholar | Find in a UK Library|