Bicluster analysis of biomedical data based on multi-objective evolutionary optimization

Golchin, Maryam (2018) Bicluster analysis of biomedical data based on multi-objective evolutionary optimization. PhD thesis, Griffith University.

Full text not available from this repository
View at Publisher Website: https://doi.org/10.25904/1912/2189
 
2


Abstract

Knowledge discovery is the process of finding hidden knowledge from a large volume of data that involves data mining. Data mining unveils interesting relationships among data and the results can help to make valuable predictions or recommendations in various applications. Recently, biclustering has become a common method in data mining and pattern recognition. Biclustering is an unsupervised machine learning method that can uncover and extract accurate and useful information from high-dimensional sparse data. Biclustering has found many useful applications for visualization and exploratory analysis in various fields such as knowledge discovery, data mining, pattern classification, information retrieval, collaborative filtering, and especially in gene expressions data analysis such as functional annotation, tissue classification, and motif identification.

It has been shown in previous studies that finding biclusters of data is inherently intractable and computationally complex. Generally, the challenges of biclustering include the high dimensionality of data, noisy data, different types of bicluster patterns, and the fact that biclusters can overlap. Although there are several studies in biclustering, after a review of the methods proposed in the literature, we found that these challenges are not addressed properly. Most of the proposed methods in literature can only detect a limited set of bicluster patterns under restrictive assumptions about the data. Moreover, in many methods biclusters are detected sequentially, i.e., the method replaces the detected bicluster with the background and detects the next bicluster, thus preventing the detection of overlapping biclusters.

Given the above statements, there is a need for innovative methods to extract valuable information from the data and to reach a deeper understanding of the outcomes. Therefore, in this study, we first proposed a method (PBD-SPEA) that ii uses a new dynamic encoding scheme to detect multiple overlapped biclusters concurrently. However, the implementation is complex as there are several heuristic search procedures in different steps of the proposed method, and it is not able to detect all types of patterns in biclusters. Thus, a second method (LBDP) is proposed based on geometrical biclustering. In this method, we search for hyperplanes from the data using an evolutionary algorithm. Applying this idea, we are able to detect all types of bicluster patterns concurrently.

We defined several scenarios in both synthetic and real data to test the performance of the proposed methods. Although our work is initially targeted at biomedical data (gene expression data), we also tested the generality of the algorithms on other non-medical data, such as image data and social networking data. In all scenarios, our methods achieved reliable results compared to several state-of-the-art.

Item ID: 76980
Item Type: Thesis (PhD)
Keywords: Bicluster analysis, Biomedical data, Multi-objective evolutionary optimization, Gene expression data, Image data, Social networking
Copyright Information: Copyright © 2018 Maryam Golchin.
Additional Information:

This thesis is openly accessible from the link to Griffith University's institutional repository above.

Date Deposited: 01 Mar 2023 02:01
FoR Codes: 46 INFORMATION AND COMPUTING SCIENCES > 4613 Theory of computation > 461305 Data structures and algorithms @ 100%
SEO Codes: 28 EXPANDING KNOWLEDGE > 2801 Expanding knowledge > 280115 Expanding knowledge in the information and computing sciences @ 100%
Downloads: Total: 2
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page