Applying semantic technologies and artificial intelligence to eco-informatic modelling of coral reef systems
Myers, Trina Sharlene (2009) Applying semantic technologies and artificial intelligence to eco-informatic modelling of coral reef systems. PhD thesis, James Cook University.
PDF (Thesis front)
PDF (Thesis whole)
A “data deluge” is overwhelming many areas of research. Massive amounts of scientific data are being produced that cannot be effectively processed. Remote environmental monitoring (including sensor networks) is being rapidly developed and adopted for collecting real-time data across widely distributed locations. As the volume of raw data increases, it is envisaged that bottlenecks will develop in the data analysis phase of research workflows, because data processing and synthesis procedures still generally involve manual manipulation.
Despite the exponential growth in data and the consequential challenges in data management, current e-Research communities are exploring solutions to the “data deluge”. E-Research is the amalgamation of research techniques, data and people with Information Communication Technologies (ICT) to enhance research capabilities. Recent research efforts by the Semantic Web and Knowledge Representation (KR) domains focus on the development of automated data synthesis technologies. A key component in these solutions is the semantic technologies. Semantic technologies involve methods to add contextual information to data through ontologies so logic systems can be applied by the computer to enable automated inference. An ontology explicitly describes concepts in “computer-understandable” terms which allows for automated reasoning and intelligent decision-making by the machine. Automated data analysis and knowledge discovery is desirable because the manual manipulation of data processing and synthesis requires human intervention which will become increasingly more difficult to sustain as the data deluge grows.
This dissertation introduces the Semantic Reef project which is an eco-informatics software architecture designed to alleviate data management problems within marine research. The intention was to develop an automated data processing, problem-solving and knowledge discovery system within the scope of e-Research, which will assist in developing our understanding and management of coral reef ecosystems.The Semantic Reef project employs e-Research approaches including semantic technologies and scientific workflows, which together create a platform designed to evaluate complex hypothesis queries and/or provide alerting for unusual events (e.g., coral spawning or bleaching).
The Semantic Reef project was built as a KR platform, so researchers can combine disjoint data from different sources into a single Knowledge Base (KB) to pose questions of the data. Scientific workflows access and retrieve remote sensor data and/or data available via the Web to populate the KB. The KB consists of a hierarchy of reusable and usable ontologies that together generically model a coral reef ecosystem in a “computer-understandable” form. The ontologies range from informal through to formal and, when coupled to datasets, derive inferences from data to “ask” the KB questions for semantic correlation, synthesis and analysis. The ontology design leverages the scalable and autonomic characteristics of semantic technologies such as modularity, reuse and the ability to link latent connections in data through complex logic systems.
The overall goal of the Semantic Reef project was to enable marine researchers to pose hypotheses about environmental data gathered from in situ observations, and to explore phenomena such as climate change effects on an ecosystem rather than on one component at a time. Currently, in marine research, there has been an explosive increase in the number of questions posed about climate change effects; for example, questions about the origins of phenomena such as coral bleaching on coral reef ecosystems. To be answered, these questions need to be able to assess the cumulative combination of ecological factors and stressors that contribute to the tipping point from a healthy coral to stressed coral due to coral bleaching. The marine biology domain has an urgent need for more efficient investigation of the disparate data streams and data sources. The Semantic Reef project, which incorporates the new hypothesis-driven research tools and problem-solving methods, is designed as a proof of concept to resolve this need.
The Semantic Reef system has the capacity to pose hypotheses and automate inferences of the available data. The system’s design supports flexibility in theoretic hypothesis design because the researcher is not required to predetermine the exact hypothesis prior to gathering data for import to the KB. Rather, the questions can be as flexible as the researcher requires, and they may evolve as new data becomes available or as ideas grow and/or epiphanies emerge. Then, once phenomena in the data are disclosed through semantic inference, in situ observations can be performed to confirm or negate the theory. The Semantic Reef tool offers marine researchers this flexibility in hypothesis modelling to theorise about a range of scientific conundrums such as the cumulative causal factors that contribute to coral bleaching.
This study is the first known example of Semantic Web technologies and scientific workflows combined to integrate data, with the purpose of posing observational hypotheses or inferring alerts in the coral reef domain. As a proof of concept, the Semantic Reef system offers a different approach to the development and execution of observational hypotheses on coral reefs. The system offers adaptability when applying hypotheses and questions of data, specifically in scenarios where the hypothesis is not apparent prior to data collection efforts. The Semantic Reef system cannot overcome the data deluge, but it offers a unique approach to the discovery of new phenomena that, through automation, can alleviate the problems associated with the data analysis phase.