Searching heterogeneous and distributed databases: a case study from the maritime archaeology community

Hardy, Dianna Lynn (2008) Searching heterogeneous and distributed databases: a case study from the maritime archaeology community. Masters (Research) thesis, James Cook University.

[img] PDF (Thesis front)
Download (267kB)
[img] PDF (Thesis whole)
Download (2MB)
 
1648


Abstract

Much of the data from archaeological investigations currently reside in databases with dissimilar file formats and structures. In addition, data from individual excavations and other research are frequently placed in separate databases that are maintained and accessed solely by the group responsible for the project. Due to the differing file formats, lack of access via a cohesive network and issues regarding ownership and use of data, maritime archaeologists have found it difficult to query such databases in order to perform cross-site analyses. This thesis seeks to provide a framework for federating maritime archaeological databases in order to make such queries and cross-site analyses possible. During this research two important question emerged, 1. Are there tools available to federate these databases?, and 2. how can the search results be appropriately targeted when searching across such a variety of data sources?

This research began by developing a case study centred on databases provided by three maritime heritage organizations in Australia. An informal analysis of feedback from these contributors and others in the maritime archaeological community informed the preliminary design of a prototype system. One of the key issues identified by the community was a lack of funding for new tools. Therefore, the decision was made to use only "open-source" software which is available at no cost. The initial prototype system developed here employed the application 'Storage Resource Broker' (SRB). This software acts as a broker by providing access to distributed sources of data via a search engine that queries the combined resources. The holders of the individual data can set access permissions so that users only see the data to which they have been granted access.

As the research progressed another key issue was identified; although there are currently open source tools available which are capable of integrating distributed data sets, the tools are difficult to use, and require a significant level of time, technical ability and planning in order to fully implement. A related issue is the difficulty of combining data sets which may have with little data in common. To overcome these issues it was necessary to develop a separate application that works in concert with SRB and requires little technical ability to deposit databases. The prototype system allows a data depositor to provide a schema or description of the data itself, and to use the functionality built into the system to create a mapping between columns of data which contain similar information. Integral to the prototype is an embedded metadata catalogue (MCAT) that lists semantic metadata for each resource which allows the system to return better search results.

The final results of the research show that while it is possible to integrate maritime archaeological datasets, in order to implement a data sharing strategy, data standards for archaeological resources must be established. In addition, tools geared toward the average user must be established for creating ontologies and handling other semantic issues.

Item ID: 1849
Item Type: Thesis (Masters (Research))
Keywords: semantic web, archaeology, maritime archaeology, access, ownership, cross-site analyses, federated databases, heterogeneous databases, heritage, open-source databases, search engines, data sets, embedded metadata, semantics, ontologies, datasets, digital libraries, e-research, middleware, metadata harvesting, multimedia, web architecture, world wide web, Internet, data sharing, ArcheoView, search results, Australia, metadatabases, semantic integration, semantic networks, information retrieval
Additional Information:

For other works for this author, see also Dianna Madden.

Date Deposited: 18 Mar 2008
FoR Codes: 08 INFORMATION AND COMPUTING SCIENCES > 0806 Information Systems > 080608 Information Systems Development Methodologies @ 34%
08 INFORMATION AND COMPUTING SCIENCES > 0806 Information Systems > 080610 Information Systems Organisation @ 33%
21 HISTORY AND ARCHAEOLOGY > 2101 Archaeology > 210110 Maritime Archaeology @ 33%
SEO Codes: 89 INFORMATION AND COMMUNICATION SERVICES > 8903 Information Services > 890301 Electronic Information Storage and Retrieval Services @ 33%
95 CULTURAL UNDERSTANDING > 9503 Heritage > 950304 Conserving Intangible Cultural Heritage @ 33%
97 EXPANDING KNOWLEDGE > 970121 Expanding Knowledge in History and Archaeology @ 34%
Downloads: Total: 1648
Last 12 Months: 27
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page