(C. Maria Keet)

Marijke Keet

Home

Research

IT stuff
English
Deutsch
Castellano
Nederlands
site info


Summary "Ontology development and integration for the biosciences"


With the popularisation of ontologies, their integration will become more important, which in turn influences ontology development, like any goal affects the shaping of the product. While several methodologies and difficulties of ontology integration are known, they are not based on one sound theory of integration. This necessary and overdue clarification of integration will also shed light on preferable types of ontologies that are to be integrated. Within biology, there are few ontologies for ecology, but the discipline has a tradition of modularised modelling, hence there is a lot of potential for fruitful interdisciplinary research and one can start ontologising ecological semantics with a tabula rasa, hence when developing an ontology this means taking into account factors that will have a positive effect on success of (future) integration of these ontologies. The aims of this research are twofold: first, to provide insight in ecological knowledge to make it usable for sound and formal ontology development by exploiting extant ecological models for bottom-up ontology development and formalising an integrative ecological concept (the niche) top-down. Secondly, to structure ontology integration in the anticipation of further research and implementation of ontology integration software that will provide guidance for semi-automatic selection of ontology integration methodology and integration procedures to ease the processes involved. Such an increase in 'cross-fertilization' between informatics and ecology will push the boundaries of computing science and improve the effectiveness and efficient reuse of knowledge in ecological research.

Prerequisites for ontology integration and the influences they have on integration include types of ontologies, UoD, difficulties inherent in biological knowledge, and diverging goals and science versus engineering consequences. Heterogeneity and sources of mismatches are exacerbated by the methodological differences and inconsistencies in design decisions during model construction and development phases from informal to formal ontologies. Different types of ontologies can be identified according to the level of formalism and type of content, increasing potential for semantic interoperability with increasing formalisation. Benefits can be gained from specifying the distinctions, fiat boundaries, of the different kinds of ontologies unambiguously. A good understanding of 'what you have' before starting any kind of integration is of prime importance.

Subsequent research into ontology integration revealed that although ontologists demand from the domain experts to reach consensus, this does not exist for 'ontology integration' and its related concepts such as merging, matching and so forth. Terms, definitions and practices found in a representative sample of the extant literature totalled to 24 terms subsumed by 'integration' and 48 definitions and methodologies; these were structured and informally categorised. More input is required from the research community for it to evolve towards a specified shared conceptualisation of itnegration. In addition, anticipated consequences of integrating different kinds of ontologies of the same, similar, and orthogonal subject domains were formulated. Existing ontology integration applications each provide a partially automated solution to a specific aspect of ontology integration within their chosen implementation language. Compared to automation of heuristics of integration on the semantic level, automation on the system and syntactic level is relatively straightforward and achieved; semi-automation of semantic integration is still a hot research topic. However, for certain goals, especially involving biosciences, manual integration may be preferred to investigate and elucidate scientific theories, thus functioning as part of the scientific enterprise. The many challenges facing ontology integration, both domain-independent engineering challenges, such as mismatches (e.g. relationship scope, aggregation, versioning), semantics & structure and reliability of existing ontologies, and philosophical ones, are discussed more comprehensively than a combination of previous publications. Overall, 'ontology integration' is more clearly defined covering a wider scope, but the categorisation is not yet specified to the detail required for the task to create decision support advising on the optimum integration strategy given a certain input and formulated goal, and answering why. The identified challenges, especially the engineering challenges, kinds of ontologies and diverging goals need to be included in such to-be-developed integration heuristics.

Two experiments were conducted: bottom-up ontology development focussed on representing 'flow' and a deconstruction of the integrative concept of ecological niche. With ecological modelling software such as STELLA, guided bottom-up development of ontologies is possible, aided by the formalised correspondences between STELLA elements and elements of an ontology. Flow was represented as subtypes of perdurants and linked to the bearers of the processes via properties. As such, STELLA serves as an intermediate model, widely used by ecologists and translatable to a representation usable for computer scientists. The methodology of extended semantic representations of equations proved to be a useful approach only in conjunction with a taxonomy due to the lack of expressiveness of simple taxonomies. The semantics of the (biogeochemical) MicrobialLoop model can be fully represented in a formal ontology, created with Protégé and OWL DL, that does justice to the semantics of the 'flow' in the corresponding STELLA model. In addition, the more comprehensive semantics of the ontologies have not only a higher level of reusability within the UoD, but also will facilitate future ontology integration because both the PilotPollution and MicrobialLoop are developed with the same ontological categorisations.

The ecological niche and its underlying structure were formalised in FOL and most concepts within the niche could be classified using the DOLCE ontological categories. The distinction between the meta-level concept niche, the abstract definition of a niche with its hypervolume in a multidimensional space and its realisation(s) together with its underlying structure provide clarity about the multi-interpretable use of 'niche' in the ecological literature and is comprehensive enough to accommodate for alternative niche theories, such a neutrality, and extensions like niche construction. A more comprehensive abstract model was devised to separate compound elements and the requirement for a physical realisation abandoned in favour of allowing conjectural niches. The applicability of the underlying abstraction of the model was illustrated with an examination of the ecological niche concept in detail and two brief outlines were provided to demonstrate the potential of the devised extended abstraction of the niche structure outside the domain of biology. The notions of flow and integrative concepts do influence ontology development, although goals (specified or implicit) may have a larger impact. To represent flow in some way in an ontology, one needs richer, hence more formal, modelling languages to capture the semantics accurately. The niche being an integrative concept with alternative hypotheses and points of view, affected development particularly concerning decisions on what, and what not, to include and to choose for a formal representation to specify the theory more precisely in order to elicitate these subtle differences.

Several of the engineering challenges and most of the envisaged challenges identified from the philosophical analysis on developing ontologies for the biosciences were encountered and observed during the PilotPollution and MicrobialLoop creation and niche specification activities, where, from the perspective of ontology research, the potentially most interesting may be the formalisation of the abstraction of the niche structure that through its prospective of reuse outside the subject domain of ontology may be considered as a candidate for extension of existing ontological categories. One cannot conclude if bottom-up or top-down ontology development is superior, but if one desires to pursue the bottom-up approach, incorporating foundational ontological aspects and the highest feasible level of formalisation can cope better with moving targets and changing goals because the semantics are captured more consistently and precisely.

Technical report avaialble on request



For comments:
send an email to keet at inf dot unibz dot it.

This page was created on 2 December 2004