Crowd-annotation and LoD-based semantic indexing of content in multi-disciplinary web repositories to improve search results

Khan, Arshad and Tiropanis, Thanassis and Martin, David (2017) Crowd-annotation and LoD-based semantic indexing of content in multi-disciplinary web repositories to improve search results. In: ACSW '17 Proceedings of the Australasian Computer Science Week Multiconference, January 30 - February 03, 2017, Geelong, Australia.

Preview

PDF
a53-khan.pdf
Available under License Creative Commons Attribution.
Download (10MB) | Preview

Official URL: http://dl.acm.org/citation.cfm?id=3014867

Abstract

Searching for relevant information in multi-disciplinary web
repositories is becoming a topic of increasing interest among the
computer science research community. To date, methods and techniques to extract useful and relevant information from
online repositories of research data have largely been based on
static full text indexing which entails a ‘produce once and use
forever’ kind of strategy. That strategy is fast becoming
insufficient due to increasing data volume, concept
obsolescence, and complexity and heterogeneity of content types
in web repositories. We propose that by automatic semantic
annotation of content in web repositories (using Linked Open
Data or LoD sources) without
using domain-specific ontologies,
we can sustain the performance of searching by retrieving highly
relevant search results. Secondly, we claim that by expert
crowd-annotation of content on top of automatic semantic
annotation, we can enrich the semantic index over time to
augment the contextual value of content in web repositories so
that they remain findable despite changes in language,
terminology and scientific concepts. We deployed a custom-
built annotation, indexing and searching environment in a web
repository website that has been used by expert annotators to
annotate webpages using free text and vocabulary terms. We
present our findings based on the annotation and tagging data on
top of LoD-based annotations and the overall
modus operandi.
We also analyze and demonstrate that by adding expert
annotations to the existing semantic index, we can improve the
relationship between query and documents using Cosine
Similarity Measures (CSM).

Item Type:	Conference or Workshop Item (Paper)
Subjects:	2. Data Collection > 2.6 Observation 2. Data Collection > 2.11 Online Data Collection 7. ICT and Software > 7.3 Technology 7. ICT and Software > 7.4 ICT and Software (other)
Depositing User:	Mr. Arshad Khan
Date Deposited:	10 Feb 2017 14:43
Last Modified:	14 Jul 2021 14:02
URI:	https://eprints.ncrm.ac.uk/id/eprint/4004

Actions (login required)

: View Item