For this special issue we invited researchers to explore the possibilities and limitations of Semantic Search. In recent years we have witnessed an impressive advancement of research regarding the theory of semantic technologies as well as regarding the (experimental) development of semantic tools, especially in the W3C context and in the context of the constantly growing Linked Data and DBpedia community (Bizer et al., 2009). This special issue focuses on the question of how semantic technologies, such as information annotation and extracting techniques, resource description (RDF) languages and ontologies, or reasoning engines are used.

Far from being exhaustive, the five contributions are a very small selection from a young and vibrant field but we hope reflect the diversity of current directions of study in Semantic Search. Research in the five papers ranges across:

1. incorporating Semantic Web schema and languages like Resource Description Framework (RDF) and Simple Protocol and RDF Query Language (SPARQL) to improve typical information retrieval (IR) tasks like reducing query ambiguity and redundant search results;

2. approaches to how to overcome the challenge of diverse or even contradictory Knowledge Organization for successful IR tasks;

3. improving the accessibility and enhancing the usability of search results via exploiting the (semantic) structure of the data to supply the user with a convincing search interface and visualization of search results;

4. approaches to better determine the user intent of a search query; and

5. determining inherent sentiments of a statement or document.

Providing the user with pertinent and relevant information in order to satisfy complex information needs and supplying the user with tools to overview, navigate and filter the provided information successfully are the two main functions of Semantic Search that are discussed in this special issue.

The first paper is "A model for ranking entity attributes using DBpedia". An advantage for users searching the Semantic Web and Linked Data via aggregated search engines is that they can access and navigate the Web of Data without having to perform separate search tasks and exactly identify the type of entity in advance or select the specific source of Linked Data to query. The paper proposes an Attribute Importance Model for reducing query ambiguity and redundant attributes. Via SPARQL queries, a range of possible entities is retrieved from DBpedia to determine a range of possible entities underlying a given query as a means to help users improve the quality of their search results. The novelty of the approach is that it mainly relies on the attributes of the DBpedia data instead of using the concept hierarchies of the ontology.

Search is augmented by presenting entity type-based query suggestions, computing path-based similarity between query and result based on the DBpedia taxonomy, clustering aggregated attributes, and ranking attributes based on their importance to a given query.

The second paper "Towards maximal unification of semantically diverse ontologies for controversial domains" suggests a solution for one of the main challenges of knowledge modelling, the variability and even contrariness of domain ontologies due to varying points of view of their authors. In a promising conceptual study, the authors suggest an approach to ontology mapping that is not mainly based on matching individual entities but on matching the ontological relations. To this end, a set of unification rules for ontological relations based on ontological reference rules, and lexical and textual entailment is devised.

The effectiveness of the approach has so far only been tested on a small data set but it is to be hoped that the paper inspires further modification and testing of the suggested relational approach in the context of Semantic Search.

The third paper "User-centered design and evaluation of overview components for semantic data exploration" faces one of the most important requirements for making Semantic Search suitable for everyday use, especially for users without a deeper understanding of underlying Semantic Web formats like RDF or SPARQL. As traditional Information Architecture (IA) techniques do not always scale to large data sets, it can become very difficult for the user to achieve an overview of the underlying large heterogeneous data sets typical in the Semantic Web. Established IA components of information presentation like navigation bars, site maps, site indexes, and tree maps are automatically generated from semantic data. Evaluations with end-users showed that the automatically generated navigation features improve the usability and accessibility of the provided semantic tools and can be easily followed by lay-users without requiring knowledge about the underlying semantic technologies.

The fourth paper "Collaborative search using an implicitly formed academic network" deals with the personalization and verifiability of search results based on user preferences. The proposed Collaborative Search System attempts to achieve collaboration by implicitly identifying and reflecting search behaviour by taking into consideration the participant's research presence, the search environment, and the usual search behaviour on the web of collaborators in the Digital Bibliography Library Project (DBLP). The individual relevancy of a search result is computed from a Collaborative Hit Matrix that stores the hit count and hit list of page results for queries and is updated after every search session to enhance the collaborative search among the researchers. The approach is new and innovative insofar as the personalization is not restricted to considering the search behaviour of an individual but takes into consideration the special needs of a group of researchers that are members of a research community.

The last paper "Sentiment search: an emerging trend in social media monitoring systems" illustrates a special use case and extension of the common understanding of semantic search. It proposes a retrieval framework designed to determine sentiment opinion implied with a search topic and the retrieval results. Tested on a relatively small test set, the embedded linguistic rules achieved better results than data mining techniques. Whereas the discussed use case, identifying opinions of users when they are referring to brands and products belongs to the area of social media monitoring, the proposed approach which combines the usage of domain ontologies with specific linguistic rules to handle sentiment terms in textual data could be easily adopted in other search scenarios.

Semantic technologies exert an increasing influence on existing Web Search approaches. They are employed to enhance typical IR requirement like sense disambiguation, relevance ranking, determining the user intent, for example by exploiting user preferences and search behaviour. The increasing supply of semantically enriched structured data as well as the development of techniques to retrieve these data changes the way information can be retrieved, experienced, and modified by the user. Rather than competing directly with the prevailing statistical and probabilistic approaches that are based on an unstructured "bag of words" in IR, we observe an interplay of natural language processing (NLP), statistical and probabilistic approaches with semantic technologies in the narrower sense, in order to improve search results by exploiting semantic data and information structures alongside other techniques.

Although beyond the scope of this issue, the potential of semantic technologies goes far beyond search enhancement. Initially the scientific communities appeared to take the lead in using semantic technologies with projects such as the Gene Ontology, and early contributions such as the FOAF (friend of a friend) ontology created by Dan Brickley and Libby Miller aiming to open up the Semantic Web to a wider constituency. Semantic technologies have now entered the commercial world in force, with applications in areas ranging from legal discovery to customer relations management.

Linked Data has been embraced by the cultural and heritage communities as particularly appropriate for publishing bibliographic and reference information. National Archives and National Libraries are using semantic techniques to open up their catalogues to the public and to enable collaborative projects like Europeana, while the BBC is amongst the organizations that have pioneered the use of semantic technologies online, encouraging the building of dynamic content strategies for web sites such as the Olympics web site. Such potentially culturally significant uses of ontologies demonstrate the need for thinking about their preservation and maintenance over time. For example, in an area of particular interest to the Library and Archival communities, Victoria Cowan at the BBC Archives has proposed a method for treating ontologies as archival artefacts and applying the Open Archival Information System (or OAIS) to their management, noting that the existing literature typically focuses on technical rather than socio-cultural aspects of information systems. A tendency that was also reflected by the topics of articles submitted for this special issue.

Whilst the innovation and expansion of the applications of semantics through cultural ontologies such as the CIDOC Conceptual Reference Model, which includes ways of referencing imaginary, literary, and fictional worlds, is well received in the community, philosophical considerations of what we mean by reality remains a specialized area of thought. Experimentation in the arts, exploring semantic techniques that can enhance creative expression through automating aspects of theatrical performance, or working with narrative and structure, as proposed by Paul Rissen of the BBC Mythology Engine project, indicate that the creative future of semantics is bright.

It is clear that one single journal issue can only present a small selection of the range and diversity of work in this exciting field, but we hope merely to signal the importance and potential of semantic technologies for the future and to encourage investigation by as diverse a range of researchers as possible.

Fran Alexander and Dr Ulrike Spree


