Search results
1 – 10 of 50Ioana Barbantan, Mihaela Porumb, Camelia Lemnaru and Rodica Potolea
Improving healthcare services by developing assistive technologies includes both the health aid devices and the analysis of the data collected by them. The acquired data modeled…
Abstract
Purpose
Improving healthcare services by developing assistive technologies includes both the health aid devices and the analysis of the data collected by them. The acquired data modeled as a knowledge base give more insight into each patient’s health status and needs. Therefore, the ultimate goal of a health-care system is obtaining recommendations provided by an assistive decision support system using such knowledge base, benefiting the patients, the physicians and the healthcare industry. This paper aims to define the knowledge flow for a medical assistive decision support system by structuring raw medical data and leveraging the knowledge contained in the data proposing solutions for efficient data search, medical investigation or diagnosis and medication prediction and relationship identification.
Design/methodology/approach
The solution this paper proposes for implementing a medical assistive decision support system can analyze any type of unstructured medical documents which are processed by applying Natural Language Processing (NLP) tasks followed by semantic analysis, leading to the medical concept identification, thus imposing a structure on the input documents. The structured information is filtered and classified such that custom decisions regarding patients’ health status can be made. The current research focuses on identifying the relationships between medical concepts as defined by the REMed (Relation Extraction from Medical documents) solution that aims at finding the patterns that lead to the classification of concept pairs into concept-to-concept relations.
Findings
This paper proposed the REMed solution expressed as a multi-class classification problem tackled using the support vector machine classifier. Experimentally, this paper determined the most appropriate setup for the multi-class classification problem which is a combination of lexical, context, syntactic and grammatical features, as each feature category is good at representing particular relations, but not all. The best results we obtained are expressed as F1-measure of 74.9 per cent which is 1.4 per cent better than the results reported by similar systems.
Research limitations/implications
The difficulty to discriminate between TrIP and TrAP relations revolves around the hierarchical relationship between the two classes as TrIP is a particular type (an instance) of TrAP. The intuition behind this behavior was that the classifier cannot discern the correct relations because of the bias toward the majority classes. The analysis was conducted by using only sentences from electronic health record that contain at least two medical concepts. This limitation was introduced by the availability of the annotated data with reported results, as relations were defined at sentence level.
Originality/value
The originality of the proposed solution lies in the methodology to extract valuable information from the medical records via semantic searches; concept-to-concept relation identification; and recommendations for diagnosis, treatment and further investigations. The REMed solution introduces a learning-based approach for the automatic discovery of relations between medical concepts. We propose an original list of features: lexical – 3, context – 6, grammatical – 4 and syntactic – 4. The similarity feature introduced in this paper has a significant influence on the classification, and, to the best of the authors’ knowledge, it has not been used as feature in similar solutions.
Details
Keywords
Mulki Indana Zulfa, Rudy Hartanto and Adhistya Erna Permanasari
Internet users and Web-based applications continue to grow every day. The response time on a Web application really determines the convenience of its users. Caching Web content is…
Abstract
Purpose
Internet users and Web-based applications continue to grow every day. The response time on a Web application really determines the convenience of its users. Caching Web content is one strategy that can be used to speed up response time. This strategy is divided into three main techniques, namely, Web caching, Web prefetching and application-level caching. The purpose of this paper is to put forward a literature review of caching strategy research that can be used in Web-based applications.
Design/methodology/approach
The methods used in this paper were as follows: determined the review method, conducted a review process, pros and cons analysis and explained conclusions. The review method is carried out by searching literature from leading journals and conferences. The first search process starts by determining keywords related to caching strategies. To limit the latest literature in accordance with current developments in website technology, search results are limited to the past 10 years, in English only and related to computer science only.
Findings
Note in advance that Web caching and Web prefetching are slightly overlapping techniques because they have the same goal of reducing latency on the user’s side. But actually, the two techniques are motivated by different basic mechanisms. Web caching uses the basic mechanism of cache replacement or the algorithm to change cache objects in memory when the cache capacity is full, whereas Web prefetching uses the basic mechanism of predicting cache objects that can be accessed in the future. This paper also contributes practical guidelines for choosing the appropriate caching strategy for Web-based applications.
Originality/value
This paper conducts a state-of-the art review of caching strategies that can be used in Web applications. Exclusively, this paper presents taxonomy, pros and cons of selected research and discusses data sets that are often used in caching strategy research. This paper also provides another contribution, namely, practical instructions for Web developers to decide the caching strategy.
Details
Keywords
The recent report for the Commission of the European Communities on current multilingual activities in the field of scientific and technical information and the 1977 conference on…
Abstract
The recent report for the Commission of the European Communities on current multilingual activities in the field of scientific and technical information and the 1977 conference on the same theme both included substantial sections on operational and experimental machine translation systems, and in its Plan of action the Commission announced its intention to introduce an operational machine translation system into its departments and to support research projects on machine translation. This revival of interest in machine translation may well have surprised many who have tended in recent years to dismiss it as one of the ‘great failures’ of scientific research. What has changed? What grounds are there now for optimism about machine translation? Or is it still a ‘utopian dream’ ? The aim of this review is to give a general picture of present activities which may help readers to reach their own conclusions. After a sketch of the historical background and general aims (section I), it describes operational and experimental machine translation systems of recent years (section II), it continues with descriptions of interactive (man‐machine) systems and machine‐assisted translation (section III), (and it concludes with a general survey of present problems and future possibilities section IV).
Xiaoming Zhang, Mingming Meng, Xiaoling Sun and Yu Bai
With the advent of the era of Big Data, the scale of knowledge graph (KG) in various domains is growing rapidly, which holds huge amount of knowledge surely benefiting the…
Abstract
Purpose
With the advent of the era of Big Data, the scale of knowledge graph (KG) in various domains is growing rapidly, which holds huge amount of knowledge surely benefiting the question answering (QA) research. However, the KG, which is always constituted of entities and relations, is structurally inconsistent with the natural language query. Thus, the QA system based on KG is still faced with difficulties. The purpose of this paper is to propose a method to answer the domain-specific questions based on KG, providing conveniences for the information query over domain KG.
Design/methodology/approach
The authors propose a method FactQA to answer the factual questions about specific domain. A series of logical rules are designed to transform the factual questions into the triples, in order to solve the structural inconsistency between the user’s question and the domain knowledge. Then, the query expansion strategies and filtering strategies are proposed from two levels (i.e. words and triples in the question). For matching the question with domain knowledge, not only the similarity values between the words in the question and the resources in the domain knowledge but also the tag information of these words is considered. And the tag information is obtained by parsing the question using Stanford CoreNLP. In this paper, the KG in metallic materials domain is used to illustrate the FactQA method.
Findings
The designed logical rules have time stability for transforming the factual questions into the triples. Additionally, after filtering the synonym expansion results of the words in the question, the expansion quality of the triple representation of the question is improved. The tag information of the words in the question is considered in the process of data matching, which could help to filter out the wrong matches.
Originality/value
Although the FactQA is proposed for domain-specific QA, it can also be applied to any other domain besides metallic materials domain. For a question that cannot be answered, FactQA would generate a new related question to answer, providing as much as possible the user with the information they probably need. The FactQA could facilitate the user’s information query based on the emerging KG.
Details
Keywords
Kamal Hamaz and Fouzia Benchikha
With the development of systems and applications, the number of users interacting with databases has increased considerably. The relational database model is still considered as…
Abstract
Purpose
With the development of systems and applications, the number of users interacting with databases has increased considerably. The relational database model is still considered as the most used model for data storage and manipulation. However, it does not offer any semantic support for the stored data which can facilitate data access for the users. Indeed, a large number of users are intimidated when retrieving data because they are non-technical or have little technical knowledge. To overcome this problem, researchers are continuously developing new techniques for Natural Language Interfaces to Databases (NLIDB). Nowadays, the usage of existing NLIDBs is not widespread due to their deficiencies in understanding natural language (NL) queries. In this sense, the purpose of this paper is to propose a novel method for an intelligent understanding of NL queries using semantically enriched database sources.
Design/methodology/approach
First a reverse engineering process is applied to extract relational database hidden semantics. In the second step, the extracted semantics are enriched further using a domain ontology. After this, all semantics are stored in the same relational database. The phase of processing NL queries uses the stored semantics to generate a semantic tree.
Findings
The evaluation part of the work shows the advantages of using a semantically enriched database source to understand NL queries. Additionally, enriching a relational database has given more flexibility to understand contextual and synonymous words that may be used in a NL query.
Originality/value
Existing NLIDBs are not yet a standard option for interfacing a relational database due to their lack for understanding NL queries. Indeed, the techniques used in the literature have their limits. This paper handles those limits by identifying the NL elements by their semantic nature in order to generate a semantic tree. This last is a key solution towards an intelligent understanding of NL queries to relational databases.
Details
Keywords
Victor Diogho Heuer de Carvalho and Ana Paula Cabral Seixas Costa
This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is…
Abstract
Purpose
This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is supporting analyses, so security authorities can make appropriate decisions about their actions.
Design/methodology/approach
The corpora were obtained through web scraping from a newspaper's website and tweets from a Brazilian metropolitan region. Natural language processing was applied considering: text cleaning, lemmatization, summarization, part-of-speech and dependencies parsing, named entities recognition, and topic modeling.
Findings
Several results were obtained based on the methodology used, highlighting some: an example of a summarization using an automated process; dependency parsing; the most common topics in each corpus; the forty named entities and the most common slogans were extracted, highlighting those linked to public security.
Research limitations/implications
Some critical tasks were identified for the research perspective, related to the applied methodology: the treatment of noise from obtaining news on their source websites, passing through textual elements quite present in social network posts such as abbreviations, emojis/emoticons, and even writing errors; the treatment of subjectivity, to eliminate noise from irony and sarcasm; the search for authentic news of issues within the target domain. All these tasks aim to improve the process to enable interested authorities to perform accurate analyses.
Practical implications
The corpora dedicated to the public security domain enable several analyses, such as mining public opinion on security actions in a given location; understanding criminals' behaviors reported in the news or even on social networks and drawing their attitudes timeline; detecting movements that may cause damage to public property and people welfare through texts from social networks; extracting the history and repercussions of police actions, crossing news with records on social networks; among many other possibilities.
Originality/value
The work on behalf of the corpora reported in this text represents one of the first initiatives to create textual bases in Portuguese, dedicated to Brazil's specific public security domain.
Details
Keywords
The purpose of this paper is to address the knowledge acquisition bottleneck problem in natural language processing by introducing a new rule‐based approach for the automatic…
Abstract
Purpose
The purpose of this paper is to address the knowledge acquisition bottleneck problem in natural language processing by introducing a new rule‐based approach for the automatic acquisition of linguistic knowledge.
Design/methodology/approach
The author has developed a new machine translation methodology that only requires a bilingual lexicon and a parallel corpus of surface sentences aligned at the sentence level to learn new transfer rules.
Findings
A first prototype of a web‐based Japanese‐English translation system called Japanese‐English translation using corpus‐based acquisition of transfer (JETCAT) has been implemented in SWI‐Prolog, and a Greasemonkey user script to analyze Japanese web pages and translate sentences via Ajax. In addition, linguistic information is displayed at the character, word, and sentence level to provide a useful tool for web‐based language learning. An important feature is customization; the user can simply correct translation results leading to an incremental update of the knowledge base.
Research limitations/implications
This paper focuses on the technical aspects and user interface issues of JETCAT. The author is planning to use JETCAT in a classroom setting to gather first experiences and will then evaluate a real‐world deployment; also work has started on extending JETCAT to include collaborative features.
Practical implications
The research has a high practical impact on academic language education. It also could have implications for the translation industry by superseding certain translation tasks and, on the other hand, adding value and quality to others.
Originality/value
The paper presents an extended version of the paper receiving the Emerald Web Information Systems Best Paper Award at iiWAS2010.
Details
Keywords
Joseph Fong, San Kuen Cheung, Herbert Shiu and Chi Chung Cheung
XML Schema Definition (XSD) is in the logical level of XML model and is used in most web applications. At present, there is no standard format for the conceptual level of XML…
Abstract
XML Schema Definition (XSD) is in the logical level of XML model and is used in most web applications. At present, there is no standard format for the conceptual level of XML model. Therefore, we introduce an XML Tree Model as an XML conceptual schema for representing and confirming the data semantics according to the user requirements in a diagram. The XML Tree Model consists of nodes representing all elements within the XSD. We apply reverse engineering from an XSD to an XML Tree Model to assist end users in applying an XML database for information highway on the Internet. The data semantics recovered for visualization include root element, weak elements, participation, cardinality, aggregation, generalization, categorization, and n‐ary association, and which can be derived by analyzing the structural constraints of XSD based on its key features such as key, keyref, minOccurs, maxOccurs, Choice, Sequence and extension. We use the Eclipse user interface for generating a graphical view for XML conceptual schema.
Details
Keywords
The paper describes research designed to improve automatic pre‐coordinate term indexing by applying powerful general‐purpose language analysis techniques to identify term sources…
Abstract
The paper describes research designed to improve automatic pre‐coordinate term indexing by applying powerful general‐purpose language analysis techniques to identify term sources in requests, and to generate variant expressions of the concepts involved for document text searching.
Jeevan Jacob and Koshy Varghese
The building design processes today are complex, involving many disciplines and issues like collaboration, concurrency and collocation. Several studies have focused on…
Abstract
Purpose
The building design processes today are complex, involving many disciplines and issues like collaboration, concurrency and collocation. Several studies have focused on understanding and modeling formal information exchange in these processes. Few past studies have also identified the importance of informal information exchanges in the design process and proposed passive solutions for facilitating this exchange. The purpose of this paper is to term the informal information as ad hoc information and explores if components of ad hoc information exchanges can be actively managed.
Design/methodology/approach
An MDM-based framework integrating product, process and people dependencies is proposed and a prototype platform to implement this framework is developed. The demonstration on the usage of this platform to identify information paths during collaboration and hence manage ad hoc information exchanges is presented through an example problem.
Findings
Based on the effectiveness of the prototype platform in identifying information paths for design queries, it is concluded that the proposed framework is useful for actively managing some components of ad hoc information exchange.
Originality/value
This research enables the design manager/participants to make a more informed decision on requesting and releasing design information.
Details