Search results

1 – 10 of over 83000
Article
Publication date: 1 February 1993

BRIAN VICKERY and ALINA VICKERY

There is a huge amount of information and data stored in publicly available online databases that consist of large text files accessed by Boolean search techniques. It is widely…

Abstract

There is a huge amount of information and data stored in publicly available online databases that consist of large text files accessed by Boolean search techniques. It is widely held that less use is made of these databases than could or should be the case, and that one reason for this is that potential users find it difficult to identify which databases to search, to use the various command languages of the hosts and to construct the Boolean search statements required. This reasoning has stimulated a considerable amount of exploration and development work on the construction of search interfaces, to aid the inexperienced user to gain effective access to these databases. The aim of our paper is to review aspects of the design of such interfaces: to indicate the requirements that must be met if maximum aid is to be offered to the inexperienced searcher; to spell out the knowledge that must be incorporated in an interface if such aid is to be given; to describe some of the solutions that have been implemented in experimental and operational interfaces; and to discuss some of the problems encountered. The paper closes with an extensive bibliography of references relevant to online search aids, going well beyond the items explicitly mentioned in the text. An index to software appears after the bibliography at the end of the paper.

Details

Journal of Documentation, vol. 49 no. 2
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 March 1998

Robert Gaizauskas and Yorick Wilks

In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre‐specified…

1437

Abstract

In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre‐specified set of entities, relations or events from natural language texts and to record this information in structured representations called templates. Here we describe the nature of the IE task, review the history of the area from its origins in AI work in the 1960s and 70s till the present, discuss the techniques being used to carry out the task, describe application areas where IE systems are or are about to be at work, and conclude with a discussion of the challenges facing the area. What emerges is a picture of an exciting new text processing technology with a host of new applications, both on its own and in conjunction with other technologies, such as information retrieval, machine translation and data mining.

Details

Journal of Documentation, vol. 54 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 1 February 1991

JOHN F. FARROW

Classification, indexing and abstracting can all be regarded as summarisations of the content of a document. A model of text comprehension by indexers (including classifiers and…

Abstract

Classification, indexing and abstracting can all be regarded as summarisations of the content of a document. A model of text comprehension by indexers (including classifiers and abstractors) is presented, based on task descriptions which indicate that the comprehension of text for indexing differs from normal fluent reading in respect of: operational time constraints, which lead to text being scanned rapidly for perceptual cues to aid gist comprehension; comprehension being task oriented rather than learning oriented, and being followed immediately by the production of an abstract, index, or classification; and the automaticity of processing of text by experienced indexers working within a restricted range of text types. The evidence for the interplay of perceptual and conceptual processing of text under conditions of rapid scanning is reviewed. The allocation of mental resources to text processing is discussed, and a cognitive process model of abstracting, indexing and classification is described.

Details

Journal of Documentation, vol. 47 no. 2
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 April 1980

JOHN WHITEHEAD

The ‘Office of the Future’, ‘Office Technology’, ‘Word Processing’, ‘Electronic Mail’, ‘Electronic Communications’, ‘Convergence’, ‘Information Management’. These are all terms…

Abstract

The ‘Office of the Future’, ‘Office Technology’, ‘Word Processing’, ‘Electronic Mail’, ‘Electronic Communications’, ‘Convergence’, ‘Information Management’. These are all terms included in the current list of buzz words used to describe current activities in the office technology area. Open the pages of almost any journal or periodical today and you will probably find an article or some reference to one or more of the above subjects. Long, detailed and highly technical theses are appearing on new techniques to automate and revolutionize the office environment. Facts and figures are quoted ad nauseam on the high current cost of writing a letter, filing letters, memos, reports and documents, trying to communicate with someone by telephone or other telecommunication means and, most significant of all, the high cost of people undertaking these never‐ending tasks. The high level of investment in factories and plants and the ever‐increasing fight to improve productivity by automating the dull, routine jobs are usually quoted and compared with the extremely low investment in improving and automating the equally tedious routine jobs in the office environment; the investment in the factory is quoted as being ten times greater per employee than in the office. This, however, is changing rapidly and investment on a large scale is already taking place in many areas as present‐day inflation bites hard, forcing many companies and organizations to take a much closer look at their office operations.

Details

Journal of Documentation, vol. 36 no. 4
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 16 August 2021

Nael Alqtati, Jonathan A.J. Wilson and Varuna De Silva

This paper aims to equip professionals and researchers in the fields of advertising, branding, public relations, marketing communications, social media analytics and marketing…

Abstract

Purpose

This paper aims to equip professionals and researchers in the fields of advertising, branding, public relations, marketing communications, social media analytics and marketing with a simple, effective and dynamic means of evaluating consumer behavioural sentiments and engagement through Arabic language and script, in vivo.

Design/methodology/approach

Using quantitative and qualitative situational linguistic analyses of Classical Arabic, found in Quranic and religious texts scripts; Modern Standard Arabic, which is commonly used in formal Arabic channels; and dialectical Arabic, which varies hugely from one Arabic country to another: this study analyses rich marketing and consumer messages (tweets) – as a basis for developing an Arabic language social media methodological tool.

Findings

Despite the popularity of Arabic language communication on social media platforms across geographies, currently, comprehensive language processing toolkits for analysing Arabic social media conversations have limitations and require further development. Furthermore, due to its unique morphology, developing text understanding capabilities specific to the Arabic language poses challenges.

Practical implications

This study demonstrates the application and effectiveness of the proposed methodology on a random sample of Twitter data from Arabic-speaking regions. Furthermore, as Arabic is the language of Islam, the study is of particular importance to Islamic and Muslim geographies, markets and marketing.

Social implications

The findings suggest that the proposed methodology has a wider potential beyond the data set and health-care sector analysed, and therefore, can be applied to further markets, social media platforms and consumer segments.

Originality/value

To remedy these gaps, this study presents a new methodology and analytical approach to investigating Arabic language social media conversations, which brings together a multidisciplinary knowledge of technology, data science and marketing communications.

Article
Publication date: 31 October 2023

Hong Zhou, Binwei Gao, Shilong Tang, Bing Li and Shuyu Wang

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly…

Abstract

Purpose

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly promote the overall performance of the project life cycle. The miss of clauses may result in a failure to match with standard contracts. If the contract, modified by the owner, omits key clauses, potential disputes may lead to contractors paying substantial compensation. Therefore, the identification of construction project contract missing clauses has heavily relied on the manual review technique, which is inefficient and highly restricted by personnel experience. The existing intelligent means only work for the contract query and storage. It is urgent to raise the level of intelligence for contract clause management. Therefore, this paper aims to propose an intelligent method to detect construction project contract missing clauses based on Natural Language Processing (NLP) and deep learning technology.

Design/methodology/approach

A complete classification scheme of contract clauses is designed based on NLP. First, construction contract texts are pre-processed and converted from unstructured natural language into structured digital vector form. Following the initial categorization, a multi-label classification of long text construction contract clauses is designed to preliminary identify whether the clause labels are missing. After the multi-label clause missing detection, the authors implement a clause similarity algorithm by creatively integrating the image detection thought, MatchPyramid model, with BERT to identify missing substantial content in the contract clauses.

Findings

1,322 construction project contracts were tested. Results showed that the accuracy of multi-label classification could reach 93%, the accuracy of similarity matching can reach 83%, and the recall rate and F1 mean of both can reach more than 0.7. The experimental results verify the feasibility of intelligently detecting contract risk through the NLP-based method to some extent.

Originality/value

NLP is adept at recognizing textual content and has shown promising results in some contract processing applications. However, the mostly used approaches of its utilization for risk detection in construction contract clauses predominantly are rule-based, which encounter challenges when handling intricate and lengthy engineering contracts. This paper introduces an NLP technique based on deep learning which reduces manual intervention and can autonomously identify and tag types of contractual deficiencies, aligning with the evolving complexities anticipated in future construction contracts. Moreover, this method achieves the recognition of extended contract clause texts. Ultimately, this approach boasts versatility; users simply need to adjust parameters such as segmentation based on language categories to detect omissions in contract clauses of diverse languages.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 14 November 2023

Jihye Park, Min Zhang, Seunghyun Yoo and Hannah Gloria Kwon

This study investigates the effects of vertical direction and rotation of English loan brand names in East Asian languages (Chinese and Korean) on processing fluency, perceived…

Abstract

Purpose

This study investigates the effects of vertical direction and rotation of English loan brand names in East Asian languages (Chinese and Korean) on processing fluency, perceived product quality and purchase intention.

Design/methodology/approach

Four experiments were conducted in China and Korea, employing a 2 (vertical direction: downward vs upward) X 3 (rotation: 0°/marquee vs 90° clockwise vs 90° counterclockwise) between-subjects factorial design.

Findings

The findings showed that when the English loan Chinese brand name was displayed downward, the marquee format was preferred, while counterclockwise rotation was favored when displayed upward. In Korean, clockwise rotation was preferred for downward presentation, while counterclockwise rotation was favored for upward presentation. The effects on purchase intention were mediated by processing fluency and perceived product quality.

Practical implications

This research provides practical implications for global manufacturers and retailers, offering guidance on presenting brand names in East Asian languages and optimizing product packaging designs. For Chinese consumers, the marquee format is recommended for downward-oriented brand names, while counterclockwise rotation is effective for upward orientation. For Korean consumers, clockwise rotation is favored for downward presentation and counterclockwise rotation is preferred for upward presentation. Understanding linguistic habits allows the tailoring of brand presentations, enhancing brand perception and consumer responses.

Originality/value

This study contributes to understanding the role of cultural and linguistic influences on consumer information processing and product perception in vertical presentations of brand names.

Details

Asia Pacific Journal of Marketing and Logistics, vol. 36 no. 5
Type: Research Article
ISSN: 1355-5855

Keywords

Article
Publication date: 6 February 2024

Somayeh Tamjid, Fatemeh Nooshinfard, Molouk Sadat Hosseini Beheshti, Nadjla Hariri and Fahimeh Babalhavaeji

The purpose of this study is to develop a domain independent, cost-effective, time-saving and semi-automated ontology generation framework that could extract taxonomic concepts…

Abstract

Purpose

The purpose of this study is to develop a domain independent, cost-effective, time-saving and semi-automated ontology generation framework that could extract taxonomic concepts from unstructured text corpus. In the human disease domain, ontologies are found to be extremely useful for managing the diversity of technical expressions in favour of information retrieval objectives. The boundaries of these domains are expanding so fast that it is essential to continuously develop new ontologies or upgrade available ones.

Design/methodology/approach

This paper proposes a semi-automated approach that extracts entities/relations via text mining of scientific publications. Text mining-based ontology (TmbOnt)-named code is generated to assist a user in capturing, processing and establishing ontology elements. This code takes a pile of unstructured text files as input and projects them into high-valued entities or relations as output. As a semi-automated approach, a user supervises the process, filters meaningful predecessor/successor phrases and finalizes the demanded ontology-taxonomy. To verify the practical capabilities of the scheme, a case study was performed to drive glaucoma ontology-taxonomy. For this purpose, text files containing 10,000 records were collected from PubMed.

Findings

The proposed approach processed over 3.8 million tokenized terms of those records and yielded the resultant glaucoma ontology-taxonomy. Compared with two famous disease ontologies, TmbOnt-driven taxonomy demonstrated a 60%–100% coverage ratio against famous medical thesauruses and ontology taxonomies, such as Human Disease Ontology, Medical Subject Headings and National Cancer Institute Thesaurus, with an average of 70% additional terms recommended for ontology development.

Originality/value

According to the literature, the proposed scheme demonstrated novel capability in expanding the ontology-taxonomy structure with a semi-automated text mining approach, aiming for future fully-automated approaches.

Details

The Electronic Library , vol. 42 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 1 January 1993

Ankie Visschedijk and Forbes Gibb

This article reviews some of the more unconventional text retrieval systems, emphasising those which have been commercialised. These sophisticated systems improve on conventional…

Abstract

This article reviews some of the more unconventional text retrieval systems, emphasising those which have been commercialised. These sophisticated systems improve on conventional retrieval by using either innovative software or hardware to increase retrieval speed or functionality, precision or recall. The software systems reviewed are: AIDA, CLARIT, Metamorph, SIMPR, STATUS/IQ, TCS, TINA and TOPIC. The hardware systems reviewed are: CAFS‐ISP, the Connection Machine, GESCAN,HSTS,MPP, TEXTRACT, TRW‐FDF and URSA.

Details

Online and CD-Rom Review, vol. 17 no. 1
Type: Research Article
ISSN: 1353-2642

Keywords

Article
Publication date: 1 March 1990

Forbes Gibb and Godfrey Smart

SIMPR (Structured Information Management: Processing and Retrieval) is an ESPRIT II Project aiming to achieve technological advances in information management This new technology…

Abstract

SIMPR (Structured Information Management: Processing and Retrieval) is an ESPRIT II Project aiming to achieve technological advances in information management This new technology is instantiated in the SIMPR software system. SIMPR will process documents by indexing them and classifying their subjects, before storing them in an electronic information base from which they can then be retrieved using simple natural language search requests. Building this system has required initiatives in automatic indexing, in language analysis, in subject classification and in machine learning. These initiatives are discussed in this paper, in the context of the strategy and achievements to date of the SIMPR Project.

Details

Online Review, vol. 14 no. 3
Type: Research Article
ISSN: 0309-314X

1 – 10 of over 83000