Search results

1 – 10 of 90
Article
Publication date: 5 March 2024

Yuchen Yang

Recent archiving and curatorial practices took advantage of the advancement in digital technologies, creating immersive and interactive experiences to emphasize the plurality of…

Abstract

Purpose

Recent archiving and curatorial practices took advantage of the advancement in digital technologies, creating immersive and interactive experiences to emphasize the plurality of memory materials, encourage personalized sense-making and extract, manage and share the ever-growing surrounding knowledge. Audiovisual (AV) content, with its growing importance and popularity, is less explored on that end than texts and images. This paper examines the trend of datafication in AV archives and answers the critical question, “What to extract from AV materials and why?”.

Design/methodology/approach

This study roots in a comprehensive state-of-the-art review of digital methods and curatorial practices in AV archives. The thinking model for mapping AV archive data to purposes is based on pre-existing models for understanding multimedia content and metadata standards.

Findings

The thinking model connects AV content descriptors (data perspective) and purposes (curatorial perspective) and provides a theoretical map of how information extracted from AV archives should be fused and embedded for memory institutions. The model is constructed by looking into the three broad dimensions of audiovisual content – archival, affective and aesthetic, social and historical.

Originality/value

This paper contributes uniquely to the intersection of computational archives, audiovisual content and public sense-making experiences. It provides updates and insights to work towards datafied AV archives and cope with the increasing needs in the sense-making end using AV archives.

Details

Journal of Documentation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 18 May 2023

Rongen Yan, Depeng Dang, Hu Gao, Yan Wu and Wenhui Yu

Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different…

Abstract

Purpose

Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different expressions, which increases the difficulty of text retrieval. Therefore, the purpose of this paper is to explore new query rewriting method for QA that integrates multiple related questions (RQs) to form an optimal question. Moreover, it is important to generate a new dataset of the original query (OQ) with multiple RQs.

Design/methodology/approach

This study collects a new dataset SQuAD_extend by crawling the QA community and uses word-graph to model the collected OQs. Next, Beam search finds the best path to get the best question. To deeply represent the features of the question, pretrained model BERT is used to model sentences.

Findings

The experimental results show three outstanding findings. (1) The quality of the answers is better after adding the RQs of the OQs. (2) The word-graph that is used to model the problem and choose the optimal path is conducive to finding the best question. (3) Finally, BERT can deeply characterize the semantics of the exact problem.

Originality/value

The proposed method can use word-graph to construct multiple questions and select the optimal path for rewriting the question, and the quality of answers is better than the baseline. In practice, the research results can help guide users to clarify their query intentions and finally achieve the best answer.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 25 April 2024

Abdul-Manan Sadick, Argaw Gurmu and Chathuri Gunarathna

Developing a reliable cost estimate at the early stage of construction projects is challenging due to inadequate project information. Most of the information during this stage is…

Abstract

Purpose

Developing a reliable cost estimate at the early stage of construction projects is challenging due to inadequate project information. Most of the information during this stage is qualitative, posing additional challenges to achieving accurate cost estimates. Additionally, there is a lack of tools that use qualitative project information and forecast the budgets required for project completion. This research, therefore, aims to develop a model for setting project budgets (excluding land) during the pre-conceptual stage of residential buildings, where project information is mainly qualitative.

Design/methodology/approach

Due to the qualitative nature of project information at the pre-conception stage, a natural language processing model, DistilBERT (Distilled Bidirectional Encoder Representations from Transformers), was trained to predict the cost range of residential buildings at the pre-conception stage. The training and evaluation data included 63,899 building permit activity records (2021–2022) from the Victorian State Building Authority, Australia. The input data comprised the project description of each record, which included project location and basic material types (floor, frame, roofing, and external wall).

Findings

This research designed a novel tool for predicting the project budget based on preliminary project information. The model achieved 79% accuracy in classifying residential buildings into three cost_classes ($100,000-$300,000, $300,000-$500,000, $500,000-$1,200,000) and F1-scores of 0.85, 0.73, and 0.74, respectively. Additionally, the results show that the model learnt the contextual relationship between qualitative data like project location and cost.

Research limitations/implications

The current model was developed using data from Victoria state in Australia; hence, it would not return relevant outcomes for other contexts. However, future studies can adopt the methods to develop similar models for their context.

Originality/value

This research is the first to leverage a deep learning model, DistilBERT, for cost estimation at the pre-conception stage using basic project information like location and material types. Therefore, the model would contribute to overcoming data limitations for cost estimation at the pre-conception stage. Residential building stakeholders, like clients, designers, and estimators, can use the model to forecast the project budget at the pre-conception stage to facilitate decision-making.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 6 February 2024

Lin Xue and Feng Zhang

With the increasing number of Web services, correct and efficient classification of Web services is crucial to improve the efficiency of service discovery. However, existing Web…

Abstract

Purpose

With the increasing number of Web services, correct and efficient classification of Web services is crucial to improve the efficiency of service discovery. However, existing Web service classification approaches ignore the class overlap in Web services, resulting in poor accuracy of classification in practice. This paper aims to provide an approach to address this issue.

Design/methodology/approach

This paper proposes a label confusion and priori correction-based Web service classification approach. First, functional semantic representations of Web services descriptions are obtained based on BERT. Then, the ability of the model is enhanced to recognize and classify overlapping instances by using label confusion learning techniques; Finally, the predictive results are corrected based on the label prior distribution to further improve service classification effectiveness.

Findings

Experiments based on the ProgrammableWeb data set show that the proposed model demonstrates 4.3%, 3.2% and 1% improvement in Macro-F1 value compared to the ServeNet-BERT, BERT-DPCNN and CARL-NET, respectively.

Originality/value

This paper proposes a Web service classification approach for the overlapping categories of Web services and improve the accuracy of Web services classification.

Details

International Journal of Web Information Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 29 August 2023

Hei-Chia Wang, Martinus Maslim and Hung-Yu Liu

A clickbait is a deceptive headline designed to boost ad revenue without presenting closely relevant content. There are numerous negative repercussions of clickbait, such as…

Abstract

Purpose

A clickbait is a deceptive headline designed to boost ad revenue without presenting closely relevant content. There are numerous negative repercussions of clickbait, such as causing viewers to feel tricked and unhappy, causing long-term confusion, and even attracting cyber criminals. Automatic detection algorithms for clickbait have been developed to address this issue. The fact that there is only one semantic representation for the same term and a limited dataset in Chinese is a need for the existing technologies for detecting clickbait. This study aims to solve the limitations of automated clickbait detection in the Chinese dataset.

Design/methodology/approach

This study combines both to train the model to capture the probable relationship between clickbait news headlines and news content. In addition, part-of-speech elements are used to generate the most appropriate semantic representation for clickbait detection, improving clickbait detection performance.

Findings

This research successfully compiled a dataset containing up to 20,896 Chinese clickbait news articles. This collection contains news headlines, articles, categories and supplementary metadata. The suggested context-aware clickbait detection (CA-CD) model outperforms existing clickbait detection approaches on many criteria, demonstrating the proposed strategy's efficacy.

Originality/value

The originality of this study resides in the newly compiled Chinese clickbait dataset and contextual semantic representation-based clickbait detection approach employing transfer learning. This method can modify the semantic representation of each word based on context and assist the model in more precisely interpreting the original meaning of news articles.

Details

Data Technologies and Applications, vol. 58 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Book part
Publication date: 28 March 2024

Aline Maia

This chapter presents a methodological discussion about ethnographic practice from a feminist perspective that contributes to the field of communication studies methodology and…

Abstract

This chapter presents a methodological discussion about ethnographic practice from a feminist perspective that contributes to the field of communication studies methodology and theory. The ethnography engages Black (Afro-Brasileiro and African-American) and economically disadvantaged youth from Rio de Janeiro (Brazil) and New Orleans (USA) regarding their strategies of social and media visibility. This multi-sited ethnography proposes to improve the objectives of ethnography through theoretical flexibility, liberation from a priori assumptions, greater representation of the voices of community members, disavowal of the imperatives of positivist work, and abiding respect for the “other.”

Details

Geo Spaces of Communication Research
Type: Book
ISBN: 978-1-80071-606-3

Keywords

Article
Publication date: 15 December 2023

Yuhong Peng, Jianwei Ding and Yueyan Zhang

This study examines the relationship between streamers' product descriptions, customer comments and online sales and focuses on the moderating effect of streamer–viewer…

Abstract

Purpose

This study examines the relationship between streamers' product descriptions, customer comments and online sales and focuses on the moderating effect of streamer–viewer relationship strength.

Design/methodology/approach

Between June 2021 and April 2022, the structured data of 965 livestreaming and unstructured text data of 42,956,147 characters from two major live-streaming platforms were collected for the study. Text analysis and regression analysis methods were employed for data analysis.

Findings

First, the authors' analysis reveals an inverted U-shaped relationship between comment length and product sales. Notably, comment volume and comment emotion positively influence product sales. Furthermore, the semantic richness, emotion and readability of streamers' product descriptions also positively influence product sales. Secondly, the authors find that the strength of streamer–viewer relationship weakens the positive effects of comment volume and comment emotion without moderating the inverted U-shaped effect of comment length. Lastly, the strength of streamer–viewer relationship also diminishes the positive effects of emotion, semantics and readability of streamers' product descriptions on product sales.

Originality/value

This study is the first to concurrently examine the direct and interactive effects of user-generated content (UGC) and marketer-generated content (MGC) on consumer purchase behaviors in livestreaming e-commerce, offering a novel perspective on individual decision-making and cue utilization in the social retail context.

Details

Marketing Intelligence & Planning, vol. 42 no. 1
Type: Research Article
ISSN: 0263-4503

Keywords

Article
Publication date: 5 May 2023

Ying Yu and Jing Ma

The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee…

Abstract

Purpose

The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee, shipping location and shipping items. Automated information extraction in this area is, however, under-researched, making the extraction process a time- and effort-consuming one. For Chinese logistics tender entities, in particular, existing named entity recognition (NER) solutions are mostly unsuitable as they involve domain-specific terminologies and possess different semantic features.

Design/methodology/approach

To tackle this problem, a novel lattice long short-term memory (LSTM) model, combining a variant contextual feature representation and a conditional random field (CRF) layer, is proposed in this paper for identifying valuable entities from logistic tender documents. Instead of traditional word embedding, the proposed model uses the pretrained Bidirectional Encoder Representations from Transformers (BERT) model as input to augment the contextual feature representation. Subsequently, with the Lattice-LSTM model, the information of characters and words is effectively utilized to avoid error segmentation.

Findings

The proposed model is then verified by the Chinese logistic tender named entity corpus. Moreover, the results suggest that the proposed model excels in the logistics tender corpus over other mainstream NER models. The proposed model underpins the automatic extraction of logistics tender information, enabling logistic companies to perceive the ever-changing market trends and make far-sighted logistic decisions.

Originality/value

(1) A practical model for logistic tender NER is proposed in the manuscript. By employing and fine-tuning BERT into the downstream task with a small amount of data, the experiment results show that the model has a better performance than other existing models. This is the first study, to the best of the authors' knowledge, to extract named entities from Chinese logistic tender documents. (2) A real logistic tender corpus for practical use is constructed and a program of the model for online-processing real logistic tender documents is developed in this work. The authors believe that the model will facilitate logistic companies in converting unstructured documents to structured data and further perceive the ever-changing market trends to make far-sighted logistic decisions.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 8 April 2024

Matthew Peebles, Shen Hin Lim, Mike Duke, Benjamin Mcguinness and Chi Kit Au

Time of flight (ToF) imaging is a promising emerging technology for the purposes of crop identification. This paper aim to presents localization system for identifying and…

Abstract

Purpose

Time of flight (ToF) imaging is a promising emerging technology for the purposes of crop identification. This paper aim to presents localization system for identifying and localizing asparagus in the field based on point clouds from ToF imaging. Since the semantics are not included in the point cloud, it contains the geometric information of other objects such as stones and weeds other than asparagus spears. An approach is required for extracting the spear information so that a robotic system can be used for harvesting.

Design/methodology/approach

A real-time convolutional neural network (CNN)-based method is used for filtering the point cloud generated by a ToF camera, allowing subsequent processing methods to operate over smaller and more information-dense data sets, resulting in reduced processing time. The segmented point cloud can then be split into clusters of points representing each individual spear. Geometric filters are developed to eliminate the non-asparagus points in each cluster so that each spear can be modelled and localized. The spear information can then be used for harvesting decisions.

Findings

The localization system is integrated into a robotic harvesting prototype system. Several field trials have been conducted with satisfactory performance. The identification of a spear from the point cloud is the key to successful localization. Segmentation and clustering points into individual spears are two major failures for future improvements.

Originality/value

Most crop localizations in agricultural robotic applications using ToF imaging technology are implemented in a very controlled environment, such as a greenhouse. The target crop and the robotic system are stationary during the localization process. The novel proposed method for asparagus localization has been tested in outdoor farms and integrated with a robotic harvesting platform. Asparagus detection and localization are achieved in real time on a continuously moving robotic platform in a cluttered and unstructured environment.

Details

Industrial Robot: the international journal of robotics research and application, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0143-991X

Keywords

Abstract

Details

Big Data Analytics for the Prediction of Tourist Preferences Worldwide
Type: Book
ISBN: 978-1-83549-339-7

1 – 10 of 90