Search results
1 – 10 of 90Recent archiving and curatorial practices took advantage of the advancement in digital technologies, creating immersive and interactive experiences to emphasize the plurality of…
Abstract
Purpose
Recent archiving and curatorial practices took advantage of the advancement in digital technologies, creating immersive and interactive experiences to emphasize the plurality of memory materials, encourage personalized sense-making and extract, manage and share the ever-growing surrounding knowledge. Audiovisual (AV) content, with its growing importance and popularity, is less explored on that end than texts and images. This paper examines the trend of datafication in AV archives and answers the critical question, “What to extract from AV materials and why?”.
Design/methodology/approach
This study roots in a comprehensive state-of-the-art review of digital methods and curatorial practices in AV archives. The thinking model for mapping AV archive data to purposes is based on pre-existing models for understanding multimedia content and metadata standards.
Findings
The thinking model connects AV content descriptors (data perspective) and purposes (curatorial perspective) and provides a theoretical map of how information extracted from AV archives should be fused and embedded for memory institutions. The model is constructed by looking into the three broad dimensions of audiovisual content – archival, affective and aesthetic, social and historical.
Originality/value
This paper contributes uniquely to the intersection of computational archives, audiovisual content and public sense-making experiences. It provides updates and insights to work towards datafied AV archives and cope with the increasing needs in the sense-making end using AV archives.
Details
Keywords
Rongen Yan, Depeng Dang, Hu Gao, Yan Wu and Wenhui Yu
Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different…
Abstract
Purpose
Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different expressions, which increases the difficulty of text retrieval. Therefore, the purpose of this paper is to explore new query rewriting method for QA that integrates multiple related questions (RQs) to form an optimal question. Moreover, it is important to generate a new dataset of the original query (OQ) with multiple RQs.
Design/methodology/approach
This study collects a new dataset SQuAD_extend by crawling the QA community and uses word-graph to model the collected OQs. Next, Beam search finds the best path to get the best question. To deeply represent the features of the question, pretrained model BERT is used to model sentences.
Findings
The experimental results show three outstanding findings. (1) The quality of the answers is better after adding the RQs of the OQs. (2) The word-graph that is used to model the problem and choose the optimal path is conducive to finding the best question. (3) Finally, BERT can deeply characterize the semantics of the exact problem.
Originality/value
The proposed method can use word-graph to construct multiple questions and select the optimal path for rewriting the question, and the quality of answers is better than the baseline. In practice, the research results can help guide users to clarify their query intentions and finally achieve the best answer.
Details
Keywords
Abdul-Manan Sadick, Argaw Gurmu and Chathuri Gunarathna
Developing a reliable cost estimate at the early stage of construction projects is challenging due to inadequate project information. Most of the information during this stage is…
Abstract
Purpose
Developing a reliable cost estimate at the early stage of construction projects is challenging due to inadequate project information. Most of the information during this stage is qualitative, posing additional challenges to achieving accurate cost estimates. Additionally, there is a lack of tools that use qualitative project information and forecast the budgets required for project completion. This research, therefore, aims to develop a model for setting project budgets (excluding land) during the pre-conceptual stage of residential buildings, where project information is mainly qualitative.
Design/methodology/approach
Due to the qualitative nature of project information at the pre-conception stage, a natural language processing model, DistilBERT (Distilled Bidirectional Encoder Representations from Transformers), was trained to predict the cost range of residential buildings at the pre-conception stage. The training and evaluation data included 63,899 building permit activity records (2021–2022) from the Victorian State Building Authority, Australia. The input data comprised the project description of each record, which included project location and basic material types (floor, frame, roofing, and external wall).
Findings
This research designed a novel tool for predicting the project budget based on preliminary project information. The model achieved 79% accuracy in classifying residential buildings into three cost_classes ($100,000-$300,000, $300,000-$500,000, $500,000-$1,200,000) and F1-scores of 0.85, 0.73, and 0.74, respectively. Additionally, the results show that the model learnt the contextual relationship between qualitative data like project location and cost.
Research limitations/implications
The current model was developed using data from Victoria state in Australia; hence, it would not return relevant outcomes for other contexts. However, future studies can adopt the methods to develop similar models for their context.
Originality/value
This research is the first to leverage a deep learning model, DistilBERT, for cost estimation at the pre-conception stage using basic project information like location and material types. Therefore, the model would contribute to overcoming data limitations for cost estimation at the pre-conception stage. Residential building stakeholders, like clients, designers, and estimators, can use the model to forecast the project budget at the pre-conception stage to facilitate decision-making.
Details
Keywords
Lin Xue and Feng Zhang
With the increasing number of Web services, correct and efficient classification of Web services is crucial to improve the efficiency of service discovery. However, existing Web…
Abstract
Purpose
With the increasing number of Web services, correct and efficient classification of Web services is crucial to improve the efficiency of service discovery. However, existing Web service classification approaches ignore the class overlap in Web services, resulting in poor accuracy of classification in practice. This paper aims to provide an approach to address this issue.
Design/methodology/approach
This paper proposes a label confusion and priori correction-based Web service classification approach. First, functional semantic representations of Web services descriptions are obtained based on BERT. Then, the ability of the model is enhanced to recognize and classify overlapping instances by using label confusion learning techniques; Finally, the predictive results are corrected based on the label prior distribution to further improve service classification effectiveness.
Findings
Experiments based on the ProgrammableWeb data set show that the proposed model demonstrates 4.3%, 3.2% and 1% improvement in Macro-F1 value compared to the ServeNet-BERT, BERT-DPCNN and CARL-NET, respectively.
Originality/value
This paper proposes a Web service classification approach for the overlapping categories of Web services and improve the accuracy of Web services classification.
Details
Keywords
Hei-Chia Wang, Martinus Maslim and Hung-Yu Liu
A clickbait is a deceptive headline designed to boost ad revenue without presenting closely relevant content. There are numerous negative repercussions of clickbait, such as…
Abstract
Purpose
A clickbait is a deceptive headline designed to boost ad revenue without presenting closely relevant content. There are numerous negative repercussions of clickbait, such as causing viewers to feel tricked and unhappy, causing long-term confusion, and even attracting cyber criminals. Automatic detection algorithms for clickbait have been developed to address this issue. The fact that there is only one semantic representation for the same term and a limited dataset in Chinese is a need for the existing technologies for detecting clickbait. This study aims to solve the limitations of automated clickbait detection in the Chinese dataset.
Design/methodology/approach
This study combines both to train the model to capture the probable relationship between clickbait news headlines and news content. In addition, part-of-speech elements are used to generate the most appropriate semantic representation for clickbait detection, improving clickbait detection performance.
Findings
This research successfully compiled a dataset containing up to 20,896 Chinese clickbait news articles. This collection contains news headlines, articles, categories and supplementary metadata. The suggested context-aware clickbait detection (CA-CD) model outperforms existing clickbait detection approaches on many criteria, demonstrating the proposed strategy's efficacy.
Originality/value
The originality of this study resides in the newly compiled Chinese clickbait dataset and contextual semantic representation-based clickbait detection approach employing transfer learning. This method can modify the semantic representation of each word based on context and assist the model in more precisely interpreting the original meaning of news articles.
Details
Keywords
This chapter presents a methodological discussion about ethnographic practice from a feminist perspective that contributes to the field of communication studies methodology and…
Abstract
This chapter presents a methodological discussion about ethnographic practice from a feminist perspective that contributes to the field of communication studies methodology and theory. The ethnography engages Black (Afro-Brasileiro and African-American) and economically disadvantaged youth from Rio de Janeiro (Brazil) and New Orleans (USA) regarding their strategies of social and media visibility. This multi-sited ethnography proposes to improve the objectives of ethnography through theoretical flexibility, liberation from a priori assumptions, greater representation of the voices of community members, disavowal of the imperatives of positivist work, and abiding respect for the “other.”
Details
Keywords
Yuhong Peng, Jianwei Ding and Yueyan Zhang
This study examines the relationship between streamers' product descriptions, customer comments and online sales and focuses on the moderating effect of streamer–viewer…
Abstract
Purpose
This study examines the relationship between streamers' product descriptions, customer comments and online sales and focuses on the moderating effect of streamer–viewer relationship strength.
Design/methodology/approach
Between June 2021 and April 2022, the structured data of 965 livestreaming and unstructured text data of 42,956,147 characters from two major live-streaming platforms were collected for the study. Text analysis and regression analysis methods were employed for data analysis.
Findings
First, the authors' analysis reveals an inverted U-shaped relationship between comment length and product sales. Notably, comment volume and comment emotion positively influence product sales. Furthermore, the semantic richness, emotion and readability of streamers' product descriptions also positively influence product sales. Secondly, the authors find that the strength of streamer–viewer relationship weakens the positive effects of comment volume and comment emotion without moderating the inverted U-shaped effect of comment length. Lastly, the strength of streamer–viewer relationship also diminishes the positive effects of emotion, semantics and readability of streamers' product descriptions on product sales.
Originality/value
This study is the first to concurrently examine the direct and interactive effects of user-generated content (UGC) and marketer-generated content (MGC) on consumer purchase behaviors in livestreaming e-commerce, offering a novel perspective on individual decision-making and cue utilization in the social retail context.
Details
Keywords
The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee…
Abstract
Purpose
The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee, shipping location and shipping items. Automated information extraction in this area is, however, under-researched, making the extraction process a time- and effort-consuming one. For Chinese logistics tender entities, in particular, existing named entity recognition (NER) solutions are mostly unsuitable as they involve domain-specific terminologies and possess different semantic features.
Design/methodology/approach
To tackle this problem, a novel lattice long short-term memory (LSTM) model, combining a variant contextual feature representation and a conditional random field (CRF) layer, is proposed in this paper for identifying valuable entities from logistic tender documents. Instead of traditional word embedding, the proposed model uses the pretrained Bidirectional Encoder Representations from Transformers (BERT) model as input to augment the contextual feature representation. Subsequently, with the Lattice-LSTM model, the information of characters and words is effectively utilized to avoid error segmentation.
Findings
The proposed model is then verified by the Chinese logistic tender named entity corpus. Moreover, the results suggest that the proposed model excels in the logistics tender corpus over other mainstream NER models. The proposed model underpins the automatic extraction of logistics tender information, enabling logistic companies to perceive the ever-changing market trends and make far-sighted logistic decisions.
Originality/value
(1) A practical model for logistic tender NER is proposed in the manuscript. By employing and fine-tuning BERT into the downstream task with a small amount of data, the experiment results show that the model has a better performance than other existing models. This is the first study, to the best of the authors' knowledge, to extract named entities from Chinese logistic tender documents. (2) A real logistic tender corpus for practical use is constructed and a program of the model for online-processing real logistic tender documents is developed in this work. The authors believe that the model will facilitate logistic companies in converting unstructured documents to structured data and further perceive the ever-changing market trends to make far-sighted logistic decisions.
Details
Keywords
Matthew Peebles, Shen Hin Lim, Mike Duke, Benjamin Mcguinness and Chi Kit Au
Time of flight (ToF) imaging is a promising emerging technology for the purposes of crop identification. This paper aim to presents localization system for identifying and…
Abstract
Purpose
Time of flight (ToF) imaging is a promising emerging technology for the purposes of crop identification. This paper aim to presents localization system for identifying and localizing asparagus in the field based on point clouds from ToF imaging. Since the semantics are not included in the point cloud, it contains the geometric information of other objects such as stones and weeds other than asparagus spears. An approach is required for extracting the spear information so that a robotic system can be used for harvesting.
Design/methodology/approach
A real-time convolutional neural network (CNN)-based method is used for filtering the point cloud generated by a ToF camera, allowing subsequent processing methods to operate over smaller and more information-dense data sets, resulting in reduced processing time. The segmented point cloud can then be split into clusters of points representing each individual spear. Geometric filters are developed to eliminate the non-asparagus points in each cluster so that each spear can be modelled and localized. The spear information can then be used for harvesting decisions.
Findings
The localization system is integrated into a robotic harvesting prototype system. Several field trials have been conducted with satisfactory performance. The identification of a spear from the point cloud is the key to successful localization. Segmentation and clustering points into individual spears are two major failures for future improvements.
Originality/value
Most crop localizations in agricultural robotic applications using ToF imaging technology are implemented in a very controlled environment, such as a greenhouse. The target crop and the robotic system are stationary during the localization process. The novel proposed method for asparagus localization has been tested in outdoor farms and integrated with a robotic harvesting platform. Asparagus detection and localization are achieved in real time on a continuously moving robotic platform in a cluttered and unstructured environment.
Details
Keywords
N. Padmaja, Rajalakshmi Subramaniam and Sanjay Mohapatra