Search results
1 – 10 of over 1000This paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this…
Abstract
Purpose
This paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this emerging field. Communication is multimodal. Multimodality is a representation of many modes chosen from rhetorical aspects for its communication potentials. The author seeks to define the available automation capabilities in communication using multimodalities that will support a proposed Interactive Robot System (IRS) as an AI mounted robotic platform to advance the speed and quality of military operational and tactical decision making.
Design/methodology/approach
This review will begin by presenting key developments in the robotic interaction field with the objective of identifying essential technological developments that set conditions for robotic platforms to function autonomously. After surveying the key aspects in Human Robot Interaction (HRI), Unmanned Autonomous System (UAS), visualization, Virtual Environment (VE) and prediction, the paper then proceeds to describe the gaps in the application areas that will require extension and integration to enable the prototyping of the IRS. A brief examination of other work in HRI-related fields concludes with a recapitulation of the IRS challenge that will set conditions for future success.
Findings
Using insights from a balanced cross section of sources from the government, academic, and commercial entities that contribute to HRI a multimodal IRS in military communication is introduced. Multimodal IRS (MIRS) in military communication has yet to be deployed.
Research limitations/implications
Multimodal robotic interface for the MIRS is an interdisciplinary endeavour. This is not realistic that one can comprehend all expert and related knowledge and skills to design and develop such multimodal interactive robotic interface. In this brief preliminary survey, the author has discussed extant AI, robotics, NLP, CV, VDM, and VE applications that is directly related to multimodal interaction. Each mode of this multimodal communication is an active research area. Multimodal human/military robot communication is the ultimate goal of this research.
Practical implications
A multimodal autonomous robot in military communication using speech, images, gestures, VST and VE has yet to be deployed. Autonomous multimodal communication is expected to open wider possibilities for all armed forces. Given the density of the land domain, the army is in a position to exploit the opportunities for human–machine teaming (HMT) exposure. Naval and air forces will adopt platform specific suites for specially selected operators to integrate with and leverage this emerging technology. The possession of a flexible communications means that readily adapts to virtual training will enhance planning and mission rehearsals tremendously.
Social implications
Interaction, perception, cognition and visualization based multimodal communication system is yet missing. Options to communicate, express and convey information in HMT setting with multiple options, suggestions and recommendations will certainly enhance military communication, strength, engagement, security, cognition, perception as well as the ability to act confidently for a successful mission.
Originality/value
The objective is to develop a multimodal autonomous interactive robot for military communications. This survey reports the state of the art, what exists and what is missing, what can be done and possibilities of extension that support the military in maintaining effective communication using multimodalities. There are some separate ongoing progresses, such as in machine-enabled speech, image recognition, tracking, visualizations for situational awareness, and virtual environments. At this time, there is no integrated approach for multimodal human robot interaction that proposes a flexible and agile communication. The report briefly introduces the research proposal about multimodal interactive robot in military communication.
Details
Keywords
This paper aims to investigate adolescent English as a foreign language (EFL) learners’ digitally mediated multimodal compositions across different genres of writing.
Abstract
Purpose
This paper aims to investigate adolescent English as a foreign language (EFL) learners’ digitally mediated multimodal compositions across different genres of writing.
Design/methodology/approach
Three Korean high school students participated in the study and created multiple multimodal texts over the course of one academic semester. These texts and other materials were the basis for this study’s qualitative case studies. Multiple sources of data (e.g. class observations, demographic surveys, interviews, field notes and students’ artifacts) were collected. Drawing upon the inductive approach, a coding-oriented analysis was used for the collected data. In addition, a multimodal text analysis was conducted for the students’ multimodal texts and their storyboards.
Findings
The study participants’ perceptions of multimodal composing practices seemed to be positively reshaped as a result of them creating multimodal texts. Some participants created multimodal products in phases (e.g. selecting or changing a topic, constructing a storyboard and editing). Especially, although the students’ creative processes had a similarly fixed and linear flow from print-based writing to other modalities, their creative processes proved to be flexible, recursive and/or circular.
Originality/value
This study contributes to the understanding of adolescent English language learners’ multimodal composing practices in the EFL context, which has been underexplored in the literature. It also presents the students’ perspectives on these practices. In short, it provides theoretical and methodological grounds for future L2 literacy researchers to conduct empirical studies on multimodal composing practices.
Details
Keywords
In job advertisements, companies present claims about their organizational identity. My study explores how employers use multimodality in visuals and verbal text to construct…
Abstract
In job advertisements, companies present claims about their organizational identity. My study explores how employers use multimodality in visuals and verbal text to construct organizational identity claims and address potential future employees. Drawing on a multimodal analysis of job advertisements used by German fashion companies between 1968 and 2013, I identify three types of job advertisements and analyze their content and latent meanings. I find three specific relationships between identity claims’ verbal and visual dimensions that also influence viewers’ attraction to, perception of the legitimacy of, and identification with organizations. My study contributes to research on multimodality and on organizational identity claims.
Details
Keywords
Aman Dua, Rishika Chhabra and Deepankar Sinha
The first purpose is to assess the quality of containerized multimodal export and the second is to develop and demonstrate the design of a service network with quality approach.
Abstract
Purpose
The first purpose is to assess the quality of containerized multimodal export and the second is to develop and demonstrate the design of a service network with quality approach.
Design/methodology/approach
The article used the structural equation model to develop a model to measure the quality of multimodal transportation for containerized exports and finalized the model with an alternative approach. The evolutionary algorithm had been used to design a service network based on quality.
Findings
Provided factors affecting quality of multimodal transportation and reverse to one hypothesis, the construct variation in cost, time shape and quantity did not affect the quality of multimodal transportation for containerized exports. The model without variation construct was finalized by exploring causality.
Research limitations/implications
This research had scope till container loading onto the vessel and assessed the quality for containerized cargo only, and second research purpose is limited by assumed values of fitness function and the limited number of nodes, in service network design demonstration.
Practical implications
This research provided a tool to measure the quality of multimodal transportation for containerized exports and demonstrated the field application of the model developed in service network design. This approach included all factors applicable across the container movement. The integrated approach of the article provided an organized method to design a service network for containerized exports.
Originality/value
This work provided the tool to assess the quality of multimodal transportation for containerized exports and developed an approach to design a service network of multimodal transportation based on quality. This approach has considered the factors of multimodal transportation comprehensively in contrast to the optimization approaches based on operation research techniques.
Details
Keywords
Multimodal writing portfolios were introduced and integrated into an undergraduate course and a graduate course in a research-oriented university in northwest Taiwan. This study…
Abstract
Purpose
Multimodal writing portfolios were introduced and integrated into an undergraduate course and a graduate course in a research-oriented university in northwest Taiwan. This study aims to examine the influence of multimodal writing portfolios of novice researchers' academic writing.
Design/methodology/approach
Comparative case studies involve collecting data from several cases and analyzing the similarities, differences and patterns across cases (Merriam, 2009). To address this underdeveloped area of research, a comparative case study method was employed to understand undergraduate and graduate students' multimodal writing portfolios in academic writing in two courses in Taiwan.
Findings
First, multimodal writing portfolios enabled novice researchers to be more familiar with the structure of academic paper and they had better performance in intrapersonal and linguistic aspects. Second, novice researchers held positive attitude toward multimodal writing portfolios because they regarded process of making multimodal writing portfolios as preparation for their future academic writing. Finally, participants highly valued the class PowerPoint slides, weekly writing tasks and the instructor's modeling as effective facilitation for making multimodal writing portfolios.
Research limitations/implications
Limited studies focus on multimodal writing portfolios (e.g. Silver, 2019). The present case study explores the integration of a multimodal writing portfolio into one undergraduate and one graduate course to explore learners' attitude and performance in academic writing.
Practical implications
Novice researchers can learn to compose multimodal academic texts for the academic writing community.
Social implications
Suggestions on effective integration of multimodal writing portfolios into academic writing instruction were provided based on the research findings.
Originality/value
The findings of the study provide the field of L2 writing with insights into the pedagogical development of multilingual writing portfolios and help educators to be better prepared for teaching novice researchers to comprehend and compose multimodal texts and enter the academic writing community. The framework in Figure 1 and suggestions on course designs for academic writing can inform educators on the integration of multimodality in academic discourse. Moreover, this study moves beyond general writing courses at the tertiary level and could contribute to L2 writers' deeper understanding of how multimodal writing portfolios can be constructed.
Details
Keywords
Emily Hellmich, Jill Castek, Blaine E. Smith, Rachel Floyd and Wen Wen
Multimodal composing is often romanticized as a flexible approach suitable for all learners. There is a lack of research that critically examines students’ perspectives and the…
Abstract
Purpose
Multimodal composing is often romanticized as a flexible approach suitable for all learners. There is a lack of research that critically examines students’ perspectives and the constraints of multimodal composing across academic contexts. This study aims to address this need by exploring high school learners’ perspectives and experiences enacting multimodal learning in an L2 classroom. More specifically, this study presents key tensions between students’ experiences of multimodal composing and teacher/researchers’ use of multimodal composition in an L2 classroom setting.
Design/methodology/approach
The paper focuses on two multimodal composing projects developed within a design-based implementation research approach and implemented in a high school French class. Multiple data sources were used: observations; interviews; written reflections; and multimodal compositions. Data were analyzed using the critical incident technique (CIT). A critical incident is one that is unplanned and that stimulates reflection on teaching and learning. Methodologically, CIT was enacted through iterative coding to identify critical incidents and collaborative analysis.
Findings
Using illustrative examples from multiple data sources, this study discusses four tensions between students’ experiences of multimodal composing and teacher/researchers’ use of multimodal composition in a classroom setting: the primary audience of student projects, the media leveraged in student projects, expectations of learning in school and the role of a public viewing of student work.
Originality/value
This paper problematizes basic assumptions and benefits of multimodal composing and offers ideas on how to re-center multimodal composing on student voices.
Details
Keywords
The purpose of this paper is to provide an overview of navigational assistive technologies with various sensor modalities and alternative perception approaches for visually…
Abstract
Purpose
The purpose of this paper is to provide an overview of navigational assistive technologies with various sensor modalities and alternative perception approaches for visually impaired people. It also examines the input and output of each technology, and provides a comparison between systems.
Design/methodology/approach
The contributing authors along with their students thoroughly read and reviewed the referenced papers while under the guidance of domain experts and users evaluating each paper/technology based on a set of metrics adapted from universal and system design.
Findings
After analyzing 13 multimodal assistive technologies, the authors found that the most popular sensors are optical, infrared, and ultrasonic. Similarly, the most popular actuators are audio and haptic. Furthermore, most systems use a combination of these sensors and actuators. Some systems are niche, while others strive to be universal.
Research limitations/implications
This paper serves as a starting point for further research in benchmarking multimodal assistive technologies for the visually impaired and to eventually cultivate better assistive technologies for all.
Social implications
Based on 2012 World Health Organization, there are 39 million blind people. This paper will have an insight of what kind of assistive technologies are available to the visually impaired people, whether in market or research lab.
Originality/value
This paper provides a comparison across diverse visual assistive technologies. This is valuable to those who are developing assistive technologies and want to be aware of what is available as well their pros and cons, and the study of human-computer interfaces.
Details
Keywords
Juan Yang, Zhenkun Li and Xu Du
Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their…
Abstract
Purpose
Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human–computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.
Design/methodology/approach
A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum–based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including “Multi-head Attention-based Intra-modality Interaction Module” and “Multi-head Attention-based Cross-modality Interaction Module”, are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.
Findings
Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.
Originality/value
The proposed KE-AFN can support the development of engaging and empathetic human–computer interaction environment.
Details
Keywords
Yanmin Zhou, Zheng Yan, Ye Yang, Zhipeng Wang, Ping Lu, Philip F. Yuan and Bin He
Vision, audition, olfactory, tactile and taste are five important senses that human uses to interact with the real world. As facing more and more complex environments, a sensing…
Abstract
Purpose
Vision, audition, olfactory, tactile and taste are five important senses that human uses to interact with the real world. As facing more and more complex environments, a sensing system is essential for intelligent robots with various types of sensors. To mimic human-like abilities, sensors similar to human perception capabilities are indispensable. However, most research only concentrated on analyzing literature on single-modal sensors and their robotics application.
Design/methodology/approach
This study presents a systematic review of five bioinspired senses, especially considering a brief introduction of multimodal sensing applications and predicting current trends and future directions of this field, which may have continuous enlightenments.
Findings
This review shows that bioinspired sensors can enable robots to better understand the environment, and multiple sensor combinations can support the robot’s ability to behave intelligently.
Originality/value
The review starts with a brief survey of the biological sensing mechanisms of the five senses, which are followed by their bioinspired electronic counterparts. Their applications in the robots are then reviewed as another emphasis, covering the main application scopes of localization and navigation, objection identification, dexterous manipulation, compliant interaction and so on. Finally, the trends, difficulties and challenges of this research were discussed to help guide future research on intelligent robot sensors.
Details
Keywords
This paper aims to give an overview of a dialogue manager and recent experiments with multimodal human‐robot dialogues.
Abstract
Purpose
This paper aims to give an overview of a dialogue manager and recent experiments with multimodal human‐robot dialogues.
Design/methodology/approach
The paper identifies requirements and solutions in the design of a human‐robot interface. The paper presents essential techniques for a humanoid robot in a household environment and describes their application to representative interaction scenarios that are based on standard situations for a humanoid robot in a household environment. The presented dialogue manager has been developed within the German collaborative research center SFB‐588 on “Humanoid Robots – Learning and Cooperating Multimodal Robots”. The dialogue system is embedded in a multimodal perceptual system of the humanoid robot developed within this project. The implementation of the dialogue manager is geared to requirements found in the explored scenarios. The algorithms include multimodal fusion, reinforcement learning, knowledge acquisition and tight coupling of dialogue manager and speech recognition.
Findings
Within the presented scenarios several algorithms have been implemented and show improvements of the interactions. Results are reported within scenarios that model typical household situations.
Research limitations/implications
Additional scenarios need to be explored especially in real‐world (out of the lab) experiments.
Practical implications
The paper includes implications for the development of humanoid robots and human‐robot interaction.
Originality/value
This paper explores human‐robot interaction scenarios and describes solutions for dialogue systems.
Details