Search results

1 – 10 of over 9000
Open Access
Article
Publication date: 9 October 2023

Aya Khaled Youssef Sayed Mohamed, Dagmar Auer, Daniel Hofer and Josef Küng

Data protection requirements heavily increased due to the rising awareness of data security, legal requirements and technological developments. Today, NoSQL databases are…

1052

Abstract

Purpose

Data protection requirements heavily increased due to the rising awareness of data security, legal requirements and technological developments. Today, NoSQL databases are increasingly used in security-critical domains. Current survey works on databases and data security only consider authorization and access control in a very general way and do not regard most of today’s sophisticated requirements. Accordingly, the purpose of this paper is to discuss authorization and access control for relational and NoSQL database models in detail with respect to requirements and current state of the art.

Design/methodology/approach

This paper follows a systematic literature review approach to study authorization and access control for different database models. Starting with a research on survey works on authorization and access control in databases, the study continues with the identification and definition of advanced authorization and access control requirements, which are generally applicable to any database model. This paper then discusses and compares current database models based on these requirements.

Findings

As no survey works consider requirements for authorization and access control in different database models so far, the authors define their requirements. Furthermore, the authors discuss the current state of the art for the relational, key-value, column-oriented, document-based and graph database models in comparison to the defined requirements.

Originality/value

This paper focuses on authorization and access control for various database models, not concrete products. This paper identifies today’s sophisticated – yet general – requirements from the literature and compares them with research results and access control features of current products for the relational and NoSQL database models.

Details

International Journal of Web Information Systems, vol. 20 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Open Access
Article
Publication date: 22 May 2023

Edmund Baffoe-Twum, Eric Asa and Bright Awuku

Background: Geostatistics focuses on spatial or spatiotemporal datasets. Geostatistics was initially developed to generate probability distribution predictions of ore grade in the…

Abstract

Background: Geostatistics focuses on spatial or spatiotemporal datasets. Geostatistics was initially developed to generate probability distribution predictions of ore grade in the mining industry; however, it has been successfully applied in diverse scientific disciplines. This technique includes univariate, multivariate, and simulations. Kriging geostatistical methods, simple, ordinary, and universal Kriging, are not multivariate models in the usual statistical function. Notwithstanding, simple, ordinary, and universal kriging techniques utilize random function models that include unlimited random variables while modeling one attribute. The coKriging technique is a multivariate estimation method that simultaneously models two or more attributes defined with the same domains as coregionalization.

Objective: This study investigates the impact of populations on traffic volumes as a variable. The additional variable determines the strength or accuracy obtained when data integration is adopted. In addition, this is to help improve the estimation of annual average daily traffic (AADT).

Methods procedures, process: The investigation adopts the coKriging technique with AADT data from 2009 to 2016 from Montana, Minnesota, and Washington as primary attributes and population as a controlling factor (second variable). CK is implemented for this study after reviewing the literature and work completed by comparing it with other geostatistical methods.

Results, observations, and conclusions: The Investigation employed two variables. The data integration methods employed in CK yield more reliable models because their strength is drawn from multiple variables. The cross-validation results of the model types explored with the CK technique successfully evaluate the interpolation technique's performance and help select optimal models for each state. The results from Montana and Minnesota models accurately represent the states' traffic and population density. The Washington model had a few exceptions. However, the secondary attribute helped yield an accurate interpretation. Consequently, the impact of tourism, shopping, recreation centers, and possible transiting patterns throughout the state is worth exploring.

Details

Emerald Open Research, vol. 1 no. 5
Type: Research Article
ISSN: 2631-3952

Keywords

Open Access
Article
Publication date: 5 October 2023

Babitha Philip and Hamad AlJassmi

To proactively draw efficient maintenance plans, road agencies should be able to forecast main road distress parameters, such as cracking, rutting, deflection and International…

Abstract

Purpose

To proactively draw efficient maintenance plans, road agencies should be able to forecast main road distress parameters, such as cracking, rutting, deflection and International Roughness Index (IRI). Nonetheless, the behavior of those parameters throughout pavement life cycles is associated with high uncertainty, resulting from various interrelated factors that fluctuate over time. This study aims to propose the use of dynamic Bayesian belief networks for the development of time-series prediction models to probabilistically forecast road distress parameters.

Design/methodology/approach

While Bayesian belief network (BBN) has the merit of capturing uncertainty associated with variables in a domain, dynamic BBNs, in particular, are deemed ideal for forecasting road distress over time due to its Markovian and invariant transition probability properties. Four dynamic BBN models are developed to represent rutting, deflection, cracking and IRI, using pavement data collected from 32 major road sections in the United Arab Emirates between 2013 and 2019. Those models are based on several factors affecting pavement deterioration, which are classified into three categories traffic factors, environmental factors and road-specific factors.

Findings

The four developed performance prediction models achieved an overall precision and reliability rate of over 80%.

Originality/value

The proposed approach provides flexibility to illustrate road conditions under various scenarios, which is beneficial for pavement maintainers in obtaining a realistic representation of expected future road conditions, where maintenance efforts could be prioritized and optimized.

Details

Construction Innovation , vol. 24 no. 1
Type: Research Article
ISSN: 1471-4175

Keywords

Open Access
Article
Publication date: 22 November 2023

En-Ze Rui, Guang-Zhi Zeng, Yi-Qing Ni, Zheng-Wei Chen and Shuo Hao

Current methods for flow field reconstruction mainly rely on data-driven algorithms which require an immense amount of experimental or field-measured data. Physics-informed neural…

Abstract

Purpose

Current methods for flow field reconstruction mainly rely on data-driven algorithms which require an immense amount of experimental or field-measured data. Physics-informed neural network (PINN), which was proposed to encode physical laws into neural networks, is a less data-demanding approach for flow field reconstruction. However, when the fluid physics is complex, it is tricky to obtain accurate solutions under the PINN framework. This study aims to propose a physics-based data-driven approach for time-averaged flow field reconstruction which can overcome the hurdles of the above methods.

Design/methodology/approach

A multifidelity strategy leveraging PINN and a nonlinear information fusion (NIF) algorithm is proposed. Plentiful low-fidelity data are generated from the predictions of a PINN which is constructed purely using Reynold-averaged Navier–Stokes equations, while sparse high-fidelity data are obtained by field or experimental measurements. The NIF algorithm is performed to elicit a multifidelity model, which blends the nonlinear cross-correlation information between low- and high-fidelity data.

Findings

Two experimental cases are used to verify the capability and efficacy of the proposed strategy through comparison with other widely used strategies. It is revealed that the missing flow information within the whole computational domain can be favorably recovered by the proposed multifidelity strategy with use of sparse measurement/experimental data. The elicited multifidelity model inherits the underlying physics inherent in low-fidelity PINN predictions and rectifies the low-fidelity predictions over the whole computational domain. The proposed strategy is much superior to other contrastive strategies in terms of the accuracy of reconstruction.

Originality/value

In this study, a physics-informed data-driven strategy for time-averaged flow field reconstruction is proposed which extends the applicability of the PINN framework. In addition, embedding physical laws when training the multifidelity model leads to less data demand for model development compared to purely data-driven methods for flow field reconstruction.

Details

International Journal of Numerical Methods for Heat & Fluid Flow, vol. 34 no. 1
Type: Research Article
ISSN: 0961-5539

Keywords

Article
Publication date: 8 September 2023

Xiancheng Ou, Yuting Chen, Siwei Zhou and Jiandong Shi

With the continuous growth of online education, the quality issue of online educational videos has become increasingly prominent, causing students in online learning to face the…

Abstract

Purpose

With the continuous growth of online education, the quality issue of online educational videos has become increasingly prominent, causing students in online learning to face the dilemma of knowledge confusion. The existing mechanisms for controlling the quality of online educational videos suffer from subjectivity and low timeliness. Monitoring the quality of online educational videos involves analyzing metadata features and log data, which is an important aspect. With the development of artificial intelligence technology, deep learning techniques with strong predictive capabilities can provide new methods for predicting the quality of online educational videos, effectively overcoming the shortcomings of existing methods. The purpose of this study is to find a deep neural network that can model the dynamic and static features of the video itself, as well as the relationships between videos, to achieve dynamic monitoring of the quality of online educational videos.

Design/methodology/approach

The quality of a video cannot be directly measured. According to previous research, the authors use engagement to represent the level of video quality. Engagement is the normalized participation time, which represents the degree to which learners tend to participate in the video. Based on existing public data sets, this study designs an online educational video engagement prediction model based on dynamic graph neural networks (DGNNs). The model is trained based on the video’s static features and dynamic features generated after its release by constructing dynamic graph data. The model includes a spatiotemporal feature extraction layer composed of DGNNs, which can effectively extract the time and space features contained in the video's dynamic graph data. The trained model is used to predict the engagement level of learners with the video on day T after its release, thereby achieving dynamic monitoring of video quality.

Findings

Models with spatiotemporal feature extraction layers consisting of four types of DGNNs can accurately predict the engagement level of online educational videos. Of these, the model using the temporal graph convolutional neural network has the smallest prediction error. In dynamic graph construction, using cosine similarity and Euclidean distance functions with reasonable threshold settings can construct a structurally appropriate dynamic graph. In the training of this model, the amount of historical time series data used will affect the model’s predictive performance. The more historical time series data used, the smaller the prediction error of the trained model.

Research limitations/implications

A limitation of this study is that not all video data in the data set was used to construct the dynamic graph due to memory constraints. In addition, the DGNNs used in the spatiotemporal feature extraction layer are relatively conventional.

Originality/value

In this study, the authors propose an online educational video engagement prediction model based on DGNNs, which can achieve the dynamic monitoring of video quality. The model can be applied as part of a video quality monitoring mechanism for various online educational resource platforms.

Details

International Journal of Web Information Systems, vol. 19 no. 5/6
Type: Research Article
ISSN: 1744-0084

Keywords

Content available
Article
Publication date: 1 August 2023

Elham Mahamedi, Martin Wonders, Nima Gerami Seresht, Wai Lok Woo and Mohamad Kassem

The purpose of this paper is to propose a novel data-driven approach for predicting energy performance of buildings that can address the scarcity of quality data, and consider the…

74

Abstract

Purpose

The purpose of this paper is to propose a novel data-driven approach for predicting energy performance of buildings that can address the scarcity of quality data, and consider the dynamic nature of building systems.

Design/methodology/approach

This paper proposes a reinforcing machine learning (ML) approach based on transfer learning (TL) to address these challenges. The proposed approach dynamically incorporates the data captured by the building management systems into the model to improve its accuracy.

Findings

It was shown that the proposed approach could improve the accuracy of the energy performance prediction compared to the conventional TL (non-reinforcing) approach by 19 percentage points in mean absolute percentage error.

Research limitations/implications

The case study results confirm the practicality of the proposed approach and show that it outperforms the standard ML approach (with no transferred knowledge) when little data is available.

Originality/value

This approach contributes to the body of knowledge by addressing the limited data availability in the building sector using TL; and accounting for the dynamics of buildings’ energy performance by the reinforcing architecture. The proposed approach is implemented in a case study project based in London, UK.

Details

Construction Innovation , vol. 24 no. 1
Type: Research Article
ISSN: 1471-4175

Keywords

Open Access
Article
Publication date: 8 February 2023

Edoardo Ramalli and Barbara Pernici

Experiments are the backbone of the development process of data-driven predictive models for scientific applications. The quality of the experiments directly impacts the model…

Abstract

Purpose

Experiments are the backbone of the development process of data-driven predictive models for scientific applications. The quality of the experiments directly impacts the model performance. Uncertainty inherently affects experiment measurements and is often missing in the available data sets due to its estimation cost. For similar reasons, experiments are very few compared to other data sources. Discarding experiments based on the missing uncertainty values would preclude the development of predictive models. Data profiling techniques are fundamental to assess data quality, but some data quality dimensions are challenging to evaluate without knowing the uncertainty. In this context, this paper aims to predict the missing uncertainty of the experiments.

Design/methodology/approach

This work presents a methodology to forecast the experiments’ missing uncertainty, given a data set and its ontological description. The approach is based on knowledge graph embeddings and leverages the task of link prediction over a knowledge graph representation of the experiments database. The validity of the methodology is first tested in multiple conditions using synthetic data and then applied to a large data set of experiments in the chemical kinetic domain as a case study.

Findings

The analysis results of different test case scenarios suggest that knowledge graph embedding can be used to predict the missing uncertainty of the experiments when there is a hidden relationship between the experiment metadata and the uncertainty values. The link prediction task is also resilient to random noise in the relationship. The knowledge graph embedding outperforms the baseline results if the uncertainty depends upon multiple metadata.

Originality/value

The employment of knowledge graph embedding to predict the missing experimental uncertainty is a novel alternative to the current and more costly techniques in the literature. Such contribution permits a better data quality profiling of scientific repositories and improves the development process of data-driven models based on scientific experiments.

Article
Publication date: 24 October 2022

Priyanka Chawla, Rutuja Hasurkar, Chaithanya Reddy Bogadi, Naga Sindhu Korlapati, Rajasree Rajendran, Sindu Ravichandran, Sai Chaitanya Tolem and Jerry Zeyu Gao

The study aims to propose an intelligent real-time traffic model to address the traffic congestion problem. The proposed model assists the urban population in their everyday lives…

Abstract

Purpose

The study aims to propose an intelligent real-time traffic model to address the traffic congestion problem. The proposed model assists the urban population in their everyday lives by assessing the probability of road accidents and accurate traffic information prediction. It also helps in reducing overall carbon dioxide emissions in the environment and assists the urban population in their everyday lives by increasing overall transportation quality.

Design/methodology/approach

This study offered a real-time traffic model based on the analysis of numerous sensor data. Real-time traffic prediction systems can identify and visualize current traffic conditions on a particular lane. The proposed model incorporated data from road sensors as well as a variety of other sources. It is difficult to capture and process large amounts of sensor data in real time. Sensor data is consumed by streaming analytics platforms that use big data technologies, which is then processed using a range of deep learning and machine learning techniques.

Findings

The study provided in this paper would fill a gap in the data analytics sector by delivering a more accurate and trustworthy model that uses internet of things sensor data and other data sources. This method can also assist organizations such as transit agencies and public safety departments in making strategic decisions by incorporating it into their platforms.

Research limitations/implications

The model has a big flaw in that it makes predictions for the period following January 2020 that are not particularly accurate. This, however, is not a flaw in the model; rather, it is a flaw in Covid-19, the global epidemic. The global pandemic has impacted the traffic scenario, resulting in erratic data for the period after February 2020. However, once the circumstance returns to normal, the authors are confident in their model’s ability to produce accurate forecasts.

Practical implications

To help users choose when to go, this study intended to pinpoint the causes of traffic congestion on the highways in the Bay Area as well as forecast real-time traffic speeds. To determine the best attributes that influence traffic speed in this study, the authors obtained data from the Caltrans performance measurement system (PeMS), reviewed it and used multiple models. The authors developed a model that can forecast traffic speed while accounting for outside variables like weather and incident data, with decent accuracy and generalizability. To assist users in determining traffic congestion at a certain location on a specific day, the forecast method uses a graphical user interface. This user interface has been designed to be readily expanded in the future as the project’s scope and usefulness increase. The authors’ Web-based traffic speed prediction platform is useful for both municipal planners and individual travellers. The authors were able to get excellent results by using five years of data (2015–2019) to train the models and forecast outcomes for 2020 data. The authors’ algorithm produced highly accurate predictions when tested using data from January 2020. The benefits of this model include accurate traffic speed forecasts for California’s four main freeways (Freeway 101, I-680, 880 and 280) for a specific place on a certain date. The scalable model performs better than the vast majority of earlier models created by other scholars in the field. The government would benefit from better planning and execution of new transportation projects if this programme were to be extended across the entire state of California. This initiative could be expanded to include the full state of California, assisting the government in better planning and implementing new transportation projects.

Social implications

To estimate traffic congestion, the proposed model takes into account a variety of data sources, including weather and incident data. According to traffic congestion statistics, “bottlenecks” account for 40% of traffic congestion, “traffic incidents” account for 25% and “work zones” account for 10% (Traffic Congestion Statistics). As a result, incident data must be considered for analysis. The study uses traffic, weather and event data from the previous five years to estimate traffic congestion in any given area. As a result, the results predicted by the proposed model would be more accurate, and commuters who need to schedule ahead of time for work would benefit greatly.

Originality/value

The proposed work allows the user to choose the optimum time and mode of transportation for them. The underlying idea behind this model is that if a car spends more time on the road, it will cause traffic congestion. The proposed system encourages users to arrive at their location in a short period of time. Congestion is an indicator that public transportation needs to be expanded. The optimum route is compared to other kinds of public transit using this methodology (Greenfield, 2014). If the commute time is comparable to that of private car transportation during peak hours, consumers should take public transportation.

Details

World Journal of Engineering, vol. 21 no. 1
Type: Research Article
ISSN: 1708-5284

Keywords

Article
Publication date: 26 May 2022

Ismail Abiodun Sulaimon, Hafiz Alaka, Razak Olu-Ajayi, Mubashir Ahmad, Saheed Ajayi and Abdul Hye

Road traffic emissions are generally believed to contribute immensely to air pollution, but the effect of road traffic data sets on air quality (AQ) predictions has not been fully…

260

Abstract

Purpose

Road traffic emissions are generally believed to contribute immensely to air pollution, but the effect of road traffic data sets on air quality (AQ) predictions has not been fully investigated. This paper aims to investigate the effects traffic data set have on the performance of machine learning (ML) predictive models in AQ prediction.

Design/methodology/approach

To achieve this, the authors have set up an experiment with the control data set having only the AQ data set and meteorological (Met) data set, while the experimental data set is made up of the AQ data set, Met data set and traffic data set. Several ML models (such as extra trees regressor, eXtreme gradient boosting regressor, random forest regressor, K-neighbors regressor and two others) were trained, tested and compared on these individual combinations of data sets to predict the volume of PM2.5, PM10, NO2 and O3 in the atmosphere at various times of the day.

Findings

The result obtained showed that various ML algorithms react differently to the traffic data set despite generally contributing to the performance improvement of all the ML algorithms considered in this study by at least 20% and an error reduction of at least 18.97%.

Research limitations/implications

This research is limited in terms of the study area, and the result cannot be generalized outside of the UK as some of the inherent conditions may not be similar elsewhere. Additionally, only the ML algorithms commonly used in literature are considered in this research, therefore, leaving out a few other ML algorithms.

Practical implications

This study reinforces the belief that the traffic data set has a significant effect on improving the performance of air pollution ML prediction models. Hence, there is an indication that ML algorithms behave differently when trained with a form of traffic data set in the development of an AQ prediction model. This implies that developers and researchers in AQ prediction need to identify the ML algorithms that behave in their best interest before implementation.

Originality/value

The result of this study will enable researchers to focus more on algorithms of benefit when using traffic data sets in AQ prediction.

Details

Journal of Engineering, Design and Technology , vol. 22 no. 3
Type: Research Article
ISSN: 1726-0531

Keywords

Open Access
Article
Publication date: 11 March 2022

Edmund Baffoe-Twum, Eric Asa and Bright Awuku

Background: The annual average daily traffic (AADT) data from road segments are critical for roadway projects, especially with the decision-making processes about operations…

Abstract

Background: The annual average daily traffic (AADT) data from road segments are critical for roadway projects, especially with the decision-making processes about operations, travel demand, safety-performance evaluation, and maintenance. Regular updates help to determine traffic patterns for decision-making. Unfortunately, the luxury of having permanent recorders on all road segments, especially low-volume roads, is virtually impossible. Consequently, insufficient AADT information is acquired for planning and new developments. A growing number of statistical, mathematical, and machine-learning algorithms have helped estimate AADT data values accurately, to some extent, at both sampled and unsampled locations on low-volume roadways. In some cases, roads with no representative AADT data are resolved with information from roadways with similar traffic patterns.

Methods: This study adopted an integrative approach with a combined systematic literature review (SLR) and meta-analysis (MA) to identify and to evaluate the performance, the sources of error, and possible advantages and disadvantages of the techniques utilized most for estimating AADT data. As a result, an SLR of various peer-reviewed articles and reports was completed to answer four research questions.

Results: The study showed that the most frequent techniques utilized to estimate AADT data on low-volume roadways were regression, artificial neural-network techniques, travel-demand models, the traditional factor approach, and spatial interpolation techniques. These AADT data-estimating methods' performance was subjected to meta-analysis. Three studies were completed: R squared, root means square error, and mean absolute percentage error. The meta-analysis results indicated a mixed summary effect: 1. all studies were equal; 2. all studies were not comparable. However, the integrated qualitative and quantitative approach indicated that spatial-interpolation (Kriging) methods outperformed the others.

Conclusions: Spatial-interpolation methods may be selected over others to generate accurate AADT data by practitioners at all levels for decision making. Besides, the resulting cross-validation statistics give statistics like the other methods' performance measures.

Details

Emerald Open Research, vol. 1 no. 5
Type: Research Article
ISSN: 2631-3952

Keywords

Access

Year

Last 6 months (9465)

Content type

Article (9465)
1 – 10 of over 9000