Search results
1 – 10 of 328Sami J. Habib and Paulvanna N. Marimuthu
Energy constraint is always a serious issue in wireless sensor networks, as the energy possessed by the sensors is limited and non‐renewable. Data aggregation at intermediate base…
Abstract
Purpose
Energy constraint is always a serious issue in wireless sensor networks, as the energy possessed by the sensors is limited and non‐renewable. Data aggregation at intermediate base stations increases the lifespan of the sensors, whereby the sensors' data are aggregated before being communicated to the central server. This paper proposes a query‐based aggregation within Monte Carlo simulator to explore the best and worst possible query orders to aggregate the sensors' data at the base stations. The proposed query‐based aggregation model can help the network administrator to envisage the best query orders in improving the performance of the base stations under uncertain query ordering. Furthermore, it aims to examine the feasibility of the proposed model to engage simultaneous transmissions at the base station and also to derive a best‐fit mathematical model to study the behavior of data aggregation with uncertain querying order.
Design/methodology/approach
The paper considers small and medium‐sized wireless sensor networks comprised of randomly deployed sensors in a square arena. It formulates the query‐based data aggregation problem as an uncertain ordering problem within Monte Carlo simulator, generating several thousands of uncertain orders to schedule the responses of M sensors at the base station within the specified time interval. For each selected time interval, the model finds the best possible querying order to aggregate the data with reduced idle time and with improved throughput. Furthermore, it extends the model to include multiple sensing parameters and multiple aggregating channels, thereby enabling the administrator to plan the capacity of its WSN according to specific time intervals known in advance.
Findings
The experimental results within Monte Carlo simulator demonstrate that the query‐based aggregation scheme show a better trade‐off in maximizing the aggregating efficiency and also reducing the average idle‐time experienced by the individual sensor. The query‐based aggregation model was tested for a WSN containing 25 sensors with single sensing parameter, transmitting data to a base station; moreover, the simulation results show continuous improvement in best‐case performances from 56 percent to 96 percent in the time interval of 80 to 200 time units. Moreover, the query aggregation is extended to analyze the behavior of WSN with 50 sensors, sensing two environmental parameters and base station equipped with multiple channels, whereby it demonstrates a shorter aggregation time interval against single channel. The analysis of average waiting time of individual sensors in the generated uncertain querying order shows that the best‐case scenario within a specified time interval showed a gain of 10 percent to 20 percent over the worst‐case scenario, which reduces the total transmission time by around 50 percent.
Practical implications
The proposed query‐based data aggregation model can be utilized to predict the non‐deterministic real‐time behavior of the wireless sensor network in response to the flooded queries by the base station.
Originality/value
This paper employs a novel framework to analyze all possible ordering of sensor responses to be aggregated at the base station within the stipulated aggregating time interval.
Details
Keywords
Develops and tests a general model for understanding the influence of query‐based decision aids (QBDA) on consumer decision making in the electronic commerce environment. The…
Abstract
Develops and tests a general model for understanding the influence of query‐based decision aids (QBDA) on consumer decision making in the electronic commerce environment. The results show that the use of well‐designed query‐based decision aids leads to increased satisfaction with the decision process and increased confidence in judgements. The number of stages of phased narrowing of the consideration set was higher in the case of subjects who had access to the query‐based decision aids. The mediating variables through which this influence occurs are size of the consideration set, similarity among the alternatives in the consideration set, cognitive decision effort, and perceived cost savings. The size of the consideration set and the similarity among the alternatives in the consideration set were higher in the case of subjects who had access to the query‐based decision aid. Subjects who had access to the query‐based decision aid perceived an increased cost savings and a lower cognitive decision effort associated with the purchase decision. This research is done in the context of consumers searching for information on the World Wide Web prior to the purchase of cars.
Details
Keywords
Farnoush Bayatmakou, Azadeh Mohebi and Abbas Ahmadi
Query-based summarization approaches might not be able to provide summaries compatible with the user’s information need, as they mostly rely on a limited source of information…
Abstract
Purpose
Query-based summarization approaches might not be able to provide summaries compatible with the user’s information need, as they mostly rely on a limited source of information, usually represented as a single query by the user. This issue becomes even more challenging when dealing with scientific documents, as they contain more specific subject-related terms, while the user may not be able to express his/her specific information need in a query with limited terms. This study aims to propose an interactive multi-document text summarization approach that generates an eligible summary that is more compatible with the user’s information need. This approach allows the user to interactively specify the composition of a multi-document summary.
Design/methodology/approach
This approach exploits the user’s opinion in two stages. The initial query is refined by user-selected keywords/keyphrases and complete sentences extracted from the set of retrieved documents. It is followed by a novel method for sentence expansion using the genetic algorithm, and ranking the final set of sentences using the maximal marginal relevance method. Basically, for implementation, the Web of Science data set in the artificial intelligence (AI) category is considered.
Findings
The proposed approach receives feedback from the user in terms of favorable keywords and sentences. The feedback eventually improves the summary as the end. To assess the performance of the proposed system, this paper has asked 45 users who were graduate students in the field of AI to fill out a questionnaire. The quality of the final summary has been also evaluated from the user’s perspective and information redundancy. It has been investigated that the proposed approach leads to higher degrees of user satisfaction compared to the ones with no or only one step of the interaction.
Originality/value
The interactive summarization approach goes beyond the initial user’s query, while it includes the user’s preferred keywords/keyphrases and sentences through a systematic interaction. With respect to these interactions, the system gives the user a more clear idea of the information he/she is looking for and consequently adjusting the final result to the ultimate information need. Such interaction allows the summarization system to achieve a comprehensive understanding of the user’s information needs while expanding context-based knowledge and guiding the user toward his/her information journey.
Details
Keywords
Engines have been built that execute queries against XML data. The aim of this paper is to describe a novel technique that can be used to improve the speed of execution of the…
Abstract
Purpose
Engines have been built that execute queries against XML data. The aim of this paper is to describe a novel technique that can be used to improve the speed of execution of the queries based on semantics of the data in the XML document.
Design/methodology/approach
The paper formally introduces algorithms for optimizing XML queries, implement the algorithms, and through experimentation demonstrate the improvement in speed.
Findings
Three possible semantic query optimizations based on the values of elements were introduced and these demonstrate that two of the three optimizations improve query performance but the third does not. It is hypothesized why this is the case.
Research limitations/implications
A limitation is obviously the query engine and how it works. Future work includes, executing the experiments on a different engine and comparing results, building a system to automatically generate the characteristics that are necessary to do the optimization, describing the best way to represent and maintain the characteristics once they are found, compare the results of optimizations based on content with optimizations based on structure.
Practical implications
The optimizations could be incorporated into new query engines.
Originality/value
Novel algorithms for query optimization have been developed and proven to work. They are of value to people who are building database systems for XML data.
Details
Keywords
Sergei O. Kuznetsov, Alexey Masyutin and Aleksandr Ageev
The purpose of this study is to show that closure-based classification and regression models provide both high accuracy and interpretability.
Abstract
Purpose
The purpose of this study is to show that closure-based classification and regression models provide both high accuracy and interpretability.
Design/methodology/approach
Pattern structures allow one to approach the knowledge extraction problem in case of partially ordered descriptions. They provide a way to apply techniques based on closed descriptions to non-binary data. To provide scalability of the approach, the author introduced a lazy (query-based) classification algorithm.
Findings
The experiments support the hypothesis that closure-based classification and regression allow one to both achieve higher accuracy in scoring models as compared to results obtained with classical banking models and retain interpretability of model results, whereas black-box methods grant better accuracy for the cost of losing interpretability.
Originality/value
This is an original research showing the advantage of closure-based classification and regression models in the banking sphere.
Details
Keywords
Hengqin Wu, Geoffrey Shen, Xue Lin, Minglei Li, Boyu Zhang and Clyde Zhengdao Li
This study proposes an approach to solve the fundamental problem in using query-based methods (i.e. searching engines and patent retrieval tools) to screen patents of information…
Abstract
Purpose
This study proposes an approach to solve the fundamental problem in using query-based methods (i.e. searching engines and patent retrieval tools) to screen patents of information and communication technology in construction (ICTC). The fundamental problem is that ICTC incorporates various techniques and thus cannot be simply represented by man-made queries. To investigate this concern, this study develops a binary classifier by utilizing deep learning and NLP techniques to automatically identify whether a patent is relevant to ICTC, thus accurately screening a corpus of ICTC patents.
Design/methodology/approach
This study employs NLP techniques to convert the textual data of patents into numerical vectors. Then, a supervised deep learning model is developed to learn the relations between the input vectors and outputs.
Findings
The validation results indicate that (1) the proposed approach has a better performance in screening ICTC patents than traditional machine learning methods; (2) besides the United States Patent and Trademark Office (USPTO) that provides structured and well-written patents, the approach could also accurately screen patents form Derwent Innovations Index (DIX), in which patents are written in different genres.
Practical implications
This study contributes a specific collection for ICTC patents, which is not provided by the patent offices.
Social implications
The proposed approach contributes an alternative manner in gathering a corpus of patents for domains like ICTC that neither exists as a searchable classification in patent offices, nor is accurately represented by man-made queries.
Originality/value
A deep learning model with two layers of neurons is developed to learn the non-linear relations between the input features and outputs providing better performance than traditional machine learning models. This study uses advanced NLP techniques lemmatization and part-of-speech POS to process textual data of ICTC patents. This study contributes specific collection for ICTC patents which is not provided by the patent offices.
Details
Keywords
The purpose of this paper is to introduce a summarization method to enhance the current web-search approaches by offering a summary of each clustered set of web-search results…
Abstract
Purpose
The purpose of this paper is to introduce a summarization method to enhance the current web-search approaches by offering a summary of each clustered set of web-search results with contents addressing the same topic, which should allow the user to quickly identify the information covered in the clustered search results. Web search engines, such as Google, Bing and Yahoo!, rank the set of documents S retrieved in response to a user query and represent each document D in S using a title and a snippet, which serves as an abstract of D. Snippets, however, are not as useful as they are designed for, i.e. assisting its users to quickly identify results of interest. These snippets are inadequate in providing distinct information and capture the main contents of the corresponding documents. Moreover, when the intended information need specified in a search query is ambiguous, it is very difficult, if not impossible, for a search engine to identify precisely the set of documents that satisfy the user’s intended request without requiring additional information. Furthermore, a document title is not always a good indicator of the content of the corresponding document either.
Design/methodology/approach
The authors propose to develop a query-based summarizer, called QSum, in solving the existing problems of Web search engines which use titles and abstracts in capturing the contents of retrieved documents. QSum generates a concise/comprehensive summary for each cluster of documents retrieved in response to a user query, which saves the user’s time and effort in searching for specific information of interest by skipping the step to browse through the retrieved documents one by one.
Findings
Experimental results show that QSum is effective and efficient in creating a high-quality summary for each cluster to enhance Web search.
Originality/value
The proposed query-based summarizer, QSum, is unique based on its searching approach. QSum is also a significant contribution to the Web search community, as it handles the ambiguous problem of a search query by creating summaries in response to different interpretations of the search which offer a “road map” to assist users to quickly identify information of interest.
Details
Keywords
Majdi A. Maabreh, Mohammed N. Al‐Kabi and Izzat M. Alsmadi
This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to…
Abstract
Purpose
This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to evaluate the impact of the academic environment on using the internet.
Design/methodology/approach
The web log files were collected from one of the higher institute's servers over a one‐month period. A special program was designed and implemented to extract web search queries from these files and also to automatically classify Arabic queries into three query types (i.e. Navigational, Transactional, and Informational queries) based on predefined specifications for each type.
Findings
The results indicate that students are slowly and gradually using the internet for more relevant academic purposes. Tests showed that it is possible to automatically classify Arabic queries based on query terms, with 80.6 per cent to 80.2 per cent accuracy for the two phases of the test respectively. In their future strategies, Jordanian universities should apply methods to encourage university students to use the internet for academic purposes. Web search engines in general and Arabic search engines in particular may benefit from the proposed classification method in order to improve the effectiveness and relevancy of their results in accordance with users' needs.
Originality/value
Studying internet web logs has been the subject of many papers. However, the particular domain, and the specific focuses on this research are what can distinguish it from the others.
Details
Keywords
Azadeh Mohebi, Mehri Sedighi and Zahra Zargaran
The purpose of this paper is to introduce an approach for retrieving a set of scientific articles in the field of Information Technology (IT) from a scientific database such as…
Abstract
Purpose
The purpose of this paper is to introduce an approach for retrieving a set of scientific articles in the field of Information Technology (IT) from a scientific database such as Web of Science (WoS), to apply scientometrics indices and compare them with other fields.
Design/methodology/approach
The authors propose to apply a statistical classification-based approach for extracting IT-related articles. In this approach, first, a probabilistic model is introduced to model the subject IT, using keyphrase extraction techniques. Then, they retrieve IT-related articles from all Iranian papers in WoS, based on a Bayesian classification scheme. Based on the probabilistic IT model, they assign an IT membership probability for each article in the database, and then they retrieve the articles with highest probabilities.
Findings
The authors have extracted a set of IT keyphrases, with 1,497 terms through the keyphrase extraction process, for the probabilistic model. They have evaluated the proposed retrieval approach with two approaches: the query-based approach in which the articles are retrieved from WoS using a set of queries composed of limited IT keywords, and the research area-based approach which is based on retrieving the articles using WoS categorizations and research areas. The evaluation and comparison results show that the proposed approach is able to generate more accurate results while retrieving more articles related to IT.
Research limitations/implications
Although this research is limited to the IT subject, it can be generalized for any subject as well. However, for multidisciplinary topics such as IT, special attention should be given to the keyphrase extraction phase. In this research, bigram model is used; however, one can extend it to tri-gram as well.
Originality/value
This paper introduces an integrated approach for retrieving IT-related documents from a collection of scientific documents. The approach has two main phases: building a model for representing topic IT, and retrieving documents based on the model. The model, based on a set of keyphrases, extracted from a collection of IT articles. However, the extraction technique does not rely on Term Frequency-Inverse Document Frequency, since almost all of the articles in the collection share a set of same keyphrases. In addition, a probabilistic membership score is defined to retrieve the IT articles from a collection of scientific articles.
Details
Keywords
MARTIN DILLON and JAMES DESPER
A technique is described for automatic reformulation of boolean queries. Based on patron relevance judgements of an initial retrieval, prevalence measures are derived for terms…
Abstract
A technique is described for automatic reformulation of boolean queries. Based on patron relevance judgements of an initial retrieval, prevalence measures are derived for terms appearing in the retrieved set of documents that reflect a term's distribution among the relevant and non‐relevant documents. These measures are then used to guide the construction of a boolean query for a subsequent retrieval. To illustrate the technique, a series of tests is described of its application to a small data base in an experimental environment. Results compare favourably with feedback as employed in a SMART‐type system. More extensive testing is suggested to validate the technique.