Search results
1 – 10 of 272The purpose of this paper is to use cloud storage in digital preservation by analyzing the pricing and data retrieval models. The author recommends strategies to minimize the…
Abstract
Purpose
The purpose of this paper is to use cloud storage in digital preservation by analyzing the pricing and data retrieval models. The author recommends strategies to minimize the costs and believes cloud storage is worthy of serious consideration.
Design/methodology/approach
Few articles have been published to show the uses of cloud storage in libraries. The cost is the main concern. An overview of cloud storage pricing shows a price drop once every one or one-and-a-half years. The author emphasize the data transfer-out costs and demonstrate a case study. Comparisons and analysis of S3 and Glacier have been conducted to show the differences in retrieval and costs.
Findings
Cloud storage solutions like Glacier can be very attractive for long-term digital preservation if data can be operated within the provider’s same data zone and data transfer-out can be minimized.
Practical implications
Institutions can benefit from cloud storage by understanding the cost models and data retrieval models. Multiple strategies are suggested to minimize the costs.
Originality/value
The paper is intended to bridge the gap of uses of cloud storage. Cloud storage pricing especially data transfer-out pricing charts are presented to show the price drops over the past eight years. Costs and analysis of storing and retrieving data in Amazon S3 and Glacier are discussed in details. Comparisons of S3 and Glacier show that Glacier has uniqueness and advantages over other cloud storage solutions. Finally strategies are suggested to minimize the costs of using cloud storage. The analysis shows that cloud storage can be very useful in digital preservation.
Details
Keywords
The purpose of this article is to provide an overview of current uses of cloud computing (CC) services in libraries, address a gap identified in integrating cloud storage in IaaS…
Abstract
Purpose
The purpose of this article is to provide an overview of current uses of cloud computing (CC) services in libraries, address a gap identified in integrating cloud storage in IaaS level, and show how to use EC2 tools for easy backup and resource monitoring.
Design/methodology/approach
The article begins a literature review of CC uses in libraries, organized at the SaaS, PaaS and IaaS levels. The author presents his experience of integrating cloud storage services S3 and GCS. In addition, he also shows how to use virtual machine EC2 tools for backup and monitoring resources.
Findings
The article describes a case study of integrating cloud storage using S3 and GCS. S3 can be integrated with any program whether the program runs on cloud or locally, while GCS is only good for applications running on GAE. The limitation of the current GCS approach makes it hard to use for a stand‐alone cloud storage. The author also discusses virtual machines using EC2 and its related tools for backup, increase storage, and monitoring service. These services make system administration easier as compared to the traditional approach.
Research limitations/implications
The article presents current CC uses in libraries at the SaaS, PaaS, and IaaS levels. CC services are changing quickly. For example, Google has stated that its APIs are experimental. Readers should be aware of this.
Practical implications
The author shows his experience of integrating cloud storage services. Readers can understand the similarities and differences between S3 and GCS. In addition, readers can learn the advantages and concerns associated with implementing cloud computing. Readers are encouraged to consider questions such as content, skills, costs, and security.
Originality/value
There are many uses of CC services in libraries. However, gaps are identified: in IaaS cloud storage, a few libraries used Amazon S3 and Microsoft Azure, but none explored using Google Cloud Storage (GCS); none provided implementation details, difficulties, and comparisons of S3 and GCS; and a few articles have briefly discussed implementations on Amazon EC2, but have not provided specific details about upgrade and backup. This article addresses those gaps.
Details
Keywords
Maryam AlJame and Imtiaz Ahmad
The evolution of technologies has unleashed a wealth of challenges by generating massive amount of data. Recently, biological data has increased exponentially, which has…
Abstract
The evolution of technologies has unleashed a wealth of challenges by generating massive amount of data. Recently, biological data has increased exponentially, which has introduced several computational challenges. DNA short read alignment is an important problem in bioinformatics. The exponential growth in the number of short reads has increased the need for an ideal platform to accelerate the alignment process. Apache Spark is a cluster-computing framework that involves data parallelism and fault tolerance. In this article, we proposed a Spark-based algorithm to accelerate DNA short reads alignment problem, and it is called Spark-DNAligning. Spark-DNAligning exploits Apache Spark ’s performance optimizations such as broadcast variable, join after partitioning, caching, and in-memory computations. Spark-DNAligning is evaluated in term of performance by comparing it with SparkBWA tool and a MapReduce based algorithm called CloudBurst. All the experiments are conducted on Amazon Web Services (AWS). Results demonstrate that Spark-DNAligning outperforms both tools by providing a speedup in the range of 101–702 in aligning gigabytes of short reads to the human genome. Empirical evaluation reveals that Apache Spark offers promising solutions to DNA short reads alignment problem.
Huber Flores, Satish Narayana Srirama and Carlos Paniagua
Cloud computing becomes mobile when a mobile device tries to access the shared pool of computing resources provided by the cloud, on demand. Mobile applications may enrich their…
Abstract
Purpose
Cloud computing becomes mobile when a mobile device tries to access the shared pool of computing resources provided by the cloud, on demand. Mobile applications may enrich their functionality by delegating heavy tasks to the clouds as the remote processing and storage have become possible by adding asynchronous behavior in the communication. However, developing mobile cloud applications involves working with services and APIs from different cloud vendors, which mostly are not interoperable across clouds. Moreover, by adding asynchronicity, mobile applications must rely on push mechanisms which are considered to be moderately reliable, and thus not recommended in scenarios that require high scalability and quality of service (QoS). To counter these problems, and the purpose of this paper, is to design a middleware framework, Mobile Cloud Middleware (MCM), which handles the interoperability issues and eases the use of process‐intensive services from smartphones by extending the concept of mobile host.
Design/methodology/approach
MCM is developed as an intermediary between the mobile and the cloud, which hides the complexity of dealing with multiple cloud services from mobiles. Several applications are presented to show the benefits of mobiles going cloud‐aware. Moreover, to verify the scalability of MCM, load tests are performed on the hybrid cloud resources using well known load balancing mechanisms like HAProxy and Tsung.
Findings
From the study it was found that it is possible to handle hybrid cloud services from mobiles by using MCM. The analysis demonstrated that the MCM shows reasonable performance levels of interaction with the user, thus validating the proof of concept. Moreover, MCM decreases the effort in developing mobile cloud applications and helps in keeping soft‐real time responses by using its asynchronous approach.
Originality/value
MCM fosters the utilization of different types of cloud services rather than the traditional mobile cloud services based on data synchronization. By offloading heavy tasks to the clouds, the framework extends the processing power and storage space capabilities of the constrained smart phones. The applications mentioned in the paper bring an added value by being success stories for mobile cloud computing domain in general.
Details
Keywords
Alexander Döschl, Max-Emanuel Keller and Peter Mandl
This paper aims to evaluate different approaches for the parallelization of compute-intensive tasks. The study compares a Java multi-threaded algorithm, distributed computing…
Abstract
Purpose
This paper aims to evaluate different approaches for the parallelization of compute-intensive tasks. The study compares a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and resilient distributed data set (RDD) (Apache Spark) paradigms and a graphics processing unit (GPU) approach with Numba for compute unified device architecture (CUDA).
Design/methodology/approach
The paper uses a simple but computationally intensive puzzle as a case study for experiments. To find all solutions using brute force search, 15! permutations had to be computed and tested against the solution rules. The experimental application comprises a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and RDD (Apache Spark) paradigms and a GPU approach with Numba for CUDA. The implementations were benchmarked on Amazon-EC2 instances for performance and scalability measurements.
Findings
The comparison of the solutions with Apache Hadoop and Apache Spark under Amazon EMR showed that the processing time measured in CPU minutes with Spark was up to 30% lower, while the performance of Spark especially benefits from an increasing number of tasks. With the CUDA implementation, more than 16 times faster execution is achievable for the same price compared to the Spark solution. Apart from the multi-threaded implementation, the processing times of all solutions scale approximately linearly. Finally, several application suggestions for the different parallelization approaches are derived from the insights of this study.
Originality/value
There are numerous studies that have examined the performance of parallelization approaches. Most of these studies deal with processing large amounts of data or mathematical problems. This work, in contrast, compares these technologies on their ability to implement computationally intensive distributed algorithms.
Details
Keywords
This paper aims to explore the educational potential of “cloud computing” (CC), and how it could be exploited in enhancing engagement among educational researchers and educators…
Abstract
Purpose
This paper aims to explore the educational potential of “cloud computing” (CC), and how it could be exploited in enhancing engagement among educational researchers and educators to better understand and improve their practice, in increasing the quality of their students' learning outcomes, and, thus, in advancing the scholarship of teaching and learning (SoTL) in a higher education context.
Design/methodology/approach
Adoption of the ideals of SoTL is considered an important approach for salvaging the higher education landscape around the world that is currently in a state of flux and evolution as a result of rapid advances in information and communications technology, and the subsequent changing needs of the digital natives. The study is based on ideas conceptualised from reading several editorials and articles on server virtualisation technology and cloud computing in several journals, with the eSchool News as the most important one. The paper identifies two cloud computing tools, their salient features and describes how cloud computing can be used to achieve the ideals of SoTL.
Findings
The study reports that the cloud as a ubiquitous computing tool and a powerful platform can enable educators to practise the ideals of SoTL. Two of the most useful free “cloud computing” applications are the Google Apps for Education which is a free online suite of tools that includes Gmail for e‐mail and Google Docs for documents, spreadsheets, and presentations, and Microsoft's cloud service (Live@edu) including the SkyDrive. Using the cloud approach, everybody can work on the same document at the same time to make corrections as well as improve it dynamically in a collaborative manner.
Practical implications
Cloud computing has a significant place in higher education in that the appropriate use of cloud computing tools can enhance engagement among students, educators, and researchers in a cost effective manner. There are security concerns but they do not overshadow the benefits.
Originality/value
The paper provides insights into the possibility of using cloud computing delivery for originating a new instructional paradigm that makes a shift possible from the traditional practice of teaching as a private affair to a peer‐reviewed transparent process, and makes it known how student learning can be improved generally, not only in one's own classroom but also beyond it.
Details
Keywords
The purpose of this paper is to produce figures showing the carbon footprint of the knowledge industry – from creation to distribution and use of knowledge, and to provide…
Abstract
Purpose
The purpose of this paper is to produce figures showing the carbon footprint of the knowledge industry – from creation to distribution and use of knowledge, and to provide comparative figures for digital distribution and access.
Design/methodology/approach
An extensive literature search and environmental scan was conducted to produce data relating to the CO2 emissions from various industries and activities such as book and journal production, photocopying activities, information technology and the internet. Other sources such as the International Energy Agency (IEA), Carbon Monitoring for Action (CARMA ), Copyright Licensing Agency, UK (CLA), Copyright Agency Limited, Australia (CAL), etc., have been used to generate emission figures for production and distribution of print knowledge products versus digital distribution and access.
Findings
The current practices for production and distribution of printed knowledge products generate an enormous amount of CO2. It is estimated that the book industry in the UK and USA alone produces about 1.8 million tonnes and about 11.27 million tonnes of CO2 respectively. CO2 emission for the worldwide journal publishing industry is estimated to be about 12 million tonnes. It is shown that the production and distribution costs of digital knowledge products are negligible compared to the environmental costs of production and distribution of printed knowledge products.
Practical implications
Given the astounding emission figures for production and distribution of printed knowledge products, and the associated activities for access and distribution of these products, for example, emissions from photocopying activities permitted within the provisions of statutory licenses provided by agencies like CLA, CAL, etc., it is proposed that a digital distribution and access model is the way forward, and that such a system will be environmentally sustainable.
Originality/value
It is expected that the findings of this study will pave the way for further research and this paper will be extremely helpful for design and development of the future knowledge distribution and access systems.
Details
Keywords
Tyler O. Walters and Katherine Skinner
This paper aims to examine the emerging field of digital preservation and its economics. It seeks to consider in detail the cooperative model and the path it provides toward…
Abstract
Purpose
This paper aims to examine the emerging field of digital preservation and its economics. It seeks to consider in detail the cooperative model and the path it provides toward sustainability as well as how it fosters participation by cultural memory organizations and their administrators, who are concerned about what digital preservation will ultimately cost and who will pay.
Design/methodology/approach
The authors cast light on the decisions that administrators of cultural memory organizations are making on a daily basis – namely, to preserve or not to preserve their digital collections. They assert that either way, a decision is being made, costs are incurred, and consequences are being levied. The authors begin by exploring the costs incurred by cultural memory organizations if they do not quickly establish digital preservation programs for their digital assets. They move then to look to the digital preservation field's preliminary findings regarding the costs of preserving digital assets and who should ideally subsidize this investment.
Findings
The authors describe one economically sustainable digital preservation model in practice, the MetaArchive Cooperative, a distributed digital preservation network that has been in operation since 2004. The MetaArchive has built its economic sustainability model and has experienced successes with it for over five years.
Originality/value
There are very few studies or articles in the literature that review studies on the economics of digital preservation and apply them to digital preservation initiatives in action. This article provides that application and further articulates why cultural memory organizations should invest themselves and learn how to provide for the preservation of their own digital collections.
Details
Keywords
The purpose of this paper is to examine three different, but related, distributed computing technologies in the context of public‐funded e‐science research, and to present the…
Abstract
Purpose
The purpose of this paper is to examine three different, but related, distributed computing technologies in the context of public‐funded e‐science research, and to present the author's viewpoint on future directions.
Design/methodology/approach
The paper takes a critical look at the state‐of‐the‐art with regard to three enabling technologies for e‐science. It forms a set of arguments to support views on the evolution of these technologies in support of the e‐science applications of the future.
Findings
Although grid computing has been embraced in public‐funded higher education institutions and research centres as an enabler for projects pertaining to e‐science, the adoption of desktop grids is low. With the advent of cloud computing and its promise of on‐demand provisioning of computing resources, it is expected that the conventional form of grid computing will gradually move towards cloud‐based computing. However, cloud computing also brings with it the “pay‐per‐use” economic model, and this may act as stimulus for organisations engaged in e‐science to harvest existing underutilised computation capacity through the deployment of organisation‐wide desktop grid infrastructures. Conventional grid computing will continue to support future e‐science applications, although its growth may remain stagnant.
Originality/value
The paper argues that there will be a gradual shift in the underlying distributed computing technologies that support e‐science applications of the future. While cloud computing and desktop grid computing will gain in prominence, the growth of traditional cluster‐based grid computing may remain dormant.
Details
Keywords
Srinimalan Balakrishnan Selvakumaran and Daniel Mark Hall
The purpose of this paper is to investigate the feasibility of an end-to-end simplified and automated reconstruction pipeline for digital building assets using the design science…
Abstract
Purpose
The purpose of this paper is to investigate the feasibility of an end-to-end simplified and automated reconstruction pipeline for digital building assets using the design science research approach. Current methods to create digital assets by capturing the state of existing buildings can provide high accuracy but are time-consuming, expensive and difficult.
Design/methodology/approach
Using design science research, this research identifies the need for a crowdsourced and cloud-based approach to reconstruct digital building assets. The research then develops and tests a fully functional smartphone application prototype. The proposed end-to-end smartphone workflow begins with data capture and ends with user applications.
Findings
The resulting implementation can achieve a realistic three-dimensional (3D) model characterized by different typologies, minimal trade-off in accuracy and low processing costs. By crowdsourcing the images, the proposed approach can reduce costs for asset reconstruction by an estimated 93% compared to manual modeling and 80% compared to locally processed reconstruction algorithms.
Practical implications
The resulting implementation achieves “good enough” reconstruction of as-is 3D models with minimal tradeoffs in accuracy compared to automated approaches and 15× cost savings compared to a manual approach. Potential facility management use cases include the issue and information tracking, 3D mark-up and multi-model configurators.
Originality/value
Through user engagement, development, testing and validation, this work demonstrates the feasibility and impact of a novel crowdsourced and cloud-based approach for the reconstruction of digital building assets.
Details