Crowdsourcing for search engines: perspectives and challenges

Mohammad Moradi (Young Researchers and Elite Club, Qazvin Branch, Islamic Azad University, Qazvin, Iran)

International Journal of Crowd Science

ISSN: 2398-7294

Article publication date: 16 April 2019

Issue publication date: 12 June 2019

2289

Abstract

Purpose

As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward unprecedented opportunities to empower the related services and mechanisms by leveraging humans’ intelligence and problem solving abilities. With respect to the pivotal role of search engines in the Web and information community, this paper aims to investigate the advantages and challenges of incorporating people – as intelligent agents – into search engines’ workflow.

Design/methodology/approach

To emphasize the role of the human in computational processes, some specific and related areas are studied. Then, through studying the current trends in the field of crowd-powered search engines and analyzing the actual needs and requirements, the perspectives and challenges are discussed.

Findings

As the research on this topic is still in its infancy, it is believed that this study can be considered as a roadmap for future works in the field. In this regard, current status and development trends are delineated through providing a general overview of the literature. Moreover, several recommendations for extending the applicability and efficiency of next generation of crowd-powered search engines are presented. In fact, becoming aware of different aspects and challenges of constructing search engines of this kind can shed light on the way of developing working systems with respect to essential considerations.

Originality/value

The present study was aimed to portrait the big picture of crowd-powered search engines and possible challenges and issues. As one of the early works that provided a comprehensive report on different aspects of the topic, it can be regarded as a reference point.

Keywords

Citation

Moradi, M. (2019), "Crowdsourcing for search engines: perspectives and challenges", International Journal of Crowd Science, Vol. 3 No. 1, pp. 49-62. https://doi.org/10.1108/IJCS-12-2018-0026

Publisher

:

Emerald Publishing Limited

Copyright © 2019, Mohammad Moradi.

License

Published in International Journal of Crowd Science. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode


1. Introduction

The days of relying only on machines for performing computing tasks and problem solving have gone. In fact, introduction of crowdsourcing (Howe, 2006) as a mold-breaking computing paradigm has changed the playground drastically. Despite many years of research and development, machines could not handle all computational problems completely and independently, especially when it comes to cognitive- and intelligence-intensive tasks (Zadeh, 2008; Del Prado, 2015; Whitney, 2017). Therefore, putting humans in the loop as collaborators, cooperators and even coordinators (rather than just users or supervisors) (Folds, 2016) can be considered the silver bullet for tackling a wide variety of problems in different domains (Kamar et al., 2012; Weyer et al., 2015; Ofli et al., 2016; Holzinger et al., 2016). Narrowing down the view on a specific niche, there are some interesting common grounds between crowdsourcing and the Web 2.0 (OReilly, 2007) concepts in their perspectives on humans’ roles. According to the Web 2.0 manifesto (OReilly, 2007), Web users should evolve from merely consumers to active producers. In this regard, notable efforts such as Wikipedia – as a Web 2.0 iconic example of collaborative participation of users and a successful best practice of crowdsourcing-based knowledge acquisition- could be inspirational and motivating. Such an example, by the way, puts focus on invaluable applications of human-centered intelligence-oriented participation in Web-related workflows. Regarding the principal role of search engines in the Web and information society, it is worth studying the possible perspectives and also challenges of incorporating humans (i.e. Web users) into information retrieval, validation and evaluation processes. Such activities that can affect search engines and some of the related processes are discussed in this paper as in the following structure: some background and related works are introduced in Section 2. The rationale behind human-powered search engines is investigated in Section 3. Applications and perspectives of leveraging crowdsourcing for search engines and related challenges are studied in Sections 4 and 5, respectively. Moreover, a concise literature review is conducted in Section 6, and some suggestions for the future works are presented in Section 7.

2. Related works

Since the early days of introduction, crowdsourcing (Howe, 2006; Brabham, 2008; Ikediego et al., 2018) has provided many unprecedented opportunities to facilitate traditional workflows and processes in a wide variety of (mostly) technology-related domains. In this regard, one can see numerous example scenarios in a broad range of application areas from robotics (Breazeal et al., 2013; Jain et al., 2015; Moradi et al., 2016; Almosalami et al., 2018) and machine learning (Simpson et al., 2015; Wallace et al., 2017) to knowledge management (Callaghan, 2016; Dimitrova and Scarso, 2017) and much more. Following this working idea, information retrieval researchers have pursued the applicability of leveraging the people’s potential for improving related tasks. More specifically, taking advantages of collective human intelligence for corpus annotation (Krishna et al., 2017; Tayyub et al., 2017), query interpretation (Ciceri et al., 2016; Chen et al., 2016) and other database-related processes (Liptchinsky et al., 2015; Trushkowsky et al., 2015; Zhao et al., 2015) have gained momentum. To introduce some, the following are worth mentioning: as a notable work in this context, Franklin et al. (2011) proposed CrowdDB. The system leverages human input to process and answer queries that machines could not provide appropriate results for them. Additionally, they introduced CrowdSQL as an extension for SQL to support the underlying idea. To take advantages of Amazon’s Mechanical Turk (AMT) for more complicated tasks and processes, namely, database-related ones, Marcus et al. (2011) introduced a new query system, Qurk. Designing an algorithm for human-driven filtering of data items based on some attributes is the theme of the research reported in Parameswaran et al. (2012). As another inspirational work, benefits of crowdsourcing-based relevance assessment for XML retrieval are reported by Alonso et al. (2010).

On the other side, search engines are playing an integral role in dissemination of information in the Web. However, despite the remarkable advancements in the field (Deng and Feng, 2011; Hariri, 2013; Lewandowski, 2015), there are still several essential issues with search engines in identification of users’ intentions (Jansen et al., 2007; Ruotsalo et al., 2015) and providing them with most appropriate answers (Thelwall, 2008; Uyar, 2009).

Specifically, among the major challenges search engines face with, understanding humans (their intentions and exact needs) and providing them with human-level responses are of high importance. Due to intrinsic weakness of (current)machines in dealing with cognitive and intelligence-intensive tasks, such as interpretation of natural languages, one cannot expect perfect and flawless search engines with clear-cut results. As a result, optimum and dreamy search engines seems not to be in sight at least at this time. To fill this meaningful gap between what users (searchers) want from search engines and what search engines can bring to users, leveraging humans’ intelligence and cognitive/problem solving abilities can be considered a game changer (Figure 1). Therefore, the major motivations of this study are of two types:

  1. investigation of reasons, benefits and nuts and bolts of typical crowd-powered search engines, i.e. theoretical motivations; and

  2. studying best practices, current solutions, practical implications and challenges of relying on crowds’ power for evolving traditional search engines, i.e. practical motivations.

In this regard, this paper aimed to provide a reference point to reflect current status of the field and draw a road map for future works.

3. Why crowdsourcing is needed for search?

Although machines can easily outperform humans in (most of) computational tasks, when it comes to cognitive and intelligence-intensive problems, including natural language processing, they address several critical shortcomings (Poirier, 2017). To Deal with such issues, leveraging humans’ potential and abilities opened a new window toward taking advantages of man-machine cooperation. Over the years, many research studies have been conducted to benefit from such a hybrid strategy (Dounias, 2015; Kamar, 2016; Dellermann et al., 2018). Among them, some of the mostly human-centric application areas, including information search and relevance assessment, greatly depend on human intervention to provide reliable and accurate results. As everyone experiences in his daily Web browsing, even the highly-sophisticated search engines with the state-of-the-art algorithms and procedures are not as strong and accurate as users may expect, specifically in interpreting the search queries and consequently retrieving relevant results. In other words, demystifying users’ intention of search (terms), finding most relevant matches and ranking the results so that best conform to users’ goals and preferences cannot be achieved by relying only on machines’ intelligence and capabilities. In this regard, so-called crowd(human)-powered search engines (Parameswaran et al., 2014) have gained momentum. The main rationale behind such search systems is involving individuals to leverage their cognitive intelligence (and searching expertise) for the sake of providing users (i.e. initial searchers) with what they could not find by themselves. As a real-world example, Digle[1] – a people-powered search engine – crowdsources search queries to its large body of participants (search workers/searchers). To present more accurate and relevant results, users are asked to provide some additional information on their own requests, including the related category, etc. However, question and answer sites and forums provide similar facilities for years, and crowdsourcing-based search engines are in charge of generalizing the concept and presenting their users with specific, to-the-point, relevant and humanized information.

4. How does crowdsourcing help Web search?

Humans’ power – according to the context and applications – can be leveraged in many different ways from providing training data for the machine (Kairam and Heer, 2016; Chang et al., 2017) to collaborating with an algorithm to provide more precise outputs (Fan et al., 2014; Sarma et al., 2014), e.g. in the form of a quality controller or supervisor. When it comes to the Web search, crowdsourcing is mainly related to improving underlying processes or providing users (searchers) with some assistance on finding more relevant answers. In addition to analyze logs and query submission patterns (Park et al., 2015; Zahedi et al., 2017) to find out users’ requirements (indirect crowdsourcing or crowd analysis), game-based methods (Law et al., 2009a; Law et al., 2009b; Bennett et al., 2009; Ma et al., 2009), as a tacit approach for facilitating search process, are in the center of attention. From a general point of view, humans’ role in the search process could be categorized in the four major classes as follows.

4.1 Crowd-searching

In this category, it is supposed that the user could not find what (s)he is looking for. It may be cause of lack of adequate searching abilities, having no knowledge of the target topic and so on. In such a case, the aim is to crowdsource the problem (i.e. keywords to be searched) and get back the most relevant crowd-searched results to the user. To obtain more accurate answers, users should be asked to provide as much as possible additional information on what they want to find. Digle, DataSift (Parameswaran et al., 2014) and CrowdSearcher (Bozzon et al., 2012a) are major solutions that provide users with crowd-searched findings of desired topics. Following this idea, the people’s power have been leveraged in previous inspirational studies (Jeong et al., 2013; Spirin, 2014) for answering twitter questions and finding design examples, respectively. The obtained results in this approach may be used for further managerial processes, such as query interpretation.

4.2 Crowd-clarifying

One of the main issues search engines and information retrieval systems deal with is demystifying and clarifying submitted queries. As this problem is mainly related to natural language processing (a hard AI problem), machines face some difficulty to handle them. Unfamiliarity with the search (target) language, entering long and ambiguous search terms, typos and semantic errors are among reasons that imply needs for crowdsourcing-based clarification of search queries. In fact, human intelligence is the best means to uncover humans’ intention of a specific query. Despite the crowd-searching (Bozzon et al., 2012b), this approach is not necessarily online or (semi)real-time. Human workforce, for this purpose, will be used to interpret the query, breaking down it to several essential meaningful parts, suggesting additional choices for expanding search terms and finding similar terms (Kim et al., 2013) and more appropriate alternatives for replacing the input search term(s) with. These will improve the query-result matching and retrieval processes.

4.3 Crowd-sifting

Conceptually similar to crowd-searching, crowd-sifting is an umbrella term for a set of activities devoted to preparation of intermediate information. Data labeling and classification are important tasks in this category. Doing so, in fact, the information that could be matched with the respective queries will be filtered and organized to achieve higher performance (Milne et al., 2008). From another point of view, the information retrieved through automatic searching process, to be calibrated and normalized, should be validated by the people (Yan et al., 2010). Such supervisory tasks are considered as a pre-processing step for the answer generation. Due to need for recruiting a relatively large body of participants, and performing precise computation and supervisory routines, this approach is a costly and time-consuming one.

4.4 Crowd-rating

Regardless of how answers are produced, there are two critical post-processing steps. First one is evaluating the relevance of candidate answer sets to the submitted query, which is a determining process for the final answer generation (Alonso et al., 2008; Lewandowski, 2015). This easily crowdsourcable process can also be performed tacitly through analyzing users’ feedback and satisfaction measuring. Second, ranking the results (Kim et al., 2013) plays an important role for helping to find the most relevant items. The aforementioned processes are inter-related and dependent, through which the ultimate results will be populated and organized in a user-friendly manner.

According to the aforementioned procedures, the human as the crowdworker can play two major roles (Figure 2):

  1. Search assistant: In this role that is referred to as crowd-searcher, the human’s participation is leveraged to directly assist Web searchers. Therefore, they are not involved in the background supervisory processes and just their searching skills and abilities are benefited.

  2. System collaborator: The last three categories delineated previously in this section take advantages of humans’ intelligence and knowledge for query analysis, relevance assessment and similar supervisory workflows. Therefore, the people in such contexts serve as collaborators and/or experts who take part in the decision-making process.

5. Challenges

Despite several remarkable benefits of crowdsourcing for facilitating the Web search process in different levels, relying on humans’ power addresses some essential challenges. Underestimating these issues and their consequences can greatly affect the efficiency of the process. These challenges are of two broad classes: human-related and technical ones.

5.1 Human-related challenges

No one could improve the Web search process better than (expert) users, and on the other side, no one else could undermine/affect it just like them, their behavior and operations. Regarding this fact, there are some influential factors that should be considered.

5.1.1 Motivation and incentives.

However, crowdsourcing, in some cases, is established on the shoulders of volunteers; when it comes to critical and serious applications that should be performed in near real-time, it is not an effective approach. In this regard, recruiting active and responsible (and possibly expert) participants is a must-have need that imposes remarkable costs.

5.1.2 Challenging tasks.

As mentioned earlier, a common type of tasks in the context of Web search is interpretation of long, ambiguous and complicated search terms. Due to some intrinsic issues in such cases, e.g. obscure submissions by users in language other than their own, crowdworkers may be disinterested to demystify those inputs. In other words, highly prolonged and erroneous inputs – that are prevalent in search engines – may affect the crowdsourcing process. To cope with such issues, applying a preliminary machine interpretation or increasing the payment for complicated submissions (tasks) are of working solutions.

5.1.3 Integrity and scalability.

Machine interpretation of search terms and finding relevant answers are only dependent on underlying algorithms. Replacing it with a human-driven strategy can be influenced with several to many variables. For this reason, it is unlikely to expect similar answers for similar search terms in the case of lack of some further supervisory (integration) steps. On the other side, adversarial intentions can affect the answer seeking process (Harris, 2011; Difallah et al., 2012). Therefore, there is need to some quality control processes (Daniel et al., 2018); otherwise, the reliability of human-powered information searching may be questioned. Further, scalability is another challenging issue in this context, specifically when it comes to deal with a large number of users. In such cases, recruiting and managing many active crowd-searchers to guarantee the efficiency of the system will impose remarkable costs and technical considerations.

5.2 Technical challenges

In addition to usual technical complexities for search engines, to manage crowdsourcing-related issues, some further considerations are needed, including the followings.

5.2.1 Response time.

An essential advantage (and performance measure) of search engines is reducing the retrieval time. Currently, most search engines retrieve the initial answers in less than few seconds. Such a feature is one of the most important superiority of traditional search engines over human-powered ones. Assigning search tasks to the crowd, finding relevant results by the people, validation, integration and retrieval of most relevant answers are time-consuming processes that not only exceed near real-time performance but also impose a remarkable annoying delay. Although it is studied that in some cases users prefer the slow search process to acquire more accurate results (Teevan et al., 2014), this is not the case for general purposes.

5.2.2 Managerial overheads.

Managing crowd-searched answers is a complex and sophisticated process. In fact, machine-driven validation and relevance evaluation processes may be subject to some inconsistencies. In this regard, some human-oriented supervisory processes may be needed. Such an approach, by the way, can address the need for repetitive human-mediated evaluation to reach an acceptable assurance level. Further, there are several essential implementation considerations that should be taken into account to make the system feasible and efficient enough.

5.2.3 Crowdsourcing platform.

Due to its features and capabilities, AMT is the first choice of researchers and practitioners for conducting crowdsourcing projects. Nonetheless, its basic facilities may not completely support unusual tasks and procedures. Dealing with such issues, some researchers proposed solutions (such as additional frameworks and interface on the top of AMT) to handle the case (Marcus et al., 2011). While some others introduced their own case-specific crowdsourcing systems. As a real world example, Digle can demonstrate an appropriate and working instance. As there is not a size to fit all, there should be a match between type of tasks and crowdsourcing platform’s capabilities. Clearly, because Digle provides (or at least aimed to provide) near real-time answers, it is not a rationale choice (for them) to use Mechanical Turk or similar systems. On the other side, for background tasks such as relevance assessment and evaluation –as done in (Blanco et al., 2011), adopting to the standard third party services is acceptable.

5.2.4 Economical side effects.

Web-based commerce greatly relies on search engine optimization (SEO) techniques and strategies. However, the underlying methods by which search engines rank Web pages are not publicly revealed; over the years, SEO experts have become aware of the nuts and bolts of the workflow. Therefore, if the crowd-powered search engines gain momentum as a key player in the search engines’ playground, the current (accepted, well-studied and documented) rules will be changed in a non-understandable way. In addition to disorganizing the SEO approaches, targeted activities can affect the search-based commerce in an adversarial and destructive manner.

6. Literature review

To investigate the current advancements in the field, in this section, a brief literature review is conducted. In this regard, first and foremost, crowd-powered search engines are introduced. Then, the works adhered to game-based methods for improving search process are reviewed. For more information on general issues in the field, the research works conducted earlier (Sushmita et al., 2009; Kazai et al., 2011; Kazai, 2011; Harris and Srinivasan, 2013) are recommended.

6.1 Crowd-powered search engines

As one of the most interesting contribution in the field, Parameswaran et al. (2014) proposed a crowd-powered search toolkit, entitled DataSift. The most important feature of this tool is its capability to connect to any data set. The submitted query to the DataSift will be processed in a dual approach: forwarding to the crowd and analyzing by means of keyword processing subsystem. Finally, the user will be provided with a list of ranked results. To improve the quality of Twitter question asking process, the authors introduced an embedded, crowd-powered search system – MSR Answers (Jeong et al., 2013). The system provides a novel facility to obtain answers from the crowd instead of relying only on the friends’ circle. It is claimed that the crowd-generated answers are as quality as what the friends can provide.

A crowdsourcing-based image search system for mobile phones, CrowdSearch, is proposed in Yan et al. (2010). In this work, the search process will be performed automatically; then a real-time crowd-mediated validation process will be applied on the generated results.

To fill the remarkable gap between automated search engines and humans’ information seeking behaviors, Crowdsearcher is introduced (Bozzon et al., 2012a). The main contribution of this study is to provide pure humanized answers by leveraging humans’ interaction and cognitive intelligence. In another similar study, Bozzon et al. (2012b) proposed a model-driven approach to take advantages of humans’ interaction and opinions for question answering.

6.2 Game-based approaches

Search War, a competitive game for improving Web search, was introduced in Law et al. (2009b). The users, in addition to collect data, take part in a relevance evaluation process for a specific search query and a Web page.

Ma et al. (2009) proposed three human computation games for improving Web search. The underlying idea of the first one, which is named Page Hunt, is to show the user a random Web page and ask him to suggest the most relevant search query for that. The suggested query will be checked in a real search engine, and the results will get back to the user for the sake of evaluation. This game, by the way, could be used for the search engines optimization purposes. The second game, called Page Race, is a competitive one with the aim of specifying the query (search phrase) that best matches the given Web page. As a collaborative game, the third one, Page Match, is intended to examine humans’ efforts to match similar Web pages based on their selected queries. In this game, players win points when both agreed on a decision, i.e. the Web pages are same or different.

The major contribution of Intentions, a human computation game proposed by Law et al. (2009a), is to collect relevant human-generated data for interpreting intentions behind search queries. Despite the Page Hunt, the game play for the Intentions is a reverse one: users will get an intention and will be asked to suggest some search queries which best match it.

Borrowing the underlying idea from ESP game, Picture This as a social collaborative game was proposed in Bennett et al. (2009) to collect data for image searching purpose. In the game, participants will be presented with a sequence of queries and several images. They will be awarded credit when they agree on an image for a specific query.

As an educational game, Koru (Milne et al., 2008) is developed to trace how users evolve queries, how they can improve their searching skills and find out what are their intentions for issued queries.

7. Future works

In addition to the general perspectives discussed in the paper, in this section, several specific suggestions for future works in the field are presented.

7.1 Leveraging collective machine intelligence

The idea of leveraging collective machine intelligence and performance has gained momentum within the recent years (Yampolskiy et al., 2012; Halmes, 2013). As an equivalent concept for crowdsourcing in the context of intelligent agents, such an idea can be used to provide Web searchers with more precise and comprehensive answers. Specifically, any search engine follows its own attitudes toward the query interpretation, information retrieval and other similar procedures. Therefore, it is expected to obtain (partially) different answers when issuing same search query in different search engines. In this regard, taking advantages of different search engines and information retrieval systems to provide the user with most relevant answers can be an interesting and working strategy.

7.2 Location-based crowdsourcing

As location and temporal information (features) can affect the search and retrieval processes (Zhang et al., 2017; Ermagun et al., 2017), there is a strong need to incorporate such factors in the related workflows. When it comes to crowd-powered search engines, the key to consider location-related features is to adhere to location-based crowdsourcing. For example, to provide a user with (possibly) most relevant answers, it would be better to employ crowdworkers who are in the same geographical location that the initial query was issued.

7.3 Mining crowdsourced data

Within the recent years, researchers have paid a remarkable attention to discovering knowledge from crowdsourced data (Rahman et al., 2015; Gao et al., 2016). In fact, mining crowdsourced data can be regarded as delving into humans’ intelligence. In the context of search engines, analyzing crowd-selected and crowd-searched keywords is a powerful means to gain insight on common search patterns. Moreover, discovering the ranking and evaluation patterns can be used for constructing an expert system to automate the process and providing users with precise recommendations.

7.4 Rethinking the incentive mechanism

One of the most important drawbacks of crowd-powered search engines is the intrinsic delay. To overcome this shortcoming, it is needed to recruit a very large body of active participants. For this reason, a working strategy is to keep them active through (social and viral) games (Zeng et al., 2017). Also, establishing competitive environments and mechanisms can increase the rate of participation and accuracy. Looking back at the best practices for attracting mass human participation, such as the Google’s reCAPTCHA (Von Ahn et al., 2008), can also be inspirational.

8. Conclusion

The main contribution of this paper was studying effects of leveraging humans’ problem solving and information seeking abilities in the context of Web search engines. As the current search engines, despite their advantages and capabilities, could not provide human-level answers in some cases, it seems (and partially proved) that incorporating humans in the process can be the silver bullet to overcome current deficiencies of traditional approaches. In this regard, in addition to providing an overview of the respected literature, some important perspectives and challenges of the field were studied.

Figures

Relatedness gap between what users want and what search engines provide

Figure 1.

Relatedness gap between what users want and what search engines provide

Different roles of the human in the crowd-powered search engines

Figure 2.

Different roles of the human in the crowd-powered search engines

Note

References

Almosalami, A., Jones, A., Tipparach, S., Leier, K. and Peterson, R. (2018), “Beachbot: crowdsourcing garbage collection with amphibious robot network”, Proceedings of CIEEE International Conferenc/IEEE International Conference on Human-Robot Interaction, ACM, pp. 333-334.

Alonso, O., Rose, D.E. and Stewart, B. (2008), “Crowdsourcing for relevance evaluation”, ACM SIGIR Forum, Vol. 42 No. 2, pp. 9-15.

Alonso, O., Schenkel, R. and Theobald, M. (2010), “Crowdsourcing assessments for XML ranked retrieval”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 602-606.

Bennett, P.N., Chickering, D.M. and Mityagin, A. (2009), “Picture this: preferences for image search”, Proceedings of the ACM SIGKDD Workshop on Human Computation, ACM, pp. 25-26.

Blanco, R., Halpin, H., Herzig, D.M., Mika, P., Pound, J., Thompson, H.S. and Tran Duc, T. (2011), “Repeatable and reliable search system evaluation using crowdsourcing”, Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 923-932.

Bozzon, A., Brambilla, M. and Ceri, S. (2012a), “Answering search queries with crowdsearcher”, Proceedings of the 21st International Conference on World Wide Web, ACM, pp. 1009-1018.

Bozzon, A., Brambilla, M. and Mauri, A. (2012b), “A Model-Driven approach for crowdsourcing search”, Proceedings of CrowdSearch 2012 Workshop at WWW, pp. 31-35.

Brabham, D.D. (2008), “Crowdsourcing as a model for problem solving: an introduction and cases”, Convergence, Vol. 14 No. 1, pp. 75-90.

Breazeal, C., DePalma, N., Orkin, J., Chernova, S. and Jung, M. (2013), “Crowdsourcing human-robot interaction: new methods and system evaluation in a public environment”, Journal of Human-Robot Interaction, Vol. 2 No. 1, pp. 82-111.

Callaghan, C.W. (2016), “A new paradigm of knowledge management: crowdsourcing as emergent research and development”, Southern African Business Review, Vol. 20 No. 1, pp. 1-28.

Chang, J.C., Amershi, S. and Kamar, E. (2017), “Revolt: collaborative crowdsourcing for labeling machine learning datasets”, Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ACM, pp. 2334-2346.

Chen, W., Zhao, Z., Wang, X. and Ng, W. (2016), “Crowdsourced query processing on microblogs”, Proceedings of International Conference on Database Systems for Advanced Applications, Springer, Cham, pp. 18-32.

Ciceri, E., Fraternali, P., Martinenghi, D. and Tagliasacchi, M. (2016), “Crowdsourcing for top-k query processing over uncertain data”, Proceedings of IEEE 32nd International Conference on Data Engineering (ICDE), IEEE, pp. 1452-1453.

Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B. and Allahbakhsh, M. (2018), “Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions”, ACM Computing Surveys, Vol. 51 No. 1, Article No. 7.

Del Prado, G.M. (2015), “Robots are terrible at these 3 uniquely human skills”, Business Insider, available at: www.businessinsider.com/things-humans-can-do-better-than-machines-2015-10/ (accessed 20 April 2018).

Dellermann, D., Lipusch, N., Ebel, P. and Leimeister, J.M. (2018), “Design principles for a hybrid intelligence decision support system for business model validation”, ElectronicMarkets, available at: https://doi.org/10.1007/s12525-018-0309-2

Deng, T. and Feng, L. (2011), “A survey on information re-finding techniques”, International Journal of Web Information Systems, Vol. 7 No. 4, pp. 313-332.

Difallah, D.E., Demartini, G. and Cudré-Mauroux, P. (2012), “Mechanical cheat: spamming schemes and adversarial techniques on crowdsourcing platforms”, Proceedings of The First International Workshop on Crowdsourcing Web search (CrowdSearch), pp. 26-30.

Dimitrova, S. and Scarso, E. (2017), “The impact of crowdsourcing on the evolution of knowledge management: insights from a case study”, Knowledge and Process Management, Vol. 24 No. 4, pp. 287-295.

Dounias, G. (2015), “Hybrid computational intelligence”, Encyclopedia of Information Science and Technology, 3rd Edition, IGI Global, pp. 154-162.

Ermagun, A., Fan, Y., Wolfson, J., Adomavicius, G. and Das, K. (2017), “Real-time trip purpose prediction using online location-based search and discovery services”, Transportation Research Part C: Emerging Technologies, Vol. 77, pp. 96-112.

Fan, J., Lu, M., Ooi, B.C., Tan, W.C. and Zhang, M. (2014), “A hybrid machine-crowdsourcing system for matching web tables”, Proceedings of IEEE 30th International Conference on Data Engineering, IEEE, pp. 976-987.

Folds, D.J. (2016), “Human executive control of autonomous systems: a conceptual framework”, Proceedings of IEEE International Symposium on Systems Engineering (ISSE), pp. 1-5.

Franklin, M.J., Kossmann, D., Kraska, T., Ramesh, S. and Xin, R. (2011), “CrowdDB: answering queries with crowdsourcing”, Proceedings of the 2011 ACMSIGMOD International Conference on Management of Data, pp. 61-72.

Gao, J., Li, Q., Zhao, B., Fan, W. and Han, J. (2016), “Mining reliable information from passively and actively crowdsourced data”, Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2121-2122.

Halmes, M. (2013), “Measurements of collective machine intelligence”, arXiv preprint arXiv:1306.6649.

Hariri, N. (2013), “Do natural language search engines really understand what users want? a comparative study on three natural language search engines and google”, Online Information Review, Vol. 37 No. 2, pp. 287-303.

Harris, C.G. (2011), “Dirty deeds done dirt cheap: a darker side to crowdsourcing”, Proceedings of IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and IEEE Third International Conference on Social Computing (SocialCom), IEEE, pp. 1314-1317.

Harris, C.G. and Srinivasan, P. (2013), “Comparing crowd-based, game-based, and machine-based approaches in initial query and query refinement tasks”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 495-506.

Holzinger, A., Plass, M., Holzinger, K., Crişan, G.C., Pintea, C.M. and Palade, V. (2016), “Toward interactive machine learning (iML): applying ant colony algorithms to solve the traveling salesman problem with the human-in-the-loop approach”, Proceedings of International Conference on Availability, Reliability, and Security, Springer, pp. 81-95.

Howe, J. (2006), “The rise of crowdsourcing”, Wired Magazine, Vol. 14 No. 6, pp. 1-4.

Ikediego, H.O., Ilkan, M., Abubakar, A.M. and Victor Bekun, F. (2018), “Crowd-sourcing (who, why and what)”, International Journal of Crowd Science, Vol. 2 No. 1, pp. 27-41.

Jain, A., Das, D., Gupta, J.K. and Saxena, A. (2015), “Planit: a crowdsourcing approach for learning to plan paths from large scale preference feedback”, Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 877-884.

Jansen, B.J., Booth, D.L. and Spink, A. (2007), “Determining the user intent of web search engine queries”, Proceedings of the 16th International Conference on World Wide Web, ACM, pp. 1149-1150.

Jeong, J.W., Morris, M.R., Teevan, J. and Liebling, D. (2013), “A crowd-powered socially embedded search engine”, Proceedings of Seventh International AAAI Conference on Weblogs and Social Media, ICWSM, AAAI.

Kairam, S. and Heer, J. (2016), “Parting crowds: characterizing divergent interpretations in crowdsourced annotation tasks”, Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing, pp. 1637-1648.

Kamar, E. (2016), “Directions in hybrid intelligence: complementing Ai systems with human intelligence”, Proceedings of IJCAI, pp. 4070-4073.

Kamar, E., Hacker, S. and Horvitz, E. (2012), “Combining human and machine intelligence in large-scale crowdsourcing”, Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, pp. 467-474.

Kazai, G. (2011), “In search of quality in crowdsourcing for search engine evaluation”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 165-176.

Kazai, G., Kamps, J., Koolen, M. and Milic-Frayling, N. (2011), “Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking”, Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 205-214.

Kim, Y., Collins-Thompson, K. and Teevan, J. (2013), “Crowdsourcing for robustness in web search”, Proceedings of TREC.

Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.J., Shamma, D.A. and Bernstein, M.S. (2017), “Visual genome: connecting language and vision using crowdsourced dense image annotations”, International Journal of Computer Vision, Vol. 123 No. 1, pp. 32-73.

Law, E., Mityagin, A. and Chickering, M. (2009a), “Intentions: a game for classifying search query intent”, Proceedings of CHI’09 Extended Abstracts on Human Factors in Computing Systems, ACM, pp. 3805-3810.

Law, E., von Ahn, L. and Mitchell, T. (2009b), “Search war: a game for improving web search”, Proceedings of the ACM sigkdd workshop on human computation, ACM, p. 31.

Lewandowski, D. (2015), “Evaluating the retrieval effectiveness of web search engines using a representative query sample”, Journal of the Association for Information Science and Technology, Vol. 66 No. 9, pp. 1763-1775.

Liptchinsky, V., Satzger, B., Schulte, S. and Dustdar, S. (2015), “Crowdstore: a crowdsourcing graph database”, Proceedings of International Conference on Collaborative Computing: Networking, Applications and Worksharing, Springer, pp. 72-81.

Ma, H., Chandrasekar, R., Quirk, C. and Gupta, A. (2009), “Improving search engines using human computation games”, Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 275-284.

Marcus, A., Wu, E., Karger, D.R., Madden, S. and Miller, R.C. (2011), “Crowdsourced databases: query processing with people”, Proceedings of CIDR.

Milne, D., Nichols, D.M. and Witten, I.H. (2008), “A competitive environment for exploratory query expansion”, Proceedings of the 8thACM/IEEE-CS Joint Conference on Digital Libraries, ACM, pp. 197-200.

Moradi, M., Ardestani, M.A. and Moradi, M. (2016), “Learning decision making for soccer robots: a crowdsourcing-based approach”, Proceedings of Artificial Intelligence and Robotics (IRANOPEN), IEEE, pp. 25-29.

Ofli, F., Meier, P., Imran, M., Castillo, C., Tuia, D., Rey, N., Briant, J., Millet, P., Reinhard, F., Parkan, M. and Joost, S. (2016), “Combining human computing and machine learning to make sense of big (aerial) data for disaster response”, Big Data, Vol. 4 No. 1, pp. 47-59.

OReilly, T. (2007), “What is web 2.0: design patterns and business models for the next generation of software”, International Journal of Digital Economics, Vol. 1, pp. 17-37.

Parameswaran, A., Teh, M.H., Garcia-Molina, H. and Widom, J. (2014), “Datasift: a crowd-powered search toolkit”, Proceedings of International Conference on Management of Data, pp. 885-888.

Parameswaran, A.G., Garcia-Molina, H., Park, H., Polyzotis, N., Ramesh, A. and Widom, J. (2012), “Crowdscreen: algorithms for filtering data with humans”, Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 361-372.

Park, S., Cho, K. and Choi, K. (2015), “Information seeking behavior of shopping site users: a log analysis of popshoes, a korean shopping search engine”, Journal of the Korean Society for Information Management, Vol. 32 No. 4, pp. 289-305.

Poirier, P. (2017), “Four human strengths and AI weaknesses”, avaliable at: https://medium.com/eruditeai/four-human-strengths-and-ai-weaknesses-a0fc1d38d538/ (accessed 20 July 2018).

Rahman, S.S., Easton, J.M. and Roberts, C. (2015), “Mining open and crowdsourced data to improve situational awareness for railway”, Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1240-1243.

Ruotsalo, T., Jacucci, G., Myllymäki, P. and Kaski, S. (2015), “Interactive intent modeling: information discovery beyond search”, Communications of the, ACM, Vol. 58 No. 1, pp. 86-92.

Sarma, A.D., Parameswaran, A., Garcia-Molina, H. and Halevy, A. (2014), “Crowd-powered find algorithms”, Proceedings of IEEE 30th International Conference on Data Engineering (ICDE), pp. 964-975.

Simpson, E.D., Venanzi, M., Reece, S., Kohli, P., Guiver, J., Roberts, S.J. and Jennings, N.R. (2015), “Language understanding in the wild: combining crowdsourcing and machine learning”, Proceedings of the 24th International Conference on World Wide Web, pp. 992-1002.

Spirin, N. (2014), “Searching for design examples with crowdsourcing”, Proceedings of the 23rd International Conference on World Wide Web, ACM, pp. 381-382.

Sushmita, S., Joho, H., Lalmas, M. and Jose, J.M. (2009), “Understanding domain relevance in web search”, Proceedings of WWW 2 Workshop on Web Search Result Summarization and Presentation, Madrid.

Tayyub, J., Hawasly, M., Hogg, D.C. and Cohn, A.G. (2017), “CLAD: a complex and long activities dataset with rich crowdsourced annotations”, arXiv preprint arXiv:1709.03456.

Teevan, J., Collins-Thompson, K., White, R.W. and Dumais, S. (2014), “Slow search”, Communications of the, ACM, Vol. 57 No. 8, pp. 36-38.

Thelwall, M. (2008), “Quantitative comparisons of search engine results”, Journal of the American Society for Information Science and Technology, Vol. 59 No. 11, pp. 1702-1710.

Trushkowsky, B., Kraska, T., Franklin, M.J., Sarkar, P. and Ramachandran, V. (2015), “Crowdsourcing enumeration queries: estimators and interfaces”, IEEE Transactions on Knowledge and Data Engineering, Vol. 27 No. 27, pp. 1796-1809.

Uyar, A. (2009), “Investigation of the accuracy of search engine hit counts”, Journal of Information Science, Vol. 35 No. 4, pp. 469-480.

Von Ahn, L., Maurer, B., McMillen, C., Abraham, D. and Blum, M. (2008), “Recaptcha: human-based character recognition via web security measures”, Science, Vol. 321 No. 5895, pp. 1465-1468.

Wallace, B.C., Noel-Storr, A., Marshall, I.J., Cohen, A.M., Smalheiser, N.R. and Thomas, J. (2017), “Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach”, Journal of the American Medical Informatics Association, Vol. 24 No. 6, pp. 1165-1168.

Weyer, J., Fink, R.D. and delt, F. (2015), “Human-machine cooperation in smart cars: an empirical investigation of the loss-of-control thesis”, Safety Science, Vol. 72, pp. 199-208.

Whitney, L. (2017), “Are computers already smarter than humans?”, Time Magazine, available at: http://time.com/4960778/computers-smarter-than-humans/ (accessed 23 March 2018).

Yampolskiy, R.V., Ashby, L. and Hassan, L. (2012), “Wisdom of artificial crowds – a metaheuristic algorithm for optimization”, Journal of Intelligent Learning Systems and Applications, Vol. 4 No. 2, pp. 98-107.

Yan, T., Kumar, V. and Ganesan, D. (2010), “Crowdsearch: exploiting crowds for accurate real-time image search on mobile phones”, Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, ACM, pp. 77-90.

Zadeh, L.A. (2008), “Toward human level machine intelligence-is it achievable? the need for a paradigm shift”, IEEE Computational Intelligence Magazine, Vol. 3 No. 3.

Zahedi, M.S. et al. (2017), “How questions are posed to a search engine? An empiricial analysis of question queries in a large scale persian search engine log”, Proceedings of 3th International Conference on Web Research (ICWR), IEEE, pp. 84-89.

Zeng, Z., Tang, J. and Wang, T. (2017), “Motivation mechanism of gamification in crowdsourcing projects”, International Journal of Crowd Science, Vol. 1 No. 1, pp. 71-82.

Zhang, J., Wang, S. and Huang, Q. (2017), “Location-based parallel tag completion for geo-tagged social image retrieval”, ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 8 No. 3, Article No. 38.

Zhao, Z., Wei, F., Zhou, M., Chen, W. and Ng, W. (2015), “Crowd-selection query processing in crowdsourcing databases: a task-driven approach”, Proceedings of EDBT, pp. 397-408.

Corresponding author

Mohammad Moradi can be contacted at: moradi.c85@gmail.com

Related articles