Crowdsourcing for search engines: perspectives and challenges

Purpose – As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward unprecedented opportunities to empower the related services and mechanisms by leveraging humans ’ intelligence and problem solving abilities. With respect to the pivotal role of search engines in the Web and information community, this paper aims to investigate the advantages and challenges of incorporating people – as intelligent agents – into search engines ’ work ﬂ ow. Design/methodology/approach – To emphasize the role of the human in computational processes, some speci ﬁ c and related areas are studied. Then, through studying the current trends in the ﬁ eld of crowd-powered searchenginesand analyzingtheactualneedsandrequirements,theperspectivesand challengesarediscussed. Findings – As the research on this topic is still in its infancy, it is believed that this study can be considered as a roadmap for future works in the ﬁ eld. In this regard, current status and development trends are delineated through providing a general overview of the literature. Moreover, several recommendations for extending the applicability and ef ﬁ ciency of next generation of crowd-powered search engines are presented. In fact, becoming aware of different aspects and challenges of constructing search engines of this kind can shed light ontheway of developing working systems withrespect to essential considerations. Originality/value – The present study was aimed to portrait the big picture of crowd-powered search engines and possible challenges and issues. As one of the early works that provided a comprehensive report ondifferent aspectsof thetopic,itcan be regarded asa reference point.


Introduction
The days of relying only on machines for performing computing tasks and problem solving have gone. In fact, introduction of crowdsourcing (Howe, 2006) as a mold-breaking computing paradigm has changed the playground drastically. Despite many years of research and development, machines could not handle all computational problems completely and independently, especially when it comes to cognitive-and intelligence-intensive tasks (Zadeh, 2008;Del Prado, 2015;Whitney, 2017). Therefore, putting humans in the loop as collaborators, cooperators and even coordinators (rather than just users or supervisors) (Folds, 2016) can be considered the silver bullet for tackling a wide variety of problems in different domains (Kamar et al., 2012;Weyer et al., 2015;Ofli et al., 2016;Holzinger et al., 2016). Narrowing down the view on a specific niche, there are some interesting common grounds between crowdsourcing and the Web 2.0 (OReilly, 2007) concepts in their perspectives on humans' roles. According to the Web 2.0 manifesto (OReilly, 2007), Web users should evolve from merely consumers to active producers. In this regard, notable efforts such as Wikipediaas a Web 2.0 iconic example of collaborative participation of users and a successful best practice of crowdsourcing-based knowledge acquisition-could be inspirational and motivating. Such an example, by the way, puts focus on invaluable applications of human-centered intelligenceoriented participation in Web-related workflows. Regarding the principal role of search engines in the Web and information society, it is worth studying the possible perspectives and also challenges of incorporating humans (i.e. Web users) into information retrieval, validation and evaluation processes. Such activities that can affect search engines and some of the related processes are discussed in this paper as in the following structure: some background and related works are introduced in Section 2. The rationale behind human-powered search engines is investigated in Section 3. Applications and perspectives of leveraging crowdsourcing for search engines and related challenges are studied in Sections 4 and 5, respectively. Moreover, a concise literature review is conducted in Section 6, and some suggestions for the future works are presented in Section 7.

Related works
Since the early days of introduction, crowdsourcing (Howe, 2006;Brabham, 2008;Ikediego et al., 2018) has provided many unprecedented opportunities to facilitate traditional workflows and processes in a wide variety of (mostly) technology-related domains. In this regard, one can see numerous example scenarios in a broad range of application areas from robotics (Breazeal et al., 2013;Jain et al., 2015;Moradi et al., 2016;Almosalami et al., 2018) and machine learning (Simpson et al., 2015;Wallace et al., 2017) to knowledge management (Callaghan, 2016;Dimitrova and Scarso, 2017) and much more. Following this working idea, information retrieval researchers have pursued the applicability of leveraging the people's potential for improving related tasks. More specifically, taking advantages of collective human intelligence for corpus annotation (Krishna et al., 2017;Tayyub et al., 2017), query interpretation (Ciceri et al., 2016;Chen et al., 2016) and other database-related processes (Liptchinsky et al., 2015;Trushkowsky et al., 2015;Zhao et al., 2015) have gained momentum. To introduce some, the following are worth mentioning: as a notable work in this context, Franklin et al. (2011) proposed CrowdDB. The system leverages human input to process and answer queries that machines could not provide appropriate results for them. Additionally, they introduced CrowdSQL as an extension for SQL to support the underlying idea. To take advantages of Amazon's Mechanical Turk (AMT) for more complicated tasks and processes, namely, database-related ones, Marcus et al. (2011) introduced a new query system, Qurk. Designing an algorithm for human-driven filtering of data items based on some attributes is the theme of the research reported in Parameswaran et al. (2012). As another inspirational work, benefits of crowdsourcing-based relevance assessment for XML retrieval are reported by Alonso et al. (2010).
On the other side, search engines are playing an integral role in dissemination of information in the Web. However, despite the remarkable advancements in the field (Deng and Feng, 2011;Hariri, 2013;Lewandowski, 2015), there are still several essential issues with search engines in identification of users' intentions (Jansen et al., 2007;Ruotsalo et al., 2015) and providing them with most appropriate answers (Thelwall, 2008;Uyar, 2009). Specifically, among the major challenges search engines face with, understanding humans (their intentions and exact needs) and providing them with human-level responses are of high importance. Due to intrinsic weakness of (current)machines in dealing with cognitive and intelligence-intensive tasks, such as interpretation of natural languages, one cannot expect perfect and flawless search engines with clear-cut results. As a result, optimum and dreamy search engines seems not to be in sight at least at this time. To fill this meaningful gap between what users (searchers) want from search engines and what search engines can bring to users, leveraging humans' intelligence and cognitive/problem solving abilities can be considered a game changer ( Figure 1). Therefore, the major motivations of this study are of two types: (1) investigation of reasons, benefits and nuts and bolts of typical crowd-powered search engines, i.e. theoretical motivations; and (2) studying best practices, current solutions, practical implications and challenges of relying on crowds' power for evolving traditional search engines, i.e. practical motivations.
In this regard, this paper aimed to provide a reference point to reflect current status of the field and draw a road map for future works.

Why crowdsourcing is needed for search?
Although machines can easily outperform humans in (most of) computational tasks, when it comes to cognitive and intelligence-intensive problems, including natural language processing, they address several critical shortcomings (Poirier, 2017). To Deal with such issues, leveraging humans' potential and abilities opened a new window toward taking advantages of man-machine cooperation. Over the years, many research studies have been conducted to benefit from such a hybrid strategy (Dounias, 2015;Kamar, 2016;Dellermann et al., 2018). Among them, some of the mostly human-centric application areas, including information search and relevance assessment, greatly depend on human intervention to Search engines provide reliable and accurate results. As everyone experiences in his daily Web browsing, even the highly-sophisticated search engines with the state-of-the-art algorithms and procedures are not as strong and accurate as users may expect, specifically in interpreting the search queries and consequently retrieving relevant results. In other words, demystifying users' intention of search (terms), finding most relevant matches and ranking the results so that best conform to users' goals and preferences cannot be achieved by relying only on machines' intelligence and capabilities. In this regard, so-called crowd(human)-powered search engines  have gained momentum. The main rationale behind such search systems is involving individuals to leverage their cognitive intelligence (and searching expertise) for the sake of providing users (i.e. initial searchers) with what they could not find by themselves. As a real-world example, Digle[1]a people-powered search enginecrowdsources search queries to its large body of participants (search workers/ searchers). To present more accurate and relevant results, users are asked to provide some additional information on their own requests, including the related category, etc. However, question and answer sites and forums provide similar facilities for years, and crowdsourcingbased search engines are in charge of generalizing the concept and presenting their users with specific, to-the-point, relevant and humanized information.

How does crowdsourcing help Web search?
Humans' poweraccording to the context and applicationscan be leveraged in many different ways from providing training data for the machine (Kairam and Heer, 2016;Chang et al., 2017) to collaborating with an algorithm to provide more precise outputs (Fan et al., 2014;Sarma et al., 2014), e.g. in the form of a quality controller or supervisor. When it comes to the Web search, crowdsourcing is mainly related to improving underlying processes or providing users (searchers) with some assistance on finding more relevant answers. In addition to analyze logs and query submission patterns (Park et al., 2015;Zahedi et al., 2017) to find out users' requirements (indirect crowdsourcing or crowd analysis), game-based methods (Law et al., 2009a;Law et al., 2009b;Bennett et al., 2009;Ma et al., 2009), as a tacit approach for facilitating search process, are in the center of attention. From a general point of view, humans' role in the search process could be categorized in the four major classes as follows.

Crowd-searching
In this category, it is supposed that the user could not find what (s)he is looking for. It may be cause of lack of adequate searching abilities, having no knowledge of the target topic and so on. In such a case, the aim is to crowdsource the problem (i.e. keywords to be searched) and get back the most relevant crowd-searched results to the user. To obtain more accurate answers, users should be asked to provide as much as possible additional information on what they want to find. Digle, DataSift  and CrowdSearcher (Bozzon et al., 2012a) are major solutions that provide users with crowd-searched findings of desired topics. Following this idea, the people's power have been leveraged in previous inspirational studies (Jeong et al., 2013;Spirin, 2014) for answering twitter questions and finding design examples, respectively. The obtained results in this approach may be used for further managerial processes, such as query interpretation.

Crowd-clarifying
One of the main issues search engines and information retrieval systems deal with is demystifying and clarifying submitted queries. As this problem is mainly related to natural language processing (a hard AI problem), machines face some difficulty to handle them.
Unfamiliarity with the search (target) language, entering long and ambiguous search terms, typos and semantic errors are among reasons that imply needs for crowdsourcing-based clarification of search queries. In fact, human intelligence is the best means to uncover humans' intention of a specific query. Despite the crowd-searching (Bozzon et al., 2012b), this approach is not necessarily online or (semi)real-time. Human workforce, for this purpose, will be used to interpret the query, breaking down it to several essential meaningful parts, suggesting additional choices for expanding search terms and finding similar terms (Kim et al., 2013) and more appropriate alternatives for replacing the input search term(s) with. These will improve the query-result matching and retrieval processes.

Crowd-sifting
Conceptually similar to crowd-searching, crowd-sifting is an umbrella term for a set of activities devoted to preparation of intermediate information. Data labeling and classification are important tasks in this category. Doing so, in fact, the information that could be matched with the respective queries will be filtered and organized to achieve higher performance (Milne et al., 2008). From another point of view, the information retrieved through automatic searching process, to be calibrated and normalized, should be validated by the people (Yan et al., 2010). Such supervisory tasks are considered as a pre-processing step for the answer generation. Due to need for recruiting a relatively large body of participants, and performing precise computation and supervisory routines, this approach is a costly and time-consuming one.

Crowd-rating
Regardless of how answers are produced, there are two critical post-processing steps. First one is evaluating the relevance of candidate answer sets to the submitted query, which is a determining process for the final answer generation (Alonso et al., 2008;Lewandowski, 2015). This easily crowdsourcable process can also be performed tacitly through analyzing users' feedback and satisfaction measuring. Second, ranking the results (Kim et al., 2013) plays an important role for helping to find the most relevant items. The aforementioned processes are inter-related and dependent, through which the ultimate results will be populated and organized in a user-friendly manner. According to the aforementioned procedures, the human as the crowdworker can play two major roles ( Figure 2): (1) Search assistant: In this role that is referred to as crowd-searcher, the human's participation is leveraged to directly assist Web searchers. Therefore, they are not involved in the background supervisory processes and just their searching skills and abilities are benefited.
(2) System collaborator: The last three categories delineated previously in this section take advantages of humans' intelligence and knowledge for query analysis, relevance assessment and similar supervisory workflows. Therefore, the people in such contexts serve as collaborators and/or experts who take part in the decision-making process.

Challenges
Despite several remarkable benefits of crowdsourcing for facilitating the Web search process in different levels, relying on humans' power addresses some essential challenges.
Underestimating these issues and their consequences can greatly affect the efficiency of the process. These challenges are of two broad classes: human-related and technical ones.

Human-related challenges
No one could improve the Web search process better than (expert) users, and on the other side, no one else could undermine/affect it just like them, their behavior and operations. Regarding this fact, there are some influential factors that should be considered. 5.1.1 Motivation and incentives. However, crowdsourcing, in some cases, is established on the shoulders of volunteers; when it comes to critical and serious applications that should be performed in near real-time, it is not an effective approach. In this regard, recruiting active and responsible (and possibly expert) participants is a must-have need that imposes remarkable costs.
5.1.2 Challenging tasks. As mentioned earlier, a common type of tasks in the context of Web search is interpretation of long, ambiguous and complicated search terms. Due to some intrinsic issues in such cases, e.g. obscure submissions by users in language other than their own, crowdworkers may be disinterested to demystify those inputs. In other words, highly prolonged and erroneous inputsthat are prevalent in search enginesmay affect the crowdsourcing process. To cope with such issues, applying a preliminary machine interpretation or increasing the payment for complicated submissions (tasks) are of working solutions. 5.1.3 Integrity and scalability. Machine interpretation of search terms and finding relevant answers are only dependent on underlying algorithms. Replacing it with a humandriven strategy can be influenced with several to many variables. For this reason, it is unlikely to expect similar answers for similar search terms in the case of lack of some further supervisory (integration) steps. On the other side, adversarial intentions can affect the answer seeking process (Harris, 2011;Difallah et al., 2012). Therefore, there is need to some quality control processes (Daniel et al., 2018); otherwise, the reliability of humanpowered information searching may be questioned. Further, scalability is another challenging issue in this context, specifically when it comes to deal with a large number of users. In such cases, recruiting and managing many active crowd-searchers to guarantee the efficiency of the system will impose remarkable costs and technical considerations.

Technical challenges
In addition to usual technical complexities for search engines, to manage crowdsourcingrelated issues, some further considerations are needed, including the followings.
5.2.1 Response time. An essential advantage (and performance measure) of search engines is reducing the retrieval time. Currently, most search engines retrieve the initial answers in less than few seconds. Such a feature is one of the most important superiority of traditional search engines over human-powered ones. Assigning search tasks to the crowd, finding relevant results by the people, validation, integration and retrieval of most relevant answers are time-consuming processes that not only exceed near real-time performance but also impose a remarkable annoying delay. Although it is studied that in some cases users prefer the slow search process to acquire more accurate results (Teevan et al., 2014), this is not the case for general purposes.
5.2.2 Managerial overheads. Managing crowd-searched answers is a complex and sophisticated process. In fact, machine-driven validation and relevance evaluation processes may be subject to some inconsistencies. In this regard, some human-oriented supervisory processes may be needed. Such an approach, by the way, can address the need for repetitive human-mediated evaluation to reach an acceptable assurance level. Further, there are several essential implementation considerations that should be taken into account to make the system feasible and efficient enough.
5.2.3 Crowdsourcing platform. Due to its features and capabilities, AMT is the first choice of researchers and practitioners for conducting crowdsourcing projects. Nonetheless, its basic facilities may not completely support unusual tasks and procedures. Dealing with such issues, some researchers proposed solutions (such as additional frameworks and interface on the top of AMT) to handle the case (Marcus et al., 2011). While some others introduced their own case-specific crowdsourcing systems. As a real world example, Digle can demonstrate an appropriate and working instance. As there is not a size to fit all, there should be a match between type of tasks and crowdsourcing platform's capabilities. Clearly, because Digle provides (or at least aimed to provide) near real-time answers, it is not a rationale choice (for them) to use Mechanical Turk or similar systems. On the other side, for background tasks such as relevance assessment and evaluation -as done in (Blanco et al., 2011), adopting to the standard third party services is acceptable.
5.2.4 Economical side effects. Web-based commerce greatly relies on search engine optimization (SEO) techniques and strategies. However, the underlying methods by which search engines rank Web pages are not publicly revealed; over the years, SEO experts have become aware of the nuts and bolts of the workflow. Therefore, if the crowd-powered search engines gain momentum as a key player in the search engines' playground, the current (accepted, well-studied and documented) rules will be changed in a non-understandable way. In addition to disorganizing the SEO approaches, targeted activities can affect the searchbased commerce in an adversarial and destructive manner.

Literature review
To investigate the current advancements in the field, in this section, a brief literature review is conducted. In this regard, first and foremost, crowd-powered search engines are introduced. Then, the works adhered to game-based methods for improving search process are reviewed. For more information on general issues in the field, the research works conducted earlier (Sushmita et al., 2009;Kazai et al., 2011;Kazai, 2011;Harris and Srinivasan, 2013) are recommended.

Crowd-powered search engines
As one of the most interesting contribution in the field, Parameswaran et al. (2014) proposed a crowd-powered search toolkit, entitled DataSift. The most important feature of this tool is its capability to connect to any data set. The submitted query to the DataSift will be processed in a dual approach: forwarding to the crowd and analyzing by means of keyword processing subsystem. Finally, the user will be provided with a list of ranked results. To Search engines improve the quality of Twitter question asking process, the authors introduced an embedded, crowd-powered search system -MSR Answers (Jeong et al., 2013). The system provides a novel facility to obtain answers from the crowd instead of relying only on the friends' circle. It is claimed that the crowd-generated answers are as quality as what the friends can provide.
A crowdsourcing-based image search system for mobile phones, CrowdSearch, is proposed in Yan et al. (2010). In this work, the search process will be performed automatically; then a real-time crowd-mediated validation process will be applied on the generated results.
To fill the remarkable gap between automated search engines and humans' information seeking behaviors, Crowdsearcher is introduced (Bozzon et al., 2012a). The main contribution of this study is to provide pure humanized answers by leveraging humans' interaction and cognitive intelligence. In another similar study, Bozzon et al. (2012b) proposed a model-driven approach to take advantages of humans' interaction and opinions for question answering.
6.2 Game-based approaches Search War, a competitive game for improving Web search, was introduced in Law et al. (2009b). The users, in addition to collect data, take part in a relevance evaluation process for a specific search query and a Web page. Ma et al. (2009) proposed three human computation games for improving Web search. The underlying idea of the first one, which is named Page Hunt, is to show the user a random Web page and ask him to suggest the most relevant search query for that. The suggested query will be checked in a real search engine, and the results will get back to the user for the sake of evaluation. This game, by the way, could be used for the search engines optimization purposes. The second game, called Page Race, is a competitive one with the aim of specifying the query (search phrase) that best matches the given Web page. As a collaborative game, the third one, Page Match, is intended to examine humans' efforts to match similar Web pages based on their selected queries. In this game, players win points when both agreed on a decision, i.e. the Web pages are same or different.
The major contribution of Intentions, a human computation game proposed by Law et al. (2009a), is to collect relevant human-generated data for interpreting intentions behind search queries. Despite the Page Hunt, the game play for the Intentions is a reverse one: users will get an intention and will be asked to suggest some search queries which best match it.
Borrowing the underlying idea from ESP game, Picture This as a social collaborative game was proposed in Bennett et al. (2009) to collect data for image searching purpose. In the game, participants will be presented with a sequence of queries and several images. They will be awarded credit when they agree on an image for a specific query.
As an educational game, Koru (Milne et al., 2008) is developed to trace how users evolve queries, how they can improve their searching skills and find out what are their intentions for issued queries.

Future works
In addition to the general perspectives discussed in the paper, in this section, several specific suggestions for future works in the field are presented.

Leveraging collective machine intelligence
The idea of leveraging collective machine intelligence and performance has gained momentum within the recent years (Yampolskiy et al., 2012;Halmes, 2013). As an equivalent concept for crowdsourcing in the context of intelligent agents, such an idea can be used to provide Web searchers with more precise and comprehensive answers. Specifically, any search engine follows its own attitudes toward the query interpretation, information retrieval and other similar procedures. Therefore, it is expected to obtain (partially) different answers when issuing same search query in different search engines. In this regard, taking advantages of different search engines and information retrieval systems to provide the user with most relevant answers can be an interesting and working strategy.

Location-based crowdsourcing
As location and temporal information (features) can affect the search and retrieval processes (Zhang et al., 2017;Ermagun et al., 2017), there is a strong need to incorporate such factors in the related workflows. When it comes to crowd-powered search engines, the key to consider location-related features is to adhere to location-based crowdsourcing. For example, to provide a user with (possibly) most relevant answers, it would be better to employ crowdworkers who are in the same geographical location that the initial query was issued.

Mining crowdsourced data
Within the recent years, researchers have paid a remarkable attention to discovering knowledge from crowdsourced data (Rahman et al., 2015;Gao et al., 2016). In fact, mining crowdsourced data can be regarded as delving into humans' intelligence. In the context of search engines, analyzing crowd-selected and crowd-searched keywords is a powerful means to gain insight on common search patterns. Moreover, discovering the ranking and evaluation patterns can be used for constructing an expert system to automate the process and providing users with precise recommendations.

Rethinking the incentive mechanism
One of the most important drawbacks of crowd-powered search engines is the intrinsic delay. To overcome this shortcoming, it is needed to recruit a very large body of active participants. For this reason, a working strategy is to keep them active through (social and viral) games (Zeng et al., 2017). Also, establishing competitive environments and mechanisms can increase the rate of participation and accuracy. Looking back at the best practices for attracting mass human participation, such as the Google's reCAPTCHA (Von Ahn et al., 2008), can also be inspirational.

Conclusion
The main contribution of this paper was studying effects of leveraging humans' problem solving and information seeking abilities in the context of Web search engines. As the current search engines, despite their advantages and capabilities, could not provide humanlevel answers in some cases, it seems (and partially proved) that incorporating humans in the process can be the silver bullet to overcome current deficiencies of traditional approaches. In this regard, in addition to providing an overview of the respected literature, some important perspectives and challenges of the field were studied.