The purpose of this paper is to evaluate Google question-answering (QA) quality.
Given the large variety and complexity of Google answer boxes in search result pages, existing evaluation criteria for both search engines and QA systems seemed unsuitable. This study developed an evaluation criteria system for the evaluation of Google QA quality by coding and analyzing search results of questions from a representative question set. The study then evaluated Google’s overall QA quality as well as QA quality across four target types and across six question types, using the newly developed criteria system. ANOVA and Tukey tests were used to compare QA quality among different target types and question types.
It was found that Google provided significantly higher-quality answers to person-related questions than to thing-related, event-related and organization-related questions. Google also provided significantly higher-quality answers to where- questions than to who-, what- and how-questions. The more specific a question is, the higher the QA quality would be.
Suggestions for both search engine users and designers are presented to help enhance user experience and QA quality.
Particularly suitable for search engine QA quality analysis, the newly developed evaluation criteria system expanded and enriched assessment metrics of both search engines and QA systems.
This work is supported by the National Natural Science Foundation of China under Grant Nos 71420107026, 71874130 and 71403190. It is partly supported by the Ministry of Education of the People’s Republic of China under Grant No. 18YJC870026, China Association for Science and; Technology, the Fundamental Research Funds for the Central Universities and Research Fund for Academic Team of Young Scholars at Wuhan University (Whu2016013).
Emerald Publishing Limited
Copyright © 2018, Emerald Publishing Limited