- Pradeep Kumar Roy1,
- Zishan Ahmad1,
- Jyoti Prakash Singh1,
- Mohammad Abdallah Ali Alryalat2,
- Nripendra P. Rana3 &
- …
- Yogesh K. Dwivedi3
1044Accesses
Abstract
Community Question Answering (CQA) sites have become a very popular place to ask questions and give answers to a large community of users on the Internet. Stack Exchange is one of the popular CQA sites where a large amount of contents are posted every day in the form of questions, answers and comments. The answers on Stack Exchange are listed by their recent occurrences, time of posting or votes obtained by peer users under three tabs called active, oldest and votes, respectively. Votes tab is the default setting on the site and is also preferred tab of users because answers under this tab are voted as good answers by other users. The problem of voting-based sorting is that new answers which are yet to receive any vote are placed at the bottom in vote tab. The new answer may be of sufficiently high-quality to be placed at the top but no or fewer votes (later posting) have made them stay at the bottom. We introduce a new tab calledpromising answers tab where answers are listed based on their usefulness, which is calculated by our proposed system using the classification and regression models. Several textual features of answers and users reputation are used as features to predict the usefulness of the answers. The results are validated with good values of precision, recall, F1-score, area under the receiver operating characteristic curve (AUC) and root mean squared error. We also compare the top ten answers predicted by our system to the actual top ten answers based on votes and found that they are in high agreement.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
Activeness of a post is defined by the number of times it has been modified.
References
Alalwan, A., Rana, N. P., & Dwivedi, Y. K., Algharabat, R. (2017).Social media in marketing: A review and analysis of the existing literature, telematics and informatics, Available athttp://www.sciencedirect.com/science/article/pii/S0736585317301077.
Aswani, R., Kar, A. K., Ilavarasan, P. V., & Dwivedi, Y. K. (2018). Search engine marketing is not all gold: insights from Twitter and SEOClerks.International Journal of Information Management,38(1), 107–116.
Atkinson, J., Figueroa, A., & Andrade, C. (2013). Evolutionary optimization for ranking how-to questions based on user-generated contents.Expert Systems with Applications,40(17), 7060–7068.
Bian, J., Liu, Y., Zhou, D., Agitating, E., & Zha, H. (2009). Learning to recognize reliable users and content in social media with coupled mutual reinforcement. InProceedings of the 18th international conference on world wide web (pp. 51–60).
Blooma, M. J., Hoe-Lian Goh, D., & Yeow-Kuan Chua, A. (2012). Predictors of high-quality answers.Online Information Review,36(3), 383–400.
Burel, G., He, Y., & Alani, H. (2012). Automatic identification of best answers in online enquiry communities.The Semantic Web: Research and Applications,7295, 514–529.(ESWC 2012. Lecture Notes in Computer Science).
Calefato, F., Lanubile, F., & Novielli, N. (2016). Moving to stack overflow: Best-answer prediction in legacy developer forums. InProceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. Article 13 (pp. 1–10). ACM.
Chall, J. S., & Dale, E. (1995).Manual for use of the new Dale–Chall readability formula. Brookline: Brookline Books.
Chen, B. C., Dasgupta, A., Wang, X., & Yang, J. (2012). Vote calibration in community question-answering systems. InProceedings of the 35th international ACM SIGIR conference on research and development in information retrieval (pp. 781–790). ACM.
Craswell, N. (2009). Mean reciprocal rank. In L. Liu, M. T. Özsu (Eds.),Encyclopedia of database systems. (pp. 1703–1703). Springer US.
Davis, J., & Goadrich, M. (2006). The relationship between Precision–Recall and ROC curves. InProceedings of the 23rd international conference on Machine learning (pp. 233–240).
Dong, H., Wang, J., Lin, H., Xu, B., & Yang, Z. (2015). Predicting best answerers for new questions: an approach leveraging distributed representations of words in community question answering. In2015 ninth international conference on frontier of computer science and technology (FCST) (pp. 13–18). IEEE.
Dwivedi, Y. K., Kapoor, K. K., & Chen, H. (2015a). Social media marketing and advertising.The Marketing Review,15(3), 289–309.
Dwivedi, Y. K., Rana, N. P., & Alryalat, M. (2017a). Affiliate marketing: An overview and analysis of emerging literature.The Marketing Review,17(1), 33–50.
Dwivedi, Y. K., Rana, N. P., Janssen, M., Lal, B., Williams, M. D., & Clement, M. (2017b). An empirical validation of a unified model of electronic government adoption (UMEGA).Government Information Quarterly,34(2), 211–230.
Dwivedi, Y. K., Rana, N. P., Jeyaraj, A., Clement, M., & Williams, M. D. (2017c). Re-examining the unified theory of acceptance and use of technology (UTAUT): Towards a revised theoretical model.Information Systems Frontiers.https://doi.org/10.1007/s10796-017-9774-y.
Dwivedi, Y. K., Shareef, M. A., Simintiras, A. C., Lal, B., & Weerakkody, V. (2016). A generalised adoption model for services: A cross-country comparison of mobile health (m-health).Government Information Quarterly,33(1), 174–187.
Dwivedi, Y. K., Wastell, D., Laumer, S., Henriksen, H. Z., Myers, M. D., Bunker, D., et al. (2015b). Research on information systems failures and successes: Status update and future directions.Information Systems Frontiers,17(1), 143–157.
Figueroa, A., & Neumann, G. (2014). Category-specific models for ranking effective paraphrases in community question answering.Expert Systems with Applications,41(10), 4730–4742.
Hughes, D. L., Dwivedi, Y. K., & Rana, N. P. (2017). Mapping IS failure factors on PRINCE2® stages: An application of Interpretive Ranking Process (IRP).Production Planning & Control,28(9), 776–790.
Hughes, D. L., Dwivedi, Y. K., Rana, N. P., & Simintiras, A. C. (2016). Information systems project failure–analysis of causal links using interpretive structural modelling.Production Planning & Control,27(16), 1313–1333.
Hussain, W., Hussain, O. K., Hussain, F. K., & Khan, M. Q. (2017). Usability evaluation of english, local and plain languages to enhance on-screen text readability: A use case of Pakistan.Global Journal of Flexible Systems Management,18(1), 33–49.
Ismagilova, E., Dwivedi, Y. K., Slade, E. L., & Williams, M. D. (2017).Electronic word of mouth (eWOM) in the marketing context: A state of the art analysis and future directions. Berlin: Springer.
John, B. M., Chua, A. Y. K., & Goh, D. H. L. (2011). What makes a high-quality user-generated answer?IEEE Internet Computing,15(1), 66–71.
Kapoor, K. K., & Dwivedi, Y. K. (2015). Metamorphosis of Indian electoral campaigns: Modi’s social media experiment.International Journal of Indian Culture and Business Management,11(4), 496–516.
Kapoor, K. K., Dwivedi, Y. K., & Piercy, N. (2016). Pay-per-click advertising: A review of literature.The Marketing Review,16(2), 183–202.
Kincaid, J. P., Fishburne, R. P., Jr., Rogers, R. L., & Chissom, B. S. (1975).Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel (No. RBR-8-75). Millington: Naval Technical Training Command Millington TN Research Branch.
Lin, J., & Demner-Fushman, D. (2006). Methods for automatically evaluating answers to complex questions.Information Retrieval,9(5), 565–587.
Liu, Q., Agichtein, E., Dror, G., Gabrilovich, E., Maarek, Y., Pelleg, D., & Szpektor, I. (2011). Predicting web searcher satisfaction with existing community-based answers. InProceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. (pp. 415–424). ACM.
Liu, B., Feng, J., Liu, M., Hu, H., & Wang, X. (2015). Predicting the quality of user-generated answers using co-training in community-based question answering portals.Pattern Recognition Letters,58, 29–34.
Liu, L., & Ozsu, M. T. (Eds.). (2009).Mean average precision (p. 1703). Boston, MA: Springer.
Merton, R. K. (1968). The Matthew effect in science.Science,159(3810), 56–63.
Molino, P., Aiello, L. M., & Lops, P. (2016). Social question answering: Textual, user, and network features for best answer prediction.ACM Transactions on Information Systems (TOIS),35(1), 4:1–4:40.
Palanisamy, R., & Foshay, N. (2013). Impact of user’s internal flexibility and participation on usage and information systems flexibility.Global Journal of Flexible Systems Management,14(4), 195–209.
Plume, C. J., Dwivedi, Y. K., & Slade, E. L. (2016).Social media in the marketing context: A state of the art analysis and future directions (1st ed.). Oxford: Chandos Publishing Ltd.
Rana, N. P., Dwivedi, Y. K., Lal, B., Williams, M. D., & Clement, M. (2017). Citizens’ adoption of an electronic government system: Towards a unified view.Information Systems Frontiers,19(3), 549–568.
Rana, N. P., Dwivedi, Y. K., Williams, M. D., & Weerakkody, V. (2016). Adoption of online public grievance redressal system in India: Toward developing a unified view.Computers in Human Behavior,59, 265–282.
Rathore, A. K., Ilavarasan, P. V., & Dwivedi, Y. K. (2016). Social media content and product co-creation: An emerging paradigm.Journal of Enterprise Information Management,29(1), 7–18.
Sahu, T. P., Nagwani, N. K., & Verma, S. (2016). Selecting best answer: An empirical analysis on community question answering sites.IEEE Access,4, 4797–4808.
Sakai, T., Ishikawa, D., Kando, N., Seki, Y., Kuriyama, K., & Lin, C. Y. (2011). Using graded-relevance metrics for evaluating community QA answer selection. InProceedings of the fourth ACM international conference on web search and data mining. (pp. 187–196). ACM.
Shah, C., & Pomerantz, J. (2010). Evaluating and predicting answer quality in community QA. InProceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. (pp. 411–418). ACM.
Singh, J. P., Dwivedi, Y. K., Rana, N. P., Kumar, A., & Kapoor, K. K. (2017a). Event classification and location prediction from tweets during disasters.Annals of Operations Research.https://doi.org/10.1007/s10479-017-2522-3.
Singh, J. P., Irani, S., Rana, N. P., Dwivedi, Y. K., Saumya, S., & Roy, P. K. (2017b). Predicting the “helpfulness” of online consumer reviews.Journal of Business Research,70, 346–355.
Soricut, R., & Brill, E. (2006). Automatic question answering using the web: Beyond the factoid.Information Retrieval,9(2), 191–206.
The stack exchange dataset. (2017). Retrived fromhttps://archive.org/details/stackexchange/. Accessed on March 13, 2017.
Yao, Y., Tong, H., Xie, T., Akoglu, L., Xu, F., & Lu, J. (2015). Detecting high-quality posts in community question answering sites.Information Sciences,302, 70–82.
Yen, S. J., Wu, Y. C., Yang, J. C., Lee, Y. S., Lee, C. J., & Liu, J. J. (2013). A support vector machine-based context-ranking model for question answering.Information Sciences,224, 77–87.
Zhang, Z., & Li, Q. (2011). QuestionHolic: Hot topic discovery and trend analysis in community question answering systems.Expert Systems with Applications,38(6), 6848–6855.
Author information
Authors and Affiliations
National Institute of Technology Patna, Ashok Rajpath, Patna, 800005, Bihar, India
Pradeep Kumar Roy, Zishan Ahmad & Jyoti Prakash Singh
Al-Balqa’ Applied University, Salt, Jordan
Mohammad Abdallah Ali Alryalat
Emerging Markets Research Centre (EMaRC), School of Management, Swansea University Bay Campus, Fabian Way, Swansea, SA1 8EN, UK
Nripendra P. Rana & Yogesh K. Dwivedi
- Pradeep Kumar Roy
You can also search for this author inPubMed Google Scholar
- Zishan Ahmad
You can also search for this author inPubMed Google Scholar
- Jyoti Prakash Singh
You can also search for this author inPubMed Google Scholar
- Mohammad Abdallah Ali Alryalat
You can also search for this author inPubMed Google Scholar
- Nripendra P. Rana
You can also search for this author inPubMed Google Scholar
- Yogesh K. Dwivedi
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toYogesh K. Dwivedi.
Rights and permissions
About this article
Cite this article
Roy, P.K., Ahmad, Z., Singh, J.P.et al. Finding and Ranking High-Quality Answers in Community Question Answering Sites.Glob J Flex Syst Manag19, 53–68 (2018). https://doi.org/10.1007/s40171-017-0172-6
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Keywords
Profiles
- Pradeep Kumar RoyView author profile
- Jyoti Prakash SinghView author profile