Transactions on Data Analysis in Social Science

Transactions on Data Analysis in Social Science

Quantitative Data Analysis of TripAdvisor Reviews for Hotels in Tehran

Document Type : Original Article

Authors
1 M.Sc. Student, Department of Industrial Engineering, Sharif University of Technology, Tehran, Iran
2 Assistant Professor, Department of Industrial Engineering, Sharif University of Technology, Tehran, Iran
3 Ph.D. in Industrial Engineering, Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran
Abstract
This study investigates hotel rating prediction using machine learning techniques, focusing on hotels in Tehran, the capital and largest city of Iran. Data were collected from TripAdvisor.com, the world’s largest online travel and tourism platform. A total of 64 Tehran-based hotels with official TripAdvisor pages were identified, yielding 4,736 user reviews compiled from the earliest available entries. The primary aim of this research is to predict the ratings that new users may assign to Tehran’s hotels based on both user profile characteristics and hotel attributes. To achieve this objective, eight supervised machine learning models were implemented using the R programming language: K-Nearest Neighbors (KNN), Naïve Bayes Classifier, Decision Tree, Logistic Regression, Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machine (GBM). The models were comparatively analyzed to evaluate their predictive performance and to identify the most accurate approach. The findings of this study contribute to the growing field of predictive analytics in online review systems, demonstrating the potential of data-driven methods to enhance customer satisfaction, support managerial decision-making, and improve service quality within the hospitality and tourism industry.
Keywords

[1]      Zhang, X., Song, H., & Huang, G. Q. (2009). Tourism supply chain management: A new research agenda. Tourism Management, 30(3), 345–358. https://doi.org/10.1016/j.tourman.2008.12.010
[2]      Guo, Y., Barnes, S. J., & Jia, Q. (2017). No title. Tourism Management, 59, 467–483. https://doi.org/10.1016/j.tourman.2016.09.009
[3]      Liu, Y., Teichert, T., Rossi, M., Li, H., & Hu, F. (2017). Big data for big insights: Investigating language-specific drivers of hotel satisfaction with 412,784 user-generated reviews. Tourism Management, 59, 554–563. https://doi.org/10.1016/j.tourman.2016.08.012
[4]      Lu, W., & Stepchenkova, S. (2012). Ecotourism experiences reported online: Classification of satisfaction attributes. Tourism Management, 33(3), 702–712. https://doi.org/10.1016/j.tourman.2011.08.003
[5]      Xu, X., & Li, Y. (2016). The antecedents of customer satisfaction and dissatisfaction toward various types of hotels: A text mining approach. International Journal of Hospitality Management, 55, 57–69. https://doi.org/10.1016/j.ijhm.2016.03.003
[6]      Crotts, J. C., Mason, P. R., & Davis, B. (2009). Measuring guest satisfaction and competitive position in the hospitality and tourism industry. Journal of Travel Research, 48(2), 139–151. https://doi.org/10.1177/0047287508328795
[7]      Xiang, Z., Schwartz, Z., Gerdes, J. H., & Uysal, M. (2015). What can big data and text analytics tell us about hotel guest experience and satisfaction? International Journal of Hospitality Management, 44, 120–130. https://doi.org/10.1016/j.ijhm.2014.10.013
[8]      Berezina, K., Bilgihan, A., Cobanoglu, C., & Okumus, F. (2016). Understanding satisfied and dissatisfied hotel customers: Text mining of online hotel reviews. Journal of Hospitality Marketing & Management, 25(1), 1–24. https://doi.org/10.1080/19368623.2015.983631
[9]      Hu, Y.-H., Chen, Y.-L., & Chou, H.-L. (2017). Opinion mining from online hotel reviews: A text summarization approach. Information Processing & Management, 53(2), 436–449. https://doi.org/10.1016/j.ipm.2016.12.002
[10]   Ma, J., Luo, S., Yao, J., Cheng, S., & Chen, X. (2016). Efficient opinion summarization on comments with online-LDA. International Journal of Computers, Communications & Control, 11(3), 414–427. https://doi.org/10.15837/ijccc.2016.3.700
[11]   Melián-González, S., Bulchand-Gidumal, J., & González López-Valcárcel, B. (2013). Online customer reviews of hotels. Cornell Hospitality Quarterly, 54(3), 274–283. https://doi.org/10.1177/1938965513481498
[12]   Phillips, P., Zigan, K., Santos Silva, M. M., & Schegg, R. (2015). The interactive effects of online reviews on the determinants of Swiss hotel performance: A neural network analysis. Tourism Management, 50, 130–141. https://doi.org/10.1016/j.tourman.2015.01.028
[13]   Xiang, Z., Du, Q., Ma, Y., & Fan, W. (2017). A comparative analysis of major online review platforms: Implications for social media analytics in hospitality and tourism. Tourism Management, 58, 51–65. https://doi.org/10.1016/j.tourman.2016.10.001
[14]   Zhang, Y., & Cole, S. T. (2016). Dimensions of lodging guest satisfaction among guests with mobility challenges: A mixed-method analysis of web-based texts. Tourism Management, 53, 13–27. https://doi.org/10.1016/j.tourman.2015.09.001
[15]   Park, S., & Nicolau, J. L. (2015). Asymmetric effects of online consumer reviews. Annals of Tourism Research, 50, 67–83. https://doi.org/10.1016/j.annals.2014.10.007
[16]   Fang, B., Ye, Q., Kucukusta, D., & Law, R. (2016). Analysis of the perceived value of online tourism reviews: Influence of readability and reviewer characteristics. Tourism Management, 52, 498–506. https://doi.org/10.1016/j.tourman.2015.07.018
[17]   Pearce, P. L., & Wu, M.-Y. (2018). Entertaining international tourists: An empirical study of an iconic site in China. Journal of Hospitality & Tourism Research, 42(5), 772–792. https://doi.org/10.1177/1096348015598202
[18]   Yuan, H., Xu, H., Qian, Y., & Li, Y. (2016). Make your travel smarter: Summarizing urban tourism information from massive blog data. International Journal of Information Management, 36(6), 1306–1319. https://doi.org/10.1016/j.ijinfomgt.2016.02.009
[19]   Xu, H., Yuan, H., Ma, B., & Qian, Y. (2015). Where to go and what to play: Towards summarizing popular information from massive tourism blogs. Journal of Information Science, 41(6), 830–854. https://doi.org/10.1177/0165551515603323
[20]   Philander, K., & Zhong, Y. Y. (2016). Twitter sentiment analysis: Capturing sentiment from integrated resort tweets. International Journal of Hospitality Management, 55, 16–24. https://doi.org/10.1016/j.ijhm.2016.02.001
[21]   Kontopoulos, E., Berberidis, C., Dergiades, T., & Bassiliades, N. (2013). Ontology-based sentiment analysis of Twitter posts. Expert Systems with Applications, 40(10), 4065–4074. https://doi.org/10.1016/j.eswa.2013.01.001
[22]   Li, G., Law, R., Vu, H. Q., Rong, J., & Zhao, X. (Roy). (2015). Identifying emerging hotel preferences using emerging pattern mining technique. Tourism Management, 46, 311–321. https://doi.org/10.1016/j.tourman.2014.06.015
[23]   Schuckert, M., Liu, X., & Law, R. (2015). Hospitality and tourism online reviews: Recent trends and future directions. Journal of Travel & Tourism Marketing, 32(5), 608–621. https://doi.org/10.1080/10548408.2014.933154
[24]   Xie, K. L., Zhang, Z., & Zhang, Z. (2014). The business value of online consumer reviews and management response to hotel performance. International Journal of Hospitality Management, 43, 1–12. https://doi.org/10.1016/j.ijhm.2014.07.007
[25]   [25] Ye, Q., Li, H., Wang, Z., & Law, R. (2014). The influence of hotel price on perceived service quality and value in e-tourism. Journal of Hospitality & Tourism Research, 38(1), 23–39. https://doi.org/10.1177/1096348012442540
[26]   Racherla, P., & Friske, W. (2012). Perceived “usefulness” of online consumer reviews: An exploratory investigation across three services categories. Electronic Commerce Research and Applications, 11(6), 548–559. https://doi.org/10.1016/j.elerap.2012.06.003
[27]   Ye, Q., Law, R., & Gu, B. (2009). The impact of online user reviews on hotel room sales. International Journal of Hospitality Management, 28(1), 180–182. https://doi.org/10.1016/j.ijhm.2008.06.011
[28]   Zhang, Z., Zhang, Z., & Yang, Y. (2016). The power of expert identity: How website-recognized expert reviews influence travelers’ online rating behavior. Tourism Management, 55, 15–24. https://doi.org/10.1016/j.tourman.2016.01.004
[29]   Zhang, Z., Ye, Q., Law, R., & Li, Y. (2010). The impact of e-word-of-mouth on the online popularity of restaurants: A comparison of consumer reviews and editor reviews. International Journal of Hospitality Management, 29(4), 694–700. https://doi.org/10.1016/j.ijhm.2010.02.002
[30]   Chua, A., Servillo, L., Marcheggiani, E., & Vande Moere, A. (2016). Mapping Cilento: Using geotagged social media data to characterize tourist flows in southern Italy. Tourism Management, 57, 295–310. https://doi.org/10.1016/j.tourman.2016.06.013
[31]   Bordogna, G., Frigerio, L., Cuzzocrea, A., & Psaila, G. (2016). Clustering geo-tagged tweets for advanced big data analytics. In Proceedings - 2016 IEEE International Congress on Big Data, BigData Congress 2016 (pp. 42–51). https://doi.org/10.1109/BigDataCongress.2016.78
[32]   Cheng, M., & Edwards, D. (2015). Social media in tourism: A visual analytic approach. Current Issues in Tourism, 18(11), 1080–1087. https://doi.org/10.1080/13683500.2015.1036009
[33]   Schuckert, M., Liu, X., & Law, R. (2015). A segmentation of online reviews by language groups: How English and non-English speakers rate hotels differently. International Journal of Hospitality Management, 48, 143–149. https://doi.org/10.1016/j.ijhm.2014.12.007
[34]   Költringer, C., & Dickinger, A. (2015). Analyzing destination branding and image from online sources: A web content mining approach. Journal of Business Research, 68(9), 1836–1843. https://doi.org/10.1016/j.jbusres.2015.01.011
[35]   Marine-Roig, E., & Anton Clavé, S. (2015). Tourism analytics with massive user-generated content: A case study of Barcelona. Journal of Destination Marketing & Management, 4(3), 162–172. https://doi.org/10.1016/j.jdmm.2015.06.004
[36]   Ye, Q., Zhang, Z., & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications, 36(3), 6527–6535. https://doi.org/10.1016/j.eswa.2008.07.035
[37]   Peng, G., Liu, Y., Wang, J., & Gu, J. (2017). Analysis of the prediction capability of web search data based on the HE-TDC method - prediction of the volume of daily tourism visitors. Journal of System Science and Systems Engineering, 26(2), 163–182. https://doi.org/10.1007/s11518-016-5311-7
[38]   Fuchs, M., Höpken, W., & Lexhagen, M. (2014). Big data analytics for knowledge generation in tourism destinations - A case from Sweden. Journal of Destination Marketing & Management, 3(4), 198–209. https://doi.org/10.1016/j.jdmm.2014.08.002
[39]   Kim, W. G., & Park, S. A. (2017). Social media review rating versus traditional customer satisfaction. International Journal of Contemporary Hospitality Management, 29(2), 784–802. https://doi.org/10.1108/IJCHM-11-2015-0627
[40]   Min, H., Min, H., & Emam, A. (2002). A data mining approach to developing the profiles of hotel customers. International Journal of Contemporary Hospitality Management, 14(6), 274–285. https://doi.org/10.1108/09596110210436814
[41]   Magnini, V. P., Honeycutt, E. D., & Hodge, S. K. (2003). Data mining for hotel firms: Use and limitations. Cornell Hotel and Restaurant Administration Quarterly, 44(2), 94–105. https://doi.org/10.1016/S0010-8804(03)90022-X
[42]   Moro, S., Rita, P., & Coelho, J. (2017). Stripping customers' feedback on hotels through data mining: The case of Las Vegas Strip. Tourism Management Perspectives, 23, 41–52. https://doi.org/10.1016/j.tmp.2017.04.003
[43]   Nguyen, K. A., & Coudounaris, D. N. (2015). The mechanism of online review management: A qualitative study. Tourism Management Perspectives, 16, 163–175. https://doi.org/10.1016/j.tmp.2015.08.002
[44]   O'Connor, P. (2010). Managing a hotel's image on Tripadvisor. Journal of Hospitality Marketing & Management, 19(7), 754–772. https://doi.org/10.1080/19368623.2010.508007
[45]   TripAdvisor. (n.d.). Investor relations. http://ir.tripadvisor.com/investor-relations
[46]   Bennett, D. A. (2001). How can I deal with missing data in my study? Australian and New Zealand Journal of Public Health, 25(5), 464–469. https://doi.org/10.1111/j.1467-842X.2001.tb00294.x
[47]   Cortez, P., & Embrechts, M. J. (2013). Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 225, 1–17. https://doi.org/10.1016/j.ins.2012.10.039
[48]   Cortez, P., Cerdeira, A., Almeida, F., Matos, T., & Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47(4), 547–553. https://doi.org/10.1016/j.dss.2009.05.016
[49]   Moro, S., Cortez, P., & Rita, P. (2014). A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62, 22–31. https://doi.org/10.1016/j.dss.2014.03.001
[50]   Tinoco, J., Gomes Correia, A., & Cortez, P. (2011). Application of data mining techniques in the estimation of the uniaxial compressive strength of jet grouting columns over time. Construction and Building Materials, 25(3), 1257–1262. https://doi.org/10.1016/j.conbuildmat.2010.09.027
 
Volume 1, Issue 4
Autumn 2019
Pages 171-185

  • Receive Date 11 June 2019
  • Revise Date 06 August 2019
  • Accept Date 09 November 2019