Browse > Article

Multi-Dimensional Keyword Search and Analysis of Hotel Review Data Using Multi-Dimensional Text Cubes  

Kim, Namsoo (Dept. of Computer Science, Kangwon, National University)
Lee, Suan (Dept. of Computer Science, Kangwon, National University)
Jo, Sunhwa (Dept. of Computer Science, Kangwon, National University)
Kim, Jinho (Dept. of Computer Science, Kangwon, National University)
Abstract
As the advance of WWW, unstructured data including texts are taking users' interests more and more. These unstructured data created by WWW users represent users' subjective opinions thus we can get very useful information such as users' personal tastes or perspectives from them if we analyze appropriately. In this paper, we provide various analysis efficiently for unstructured text documents by taking advantage of OLAP (On-Line Analytical Processing) multidimensional cube technology. OLAP cubes have been widely used for the multidimensional analysis for structured data such as simple alphabetic and numberic data but they didn't have used for unstructured data consisting of long texts. In order to provide multidimensional analysis for unstructured text data, however, Text Cube model has been proposed precently. It incorporates term frequency and inverted index as measurements to search and analyze text databases which play key roles in information retrieval. The primary goal of this paper is to apply this text cube model to a real data set from in an Internet site sharing hotel information and to provide multidimensional analysis for users' reviews on hotels written in texts. To achieve this goal, we first build text cubes for the hotel review data. By using the text cubes, we design and implement the system which provides multidimensional keyword search features to search and to analyze review texts on various dimensions. This system will be able to help users to get valuable guest-subjective summary information easily. Furthermore, this paper evaluats the proposed systems through various experiments and it reveals the effectiveness of the system.
Keywords
Multi-dimensional Text Databases; Text Cubes; On-Line Analytical Processing (OLAP); Usres' review analysis; keyword search;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 최수민, 최광희, 인터넷 검색서비스 주요이슈 및 정책방향, Internaet & Security Focus 2013 10월호, 7-9.
2 Scaffidi, C., Bierhoff, K., Chang, E., Felker, M., Ng, H., and Jin, C., "Red Opal: Product-Feature Scoring from Reviews," In Proceedings of the 8th ACM conference on Electronic Commerce, 2007.
3 Adrien Guille, Hakim Hacid, Cecile Favre, Djamel A. Zighed, "Information Diffusion in Online Social Networks: A Survey," in SIGMOD Record, June 2013 (Vol. 42, No. 2).
4 Qingliang Miao, Shu Zhang, Yao Meng, Hao Yu Fujitsu R&D Center Co., LTD, "Domain-sensitive Opinion Leader Mining from Online Review Communities," In WWW 2013 Companion, May 13-17, 2013.
5 http://en.wikipedia.org/wiki/Online_Analytical_Processing.
6 Gray, J., Bosworth, A., Layman, A., and Pirahesh, H., "Datacube: A relational aggregation operator generalizing group by, cross-tab, and sub-total," In ICDE, 1996.
7 Lin, X., Ding, B., Han, J., Zhu, F., and Zhao, B., "Text cube: Computing ir measures for multidimensional text database analysis," in ICDM, 2008.
8 TripAdvisor, http://www.tripadvisor.com
9 Nenad Jukic, Boris Jukic, Mary Malliaris, Online Analytical Processing (OLAP) for Decision Support, International Handbooks Information System 2008, pp. 259-276.
10 Seungkyu Choi, Jaehong Pack and JooseokPark, "Impact of ERP System Adoption on Corporate Performance in the Korean Listed Company," In Journal of Information Technology and Arechitecture, Vol. 10. No. 2, June 2013, pp 211-222.
11 Ynkun Hahm, Seogjun Lee, Hansoo Kang and jinsung Kim, "Business Model Components and Challenges in Korean IT Companies: A Comparative Case Study," In Journal of Information Technology and Arechitecture, Vol. 9. No. 1, March 2012, pp 95-110.
12 Jaehak Yu, Junsang Park, Hansung Lee, Younghee Im, Myungsup Kim, Daihee Park, "Network Traffic Analysis on Multi-dimensional Data Cube," In Kiise Fall Conference, 11, 100-105, 2010.
13 Hoseok Jung, Jonguk Lee, Hansung Lee, Daihee Park, "A Multi-dimensional Analysis of Soccer Video using Data Cube," In KCC, 2011.6, 21-24.
14 Yoke Yie Chen and Ken Vinn Lee, "User -Centered Sentiment Analysis on Customer Product Review," In World Applied Sciences Journal 12 (Special Issue on Computer Applications & Knowledge Management): 32-38, 2011.
15 D. Zhang, C. Zhai, and J. Han, "Topic Cube: Topic modeling for OLAP on multidimensional text databases," In SDM, 2009.
16 Bolin Ding, Bo Zhao, Cindy Xide Lin, Jiawei Han, Chengxiang Zhai, "TopCells: Keyword-Based Search of Top-k Aggregated Documents in Text Cube," In ICDE, 2010.
17 Yintao Yu, Cindy X. Lin, Yuzhou Sun, Chen Chen, Jianwei Han, Binbin Liao, Tianyi Wu, ChengXiang Zhai, Duo Zhang, Bo Zhao, "iNextCube: Information Network-Enhanced Text Cube," In VLDB, 2009.
18 Suan Lee, Sunhwa jo and Jinho kim,"An Iterative Algorithm for the Bottom Up Computation of the Data Cube using MapReduce,", In Journal of Information Technology and Architecture, Vol. 9, No. 4, December 2012, pp 455-464.