• Title/Summary/Keyword: Digital Indicator

Search Result 192, Processing Time 0.02 seconds

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.

A Study on Public Nuisance in Seoul, Pusan and Daegu Cities Part I. Survey on Air Pollution and Noise Level (공해(公害)에 관(關)한 조사연구(調査硏究) 제일편(第一編) : 서울, 부산(釜山), 대구(大邱) 지역(地域)의 대기오염(大氣汚染) 및 소음(騷音)에 관(關)한 비교조사(比較調査) 연구(硏究))

  • Cha, Chul-Hwan;Shin, Young-Soo;Lee, Young-Il;Cho, Kwang-Soo;Choo, Chong-Yoo;Kim, Kyo-Sung;Choi, Dug-Il
    • Journal of Preventive Medicine and Public Health
    • /
    • v.4 no.1
    • /
    • pp.41-64
    • /
    • 1971
  • During the period from July 1st to the end of November 1970, a survey on air pollution and noise level was made in Seoul, Pusan and Taegu, the three largest cities in Korea. Each city was divided into 4-6 areas; the industrial area, the semi-industrial area, the commercial area, the residential area, the park area and the downtown area. Thirty eight sites were selected from each area. A. Method of Measurement : Dustfall was measured by the Deposit Gauge Method, sulfur oxides by $PbO_2$ cylinder method, suspended particles by the Digital Dust Indicator, Sulfur dioxide ($SO_2$) and Carbon Monoxide (CO) by the MSA & Kitakawa Detector and the noise levels by Rion Sound Survey meter. B. Results: 1. The mean value of dustfall in 3 cities was $30.42ton/km^2/month$, ranging from 8.69 to 95.44. 2. The mean values of dustfall by city were $33.17ton/km^2/month$ in Seoul, 32.11 in Pusan and 25.97 in Taegu. 3. The mean values of dustfall showed a trend of decreasing order of semi-industrial area, downtown area, industrial area, commercial area, residential area, and park area. 4. The mean value of dustfall in Seoul by area were $52.32ton/km^2/month$ in downtown, 50.54 in semi-industrial area, 40.37 in industrial area, 24,19 in commercial area, 16.25 in park area and 15.39 in residential area in order of concentration. 5. The mean values of dustfall in Pusan by area were $48.27ton/km^2/month$ in semi-industrial area, 36.68 in industrial area 25.31 in commercial area, and 18.19 in residential area. 6. The mean values of dustfall in Taegu by area were $36.46ton/km^2/month$ in downtown area, 33.52 in industrial area, 20.37 in commercial area and 13.55 in residential area. 7. The mean values of sulfur oxides in 3 cities were $1.52mg\;SO_3/day/100cm^2\;PbO_2$, ranging from 0.32 to 4.72. 8. The mean values of sulfur oxides by city were $1.89mg\;SO_3/day/100cm^2\;PbO_2$ in Pusan, 1.64 in Seoul and 1.21 in Taegu. 9. The mean values of sulfur oxides by area in 3 cities were $2.16mg\;SO_3/day/100cm^2\;PbO_2$ in industrial area, 1.69 in semi-industrial area, 1.50 in commercial area, 1.48 in downtown area, 1.32 in residential area and 0.94 in the park area, respectively. 10. The monthly mean values of sulfur oxides contents showed a steady increase from July reaching a peak in November. 11. The mean values of suspended particles was $2.89mg/m^3$, ranging from 1.15 to 5.27. 12. The mean values of suspended particles by city were $3.14mg/m^3$ in Seoul, 2.79 in Taegu and 2.25 in Pusan. 13. The mean values of noise level in 3 cities was 71.3 phon, ranging from 49 to 99 phon. 14. The mean values of noise level by city were 73 phon in Seoul, 72 in Pusan, and 69 in Taegu in that order. 15. The mean values of noise level by area in 3 cities showed a decrease in the order of the downtown area, commercial area, industrial area and semi-industrial area, park area and residential area. 16. The comparison of the noise levels by area in 3 cities indicated that the highest level was detected in the downtown area in Seoul and Taegu and in the industrial area in Pusan. 17. The daily average concentration of sulfur dioxides ($SO_2$) in 3 cities was 0.081 ppm, ranging from 0.004 to 0.196. 18. The daily average concentrations of sulfur dioxides by city were 0.092 ppm in Seoul, 0.089 in Pusan and 0.062 in Taegu in that order. 19. The weekly average concentration of carbon monoxides(CO) was 27.59 ppm. 20. The daily average concentrations of carbon monoxides by city were 33.37 ppm. in Seoul, 25.76 in Pusan and 23.65 in Taegu in that order. 21. The concentration of $SO_2$ and CO reaches a peak from 6 p. m. to 8 p. m. 22. About 3 times probably the daily average concentration of CO could be detected in the downtown area probably due to heavy traffic emission in comparison with that in the industial area. 23. As for daily variation of the concentration of $SO_2$ and CO it was found that the concentration maintains relatively higher value during weekdays in the industrial area and on the first part of the week in the downtown area.

  • PDF