• Title/Summary/Keyword: Text comparing

Search Result 270, Processing Time 0.026 seconds

Product Feature Extraction and Rating Distribution Using User Reviews (사용자 리뷰를 이용한 상품 특징 추출 및 평점 분배)

  • Son, Soobin;Chun, Jonghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.1
    • /
    • pp.65-87
    • /
    • 2017
  • We propose a method to analyze the user reviews and ratings of the products in the online shopping mall and automatically extracts the features of the products to determine the characteristics of a product. By judging whether a rating is given by a specific feature of a product, our method distributes the score to each feature. Conventional methods force users to wastes time reading overflowing number of reviews and ratings to decide whether to buy the product or not. Moreover, it is difficult to grasp the merits and demerits of the product, because of the way reviews and ratings are provided. It is structured in a way that it is impossible to decide which rating is given to the which characteristics of the product. Therefore, in this paper, to resolve this problem, we propose a method to automatically extract the feature of the product from the user review and distribute the score to appropriate characteristics of the product by calculating the rating of each feature from the overall rating. proposed method collects product reviews and ratings, conducts morphological analysis, and extracts features and emotional words of the products. In addition, a method for determining the polarity of a sentence in which the feature appears is given a weight value for each feature. results of the experiment and the questionnaires comparing the existing methods show the usefulness of the proposed method. We also validates the results by comparing the analysis conducted by the product review experts.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Effect and Safety of Oxygen Chamber Therapy on Cold Hypersensitivity: A Randomized, Controlled Trial (냉증에 대한 산소챔버의 임상 효능 및 안전성 연구)

  • Ha, Hun-Yong;Yoon, Dal-Hwan;Go, Ho-Yeon;Han, Yong-Dae;Kim, Nam-Sik;Nam, Eun-Young;Kim, Hyung-Jun
    • The Journal of Korean Obstetrics and Gynecology
    • /
    • v.26 no.4
    • /
    • pp.123-139
    • /
    • 2013
  • Purpose: Cold hypersensitivity is regarded to be associated with blood circulation. This study is aims to evaluate the effects and safety of oxygen chamber therapy on cold hypersensitivity by comparing the temperature and Visual Analogue Scale. Methods: 42 outpatients who visited ${\bigcirc}{\bigcirc}$ University Oriental Hospital from July 11th, 2013 to August 28th, 2013 were analyzed. Patients were subjected to thermometer, and those with thermal difference greater than $0.3^{\circ}C$ between upper arm and palm and also with more than VAS 4 of cold hypersensitivity were diagnosed with cold hypersensitivity. 42 outpatients diagnosed with cold hypersensitivity are divided into two groups, one is the experimental group consisted of 21 patients and other was control group consisted of 21 patients. The experimental group had oxygen chamber therapy 10 times for 4 weeks. Thereafter the effects of oxygen chamber therapy on cold hypersensitivity was analyzed with t-text using SPSS for Windows version 21. Results: After the oxygen chamber therapy, experimental group had considerable improvement on cold hypersensitivity, in consequence of decreasing rate of thermal difference and VAS of cold hypersensitivity. Ear deafness and hand numbness were reported as an adverse effects in experimental group, but there was no serious adverse effects. Conclusions: This clinical trial showed oxygen chamber therapy could be effective and safe to reduce cold hypersensitivity.

The Study on Korean Prosody Generation using Artificial Neural Networks (인공 신경망의 한국어 운율 발생에 관한 연구)

  • Min Kyung-Joong;Lim Un-Cheon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.337-340
    • /
    • 2004
  • The exactly reproduced prosody of a TTS system is one of the key factors that affect the naturalness of synthesized speech. In general, rules about prosody had been gathered either from linguistic knowledge or by analyzing the prosodic information from natural speech. But these could not be perfect and some of them could be incorrect. So we proposed artificial neural network(ANN)s that can be trained to team the prosody of natural speech and generate it. In learning phase, let ANNs learn the pitch and energy contour of center phoneme by applying a string of phonemes in a sentence to ANNs and comparing the output pattern with target pattern and making adjustment in weighting values to get the least mean square error between them. In test phase, the estimation rates were computed. We saw that ANNs could generate the prosody of a sentence.

  • PDF

The Basic Study for Building the Depression Prescription Guideline of Gamiguibi-Tang - The Evaluation of Reliability and Validity of the Depression Pattern-Identification Questionnaire - (가미귀비탕(加味歸脾湯)의 우울증 투약지침 개발을 위한 기초연구 - 우울증 변증 설문지의 신뢰도 타당도 평가 -)

  • Koo, Byung-Soo;Lee, Sang-Jae;Han, Chang-Ho;Kim, Ho-Jun;Park, Se-Hwan
    • Journal of Oriental Neuropsychiatry
    • /
    • v.20 no.4
    • /
    • pp.1-13
    • /
    • 2009
  • Objectives : As depression falls into the category of Wuljeung, Gamiguibi-Tang(Jiaweiguipitang) is the standard prescription to cure Wuljeung. This study develops a questionnaire for building the guidelines to administer Gamiguibi-Tang to depression and evaluates reliability and validity of questionnaire. Methods : With extracting the text related to depression and Gamiguibi-Tang through total 9 Korean medicine books and consulting the experts, the study selected 80 items and converted them into a questionnaire. It surveyed the neuropsychiatry professors and the medical specialists three times by Delphi method, and lastly selected 21 final items of a questionnaire. On the basis of the questionnaire, it collected total 216 samples and tested their reliability and validity. Results : 21 items all didn't reduce total Cronbach alpha coefficient and satisfied test-retest reliability. As a result of factor analysis, totally 5 factors were extracted such as mental state, sleep, accompaniment, fatigue and weakness. Finally, in comparing a depression group with a normal control group, two groups all showed meaningful difference in each 21 items' point, the sum of factor 1 to 5 items' points, and the sum of 21 items' points. Conclusions : The questionnaire on the updated depression prescription guideline of Gamiguibi-Tang satisfied both of reliability and validity. Later it can help objectifying to apply Gamiguibi-Tang to depression cure.

  • PDF

Different Pathology between General and palms-and-soles hyperhidrosis in Korean Medicine and Medicine (자한(自汗)과 수족한(手足汗)에 대한 한의학 및 의학적 고찰)

  • Lee, Wook Jin;Kim, Byoung Soo
    • The Journal of Korean Medicine
    • /
    • v.41 no.1
    • /
    • pp.11-20
    • /
    • 2020
  • Objectives: We noticed that hyperhidrosis can be differentiated by whether it is topical or systemic in both Korean medicine(KM) and Modern medicine(MM). Comparing between topical and systemic sweating, we will figure out similarity between KM and MM about stimuli on sweat. Methods: All research is done by finding information on text-book, article, books. Results: Hyperhidrosis is differentiated by whether it is topical or systemic in both Korean medicine(KM) and Modern medicine(MM). First, systemic sweating(SS) is affected by body temperature. In KM, Heat and Cold(plus yang deficiency) can make human sweat systemically. In MM, heat is also mentioned as stimulus. Second, topical sweating(TS) can occur on emotionally-stressed situation especially on palms-and-soles. In KM, this phenomenon is explained by heart spirit(心神) and disease transmitted by pericardium meridian(手厥陰心包經 是動病). In MM, anatomically hyperhidrosis on palms-and-soles is generated by adrenergic sympathetic nerve which is involved with stress. Third, sweating on palms-and-soles also can be generated by internal organ. In KM, hyperhidrosis on palms-and-soles is explained as illness on stomach meridian(足陽明胃經). The 70% of parasympathetic nerve is vagus nerve which is located at internal organs-usually gastrointestinal tract. In that point, stomach and parasympathetic nerve seem to be involved in hyperhidrosis on palms-and-soles. Conclusion: Hyperhidrosis is differentiated similarly by whether it is topical or systemic in both Korean medicine and Modern medicine. Conserving each perspective of KM and MM, one perspective can be useful to other by supplementing other's weak point.

Comparison of Topic Modeling Methods for Analyzing Research Trends of Archives Management in Korea: focused on LDA and HDP (국내 기록관리학 연구동향 분석을 위한 토픽모델링 기법 비교 - LDA와 HDP를 중심으로 -)

  • Park, JunHyeong;Oh, Hyo-Jung
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.4
    • /
    • pp.235-258
    • /
    • 2017
  • The purpose of this study is to analyze research trends of archives management in Korea by comparing LDA (Latent Semantic Allocation) topic modeling, which is the most famous method in text mining, and HDP (Hierarchical Dirichlet Process) topic modeling, which is developed LDA topic modeling. Firstly we collected 1,027 articles related to archives management from 1997 to 2016 in two journals related with archives management and four journals related with library and information science in Korea and performed several preprocessing steps. And then we conducted LDA and HDP topic modelings. For a more in-depth comparison analysis, we utilized LDAvis as a topic modeling visualization tool. At the results, LDA topic modeling was influenced by frequently keywords in all topics, whereas, HDP topic modeling showed specific keywords to easily identify the characteristics of each topic.

VOC Summarization and Classification based on Sentence Understanding (구문 의미 이해 기반의 VOC 요약 및 분류)

  • Kim, Moonjong;Lee, Jaean;Han, Kyouyeol;Ahn, Youngmin
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.50-55
    • /
    • 2016
  • To attain an understanding of customers' opinions or demands regarding a companies' products or service, it is important to consider VOC (Voice of Customer) data; however, it is difficult to understand contexts from VOC because segmented and duplicate sentences and a variety of dialog contexts. In this article, POS (part of speech) and morphemes were selected as language resources due to their semantic importance regarding documents, and based on these, we defined an LSP (Lexico-Semantic-Pattern) to understand the structure and semantics of the sentences and extracted summary by key sentences; furthermore the LSP was introduced to connect the segmented sentences and remove any contextual repetition. We also defined the LSP by categories and classified the documents based on those categories that comprise the main sentences matched by LSP. In the experiment, we classified the VOC-data documents for the creation of a summarization before comparing the result with the previous methodologies.

Remote Versioning on the CoSpace Client for the CoSlide Collaborative System (CoSlide 협업시스템을 지원하는 CoSpace 클라이언트의 원격 버전 관리)

  • Park, Jong-Moon;Lee, Myung-Joon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.1
    • /
    • pp.233-241
    • /
    • 2010
  • CoSlide is a collaborative system, extending the Jakarta Slide WebDAV server. The CoSlide server provides group workspaces for collaborators. CoSpace is a client which supports various collaborative authoring activities on the CoSlide server through the WebDAV protocol. CoSpace provides graphic user interfaces to support effective interaction among the collaborators, managing the shared resources for them. However, during collaboration, simultaneous modifications on the content of shared resources might cause conflicts among the content of the revisions made by the collaborators, leading to serious problems on project progress. In this paper, we describe an extension of the CoSpace client to solve the problem. The extended CoSpace client supports the remote version management facility through which the collaborators can manage the versions of the associated server resources in the distance. Also, to identify the changes of the text files such as program source codes, the extended client provides the facility for comparing two versions and displaying the differences in a visual manner. In addition, it provides the version management of a whole workspace and the removal of all the unnecessary versions of the designated resources.

Recognition of Various Printed Hangul Images by using the Boundary Tracing Technique (경계선 기울기 방법을 이용한 다양한 인쇄체 한글의 인식)

  • Baek, Seung-Bok;Kang, Soon-Dae;Sohn, Young-Sun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.1-5
    • /
    • 2003
  • In this paper, we realized a system that converts the character images of the printed Korean alphabet (Hangul) to the editable text documents by using the black and white CCD camera, We were able to abstract the contours information of the character which is based on the structural character by using the boundary tracing technique that is strong to the noise on the character recognition. By using the contours information, we recognized the horizontal vowels and vertical vowels of the character image and classify the character into the six patterns. After that, the character is divided to the unit of the consonant and vowel. The vowels are recognized by using the maximum length projection. The separated consonants are recognized by comparing the inputted pattern with the standard pattern that has the phase information of the boundary line change. We realized a system that the recognized characters are inputted to the word editor with the editable KS Hangul completion type code.