Search | Korea Science

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
- Journal of Intelligence and Information Systems
- /
- v.24 no.2
- /
- pp.59-83
- /
- 2018
With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.
https://doi.org/10.13088/jiis.2018.24.2.059 인용 PDF KSCI

A Study on the Improvement of Flexible Working Hours (탄력적 근로시간제 개선에 대한 연구)

Kwon, Yong-man
- Journal of Venture Innovation
- /
- v.5 no.3
- /
- pp.57-70
- /
- 2022
In modern industrial capitalism, the relationship between the provision of work and the receipt of wages has become an important principle governing society. According to the labor contract, the wages provided by entrusting the right to dispose of one's labor to the employer are directly compensated, and human life should be guaranteed and reproduced with proper rest. The establishment of labor relations under free contracts represents a problem in protecting workers, and accordingly, the maximum of working hours is set as a minimum right for workers, and the standard for minimum rest is set and assigned. The reduction of working hours is very important in terms of the quality of life of workers, but it is also an important issue in efficient corporate activities. As of 2020, Korea has 1,908 hours of annual working hours, the third lowest among OECD 37 countries in the happiness index surveyed by the Sustainable Development Solution Network(SDSN), an agency under the United Nations. Accordingly, the necessity of reducing working hours has been recognized, and the maximum working hours per week has been limited to 52 hours since 2018. In this situation, various working hours are legally excluded as a way to maintain the company's value-added creation and meet the diverse needs of workers, and Korea's Labor Standards Act restricts flexible working hours within three months, flexible working hours exceeding three months, selective working hours, and extended working hours. However, in the discussion on the application of the revised flexible working hours system in 2021 and the expansion of the settlement unit period recently discussed, there is a problem with the flexible working hours system, which needs to be improved. Therefore, this paper aims to examine the problems of the flexible working hours system and improvement measures. The flexible working hours system is a system that does not violate working hours even if the legal working hours are exceeded on a specific day or week according to a predetermined standard, and does not have to pay additional wages for excessive overtime work. It is mainly useful as a form of shift work in manufacturing, sales service, continuous business or electricity, gas, water, and transportation for long-term operations. It is also used as a way to shorten working hours, such as expanding holidays through short working days. However, if the settlement unit period is expanded, it is disadvantageous to workers as the additional wages that workers can receive will not be received. Therefore, First, in order to expand the settlement unit period currently under discussion, additional wages should be paid for the period expanded from the current standard. Second, it is necessary to improve the application of the flexible working hours system to individual workers to have sufficient consultation with individual workers in a written agreement with the worker representative, Third, clarify the allowable time for extended work during the settlement unit period, and Fourth, limit the daily working hours or apply to continuous rest. In addition, since the written agreement of the worker representative is an important issue in the application of the flexible working hours system, it is necessary to secure the representation of the worker representative.
https://doi.org/10.22788/5.3.4 인용 PDF KSCI

A Technique of Forecasting Market Share of Transportation Modes after Introducing New Lines of Urban Rail Transit with Observed Mode Share Data (관측 교통수단 분담률 자료를 활용한 도시철도 신설 후 수단분담률 예측분석 기법)

Seo, Dong-Jeong;Kim, Ik-Ki;Lee, Tae-Hoon
- Journal of Korean Society of Transportation
- /
- v.30 no.1
- /
- pp.7-18
- /
- 2012
This study suggested a method of forecasting market-share of each mode after introducing new urban rail transit lines. The study reflected the observed market share of presently operating urban rail transit into forecasting process in order to improve accuracy in predicting market share of each modes. For more realistic representation of the forecasting model, we categorized O/D pairs according to attributes of trip distance, access time and number of transfers. The analysis results of traveler's mode choice behavior with observed data showed that the trip distances are longer, the share of urban rail tends to be higher, and that the number of transfers is fewer and the access times are lesser, the share of urban rail also tends to be higher. Then, incremental logit model was used in estimating mode choice probabilities for O/D pairs along with rail transit lines while utilizing observed market shares of each modes and differences in transit service level. As the next step, the market share of rail transit after introducing new rail transit lines was forecasted by using incremental logit model with the intial share values calculated the previous analysis step. It also reflected changes in level of service for automobile in highway due to changes in highway systems and changes in mode shares after introducing new lines of rail transit. It can be expected that the proposed method would more realistically duplicates phenomena of mode choice behavior for rail transit and that it would be more theoretically logical than the typical existing methods using SP data and incremental logit model or using addictive logit model in this country.
https://doi.org/10.7470/jkst.2012.30.1.007 인용 PDF KSCI

Image Quality Assessment Model of Natural Scene Based on Normal Distribution Analysis (일반 장면의 정규분포 분석을 기반으로 한 화질 측정 모형)

Park, Hyung-Ju;Har, Dong-Hwan
- Science of Emotion and Sensibility
- /
- v.16 no.3
- /
- pp.373-386
- /
- 2013
In this research, we specify the image consumers' preferred image quality ranges based on objective image quality evaluation factors and follow a method which measures preference of the natural image scenes. In other words, according to No-Reference, we select dynamic range, color, and contrast as factors of image quality measurements. For collecting sample images, we choose the preferred 200 landscapes which have over 30 recommendations by image consumers on the internet photo gallery. According to the scores of three objective factors of image quality measurements, the final expected score which means the image quality preference is measured and its total score is 100 points. In the main test, the actual image sample shows dynamic range 10 stop, LAB mean value L:54.7, A:2.96, B:-15.84, and RSC contrast 376.9. Total 200 image samples' normal distribution z value represents in dynamic range 0.21, LAB mean value L:0.15, A:0.38, B:0.13, and RSC contrast 0.08. In the standard normal distribution table, we can convert the z value as a percentage; dynamic range is 8.32%, LAB mean value is L:5.96%, A:14.8%, B:5.17%, and RSC contrast is 3.19%. And then, we convert the percentage values into the scores of 100; dynamic range is 91.68, LAB mean value is 91.36, and RSC contrast is 96.81. Therefore, we can conclude that the sample image's total mean score is 94.99 based on three objective image quality factors. Throughout our proposed image quality assessment model, we can measure the preference value of natural scenes. Also, we can specify the preferred image quality representation ranges and measure the expected image quality preference.
PDF

A Study on Methods for the Visualization of Stage Space through Stage Lighting (무대조명을 통한 무용 예술의 무대공간 시각화 방안 연구)

Lee, Jang-Weon;Yi, Chin-Woo
- Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
- /
- v.23 no.4
- /
- pp.16-28
- /
- 2009
Stage art basically builds upon the essence of "seeing," and at the same time, possesses relativity in showing and seeing. Stage lighting uses artificial light to solve the essence of "seeing", which is the foundation of stage art, and coming into the modern age, its role has been enhanced to an important medium for visual expression in stage art, due to the lighting tools that developed at a rapid pace along with the discovery of electricity, as well as the development of optics. Therefore, not only does lighting use a medium known as light in a field of stage art that gives mental and emotional inspiration to the audience, and aesthetically expresses time and space. In other words, stage lighting is a complex function of light engineering (technology and science) and aesthetic sense (feeling and art). This study aims to do research on methods for the visualization of stage space through lighting, mainly focused on dancing. I have studied the basics of stage lighting, its relations with other fields of stage art, and the functions and characteristics of lighting. Results show that lighting could be used to maximize the visualization of dancing and emphasizing the artistic growth of lighting and its ability to aesthetically express and I came to the following conclusions. First, lighting uses the forms and directions of light that various tools are able to produce in order to visualize the space on stage, and can maximally express the image that the work seeks. Second, it is possible to use lighting, through the movement of light, as a visual representation of the configuration of space in dancing works. Third, through the expression of visual and spatial aspects created by light, the work's dramatic catharsis can bring out mental and emotional feelings form the audience. Fourth, lighting can be seen not as a supporting role, but as an original visual design. To conclude, in order for lighting to be freed form the simple function of "lighting up the stage," which a majority of people think is common knowledge, and grow as one area in art, lighting designers must understand the intentions of the choreographer and the work with creativity and artistry they must consider light and color as an aesthetic language in order to heighten the effects of the work and allow it to partake as one element of work creation, so that lighting will be treated as a form of art.
https://doi.org/10.5207/JIEIE.2009.23.4.016 인용 KSCI

The 'Fantastic' in the René Laloux's movie (<죽은 시간들(Les Temps Morts), 르네 랄루(René Laloux) 작, 1964>의 환상성)

Han, Sang-Jung;Park, Sang-Chun
- Cartoon and Animation Studies
- /
- s.27
- /
- pp.31-49
- /
- 2012
This research aims at showing specificity of the 'fantastic' in the movie < Les temps morts (on 1964) > directed by Ren$\acute{e}$ Laloux (1929-2004 ), all over the world recognized director. This movie has a particular style by composing four forms of expression: the real recording (movie), the recording embellishes with images by image (animation), the drawing, the photo. This film is the most strange among his all films. Even if we could catch the key meaning of the film, it offer for the audience the sentiment incertain and unclear. If we consider 'the fantastic' as a hesitation between the real and the unreal, in diverse levels, this movie offers to the spectators the fantastic feelings. In order to present the way this film shows us the fantastic, we divide the film into 15 sequences according to the criteria of the visual elements and the auditive elements. We analyze specificities of this fantastic in diverse levels. At first, the first style of the drawing of Roland Topor, does not let us escape easily from the feeling of fantasy. The four representation formats(drawing, photo, animation, movie) are integrated into one whole by auditive elements(music, narration). On the other hand, certain parts incomprehensible are not integrated into the entire. are fully integrated into the unity that does not understand that part, leaving them can. Laloux leads the audience into a reality toward the end of the film, but he leave incertain sequences at the last moment. Through which the audience is again hesitant between the real and the unreal, the fantastic is strengthened as a result of the work. Finally, the fantastic of the film could be found at three levels. First, the fantastic drawing style of Roland Topor. In the second place, the fantastic exposed through the entire composition and structure of work. Overall, these by leaving through the availability of the story incomprehensible to the audience is to provide a fantastic sentiment.
https://doi.org/10.7230/KOSCAS.2012.27.031 인용 PDF KSCI

Study On the Geographic Locations of Gugoks and Dongcheons in Seoul, Gyeonggi-Do and Gangwon-Do (서울시·경기도·강원도지역 구곡·동천 위치연구)

Kang, Kee-Rae;Lee, Hae-Ju;Kim, Hee-Chae;Lee, Hyun-Chae;Kim, Dong-Phil
- Journal of the Korean Institute of Traditional Landscape Architecture
- /
- v.35 no.3
- /
- pp.67-75
- /
- 2017
The culture of Gugok (九曲) and Dongcheon (洞天), which tries to reach the ideological culmination in Confucianism, was widespread throughout the Joseon dynasty. This was an extension of the spirit of studying and honoring Zhu Xi (學朱子, 尊朱子); thereby, Confucian scholars in Joseon expressed the will to follow the teachings of Zhu Xi (朱子) and comforted themselves that they were in the course of attaining the truth. As a realization of this expression of will, scholars designated and operated various scenic sites as Gugoks, following the example of Zhu Xi's Mui Gugok (武夷九曲), and Dongcheons, as a representation of the utopia. These designations are widespread nationwide, with around sixty Gugok locations that have now been reported in academia. However, the actual number of Gugoks exceeds this number, and many of them are currently not identified concerning the exact locations. Therefore, the purpose of this study is to identify the locations of Gugoks and Dongcheons scattered around Seoul, Gyeonggi and Gangwon regions. For the coordinates of Gugoks and Dongcheons, this study referred to the literature, web search and the books published by local cultural institutes. Based on the collected information, the researchers conducted field trips to investigate whether the record exists as a real location and, if so, acquired their coordinates. This study also provides the tables of Gugok or Dongcheon that only exists in the imagination, existed before but now are lost, or are inaccessible. Eight locations in Seoul, Gyeonggi, and Gangwon regions are understood as Gugok. Among them, Gogun Gugok and Okgye Gugok have relatively clear locations and records. Byeokgye Gugok and Suhoe Gugok, on the other hand, has many locations and titles overlapped, and their established time and managers are unclear. As for Ui Gugok in Seoul, it is known to be set by Hong Yangho, but some parts of its locations are confirmed, others are in dispute, and many locations are damaged. Thirty-eight locations in Seoul, Gyeonggi, and Gangwon regions are understood as Dongcheon. There are sixteen Dongcheons in Seoul area. Among them, those including Dohwa Dongcheon, Yangsan Dongcheon, and Ssangnyu Dongcheon actually exist but are forbidden to be accessed. There are thirteen Dongcheons in Gyeonggi area. The exact location of Onsu Dongcheon cannot be confirmed because of the development; Gwirae Dongcheon has historical records, but the actual existence cannot be confirmed. There are nine Dongcheons in Gangwon area. The researcher judged that Hwaeum Dongcheon is the misspelled record of Hwaeumdong Jeongsaji (華陰洞精舍址), which is located at the upstream of Gogun Gugok.
https://doi.org/10.14700/KITLA.2017.35.3.067 인용 PDF KSCI

A Study on Practices and Improvement Factors of Financial Disclosures in early stages of IFRS Adoption - An Integrative Approach of Korean Cases: Embracing Views of Reporting Entities and Users of Financial Statements (IFRS 공시 실태 개선방안에 대한 소고 - 보고기업, 정보이용자 요인을 고려한 통합적 접근 -)

Kim, Hee-Suk
- Asia-Pacific Journal of Business Venturing and Entrepreneurship
- /
- v.7 no.2
- /
- pp.113-127
- /
- 2012
From the end of 1st quarter of 2012, Korean mandatory firms had started releasing financial reports conforming to the K-IFRS(Korean adopted International Financial Reporting Standards). Major characteristics of IFRS, such as 'principles based' features, consolidated reporting, 'fair value' measurement, increased pressure for non-financial disclosures have resulted in brief and various disclosure practices regarding the main body of each statements and vast amount of note description requirements. Meanwhile, a host of previous studies on IFRS disclosures have incorporated regulatory and/or 'compete information' perspectives, mainly focusing on suggesting further enforcement of strengthened requirements and providing guidelines for specific treatments. Thus, as an extension of prior findings and suggestions this study had explored to conduct an integrative approach embracing views of the reporting entities and the users of financial information. In spite of all the state-driven efforts for faithful representation and comparability of corporate financial reports, an overhaul of disclosure practices of fiscal year 2010 and 2011 had revealed numerous cases of insufficiency and discordance in terms of mandatory norms and market expectations. As to the causes of such shortcomings, this study identified several factors from the corporate side and the users of the information; some inherent aspects of IFRS, industry/corporate-specific context, expenditures related to internalizing IFRS system, reduced time frame for presentation. lack of clarity and details to meet the quality of information - understandability, comparability etc. - commonly requested by the user group. In order to improve current disclosure practices, dual approach had been suggested; Firstly, to encourage and facilitate implementation, (1) further segmentation and differentiation of mandates among companies, (2) redefining the scope and depth of note descriptions, (3) diversification and coordination of reporting periods, (4) providing support for equipping disclosure systems and granting incentives for best practices had been discussed. Secondly, as for the hard measures, (5) regularizing active involvement of corporate and user group delegations in the establishment and amendment process of K-IFRS (6) enforcing detailed and standardized disclosure on reporting entities had been recommended.
PDF

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
- Science of Emotion and Sensibility
- /
- v.13 no.1
- /
- pp.47-60
- /
- 2010
Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.
PDF

A Scenery Word of Pine Tree Extracted in Choi Myoung Hee's Novel 『Honbul』 (최명희의 소설 『혼불』에서 추출한 소나무의 경관언어)

Rho, Jae-Hyun;Kim, Hwa-Ok;Park, Yool-Jin
- Journal of the Korean Institute of Traditional Landscape Architecture
- /
- v.32 no.4
- /
- pp.61-72
- /
- 2014
Throughout analyzing and construing the words, contexts, and expressive languages used for depicting the pine tree in the novel "Honbul" written by Choi, Myung-Hee the symbolism of the pine and folksy languages used for scenery can be condensed as written below: First, it is explicit that the scenery-words for illustrating the pine tree in "Honbul" are emerged through diverse means methods and expressions. Namely, the reference forms of the pine tree and the expressive means of utilizing words portrays the use of the pine are various and subdivided. Second, the scenery-words found in vocabularies and the contexts of "Honbul" imply various symbolic representation. They not only perform to describe inherent image and symbolism of the pine, but they work for reifying the image of "Honbul" in the narrative structure in "Honbul" as being intrinsic scenery-word. Third, the scenery-words used for expressing aesthetics emerge as synesthetic expressions through the linear beauty and the texture of the pine as well as through five-senses. Forth, on the basis of the inherent symbolism and the image of the pine, the landscape of the background described in "Honbul" deems as a symbolic backdrop. As with then narrative structure of the novel, the pine tree performs as a mediation of the heaven and the earth, god and man, as well as the sacred and the secular. Fifth, scenery-words used for depicting the pine tree are a symbol that represents the spirit and emotion of the character in the novel. Moreover, it is a tool for pursuing the personification of the nature, the deification of the object, and the cosmos of the space. It is also utilized as a device that definitize the ideational image applied to express the landscape of the background of the novel. As mentioned above, the expressions, vocabularies and textures about pine tree represented from "Honbul" are expected to be the beginning of understanding the landscape-images and landscape-languages of pine in not only the setting for this novels, Namwon but also the entire districts of Korea.
https://doi.org/10.14700/KITLA.2014.32.4.061 인용 PDF KSCI

Search Result 1,829, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)