• Title/Summary/Keyword: 어휘 분석

Search Result 864, Processing Time 0.027 seconds

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

Exploring user experience factors through generational online review analysis of AI speakers (인공지능 스피커의 세대별 온라인 리뷰 분석을 통한 사용자 경험 요인 탐색)

  • Park, Jeongeun;Yang, Dong-Uk;Kim, Ha-Young
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.193-205
    • /
    • 2021
  • The AI speaker market is growing steadily. However, the satisfaction of actual users is only 42%. Therefore, in this paper, we collected reviews on Amazon Echo Dot 3rd and 4th generation models to analyze what hinders the user experience through the topic changes and emotional changes of each generation of AI speakers. By using topic modeling analysis techniques, we found changes in topics and topics that make up reviews for each generation, and examined how user sentiment on topics changed according to generation through deep learning-based sentiment analysis. As a result of topic modeling, five topics were derived for each generation. In the case of the 3rd generation, the topic representing general features of the speaker acted as a positive factor for the product, while user convenience features acted as negative factor. Conversely, in the 4th generation, general features were negatively, and convenience features were positively derived. This analysis is significant in that it can present analysis results that take into account not only lexical features but also contextual features of the entire sentence in terms of methodology.

A study on Korean tourism trends using social big data -Focusing on sentiment analysis- (소셜 빅데이터를 활용한 한국관광 트렌드에 관한연구 -감성분석을 중심으로-)

  • Youn-hee Choi;Kyoung-mi Yoo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.97-109
    • /
    • 2024
  • In the field of domestic tourism, tourism trend analysis of tourism consumers, both international tourists and domestic tourists, is essential not only for the Korean tourism market but also for local and governmental tourism policy makers. e will explore the keywords and sentiment analysis on social media to establish a marketing strategy plan and revitalize the domestic tourism industry through communication and information from tourism consumers. This study utilized TEXTOM 6.0 to analyze recent trends in Korean tourism. Data was collected from September 31, 2022, to August 31, 2023, using 'Korean tourism' and 'domestic tourism' as keywords, targeting blogs, cafes, and news provided by Naver, Daum, and Google. Through text mining, 100 key words and TF-IDF were extracted in order of frequency, and then CONCOR analysis and sentiment analysis were conducted. For Korean tourism keywords, words related to tourist destinations, travel companions and behaviors, tourism motivations and experiences, accommodation types, tourist information, and emotional connections ranked high. The results of the CONCOR analysis were categorized into five clusters related to tourist destinations, tourist information, tourist activities/experiences, tourism motivation/content, and inbound related. Finally, the sentiment analysis showed a high level of positive documents and vocabulary. This study analyzes the rapidly changing trends of Korean tourism through text mining on Korean tourism and is expected to provide meaningful data to promote domestic tourism not only for Koreans but also for foreigners visiting Korea.

Automatic Generation of Voice Web Pages Based on SALT (SALT 기반 음성 웹 페이지의 자동 생성)

  • Ko, You-Jung;Kim, Yoon-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.3
    • /
    • pp.177-184
    • /
    • 2010
  • As a voice browser is introduced, voice dialog application becomes available on the Web environment. The voice dialog application consists of voice Web pages that need to translate the dialog scripts into SALT(Speech Application Language Tags). The current Web pages have been designed for visual. They, however, are potentially capable of using voice dialog. This paper, therefore, proposes an automated voice Web generation method that finds the elements for voice dialog from Web pages based HTML and converts them into SALT. The automatic generation system of a voice Web page consists of a lexical analyzer and a syntactic analyzer that converts a Web page which is described in HTML to voice Web page which is described in HTML+SALT. The converted voice Web page is designed to be able to handle not only the current mouse and keyboard input but also voice dialog.

Korean Semantic Role Labeling Using Case Frame Dictionary and Subcategorization (격틀 사전과 하위 범주 정보를 이용한 한국어 의미역 결정)

  • Kim, Wan-Su;Ock, Cheol-Young
    • Journal of KIISE
    • /
    • v.43 no.12
    • /
    • pp.1376-1384
    • /
    • 2016
  • Computers require analytic and processing capability for all possibilities of human expression in order to process sentences like human beings. Linguistic information processing thus forms the initial basis. When analyzing a sentence syntactically, it is necessary to divide the sentence into components, find obligatory arguments focusing on predicates, identify the sentence core, and understand semantic relations between the arguments and predicates. In this study, the method applied a case frame dictionary based on The Korean Standard Dictionary of The National Institute of the Korean Language; in addition, we used a CRF Model that constructed subcategorization of predicates as featured in Korean Lexical Semantic Network (UWordMap) for semantic role labeling. Automatically tagged semantic roles based on the CRF model, which established the information of words, predicates, the case-frame dictionary and hypernyms of words as features, were used. This method demonstrated higher performance in comparison with the existing method, with accuracy rate of 83.13% as compared to 81.2%, respectively.

A Study on the Session Description Protocol Stack for VoIP (VoIP를 위한 Session Description Protocol 스택에 관한 연구)

  • Jung, Sung-Ok;Ko, Kwang-Man
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.38 no.3
    • /
    • pp.19-27
    • /
    • 2001
  • Accordingly it is very important to not only develop the stack of protocol, but also try an international standardization regarding the standard protocol of VoIP. Has compared to the advanced countries having already some success in commercialization, Korea is relatively much less involved in relation to this technology and endeavors. In this regards, this paper is focused on developing a protocol stack made with encoder/decoder, the generator or the header file, syntax analyzer etc. based on the protocol grammars of Session Description Protocol supported by IETF RFC2327. For the sake of it, first describe the SDP BNF grammar based on IETF RFC2327 Augmented BNF. And then we produce the Abstract Syntax Tree, header file generator for encoding/decoding as applying the method of syntax directed to SDP protocol grammar.

  • PDF

Searching Thesaurus Construction with Word Association Test: A Pilot Study (단어연상검사법을 이용한 탐색 시소러스 구축에 관한 실험적 연구)

  • Han Seung-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.3
    • /
    • pp.289-304
    • /
    • 2006
  • The purpose of this pilot study is to construct a searching thesaurus with word association test in the library and information science field and to confirm it's functionality as searching aids through query expansion experiments. The test results were analyzed to four types of relationship between stimulus words and response words, and the terms of association thesaurus were compared with descriptors of an existing thesaurus. The test results show that the word association test is a fruitful method to identify many related terms and narrower and equivalent terms in some degree to the stimulus terms. Furthermore. in the query expansion experiment. the Performance of association thesaurus was better than that of an existing thesaurus, This result demonstrates that word association thesaurus can apply to query expansion.

The Analysis of Sound Attributes on Sensibility Dimensions (소리의 청각적 속성에 따른 감성차원 분석)

  • Han Kwang-Hee;Lee Ju-Hwan
    • Science of Emotion and Sensibility
    • /
    • v.9 no.1
    • /
    • pp.9-17
    • /
    • 2006
  • As is commonly said, music is 'language of emotions.' It is because sound is a plentiful modality to communicate the human sensibility information. However, most researches of auditory displays were focused on improving efficiency on user's performance data such as performance time and accuracy. Recently, many of researchers in auditory displays acknowledge that individual preference and sensible satisfaction may be a more important factor than the performance data. On this ground, in the present study we constructed the sound sensibility dimensions ('Pleasure', 'Complexity', and 'Activity') and systematically examined the attributes of sound on the sensibility dimensions and analyzed the meanings. As a result, sound sensibility dimensions depended on each sound attributes , and some sound attributes interact with one another. Consequently, the results of the present study will provide the useful possibilities of applying the affective influence in the field of auditory displays needing the applications of the sensibility information according to the sound attributes.

  • PDF

Expanding the Scope of Identifying and Linking of Personal Information in Linked Data: Focusing on the Linked Data of National Library of Korea (링크드 데이터에서 인물 정보의 식별 및 연계 범위 확장에 관한 연구: 국립중앙도서관 링크드 데이터를 중심으로)

  • Lee, Sungsook;Park, Ziyoung;Lee, Hyewon
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.3
    • /
    • pp.7-21
    • /
    • 2017
  • This study analyzed the methods for representing and linking personal information in the linked data of National Library of Korea and provided suggestions for expanding the scope of identifying and linking of the personal information. As a result, the personal information as a subject has been dealt with a concept, where the personal information as a contributor has been linked with a vocabulary of personal name. In addition, there have not been assured of including additional information except existing authority data in the process of building the linked data. Therefore, this study suggested that linking personal information as a subject and personal information as a contributor was essential for the quality of linked data. In addition, we proposed to provide additional information related to the person in linked data for expanding the scope of access points in information discovery.

A Method based on Ontology for detecting errors in the Software Design (온톨로지 기반의 소프트웨어 설계에러검출방법)

  • Seo, Jin-Won;Kim, Young-Tae;Kong, Heon-Tag;Lim, Jae-Hyun;Kim, Chi-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.10
    • /
    • pp.2676-2683
    • /
    • 2009
  • The objective of this thesis is to improve the quality of a software product based on the enhancement of a software design quality using a better error detecting method. Also, this thesis is based on a software design method called as MOA(Methodology for Object to Agents) which uses an ontology based ODES(A Method based on Ontology for Detecting Errors in the Software Design) model as a common information model. At this thesis, a new format of error detecting method was defined. The method is implemented during a transformation process from UML model to ODES model using a ODES model, a Inter-View Inconsistency Detection technique and a combination of ontologic property of consistency framework and related rules. Transformation process to ODES model includes lexicon analysis and meaning analysis of a software design using of multiple mapping table at algorithm for the generation of ODES model instance.