Search | Korea Science

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
- Journal of Intelligence and Information Systems
- /
- v.28 no.2
- /
- pp.191-206
- /
- 2022
Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.
https://doi.org/10.13088/jiis.2022.28.2.191 인용 PDF KSCI

A Context-Aware Cooperative Query for u-Shopping Systems (u-쇼핑 시스템을 위한 상황인식적이고 협력적인 질의 시스템 개발)

Kwon, Ohbyung;Shin, Myung Keun
- Journal of Intelligence and Information Systems
- /
- v.12 no.4
- /
- pp.61-72
- /
- 2006
Ubiquitous computing technologies become mature enough to be applied in acceptable ubiquitous services. In particular, in u-shopping area, personalized recommender systems which automatically collect the nomadic user-related context data and then provide them with products or shops in a flexible manner. However, legacy cooperative queries and context-aware queries so far do not come up with dynamically changing situations and ambiguous query commands, respectively. Hence, The purpose of this paper is to propose a personalized context-aware cooperative query that supports a multi-level data abstraction hierarchy and conceptual distance metric among node instances, while considering the user's context data. To show the feasibility of the methodology proposed in this paper, we have implemented a prototype system, CACO, in the area of site search in a large-scale shopping mall.
PDF

Handover Latency Improvement & Performance Analysis over Inter-LMA (Inter-LMA 이동시 Handover Latency 개선 방안 및 성능 분석)

Chang, Jae-Cheol;Park, Byung-Joo;Kim, Dae-Young
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.46 no.8
- /
- pp.34-42
- /
- 2009
Mobile communication traffic is changing from voice to data/internet, e.g. wireless internet access, SMS/MMS. more and more. Therefore many data services are coming out over 3G, Mobile WiMAX(WIBRO), LTE etc. Wireless internet market is growing and MIPv6 is more important and many protocols being studied and developed from MIPv6 to Fast MIPv6, Hierachical MIPv6, Proxy MIPv6, etc. The significant factor over MIPv6 is Hand-over latency and Packet-loss PMIPv6 is efficient for reducing mobility related messages and hand-over latency, but it considers single LMA domain. If mobile node is moving inter-LMAs, hand-over delay time affects the real-time communications. To overcome this hand-over delay, we propose present and new enhanced schemes and analize the performance and show the results.
PDF KSCI

Context-based Incremental Preference Analysis Method in Ubiquitous Commerce (유비쿼터스 상거래 환경의 컨텍스트 기반 점진적 선호 분석 기법)

Ku Mi Sug;Hwang Jeong Hee;Choi Nam Kyu;Jung Doo Young;Ryu Keun Ho
- The KIPS Transactions:PartD
- /
- v.11D no.7 s.96
- /
- pp.1417-1426
- /
- 2004
As Ubiquitous commerce is coming personalization service is getting interested. And also the recommendation method which offers useful information to customer becomes more important. However, most of them depend on specific method and are restricted to the E-commerce. For applying these recommendation methods into U-commerce, first it is necessary that the extended context modeling and systematic connection of the methods to complement strength and weakness of recommendation methods in each commercial transaction. Therefore, we propose a mod-eling technique of context information related to personal activation in commercial transaction and show incremental preference analysis method, using preference tree which is closely connected to recommendation method in each step. And also, we use an XML indexing technique to effi-ciently extract the recommendation information from a preference tree.
https://doi.org/10.3745/KIPSTD.2004.11D.7.1417 인용 PDF KSCI

불공정거래행위 규제에 대한 발전적 입법론에 대하여

An, Byeong-Han
- Journal of Korea Fair Competition Federation
- /
- no.150
- /
- pp.14-29
- /
- 2010
비록 부정경쟁방지법의 제정 목적이 부정경쟁행위 등의 방지를 통하여 건전한 거래질서를 유지한다는 의미의 경쟁체제 확립에 있기는 하지만, 우리나라의 경우는 법 제정 당시와는 달리 사실상 산업스파이에 대한 영업비밀의 보호나 주지의 상표 영업표지의 보호와 같은 지적재산권의 보호 법률로서의 역할로 점차 변화하고 있고, 특히 부정경쟁방지법이 주지의 상표에 대한 출처의 혼동에 대한 규제뿐만이 아니라 별도로 저명상표의 희석화(稀釋化) 방지라는 법익, 이에 더 나아가 도메인 네임(Domain Name)의 선점과 원산지 및 품질의 오인(誤認) 야기행위, 주지 저명한 타인의 디자인(Design), 캐릭터(Character)와 같은 상품의 표지에 이르기까지 지적재산권과 관련된 보다 넓은 법익의 보호까지 수행하게 되면서 그 기능은 날로 강화되고 있는 상태이다. 이에 반하여 부정경쟁방지법상의 부정경쟁행위 자체에 대한 규제는 사실상 주지 저명한 타인의 상표나 상품표지의 식별력이나 출처표시기능 등의 보호라는 의미의 분쟁수준을 넘지 못하고 있어, '경쟁법'으로서의 역할은 상대적으로 미약해지고 있는 것 또한 현실이다. 또한, 공정거래법 제23조 제1항 제8호를 비롯하여 현행 공정거래법상의 불공정거래행위에 대한 규정체제를 살펴보면, 해석 여하에 따라서는 부정경쟁방지법상의 부정경쟁행위가 대부분 공정거래법상의 불공정거래행위의 범위 내로 포섭될 수도 있는 상황이기도 하다. 이에 양 법률의 성격과 역할, 앞으로 나아가야 할 방향을 고민하지 않을 수 없고, 이와 같은 논의는 발전적 입법론으로서의 가치를 갖는다. 물론 불공정거래행위(부정경쟁행위)에 대한 규제에 있어서 반드시 독일법체계에 따를 것인지 아니면 미국의 경우를 따를 것인지에 대한 선택 자체가 논리적으로 양립 불가능한 것은 아닐 것이지만 우리나라의 경우는 1980년 "독점규제 및 공정거래에 관한 법률"이 제정되는 과정에서 당시 부정경쟁방지법에 담겨 있던 기존의 부정경쟁행위에 대한 규정과 공정거래법상의 불공정거래행위와의 경합이나 중복문제는 마땅히 검토되었어야 했음에도 불구하고 공정 거래법의 제정과정에서 사실상 부정경쟁방지법의 존재 자체가 간과되어 오늘에 이르고 있다. 그동안 양 법률상의 규정 중복이나 충돌을 정식으로 문제 삼았던 바는 없었지만 '발전적 입법론' 이라는 차원에서 살펴 보면 부정경쟁방지법상의 부정경쟁행위에 대한 규제는 앞으로 공정거래법체계 내의 불공정거래행위로 포섭할 필요가 있고 이를 통하여 경쟁정책의 전문 전담기구로서 불공정거래행위에 대한 규제의 중심에 서 있는 공정거래위원회를 중심으로 효율적이고 통일적인 경쟁정책을 확립을 기대하여 볼 수 있을 것이다. 이 과정에서 공정거래법의 변화 또한 뒤따라야 하는데, 부정경쟁방지법상의 부정경쟁행위의 편입에 따라 불공정거래행위에 대한 규정 일부를 알맞게 다시 수정하는 것에 그치지 않고, 기존 부정경쟁방지법이 인정하고 있었던 사인간(私人間) 금지 또는 예방청구권 또한 공정거래법으로 그대로 편입되는 방향으로의 입법 개선이 이루어질 필요가 있으며, 그동안 '부정경쟁방지법의 공정거래법으로의 편입문제'와는 전혀 무관하게 공정거래법의 사적 구제 및 사소(私訴)의 활성화 차원의 논의로서 공정거래법상 사인간 금지청구권의 도입 여부가 검토되어 왔지만, 앞으로 이 문제는 부정경쟁방지법상 부정경쟁행위의 공정거래법체계 내로의 편입문제와 함께 이를 포함한 더욱 큰 논의로서 다시 적극적으로 검토될 필요가 있다고 본다. 이를 통하여 앞으로 부정경쟁방지법은 특허청을 중심으로 산업스파이에 대한 규제나 영업비밀의 보호와 기타 지적재산권의 보호에 온 힘을 다하고, 공정거래법은 공정거래위원회를 중심으로 불공정거래행위 (부정경쟁행위에 대한 보다 포괄적이고 통일적인 규제를 담당하여 '선택과 집중' 이라는 차원의 각 법률체계의 한 차원 높은 발전 또한 기대해 볼 수 있을 것으로 확신한다. 이러한 합의점을 시작으로 미시적인 다음 단계의 논의에 해당하는 사인간 금지청구권의 허용범위나 허용요건, 남용을 방지하기 위한 제도적 장치, 단체소송 등의 허용 여부 등의 논의도 함께 하여야 할 것이고, 이 과정에서 미국의 클레이튼법(Clayton Act)이나 가까운 일본의 입법례를 참고하여 우리의 실정에 맞는 규제의 틀을 마련함이 타당할 것이고, 이를 통하여 궁극적으로 그동안 공정거래법의 사적 집행의 활성화를 통한 경쟁질서의 확립의 강화라는 이상에 더욱 가까워질 수 있는 좋은 입법적 변화의 모습을 볼 수 있을 것이라 생각한다.
PDF

A study on the aspect-based sentiment analysis of multilingual customer reviews (다국어 사용자 후기에 대한 속성기반 감성분석 연구)

Sungyoung Ji;Siyoon Lee;Daewoo Choi;Kee-Hoon Kang
- The Korean Journal of Applied Statistics
- /
- v.36 no.6
- /
- pp.515-528
- /
- 2023
With the growth of the e-commerce market, consumers increasingly rely on user reviews to make purchasing decisions. Consequently, researchers are actively conducting studies to effectively analyze these reviews. Among the various methods of sentiment analysis, the aspect-based sentiment analysis approach, which examines user reviews from multiple angles rather than solely relying on simple positive or negative sentiments, is gaining widespread attention. Among the various methodologies for aspect-based sentiment analysis, there is an analysis method using a transformer-based model, which is the latest natural language processing technology. In this paper, we conduct an aspect-based sentiment analysis on multilingual user reviews using two real datasets from the latest natural language processing technology model. Specifically, we use restaurant data from the SemEval 2016 public dataset and multilingual user review data from the cosmetic domain. We compare the performance of transformer-based models for aspect-based sentiment analysis and apply various methodologies to improve their performance. Models using multilingual data are expected to be highly useful in that they can analyze multiple languages in one model without building separate models for each language.
https://doi.org/10.5351/KJAS.2023.36.6.515 인용 PDF

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
- Journal of Intelligence and Information Systems
- /
- v.24 no.2
- /
- pp.59-83
- /
- 2018
With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.
https://doi.org/10.13088/jiis.2018.24.2.059 인용 PDF KSCI

Search Result 47, Processing Time 0.023 seconds

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

A Context-Aware Cooperative Query for u-Shopping Systems (u-쇼핑 시스템을 위한 상황인식적이고 협력적인 질의 시스템 개발)

Handover Latency Improvement & Performance Analysis over Inter-LMA (Inter-LMA 이동시 Handover Latency 개선 방안 및 성능 분석)

Context-based Incremental Preference Analysis Method in Ubiquitous Commerce (유비쿼터스 상거래 환경의 컨텍스트 기반 점진적 선호 분석 기법)

불공정거래행위 규제에 대한 발전적 입법론에 대하여

A study on the aspect-based sentiment analysis of multilingual customer reviews (다국어 사용자 후기에 대한 속성기반 감성분석 연구)

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)