• Title/Summary/Keyword: Language Models

Search Result 868, Processing Time 0.03 seconds

Deep recurrent neural networks with word embeddings for Urdu named entity recognition

  • Khan, Wahab;Daud, Ali;Alotaibi, Fahd;Aljohani, Naif;Arafat, Sachi
    • ETRI Journal
    • /
    • v.42 no.1
    • /
    • pp.90-100
    • /
    • 2020
  • Named entity recognition (NER) continues to be an important task in natural language processing because it is featured as a subtask and/or subproblem in information extraction and machine translation. In Urdu language processing, it is a very difficult task. This paper proposes various deep recurrent neural network (DRNN) learning models with word embedding. Experimental results demonstrate that they improve upon current state-of-the-art NER approaches for Urdu. The DRRN models evaluated include forward and bidirectional extensions of the long short-term memory and back propagation through time approaches. The proposed models consider both language-dependent features, such as part-of-speech tags, and language-independent features, such as the "context windows" of words. The effectiveness of the DRNN models with word embedding for NER in Urdu is demonstrated using three datasets. The results reveal that the proposed approach significantly outperforms previous conditional random field and artificial neural network approaches. The best f-measure values achieved on the three benchmark datasets using the proposed deep learning approaches are 81.1%, 79.94%, and 63.21%, respectively.

Comparison of Sentiment Classification Performance of for RNN and Transformer-Based Models on Korean Reviews (RNN과 트랜스포머 기반 모델들의 한국어 리뷰 감성분류 비교)

  • Jae-Hong Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.4
    • /
    • pp.693-700
    • /
    • 2023
  • Sentiment analysis, a branch of natural language processing that classifies and identifies subjective opinions and emotions in text documents as positive or negative, can be used for various promotions and services through customer preference analysis. To this end, recent research has been conducted utilizing various techniques in machine learning and deep learning. In this study, we propose an optimal language model by comparing the accuracy of sentiment analysis for movie, product, and game reviews using existing RNN-based models and recent Transformer-based language models. In our experiments, LMKorBERT and GPT3 showed relatively good accuracy among the models pre-trained on the Korean corpus.

The Interaction Effect of Foreign Model Attractiveness and Foreign Language Usage (외국인 모델의 매력도와 외국어 사용의 상호작용 효과)

  • Lee, Ji-Hyun;Lee, Dong-Il
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.3
    • /
    • pp.61-81
    • /
    • 2007
  • Recently, use of foreign models and foreign language in advertising is a general trend in Korea even though the effect has not been well-known..Most of the previous research shows rather an opposite effect claiming marketing communication is more effective when higher congruity between marketing communication and consumer's cultural values are achieved. However, the introduction of global culture due to the expansion of new media such as Internet or cable television makes the congruity not the best choice of marketing strategy. In addition, use of highly attractive models in advertising to increase the effect of advertising is general. However, recent studies show that targeted women audience tend to compare themselves to the highly attractive models and do experience negative sentiment. Bower (2001) proved the difference between 'comparer' and 'noncomparer' when women face highly attractive models. The results show that a comparer who has an intention to compare highly attractive model (HAM) with herself has a significantly negative effect on model expertise, product argument, product evaluation and buying intention. Therefore, HAM is not always a good choice and model attractiveness plays a role in the processing other cues or changing the advertising effect from result of processing other cues. The purpose of this study is to investigate the effect of the use of foreign language on the advertising response of the audience with regard of the model attractiveness. For the empirical study, the virtual advertising using foreign models (HAM, NAM), brand names and slogans(Korean, English) were used as stimuli. The respondents of each stimulus were 75('HAM-Korean'), 75('NAM-Korean'), 66('HAM-English') and 66 ('NAM-English') respectively. To establish the effect of marketing communication, the attitude for media(AM), the attitude for product(AP), targetedness(TD), overall quality(OQ), and purchase intention(PI) with 7 point likert scale were measured. The manipulation was verified to check the difference between HAM attractiveness assessment (m=3.27) and NAM attractiveness assessment (m=5.12). The mean difference was statiscally significant (p<.05). As a result, all consequences were significantly changed with model attractiveness, and overall quality evaluation(OQ) were significantly changed with language. The interaction effect from model attractiveness and language was significant on attitude toward the product(AP) and purchase intention(PI). To analyze the difference, the mean values and standard deviation of consequences were compared. The result was more positive when model attractiveness was high for all consequences. For language effect, the assessment was more positive when English was used for OQ. Considering model attractiveness and language simultaneously, HAM-Korean was more positive for AP and PI, and NAM-English was more positive for AP and PI. In other words, the interaction effect was confirmed by model attractiveness and language. As mentioned above, use of foreign models and foreign language in advertising was explained by cultural match up hypothesis (Leclerc et al. 1994) which claimed that culture of origin effect. In other words, in advertising, use of same cultural language with the foreign model could make positive assessment for OQ. But this effect was moderated by model attractiveness. When the model attractiveness was low, the use of English makes PI high because of the effect of foreign language which supported the cultural match up hypothesis. When the model attractiveness was low, the use of Korean made AP and PI high because the effect of foreign language was diluted. It was a general notion that the visual cues got processed before (Holbrook and Moore, 1981; Sholl et al, 1995) compared to linguistic cues. Therefore, when consumers were faced HAM, so much perception was already consumed at processing visual cues making their native language of Korean to strongly and positively connected with the advertising concept. On the contrary, when consumers were faced with NAM, less perception was consumed compared to HAM, making English to accompany cultural halo effect which affected more positively. Therefore, when foreign models were employed in advertising, the language must be carefully selected according to the level of model attractiveness.

  • PDF

Zero-anaphora resolution in Korean based on deep language representation model: BERT

  • Kim, Youngtae;Ra, Dongyul;Lim, Soojong
    • ETRI Journal
    • /
    • v.43 no.2
    • /
    • pp.299-312
    • /
    • 2021
  • It is necessary to achieve high performance in the task of zero anaphora resolution (ZAR) for completely understanding the texts in Korean, Japanese, Chinese, and various other languages. Deep-learning-based models are being employed for building ZAR systems, owing to the success of deep learning in the recent years. However, the objective of building a high-quality ZAR system is far from being achieved even using these models. To enhance the current ZAR techniques, we fine-tuned a pretrained bidirectional encoder representations from transformers (BERT). Notably, BERT is a general language representation model that enables systems to utilize deep bidirectional contextual information in a natural language text. It extensively exploits the attention mechanism based upon the sequence-transduction model Transformer. In our model, classification is simultaneously performed for all the words in the input word sequence to decide whether each word can be an antecedent. We seek end-to-end learning by disallowing any use of hand-crafted or dependency-parsing features. Experimental results show that compared with other models, our approach can significantly improve the performance of ZAR.

Iterative Feedback-based Personality Persona Generation for Diversifying Linguistic Patterns in Large Language Models (대규모 언어 모델의 언어 패턴 다양화를 위한 반복적 피드백 기반 성격 페르소나 생성법)

  • Taeho Hwang;Hoyun Song;Jisu Shin;Sukmin Cho;Jong C. Park
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.454-460
    • /
    • 2023
  • 대규모 언어 모델(Large Language Models, LLM)의 발전과 더불어 대량의 학습 데이터로부터 기인한 LLM의 편향성에 관심이 집중하고 있다. 최근 선행 연구들에서는 LLM이 이러한 경향성을 탈피하고 다양한 언어 패턴을 생성하게 하기 위하여 LLM에 여러가지 페르소나를 부여하는 방법을 제안하고 있다. 일부에서는 사람의 성격을 설명하는 성격 5 요인 이론(Big 5)을 이용하여 LLM에 다양한 성격 특성을 가진 페르소나를 부여하는 방법을 제안하였고, 페르소나 간의 성격의 차이가 다양한 양상의 언어 사용 패턴을 이끌어낼 수 있음을 보였다. 그러나 제한된 횟수의 입력만으로 목표하는 성격의 페르소나를 생성하려 한 기존 연구들은 세밀히 서로 다른 성격을 가진 페르소나를 생성하는 데에 한계가 있었다. 본 연구에서는 페르소나 부여 과정에서 피드백을 반복하여 제공함으로써 세세한 성격의 차이를 가진 페르소나를 생성하는 방법론을 제안한다. 본 연구의 실험과 분석을 통해, 제안하는 방법론으로 형성된 성격 페르소나가 다양한 언어 패턴을 효과적으로 만들어 낼 수 있음을 확인했다.

  • PDF

Feature Configuration Validation using Semantic Web Technology (시맨틱 웹 기술을 이용한 특성 구성 검증)

  • Choi, Seung-Hoon
    • Journal of Internet Computing and Services
    • /
    • v.11 no.4
    • /
    • pp.107-117
    • /
    • 2010
  • The feature models representing the common and variable concepts among the software products and the feature configurations generated by selecting the features to be included in the target product are the essential components in the software product lines methodology. Although the researches on the formal semantics and reasoning of the feature models and feature configurations are in progress, the researches on feature model ontologies and feature configuration validation using the semantic web technologies are yet insufficient. This paper defines the formal semantics of the feature models and proposes a feature configuration validation technique based on ontology and semantic web technologies. OWL(Web Ontology Language), a semantic web standard language, is used to represent the knowledge in the feature models and the feature configurations. SWRL(Semantic Web Rule Language), a semantic web rule languages, is used to define the rules to validate the feature configurations. The approach in this paper provides the formal semantic of the feature models, automates the validation of feature configurations, and enables the application of various semantic web technologies, such as SQWRL.

Analysis of Prompt Engineering Methodologies and Research Status to Improve Inference Capability of ChatGPT and Other Large Language Models (ChatGPT 및 거대언어모델의 추론 능력 향상을 위한 프롬프트 엔지니어링 방법론 및 연구 현황 분석)

  • Sangun Park;Juyoung Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.287-308
    • /
    • 2023
  • After launching its service in November 2022, ChatGPT has rapidly increased the number of users and is having a significant impact on all aspects of society, bringing a major turning point in the history of artificial intelligence. In particular, the inference ability of large language models such as ChatGPT is improving at a rapid pace through prompt engineering techniques. This reasoning ability can be considered as an important factor for companies that want to adopt artificial intelligence into their workflows or for individuals looking to utilize it. In this paper, we begin with an understanding of in-context learning that enables inference in large language models, explain the concept of prompt engineering, inference with in-context learning, and benchmark data. Moreover, we investigate the prompt engineering techniques that have rapidly improved the inference performance of large language models, and the relationship between the techniques.

Privacy-Preserving Language Model Fine-Tuning Using Offsite Tuning (프라이버시 보호를 위한 오프사이트 튜닝 기반 언어모델 미세 조정 방법론)

  • Jinmyung Jeong;Namgyu Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.165-184
    • /
    • 2023
  • Recently, Deep learning analysis of unstructured text data using language models, such as Google's BERT and OpenAI's GPT has shown remarkable results in various applications. Most language models are used to learn generalized linguistic information from pre-training data and then update their weights for downstream tasks through a fine-tuning process. However, some concerns have been raised that privacy may be violated in the process of using these language models, i.e., data privacy may be violated when data owner provides large amounts of data to the model owner to perform fine-tuning of the language model. Conversely, when the model owner discloses the entire model to the data owner, the structure and weights of the model are disclosed, which may violate the privacy of the model. The concept of offsite tuning has been recently proposed to perform fine-tuning of language models while protecting privacy in such situations. But the study has a limitation that it does not provide a concrete way to apply the proposed methodology to text classification models. In this study, we propose a concrete method to apply offsite tuning with an additional classifier to protect the privacy of the model and data when performing multi-classification fine-tuning on Korean documents. To evaluate the performance of the proposed methodology, we conducted experiments on about 200,000 Korean documents from five major fields, ICT, electrical, electronic, mechanical, and medical, provided by AIHub, and found that the proposed plug-in model outperforms the zero-shot model and the offsite model in terms of classification accuracy.

Large Language Models: A Guide for Radiologists

  • Sunkyu Kim;Choong-kun Lee;Seung-seob Kim
    • Korean Journal of Radiology
    • /
    • v.25 no.2
    • /
    • pp.126-133
    • /
    • 2024
  • Large language models (LLMs) have revolutionized the global landscape of technology beyond natural language processing. Owing to their extensive pre-training on vast datasets, contemporary LLMs can handle tasks ranging from general functionalities to domain-specific areas, such as radiology, without additional fine-tuning. General-purpose chatbots based on LLMs can optimize the efficiency of radiologists in terms of their professional work and research endeavors. Importantly, these LLMs are on a trajectory of rapid evolution, wherein challenges such as "hallucination," high training cost, and efficiency issues are addressed, along with the inclusion of multimodal inputs. In this review, we aim to offer conceptual knowledge and actionable guidance to radiologists interested in utilizing LLMs through a succinct overview of the topic and a summary of radiology-specific aspects, from the beginning to potential future directions.

Framework for evaluating code generation ability of large language models

  • Sangyeop Yeo;Yu-Seung Ma;Sang Cheol Kim;Hyungkook Jun;Taeho Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.106-117
    • /
    • 2024
  • Large language models (LLMs) have revolutionized various applications in natural language processing and exhibited proficiency in generating programming code. We propose a framework for evaluating the code generation ability of LLMs and introduce a new metric, pass-ratio@n, which captures the granularity of accuracy according to the pass rate of test cases. The framework is intended to be fully automatic to handle the repetitive work involved in generating prompts, conducting inferences, and executing the generated codes. A preliminary evaluation focusing on the prompt detail, problem publication date, and difficulty level demonstrates the successful integration of our framework with the LeetCode coding platform and highlights the applicability of the pass-ratio@n metric.