• Title/Summary/Keyword: language training

Search Result 685, Processing Time 0.023 seconds

Hesse's Multimedia Features and Inter-Media Crossing (헤세의 다매체적 특징과 상호매체 넘나들기)

  • Cho, Heeju;Chae, Yonsuk
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.2
    • /
    • pp.515-523
    • /
    • 2017
  • In the training field where literature is used as a tool, some excerpts from its text are used, instead of its full text. Therefore, it is necessary to have empirical guidelines for which part of the text should be used as Memory-Hint, a part that reminds its reader of certain memory, and for how the text can be introduced effectively. For the study, Hesse's whole life and his literary characters were examined from a therapeutic perspective. First, while Hesse's life was reviewed and his characters were analyzed, Hesse was recognized for Self-therapeutic Life. He also lived a life of multimedia in which he practiced writing, painting, playing musical instruments, meditation, walking, etc. Second, Contents of Literature Therapy using Hesse's works were applied to the schizophrenic patients. Media used for the clinical study were mostly extracted from Hesse's works. They began to show interest in others and express their empathy on others, in addition to expressing their sentimental empathy on Hesse's texts. How effectively Hesse utilized multimedia during his lifetime will be good literary resources in helping improving modern-day people's mental health and curing their pathological problems.

Graph-Based Word Sense Disambiguation Using Iterative Approach (반복적 기법을 사용한 그래프 기반 단어 모호성 해소)

  • Kang, Sangwoo
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.2
    • /
    • pp.102-110
    • /
    • 2017
  • Current word sense disambiguation techniques employ various machine learning-based methods. Various approaches have been proposed to address this problem, including the knowledge base approach. This approach defines the sense of an ambiguous word in accordance with knowledge base information with no training corpus. In unsupervised learning techniques that use a knowledge base approach, graph-based and similarity-based methods have been the main research areas. The graph-based method has the advantage of constructing a semantic graph that delineates all paths between different senses that an ambiguous word may have. However, unnecessary semantic paths may be introduced, thereby increasing the risk of errors. To solve this problem and construct a fine-grained graph, in this paper, we propose a model that iteratively constructs the graph while eliminating unnecessary nodes and edges, i.e., senses and semantic paths. The hybrid similarity estimation model was applied to estimate a more accurate sense in the constructed semantic graph. Because the proposed model uses BabelNet, a multilingual lexical knowledge base, the model is not limited to a specific language.

Semantic Pre-training Methodology for Improving Text Summarization Quality (텍스트 요약 품질 향상을 위한 의미적 사전학습 방법론)

  • Mingyu Jeon;Namgyu Kim
    • Smart Media Journal
    • /
    • v.12 no.5
    • /
    • pp.17-27
    • /
    • 2023
  • Recently, automatic text summarization, which automatically summarizes only meaningful information for users, is being studied steadily. Especially, research on text summarization using Transformer, an artificial neural network model, has been mainly conducted. Among various studies, the GSG method, which trains a model through sentence-by-sentence masking, has received the most attention. However, the traditional GSG has limitations in selecting a sentence to be masked based on the degree of overlap of tokens, not the meaning of a sentence. Therefore, in this study, in order to improve the quality of text summarization, we propose SbGSG (Semantic-based GSG) methodology that selects sentences to be masked by GSG considering the meaning of sentences. As a result of conducting an experiment using 370,000 news articles and 21,600 summaries and reports, it was confirmed that the proposed methodology, SbGSG, showed superior performance compared to the traditional GSG in terms of ROUGE and BERT Score.

A Study on the Employment Predictive Factors of Young University Graduates

  • Jun-Su Kim;Woo-Hong Cho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.241-246
    • /
    • 2023
  • The purpose of this study was to suggest the direction of university employment support through analysis of employment success factors for young college graduates and comparison of determining factors in metropolitan and non-metropolitan areas. For this purpose, the factors were analyzed using SPSS 25.0 statistical package binary logistic regression analysis using the 2019 'College Graduate Occupational Movement Path Survey' data provided by the Korea Employment Information Service. As a result of the study, among the personal characteristics of college graduates in Seoul and the metropolitan area, age, parental assets, and language training experience were (+) factors for employment success, and in terms of college characteristics, 2-3 year college graduates were more likely to succeed in employment than 4-year college graduates or education college graduates. In addition, among the personal characteristics of college graduates from non-metropolitan areas, age and parental assets were (+) factors in employment success, and 2-3 year college graduates were more likely to succeed in employment than 4-year college graduates.

Recognition of hand gestures with different prior postures using EMG signals (사전 자세에 따른 근전도 기반 손 제스처 인식)

  • Hyun-Tae Choi;Deok-Hwa Kim;Won-Du Chang
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.6
    • /
    • pp.51-56
    • /
    • 2023
  • Hand gesture recognition is an essential technology for the people who have difficulties using spoken language to communicate. Electromyogram (EMG), which is often utilized for hand gesture recognition, is expected to have difficulties in hand gesture recognition because its people's movements varies depending on prior postures, but the study on this subject is rare. In this study, we conducted tests to confirm if the prior postures affect on the accuracy of gesture recognition. Data were recorded from 20 subjects with different prior postures. We achieved average accuracies of 89.6% and 52.65% when the prior states between the training and test data were unique and different, respectively. The accuracy was increased when both prior states were considered, which confirmed the need to consider a variety of prior states in hand gesture recognition with EMG.

Multi-Emotion Regression Model for Recognizing Inherent Emotions in Speech Data (음성 데이터의 내재된 감정인식을 위한 다중 감정 회귀 모델)

  • Moung Ho Yi;Myung Jin Lim;Ju Hyun Shin
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.81-88
    • /
    • 2023
  • Recently, communication through online is increasing due to the spread of non-face-to-face services due to COVID-19. In non-face-to-face situations, the other person's opinions and emotions are recognized through modalities such as text, speech, and images. Currently, research on multimodal emotion recognition that combines various modalities is actively underway. Among them, emotion recognition using speech data is attracting attention as a means of understanding emotions through sound and language information, but most of the time, emotions are recognized using a single speech feature value. However, because a variety of emotions exist in a complex manner in a conversation, a method for recognizing multiple emotions is needed. Therefore, in this paper, we propose a multi-emotion regression model that extracts feature vectors after preprocessing speech data to recognize complex, inherent emotions and takes into account the passage of time.

A Study of Automatic Deep Learning Data Generation by Considering Private Information Protection (개인정보 보호를 고려한 딥러닝 데이터 자동 생성 방안 연구)

  • Sung-Bong Jang
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.435-441
    • /
    • 2024
  • In order for the large amount of collected data sets to be used as deep learning training data, sensitive personal information such as resident registration number and disease information must be changed or encrypted to prevent it from being exposed to hackers, and the data must be reconstructed to match the structure of the built deep learning model. Currently, these tasks are performed manually by experts, which takes a lot of time and money. To solve these problems, this paper proposes a technique that can automatically perform data processing tasks to protect personal information during the deep learning process. In the proposed technique, privacy protection tasks are performed based on data generalization and data reconstruction tasks are performed using circular queues. To verify the validity of the proposed technique, it was directly implemented using C language. As a result of the verification, it was confirmed that data generalization was performed normally and data reconstruction suitable for the deep learning model was performed properly.

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

Improving Bidirectional LSTM-CRF model Of Sequence Tagging by using Ontology knowledge based feature (온톨로지 지식 기반 특성치를 활용한 Bidirectional LSTM-CRF 모델의 시퀀스 태깅 성능 향상에 관한 연구)

  • Jin, Seunghee;Jang, Heewon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.253-266
    • /
    • 2018
  • This paper proposes a methodology applying sequence tagging methodology to improve the performance of NER(Named Entity Recognition) used in QA system. In order to retrieve the correct answers stored in the database, it is necessary to switch the user's query into a language of the database such as SQL(Structured Query Language). Then, the computer can recognize the language of the user. This is the process of identifying the class or data name contained in the database. The method of retrieving the words contained in the query in the existing database and recognizing the object does not identify the homophone and the word phrases because it does not consider the context of the user's query. If there are multiple search results, all of them are returned as a result, so there can be many interpretations on the query and the time complexity for the calculation becomes large. To overcome these, this study aims to solve this problem by reflecting the contextual meaning of the query using Bidirectional LSTM-CRF. Also we tried to solve the disadvantages of the neural network model which can't identify the untrained words by using ontology knowledge based feature. Experiments were conducted on the ontology knowledge base of music domain and the performance was evaluated. In order to accurately evaluate the performance of the L-Bidirectional LSTM-CRF proposed in this study, we experimented with converting the words included in the learned query into untrained words in order to test whether the words were included in the database but correctly identified the untrained words. As a result, it was possible to recognize objects considering the context and can recognize the untrained words without re-training the L-Bidirectional LSTM-CRF mode, and it is confirmed that the performance of the object recognition as a whole is improved.

The Sociocultural Characteristics of Korean Ethnics in Central Asia (중앙아시아 한인의 사회문화적 특성과 과제)

  • 정성호
    • Korea journal of population studies
    • /
    • v.20 no.2
    • /
    • pp.161-180
    • /
    • 1997
  • There are about 400, 000 Korean ethnics living in Central Asia. Most of Koreans in Central Asia are leading a stable middle class life mostly engaged in farm work. With increase of educational attainment of their children, a number of Koreans are launching into political and academic circles as well as in the cultural world or the press. In recent years, however, the countries in this area(Uzbekistan and Kazakstan) for this study advocate an ethnic united policy to stabilize the politics and society and to carry out efficient transformation from the former socialistic economy to a market oriented economy. In addition, they are trying to recover the culture and the language of each nation which has been forgotten in the assimilation of Russia policy. Koreans have difficulty in adaption to this kind of change. In fact, a number of Koreans lost traditional culture and could not speak their mother language - Korean. Although they more or less maintain national consciousness, they recognize Uzbekistan or Kazakstan as their nation politically. They associated with North Korea unilaterally before the launching of the Perestroika policy. But after the Seoul Olympics held in 1998, there was movement to know and understand South Korea. There has been increased in the investment by Korean companies in Central Asia. Now, what is an alternative idea for Korean community consciousness\ulcorner It can be summarized as follows: 1) The increase of aid to Korean education institute : Considering the last few decades of Russia's strong racial assimilation policy, which leads most Koreans to lost their language and national culture, the priority should go to Koreans education. 2) Local Korean press support : Though Korean newspaper are published and Korean broadcasting is on the air currently in Uzbekistan and Kazakstan, they are suffering from qualified staff and poor financial status. Therefore, positive support should be established for these Korean mass communication media outlets to recover their own function and expand their dissemination powers quickly. 3) Research on the actual condition for Korean Community : It is essential to directly examine the local Korean community's regional distribution, population structure, Korean group's formation and operation, social and cultural understanding, racial consciousness, hope for their mother land and much more. 4) Increase of mother land and education opportunity : To stir up national culture and national consciousness within the Korean community, it is necessary to expand continuous opportunities for mother land visits and education training for local Koreans, especially for second and third generations.

  • PDF