• Title/Summary/Keyword: 텍스트 연구

Search Result 3,494, Processing Time 0.036 seconds

Text Big Data Analysis and Summary for Free Semester Operational Plan Document (자유학기제 운영계획서에 대한 텍스트 빅데이터 분석 및 요약)

  • Lee, Suan;Park, Beomjun;Kim, Minkyu;Shin, Hye Sook;Kim, Jinho
    • The Journal of Korean Association of Computer Education
    • /
    • v.22 no.3
    • /
    • pp.135-146
    • /
    • 2019
  • Big data analysis is actively used for collecting and analyzing direct information on related topics in each field of society. Applying big data analysis technology in education field is increasingly interested in Korea, because applying this technology helps to identify the effectiveness of education methods and policies and applying them for policy formulation. In this paper, we propose our approach of utilizing big data analysis technology in education field. We focus on free semester program, one of the current core education policies, and we analyze the main points of interests and differences in the free semester through analysis and visualization of texts that are written on the operation reports prepared by each school. We compare regional differences in key characteristics and interests based on the free semester operation reports from middle schools particularly at Seoul and Gangwon-do regions. In conclusion, applying and utilizing big data analysis technology according to the needs and requirements of education field is a great significance.

William Faulkner's Sanctuary: The Original Text as a Matrix (윌리엄 포크너의 『성역: 오리지널 텍스트』: 매트릭스의 역할)

  • Jeong, Hyun-Sook
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.8
    • /
    • pp.233-242
    • /
    • 2019
  • The purpose of this study is to compare a supposedly "pot boiler", Sanctuary and Sanctuary: The Original Text and examine the fact that Horace Benbow in The Original Text is a more complicated and many-sided character who has suppressed desire, Oedipus complex, sense of guilt for a long time, until he came to confront Temple-Popeye case. Since literary narration means unconscious procedure, Horace's incestuous love for his step daughter and Oedipal relation reveals Faulkner's own psychology. In this sense, The Original Text serves as a matrix of many of Faulkner's major novels in terms of themes, characters, and the relationship between past and present. Among these novels are The Sound and the Fury, As I Lay Dying, and Flags in the Dust. Faulkner, while writing about his own world creating Yoknapatawpha County, tries to portray characters with artistic value through whom he wanted to express the deep anxiety and turmoil of the 1920s. Starting with Horace Benbow, Quentin Compson, Darl Bundren and young Bayard Sartoris can be doubling through his major works, conveying author's profound despair in the context of modern world.

A Study on the Music Therapy Management Model Based on Text Mining (텍스트 마이닝 기반의 음악치료 관리 모델에 관한 연구)

  • Park, Seong-Hyun;Kim, Jae-Woong;Kim, Dong-Hyun;Cho, Han-Jin
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.8
    • /
    • pp.15-20
    • /
    • 2019
  • Music therapy has shown many benefits in the treatment of disabled children and the mind. Today's music therapy system is a situation where no specific treatment system has been built. In order for the music therapist to make an accurate treatment, various music therapy cases and treatment history data must be analyzed. Although the most appropriate treatment is given to the client or patient, in reality a number of difficulties are followed due to several factors. In this paper, we propose a music therapy knowledge management model which convergence the existing therapy data and text mining technology. By using the proposed model, similar cases can be searched and accurate and effective treatment can be made for the patient or the client based on specific and reliable data related to the patient. This can be expected to bring out the original purpose of the music therapy and its effect to the maximum, and is expected to be useful for treating more patients.

Exploiting Korean Language Model to Improve Korean Voice Phishing Detection (한국어 언어 모델을 활용한 보이스피싱 탐지 기능 개선)

  • Boussougou, Milandu Keith Moussavou;Park, Dong-Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.10
    • /
    • pp.437-446
    • /
    • 2022
  • Text classification task from Natural Language Processing (NLP) combined with state-of-the-art (SOTA) Machine Learning (ML) and Deep Learning (DL) algorithms as the core engine is widely used to detect and classify voice phishing call transcripts. While numerous studies on the classification of voice phishing call transcripts are being conducted and demonstrated good performances, with the increase of non-face-to-face financial transactions, there is still the need for improvement using the latest NLP technologies. This paper conducts a benchmarking of Korean voice phishing detection performances of the pre-trained Korean language model KoBERT, against multiple other SOTA algorithms based on the classification of related transcripts from the labeled Korean voice phishing dataset called KorCCVi. The results of the experiments reveal that the classification accuracy on a test set of the KoBERT model outperforms the performances of all other models with an accuracy score of 99.60%.

Hydrogen Fuel Cell Patent Analysis: Using Knowledge Persistence-based Main Path Analysis and Text Mining (수소연료전지 특허 동향 분석: 지식 지속성 기반 주경로 분석 및 텍스트 마이닝 방법 활용)

  • Sejun Yoon;Hyunseok Park
    • Knowledge Management Research
    • /
    • v.24 no.1
    • /
    • pp.127-145
    • /
    • 2023
  • This paper analyzed a patent trend for technological domain of hydrogen fuel cell, can improve future energy and pollution problems. Patent analysis is used in establishing a technological roadmap which it can discover the current technology capability and future technological development direction. However, the previous patent analysis is qualitative analysis and simple statistical analysis. The reason why it incorrectly analysis patent does not reflect the current technology environment. The current technology environment is development through recombination of technologies. In addition to, the speed of technological development is rapidly growing. So, qualitative analysis does not satisfy the analysis requirements of the times. This paper utilized KP(Knowledge Persistence)-based main path analysis and text mining methods to reflect the current technological environment. As a result, we found core patents, main technology development, and promising technologies for technological domain of the hydrogen fuel cell.

A Sentiment Analysis of Customer Reviews on the Connected Car using Text Mining: Focusing on the Comparison of UX Factors between Domestic-Overseas Brands (텍스트 마이닝을 활용한 커넥티드 카 고객 리뷰의 감성 분석: 국내-해외 브랜드간 UX 요인 비교를 중심으로)

  • Youjung Shin;Junho Choi;Sung Woo Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.517-528
    • /
    • 2023
  • The purpose of this study is to analyze and compare UX factors of connectivity systems of domestic and overseas car brands. Using a text mining analysis, UX factors of domestic and overseas brands were compared through positive-negative sentiment index. After collecting 120,000 reviews on Hyundai Motor Group (Hyundai, Kia, Genesis) and 190,000 on Tesla, BMW, and Mercedes, pre-processing was performed. Keywords were classified into 11 UX factors in 3 dimensions of the system connection, information, and service. For domestic brands, sentiment index for 'safety' was the highest. For overseas brands, 'entertainment' was the most positive UX factor.

Multi-type object detection-based de-identification technique for personal information protection (개인정보보호를 위한 다중 유형 객체 탐지 기반 비식별화 기법)

  • Ye-Seul Kil;Hyo-Jin Lee;Jung-Hwa Ryu;Il-Gu Lee
    • Convergence Security Journal
    • /
    • v.22 no.5
    • /
    • pp.11-20
    • /
    • 2022
  • As the Internet and web technology develop around mobile devices, image data contains various types of sensitive information such as people, text, and space. In addition to these characteristics, as the use of SNS increases, the amount of damage caused by exposure and abuse of personal information online is increasing. However, research on de-identification technology based on multi-type object detection for personal information protection is insufficient. Therefore, this paper proposes an artificial intelligence model that detects and de-identifies multiple types of objects using existing single-type object detection models in parallel. Through cutmix, an image in which person and text objects exist together are created and composed of training data, and detection and de-identification of objects with different characteristics of person and text was performed. The proposed model achieves a precision of 0.724 and mAP@.5 of 0.745 when two objects are present at the same time. In addition, after de-identification, mAP@.5 was 0.224 for all objects, showing a decrease of 0.4 or more.

Analysis of Potential Construction Risk Types in Formal Documents Using Text Mining (텍스트 마이닝을 통한 건설공사 공문 잠재적 리스크 유형 분석)

  • Eom, Sae Ho;Cha, Gichun;Park, Sun Kyu;Park, Seunghee;Park, Jongho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.1
    • /
    • pp.91-98
    • /
    • 2023
  • Since risks occurring in construction projects can have a significant impact on schedules and costs, there have been many studies on this topic. However, risk analysis is often limited to only certain construction situations,and experience-dependent decision-making is therefore mainly performed. Data-based analyses have only been partially applied to safety and contract documents. Therefore, in this study, cluster analysis and a Word2Vec algorithm were applied to formal documents that contain important elements for contractors or clients. An initial classification of document content into six types was performed through cluster analysis, and 157 occurrence types were subdivided through application of the Word2Vec algorithm. The derived terms were re-classified into five categories and reviewed as to whether the terms could develop into potential construction risk factors. Identifying potential construction risk factors will be helpful as basic data for process management in the construction industry.

What has Korea told in the WTO? : An analysis on the Ministerial Conference Statements (WTO에서 한국은 무슨 말을 해왔나?: 각료회의 대표발언문 분석을 중심으로)

  • Jeong-meen Suh
    • Korea Trade Review
    • /
    • v.48 no.1
    • /
    • pp.29-53
    • /
    • 2023
  • This study analyzes the statements made by representatives of member countries at the WTO Ministerial Conference (MC), the highest decision-making body of the WTO, to examine the position and attitude that Korea has shown at the WTO during the last 27 years. After constructing text dataset by extracting about 1,800 statement documents made by member countries from the WTO document database, the text mining technique is applied to figure out the characteristics of Korea's statements compared to other member countries. Through formal characteristics such as the number of remarks and length of speech, basic attitudes such as continuity of Korea's interest in the WTO and the level of interest in the WTO are measured. In terms of substantive characteristics, the topics in the statements of Korea are categorized through the LDA topic model, and the keywords of Korea for each session are analyzed through comparative analysis with statements by other member countries.

Optimizing Input Parameters of Paralichthys olivaceus Disease Classification based on SHAP Analysis (SHAP 분석 기반의 넙치 질병 분류 입력 파라미터 최적화)

  • Kyung-Won Cho;Ran Baik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1331-1336
    • /
    • 2023
  • In text-based fish disease classification using machine learning, there is a problem that the input parameters of the machine learning model are too many, but due to performance problems, the input parameters cannot be arbitrarily reduced. This paper proposes a method of optimizing input parameters specialized for Paralichthys olivaceus disease classification using SHAP analysis techniques to solve this problem,. The proposed method includes data preprocessing of disease information extracted from the halibut disease questionnaire by applying the SHAP analysis technique and evaluating a machine learning model using AutoML. Through this, the performance of the input parameters of AutoML is evaluated and the optimal input parameter combination is derived. In this study, the proposed method is expected to be able to maintain the existing performance while reducing the number of input parameters required, which will contribute to enhancing the efficiency and practicality of text-based Paralichthys olivaceus disease classification.