• Title/Summary/Keyword: KoNLPy

Search Result 24, Processing Time 0.041 seconds

Target and Swear Word Detection Using Sentence Analysis in Real-Time Chatting (실시간 채팅 환경에서 문장 분석을 이용한 대상자 및 비속어 검출)

  • Yeom, Choongseok;Jang, Junyoung;Jang, Yuhwan;Kim, Hyun-chul;Park, Heemin
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.1
    • /
    • pp.83-87
    • /
    • 2021
  • By the increase of internet usage, communicating online became an everyday thing. Thereby various people have experienced profanity by anonymous users. Nowadays lots of studies tried to solve this problem using artificial intelligence, but most of the solutions were for non-real time situations. In this paper, we propose a Telegram plugin that detects swear words using word2vec, and an algorithm to find the target of the sentence. We vectorized the input sentence to find connections with other similar words, then inputted the value to the pre-trained CNN (Convolutional Neural Network) model to detect any swears. For target recognition we proposed a sequential algorithm based on KoNLPY.

Design and Implementation of Typing Practice Application for Learning Using Web Contents (웹 콘텐츠를 활용한 학습용 타자 연습 어플리케이션의 설계와 구현)

  • Kim, Chaewon;Hwang, Soyoung
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.12
    • /
    • pp.1663-1672
    • /
    • 2021
  • There are various typing practice applications. In addition, research cases on learning applications that support typing practice have been reported. These services are usually provided in a way that utilizes their own built-in text. Learners collect various contents through web services and use them a lot for learning. Therefore, this paper proposes a learning application to increase the learning effect by collecting vast amounts of web content and applying it to typing practice. The proposed application is implemented using Tkinter, a GUI module of Python. BeautifulSoup module of Python is used to extract information from the web. In order to process the extracted data, the NLTK module, which is an English data preprocessor, and the KoNLPy module, which is a Korean language processing module, are used. The operation of the proposed function is verified in the implementation and experimental results.

Development of a Recommendation System for Crowdfunding Using NLP in Short Text (단문 텍스트의 자연어 처리 기법을 통한 크라우드 펀딩 추천 시스템 개발)

  • Lee, Yeong-Ah;Lee, Sun-Myung;Lee, Ju-Yon;Lee, Ki Yong
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.466-469
    • /
    • 2021
  • 최근 자연어 처리에 대한 관심이 증가함에 따라 자연어 처리 기술을 활용한 다양한 추천 시스템이 등장하고 있다. 본 논문에서는 자연어 처리를 이용한 서비스를 개발한다. 본 논문에서 개발한 서비스는 KoNLPy 와 Word2Vec 을 이용하여 크라우드 펀딩 프로젝트 창작자 및 후원자에게 키워드 및 키워드와 유사한 단어가 제목에 포함되는 프로젝트를 추천해준다. 단문 텍스트로서 프로젝트 제목을 사용하여 데이터를 자연어 처리 한 후, 딥러닝 모델에 적용시켜 추출한 데이터를 기반으로 창작자와 후원자에게 추천해주는 방식이다. 따라서 본 서비스는 프로젝트 제목 정보를 통한 추천 시스템의 개발로, 나아가 영화, 도서와 같은 콘텐츠 추천 분야에도 적용할 수 있을 것으로 기대한다.

A Deep Learning-Based Smartphone Phishing Attacks Countermeasures (딥러닝 기반 스마트폰 피싱 공격 대응 방법)

  • Lee, Jae-Kyung;Seo, Jin-Beom;Cho, Young-Bok
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.321-322
    • /
    • 2022
  • 스마트폰 사용자가 늘어남에 따라 갖춰줘야 할 보안성이 취약하여, 다양한 바이러스 및 악성코드 위험에 노출되어 있다. 안드로이드는 운영체제 중 가장 많이 사용되는 운영체제로, 개방성이 높으며 수많은 악성 앱 및 바이러스가 마켓에 존재하여 위험에 쉽게 노출된다. 2년 넘게 이어진 코로나 바이러스(Covid-19)으로 인해 꾸준히 위험도가 높아진 피싱공격(Phshing attack)은 현재 최고의 스마트폰 보안 위협 Top10에 위치한다. 본 논문에서는 딥러닝 기반 자연어처리 기술을 통해 피싱 공격 대응 방법 제안 및 실험 결과를 도출하고, 또한 향후 제안 방법을 보완하여 피싱 공격 및 다양한 모바일 보안 위협에 대응할 수 있는 앱을 설계할 것이다.

  • PDF

Information Video Summarization and Keyword-based Video Tracking System (정보성 동영상 요약 및 키워드 기반 영상검색 시스템)

  • Gihun Kim;Mikyeong Moon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.701-702
    • /
    • 2023
  • 비대면 교육이 증가함에 따라 강의, 특강과 같은 정보성 동영상의 수가 급격히 많아지고 있다. 이러한 정보성 동영상을 보아야 하는 학습자들은 자원과 시간을 효율적으로 활용할 수 있는 동영상 이해 및 학습 시스템이 필요하다. 본 논문에서는 GPT-3 모델과 KoNLPy 사용하여 동영상 요약을 수행하고 키워드 기반 해당 영상 프레임으로 바로 갈 수 있는 시스템의 개발내용에 대해 기술한다. 이를 통해 동영상 콘텐츠를 효과적으로 활용하여 학습자들의 학습 효율성을 향상시킬 수 있을 것으로 기대한다.

  • PDF

Exploratory Research on Automating the Analysis of Scientific Argumentation Using Machine Learning (머신 러닝을 활용한 과학 논변 구성 요소 코딩 자동화 가능성 탐색 연구)

  • Lee, Gyeong-Geon;Ha, Heesoo;Hong, Hun-Gi;Kim, Heui-Baik
    • Journal of The Korean Association For Science Education
    • /
    • v.38 no.2
    • /
    • pp.219-234
    • /
    • 2018
  • In this study, we explored the possibility of automating the process of analyzing elements of scientific argument in the context of a Korean classroom. To gather training data, we collected 990 sentences from science education journals that illustrate the results of coding elements of argumentation according to Toulmin's argumentation structure framework. We extracted 483 sentences as a test data set from the transcription of students' discourse in scientific argumentation activities. The words and morphemes of each argument were analyzed using the Python 'KoNLPy' package and the 'Kkma' module for Korean Natural Language Processing. After constructing the 'argument-morpheme:class' matrix for 1,473 sentences, five machine learning techniques were applied to generate predictive models relating each sentences to the element of argument with which it corresponded. The accuracy of the predictive models was investigated by comparing them with the results of pre-coding by researchers and confirming the degree of agreement. The predictive model generated by the k-nearest neighbor algorithm (KNN) demonstrated the highest degree of agreement [54.04% (${\kappa}=0.22$)] when machine learning was performed with the consideration of morpheme of each sentence. The predictive model generated by the KNN exhibited higher agreement [55.07% (${\kappa}=0.24$)] when the coding results of the previous sentence were added to the prediction process. In addition, the results indicated importance of considering context of discourse by reflecting the codes of previous sentences to the analysis. The results have significance in that, it showed the possibility of automating the analysis of students' argumentation activities in Korean language by applying machine learning.

Sell-sumer: The New Typology of Influencers and Sales Strategy in Social Media (셀슈머(Sell-sumer)로 진화한 인플루언서의 새로운 유형과 소셜미디어에서의 세일즈 전략)

  • Shin, Hajin;Kim, Sulim;Hong, Manny;Hwang, Bom Nym;Yang, Hee-Dong
    • Knowledge Management Research
    • /
    • v.22 no.4
    • /
    • pp.217-235
    • /
    • 2021
  • As 49% of the world's population uses social media platforms, communication and content sharing within social media are becoming more active than ever. In this environmental base, the one-person media market grew rapidly and formed public opinion, creating a new trend called sell-sumer. This study defined new types of influencers by product category by analyzing the subject concentration of the commercial/non-commercial keywords of influencers and the impact of the ratio of commercial postings on sales. It is hoped that influencers working within social media will be helpful to new sales strategies that are transformed into sell-sumers. The method of this study classifies influencers' commercial/non-commercial posts using Python, performs text mining using KoNLPy, and calculates similarity between FastText-based words. As a result, it has been confirmed that the higher the keyword theme concentration of the influencer's commercial posting, the higher the sales. In addition, it was confirmed through the cluster analysis that the influencer types for each product category were classified into four types and that there was a significant difference between groups according to sales. In other words, the implications of this study may suggest empirical solutions of social media sales strategies for influencers working on social media and marketers who want to use them as marketing tools.

Development of Special Documents Classification System using Deep Learning (딥러닝을 이용한 전문분야 문서 분류 시스템 개발)

  • Jin, Sang-Hyeon;Hwang, Sang-Ho;Kang, Won-Seok;Son, Chang-Sik
    • Annual Conference of KIPS
    • /
    • 2019.10a
    • /
    • pp.589-591
    • /
    • 2019
  • 본 논문에서는 고도장비의 운용 및 정비를 위한 교육훈련 시스템 개발을 위해 자연어 처리와 딥러닝 기술을 이용하여 항공정비와 관련된 전문분야의 문서 분류가 가능한 방법을 제안하고자 한다. 문서 분류 모델의 개발을 위해 항공정비 교범을 텍스트 파일로 변환하여 총 4917개의 문서를 생성하였으며, 정비사 개인별 정비능력 관리(IMQC)를 기준으로 12개의 범주로 구분하였다. 수집된 문서는 전문분야의 문서인 점을 고려하여 전문용어 사전을 추가하였으며, KoNLPy를 이용하여 전처리를 수행하였다. 전문분야의 문서는 범주에 상관없이 문서 내용의 유사도가 매우 높은 특징을 가지고 있어, 특정 범주내에서 중요한 정도를 잘 표현 할 수 있는 TF-ICF를 이용하여 특징 추출을 하였다. 이후 합성곱 신경망(CNN)을 이용하여 특징 맵을 생성한 후 완전 결합 계층을 통하여 분류하였으며, 테스트 문서 983건을 분류한 결과 평균 73.6%의 분류성능을 보여주었다.

Semantic Network Analysis about Comments on Internet Articles about Nurse Workplace Bullying (간호사 괴롭힘 관련 인터넷 포털 기사에 대한 댓글의 의미연결망 분석)

  • Kim, Chang Hee;Moon, Seong Mi
    • Journal of Korean Clinical Nursing Research
    • /
    • v.25 no.3
    • /
    • pp.209-220
    • /
    • 2019
  • Purpose: A significant amount of public opinion about nurse bullying is expressed on the internet. The purpose of this study was to analyze the linkage structures among words extracted from comments on internet articles related to nurse workplace bullying using semantic network analysis. Methods: From February 2018 to April 2019, comments made on news articles posted to the Daum and Naver web portal containing keywords such as "nurse", "Taeum", and "bullying" were collected using a web crawler written in Python. A morphological analysis performed with Open Korean Text in KoNLPy generated 54 major nodes. The frequencies, eigenvector centralities, and betweenness centralities of the 54 nodes were calculated and semantic networks were visualized using the UCINET and NetDraw programs. Convergence of iterated correlations (CONCOR) analysis was performed to identify structural equivalence. Results: This paper presents results about March 2018 and January 2019 because these months had highest number of articles. Of the 54 major nodes, "nurse", "hospital", "patient", and "physician" were the most frequent and had the highest eigenvector and betweenness centralities. The CONCOR analysis identified work environment, nurse, gender, and military clusters. Conclusion: This study structurally explored public opinion about nurse bullying through semantic network analysis. It is suggested that various studies on nursing phenomena will be conducted using social network analysis.

Early Detection Assistance System for Rare Diseases based on Patient's Symptom Information (환자 증상정보 기반 희귀질환 조기 발견 보조시스템)

  • Jae-Min Choi;Sun-Yong Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.2
    • /
    • pp.373-378
    • /
    • 2023
  • Untypical symptoms and lack of diagnostic records make it difficult for even medical specialists to detect rare diseases. Thus, it takes a lot of time and money from the onset of symptoms to an accurate diagnosis, which seriously results in physical, mental, and economic pressure on patients. In this paper, we propose and implement an early detection assistance system for rare diseases using web crawling and text mining, which can suggest the names of suspected rare diseases so that medical staffs can easily recall the disease names and make a final diagnosis of the rare diseases.