• Title/Summary/Keyword: Parts of Speech

Search Result 135, Processing Time 0.031 seconds

Japanese Expressions that Include English Expressions

  • Murata, Masaki;Kanamaru, Toshiyuki;Nakamoto, Koichirou;Kotani, Katsunori;Isahara, Hitoshi
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.330-339
    • /
    • 2007
  • We extracted English expressions that appear in Japanese sentences in newspaper articles and on the Internet. The results obtained from the newspaper articles showed that the preposition "in" has been regularly used for more than ten years, and it is still regularly used now. The results obtained from the Internet articles showed there were many kinds of English expressions from various parts of speech. We extracted some interesting expressions that included English prepositions and verb phrases. These were interesting because they had different word orders to the normal order in Japanese expressions. Comparing the extracted English and katakana expressions, we found that the expressions that are commonly used in Japanese are often written in the katakana syllabary and that the expressions that are not so often used in Japanese, such as prepositions, are hardly ever written in the katakana syllabary.

  • PDF

A Study on Knowledge based Conference Management System Architecture (지식 기반 회의관리 시스템 아키텍처에 관한 연구)

  • Kim Chang-Su;Jung Hoe-Kyung;Lee Soo-Youn
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.9
    • /
    • pp.1691-1699
    • /
    • 2006
  • This thesis proposes standard of knowledge-base system architecture for managing conferences in order to make ontology in the put of a conference rather than all parts. Also, this thesis proposes possibility of developing into the system that systematize transformed and processed information through various recognition systems, video conference, speech recognition, motion recognition, and so on, make knowledge and analyze it after preparing standards of objective estimation through simulation and analysis.

Classification of Korean Parts-of-Speech for Korean-English Machine Translation (한.영 기계번역을 위한 한국어 품사 분류)

  • 송재관;박찬곤
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10c
    • /
    • pp.165-167
    • /
    • 1998
  • 본 논문에서는 한.영 기계번역을 위한 한국어 품사 분류를 한다. 한국어 표준문법에서 제시되는 품사 분류 기준은 의미, 기능, 형식의 세 가지 기준을 적용하고 있으며, 자연언어처리에서도 같은 분류 기준을 바탕으로 하고 있다. 품사 분류에 여러 가지 기준을 적용하는 것은 문법구조 이해 및 품사 분류를 어렵게 한다. 또한 한.영 기계번역시 품사의 불일치로 전처리가 필요하다. 이러한 문제를 해결하기 위하여 본 논문에서는 하나의 기준을 적용하여 품사 분류를 한다. 방법으로 한국어 표준문법에 의하여 말뭉치에 태깅하고 문제점을 찾아내며, 새로운 기준에 의하여 품사 분류를 한다. 본 논문에서 분류된 품사는 한국어 문장에서 통사적 역할이 동일하고, 영에서의 사전 품사와 동일하다. 또한 품사 분류의 모호성을 제거하고, 한국어의 문장 구조를 명확히 표현하며, 한.영 기계번역시 패턴 매칭에 의한 목적언어 생성이 가능하다.

  • PDF

Korean Morphological Analysis Considering a Term with Multiple Parts of Speech ("의미적 한 단어" 유형 분석 및 형태소 분석 기법)

  • Hur, Yun-Young;Kwon, Hyuk-Chul
    • Annual Conference on Human and Language Technology
    • /
    • 1994.11a
    • /
    • pp.128-131
    • /
    • 1994
  • 한국어 문서중 신문이나 시사지, 법률관련문서, 경제학관련문서, 국문학관련문서와 같은 전문분야 문서에는 한글, 한자, 영어, 문장부호와 같은 기호들의 결합으로 이루어지면서 하나의 뜻으로 나타내는 "의미적 한 단어"가 많이 존재한다. 이러한 단어들은 이를 고려하지 못한 형태소 분석기의 분석률을 감소시키고, 오분석율을 증가시킨다. 본 논문은 "의미적 한 단어"의 유형과 분석과정에 따른 유형을 분류하였으며 그에 적합한 형태소 분석기법을 제시하였다. 유형 분류과 제사된 형태소 분석기법으로 구현된 형태소 분석기는 기존의 형태소 분석기보다 분석률이 증가되었으며 오분석률은 감소되었다.

  • PDF

Design of an Efficient VLSI Architecture and Verification using FPGA-implementation for HMM(Hidden Markov Model)-based Robust and Real-time Lip Reading (HMM(Hidden Markov Model) 기반의 견고한 실시간 립리딩을 위한 효율적인 VLSI 구조 설계 및 FPGA 구현을 이용한 검증)

  • Lee Chi-Geun;Kim Myung-Hun;Lee Sang-Seol;Jung Sung-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.2 s.40
    • /
    • pp.159-167
    • /
    • 2006
  • Lipreading has been suggested as one of the methods to improve the performance of speech recognition in noisy environment. However, existing methods are developed and implemented only in software. This paper suggests a hardware design for real-time lipreading. For real-time processing and feasible implementation, we decompose the lipreading system into three parts; image acquisition module, feature vector extraction module, and recognition module. Image acquisition module capture input image by using CMOS image sensor. The feature vector extraction module extracts feature vector from the input image by using parallel block matching algorithm. The parallel block matching algorithm is coded and simulated for FPGA circuit. Recognition module uses HMM based recognition algorithm. The recognition algorithm is coded and simulated by using DSP chip. The simulation results show that a real-time lipreading system can be implemented in hardware.

  • PDF

COVID-19-related Korean Fake News Detection Using Occurrence Frequencies of Parts of Speech (품사별 출현 빈도를 활용한 코로나19 관련 한국어 가짜뉴스 탐지)

  • Jihyeok Kim;Hyunchul Ahn
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.267-283
    • /
    • 2023
  • The COVID-19 pandemic, which began in December 2019 and continues to this day, has left the public needing information to help them cope with the pandemic. However, COVID-19-related fake news on social media seriously threatens the public's health. In particular, if fake news related to COVID-19 is massively spread with similar content, the time required for verification to determine whether it is genuine or fake will be prolonged, posing a severe threat to our society. In response, academics have been actively researching intelligent models that can quickly detect COVID-19-related fake news. Still, the data used in most of the existing studies are in English, and studies on Korean fake news detection are scarce. In this study, we collect data on COVID-19-related fake news written in Korean that is spread on social media and propose an intelligent fake news detection model using it. The proposed model utilizes the frequency information of parts of speech, one of the linguistic characteristics, to improve the prediction performance of the fake news detection model based on Doc2Vec, a document embedding technique mainly used in prior studies. The empirical analysis shows that the proposed model can more accurately identify Korean COVID-19-related fake news by increasing the recall and F1 score compared to the comparison model.

A Deep Learning Model for Disaster Alerts Classification

  • Park, Soonwook;Jun, Hyeyoon;Kim, Yoonsoo;Lee, Soowon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.12
    • /
    • pp.1-9
    • /
    • 2021
  • Disaster alerts are text messages sent by government to people in the area in the event of a disaster. Since the number of disaster alerts has increased, the number of people who block disaster alerts is increasing as many unnecessary disaster alerts are being received. To solve this problem, this study proposes a deep learning model that automatically classifies disaster alerts by disaster type, and allows only necessary disaster alerts to be received according to the recipient. The proposed model embeds disaster alerts via KoBERT and classifies them by disaster type with LSTM. As a result of classifying disaster alerts using 3 combinations of parts of speech: [Noun], [Noun + Adjective + Verb] and [All parts], and 4 classification models: Proposed model, Keyword classification, Word2Vec + 1D-CNN and KoBERT + FFNN, the proposed model achieved the highest performance with 0.988954 accuracy.

The development of the anomia assessment battery based on the psycholinguistic processing (언어심리학을 기반으로 한 명칭성 실어증 평가도구 개발)

  • Jung, Jae-Bum;Pyun, Sung-Bom;Sohn, Hyo-Jung;Gee, Sung-Woo;Cho, Sung-Ho;Nam, Ki-Chun
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.158-162
    • /
    • 2007
  • Anomia, word finding difficulty, is one of the most common feature in aphasia. Previous studies support that the process of picture naming consists of three stages, in the order of the object recognition, semantic, and phonological output stages. Anomic patients have many symptoms and it means that anomia can be sub-divided into several symptom groups. Our anomia assessment battery consists of several parts: (1) picture naming set, (2) picture-word matching task, (3) lexical decision task for mental lexicon damage, (4) naming task for phonological lexicon damage, and (5) semantic decision task. Pictures and words were selected on the basis of usage frequency, semantic category, and word length. We administered this anomia evaluation battery to many anomic aphasics and we subdivided patients into several groups. We hope that our anomia evaluation set is useful and helpful for evaluation anomic aphasics

  • PDF

Study of Machine-Learning Classifier and Feature Set Selection for Intent Classification of Korean Tweets about Food Safety

  • Yeom, Ha-Neul;Hwang, Myunggwon;Hwang, Mi-Nyeong;Jung, Hanmin
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.3
    • /
    • pp.29-39
    • /
    • 2014
  • In recent years, several studies have proposed making use of the Twitter micro-blogging service to track various trends in online media and discussion. In this study, we specifically examine the use of Twitter to track discussions of food safety in the Korean language. Given the irregularity of keyword use in most tweets, we focus on optimistic machine-learning and feature set selection to classify collected tweets. We build the classifier model using Naive Bayes & Naive Bayes Multinomial, Support Vector Machine, and Decision Tree Algorithms, all of which show good performance. To select an optimum feature set, we construct a basic feature set as a standard for performance comparison, so that further test feature sets can be evaluated. Experiments show that precision and F-measure performance are best when using a Naive Bayes Multinomial classifier model with a test feature set defined by extracting Substantive, Predicate, Modifier, and Interjection parts of speech.

A Segmentation Algorithm of the Connected Word Speech by Statistical Method (統計的인 方法에 依한 連結音의 音素分割 알고리듬)

  • Cho, Jeong-Ho;Hong, Jae-Keun;Kim, Soo-Joong
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.4
    • /
    • pp.151-163
    • /
    • 1989
  • A statistical approach for the segmentation of speed signals is described in this paper. The main idea of this algorithm is the use of three AR models. Two fixed models are identified at the stationary parts of the signal before and after the spectral change. Changes are detected when the distance between these two models is high. Another model is located between two fixed models and is used to estimate spectral change time. This segmentation algorithm has been tested with connected words and compared to classical methods. The results showed that it can provide more accurate locations of boundaries of segments and can reduce the amount of oversegmentation.

  • PDF