• Title/Summary/Keyword: Unknown Words

Search Result 69, Processing Time 0.024 seconds

Distributed Representation of Words with Semantic Hierarchical Information (의미적 계층정보를 반영한 단어의 분산 표현)

  • Kim, Minho;Choi, Sungki;Kwon, Hyuk-Chul
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.941-944
    • /
    • 2017
  • 심층 학습에 기반을 둔 통계적 언어모형에서 가장 중요한 작업은 단어의 분산 표현(Distributed Representation)이다. 단어의 분산 표현은 단어 자체가 가지는 의미를 다차원 공간에서 벡터로 표현하는 것으로서, 워드 임베딩(word embedding)이라고도 한다. 워드 임베딩을 이용한 심층 학습 기반 통계적 언어모형은 전통적인 통계적 언어모형과 비교하여 성능이 우수한 것으로 알려져 있다. 그러나 워드 임베딩 역시 자료 부족분제에서 벗어날 수 없다. 특히 학습데이터에 나타나지 않은 단어(unknown word)를 처리하는 것이 중요하다. 본 논문에서는 고품질 한국어 워드 임베딩을 위하여 단어의 의미적 계층정보를 이용한 워드 임베딩 방법을 제안한다. 기존연구에서 제안한 워드 임베딩 방법을 그대로 활용하되, 학습 단계에서 목적함수가 입력 단어의 하위어, 동의어를 반영하여 계산될 수 있도록 수정함으로써 단어의 의미적 계층청보를 반영할 수 있다. 본 논문에서 제안한 워드 임베딩 방법을 통해 생성된 단어 벡터의 유추검사(analog reasoning) 결과, 기존 방법보다 5%가 증가한 47.90%를 달성할 수 있었다.

A Study on the Defection of Arcing Faults in Transmission Lines and Development of Fault Distance Estimation Software using MATLAB (MATLAB을 이용한 송전선로의 아크사고 검출 및 고장거리 추정 소프트웨어 개발에 관한 연구)

  • Kim, Byeong-Cheon;Park, Nam-Ok;Kim, Dong-Su;Kim, Gil-Hwan
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.51 no.4
    • /
    • pp.163-168
    • /
    • 2002
  • This paper present a new verb efficient numerical algorithm for arcing faults detection and fault distance estimation in transmission line. It is based on the fundamental differential equations describing the transients on a transmission line before, during and alter the fault occurrence, and on the application of the "Least Error Squares Technique"for the unknown model parameter estimation. If the arc voltage estimated is a near zero, the fault is without arc, in other words the fault is permanent fault. If the arc voltage estimated has any high value, the faust is identified as an fault, or the transient fault. In permanent faults case, fault distance estimation is necessary. This paper uses the model of the arcing fault in transmission line using ZnO arrestor and resistance to be implemented within EMTP. One purpose of this study is to build a structure for modeling of arcing fault detection and fault distance estimation algorithm using Matlab programming. In this paper, This algorithm has been designed in Graphic user interface(GUI).

The Convergence Characteristics of The Time-Averaged Distortion in Vector Quantization: Part II. Applications to Testing Trained Codebooks (벡터 앙자화에서 시간 평균 왜곡치의 수렴 특성: II. 훈련된 부호책의 감사 기법)

  • Dong Sik Kim
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.5
    • /
    • pp.747-755
    • /
    • 1995
  • When codebooks designed by a clustering algorithm using training sets, a time-averaged distortion, which is called the inside-training-set- distortion (ITSD), is usually calculated in each iteration of the algorithm, since the input probability function is unknown in general. The algorithm stops if the ITSD no more significantly decreases. Then, in order to test the trained codebook, the outside-training-set-distortion (OTSD) is to be calculated by a time-averaged approximation using the test set. Hence codebooks that yield small values of the OTSD are regarded as good codebooks. In other words, the calculation of the OTSD is a criterion to testing a trained codebook. But, such an argument is not always true if some conditions are not satisfied. Moreover, in order to obtain an approximation of the OTSD using the test set, it is known that a large test set is requared in general. But, large test set causes heavy calculation com0plexity. In this paper, from the analyses in [16], it has been revealed that the enough size of the test set is only the same as that of the codebook when codebook size is large. Then a simple method to testing trained codebooks is addressed. Experimental results on synthetic data and real images supporting the analysis are also provided and discussed.

  • PDF

Customizing an English-Korean Machine Translation System for Patent Translation

  • Choi, Sung-Kwon;Kim, Young-Gil
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.105-114
    • /
    • 2007
  • This paper addresses a method for customizing an English-to-Korean machine translation system from general domain to patent domain. The customizing method consists of following steps: 1) linguistically studying about characteristics of patent documents, 2) extracting unknown words from large patent documents and constructing large bilingual terminology, 3) extracting and constructing the patent-specific translation patterns 4) customizing the translation engine modules of the existing general MT system according to linguistic study about characteristics of patent documents, and 5) evaluating the accuracy of translation modules and the translation quality. This research was performed under the auspices of the MIC (Ministry of Information and Communication) of Korean government during 2005-2006. The translation accuracy of the customized English-Korean patent translation system is 82.43% on the average in 5 patent fields (machinery, electronics, chemistry, medicine and computer) according to the evaluation of 7 professional human translators. In 2006, the patent MT system started an on-line patent MT service in IPAC (International Patent Assistance Center) under MOCIE (Ministry of Commerce, Industry and Energy) in Korea. In 2007, KIPO (Korean Intellectual Property Office) tries to launch an English-Korean patent MT service.

  • PDF

Papers : Attitude Determination Algorithm of LEO Satellites in the Sun - Acquisition Mode (논문 : 태양획득 모드에서 저궤도 위성의 자세결정 알고리즘)

  • An,Hyo-Seong;Lee,Seon-Ho;Lee,Seung-U;Chae,Jang-Su
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.30 no.1
    • /
    • pp.82-87
    • /
    • 2002
  • The attitude determination in LEO Satellite like KOMPSAT is one of the most important issues for Sun-Acquisition. Particularly, in KOMPSAT, the roll axis direction can be determined since the sun sensor gives the information on the Euler angle for pitch and yaw axes in Sun-Acquisition mode. In other words, it is the problem to determine the two unknown axes direction with one axis knowledge. This paper proposes a new effective method for attitude determination of general LEO satellites when one axis information is avilable and proves its usefulness throughout the simulation.

A Time Delay-Based Gain Scheduled Control and It's Application to Electromagnetic Suspension System (시간 지연 이득 계획 제어와 자기 부상 시스템에의 응용)

  • Sung, Ho-Kyong;Jho, Jeong-Min;Cho, Heung-Jae;Kim, Dong-Sung
    • Proceedings of the KIEE Conference
    • /
    • 2005.04a
    • /
    • pp.221-225
    • /
    • 2005
  • This paper proposes a gain scheduled control technique using time-delay for the nonlinear system with plant uncertainties and unexpected disturbances. The time delay-based gain scheduled control depends on a direct estimation of a function representing the effect of uncertainties. The information from the estimation is used to cancel the unknown dynamics and the unexpected disturbances simultaneously. The proposed estimation scheme with a finite convergence time is formulated in order to estimate the unborn scheduling variable variation. In other words, the time delay-based gain scheduled control uses the past observation of the system's response and the control input to directly modify the control actions rather than to adjust the controller gains or to identify system parameters. It has a simple structure so as to minimize the computational burden. The benefits of this proposed scheme are demonstrated in the simulation of an electromagnetic suspension system with plant uncertainties and external disturbances, and the proposed controller is compared with the conventional state feedback controller.

  • PDF

A comparison of phonological error patterns in the single word and spontaneous speech of children with speech sound disorders (말소리장애 아동의 단어와 자발화 문맥의 음운오류패턴 비교)

  • Park, kayeon;Kim, Soo-Jin
    • Phonetics and Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.165-173
    • /
    • 2015
  • This study was aim to compare the phonological error patterns and PCC(Percentage of Correct Consonants) derived from the single word and spontaneous speech contexts of the speech sound disorders with unknown origin(SSD). The present study suggest that the development phonological error patterns and non-developmental error patterns of the target children, in according to speech context. The subjects were 15 children with SSD up to the age of 5 from 3 years of age. This research use 37 words of APAC(Assessment of Phonology & Articulation for Children) in the single word context and 100 eojeol in the spontaneous speech context. There was no difference of PCC between the single word and the spontaneous speech contexts. Significantly different developmental phonological error patterns between the single word and the spontaneous speech contexts were syllable deletion, word-medial onset deletion, liquid deletion, gliding, affrication, fricative other error, tensing, regressive assimilation. Significantly different non-developmental phonological error patterns were backing, addtion of phoneme, aspirating. The study showed that there was no difference of PCC between elicited single word and spontaneous conversational context. And there were some different phonological error patterns derived from the two contexts of the speech sound disorders. The more important interventions target is the error patterns of the spontaneous speech contexts for the immediate generalization and rising overall intelligibility.

소아암 환아의 영적 케어

  • Sin, Min-Seon
    • Korean Journal of Hospice Care
    • /
    • v.5 no.2
    • /
    • pp.54-63
    • /
    • 2005
  • The purpose of this study is to examine the requirement for child life support specialist and fetal education for children with cancer. This research presented was composed with three chapters : First chapter, I presented the purpose, scope and definitions of this research. Second chapter, I defined about hospice care service for children with cancer and kind of pediatric cancer. And general characteristics of children with cancer, a understanding character of death and dietary therapy. Lastly, I defined and investigated about spiritual care. Third chapter, I concluded with some of findings and final suggestions based on the results. According to the developmental stages children with cancer are disability of communication competence and more dependence on their parents, therefore parents' decision making were more difficulty. And parents with a child who suffers from a cancer needs a counseling in order to discover the meaning of life. Parents' psychological experience about the caring for their child suffering from pediatric cancer was equal to broken hearts due to shadow of the child's death from time to time. In other words a parents with a child who suffers from a cancer needs comprehensive services such as hospice, consultor as well as wide experienced pediatrician and nurse. Child life support specialist can help them recover and improve their o주 potential strength in behalf of overcoming their difficulties. And pastoral counseling can help them reduce the fear and anxiety about unknown world and death. The systematically developed a school-based counseling program would help children adjust to the difficulties after a perfect cure because of children adjusted to school well when they have good peer relationships.

  • PDF

A Study on the Cause of Nausea and Vomiting of Pregnancy in Relation to Fetal Development (태아(胎兒)의 발달과정(發達過程)에서 찾아본 악조(惡阻)의 원인(原因)에 대한 고찰(考察))

  • Yoon, Eunkyung;Kim, Jong-hyun
    • Journal of Korean Medical classics
    • /
    • v.32 no.2
    • /
    • pp.147-163
    • /
    • 2019
  • Objectives : Morning Sickness, or Nausea and Vomiting of Pregnancy(NVP) is a frequently experienced phenomenon among pregnant women whose cause is still unknown. While the key trait of this symptom is its temporality, it is hardly considered in existing studies on the cause of NVP based on Korean Medical(KM) literature. We hope to remedy this. Methods : We looked for contents on fetal development in Korean Medical literature from the Siku Quanshu as well as other key literature of KM and examined the results together with contents on NVP to find any correlation. Results : We found that the beginning stages, namely the third month marked a significant change in the course of fetal development where the fetus's own Shen(神) is first developed by work of the mother's Heart(心). In other words, the third month is when the mother's and child's Shen first encounter. Conclusions : We hypothesized that NVP whose symptoms are closely linked to the functions of the Heart, is likely to be related to this event, which was supported by the common involvement of the Heart which was involved in both fetal development and NVP during the third month of pregnancy.

Part-of-speech Tagging for Hindi Corpus in Poor Resource Scenario

  • Modi, Deepa;Nain, Neeta;Nehra, Maninder
    • Journal of Multimedia Information System
    • /
    • v.5 no.3
    • /
    • pp.147-154
    • /
    • 2018
  • Natural language processing (NLP) is an emerging research area in which we study how machines can be used to perceive and alter the text written in natural languages. We can perform different tasks on natural languages by analyzing them through various annotational tasks like parsing, chunking, part-of-speech tagging and lexical analysis etc. These annotational tasks depend on morphological structure of a particular natural language. The focus of this work is part-of-speech tagging (POS tagging) on Hindi language. Part-of-speech tagging also known as grammatical tagging is a process of assigning different grammatical categories to each word of a given text. These grammatical categories can be noun, verb, time, date, number etc. Hindi is the most widely used and official language of India. It is also among the top five most spoken languages of the world. For English and other languages, a diverse range of POS taggers are available, but these POS taggers can not be applied on the Hindi language as Hindi is one of the most morphologically rich language. Furthermore there is a significant difference between the morphological structures of these languages. Thus in this work, a POS tagger system is presented for the Hindi language. For Hindi POS tagging a hybrid approach is presented in this paper which combines "Probability-based and Rule-based" approaches. For known word tagging a Unigram model of probability class is used, whereas for tagging unknown words various lexical and contextual features are used. Various finite state machine automata are constructed for demonstrating different rules and then regular expressions are used to implement these rules. A tagset is also prepared for this task, which contains 29 standard part-of-speech tags. The tagset also includes two unique tags, i.e., date tag and time tag. These date and time tags support all possible formats. Regular expressions are used to implement all pattern based tags like time, date, number and special symbols. The aim of the presented approach is to increase the correctness of an automatic Hindi POS tagging while bounding the requirement of a large human-made corpus. This hybrid approach uses a probability-based model to increase automatic tagging and a rule-based model to bound the requirement of an already trained corpus. This approach is based on very small labeled training set (around 9,000 words) and yields 96.54% of best precision and 95.08% of average precision. The approach also yields best accuracy of 91.39% and an average accuracy of 88.15%.