• Title/Summary/Keyword: Standard Korean Dictionary

Search Result 79, Processing Time 0.021 seconds

Feature Classification of Hanguel Patterns by Distance Transformation method (거리변환법에 의한 한글패턴의 특징분류)

  • Koh, Chan;Lee, Dai-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.14 no.6
    • /
    • pp.650-662
    • /
    • 1989
  • In this paper, a new algorithm for feature extraction and classification of recognizing Hanguel patterns is proposed. Inputed patterns classify into six basic formal patterns and divided into subregion of Hanguel phoneme and extract the crook feature from position information of the each subregion. Hanguel patterns are defined and are made of the indexed-sequence file using these crook features points. Hanguel patterns are recognized by retrievignt ehses two files such as feature indexed-sequence file and standard dictionary file. Thi paper show that the algorithm is very simple and easily construct the software system. Experimental result presents the output of feature extraction and grouping of input patterns. Proposed algorithm extract the crooked feature using distance transformation method within the rectangle of enclosure the characters. That uses the informationof relative position feature. It represents the 97% of recognition ratio.

  • PDF

A Study on System for International Standard(IS) based Clinical Information Management (국제표준 기반의 임상정보 관리체계 구축에 관한 연구)

  • Choi, Yongjung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.01a
    • /
    • pp.429-432
    • /
    • 2014
  • 국내 제약산업의 경쟁력을 제고시키기 위해서는 신약의 심사/허가 기간을 단축시켜 급변하게 변하는 글로벌 제약시장에서 경쟁 우위적 위치를 선점할 수 있도록 기회를 제공할 수 있도록 체계 개선이 시급하다. 신약허가를 위해서는 임상시험 결과에 대한 안전성과 유효성 등에 대한 심사가 수행되게 된다. 하지만 현재 신약허가를 위해서 제약사와 임상시험수탁기관(Contract Research Organization, CRO)에서 데이터 정보체계인 Domain, Variable 및 Parameter 등의 표준을 따르지 않고 다양한 유형의 임상정보데이터를 심사기관에 제출하고 있어 이로 인한 심사기간 증가와 심사업무 비효율성을 야기시키고 있다. 따라서 본 연구에서는 국제민간기구인 CDISC (Clinical Data Interchange Standards Consortium)에서 제정한 글로벌 임상데이터 표준인 CDISC 표준을 준용한 국내 임상시험정보관리 체계 (eCTD 시스템)및 의약품 전주기적 관리체계를 제시하고자 하며, 본 연구를 통한 기대효과로는 국제표준의 임상정보관리 인프라 구축으로 인한 국내 신약개발 및 해외 진출 환경을 마련하여 글로벌 시장선점의 기회를 제공할 수 있고, 규제기관 차원에서는 의약품 허가, 심사업무의 효율성 증가는 물론 전주기적 의약품 안전관리체계를 마련할 수 있을 것으로 사료된다.

  • PDF

Development of Online Fashion Thesaurus and Taxonomy for Text Mining (텍스트마이닝을 위한 패션 속성 분류체계 및 말뭉치 웹사전 구축)

  • Seyoon Jang;Ha Youn Kim;Songmee Kim;Woojin Choi;Jin Jeong;Yuri Lee
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.46 no.6
    • /
    • pp.1142-1160
    • /
    • 2022
  • Text data plays a significant role in understanding and analyzing trends in consumer, business, and social sectors. For text analysis, there must be a corpus that reflects specific domain knowledge. However, in the field of fashion, the professional corpus is insufficient. This study aims to develop a taxonomy and thesaurus that considers the specialty of fashion products. To this end, about 100,000 fashion vocabulary terms were collected by crawling text data from WSGN, Pantone, and online platforms; text subsequently was extracted through preprocessing with Python. The taxonomy was composed of items, silhouettes, details, styles, colors, textiles, and patterns/prints, which are seven attributes of clothes. The corpus was completed through processing synonyms of terms from fashion books such as dictionaries. Finally, 10,294 vocabulary words, including 1,956 standard Korean words, were classified in the taxonomy. All data was then developed into a web dictionary system. Quantitative and qualitative performance tests of the results were conducted through expert reviews. The performance of the thesaurus also was verified by comparing the results of text mining analysis through the previously developed corpus. This study contributes to achieving a text data standard and enables meaningful results of text mining analysis in the fashion field.

Construction of Korean Verb Wordnet Using Preexisting Noun Wordnet and Monolingual Dictionary (명사 워드넷과 단일어 사전을 이용한 한국어 동사 워드넷 구축)

  • Lee, Ju-Ho;Bae, Hee-Suk;Kim, Eun-Hye;Kim, Hye-Kyong;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2002.10e
    • /
    • pp.92-97
    • /
    • 2002
  • 의미기반 정보 검색, 자연어 질의 응답, 지식 자동 습득, 담화 처리 등 높은 수준의 자연언어처리 시스템에서 의미처리를 위한 대용량의 지식 베이스가 필요하다. 이러한 지식 베이스 중에서 가장 기본적인 것이 워드넷이다. 이러한 워드넷을 이용함으로써 여러 의미 사이의 의미 유사도를 구할 수 있고, 속성을 물려받을 수 있기 때문에 비슷한 속성을 가진 의미들을 한꺼번에 다루는 데 유용하다. 본 논문에서는 기본 어휘를 바탕으로 기존의 명사 워드넷과 단일어 사전을 이용하여 한국어 동사 워드넷을 구축하는 방법을 제시한다. 본 논문에서 1차 작업을 통하여 구축한 동사 워드넷에는 동사 1,757개에 대한 4,717개의 의미(중복을 포함하면 모두 5,235개의 의미)를 포함하고 있으며 특별히 의미가 많이 편중된 14개의 개념에 속한 571개의 의미를 53개의 세부 개념으로 재분류하여 최종적으로 모두 767개의 계층적 개념으로 구성된 동사 워드넷이 만들어 졌다.

  • PDF

A Study on the New Trends of EDI based Internet (인터넷을 기반으로 하는 EDI 신조류)

  • 조원길
    • The Journal of Information Technology
    • /
    • v.4 no.1
    • /
    • pp.125-139
    • /
    • 2001
  • EDI(Electronic Data Interchange) works by providing a collection of standard message formats and element dictionary in a simple way for businesses to exchange data via any electronic messaging service. Open-edi is electronic data interchange among autonomous parties using public standards and aiming towards interoperability over time, business sectors, information technology and data types. The number of Internet services using XML/EDI has grown rapidly since it is easily expansible and exchangeable. To use this service, the client does not have to install EDI S/W but only needs internet browser. Consequently, it became much easier and faster to handle the trading process in an office. eBusiness SML (extensible markup language) electronic data interchange. eXedi is the service that realizes B2B of XML/EDI. eXedi can be used easily in small and medium sized companies. Companies in any place can access to eXedi using the existing Internet connection. XML/EDI provides a standard framework to exchange different types of data -- for example, an invoice, healthcare claim, project status -- so that the information be it in a transaction, exchanged via an Application Program Interface (API), web automation, database portal, catalog, a workflow document or message can be searched, decoded, manipulated, and displayed consistently and correctly by first implementing EDI dictionaries and extending our vocabulary via on-line repositories to include our business language, rules and objects.

  • PDF

Verifier-Based Multi-Party Password-Authenticated Key Exchange for Secure Content Transmission (그룹 사용자간 안전한 콘텐츠 전송을 위한 검증자를 이용한 패스워드 기반 다자간 키 교환 프로토콜)

  • Kwon, Jeong-Ok;Jeong, Ik-Rae;Choi, Jae-Tark;Lee, Dong-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.13 no.2
    • /
    • pp.251-260
    • /
    • 2008
  • In this paper, we present two verifier-based multi-party PAKE (password-authenticated key exchange) protocols. The shared key can be used for secure content transmission. The suggested protocols are secure against server compromise attacks. Our first protocol is designed to provide forward secrecy and security against known-key attacks. The second protocol is designed to additionally provide key secrecy against the server which means that even the server can not know the session keys of the users of a group. The suggested protocols have a constant number of rounds are provably secure in the standard model. To the best of our knowledge, the proposed protocols are the first secure multi-party PAKE protocols against server compromise attacks in the literature.

Statistical Analysis of Korean Phonological Variations Using a Grapheme-to-phoneme System (발음열 자동 생성기를 이용한 한국어 음운 변화 현상의 통계적 분석)

  • 이경님;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.7
    • /
    • pp.656-664
    • /
    • 2002
  • We present a statistical analysis of Korean phonological variations using a Grapheme-to-Phoneme (GPT) system. The GTP system used for experiments generates pronunciation variants by applying rules modeling obligatory and optional phonemic changes and allophonic changes. These rules are derived form morphophonological analysis and government standard pronunciation rules. The GTP system is optimized for continuous speech recognition by generating phonetic transcriptions for training and constructing a pronunciation dictionary for recognition. In this paper, we describe Korean phonological variations by analyzing the statistics of phonemic change rule applications for the 60,000 sentences in the Samsung PBS Speech DB. Our results show that the most frequently happening obligatory phonemic variations are in the order of liaison, tensification, aspirationalization, and nasalization of obstruent, and that the most frequently happening optional phonemic variations are in the order of initial consonant h-deletion, insertion of final consonant with the same place of articulation as the next consonants, and deletion of final consonant with the same place of articulation as the next consonant's, These statistics can be used for improving the performance of speech recognition systems.

Analysis of Terms on Panel Descriptions of the Domain for Astronomy at the Gwacheon National Science Museum (국립과천과학관의 천문영역 패널 설명의 용어 분석)

  • Yun, Hye-Ryun;Sohn, Jungjoo
    • Journal of Science Education
    • /
    • v.36 no.2
    • /
    • pp.329-340
    • /
    • 2012
  • The purpose of this study is to analyze the terms which were described in panels for astronomic article on exhibition at the Gwacheon National Science Museum, and to clarify that the terms were appropriate and easily understandable or not. In research, totally, 965 terms were collected in 52 panels(14 panels in planetarium, 17 panels in national history part, and 21 panels in traditional science part). All terms were categorized to 4types, as 1.Standard/Scientific terms, 2.Non-Standard/Scientific terms, 3.Standard/Non-Scientific terms, 4. Non-Standanrd/Non-Scientific casual words, based on 'Dictionary of Standard Korean' and 'Terminology of Astronomy'. And questionnaires survey was done to 24 in-service teachers at elementary school, middle school, and high school to clarify that the level of the terms are appropriate to students. The results of this study show that accurate scientific terms were 68.5%, and many of students had difficulty in understanding those scientific terms in the panels because of unfamiliarity. Therefore, in order to make students get more interest and better understanding, it is proposed to minimize scientific terms and to substitute them to casual terms which were related with practical life.

  • PDF

Review of Fish Name on the Fishes of the Family Mugilidae in Korea and Resource Utilization (우리나라 숭어과 어류의 어명 및 자원 활용에 대한 고찰)

  • Ko, Eun Young;Park, Jong Oh;Lee, Kyoung Seon
    • Journal of Marine Life Science
    • /
    • v.4 no.2
    • /
    • pp.96-105
    • /
    • 2019
  • The mugilidae fishes are common euryhaline species that live in coastal marine waters to freshwater areas. The taxonomy and nomenclature of the mugilidae fishes still remain unresolved because of their morphological similarities. Among the mugilidae fishes, most commonly consumed in Korea, are grey mullet (Mugil cephalus) and red lip mullet (Chelon haematocheilus). It is generally called 'mullet' without distinguishing between two mullets. Therefore, the aim of this study is to examine the scientific names and common names of mullet species used in Korea from the domestic journals and Korean old documents. The scientific name of grey mullet is M. cephalus, but that of redlip mullet is C. haematocheilus. But the genus of redlip mullet is still mixed with Chelon, Mugil, and Liza. The standard name of two mullet is not distinguished in the Korean dictionary, but they were clearly distinguished in the Japanese, English, and Chinese dictionaries. In the ancient Korean references, the mullet was called 'Chieo' or 'Sueo'. In most of the old literature, the distinction between grey mullet and redlip mullet is not clear. However, in Jasaneobo, it was written separately from grey mullet and redlip mullet, and attaching "ga" was different from now. The Korean standard name of redlip mullet is 'gasungeo', however, the fishermen in Jeollado and Gyoungsangdo call it 'chamsungeo'. Considering the negative perception of 'ga' character, it is proposed to change 'cham(眞)' instead of 'ga(假)' to improve economic value of red lip mullet.

Sensitivity Identification Method for New Words of Social Media based on Naive Bayes Classification (나이브 베이즈 기반 소셜 미디어 상의 신조어 감성 판별 기법)

  • Kim, Jeong In;Park, Sang Jin;Kim, Hyoung Ju;Choi, Jun Ho;Kim, Han Il;Kim, Pan Koo
    • Smart Media Journal
    • /
    • v.9 no.1
    • /
    • pp.51-59
    • /
    • 2020
  • From PC communication to the development of the internet, a new term has been coined on the social media, and the social media culture has been formed due to the spread of smart phones, and the newly coined word is becoming a culture. With the advent of social networking sites and smart phones serving as a bridge, the number of data has increased in real time. The use of new words can have many advantages, including the use of short sentences to solve the problems of various letter-limited messengers and reduce data. However, new words do not have a dictionary meaning and there are limitations and degradation of algorithms such as data mining. Therefore, in this paper, the opinion of the document is confirmed by collecting data through web crawling and extracting new words contained within the text data and establishing an emotional classification. The progress of the experiment is divided into three categories. First, a word collected by collecting a new word on the social media is subjected to learned of affirmative and negative. Next, to derive and verify emotional values using standard documents, TF-IDF is used to score noun sensibilities to enter the emotional values of the data. As with the new words, the classified emotional values are applied to verify that the emotions are classified in standard language documents. Finally, a combination of the newly coined words and standard emotional values is used to perform a comparative analysis of the technology of the instrument.