• Title/Summary/Keyword: 어휘 데이터

Search Result 313, Processing Time 0.025 seconds

A Study on Facial Visualization System based on one's Personality applied with the Oriental Physiognomy (동양 관상학을 적용한 성격별 얼굴 설계 시스템에 관한 연구)

  • Kang, Seon-Hee;Kim, Hyo-D.;Lee, Kyung-Won
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02b
    • /
    • pp.346-357
    • /
    • 2008
  • 관상학(Physiognomy)이란 사람의 얼굴을 보고 그의 운명, 성격, 수명 따위를 판단하는 방법을 연구하는 학문을 말한다. 이 논문에서 언급하는 관상학은 동양에서 말하는 관상학, 특히 얼굴의 부분적 특성이나 전체적인 조화를 통해 성격과 운영을 예측하는 학문을 의미한다. 이 연구는 동양 관상학을 적용한 성격별 얼굴 설계 시스템 구축에 관한 것으로, 첫째, 보편적인 성격 분류를 위해 MBTI에서 다루는 성격 어휘 161개를 군집분석을 통해 39개의 대표 어휘로 추출하였다. 추출된 대표 성격 어휘의 의미상 거리를 나타내기 위하여 서베이를 통해 얻은 데이터를 다차원 척도법을 통해 2차원 공간상에 성격 어휘의 관계를 분석하였다. 둘째, 얼굴 시각화를 위해 먼저 얼굴의 형태적 특성을 결정짓는 요소를 크게 얼굴형, 눈, 코, 입, 이마, 눈썹으로 분류하고, 분류된 6가지 얼굴 형태의 29가지 하위요소 별 성격을 한국인의 얼굴 특성을 기준으로 관상학적 정리 및 숫자형식 코드화를 하였다. 추출된 대표 성격 어휘별 얼굴 요소의 형태를 앞서 정리된 코드에 따라 하나의 얼굴 형태로 조합하여 39가지 얼굴을 시각화 하여 마지막으로, 성격별 얼굴 설계 시스템 'FACE'를 제작하였다. 이 연구는 사람의 성격 특성에 따라 그에 맞는 얼굴 형태를 구현하는 시스템을 제작하여 일반 사용자 뿐 아니라 애니메이션 캐릭터 개발자에게 객관적인 도움을 줄 수 있으며 또한 예로부터 내려오는 관상학의 적용 범위를 넓힐 수 있는 가능성을 보여주었다고 할 수 있다.

  • PDF

Study on Effective Extraction of New Coined Vocabulary from Political Domain Article and News Comment (정치 도메인에서 신조어휘의 효과적인 추출 및 의미 분석에 대한 연구)

  • Lee, Jihyun;Kim, Jaehong;Cho, Yesung;Lee, Mingu;Choi, Hyebong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.149-156
    • /
    • 2021
  • Text mining is one of the useful tools to discover public opinion and perception regarding political issues from big data. It is very common that users of social media express their opinion with newly-coined words such as slang and emoji. However, those new words are not effectively captured by traditional text mining methods that process text data using a language dictionary. In this study, we propose effective methods to extract newly-coined words that connote the political stance and opinion of users. With various text mining techniques, I attempt to discover the context and the political meaning of the new words.

Evaluation of Knowledge Graph for Interoperating Digital Records (디지털 기록의 상호운용을 위한 지식그래프의 평가)

  • Haram Park;Haklae Kim
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.23 no.4
    • /
    • pp.159-178
    • /
    • 2023
  • A digital archive is an online platform for preserving and utilizing digital records worthy of continued preservation. However, there are no shared standards for functionality, metadata, or data technical principles across digital archives in Korea. These issues create challenges in linking distributed digital records. This study proposes a common vocabulary for digital archives to enhance the interoperability of digital records and evaluates the interoperability of the digital archive built with the common vocabulary. We collect and analyze data from the digital archive on the Korean financial crisis of 1997 to construct a knowledge graph and compare its interoperability with the knowledge graph built with RiC-O. The archive and the knowledge graph underwent evaluation using the FAIR data principles evaluation framework. The constructed knowledge graph links various objects in the archive and provides contextual information to aid in understanding the archive. The results demonstrate that a knowledge graph built with a common vocabulary significantly improves the linkage, search, and interoperability of digital records compared to a traditional archive.

Automatic Construction of Korean Two-level Lexicon using Lexical and Morphological Information (어휘 및 형태 정보를 이용한 한국어 Two-level 어휘사전 자동 구축)

  • Kim, Bogyum;Lee, Jae Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.12
    • /
    • pp.865-872
    • /
    • 2013
  • Two-level morphology analysis method is one of rule-based morphological analysis method. This approach handles morphological transformation using rules and analyzes words with morpheme connection information in a lexicon. It is independent of language and Korean Two-level system was also developed. But, it was limited in practical use, because of using very small set of lexicon built manually. And it has also a over-generation problem. In this paper, we propose an automatic construction method of Korean Two-level lexicon for PC-KIMMO from morpheme tagged corpus. We also propose a method to solve over-generation problem using lexical information and sub-tags. The experiment showed that the proposed method reduced over-generation by 68% compared with the previous method, and the performance increased from 39% to 65% in f-measure.

Vocabulary Recognition Performance Improvement using a convergence of Bayesian Method for Parameter Estimation and Bhattacharyya Algorithm Model (모수 추정을 위한 베이시안 기법과 바타차랴 알고리즘을 융합한 어휘 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.353-358
    • /
    • 2015
  • The Vocabulary Recognition System made by recognizing the standard vocabulary is seen as a decline of recognition when out of the standard or similar words. In this case, reconstructing the system in order to add or extend a range of vocabulary is a way to solve the problem. This paper propose configured Bhattacharyya algorithm standing by speech recognition learning model using the Bayesian methods which reflect parameter estimation upon the model configuration scalability. It is recognized corrected standard model based on a characteristic of the phoneme using the Bayesian methods for parameter estimation of the phoneme's data and Bhattacharyya algorithm for a similar model. By Bhattacharyya algorithm to configure recognition model evaluates a recognition performance. The result of applying the proposed method is showed a recognition rate of 97.3% and a learning curve of 1.2 seconds.

Comparing the Usages of Vocabulary by Medias for Disaster Safety Terminology Construction (재난안전 용어사전 구축을 위한 미디어별 어휘 사용 양상 비교)

  • Lee, Jung-Eun;Kim, Tae-Young;Oh, Hyo-Jung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.6
    • /
    • pp.229-238
    • /
    • 2018
  • The rapid response of disaster accidents can be archived through the organical involvement of various disaster and safety control agencies. To define the terminology of disaster safety is essential for communication between disaster safety agencies and well as announcement for the public. Also, to efficiently construct a word dictionary of disaster safety terminology, it's necessary to define the priority of the terms. In order to establish direction of word dictionary construction, this paper compares the usage of disaster safety terminology by media: word dictionary, new media, and social media, respectively. Based on the terminology resources collected from each media, we visualized the distribution of terminology according to frequency weights and analyzed co-occurrence patterns. We also classified the types of terminology into four categories and proposed the priority in the construction of disaster safety word dictionary.

Design of Ontology Object Model Generation System (온톨로지 객체 모델 생성 시스템 설계)

  • Park, Cheon-Shu;Lee, Mi-Kyoung;Sohn, Joo-Chan;Ham, Ho-Sang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.11b
    • /
    • pp.1297-1300
    • /
    • 2003
  • 본 논문은 웹 온톨로지 데이터를 접근, 표현 및 처리 할 수 있는 온톨로지 객체 모델을 생성하기 위한 시스템이다. 시멘틱 웹의 대두로 인해 웹 상에 존재하는 데이터의 특성에 따라서 접근 할수 있는 방법도 다양화 되었다. 이에 웹 상에서 산재되어 있는 지식들을 가져와 각 도메인에 맞게 새로운 온톨로지를 생성하고 서로 다른 언어로 표현된 온톨로지를 계층 어휘들을 이용하여 시멘틱웹 환경에서 지식을 처리하기 위해 웹 온톨로지를 구축하고 처리할 수 있는 온톨로지 객체 모델을 제공하고, 온톨로지 객체 모델 API를 통해 외부 어플리케이션과의 정보를 교환한다. 본 논문에서는 웹 온톨로지를 표현하기 위한 모델을 계층별로 구별하여 프레임 기반의 상위 온톨로지(frame-based ontology layer), 다른 도메인에서도 사용이 가능한 공통된 어휘(vocabulary)를 표현한 핵심 온톨로지(generic ontology layer)와 각각의 온톨로지 언어에 의존적인 어휘를 표현한 기능 온톨로지(functional ontology layer)로 구성하여 표현의 중복을 없애고 재 사용성을 높이기 위한 모델을 제공함으로써, 온톨로지 추론, 병합 및 저작 도구 등의 외부 어플리케이션이 온톨로지 객체 모델에 손쉽게 접근할수 있고, 온톨로지에 대한 쉬운 지식 표현 및 핸들링을 제공할 수 있다.

  • PDF

Text Categorization Using Co-Trained Support Vector Machines (Co-Trained Support Vector Machines을 이용한 문서분류)

  • 박성배;장병탁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.259-261
    • /
    • 2002
  • 대부분의 자동문서분류 시스템은 문서에 사용된 단어의 분포만 고려하고, 또 하나의 중요한 정보인 통사 정보는 무시한다. 본 논문에서는 통사정보와 어휘정보를 모두 사용함으로써 대규모의 비구조 문서를 분류하는 방법을 제시한다. 이를 위해, 학습 데이터에 대해 독립된 두 개의 관점을 요구하는 일종의 부분 감독 학습 알고리즘인 co-training 알고리즘을 사용한다. 어휘정보와 통사정보가 각각 문서의 독립된 관점이 될 수 있으므로, 이 두 정보와 레이블이 없는 문서를 사용하여 문서 분류의 성능을 높일 수 있다. Reelers-21578 문서집합과 TREC-7 filtering 문서집합에 대한 실험 결과는 제시된 방법의 유효성을 보인다.

  • PDF

A Case Study of Untact Lecture on Albert Camus' La Peste using Big Data (빅데이터를 활용한 『페스트』(알베르 카뮈) 비대면 문학 강의 운영 사례 연구)

  • MIN, Jinyoung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.59-65
    • /
    • 2021
  • This is a case study on the use of Albert Camus' La Peste, which has gained its popularity in today's generation of post-COVID as well as the use of big data analysis tools for major and elective classes. First, we asked students majoring in French to compare the use of vocabulary and the number of appearances for characters using big data analysis, for about 400 pages of the original text. As a result, we were able to confirm a similar relationship between Camus' Absurdism and the vocabulary used within La Peste, in addition to noting the heavy frequency of resistant characters. Students in elective classes were asked to read the literature in a Korean-translated version to determine the frequency of vocabulary and characters' appearances. Students were able to strongly relate to La Peste due to its commonality between COVID and the plague in the literature. We also received high levels of class satisfaction regarding the use of big data analysis tools. The students showed a positive response both towards choosing La Peste as the work of literature and using big data, the main tool in the Fourth Industrial Evolution. We were able to identify good results even in a non-contact environment, as long as the literature does not rely on traditional methods but rather lectures to reflect current situations.

A Study on Recent Trends in Building Linked Data for Overseas Libraries: Focusing on Published Datasets, Reused Vocabulary, and Interlinked External Datasets (해외 도서관 링크드 데이터 구축의 최근 동향 연구 - 발행 데이터세트, 재사용 어휘집, 인터링킹 외부 데이터세트를 중심으로 -)

  • Sung-Sook Lee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.4
    • /
    • pp.5-28
    • /
    • 2022
  • In this study, LD construction cases of overseas libraries were analyzed with focus on published datasets, reused vocabulary, and interlinked external datasets, and based on the analysis results, basic data on LD construction plans of domestic libraries were obtained. As a result of the analysis of 21 library cases, overseas libraries have established a faithful authority LD and conducted new services using published LDs. To this end, overseas libraries collaborated with other libraries and cultural institutions within the region, within the country, and nationally under the leadership of the library, and based on this cooperation, a specialized dataset was published. Overseas libraries used Schema.org to increase the visibility of published LDs, and used BIBFRAME for subdivision of description to define various entities and build LDs based on the defined entities. Overseas libraries have utilized various defined entities to link related information, display results, browse, and download in bulk. Overseas libraries were interested in the continuous up-to-date of interlinked external datasets, and directly utilized external data to reinforce catalog information. In this study, based on the derived implications, points to be considered when issuing LDs by domestic libraries were proposed. The research results can be used as basic data when future domestic libraries plan LD services or upgrade existing services.