• Title/Summary/Keyword: 단어 선정

Search Result 222, Processing Time 0.025 seconds

A Study of Research on Methods of Automated Biomedical Document Classification using Topic Modeling and Deep Learning (토픽모델링과 딥 러닝을 활용한 생의학 문헌 자동 분류 기법 연구)

  • Yuk, JeeHee;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.63-88
    • /
    • 2018
  • This research evaluated differences of classification performance for feature selection methods using LDA topic model and Doc2Vec which is based on word embedding using deep learning, feature corpus sizes and classification algorithms. In addition to find the feature corpus with high performance of classification, an experiment was conducted using feature corpus was composed differently according to the location of the document and by adjusting the size of the feature corpus. Conclusionally, in the experiments using deep learning evaluate training frequency and specifically considered information for context inference. This study constructed biomedical document dataset, Disease-35083 which consisted biomedical scholarly documents provided by PMC and categorized by the disease category. Throughout the study this research verifies which type and size of feature corpus produces the highest performance and, also suggests some feature corpus which carry an extensibility to specific feature by displaying efficiency during the training time. Additionally, this research compares the differences between deep learning and existing method and suggests an appropriate method by classification environment.

Extracting Technical Vocabulary List for Early Childhood Education Using EAP Specialized Corpus (EAP 전문 코퍼스를 활용한 유아교육 전문 어휘 추출)

  • Lee, Je-Young;Ahn, Jongki;Lee, Jee Eun
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.1
    • /
    • pp.475-484
    • /
    • 2017
  • The aim of this research is the development and evaluation of a technical vocabulary list for early childhood education. The list was compiled from a corpus of 500,000 running words of written academic texts from 7 books about early childhood education. The distribution of GSL[1] and AWL[2] was 81.86% and 9.78% respectively, which meant that academic texts related to early childhood education is very similar with ones on other disciplines. The technical vocabulary list for early childhood education (TV4ECE), extracted in terms of frequency and range, contains 224 types. This word list can be used to teach early childhood education in English, especially for the preparation of reading the English texts in the field of early childhood education.

A Content Analysis of Journal Articles Using the Language Network Analysis Methods (언어 네트워크 분석 방법을 활용한 학술논문의 내용분석)

  • Lee, Soo-Sang
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.4
    • /
    • pp.49-68
    • /
    • 2014
  • The purpose of this study is to perform content analysis of research articles using the language network analysis method in Korea and catch the basic point of the language network analysis method. Six analytical categories are used for content analysis: types of language text, methods of keyword selection, methods of forming co-occurrence relation, methods of constructing network, network analytic tools and indexes. From the results of content analysis, this study found out various features as follows. The major types of language text are research articles and interview texts. The keywords were selected from words which are extracted from text content. To form co-occurrence relation between keywords, there use the co-occurrence count. The constructed networks are multiple-type networks rather than single-type ones. The network analytic tools such as NetMiner, UCINET/NetDraw, NodeXL, Pajek are used. The major analytic indexes are including density, centralities, sub-networks, etc. These features can be used to form the basis of the language network analysis method.

Music Teacher Education for Multicultural Music Education: An Analysis of Multicultural Contents in Undergraduate Music Education Curriculum in USA (다문화 음악교육을 위한 음악 교사 교육의 방향: 미국 음악교사 교육과정의 다문화 교과목의 분석을 통하여)

  • Lee, Ka-Won
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.7
    • /
    • pp.473-483
    • /
    • 2013
  • In the 21th century, world citizens have more chances to meet and interact with people from different cultural or ethnic backgrounds than ever before. It is therefore inevitable that multiculturalism has become one of the integral part of music education. Educators in every field should make conscious efforts to provide opportunities and experiences for students so that they can adequately cope with these culturally diverse encounters. This study is to grasp the current state of multicultural education in the teacher education program in Korea and America. Also it is aimed to figure out the current multicultural music course offerings available for undergraduate music education majors at the selected 10 American Universities. It is expected that this study may suggest an idea of multicultural-world music education in teacher education.

Study of Integrated Brand Communication in Clean Beauty Cosmetics (클린뷰티 화장품에 나타난 통합 브랜드 커뮤니케이션 연구)

  • Lee, Young-Hwa
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.4
    • /
    • pp.161-169
    • /
    • 2021
  • 'Clean beauty' attracts attention with an increasing interest in cosmetics without harmful ingredients as people wear masks in the age of COVID-19. Thus, this study selected and analyzed clean beauty cosmetic brands circulated in Korea on/offline in 2020. This study extracted 36 clean beauty brands and selected 20 suitable brands through an experts' analysis. For an analysis of clean beauty cosmetic brand communication, components: naming, logo, color, package, and website were drawn to conduct a survey. Preferred were the words they come up with when they think of nature or health for naming; wordmarks in a simple form for logo; greenish or yellowish for color; the simple form aligned center on the container body for package; and the images of plants, animals, and humans for website. To sum up the components, utilizing natural, clean, and light images harmoniously, acted as a factor for preferring the clean beauty cosmetic brands.

A Study on the Intellectual Structure of Domestic Open Access Area (국내 오픈액세스 분야의 지적구조 분석에 관한 연구)

  • Shin, Jueun;Kim, Seonghee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.2
    • /
    • pp.147-178
    • /
    • 2021
  • In this study, co-word analysis was conducted to investigate the intellectual structure of the domestic open access area. Through KCI and RISS, 124 research articles related to open access in Korea were selected for analysis, and a total of 1,157 keywords were extracted from the title and abstract. Network analysis was performed on the selected keywords. As a result, 3 domains and 20 clusters were extracted, and intellectual relations among keywords from open access area were visualized through PFnet. The centrality analysis of weighted networks was used to identify the core keywords in this area. Finally, 5 clusters from cluster analysis were displayed on a multidimensional scaling map, and the intellectual structure was proposed based on the correlation between keywords. The results of this study can visually identify and can be used as basic data for predicting the future direction of open access research in Korea.

Post-processing for Korean OCR Using Cohesive Feature between Syllables and Syntactic Lexical Feature (한국어의 음절 결합 특성 및 통사적 어휘 특성을 이용한 문자인식 후처리 시스템)

  • Hwang, Young-Sook;Park, Bong-Rae;Rim, Hae-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.175-182
    • /
    • 1997
  • 지금까지의 한글 문자인식 후처리 연구분야에서 미등록어와 비문맥적 오류 문제는 아직까지 잘 해결하지 못하고 있는 문제이다. 본 논문에서는 단어로서 가능한지를 결정하는 기준으로 확률적 음절 결합 정보를 사용하여 형태소 분석 기법만을 사용했을 때 발생할 수 있는 미등록어 문제를 해결하고, 통사적 기능의 어말 어휘를 고려한 문맥 결합 정보를 이용함으로써 다수의 후보 어절 가운데에서 최적의 후보 어절을 선택하는 방법을 제안한다. 제안된 시스템은 인식기에서 내보낸 후보 음절과 학습된 혼동 음절을 조합하여 하나 이상의 후보 어절을 생성하는 모듈과 통계적 언어 정보를 이용하여 최적의 후보 어절을 선정하는 모듈로 구성되었다. 실험은 1000만 원시 코퍼스에서 추출한 음절 결합 정보와 17만 태깅된 코퍼스에서 추출한 어절 결합 정보를 사용하였으며, 실제 인식 결과에 적용한 결과 문자 단위에서는 94.1%의 인식률을 97.4%로, 어절 단위에서는 87.6%를 96.6%로 향상시켰다. 교정률과 오교정률은 각각 문자 단위에서 56%와 0.6%, 어절 단위에서 83.9%와 1.66%를 보였으며, 전체 실험 어절의 3.4%를 차지한 미등록어 중 87.5%를 올바로 인식하는 한편, 전체 오류의 20.3%인 비문맥 오류에 대해서 91.6%를 올바로 교정하는 후처리 성능을 보였다.

  • PDF

Methodology for semi-autonomous rule extraction based on Restricted Language Set and ontology (제한된 언어집합과 온톨로지를 활용한 반자동적인 규칙생성 방법 연구)

  • Son, Mi-Ae;Choe, Yun-Gyu
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2007.05a
    • /
    • pp.297-306
    • /
    • 2007
  • 지능정보시스템 구축에 있어서 자동화가 어려운 단계중의 하나인 규칙 습득을 위해 활용되는 방법중의 하나가 제한된 언어집합 기법을 이용하는 것이다. 그러나 제한된 언어집합 기법을 이용해 규칙을 생성하기 위해서는 규칙을 구성하는 변수와 그 값들에 대한 정보가 사전에 정의되어 있어야 하는데, 유동성이 큰 웹 환경에서 예상 가능한 모든 변수와 그 값을 사전에 정의하는 것이 매우 어렵다. 이에 본 연구에서는 이러한 한계를 극복하기 위해 제한된 언어집합 기법과 온톨로지를 이용한 규칙 생성 방법론을 제시하였다. 이를 위해 지식의 습득 대상이 되는 특정 문장은 문법구조 분석기를 이용해 파싱을 수행하며, 파싱된 단어들을 이용해 규칙의 구성 요소인 변수와 그 값을 식별한다. 그러나 규칙을 내포한 자연어 문장의 불완전성으로 인해 변수가 명확하지 않거나 완전히 빠져 있는 경우가 흔히 발생하며, 이로 인해 온전한 형식의 규칙 생성이 어렵게 된다. 이 문제는 도메인 온톨로지의 생성을 통해 해결하였다. 이 온톨로지는 특정 도메인을 구성하고 있는 개념들간의 관계를 포함하고 있다는 점에서는 기존의 온톨로지와 유사하지만, 규칙을 완성하는 과정에서 사용된 개념들의 사용빈도를 기반으로 온톨로지의 구조를 변경하고, 결과적으로 더 정확한 규칙의 생성을 지원한다는 점에서 기존의 온톨로지와 차별화된다. 이상의 과정을 통해 식별된 규칙의 구성요소들은 제한된 언어집합 기법을 이용해 구체화된다. 본 연구에서 제안하는 방법론을 설명하기 위해 임의의 인터넷 쇼핑몰에서 수행되는 배송관련 웹 페이지를 선정하였다. 본 방법론은 XRML에서의 지식 습득 과정의 효율성 제고에 기여할 수 있을 것으로 기대된다.

  • PDF

Technology Keyword Network and Cognitive Map Analysis: to prospect promising technology of UAV(Unmanned Aerial Vehicle) airframe industry (기술 키워드 네트워크와 인지지도 분석을 통한 무인항공기 비행체산업의 유망기술 도출 연구)

  • Joo, Seong-Hyeon;Ha, Sung-Ho;Park, Sang-Hyeon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.5
    • /
    • pp.55-72
    • /
    • 2016
  • This study aims at providing a methodology for retaining international technology competitiveness, marketable industry, and sustainable promising technology in a field of new growth engine industry such as national unmanned aerial vehicle industry. We draw a result by analysing with tools such as KrKwic, Excel, NetMiner, presenting methods of a Social Network Analysis, sub-group analysis, and cognitive map analysis based on patent data in a field of unmanned aerial vehicle industry. As a result, some future promising technologies are prospected as what worths concentrated investment, such as 'pilot control tech', 'identification of friend or foe tech'.

Analysis of Author Image Based on Book Recommendation from Readers (독자 추천도서 정보를 이용한 작가 이미지 분석 연구)

  • Choi, Sanghee
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.4
    • /
    • pp.153-171
    • /
    • 2017
  • Many readers tend to read books of a specific author and to expand their reading areas according to the author. This study chose Edgar Allan Poe and analyzed the image of the author using co-recommended authors and books by other readers. The frequencies of co-occurred authors and books were investigated and the relations of authors and books were analyzed with network analysis methods. As a result, genre images of Poe, related authors, and related books are discovered. This study also suggested the methods to identify the image of a author, related author groups, and related books for libraries' reading programs and book curation.