• Title/Summary/Keyword: 텍스트 연구

Search Result 3,492, Processing Time 0.03 seconds

The Construction and Common Use of Old Document DB in the Foreign Countries (해외 소장 고문헌의 DB구축과 공동활용 방안)

  • Kang, Soon-Ae
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.42 no.3
    • /
    • pp.61-79
    • /
    • 2008
  • The purpose of this paper is to investigate the three aspects of the construction and common use of old document DB in the foreign countries: i) the processing of old documents, ii) the problem and improvement of DB systems of old documents. and iii) the common use of old document DB. Results from this research are as follows: The National Library of Korea(NLK) copied old documents in the foreign countries from 1982 to 2006 and published the brief catalog. The Reogang Publishing company issued four volumes catalogs of old document in Japan. The National Research Institute of Cultural Heritage(NRICH) investigated old books and published some catalogs of several organizations in Japan. America. France. and all. The National Institute of Korean History(NIKH) investigated old archives and published some catalogs of several organizations in Japan. The characteristics of the Korean Old and Rare Collection Information System(KORCIS) of the NLK, the Old Books Cultural Heritage in Overseas System of the NRICH. and the Korea History DB System and MF Catalog/ Image System of NIKH were described in the DB systems of old documents, the problems of DB systems were checked over and some alternatives were suggested. In the common use of old document DB, KORMARC format and description rules(draft) for archives should be revised to adopt a new standard such as KS editions. and all the institutes involved should thoroughly follow the standards. when creating bibliographic records and digitizing texts. It is necessary to educate and train the specialists of old documents. A government organization should be established to supervise all the procedures of developing technology for sharing digitized resources. using contents. and cooperating with the related internationl organizations and institutes.

Development of a gridded crop growth simulation system for the DSSAT model using script languages (스크립트 언어를 사용한 DSSAT 모델 기반 격자형 작물 생육 모의 시스템 개발)

  • Yoo, Byoung Hyun;Kim, Kwang Soo;Ban, Ho-Young
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.20 no.3
    • /
    • pp.243-251
    • /
    • 2018
  • The gridded simulation of crop growth, which would be useful for shareholders and policy makers, often requires specialized computation tasks for preparation of weather input data and operation of a given crop model. Here we developed an automated system to allow for crop growth simulation over a region using the DSSAT (Decision Support System for Agrotechnology Transfer) model. The system consists of modules implemented using R and shell script languages. One of the modules has a functionality to create weather input files in a plain text format for each cell. Another module written in R script was developed for GIS data processing and parallel computing. The other module that launches the crop model automatically was implemented using the shell script language. As a case study, the automated system was used to determine the maximum soybean yield for a given set of management options in Illinois state in the US. The AgMERRA dataset, which is reanalysis data for agricultural models, was used to prepare weather input files during 1981 - 2005. It took 7.38 hours to create 1,859 weather input files for one year of soybean growth simulation in Illinois using a single CPU core. In contrast, the processing time decreased considerably, e.g., 35 minutes, when 16 CPU cores were used. The automated system created a map of the maturity group and the planting date that resulted in the maximum yield in a raster data format. Our results indicated that the automated system for the DSSAT model would help spatial assessments of crop yield at a regional scale.

Diachronic Network Analysis on Variable Factors for enhancing the Values of Apparel Industry in South Korea -Focused on Fashion Newspaper Articles- (한국 어패럴 산업의 가치 제고를 위한 변수 요인의 통시적 연결구조 분석 -패션 신문 기사를 중심으로-)

  • Kim, Jang-Hyeon;Lee, Ji-Yeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.551-564
    • /
    • 2020
  • The fashion industry in Korea has grown into a core sector that contributes to the development of national industries, but it has shown negative growth due to the advent of a low-growth era. This study aims to explore core texts related to variable factors affecting the apparel industry in Korea (and the environmental change factors of apparel companies) by using network analysis from a diachronic point of view. In addition, we discuss the implications of enhancing the value of fashion industries in Korea based on articles in fashion newspapers over five years. The conclusion of this study is as follows. First, as a suggestion for political and economic aspects, the government should minimize the damage caused by political influence by presenting new policies, or by communicating about the practical aspects of geopolitical issues and changes linked to the fashion industry. Second, as a suggestion of socio-cultural aspects, it is necessary to reduce uncertainty about the future by establishing a strategic system through cooperation with institutions that can predict future directions. Third, as a suggestion for management changes in the apparel industry, apparel companies in Korea should recognize the importance of consciousness of promoting development for a better society from coexistence, rather than corporate profit.

A Study on Quantitative Evaluation Method for STT Engine Accuracy based on Korean Characteristics (한국어 특성 기반의 STT 엔진 정확도를 위한 정량적 평가방법 연구)

  • Min, So-Yeon;Lee, Kwang-Hyong;Lee, Dong-Seon;Ryu, Dong-Yeop
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.7
    • /
    • pp.699-707
    • /
    • 2020
  • With the development of deep learning technology, voice processing-related technology is applied to various areas, such as STT (Speech To Text), TTS (Text To Speech), ChatBOT, and intelligent personal assistant. In particular, the STT is a voice-based, relevant service that changes human languages to text, so it can be applied to various IT related services. Recently, many places, such as general private enterprises and public institutions, are attempting to introduce the relevant technology. On the other hand, in contrast to the general IT solution that can be evaluated quantitatively, the standard and methods of evaluating the accuracy of the STT engine are ambiguous, and they do not consider the characteristics of the Korean language. Therefore, it is difficult to apply the quantitative evaluation standard. This study aims to provide a guide to an evaluation of the STT engine conversion performance based on the characteristics of the Korean language, so that engine manufacturers can perform the STT conversion based on the characteristics of the Korean language, while the market could perform a more accurate evaluation. In the experiment, a 35% more accurate evaluation could be performed compared to the existing methods.

Korean Abbreviation Generation using Sequence to Sequence Learning (Sequence-to-sequence 학습을 이용한 한국어 약어 생성)

  • Choi, Su Jeong;Park, Seong-Bae;Kim, Kweon-Yang
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.3
    • /
    • pp.183-187
    • /
    • 2017
  • Smart phone users prefer fast reading and texting. Hence, users frequently use abbreviated sequences of words and phrases. Nowadays, abbreviations are widely used from chat terms to technical terms. Therefore, gathering abbreviations would be helpful to many services, including information retrieval, recommendation system, and so on. However, manually gathering abbreviations needs to much effort and cost. This is because new abbreviations are continuously generated whenever a new material such as a TV program or a phenomenon is made. Thus it is required to generate of abbreviations automatically. To generate Korean abbreviations, the existing methods use the rule-based approach. The rule-based approach has limitations, in that it is unable to generate irregular abbreviations. Another problem is to decide the correct abbreviation among candidate abbreviations generated rules. To address the limitations, we propose a method of generating Korean abbreviations automatically using sequence-to-sequence learning in this paper. The sequence-to-sequence learning can generate irregular abbreviation and does not lead to the problem of deciding correct abbreviation among candidate abbreviations. Accordingly, it is suitable for generating Korean abbreviations. To evaluate the proposed method, we use dataset of two type. As experimental results, we prove that our method is effective for irregular abbreviations.

VRML Model Retrieval System Based on XML (XML 기반 VRML 모델 검색 시스템)

  • Im, Min-San;Gwun, O-Bong;Song, Ju-Whan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07a
    • /
    • pp.709-711
    • /
    • 2005
  • 컴퓨터 그래픽스 분야의 발전으로 3D 모델의 수가 기하급수적으로 늘고 있다. 기존의 텍스트나 2D 이미지만을 검색하는 시스템으로는 정확한 3D 모델의 검색이 힘들다. 따라서 3D 모델 검색 시스템의 필요성이 대두되고 많은 분야에서 그 정확도와 속도향상을 위한 3D 모델 검색 연산자(Descriptor)와 검색 알고리즘을 개발하기 위한 연구가 진행 중이다. 본 논문에서는 VRML 모델을 XML 데이터로 변환하여 3D 모델 검색에 사용하는 것이 주요 목표이다. 검색 방법은 크게 VRML의 노드 분류화를 통한 기본 도형에 대한 검색과 XML로 변환하면서 생성하는 무게중심(Mass-Center)을 이용한 검색 두 가지이다. 즉, 3D 모델 데이터베이스를 구축함으로써 VRML 노드를 통한 분류화와 라벨화된 3D 모델 데이터베이스 지원 등의 장점을 활용한다. 3D 모델을 Key값(Descriptor)을 생성하여 분류화된 XML 데이터로 저장하고, 처리하여 유사도 비교의 대상과 횟수가 많아질수록, 3D 모델을 바로 데이터베이스에서 검색에 사용할 수 있어 검색의 속도와 성능을 보다 증가시킬 수 있다. 보다 복잡한 3D 모델의 유사도 비교에 있어서는 Princeton Shape Benchmark(PSB)[1]에서 정확도가 가장 높게 평가된 방법인 LFD(Light Field Descriptor)[6] 검색 연산자를 사용한다. 이 방법은 3D 모델에서 2D 이미지를 얻어 검색하는 방법으로 많은 2D 이미지 관측점(View-Point)과 관측된 2D 이미지의 적합도를 비교하는 계산량이 많은 단점이 있다. 그래서 3D 모델 검색을 위한 2D 이미지 관측에 있어 x, y, z축 방향의 관측점을 얻는 방법을 제안함으로써 2D 이미지의 관측점을 줄여 계산량을 대폭 감소시키는 장점을 갖는다.것으로 조사되었으며 40대 이상의 연령층은 점심비용으로 더 많은 지출을 하고 있는 것으로 나타났다. 4) 끼니별 한식에 대한 선호도는 아침식사의 경우가 가장 높았으며, 이는 40대와 50대에서 높게 나타났다. 점심 식사로 가장 선호되는 음식은 중식, 일식이었으며 저녁 식사에서 가장 선호되는 메뉴는 전 연령층에서 일식, 분식류 이었으며, 한식에 대한 선택 정도는 전 연령층에서 매우 낮게 나타났다. 5) 각 연령층에서 선호하는 한식에 대한 조사에서는 된장찌개가 전 연령층에서 가장 높은 선호도를 나타내었고, 김치는 40대 이상의 선호도가 30대보다 높게 나타났으며, 흥미롭게도 30세 이하의 선호도는 30대보다 높게 나타났다. 그 외에도 떡과 죽에 대한 선호도는 전 연령층에서 낮게 조사되었다. 장아찌류의 선호도는 전 연령대에서 낮았으며 특히 30세 이하에서 매우 낮게 조사되었다. 한식의 맛에 대한 만족도 조사에서는 연령이 올라갈수록 한식의 맛에 대한 만족도는 낮아지고 있었으나, 한식의 맛에 대한 만족도가 높을수록 양과 가격에 대한 만족도는 높은 경향을 나타내었다. 전반적으로 한식에 대한 선호도는 식사 때와 식사 목적에 따라 연령대 별로 다르게 나타나고 있으나, 선호도는 성별이나 세대에 관계없이 폭 넓은 선호도를 반영하고 있으며, 이는 대학생들을 대상으로 하는 연구 등에서도 나타난바 같다. 주 5일 근무제의 확산과 초 중 고생들의 토요일 휴무와 더불어 여행과 엔터테인먼트산업은 더욱 더 발전을 거듭하고 있으며, 외식은 여행과 여가 활동의 필수적인 요소로써 그 역할을 일조하고 있다. 이와 같은 여가시간의 증가는 독신자들에게는 좀더 많은 여유시간을 가족을 이루고 있는 가족구성원들에게는 가족과의 유대를 강화하는 휴식과 오락의 소비 트렌드를 창출시켰다. 이와 더불어 외식은 식사를 해결하기 위한

  • PDF

Study on the meaning of Edi-curation in Trans-media era - Based on the comic(webtoon) and publishing content - (트랜스미디어 시대에서 에디큐레이션의 의미에 대한 연구 - 출판 및 만화 콘텐츠를 중심으로 -)

  • Park, Se-Hyeon
    • Cartoon and Animation Studies
    • /
    • s.44
    • /
    • pp.235-261
    • /
    • 2016
  • Media consumers in the context of the Internet and digital media are using the same content to a variety of platforms. As such, the content of various genres is converted to the form of a new content through the process of fusion, combination, transformation, differentiation, reproduction, etc on the basis of digital media. That is referred to as trans-media. In order to create the successful content in trans-media era, it requires the work of Edi-curation. Edi-curation work is the act of editing and adding meaning to the curation work of curators. In that sense, this paper analyzed the definition and meaning for Edi-curation of publishing and comic(webtoon) content in trans-media era. Edi-curation process induces the changing role of consumers and producers of content in the digital media experience. In process of Edi-curation, consumers(producers) will soon lead to a media producer(consumer), namely proconsumer / produser. Diversification of digital platforms and devices, digital 1 person (or SNS) appeared in the media, etc. are also required to Edi-curation of content and comic(webtoon) published in a variety of ways. Depending on the intention of media producers (or consumers), content through the process of replication, montage, disassembly, dismantling, hypertext, compression, and reconstruction births to new content. The work of Edi-curation has the significance that it affects the way the media producers work in creative process, as well as the reading content of the media consumers. In the publishing content, Edi-curation work is the logicality destruction of a chapter or a paragraph, a sentence of colloquialisms, card news, the deformation of the utilization of video and media content. Meanwhile in the comic(webtoon) content, we mention the destruction of cut(frame), the various modifications of speech bubbles, onomatopoeia, and mimetic word.

Comparative Analysis of 4-gram Word Clusters in South vs. North Korean High School English Textbooks (남북한 고등학교 영어교과서 4-gram 연어 비교 분석)

  • Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.7
    • /
    • pp.274-281
    • /
    • 2020
  • N-gram analysis casts a new look at the n-word cluster in use different from the previously known idioms. It analyzes a corpus of English textbooks for frequently occurring n consecutive words mechanically using a concordance software, which is different from the previously known idioms. The current paper aims at extracting and comparing 4-gram words clusters between South Korean high school English textbooks and its North Korean counterpart. The classification criteria includes number of tokens and types between the two across oral and written languages in the textbooks. The criteria also use the grammatical categories and functional categories to classify and compare the 4-gram words clusters. The grammatical categories include noun phrases, verb phrases, prepositional phrases, partial clauses and others. The functional categories include deictic function, text organizers, stance and others. The findings are: South Korean high school English textbook contains more tokens and types in both oral and written languages. Verb phrase and partial clause 4-grams are grammatically most frequently encountered categories across both South and North Korean high school English textbooks. Stance is most dominant functional category in both South and North Korean English textbooks.

Semantic Dependency Link Topic Model for Biomedical Acronym Disambiguation (의미적 의존 링크 토픽 모델을 이용한 생물학 약어 중의성 해소)

  • Kim, Seonho;Yoon, Juntae;Seo, Jungyun
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.652-665
    • /
    • 2014
  • Many important terminologies in biomedical text are expressed as abbreviations or acronyms. We newly suggest a semantic link topic model based on the concepts of topic and dependency link to disambiguate biomedical abbreviations and cluster long form variants of abbreviations which refer to the same senses. This model is a generative model inspired by the latent Dirichlet allocation (LDA) topic model, in which each document is viewed as a mixture of topics, with each topic characterized by a distribution over words. Thus, words of a document are generated from a hidden topic structure of a document and the topic structure is inferred from observable word sequences of document collections. In this study, we allow two distinct word generation to incorporate semantic dependencies between words, particularly between expansions (long forms) of abbreviations and their sentential co-occurring words. Besides topic information, the semantic dependency between words is defined as a link and a new random parameter for the link presence is assigned to each word. As a result, the most probable expansions with respect to abbreviations of a given abstract are decided by word-topic distribution, document-topic distribution, and word-link distribution estimated from document collection though the semantic dependency link topic model. The abstracts retrieved from the MEDLINE Entrez interface by the query relating 22 abbreviations and their 186 expansions were used as a data set. The link topic model correctly predicted expansions of abbreviations with the accuracy of 98.30%.

A Methodology of Measuring Degree of Contextual Subjective Well-Being Using Affective Predicates for Mental Health Aware Service (정신적 건강 서비스를 위한 감성구를 활용한 주관적 웰빙 지수 측정 방법론)

  • Kwon, Oh-Byung;Choi, Suk-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.1-23
    • /
    • 2011
  • The contextual subjective well-being (SWB) of context-aware system users can be very helpful in recommending relevant mental health services, especially for those who struggle with mental illness due to a metabolic syndrome or melancholia. Self-surveying measuring or auto-sensing methods have been suggested to monitor users' SWB. However, self-surveying measuring method is not inappropriate for a context-aware service due to requesting personal data in a manual and hence obtrusive manner. Moreover, auto-sensing methods still suffer from accuracy problem to be applied in mental health services. Hence, the purpose of this paper is to propose a contextual SWB estimation method to estimate the user's mental health in unobtrusive and accurate manners. This method is timely in that it acquires context data from the user's literal responses, which expose their temporal feeling. In particular, we developed a measuring method based on exposed feeling verbs and degree adverbs in chat and other text-based communications which show anger or negative feelings. Based on the proposed contextual SWB degree estimation method, we developed an idea of well-being life care recommendation. From the experiment with actual drivers, we demonstrated that the proposed method accurately estimate the user's degree of negative feelings even though it does not require a self-survey.