• Title/Summary/Keyword: printed text

Search Result 94, Processing Time 0.026 seconds

A Structural Analysis of Dictionary Text for the Construction of Lexical Data Base (어휘정보구축을 위한 사전텍스트의 구조분석 및 변환)

  • 최병진
    • Language and Information
    • /
    • v.6 no.2
    • /
    • pp.33-55
    • /
    • 2002
  • This research aims at transforming the definition tort of an English-English-Korean Dictionary (EEKD) which is encoded in EST files for the purpose of publishing into a structured format for Lexical Data Base (LDB). The construction of LDB is very time-consuming and expensive work. In order to save time and efforts in building new lexical information, the present study tries to extract useful linguistic information from an existing printed dictionary. In this paper, the process of extraction and structuring of lexical information from a printed dictionary (EEKD) as a lexical resource is described. The extracted information is represented in XML format, which can be transformed into another representation for different application requirements.

  • PDF

Full-text databases as a means for resource sharing (자원공유 수단으로서의 전문 데이터베이스)

  • 노진구
    • Journal of Korean Library and Information Science Society
    • /
    • v.24
    • /
    • pp.45-79
    • /
    • 1996
  • Rising publication costs and declining financial resources have resulted in renewed interest among librarians in resource sharing. Although the idea of sharing resources is not new, there is a sense of urgency not seen in the past. Driven by rising publication costs and static and often shrinking budgets, librarians are embracing resource sharing as an idea whose time may finally have come. Resource sharing in electronic environments is creating a shift in the concept of the library as a warehouse of print-based collection to the idea of the library as the point of access to need information. Much of the library's material will be delivered in electronic form, or printed. In this new paradigm libraries can not be expected to su n.0, pport research from their own collections. These changes, along with improved communications, computerization of administrative functions, fax and digital delivery of articles, advancement of data storage technologies, are improving the procedures and means for delivering needed information to library users. In short, for resource sharing to be truly effective and efficient, however, automation and data communication are essential. The possibility of using full-text online databases as a su n.0, pplement to interlibrary loan for document delivery is examined. At this point, this article presents possibility of using full-text online databases as a means to interlibrary loan for document delivery. The findings of the study can be summarized as follows : First, turn-around time and the cost of getting a hard copy of a journal article from online full-text databases was comparable to the other document delivery services. Second, the use of full-text online databases should be considered as a method for promoting interlibrary loan services, as it is more cost-effective and labour saving. Third, for full-text databases to work as a document delivery system the databases must contain as many periodicals as possible and be loaded on as many systems as possible. Forth, to contain many scholarly research journals on full-text databases, we need guidelines to cover electronic document delivery, electronic reserves. Fifth, to be a full full-text database, more advanced information technologies are really needed.

  • PDF

Improved Text Recognition using Analysis of Illumination Component in Color Images (컬러 영상의 조명성분 분석을 통한 문자인식 성능 향상)

  • Choi, Mi-Young;Kim, Gye-Young;Choi, Hyung-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.3
    • /
    • pp.131-136
    • /
    • 2007
  • This paper proposes a new approach to eliminate the reflectance component for the detection of text in color images. Color images, printed by color printing technology, normally have an illumination component as well as a reflectance component. It is well known that a reflectance component usually obstructs the task of detecting and recognizing objects like texts in the scene, since it blurs out an overall image. We have developed an approach that efficiently removes reflectance components while preserving illumination components. We decided whether an input image hits Normal or Polarized for determining the light environment, using the histogram which consisted of a red component. We were able to go ahead through the ability to extract by reducing the blur phenomenon of text by light because reflection component by an illumination change and removed it and extracted text. The experimental results have shown a superior performance even when an image has a complex background. Text detection and recognition performance is influenced by changing the illumination condition. Our method is robust to the images with different illumination conditions.

  • PDF

Text Region Detection Method Using Table Border Pseudo Label (표의 테두리 유사 라벨을 활용한 문자 영역 검출 방법)

  • Han, Jeong Hoon;Park, Se Jin;Moon, Young Shik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.10
    • /
    • pp.1271-1279
    • /
    • 2020
  • Text region detection is a technology that detects text area in handwriting or printed documents. The detected text areas are digitized through a recognition step, which is used in various fields depending on the purpose of use. However, the detection result of the small text unit is not suitable for the industrial field. In addition, the border of tables in the document that it causes miss-detected results, which has an adverse effect on the recognition step. To solve the issues, we propose a method for detecting text region using the border information of the table. In order to utilize the border information of the table, the proposed method adjusts the flow of two decoders. Experimentally, we show improved performance using the table border pseudo label based on weak supervised learning.

Study on the circulated versions of Major Essentials of Huangdi's Internal Classic Plain Questions (黃帝內經素問大要), and its original publication: Chosun's version of Huangdi's Internal Classics Plain Questions (黃帝內經素問) (이규준 의서 『황제내경소문대요』의 유통본과 그 저본이 된 조선 간본 『황제내경소문』)

  • Oh, Chaekun
    • Journal of Korean Medical classics
    • /
    • v.26 no.4
    • /
    • pp.203-221
    • /
    • 2013
  • Objectives : The Major Essentials of Huangdi's Internal Classics Plain Questions (黃帝內經素問大要, MEHP) is one of the late-Chosun's literate physician Lee Gyoojoon (李圭晙, 1885-1923)'s main work, which is known to have logically proofread the Huangdi's Internal Classics Plain Questions (黃帝內經素問, HP). This study aims to examine two elements of the text: (1) the types of MEHP currently in circulation; (2) the types of publications of HP to be the MEHP's original script. Methods : In this study, basically bibliographical analyses of the form and contents was utilized about the types of MEHP and HP. However, to compare the sentences and phrases between prints, I've used 20 examples that Qian Chaochen (钱超尘) had proposed in his preceding studies. Also, regarding Lee Wonse (李元世)'s proofreading on the MEHP in 1999, I've used interviews of his students. Results : First, I've discovered that there are three versions MEHP in circulation: the woodblock printed version; Lee Wonse's handwritten version; Lee Wonse's proofreading version; and confirmed that Lee's proofreading version should be regarded as good version of MEHP. Also, I've discovered the possibility of other types of printed versions of the MEHP in existence, which is considered Lee's handwritten version's original draft. Second, I've confirmed that the original script of HP, which Lee Gyoojoon utilized for MEHP, is indeed not Gu Congde (顧從德) printed version HP, however, is the Chosun's bureau for military drill (訓練都監) printed version HP. Conclusion : Through this study, I've provided strong evidence that Lee Gyoojoon's MEHP is a unique and original research completed within the traditional realm of Korean medicine, which possesses the universality of Eastern Asian medicine represented by Huangdi's Internal Classics (黃帝內經).

The Logical Structure for Standardization of printed Dictionary (표준화를 위한 일반 사전의 논리 구조)

  • Choi, Byung-Jin;Lee, Jae-Sung;Lee, Woon-Jae;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 1996.10a
    • /
    • pp.415-423
    • /
    • 1996
  • 컴퓨터의 발달과 더불어 최근 자연언어 처리 분야의 일부에서는 일반 문서들(human-readable text)을 전자 문서(machine-readable text)화 하려는 노력이 이루어지고 있다. 이러한 연구 중 대표적인 것으로 사전을 전자문서화된 형태로 바꾸는 작업을 들 수 있는데, 외국에서는 이미 10여년 전부터 이에 관한 연구가 꾸준히 진행되어 결실을 맺고 있다. 이에 반해 우리나라에는 아직 이에 견줄만한, 나아가 표준화할 만한 전자사전이 아직 개발되어 있지 않은 상황이다. 따라서 본고에서는, 일반 사전을 전자사전화 하기 위한 정형화된 논리적 구조를 검토해 보기로 한다.

  • PDF

Studies on Foundation for Standard <Shanghan Lun> Text through Comparison of Sentences from 5 Types of Printed Book (<상한론(傷寒論)> 판본별 문장비교를 통한 표준 텍스트 연구;태양병(太陽病) 상편(上篇) (1-30조(條)))

  • Chi, Gyoo-Yong;Eom, Hyun-Sup
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.22 no.1
    • /
    • pp.25-31
    • /
    • 2008
  • This paper is written for foundation of standard <Shanghan Lun> text to research febrile disease referring from <Jin-gui-yu-han-jing>, Kang-ping-ben, Tang-ben, Song-ben and supplementarily Gui-lin-gu-ben. Through the comparative and analytical investigation of 30 articles of ${\ulcorner}$Taiyang bing Volume 1${\lrcorner}$, 22 articles were newly compiled after getting rid of doubtful sentences or putting together. Also, one article, "wind damage of Tai-yang give rise to fever and chilling", was added from Tang-ben. And many articles were unchanged or in some case, collected together with relevant articles to make sense more.

Evaluation of the Readability of Teacher's Guide Book for Nutrition Education-Sugar, Na, Trans Fat (당, 나트륨, 트랜스 지방 교재의 교사용 지도서 지문의 난이도 평가)

  • Lee, Young-Mee;Kim, Jin-Ah
    • Korean Journal of Community Nutrition
    • /
    • v.15 no.5
    • /
    • pp.648-655
    • /
    • 2010
  • This study is attempted to propose a quality evaluation method of the materials for nutrition education by applying readability test of printed materials for nutrition education. It is significantly important for the teaching class with nutrition education materials to consider students understanding level and education materials topics or choosing words in developed printed materials. This study performed an evaluation of the readability of text in teachers' guidebook, which is devised for elementary school students' education material about sugar, sodium and trans-fat and attempted to estimate the developing materials through analyzing difficulty level of the text. We utilized "The Teacher's Guidebook for Cooking Activity" that had been developed for elementary school by Ministry Education Science and Technology, as the readability evaluation standard. Compared with the average score of readability about "The Teacher's Guidebook for Cooking Activity", $72.94{\pm}6.85$, "Sugar Guidebook" was 70.94{\pm}7.46, "Sodium Guidebook" was $68.76{\pm}14.50$, and "Trans-fat Guidebook" was $58.87{\pm}10.79$. Considering the subjects careers or ages, "The Teacher's Guidebook for Cooking Activity" has little deviation by the subjects careers or ages and was "intermediate" or "easy" level; however, "Sugar Guidebook", "Sodium Guidebook", "Trans-fat Guidebook" was "intermediate" or "difficult" level (p < 0.05). Readability score was especially low when the contents of some particular subjects were too professionalized or scientific terms were frequently used, and thes results were definitely seen in the "Sodium Guidebook" and "Trans-fat Guidebook". With Cloze test score, the readability evaluation will be using as an evaluation tool for the nutrition education materials.

Fast Skew Detection of Document Images by Extraction of Center Points of Blank Lines (공백행의 중심점 추출에 의한 고속 문서 기울기 검출)

  • Jeong, Jae-Yeong;Kim, Mun-Hyeon
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.11
    • /
    • pp.1342-1349
    • /
    • 1999
  • 본 논문에서는 문서 내의 인접한 두 행 사이에는 일정한 두께의 공백 행이 존재하며 그 공백 행의 기울기는 실제 문서의 기울어진 정도를 반영한다는 사실에 기반하여, 선형적으로 기울어진 문서 영상의 기울기 추정을 위한 고속의 알고리즘을 제안한다. 먼저, 간단한 모폴로지 연산(dilation)을 이용하여 문자행 영역과 공백행 영역을 분리한 후, 이를 일정 간격으로 수직 샘플링하여 수직선 상에 있는 모든 공백행의 중심점(행간점)을 찾는다. 동일한 공백 행 상에 있는 인접한 두 행간점 간에 기울기를 계산하고, 전체 영상으로부터 이들의 분포를 조사하여 최대 빈도를 가지는 기울기를 입력 문서의 기울기로 추정한다. 실험에서는 제안한 알고리즘을 필기체 및 인쇄체를 포함하는 다양한 형태의 가로쓰기 문서에 적용한 결과를 보인다.Abstract In this paper, we propose a fast algorithm to estimate the skew angle of linearly skewed document images. This paper is based on the fact that there is a blank line with uniform thickness between two adjacent text lines and the slope of the line is the same as that of the document. Firstly, we apply a dilation operation to the image to separate blank lines from text lines, and we detect center points of blank lines along the vertically sampled lines. Then we calculate the slope between neighboring center points in the same blank line. Calculated slopes for the entire image are accumulated on the histogram to display the distribution of them. Finally, the peak in the histogram is detected and estimated as the slope of the document image. In the experiments, we adopted a lot of images of various format with hand-printed or machine-printed document to verify our algorithm.

The Region Analysis of Document Images Based on One Dimensional Median Filter (1차원 메디안 필터 기반 문서영상 영역해석)

  • 박승호;장대근;황찬식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.3
    • /
    • pp.194-202
    • /
    • 2003
  • To convert printed images into electronic ones automatically, it requires region analysis of document images and character recognition. In these, regional analysis segments document image into detailed regions and classifies thee regions into the types of text, picture, table and so on. But it is difficult to classify the text and the picture exactly, because the size, density and complexity of pixel distribution of some of these are similar. Thu, misclassification in region analysis is the main reason that makes automatic conversion difficult. In this paper, we propose region analysis method that segments document image into text and picture regions. The proposed method solves the referred problems using one dimensional median filter based method in text and picture classification. And the misclassification problems of boldface texts and picture regions like graphs or tables, caused by using median filtering, are solved by using of skin peeling filter and maximal text length. The performance, therefore, is better than previous methods containing commercial softwares.