• Title/Summary/Keyword: texts

Search Result 1,726, Processing Time 0.027 seconds

Deduction of Acupoints Selecting Elements on Zhenjiuzishengjing using hierarchical clustering (계층적 군집분석(hierarchical clustering)을 통한 침구자생경(鍼灸資生經) 경혈 선택 요인 분석)

  • Oh, Junho
    • Journal of Haehwa Medicine
    • /
    • v.23 no.1
    • /
    • pp.115-124
    • /
    • 2014
  • Objectives : There are plenty of medical record of acupuncture & moxibustion in Traditional East Asian medicine(TEAM). We performed this study to find out the hidden criteria lies on this record to choose proper acupoints. Methods : "Zhenjiuzishengjing", ancient TEAM book was analysed using document clustering techniques. Corpus was made from this book. It contained 196 texts driven from each symptoms. Each texts converted to vector representing frequency of 349 acupoints. Distance of vectors calculated by weighted Euclidean distance method. According to this distances, hierarchical clustering of symptoms was builded. Results : The cluster consisted of five large groups. they had high corelation with body part; head and face, chest, abdomen, upper extremity, lower extremity, back. Conclusions : It assumes that body part of symptom is the most importance criteria of acupoints selecting. some high similar symptom vectors consolidated this result. the other criteria is cause and pathway of illness. some symptoms bound together which had common cause and pathway.

Design of a Korean Question-Answering System for News Item Retrieval (우리말 신문기사 검색을 위한 질문응답시스템 구현에 관한 연구)

  • Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.4 no.1
    • /
    • pp.3-23
    • /
    • 1987
  • This paper describes a question-answering system that can automatically analyze input texts and questions in Korean natural language. The particular texts used for the research were newspaper articles in the specific domain of sports news. The system consists of a set of Cobol programs and an associated set of data files containing lexicon, case grammar, linguistic rules. and data base. This system employs two retrieval functions of fact retrieval and passage retrieval. Therefore input questions can be answered in forms of either sentence or factual data.

  • PDF

A Study on Automatic Indexing of Korean Texts based on Statistical Criteria (통계적기법에 의한 한글자동색인의 연구)

  • Woo, Dong-Chin
    • Journal of the Korean Society for information Management
    • /
    • v.4 no.1
    • /
    • pp.47-86
    • /
    • 1987
  • The purpose of this study is to present an effective automatic indexing method of Korean texts based on statistical criteria. Titles and abstracts of the 299 documents randomly selected from ETRI's DOCUMENT data base are used as the experimental data in this study the experimental data is divided into 4 word groups and these 4 word groups are respectively analyzed and evaluated by applying 3 automatic indexing methods including Transition Phenomena of Word Occurrence, Inverse Document Frequency Weighting Technique, and Term Discrimination Weighting Technique.

  • PDF

A Study on the Recognition of Mixed Documents Consisting of Texts and Graphic Images (텍스트와 그래픽으로 구성된 혼합문서 인식에 관한 연구)

  • 함영국;김인권;정홍규;박래홍;이창범;김상중;윤병남
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.7
    • /
    • pp.76-90
    • /
    • 1994
  • In this paper, an efficient algorithm is proposed which recognizes the mixed document consisting of the printed Korean/alphanumeric texts and graphic images. In the preprocessing step an input document is aligned if necessary by rotating it. We obtain the rotation angle using the Hough transform and align the input document horizontally. Then we separate graphic image parts from text parts by considering chain codes of connected components. We further separate each character using vertical and horizontal projections. In the recognition step Korean and alphanumeric characters are classified and each of them is recognized hierarchically using several features. In summary an efficient recognition algorithm for mixed documents is proposed and its performance is demonstrated via computer simulations.

  • PDF

Features of an Error Correction Memory to Enhance Technical Texts Authoring in LELIE

  • SAINT-DIZIER, Patrick
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.5 no.2
    • /
    • pp.75-101
    • /
    • 2015
  • In this paper, we investigate the notion of error correction memory applied to technical texts. The main purpose is to introduce flexibility and context sensitivity in the detection and the correction of errors related to Constrained Natural Language (CNL) principles. This is realized by enhancing error detection paired with relatively generic correction patterns and contextual correction recommendations. Patterns are induced from previous corrections made by technical writers for a given type of text. The impact of such an error correction memory is also investigated from the point of view of the technical writer's cognitive activity. The notion of error correction memory is developed within the framework of the LELIE project an experiment is carried out on the case of fuzzy lexical items and negation, which are both major problems in technical writing. Language processing and knowledge representation aspects are developed together with evaluation directions.

Modified Version of SVM for Text Categorization

  • Jo, Tae-Ho
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.1
    • /
    • pp.52-60
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors for text categorization and modified versions of SVM to be adaptable to string vectors. Traditionally, when the traditional version of SVM is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text categorization, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and apply the modified version of SVM adaptable to string vectors for text categorization.

Self-rated ability to follow instructions for four mental states described in yoga texts

  • Ramachandra, Raghavendra Bhat;Telles, Shirley;Hongasandra, Nagendra Rama Rao
    • CELLMED
    • /
    • v.2 no.3
    • /
    • pp.28.1-28.4
    • /
    • 2012
  • There were no studies available measuring the ability to follow instructions for meditation. Hence, the present study was planned to assess the ability to follow instructions for the four mental states viz., cancalata (random thinking), ekagrata (non-meditative concentration), dharana (focused meditation) and dhyana (defocused meditation or effortless meditation) described in yoga texts. Sixty male volunteers with ages ranging from 18 to 31 years (group mean age ${\pm}$ S.D., $22.78{\pm}2.73$) participated in the study. They were assessed using a visual analog scale immediately after each of the four states on four different days. The results showed that following dharana, scores on the visual analog scale were significantly lower compared to those related to cancalata, ekagrata and dhyana. Hence, dharana is the most difficult of the four states.

Identification of Chinese Personal Names in Unrestricted Texts

  • Cheung, Lawrence;Tsou, Benjamin K.;Sun, Mao-Song
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.28-35
    • /
    • 2002
  • Automatic identification of Chinese personal names in unrestricted texts is a key task in Chinese word segmentation, and can affect other NLP tasks such as word segmentation and information retrieval, if it is not properly addressed. This paper (1) demonstrates the problems of Chinese personal name identification in some If applications, (2) analyzes the structure of Chinese personal names, and (3) further presents the relevant processing strategies. The geographical differences of Chinese personal names between Beijing and Hong Kong are highlighted at the end. It shows that variation in names across different Chinese communities constitutes a critical factor in designing Chinese personal name Identification algorithm.

  • PDF

Implementation of Korean TTS System based on Natural Language Processing (자연어 처리 기반 한국어 TTS 시스템 구현)

  • Kim Byeongchang;Lee Gary Geunbae
    • MALSORI
    • /
    • no.46
    • /
    • pp.51-64
    • /
    • 2003
  • In order to produce high quality synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model from texts using natural language processing. Robust preprocessing for non-Korean characters should also be required. In this paper, we analyzed Korean texts using a morphological analyzer, part-of-speech tagger and syntactic chunker. We present a new grapheme-to-phoneme conversion method for Korean using a hybrid method with a phonetic pattern dictionary and CCV (consonant vowel) LTS (letter to sound) rules, for unlimited vocabulary Korean TTS. We constructed a prosody model using a probabilistic method and decision tree-based method. The probabilistic method atone usually suffers from performance degradation due to inherent data sparseness problems. So we adopted tree-based error correction to overcome these training data limitations.

  • PDF

Inverted Index based Modified Version of K-Means Algorithm for Text Clustering

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • v.4 no.2
    • /
    • pp.67-76
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors and modified version of k means algorithm to be adaptable to string vectors for text clustering. Traditionally, when k means algorithm is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text clustering, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and modify the k means algorithm adaptable to string vectors for text clustering.