• Title/Summary/Keyword: Text analysis

Search Result 3,326, Processing Time 0.049 seconds

A Study on Automatic Binarization of Text Region Using a Stroke Filter (스트록 필터를 이용한 문자영역 이진화에 관한 연구)

  • Jung, Cheol-Kon;Kim, Jong-Kyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.2C
    • /
    • pp.178-183
    • /
    • 2008
  • The videotext brings important semantic clues into video content analysis. In this paper, we propose an automatic binarization method of text region using a stroke filter. Proposed text binarization method consists of stroke filtering, text color polarity determination, and local region growing. By using the responses of dark and bright stroke filters, we can determine color polarity of text region automatically. And the method is robust against complex background, because it considers stroke information of videotexts by using a stroke filter. The effectiveness of our method is verified by experiments on a challenging database.

Text Document Categorization using FP-Tree (FP-Tree를 이용한 문서 분류 방법)

  • Park, Yong-Ki;Kim, Hwang-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.11
    • /
    • pp.984-990
    • /
    • 2007
  • As the amount of electronic documents increases explosively, automatic text categorization methods are needed to identify those of interest. Most methods use machine learning techniques based on a word set. This paper introduces a new method, called FPTC (FP-Tree based Text Classifier). FP-Tree is a data structure used in data-mining. In this paper, a method of storing text sentence patterns in the FP-Tree structure and classifying text using the patterns is presented. In the experiments conducted, we use our algorithm with a #Mutual Information and Entropy# approach to improve performance. We also present an analysis of the algorithm via an ordinary differential categorization method.

The Effects of Inferential Reading Strategy Program on Text Comprehension and Korean Language Academic Achievements of Vocational High School Students (추론적 읽기전략 프로그램이 전문계 고등학생의 텍스트 이해와 국어과 학업성취에 미치는 효과)

  • Kim, Seon-Kyung;Yune, So-Jung;Kim, Jung-Sub
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.23 no.1
    • /
    • pp.1-12
    • /
    • 2011
  • The purpose of this study was to examine the effects of inferential reading strategy program on text comprehension and Korean language academic achievements of vocational high school students. We developed the program of inferential reading strategy, applied it to an educational spot, and examined the effects of it on text comprehension ability and Korean language academic achievements of learners. ANCOVA was used for data analysis with SPSS ver.12.0 statistic program. The main findings of this study were as follows. First, the experimental group which had been conducted with the inferential reading strategy program showed statistically significant difference in their text comprehension ability from controlled group. Second, the experimental group showed statistically significant difference in their Korean language academic achievements ability from controlled group. The study shows that the inferential reading strategy program had effect on the text comprehension and Korean language academic achievements of vocational high school students.

Prediction of Physical Examination Demand Using Text Mining (텍스트 마이닝을 이용한 건강검진 수요 예측)

  • Park, Kyungbo;Kim, Mi Ryang
    • Journal of Information Technology Services
    • /
    • v.21 no.5
    • /
    • pp.95-106
    • /
    • 2022
  • Recently, physical examinations have become an important strategy to reduce costs for individuals and society. Pre-physical counseling is important for an effective physical examination. However, incomplete counseling is being conducted because the demand for physical examinations is not predicted. Therefore, in this study, the demand for physical examination was predicted using text mining and stepwise regression. As a result of the analysis, the most recent text data showed a high explanatory power of the demand for physical examination. Also, large amounts of data have high explanatory power. In addition, it was found that the high frequency of the text "health food" reduces the number of health examination customers. And the higher the frequency of the text of the word "food", the lower the number of physical examination customers. However, when the word "wild ginseng" was exposed a lot on Twitter, the number of physical examination customers visiting hospitals increased. In other words, customers consume efficiently by comparing the health examination price with the price of consumer goods. The proposed research framework can help predict demand in other industries.

A Study on the Food Culture of Literature in the late period of the Chosun Dynasty - Focused on Five Pansori texts into written form- (조선후기 문학에 나타난 음식문화 특성 - 판소리 다섯마당을 중심으로 -)

  • Kim, Mi-Hye;Chung, Hae-Kyung
    • Journal of the Korean Society of Food Culture
    • /
    • v.22 no.4
    • /
    • pp.393-403
    • /
    • 2007
  • This study presents the food culture as analysis food material, food and cooking tools in the novel literature and examines the food as a code of current cluture of common social through five Pansori texts among the twelve Pansori texts into written form. It is a many Pansori, but this study is analysed to select early copying papers. It can be found rice, Kimchi, salted fish as the common people food in Simchong-ga text. It can be known characteristics of Jeolla-do Area food used many food material and acceptance of foreign crops in the late period of the Chosun in Chunhyang-ga text. In Hungbo-ga text, it can be found the popularity food is rice cake and meat and looked the special feature of dog meat, rice cake, scorched rice-tea. In Toebyol-ga text, it can be looked many sea food and medicine beverages, and in Chokpyok-ga text, it can be found peculiarity of drink for making excitement during a war. Moreover, in five Pansori texts, that is seemed characteristics such as cover of tableware, spoon and chopsticks, tableware china, a cauldron, a charcoal burner, a brass chafing dish, a table, a flail and a mill.

Flatness Characteristics Analysis Technique of Attenuator Using Thermal Voltage Converter and AC Measurement Standard (열전압변환기와 교류측정표준을 사용한 감쇠기 평탄도 특성 분석 기법)

  • Cha, Yun-bae;Kim, Boo-il
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.2
    • /
    • pp.330-337
    • /
    • 2018
  • This paper proposes a technique to analyze the flatness characteristics of the attenuator at 10Hz to $50\text\tiny{MHz}$ on the basis of $1\text\tiny{kHz}$ using a Thermal Voltage Converter and AC measurement standard. In the proposed technique, the input voltage of the attenuator for each measuring frequency is supplied at the same rate as $1\text\tiny{kHz}$ using TVC, and the flatness characteristics of the attenuator are analyzed by the voltage variation indicated in the AC measurement standard. The results of the analysis of the attenuator flatness characteristics show that the maximum uncertainty of $866{\mu}V/V$ can be measured from $10\text\tiny{dB}$ to $70\text\tiny{dB}$ and the uncertainty is reduced by about 37% compared to $2.31\text\tiny{mV}$/V using the network measurement method. The improved attenuator flatness characteristic values can be applied to the frequency flatness calibration from 2.2V to 2.2mV at the low voltage of the AC measurement standard.

Text Analytics for Classifying Types of Accident Occurrence Using Accident Report Documents (사고보고문서를 이용한 텍스트 기반 사고발생 유형 및 관계 분석)

  • Kim, Beom Soo;Chang, Seongrok;Suh, Yongyoon
    • Journal of the Korean Society of Safety
    • /
    • v.33 no.3
    • /
    • pp.58-64
    • /
    • 2018
  • Recently, a lot of accident report documents have accumulated in almost all of industries, including critical information of accidents. Accordingly, text data contained in accident report documents are considered useful information for understanding accident processes. However, there has been a lack of systematic approaches to analyzing accident report documents. In this respect, this paper aims at proposing text analytics approach to extracting critical information on accident processes. To be specific, major causes of the accident occurrence are classified based on text information contained in accident report documents by using both textmining and latent Dirichlet allocation (LDA) algorithms. The textmining algorithm is used to structure the document-term matrix and the LDA algorithm is applied to extract latent topics included in a lot of accident report documents. We extract ten topics of accidents as accident types and related keywords of accidents with respect to each accident type. The cause-and-effect diagram is then depicted as a tool for navigating processes of the accident occurrence by structuring causes extracted from LDA. Further, the trends of accidents are identified to explore patterns of accident occurrence in each of types. Three patterns of increasing to decreasing, decreasing to increasing, or only increasing are presented in the case of a chemical plant. The proposed approach helps safety managers systematically supervise the causes and processes of accidents through analysis of text information contained in accident report documents.

Research on Methods for Processing Nonstandard Korean Words on Social Network Services (소셜네트워크서비스에 활용할 비표준어 한글 처리 방법 연구)

  • Lee, Jong-Hwa;Le, Hoanh Su;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.3
    • /
    • pp.35-46
    • /
    • 2016
  • Social network services (SNS) that help to build relationship network and share a particular interest or activity freely according to their interests by posting comments, photos, videos,${\ldots}$ on online communities such as blogs have adopted and developed widely as a social phenomenon. Several researches have been done to explore the pattern and valuable information in social networks data via text mining such as opinion mining and semantic analysis. For improving the efficiency of text mining, keyword-based approach have been applied but most of researchers argued the limitations of the rules of Korean orthography. This research aims to construct a database of non-standard Korean words which are difficulty in data mining such abbreviations, slangs, strange expressions, emoticons in order to improve the limitations in keyword-based text mining techniques. Based on the study of subjective opinions about specific topics on blogs, this research extracted non-standard words that were found useful in text mining process.

Text Extraction In WWW Images (웹 영상에 포함된 문자 영역의 추출)

  • 김상현;심재창;김중수
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.15-18
    • /
    • 2000
  • In this paper, we propose a method for text extraction in the Web images. Our approach is based on contrast detecting and pixel component ratio analysis in mouse position. Extracted data with OCR can be used for real time dictionary call or language translation application in Web browser.

  • PDF

A Study on Semiotic Analysis of Popular Songs' lyric - Analysis of 'Kang San-ae's 2nd album' Lyrics as a Text - (대중가요 노랫말의 기호학적 분석 - Text로서 강산에 2집의 노랫말 분석을 중심으로 -)

  • Joung, Woo-Il;Cho, Tae-Seon
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.11b
    • /
    • pp.528-530
    • /
    • 2010
  • 본 논문에서는 음악의 분석, 특히 대중가요의 가사 분석에 있어서 사회학적 관점인 기호학을 기초로 분석을 하였으며, 특히 '강산에'의 노래가사를 중심으로 분석을 하였다. 2집은 1994년에 발매 되었는데, 당시의 사회적 분위기와 맞물린 '비판적'가사의 흐름이외에도 '정서적', '의식적'인 구조에 대해서도 분석하였다.

  • PDF