• Title/Summary/Keyword: text

Search Result 13,458, Processing Time 0.034 seconds

A MVC Framework for Visualizing Text Data (텍스트 데이터 시각화를 위한 MVC 프레임워크)

  • Choi, Kwang Sun;Jeong, Kyo Sung;Kim, Soo Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.39-58
    • /
    • 2014
  • As the importance of big data and related technologies continues to grow in the industry, it has become highlighted to visualize results of processing and analyzing big data. Visualization of data delivers people effectiveness and clarity for understanding the result of analyzing. By the way, visualization has a role as the GUI (Graphical User Interface) that supports communications between people and analysis systems. Usually to make development and maintenance easier, these GUI parts should be loosely coupled from the parts of processing and analyzing data. And also to implement a loosely coupled architecture, it is necessary to adopt design patterns such as MVC (Model-View-Controller) which is designed for minimizing coupling between UI part and data processing part. On the other hand, big data can be classified as structured data and unstructured data. The visualization of structured data is relatively easy to unstructured data. For all that, as it has been spread out that the people utilize and analyze unstructured data, they usually develop the visualization system only for each project to overcome the limitation traditional visualization system for structured data. Furthermore, for text data which covers a huge part of unstructured data, visualization of data is more difficult. It results from the complexity of technology for analyzing text data as like linguistic analysis, text mining, social network analysis, and so on. And also those technologies are not standardized. This situation makes it more difficult to reuse the visualization system of a project to other projects. We assume that the reason is lack of commonality design of visualization system considering to expanse it to other system. In our research, we suggest a common information model for visualizing text data and propose a comprehensive and reusable framework, TexVizu, for visualizing text data. At first, we survey representative researches in text visualization era. And also we identify common elements for text visualization and common patterns among various cases of its. And then we review and analyze elements and patterns with three different viewpoints as structural viewpoint, interactive viewpoint, and semantic viewpoint. And then we design an integrated model of text data which represent elements for visualization. The structural viewpoint is for identifying structural element from various text documents as like title, author, body, and so on. The interactive viewpoint is for identifying the types of relations and interactions between text documents as like post, comment, reply and so on. The semantic viewpoint is for identifying semantic elements which extracted from analyzing text data linguistically and are represented as tags for classifying types of entity as like people, place or location, time, event and so on. After then we extract and choose common requirements for visualizing text data. The requirements are categorized as four types which are structure information, content information, relation information, trend information. Each type of requirements comprised with required visualization techniques, data and goal (what to know). These requirements are common and key requirement for design a framework which keep that a visualization system are loosely coupled from data processing or analyzing system. Finally we designed a common text visualization framework, TexVizu which is reusable and expansible for various visualization projects by collaborating with various Text Data Loader and Analytical Text Data Visualizer via common interfaces as like ITextDataLoader and IATDProvider. And also TexVisu is comprised with Analytical Text Data Model, Analytical Text Data Storage and Analytical Text Data Controller. In this framework, external components are the specifications of required interfaces for collaborating with this framework. As an experiment, we also adopt this framework into two text visualization systems as like a social opinion mining system and an online news analysis system.

Text Region Extraction from Videos using the Harris Corner Detector (해리스 코너 검출기를 이용한 비디오 자막 영역 추출)

  • Kim, Won-Jun;Kim, Chang-Ick
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.7
    • /
    • pp.646-654
    • /
    • 2007
  • In recent years, the use of text inserted into TV contents has grown to provide viewers with better visual understanding. In this paper, video text is defined as superimposed text region located of the bottom of video. Video text extraction is the first step for video information retrieval and video indexing. Most of video text detection and extraction methods in the previous work are based on text color, contrast between text and background, edge, character filter, and so on. However, the video text extraction has big problems due to low resolution of video and complex background. To solve these problems, we propose a method to extract text from videos using the Harris corner detector. The proposed algorithm consists of four steps: corer map generation using the Harris corner detector, extraction of text candidates considering density of comers, text region determination using labeling, and post-processing. The proposed algorithm is language independent and can be applied to texts with various colors. Text region update between frames is also exploited to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.

Text Extraction from Complex Natural Images

  • Kumar, Manoj;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • v.6 no.2
    • /
    • pp.1-5
    • /
    • 2010
  • The rapid growth in communication technology has led to the development of effective ways of sharing ideas and information in the form of speech and images. Understanding this information has become an important research issue and drawn the attention of many researchers. Text in a digital image contains much important information regarding the scene. Detecting and extracting this text is a difficult task and has many challenging issues. The main challenges in extracting text from natural scene images are the variation in the font size, alignment of text, font colors, illumination changes, and reflections in the images. In this paper, we propose a connected component based method to automatically detect the text region in natural images. Since text regions in mages contain mostly repetitions of vertical strokes, we try to find a pattern of closely packed vertical edges. Once the group of edges is found, the neighboring vertical edges are connected to each other. Connected regions whose geometric features lie outside of the valid specifications are considered as outliers and eliminated. The proposed method is more effective than the existing methods for slanted or curved characters. The experimental results are given for the validation of our approach.

An Audio-Visual Teaching Aid (AVTA) with Scrolling Display and Speech to Text over the Internet

  • Davood Khalili;Chung, Wan-Young
    • Proceedings of the IEEK Conference
    • /
    • 2003.07c
    • /
    • pp.2649-2652
    • /
    • 2003
  • In this Paper, an Audio-Visual Teaching aid (AVTA) for use in a classroom and with Internet is presented. A system, which was designed and tested, consists of a wireless Microphone system, Text to Speech conversion Software, Noise filtering circuit and a Computer. An IBM compatible PC with sound card and Network Interface card and a Web browser and a voice and text messenger service were used to provide slightly delayed text and also voice over the internet for remote teaming, while providing scrolling text from a real time lecture in a classroom. The motivation for design of this system, was to aid Korean students who may have difficulty in listening comprehension while have, fairly good reading ability of text. This application of this system is twofold. On one hand it will help the students in a class to view and listen to a lecture, and on the other hand, it will serve as a vehicle for remote access (audio and text) for a classroom lecture. The project provides a simple and low cost solution to remote learning and also allows a student to have access to classroom in emergency situations when the student, can not attend a class. In addition, such system allows the student in capturing a teacher's lecture in audio and text form, without the need to be present in class or having to take many notes. This system will therefore help students in many ways.

  • PDF

A Study on Information Resource Evaluation for Text Categorization (문서범주화 효율성 제고를 위한 정보원 평가에 관한 연구)

  • Chung, Eun-Kyung
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.305-321
    • /
    • 2007
  • The purpose of this study is to examine whether the information resources referenced by human indexers during indexing process are effective on Text Categorization. More specifically, information resources from bibliographic information as well as full text information were explored in the context of a typical scientific journal article data set. The experiment results pointed out that information resources such as citation, source title, and title were not significantly different with full text. Whereas keyword was found to be significantly different with full text. The findings of this study identify that information resources referenced by human indexers can be considered good candidates for text categorization for automatic subject term assignment.

A Comparison of Socio-linguistic Characteristics and Instructional Influences of Different Types of Informational Science Texts (정보적 과학 텍스트의 사회-언어학적 특징과 초등 과학 학습에 미치는 효과)

  • Lim, Hee-Jun;Kim, Hyun-Kyung
    • Journal of Korean Elementary Science Education
    • /
    • v.30 no.2
    • /
    • pp.232-241
    • /
    • 2011
  • The purpose of this study was to compare socio-linguistic characteristics and instructional influences of two different types of texts, which were narrative and expository. Socio-linguistic characteristics of two different types of texts were analyzed in their content specialization, linguistic formality, and social-pedagogic relationships. Expository texts showed strong scientific classification, and medium level of linguistic formality, and low level of social-pedagogic relationships. Narrative texts showed different characteristics. The instructional effects were investigated with 91 fifth grade elementary students in three classes. Each class was randomly assigned into three groups: expository text group, narrative text group, control group. The results showed that the science achievement scores of the narrative text group was higher than those of other groups. The affective domain test scores of the expository text group were higher than other groups. The perception of students on informational science text were generally positive both types of texts.

GalaxyTBM을 이용한 Clostridium hylemonae의 ᴅ-Psicose 3-Epimerase (DPE) 단백질 구조 예측

  • Lee, Hyeon-Jin;Park, Ji-Hyeon;Choe, Yeon-Uk;Lee, Geun-U
    • Proceeding of EDISON Challenge
    • /
    • 2015.03a
    • /
    • pp.177-183
    • /
    • 2015
  • $\text\tiny{D}$-Psicose 3-Epimerase (DPE)는 $\text\tiny{D}$-Fructose의 C3 Epimerase로써 $\text\tiny{D}$-Fructose를 $\text\tiny{D}$-Psicose로 전환해 주는 효소이다. $\text\tiny{D}$-Psicose는 설탕 대신 사용하는 감미료로 몸에 흡수되지 않아 칼로리가 없다고 알려져 있고 자연에서는 오로지 DPE에 의해서만 생산되는 희귀당이다. 이에 따라 DPE를 통한 $\text\tiny{D}$-Psicose 대량생산의 필요성이 대두되고 있는 등 이 분야에 대한 관심이 뜨거운 실정이다. 본 연구팀은 이 당과 관련된 작용기작 연구를 수행하기 위하여 아직 단백질 3차구조가 알려지지 않은 Clostridium hylemonae DPE (chDPE) 단백질의 3차 구조예측 연구를 수행 하였다. 우리는 HHsearch를 이용하여 agrobacterium tumefaciens의 DPE 외 2개의 구조를 호몰로지 모델링 연구를 위한 주형으로 선정하였다. 다음으로 PROMALS3D를 이용하여 주형들과 chDPE의 multiple sequence alignment를 수행하였고 이를 바탕으로 3차구조 예측 연구를 수행 하였다. 예측된 구조를 검증하기 위하여 ProSA와 Ramachandran plot분석을 이용하였고 Ramachandran plot에서 단백질의 94.8%에 해당하는 잔기들이 favoured regions에 위치하였다. ProSA에서는 Z-score값이 -9.3으로 X-선 결정학이나 핵자기 공명법으로 밝혀진 구조들에서 관측되는 범위 내에 위치하였다. 나아가 예측된 구조에 $\text\tiny{D}$-Psicose와 $\text\tiny{D}$-Fructose의 결합모드를 규명하기 위하여 도킹을 시도하였다. 이번 연구를 통하여 chDPE의 구조를 예측 할 수 있었고 이를 바탕으로 이 단백질의 기능을 이해하는데 도움을 줄 것으로 기대된다.

  • PDF

A Study of Hangul Text Steganography based on Genetic Algorithm (유전 알고리즘 기반 한글 텍스트 스테가노그래피의 연구)

  • Ji, Seon-Su
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.3
    • /
    • pp.7-12
    • /
    • 2016
  • In a hostile Internet environment, steganography has focused to hide a secret message inside the cover medium for increasing the security. That is the complement of the encryption. This paper presents a text steganography techniques using the Hangul text. To enhance the security level, secret messages have been encrypted first through the genetic algorithm operator crossover. And then embedded into an cover text to form the stego text without changing its noticeable properties and structures. To maintain the capacity in the cover media to 3.69%, the experiments show that the size of the stego text was increased up to 14%.

Reading Korean Classical Narrative in Digital Era (디지털 시대의 고전 서사 읽기)

  • Seo, Yu-kyung
    • Journal of Korean Classical Literature and Education
    • /
    • no.16
    • /
    • pp.91-116
    • /
    • 2008
  • This Study aims to research how we read Korean classical narrative in the digital era and the culture digital made. The meaning of reading Korean classical narrative in digital culture can be divided in four category; first, reading the old narrative text that made and enjoyed a long time ago in digital era, second, reading the old narrative text that made in the old times not as th book but by the media environment that the digital technology made up, third, reading the extended and modified media text by the digital technology from the original old narrative text, fourth, reading the text by the digital technology seeking for the original narrative text as archetype. And it is inspected that the characteristic of the digital era and how to read the four type of Korean classical narrative through the example. So we can consider about the characteristic of enjoying Korean classical narrative and method of reading the diversified Korean classical narrative by the digital. Finally, it is conclusion that we must think over the original text of Korean classical narrative and adhere classicality. And we learned we should research more abundant Korean classical narrative text.

Detecting Local Text Reuse in the Texts of East Asian Traditional Medicine (한의학 고문헌 텍스트에서의 인용문 추정과 탐색)

  • Oh, Junho
    • Journal of Korean Medical classics
    • /
    • v.34 no.1
    • /
    • pp.37-45
    • /
    • 2021
  • Objectives : The purpose of this paper was to examine quantitative methods for estimating and detecting local text reuse in the texts of East Asian Traditional Medicine. Methods : We introduce techniques that estimate the volume of local text reuse with n-gram and those that directly detect the reuse with the Smith-Waterman algorithm (SW algorithm). Based on this, the estimation and detection of local text reuse were carried out for 『Donguibogam』 and 『Huangdineijing·Suwen』. Results : Estimates with n-gram had more errors than methods with SW algorithms. SW algorithms detected suspected strings directly with local text reuse, resulting in more accurate results. Conclusions : Although n-gram does not accurately find local text reuse, its high speed makes it a preferable method for certain purposes, such as screening similar documents. On the other hand, SW algorithms have the advantage of being relatively good at finding similar phrases suspected as local text reuse even if the strings do not completely match. However, due to its excessive consumption of time and computing resources, its benefits are limited to cases where precise results are required.