• Title/Summary/Keyword: Document Image Analysis

Search Result 85, Processing Time 0.026 seconds

Extracting curved text lines using the chain composition and the expanded grouping method (체인 정합과 확장된 그룹핑 방법을 사용한 곡선형 텍스트 라인 추출)

  • Bai, Nguyen Noi;Yoon, Jin-Seon;Song, Young-Jun;Kim, Nam;Kim, Yong-Gi
    • The KIPS Transactions:PartB
    • /
    • v.14B no.6
    • /
    • pp.453-460
    • /
    • 2007
  • In this paper, we present a method to extract the text lines in poorly structured documents. The text lines may have different orientations, considerably curved shapes, and there are possibly a few wide inter-word gaps in a text line. Those text lines can be found in posters, blocks of addresses, artistic documents. Our method based on the traditional perceptual grouping but we develop novel solutions to overcome the problems of insufficient seed points and vaned orientations un a single line. In this paper, we assume that text lines contained tone connected components, in which each connected components is a set of black pixels within a letter, or some touched letters. In our scheme, the connected components closer than an iteratively incremented threshold will make together a chain. Elongate chains are identified as the seed chains of lines. Then the seed chains are extended to the left and the right regarding the local orientations. The local orientations will be reevaluated at each side of the chains when it is extended. By this process, all text lines are finally constructed. The proposed method is good for extraction of the considerably curved text lines from logos and slogans in our experiment; 98% and 94% for the straight-line extraction and the curved-line extraction, respectively.

A Study on Natural Language Document and Query Processor for Information Retrieval in Digital Library (디지털 도서관 환경에서의 정보 검색을 위한 자연어 문서 및 질의 처리기에 관한 연구)

  • 윤성희
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.12
    • /
    • pp.1601-1608
    • /
    • 2001
  • Digital library is the most important database system that needs information retrieval engine for natural language documents and multimedia data. This paper describes the experimental results of information retrieval engine and browser based on natural language processing. It includes lexical analysis, syntax processing, stemming, and keyword indexing for the natural language text. With the experimental database ‘Earth and Space Science’ that has lots of images and titles and their descriptive text in natural language, text-based search engine was tested. Combined with content-based image search engine, it is expected to be a multimedia information retrieval system in digital library

  • PDF

A Secure Face Cryptogr aphy for Identity Document Based on Distance Measures

  • Arshad, Nasim;Moon, Kwang-Seok;Kim, Jong-Nam
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.10
    • /
    • pp.1156-1162
    • /
    • 2013
  • Face verification has been widely studied during the past two decades. One of the challenges is the rising concern about the security and privacy of the template database. In this paper, we propose a secure face verification system which generates a unique secure cryptographic key from a face template. The face images are processed to produce face templates or codes to be utilized for the encryption and decryption tasks. The result identity data is encrypted using Advanced Encryption Standard (AES). Distance metric naming hamming distance and Euclidean distance are used for template matching identification process, where template matching is a process used in pattern recognition. The proposed system is tested on the ORL, YALEs, and PKNU face databases, which contain 360, 135, and 54 training images respectively. We employ Principle Component Analysis (PCA) to determine the most discriminating features among face images. The experimental results showed that the proposed distance measure was one the promising best measures with respect to different characteristics of the biometric systems. Using the proposed method we needed to extract fewer images in order to achieve 100% cumulative recognition than using any other tested distance measure.

Improved Edge Detection Algorithm Using Ant Colony System (개미 군락 시스템을 이용한 개선된 에지 검색 알고리즘)

  • Kim In-Kyeom;Yun Min-Young
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.315-322
    • /
    • 2006
  • Ant Colony System(ACS) is easily applicable to the traveling salesman problem(TSP) and it has demonstrated good performance on TSP. Recently, ACS has been emerged as the useful tool for the pattern recognition, feature extraction, and edge detection. The edge detection is wifely utilized in the area of document analysis, character recognition, and face recognition. However, the conventional operator-based edge detection approaches require additional postprocessing steps for the application. In the present study, in order to overcome this shortcoming, we have proposed the new ACS-based edge detection algorithm. The experimental results indicate that this proposed algorithm has the excellent performance in terms of robustness and flexibility.

Deep Learning Research Trends Analysis with Ego Centered Topic Citation Analysis (자아 중심 주제 인용분석을 활용한 딥러닝 연구동향 분석)

  • Lee, Jae Yun
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.4
    • /
    • pp.7-32
    • /
    • 2017
  • Recently, deep learning has been rapidly spreading as an innovative machine learning technique in various domains. This study explored the research trends of deep learning via modified ego centered topic citation analysis. To do that, a few seed documents were selected from among the retrieved documents with the keyword 'deep learning' from Web of Science, and the related documents were obtained through citation relations. Those papers citing seed documents were set as ego documents reflecting current research in the field of deep learning. Preliminary studies cited frequently in the ego documents were set as the citation identity documents that represents the specific themes in the field of deep learning. For ego documents which are the result of current research activities, some quantitative analysis methods including co-authorship network analysis were performed to identify major countries and research institutes. For the citation identity documents, co-citation analysis was conducted, and key literatures and key research themes were identified by investigating the citation image keywords, which are major keywords those citing the citation identity document clusters. Finally, we proposed and measured the citation growth index which reflects the growth trend of the citation influence on a specific topic, and showed the changes in the leading research themes in the field of deep learning.

Reliability Verification of Evidence Analysis Tools for Digital Forensics (디지털 포렌식을 위한 증거 분석 도구의 신뢰성 검증)

  • Lee, Tae-Rim;Shin, Sang-Uk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.21 no.3
    • /
    • pp.165-176
    • /
    • 2011
  • In this paper, we examine the reliability verification procedure of evidence analysis tools for computer forensics and test the famous tools for their functional requirements using the verification items proposed by standard document, TIAK.KO-12.0112. Also, we carry out performance evaluation based on test results and suggest the way of performance improvement for evidence analysis tools. To achieve this, we first investigate functions that test subjects can perform, and then we set up a specific test plan and create evidence image files which contain the contents of a verification items. We finally verify and analyze the test results. In this process, we can discover some weaknesses of most of analysis tools, such as the restoration for deleted & fragmented files, the identification of the file format which is widely used in the country and the processing of the strings composed of Korean alphabet.

A Study on Data Management Systems for Spatial Assessments of Road Visibilities at Night (야간도로 시인성에 대한 공간적 평가를 위한 자료관리체계 연구)

  • Woo, Hee Sook;Kwon, Kwang Seok;Kim, Byung Guk;Yoon, Chun Joo;Kim, Young Rok
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.4
    • /
    • pp.107-115
    • /
    • 2014
  • Visibility of the road influence the safe driving because it recognizes the obstacle on the road. In this paper, we propose a mobile data acquisition and processing system for evaluating road visibility at night. And it was converted efficiently with mobile images and archived for spatial analysis of road-visibilities at night. This was applied to the following techniques to the system. Low-power computing units, open an image processing library, GPU-based acceleration techniques and document database techniques, etc. And converting the RGB image to the YUV color system, which was integrated the brightness component and the spatial information. High performance Android devices were used to collect brightness data on roads and it was confirmed whether this prototype was to determine the spatial distribution of such acquisition and management systems for spatial-assessments of road visibility at night.

Study on the Evaluation Factors of Seafood Purchase for School Food Service (학교급식 수산물구매에 영향을 미치는 제품평가요인)

  • Jang, Young-Soo;Park, Jeong-A
    • The Journal of Fisheries Business Administration
    • /
    • v.40 no.2
    • /
    • pp.1-25
    • /
    • 2009
  • The major part of non-commercial food service is food service for school which has no any objective quality standards. Each school has different standard when they buy seafood for SFS(School Food Service). The research purpose is whether or not the extrinsic cues of the seafood such as price, the source origin, company image, safety standards, etc or the intrinsic cues such as fishy smell, the hardiness of fish meat, others have any effect on the seafood evaluation when school nutritionist purchase it, for more objective basis. The research method is distributing questionnaire survey through e-mail or directly visiting the schools from October 30 to November 9, 2007. The questionnaire was distributed to 70 nutritionists of food service for elementary school in Busan. Total 50 questionnaires are used as data in the statistical analysis using SPSS package software. The research results are; First, there is interaction effect between the extrinsic and intrinsic cues of seafood for SFS. That is when the school nutritionist valued on intrinsic cues of seafood such as a fishy smell, the hardiness of fish meat and etc influence on the extrinsic cues such as price, source origin, reliable circulation process, HACCP application, etc. Second, the extrinsic cues of the seafood give no effect on perceived quality. Since seafood for SFS are heavy buying, prearrangement contract and most of them using pre-treated frozen aquatics. Third, the intrinsic cues of the seafood give no effect on perceived quality. The extrinsic cues consist of 5 parts namely "opening about quality", "source origin", "company image", "safety/standards" and "price/package". However, "safety/standard" was the only affecting factor to perceive quality. The reason is that in fact they have no standards or any document proving the quality of the seafood unless safety standards factor. Last, the perceived quality is an important factor for perceived value and purchase intention. It is showed that there is a path to form a willing to buy through the perceived value after school nutritionist recognizes the perceived quality.

  • PDF

A Study on Text Pattern Analysis Applying Discrete Fourier Transform - Focusing on Sentence Plagiarism Detection - (이산 푸리에 변환을 적용한 텍스트 패턴 분석에 관한 연구 - 표절 문장 탐색 중심으로 -)

  • Lee, Jung-Song;Park, Soon-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.22 no.2
    • /
    • pp.43-52
    • /
    • 2017
  • Pattern Analysis is One of the Most Important Techniques in the Signal and Image Processing and Text Mining Fields. Discrete Fourier Transform (DFT) is Generally Used to Analyzing the Pattern of Signals and Images. We thought DFT could also be used on the Analysis of Text Patterns. In this Paper, DFT is Firstly Adapted in the World to the Sentence Plagiarism Detection Which Detects if Text Patterns of a Document Exist in Other Documents. We Signalize the Texts Converting Texts to ASCII Codes and Apply the Cross-Correlation Method to Detect the Simple Text Plagiarisms such as Cut-and-paste, term Relocations and etc. WordNet is using to find Similarities to Detect the Plagiarism that uses Synonyms, Translations, Summarizations and etc. The Data set, 2013 Corpus, Provided by PAN Which is the One of Well-known Workshops for Text Plagiarism is used in our Experiments. Our Method are Fourth Ranked Among the Eleven most Outstanding Plagiarism Detection Methods.

Adaptive Data Mining Model using Fuzzy Performance Measures (퍼지 성능 측정자를 이용한 적응 데이터 마이닝 모델)

  • Rhee, Hyun-Sook
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.541-546
    • /
    • 2006
  • Data Mining is the process of finding hidden patterns inside a large data set. Cluster analysis has been used as a popular technique for data mining. It is a fundamental process of data analysis and it has been Playing an important role in solving many problems in pattern recognition and image processing. If fuzzy cluster analysis is to make a significant contribution to engineering applications, much more attention must be paid to fundamental decision on the number of clusters in data. It is related to cluster validity problem which is how well it has identified the structure that Is present in the data. In this paper, we design an adaptive data mining model using fuzzy performance measures. It discovers clusters through an unsupervised neural network model based on a fuzzy objective function and evaluates clustering results by a fuzzy performance measure. We also present the experimental results on newsgroup data. They show that the proposed model can be used as a document classifier.