• Title/Summary/Keyword: Text box

Search Result 71, Processing Time 0.039 seconds

The Binarization of Text Regions in Natural Scene Images, based on Stroke Width Estimation (자연 영상에서 획 너비 추정 기반 텍스트 영역 이진화)

  • Zhang, Chengdong;Kim, Jung Hwan;Lee, Guee Sang
    • Smart Media Journal
    • /
    • v.1 no.4
    • /
    • pp.27-34
    • /
    • 2012
  • In this paper, a novel text binarization is presented that can deal with some complex conditions, such as shadows, non-uniform illumination due to highlight or object projection, and messy backgrounds. To locate the target text region, a focus line is assumed to pass through a text region. Next, connected component analysis and stroke width estimation based on location information of the focus line is used to locate the bounding box of the text region, and each box of connected components. A series of classifications are applied to identify whether each CC(Connected component) is text or non-text. Also, a modified K-means clustering method based on an HCL color space is applied to reduce the color dimension. A text binarization procedure based on location of text component and seed color pixel is then used to generate the final result.

  • PDF

Text Mining and Sentiment Analysis for Predicting Box Office Success

  • Kim, Yoosin;Kang, Mingon;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.4090-4102
    • /
    • 2018
  • After emerging online communications, text mining and sentiment analysis has been frequently applied into analyzing electronic word-of-mouth. This study aims to develop a domain-specific lexicon of sentiment analysis to predict box office success in Korea film market and validate the feasibility of the lexicon. Natural language processing, a machine learning algorithm, and a lexicon-based sentiment classification method are employed. To create a movie domain sentiment lexicon, 233,631 reviews of 147 movies with popularity ratings is collected by a XML crawling package in R program. We accomplished 81.69% accuracy in sentiment classification by the Korean sentiment dictionary including 706 negative words and 617 positive words. The result showed a stronger positive relationship with box office success and consumers' sentiment as well as a significant positive effect in the linear regression for the predicting model. In addition, it reveals emotion in the user-generated content can be a more accurate clue to predict business success.

Normalized Term Frequency Weighting Method in Automatic Text Categorization (자동 문서분류에서의 정규화 용어빈도 가중치방법)

  • 김수진;박혁로
    • Proceedings of the IEEK Conference
    • /
    • 2003.11b
    • /
    • pp.255-258
    • /
    • 2003
  • This paper defines Normalized Term Frequency Weighting method for automatic text categorization by using Box-Cox, and then it applies automatic text categorization. Box-Cox transformation is statistical transformation method which makes normalized data. This paper applies that and suggests new term frequency weighting method. Because Normalized Term Frequency is different from every term compared by existing term frequency weighting method, it is general method more than fixed weighting method such as log or root. Normalized term frequency weighting method's reasonability has been proved though experiments, used 8000 newspapers divided in 4 groups, which resulted high categorization correctness in all cases.

  • PDF

Rotation-robust text localization technique using deep learning (딥러닝 기반의 회전에 강인한 텍스트 검출 기법)

  • Choi, In-Kyu;Kim, Jewoo;Song, Hyok;Yoo, Jisang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.06a
    • /
    • pp.80-81
    • /
    • 2019
  • 본 논문에서는 자연스러운 장면 영상에서 임의의 방향성을 가진 텍스트를 검출하기 위한 기법을 제안한다. 텍스트 검출을 위한 기본적인 프레임 워크는 Faster R-CNN[1]을 기반으로 한다. 먼저 RPN(Region Proposal Network)을 통해 다른 방향성을 가진 텍스트를 포함하는 bounding box를 생성한다. 이어서 RPN에서 생성한 각각의 bounding box에 대해 세 가지의 서로 다른 크기로 pooling된 특징지도를 추출하고 병합한다. 병합한 특징지도에서 텍스트와 텍스트가 아닌 대상에 대한 score, 정렬된 bounding box 좌표, 기울어진 bounding box 좌표를 모두 예측한다. 마지막으로 NMS(Non-Maximum Suppression)을 이용하여 검출 결과를 획득한다. COCO Text 2017 dataset[2]을 이용하여 학습 및 테스트를 진행하였으며 주관적으로 평가한 결과 기울어진 텍스트에 적합하게 회전된 영역을 얻을 수 있음을 확인하였다.

  • PDF

Design of CSS3 Polar-Coordinate Layout Module based on Fan Model (부채꼴 모델에 기반한 CSS3 극좌표계 서식 모듈의 설계)

  • Shim, Seung-Min;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.2
    • /
    • pp.299-310
    • /
    • 2019
  • Most web documents are written in Cartesian coordinates, so the study of vertical arrangement of text has been well organized, while the study of circular arrangement is very early. With the recent development of circular display devices, the demand for circular arrangement of texts is increasing. Thus, we proposed a CSS3 extended specification of polar coordinate layout for the circular placement of text. First, we defined the concept of fan model for the text arrangement in polar coordinate which is corresponding to box model in Cartesian coordinate. And, we described new definition on the directions of sentence, paragraph and text orientation in polar coordinate. Based on this new definitions, we developed the extended specification consisting of three parts. A part for setting the fan model, a part for setting directions, and a part for setting typesetting properties. To verify the feasibility of the proposed specification in current web browsers, a preprocessor was developed and sample contents were examined. We compared the code length of the sample contents implemented using other JavaScript library CssWarp.js so as to verify the efficiency of the proposed specification.

Content analysis on Elementary 『Science』 text-book in aspect of Information Literacy Education (정보활용교육 측면에서의 초등학교 『과학』 교과서 내용 분석)

  • 남태우;박현영
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2003.08a
    • /
    • pp.333-340
    • /
    • 2003
  • 본 연구는 정보활용교육이 교육과정 전반에 기초소양교육으로서 수행되어져야 교육적 효과를 향상시킬 수 있다는 결과를 도출하기 위함이다. 이를 위하여, 제7차 교육과정의 초등학교 $\boxDr$과학$\boxUl$ 교과서를 대상으로 정보활용교육적 요소를 추출하고자 한다.

  • PDF