• Title/Summary/Keyword: Novel Text

Search Result 284, Processing Time 0.024 seconds

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning

  • Kong, Jun;Sun, Jinhua;Jiang, Min;Hou, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.771-789
    • /
    • 2019
  • Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.

A Novel Statistical Feature Selection Approach for Text Categorization

  • Fattah, Mohamed Abdel
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1397-1409
    • /
    • 2017
  • For text categorization task, distinctive text features selection is important due to feature space high dimensionality. It is important to decrease the feature space dimension to decrease processing time and increase accuracy. In the current study, for text categorization task, we introduce a novel statistical feature selection approach. This approach measures the term distribution in all collection documents, the term distribution in a certain category and the term distribution in a certain class relative to other classes. The proposed method results show its superiority over the traditional feature selection methods.

A Recognition Method for Korean Spatial Background in Historical Novels (한국어 역사 소설에서 공간적 배경 인식 기법)

  • Kim, Seo-Hee;Kim, Seung-Hoon
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.245-253
    • /
    • 2016
  • Background in a novel is most important elements with characters and events, and means time, place and situation that characters appeared. Among the background, spatial background can help conveys topic of a novel. So, it may be helpful for choosing a novel that readers want to read. In this paper, we are targeting Korean historical novels. In case of English text, It can be recognize spatial background easily because it use upper and lower case and words used with the spatial information such as Bank, University and City. But, in case Korean text, it is difficult to recognize that spatial background because there is few information about usage of letter. In the previous studies, they use machine learning or dictionaries and rules to recognize about spatial information in text such as news and text messages. In this paper, we build a nation dictionaries that refer to information such as 'Korean history' and 'Google maps.' We Also propose a method for recognizing spatial background based on patterns of postposition in Korean sentences comparing to previous works. We are grasp using of postposition with spatial background because Korean characteristics. And we propose a method based on result of morpheme analyze and frequency in a novel text for raising accuracy about recognizing spatial background. The recognized spatial background can help readers to grasp the atmosphere of a novel and to understand the events and atmosphere through recognition of the spatial background of the scene that characters appeared.

This study revises Lee Hyo-seok's The Buckwheat Season, utilizing Novel Corpus, intermediate learners' level (소설텍스트의 난이도 조정 방안 연구 -이효석의 「메밀꽃 필 무렵」을 중심으로-)

  • Hwang, Hye ran
    • Journal of Korean language education
    • /
    • v.29 no.4
    • /
    • pp.255-294
    • /
    • 2018
  • The Buckwheat Season, evaluated as the best of Lee Hyo-seok's literature, is one of the short stories that represent Korean literature. However, vivid literary expressions such as lyrical and beautiful depictions, figurative expressions and dialects, which show the Korean beauty, rather make learners have difficulty and become a factor that fails in reading comprehension. Thus, it is necessary to revise and present the text modified for the learners' language level. The methods of revising a literary text include the revision of linguistic elements such as cryptic vocabulary or sentence structure and the revision of the composition of the text, e.g. suggestion of characters or plot, or insertion of illustration. The methods of revising the language of the text can be divided into methods of simplification and detailing. However, in the process of revising the text, many depend on the adapter's subjective perception, not revising it with objective criteria. This paper revised the text, utilizing by the Academy of Korean Studies, , and the by the National Institute of Korean Language to secure objectivity in revising the text.

A Convergent Study on the Narration of Novel through Text-mining (소설 내러티브의 변화: 텍스트마이닝 기반 장르별 내러티브 분석)

  • Park, Jungsik;Park, Mi Sun
    • English & American cultural studies
    • /
    • v.17 no.1
    • /
    • pp.81-106
    • /
    • 2017
  • Using recently emerging quantitative methods, this article provides a comparative study of the diachronic changes in the narrations of novel, history, and science from the early 18th-century to the 20th-century. To trace the narrative changes in different genres, this article discusses how text-mining methodology can be introduced in literary studies. We compared the traces of narrative in three genres—novel, history, and science—as a pilot study, with the three major grammatical elements of narrative: pronoun, subordinating conjunction, and action verbs in past tense. The results of data-mining show that the use of pronoun and action verb has increased in the genre of novel toward the $20^{th}$ century, while history and science has developed less story-like writing styles.

A study on the aesthetic elements of Chinese translated Korean novel - Focused on the mode of narrations in "An old well" written by Jeong Heui Oh (우리말 소설의 중국어 번역에서 미적요소의 재현문제(2) - '화법'에서 본 오정희의 『옛 우물』(『老井』))

  • Choi, Eun Jeong
    • Cross-Cultural Studies
    • /
    • v.26
    • /
    • pp.201-226
    • /
    • 2012
  • This essay exams the issues of aesthetic elements that come up when Korean novels get translated into Chinese language. The short story collection titled "An old well" written by Jeong Heui Oh in both languages are compared and analyzed by focusing on the mode of narrations. There are various narrative modes in "An old well". Each narrative mode properly functions for aesthetic effects and drawing meanings. In short, we can find a way to grasp its leitmotif the writer wants to indicate only when we carefully interpret the narrative modes in the original text. However, the narrative modes in Korean text have been simplified by changing its modes into direct narrative in Chinese-translated text. Thus the aesthetic effects in the original text have been spoiled and the Chinese text fails to deliver its meaning involved in the original narrative mode. Translation of novel invites consideration on both of its form and content on account of the text's uniqueness. Accordingly, a close examination and study of the original text should be completed beforehand.

A Novel Video Image Text Detection Method

  • Zhou, Lin;Ping, Xijian;Gao, Haolin;Xu, Sen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.3
    • /
    • pp.941-953
    • /
    • 2012
  • A novel and universal method of video image text detection is proposed. A coarse-to-fine text detection method is implemented. Firstly, the spectral clustering (SC) method is adopted to coarsely detect text regions based on the stationary wavelet transform (SWT). In order to make full use of the information, multi-parameters kernel function which combining the features similarity information and spatial adjacency information is employed in the SC method. Secondly, 28 dimension classifying features are proposed and support vector machine (SVM) is implemented to classify text regions with non-text regions. Experimental results on video images show the encouraging performance of the proposed algorithm and classifying features.

A Novel Video Image Text Detection Method

  • Zhou, Lin;Ping, Xijian;Gao, Haolin;Xu, Sen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.4
    • /
    • pp.1140-1152
    • /
    • 2012
  • A novel and universal method of video image text detection is proposed. A coarse-to-fine text detection method is implemented. Firstly, the spectral clustering (SC) method is adopted to coarsely detect text regions based on the stationary wavelet transform (SWT). In order to make full use of the information, multi-parameters kernel function which combining the features similarity information and spatial adjacency information is employed in the SC method. Secondly, 28 dimension classifying features are proposed and support vector machine (SVM) is implemented to classify text regions with non-text regions. Experimental results on video images show the encouraging performance of the proposed algorithm and classifying features.

Novel Optimizer AdamW+ implementation in LSTM Model for DGA Detection

  • Awais Javed;Adnan Rashdi;Imran Rashid;Faisal Amir
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.133-141
    • /
    • 2023
  • This work take deeper analysis of Adaptive Moment Estimation (Adam) and Adam with Weight Decay (AdamW) implementation in real world text classification problem (DGA Malware Detection). AdamW is introduced by decoupling weight decay from L2 regularization and implemented as improved optimizer. This work introduces a novel implementation of AdamW variant as AdamW+ by further simplifying weight decay implementation in AdamW. DGA malware detection LSTM models results for Adam, AdamW and AdamW+ are evaluated on various DGA families/ groups as multiclass text classification. Proposed AdamW+ optimizer results has shown improvement in all standard performance metrics over Adam and AdamW. Analysis of outcome has shown that novel optimizer has outperformed both Adam and AdamW text classification based problems.

A Novel Character Segmentation Method for Text Images Captured by Cameras

  • Lue, Hsin-Te;Wen, Ming-Gang;Cheng, Hsu-Yung;Fan, Kuo-Chin;Lin, Chih-Wei;Yu, Chih-Chang
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.729-739
    • /
    • 2010
  • Due to the rapid development of mobile devices equipped with cameras, instant translation of any text seen in any context is possible. Mobile devices can serve as a translation tool by recognizing the texts presented in the captured scenes. Images captured by cameras will embed more external or unwanted effects which need not to be considered in traditional optical character recognition (OCR). In this paper, we segment a text image captured by mobile devices into individual single characters to facilitate OCR kernel processing. Before proceeding with character segmentation, text detection and text line construction need to be performed in advance. A novel character segmentation method which integrates touched character filters is employed on text images captured by cameras. In addition, periphery features are extracted from the segmented images of touched characters and fed as inputs to support vector machines to calculate the confident values. In our experiment, the accuracy rate of the proposed character segmentation system is 94.90%, which demonstrates the effectiveness of the proposed method.