• Title/Summary/Keyword: Original text

Search Result 430, Processing Time 0.023 seconds

A Method for Recovering Text Regions in Video using Extended Block Matching and Region Compensation (확장적 블록 정합 방법과 영역 보상법을 이용한 비디오 문자 영역 복원 방법)

  • 전병태;배영래
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.11
    • /
    • pp.767-774
    • /
    • 2002
  • Conventional research on image restoration has focused on restoring degraded images resulting from image formation, storage and communication, mainly in the signal processing field. Related research on recovering original image information of caption regions includes a method using BMA(block matching algorithm). The method has problem with frequent incorrect matching and propagating the errors by incorrect matching. Moreover, it is impossible to recover the frames between two scene changes when scene changes occur more than twice. In this paper, we propose a method for recovering original images using EBMA(Extended Block Matching Algorithm) and a region compensation method. To use it in original image recovery, the method extracts a priori knowledge such as information about scene changes, camera motion and caption regions. The method decides the direction of recovery using the extracted caption information(the start and end frames of a caption) and scene change information. According to the direction of recovery, the recovery is performed in units of character components using EBMA and the region compensation method. Experimental results show that EBMA results in good recovery regardless of the speed of moving object and complexity of background in video. The region compensation method recovered original images successfully, when there is no information about the original image to refer to.

The Task of the Translator: Walter Benjamin and Cultural Translation (번역자의 책무-발터 벤야민과 문화번역)

  • Yoon, Joewon
    • Journal of English Language & Literature
    • /
    • v.57 no.2
    • /
    • pp.217-235
    • /
    • 2011
  • On recognizing the significance of Walter Benjamin's "The Task of a Translator" in recent discourses of postcolonial cultural translation, this essay examines the creative postcolonialist appropriations of Benjamin's theory of translation and their political implications. In an effort to dismantle the imperialist political hierarchy between the West and the non-West, modernity and its "primitive" others, which has been the operative premise of the traditional translation studies and anthropology, newly emergent discourses of cultural translation actively adopts Benjamin's notion of translation that does not prioritize the original text's claim on authenticity. Benjamin theorizes each text-translation as well as the original-as an incomplete representation of the pure language. Eschewing formalistic views propounded by deconstructionist critics like Paul de Man, who tend to regard Benjamin's notion of the untranslatable purely in terms of the failure inherent in the language system per se, such postcolonialist critics as Tejaswini Niranjana, Rey Chow, and Homi Bhabha, each in his/her unique way, recuperate the significatory potential of historicity embedded in Benjamin's text. Their further appropriation of the concept of the "untranslatable" depends on a radically political turn that, instead of focusing on the failure of translation, salvages historical as well as cultural potentiality that lies between disparate cultural entities, signifying differences, or disjunctures, that do not easily render themselves to existing systems of representation. It may therefore be concluded that postcolonial discourses on cultural translation of Niranhana, Chow, and Bhabha, inspired by Benjamin, each translate the latter's theory into highly politicized understandings of translation, and this leads to an extensive rethinking of the act of translation itself to include all forms of cultural exchange and communicative activities between cultures. The disjunctures between these discourses and Benjamin's text, in that sense, enable them to form a sort of theoretical constellation, which aspires to an impossible yet necessary utopian ideal of critical thinking.

A Study on the Government Full-text Information Disclosure System through the Survey on the Government Officials' Perceptions (원문정보 공개제도에 대한 공무원들의 인식조사 연구)

  • Jang, Bo-Seong
    • Journal of Korean Library and Information Science Society
    • /
    • v.47 no.1
    • /
    • pp.339-360
    • /
    • 2016
  • This study is intended to analyze the actual condition of operating the full-text information disclosure system of government and Officials' Perceptions. According to the results of analysis, the public servant group's level of full-text information disclosure system was high. With regard to the positive and negative function of full-text information disclosure, an expectation for positive function was high in the aspect of assuring the transparency of administration. And public servants were worried about an increase in the burden of administrative duties. With regard to factors in hindering the development of full-text information disclosure, it was shown that the percentage of clients' abuse and misuse of full-text information was the highest. For the activation of full-text information disclosure system, it is necessary to prepare measures for preventing the abuse and misuse of full-text information.

Overlay Text Graphic Region Extraction for Video Quality Enhancement Application (비디오 품질 향상 응용을 위한 오버레이 텍스트 그래픽 영역 검출)

  • Lee, Sanghee;Park, Hansung;Ahn, Jungil;On, Youngsang;Jo, Kanghyun
    • Journal of Broadcast Engineering
    • /
    • v.18 no.4
    • /
    • pp.559-571
    • /
    • 2013
  • This paper has presented a few problems when the 2D video superimposed the overlay text was converted to the 3D stereoscopic video. To resolve the problems, it proposes the scenario which the original video is divided into two parts, one is the video only with overlay text graphic region and the other is the video with holes, and then processed respectively. And this paper focuses on research only to detect and extract the overlay text graphic region, which is a first step among the processes in the proposed scenario. To decide whether the overlay text is included or not within a frame, it is used the corner density map based on the Harris corner detector. Following that, the overlay text region is extracted using the hybrid method of color and motion information of the overlay text region. The experiment shows the results of the overlay text region detection and extraction process in a few genre video sequence.

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

Construction of Full-Text Database and Implementation of Service Environment for Electronic Theses and Dissertations (학위논문 전문데이터베이스 구축 및 서비스환경 구현)

  • Lee, Kyi-Ho;Kim, Jin-Suk;Yoon, Wha-Muk
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.1
    • /
    • pp.41-49
    • /
    • 2000
  • Form the middle of 199os, most universities in Korea have requested their students to submit not only the original text books but also their Electronic Theses and Dissertations(ETD) for masters degree and doctorates degree. The ETD submitted by the students are usually developed by various kinds of word processors such as MS-Word, LaTex, and HWP. Since there is no standard format for ETD to merge various different formats yet, it is difficult to construct the integrated database that provides full-tex service. In this paper, we transform three different ETD formats into a unified one, construct a full-text database, and implement the full-text retrieval system for effective search in the Internet environment.

  • PDF

Stroke Width Based Skeletonization for Text Images

  • Nguyen, Minh Hieu;Kim, Soo-Hyung;Yang, Hyung Jeong;Lee, Guee Sang
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.3
    • /
    • pp.149-156
    • /
    • 2014
  • Skeletonization is a morphological operation that transforms an original object into a subset, which is called a 'skeleton'. Skeletonization has been intensively studied for decades and is a challenging issue especially for special target objects. This paper proposes a novel approach to the skeletonization of text images based on stroke width detection. First, the preliminary skeleton is detected by using a Canny edge detector with a Tensor Voting framework. Second, the preliminary skeleton is smoothed, and junction points are connected by interpolation compensation. Experimental results show the validity of the proposed approach.

A Textual Bibliographic Analysis on the block books 《Xiangming Suanfa》 published in the Joseon Dynasty (조선(朝鮮) 간본(刊本) 《상명산법(詳明算法)》의 원문서지적(原文書誌的) 분석(分析))

  • Lee, Eunju
    • Journal for History of Mathematics
    • /
    • v.33 no.4
    • /
    • pp.181-222
    • /
    • 2020
  • 《Xiangming Suanfa》 is a mathematics text published in 1373. The preface and postscript of the block books 《Xiangming Suanfa》 in the Joseon Dynasty were published as they had been in the original text from China. It can be considered to be based on 《Xiangming Suanfa》 was published in Mingjingtang. The five different block books which is published in Korea, possessed in Yonsei University Library, Sanghuh Memorial Library of Konkuk University and the National Library of Korea, were compared. Through recension-correction of the text, this thesis is intended to help researchers in utilizing the research by providing fine print.

Quantized DCT Coefficient Category Address Encryption for JPEG Image

  • Li, Shanshan;Zhang, Yuanyuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.4
    • /
    • pp.1790-1806
    • /
    • 2016
  • Digital image encryption is widely used for image data security. JPEG standard compresses image with great performance on reducing file size. Thus, to encrypt an image in JPEG format we should keep the quality of original image and reduced size. This paper proposes a JPEG image encryption scheme based on quantized DC and non-zero AC coefficients inner category scrambling. Instead of coefficient value encryption, the address of coefficient is encrypted to get the address of cipher text. Then 8*8 blocks are shuffled. Chaotic iteration is employed to generate chaotic sequences for address scrambling and block shuffling. Analysis of simulation shows the proposed scheme is resistant to common attacks. Moreover, the proposed method keeps the file size of the encrypted image in an acceptable range compared with the plain text. To enlarge the cipher text possible space and improve the resistance to sophisticated attacks, several additional procedures are further developed. Contrast experiments verify these procedures can refine the proposed scheme and achieve significant improvements.

PubMiner: Machine Learning-based Text Mining for Biomedical Information Analysis

  • Eom, Jae-Hong;Zhang, Byoung-Tak
    • Genomics & Informatics
    • /
    • v.2 no.2
    • /
    • pp.99-106
    • /
    • 2004
  • In this paper we introduce PubMiner, an intelligent machine learning based text mining system for mining biological information from the literature. PubMiner employs natural language processing techniques and machine learning based data mining techniques for mining useful biological information such as protein­protein interaction from the massive literature. The system recognizes biological terms such as gene, protein, and enzymes and extracts their interactions described in the document through natural language processing. The extracted interactions are further analyzed with a set of features of each entity that were collected from the related public databases to infer more interactions from the original interactions. An inferred interaction from the interaction analysis and native interaction are provided to the user with the link of literature sources. The performance of entity and interaction extraction was tested with selected MEDLINE abstracts. The evaluation of inference proceeded using the protein interaction data of S. cerevisiae (bakers yeast) from MIPS and SGD.