• Title/Summary/Keyword: Caption

Search Result 167, Processing Time 0.037 seconds

Design of Caption-processing ASIC for On Screen Display (On Screen Display용 자막처리 ASIC 설계)

  • Jeong, Geun-Yeong;U, Jong-Sik;Park, Jong-In;Park, Ju-Seong;Park, Jong-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.37 no.5
    • /
    • pp.66-76
    • /
    • 2000
  • This paper describes design and implementation of caption-processing ASIC(Application Specific Integrated Circuits) for OSD(On Screen Display) of karaoke system. The OSD of conventional karaoke system was implemented by a general purpose DSP, however this paper suggest a design to save hardware resources. The ASIC receives commands and data of graphic and caption from host processor, and then modifies the data to have various graphic effects. The design has been done by schematic and VHDL coding. The design was verified by logic simulation and FPGA emulation on the real system. The chip was fabricated with 0.8${\mu}{\textrm}{m}$ CMOS SOG, and worked properly at the karaoke system.

  • PDF

Using similarity based image caption to aid visual question answering (유사도 기반 이미지 캡션을 이용한 시각질의응답 연구)

  • Kang, Joonseo;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.191-204
    • /
    • 2021
  • Visual Question Answering (VQA) and image captioning are tasks that require understanding of the features of images and linguistic features of text. Therefore, co-attention may be the key to both tasks, which can connect image and text. In this paper, we propose a model to achieve high performance for VQA by image caption generated using a pretrained standard transformer model based on MSCOCO dataset. Captions unrelated to the question can rather interfere with answering, so some captions similar to the question were selected to use based on a similarity to the question. In addition, stopwords in the caption could not affect or interfere with answering, so the experiment was conducted after removing stopwords. Experiments were conducted on VQA-v2 data to compare the proposed model with the deep modular co-attention network (MCAN) model, which showed good performance by using co-attention between images and text. As a result, the proposed model outperformed the MCAN model.

A Method for Recovering Text Regions in Video using Extended Block Matching and Region Compensation (확장적 블록 정합 방법과 영역 보상법을 이용한 비디오 문자 영역 복원 방법)

  • 전병태;배영래
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.11
    • /
    • pp.767-774
    • /
    • 2002
  • Conventional research on image restoration has focused on restoring degraded images resulting from image formation, storage and communication, mainly in the signal processing field. Related research on recovering original image information of caption regions includes a method using BMA(block matching algorithm). The method has problem with frequent incorrect matching and propagating the errors by incorrect matching. Moreover, it is impossible to recover the frames between two scene changes when scene changes occur more than twice. In this paper, we propose a method for recovering original images using EBMA(Extended Block Matching Algorithm) and a region compensation method. To use it in original image recovery, the method extracts a priori knowledge such as information about scene changes, camera motion and caption regions. The method decides the direction of recovery using the extracted caption information(the start and end frames of a caption) and scene change information. According to the direction of recovery, the recovery is performed in units of character components using EBMA and the region compensation method. Experimental results show that EBMA results in good recovery regardless of the speed of moving object and complexity of background in video. The region compensation method recovered original images successfully, when there is no information about the original image to refer to.

Analysis of the Reading Materials in Elementary School Science Textbooks developed under the 2009 Revised National Science Curriculum (2009 개정 과학교육과정에 따른 초등학교 과학 교과서의 읽기자료 분석)

  • Koh, Hanjoong;Seok, Jongim;Kang, Sukjin
    • Journal of Korean Elementary Science Education
    • /
    • v.36 no.2
    • /
    • pp.129-142
    • /
    • 2017
  • In this study, the characteristics of the reading materials in elementary school science textbooks developed under the 2009 revised National Science Curriculum were investigated. The criteria for classifying the reading materials were the types of topic, purpose, students' activity, and presentation. The visual images in the reading materials were also analyzed from the viewpoint of type, role, caption type, and proximity type. The results indicated that the number of the reading materials in the 2009 revised science textbooks decreased compared to that of the 2007 revised science textbooks. It was also found that the frequencies of the reading materials expanding concepts of the text and/or requiring corresponding students' inquiry increased. More visual images were used in the reading materials of the 2009 revised science textbooks. However, several limitations were still found to exist; most visual images were illustration and/or picture; many visual images were presented without a caption; there was a problem in the proximity of visual image to text.

A analysis on visualization of advertisements for domestic real estate through Otto Kleppner′s visualization model (Focused on the creative of advertising in newspaper) (Otto Kleppner의 시각화 모델을 통한 국내 부동산광고의 시각화 분석(신문광고 크리에이티브를 중심으로))

  • 박광래
    • Archives of design research
    • /
    • v.15 no.2
    • /
    • pp.27-36
    • /
    • 2002
  • As interpretation of descriptive illustrations without caption differs from each viewer, it is deemed that the effect of attention is represented differently and the acceptance level of any advertisements will be ultimately different according to the visualization method of concept in the creative. The purpose of this research is to recognize the reality and status of visualization for real estate ads in newspapers via visualization analysis through Otto Kleppner's visualization model and to reasonably and efficiently conduct creative of real estate ads by representing problems and their solutions based on the surrey. These activities are to more efficiently produce real estate ads, which occupy a relatively important portion in newspaper ads.

  • PDF

A Research of Character Graphic Design on Larger Television Screens -Based on Analysis of its Visual Perception- (TV화면 대형화에 따른 문자그래픽 표현 연구 -시각인지도 분석 기반-)

  • Lee, Kook-Se;Moon, Nam-Mee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.4
    • /
    • pp.129-138
    • /
    • 2009
  • Character graphic design of TV screen, the major visual element, has become greatly important in its roles to help viewers understand visual information better and to enhance the qualities of the program. This research is to figure our a way of changing and improving the attributes of TV captions and graphics such as fonts, size, and caption speed appropriate to bigger and better qualified TV screen. Based on two Delphi surveys of graphics experts along with the theoretical studies, this article analyzes the relevance of visual perception to various visual elements of TV screen, and proposes a better plan in visual effects for various media under OSMU (One Source Multi Use).

  • PDF

A Method for Recovering Image Data for Caption Regions and Replacing Caption Text (비디오 자막 영역 원영상 복원 후 자막 교환 방법)

  • Chun, Byung-Tae;Han, Kyu-Seo;Bae, Young-Lae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10a
    • /
    • pp.743-746
    • /
    • 2001
  • 멀티미디어 정보 중 비디오 데이터는 정보량이 많기 때문에 자동화된 비디오 영상 처리 기술이 필요하다. 시청자의 이해와 시청의 편의성을 제공하기 위하여 대부분 비디오에 자막을 삽입하게 된다. 외국 방송물 및 영화에 삽입된 다른 언어 자막을 편집 과정에서 자막으로 교환 할 필요성이 종종 있게 된다. 기존의 방법들은 자막 부분을 충분히 포함하는 일정 영역에 특정 색상을 채운 후 새로운 자막을 삽입하게 된다. 기존 방법의 문제점은 많은 영역에 대해 비디오 영상 정보를 손실 시킴으로써 시청자에게 시청의 불편을 초래하고 자막 교환의 비 효율성과 부 자연스러움을 발생시킨다. 본 논문에서는 기존 방법의 문제점을 극복하기 위하여 자막 영역을 원영상으로 복원한 후 다른 자막으로 교환하는 방법을 제안하고자 한다. 원영상 복원을 위하여 비디오 정보와 BMA(Block Matching Algorithm)를 이용한 원영상 복원 방법을 제안하고, 복원된 영역에 다른 자막으로 교환함으로써 효과적인 자막 교환 방법을 제안하고자 한다. 실험 결과 원영상 복원을 이용한 자막 교환 방법은 기존 방법에 비해 자연스럽고 효과적인 교환 방법임을 볼 수 있었다.

  • PDF

A Method for Character Segmentation using MST(Minimum Spanning Tree) (MST를 이용한 문자 영역 분할 방법)

  • Chun, Byung-Tae;Kim, Young-In
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.3
    • /
    • pp.73-78
    • /
    • 2006
  • Conventional caption extraction methods use the difference between frames or color segmentation methods from the whole image. Because these methods depend heavily on heuristics, we should have a priori knowledge of the captions to be extracted. Also they are difficult to implement. In this paper, we propose a method that uses little heuristic and simplified algorithm. We use topographical features of characters to extract the character points and use MST(Minimum Spanning Tree) to extract the candidate regions for captions. Character regions are determined by testing several conditions and verifying those candidate regions. Experimental results show that the candidate region extraction rate is 100%, and the character region extraction rate is 98.2%. And then we can see the results that caption area in complex images is well extracted.

  • PDF

Automated Story Generation with Image Captions and Recursiva Calls (이미지 캡션 및 재귀호출을 통한 스토리 생성 방법)

  • Isle Jeon;Dongha Jo;Mikyeong Moon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.1
    • /
    • pp.42-50
    • /
    • 2023
  • The development of technology has achieved digital innovation throughout the media industry, including production techniques and editing technologies, and has brought diversity in the form of consumer viewing through the OTT service and streaming era. The convergence of big data and deep learning networks automatically generated text in format such as news articles, novels, and scripts, but there were insufficient studies that reflected the author's intention and generated story with contextually smooth. In this paper, we describe the flow of pictures in the storyboard with image caption generation techniques, and the automatic generation of story-tailored scenarios through language models. Image caption using CNN and Attention Mechanism, we generate sentences describing pictures on the storyboard, and input the generated sentences into the artificial intelligence natural language processing model KoGPT-2 in order to automatically generate scenarios that meet the planning intention. Through this paper, the author's intention and story customized scenarios are created in large quantities to alleviate the pain of content creation, and artificial intelligence participates in the overall process of digital content production to activate media intelligence.