• Title/Summary/Keyword: video to images

Search Result 1,354, Processing Time 0.028 seconds

A Study of Kernel Characteristics of CNN Deep Learning for Effective Fire Detection Based on Video (영상기반의 화재 검출에 효과적인 CNN 심층학습의 커널 특성에 대한 연구)

  • Son, Geum-Young;Park, Jang-Sik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.6
    • /
    • pp.1257-1262
    • /
    • 2018
  • In this paper, a deep learning method is proposed to detect the fire effectively by using video of surveillance camera. Based on AlexNet model, classification performance is compared according to kernel size and stride of convolution layer. Dataset for learning and interfering are classified into two classes such as normal and fire. Normal images include clouds, and foggy images, and fire images include smoke and flames images, respectively. As results of simulations, it is shown that the larger kernel size and smaller stride shows better performance.

Zoom Lens Distortion Correction Of Video Sequence Using Nonlinear Zoom Lens Distortion Model (비선형 줌-렌즈 왜곡 모델을 이용한 비디오 영상에서의 줌-렌즈 왜곡 보정)

  • Kim, Dae-Hyun;Shin, Hyoung-Chul;Oh, Ju-Hyun;Nam, Seung-Jin;Sohn, Kwang-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.14 no.3
    • /
    • pp.299-310
    • /
    • 2009
  • In this paper, we proposed a new method to correct the zoom lens distortion for the video sequence captured by the zoom lens. First, we defined the nonlinear zoom lens distortion model which is represented by the focal length and the lens distortion using the characteristic that lens distortion parameters are nonlinearly and monotonically changed while the focal length is increased. Then, we chose some sample images from the video sequence and estimated a focal length and a lens distortion parameter for each sample image. Using these estimated parameters, we were able to optimize the zoom lens distortion model. Once the zoom lens distortion model was obtained, lens distortion parameters of other images were able to be computed as their focal lengths were input. The proposed method has been made experiments with many real images and videos. As a result, accurate distortion parameters were estimated from the zoom lens distortion model and distorted images were well corrected without any visual artifacts.

Toward a Key-frame Extraction Framework for Video Storyboard Surrogates Based on Users' EEG Signals (이용자 기반의 비디오 키프레임 자동 추출을 위한 뇌파측정기술(EEG) 적용)

  • Kim, Hyun-Hee;Kim, Yong-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.49 no.1
    • /
    • pp.443-464
    • /
    • 2015
  • This study examined the feasibility of using EEG signals and ERP P3b for extracting video key-frames based on users' cognitive responses. Twenty participants were used to collect EEG signals. This research found that the average amplitude of right parietal lobe is higher than that of left parietal lobe when relevant images were shown to participants; there is a significant difference between the average amplitudes of both parietal lobes. On the other hand, the average amplitude of left parietal lobe in the case of non-relevant images is lower than that in the case of relevant images. Moreover, there is no significant difference between the average amplitudes of both parietal lobes in the case of non-relevant images. Additionally, the latency of MGFP1 and channel coherence can be also used as criteria to extract key-frames.

Real-time Video Matting for Mobile Device (모바일 환경에서 실시간 영상 전경 추출 연구)

  • Yoon, Jong-Chul
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.5
    • /
    • pp.487-492
    • /
    • 2018
  • Recently, various applications for image processing have been ported to the mobile environment due to the expansion of the image shooting on the mobile device. However, in the case of extracting the image foreground, which is one of the most important functions of image synthesis, is difficult since it needs complex calculation. In this paper, we propose an video synthesis technique that can divide images captured by mobile devices into foreground / background and combine them in real time on target images. Considering the characteristics of mobile shooting, our system can extract automatically foreground of input video that contains weak motion when shooting. Using SIMD and GPGPU-based acceleration algorithms, SD-quality images can be processed on mobile in real time.

MPEG Video Retrieval Using U-Trees Construction (KD-Trees구조를 이용한MPEG 비디오 검색)

  • Kim, Daeil;Hong, Jong-Sun;Jang, Hye-Kyoung;Kim, Young-Ho;Kang, Dae-Seong
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.1855-1858
    • /
    • 2003
  • In this paper, we propose image retrieval method more accurate and efficient than the conventional one. First of ail, we perform a shot detection and key frame extraction from the DC image constructed by DCT DC coefficients in the compressed video stream that is video compression standard such as MPEG[I][2]. We get principal axis applying PCA(Principal Component Analysis) to key frames for obtaining indexing information, and divide a domain. Video retrieval uses indexing information of high dimension. We apply KD-Trees(K Dimensional-Trees)[3] which shows efficient retrieval in data set of high dimension to video retrieval method. The proposed method can represent property of images more efficiently and property of domains more accurately using KD-Trees.

  • PDF

Digital Video Camera Characterization Considering White Balance (White Balance를 고려한 디지털 비디오 카메라 Characterization)

  • 박종선;김대원;장수욱;김은수;송규익
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.299-302
    • /
    • 2002
  • Digital video camera can be a useful tool to capture images for use in colorimeter. However, the RGB signals generated by different digital video camera are not equal for the same scene. The digital video camera for use in colorimeter is characterized based on the CIE standard colorimetric observer. One method of deriving a colorimetric characterization matrix between camera RGB output signals and CIE XYZ tristimulus values is Polynomial modeling. In this paper, 3${\times}$3 linear matrix and 3${\times}$l1 polynomial matrix is used to investigate the characterization performance of the professional digital video camera. In experimental results, it is demonstrated that proposed 3${\times}$3 linear matrix has a reasonable degree of accuracy for use in colorimeter.

  • PDF

Generation and Coding of Layered Depth Images for Multi-view Video Representation with Depth Information (깊이정보를 포함한 다시점 비디오로부터 계층적 깊이영상 생성 및 부호화 기법)

  • Yoon, Seung-Uk;Lee, Eun-Kyung;Kim, Sung-Yeol;Ho, Yo-Sung;Yun, Kug-Jin;Kim, Dae-Hee;Hur, Nam-Ho;Lee, Soo-In
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.375-378
    • /
    • 2005
  • The multi-view video is a collection of multiple videos capturing the same scene at different viewpoints. The multi-view video can be used in various applications, including free viewpoint TV and three-dimensional TV. Since the data size of the multi-view video linearly increases as the number of cameras, it is necessary to compress multi-view video data for efficient storage and transmission. The multi-view video can be coded using the concept of the layered depth image (LDI). In this paper, we describe a procedure to generate LDI from the natural multi-view video and present a method to encode multi-view video using the concept of LDI.

  • PDF

A Method for Surface Reconstruction and Synthesizing Intermediate Images for Multi-viewpoint 3-D Displays

  • Fujii, Mahito;Ito, Takayuki;Miyake, Sei
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1996.06b
    • /
    • pp.35-40
    • /
    • 1996
  • In this paper, a method for 3-D surface reconstruction with two real cameras is presented. The method, which combines the extraction of binocular disparity and its interpolation can be applied to the synthesis of images from virtual viewpoints. The synthesized virtual images are as natural as the real images even when we observe the images as stereoscopic images. The method opens up many applications, such as synthesizing input images for multi-viewpoint 3-D displays, enhancing the depth impression in 2-D images and so on. We also have developed a video-rate stereo machine able to obtain binocular disparity in 1/30 sec with two cameras. We show the performance of the machine.

  • PDF

Stochastic Non-linear Hashing for Near-Duplicate Video Retrieval using Deep Feature applicable to Large-scale Datasets

  • Byun, Sung-Woo;Lee, Seok-Pil
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4300-4314
    • /
    • 2019
  • With the development of video-related applications, media content has increased dramatically through applications. There is a substantial amount of near-duplicate videos (NDVs) among Internet videos, thus NDVR is important for eliminating near-duplicates from web video searches. This paper proposes a novel NDVR system that supports large-scale retrieval and contributes to the efficient and accurate retrieval performance. For this, we extracted keyframes from each video at regular intervals and then extracted both commonly used features (LBP and HSV) and new image features from each keyframe. A recent study introduced a new image feature that can provide more robust information than existing features even if there are geometric changes to and complex editing of images. We convert a vector set that consists of the extracted features to binary code through a set of hash functions so that the similarity comparison can be more efficient as similar videos are more likely to map into the same buckets. Lastly, we calculate similarity to search for NDVs; we examine the effectiveness of the NDVR system and compare this against previous NDVR systems using the public video collections CC_WEB_VIDEO. The proposed NDVR system's performance is very promising compared to previous NDVR systems.

Feasibility Study on Audio-Tactile Display via Spectral Modulation (스펙트럼 변조를 이용한 청각정보의 촉감재현 가능성 연구)

  • Kwak, Hyun-Koo;Kim, Whee-Kuk;Chung, Ju-No;Kang, Dae-Im;Park, Yon-Kyu;Koo, Min-Mo
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.28 no.5
    • /
    • pp.638-647
    • /
    • 2011
  • Various approaches directly using vibrations of speakers have been suggested to effectively display the aural information such as the music to the hearing-impaired or the deaf. However, in these approaches, the human can't sense the frequency information over the maximum perceivable vibro-tactile frequency (around 1kHz). Therefore, in this study, an approach via spectral modulation of compressing the high frequency audio information into perceivable vibro-tactile frequency domain and outputting the modulated signals through the designated speakers is proposed. Then it is shown, through simulations of using Short-Time Fourier Transform (STFT) with Hanning windows and through preliminary experiments of using the vibro-tactile display testbed which is built and interfaced with a notebook PC, that the modulated signal of a natural sound composing sounds of a frog, a bird, and a water stream could produce the noise-free signal suitable enough for vibro-tactile speakers without causing Significant interfering disturbances, Lastly, for three different combinations of information provided to the subject, that is, i) with only video image, ii) with video image along with the modulated vibro-tactile stimuli as proposed in this study to the forearm of the subject, and iii) with video image along with full audio information, the effects to the human sense of reality and his emotion to given audio-video clips including various sounds and images are investigated and compared. It is shown from results of those experiments that the proposed method of providing modulated vibro-tactile stimuli along with the video images to the human has very high feasibility to transmit pseudo-aural sense to the human.