• Title/Summary/Keyword: scene detection

Search Result 519, Processing Time 0.026 seconds

2-Stage Detection and Classification Network for Kiosk User Analysis (디스플레이형 자판기 사용자 분석을 위한 이중 단계 검출 및 분류 망)

  • Seo, Ji-Won;Kim, Mi-Kyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.668-674
    • /
    • 2022
  • Machine learning techniques using visual data have high usability in fields of industry and service such as scene recognition, fault detection, security and user analysis. Among these, user analysis through the videos from CCTV is one of the practical way of using vision data. Also, many studies about lightweight artificial neural network have been published to increase high usability for mobile and embedded environment so far. In this study, we propose the network combining the object detection and classification for mobile graphic processing unit. This network detects pedestrian and face, classifies age and gender from detected face. Proposed network is constructed based on MobileNet, YOLOv2 and skip connection. Both detection and classification models are trained individually and combined as 2-stage structure. Also, attention mechanism is used to improve detection and classification ability. Nvidia Jetson Nano is used to run and evaluate the proposed system.

Development of a Emergency Situation Detection Algorithm Using a Vehicle Dash Cam (차량 단말기 기반 돌발상황 검지 알고리즘 개발)

  • Sanghyun Lee;Jinyoung Kim;Jongmin Noh;Hwanpil Lee;Soomok Lee;Ilsoo Yun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.4
    • /
    • pp.97-113
    • /
    • 2023
  • Swift and appropriate responses in emergency situations like objects falling on the road can bring convenience to road users and effectively reduces secondary traffic accidents. In Korea, current intelligent transportation system (ITS)-based detection systems for emergency road situations mainly rely on loop detectors and CCTV cameras, which only capture road data within detection range of the equipment. Therefore, a new detection method is needed to identify emergency situations in spatially shaded areas that existing ITS detection systems cannot reach. In this study, we propose a ResNet-based algorithm that detects and classifies emergency situations from vehicle camera footage. We collected front-view driving videos recorded on Korean highways, labeling each video by defining the type of emergency, and training the proposed algorithm with the data.

DNN Based Multi-spectrum Pedestrian Detection Method Using Color and Thermal Image (DNN 기반 컬러와 열 영상을 이용한 다중 스펙트럼 보행자 검출 기법)

  • Lee, Yongwoo;Shin, Jitae
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.361-368
    • /
    • 2018
  • As autonomous driving research is rapidly developing, pedestrian detection study is also successfully investigated. However, most of the study utilizes color image datasets and those are relatively easy to detect the pedestrian. In case of color images, the scene should be exposed by enough light in order to capture the pedestrian and it is not easy for the conventional methods to detect the pedestrian if it is the other case. Therefore, in this paper, we propose deep neural network (DNN)-based multi-spectrum pedestrian detection method using color and thermal images. Based on single-shot multibox detector (SSD), we propose fusion network structures which simultaneously employ color and thermal images. In the experiment, we used KAIST dataset. We showed that proposed SSD-H (SSD-Halfway fusion) technique shows 18.18% lower miss rate compared to the KAIST pedestrian detection baseline. In addition, the proposed method shows at least 2.1% lower miss rate compared to the conventional halfway fusion method.

Fast Human Detection Method in Range Data using Adaptive UV-histogram and Template Matching (적응적 UV-histogram과 템플릿 매칭을 이용한 거리 영상에서의 고속 인간 검출 방법)

  • Yoon, Bumsik;Kim, Whoi-Yul
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.9
    • /
    • pp.119-128
    • /
    • 2014
  • In this paper, a fast human detection method using adaptive UV-histogram and template matching is proposed. The proposed method improves the detection rate in the scene of complex environment. The method firstly generates U-histogram to extract human candidates and adaptively generates V-histogram for each labled U-histogram, thus it could extract humans correctly, which was impossible in the previous method. The method tries to match the human candidates with the adaptively sized omega shape template to the focal length and distance in order to improve the detection accuracy. It also detects false positives by rematching the template with accumulated foreground images and hence is robust to the occlusion. Experimental results showed that the proposed method has superior performance to the Bae's method in the complex environment with about 15% improvement in precision and 80% in recall and has 20 times faster processing time than Xia's method.

3D Reconstruction using a Moving Planar Mirror (움직이는 평면거울을 이용한 3차원 물체 복원)

  • 장경호;이동훈;정순기
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.11
    • /
    • pp.1543-1550
    • /
    • 2004
  • Modeling from images is a cost-effective means of obtaining 3D geometric models. These models can be effectively constructed from classical Structure from Motion algorithm. However, it's too difficult to reconstruct whole scenes using SFM method since general sites contain a very complex shapes and brilliant colours. To overcome this difficulty, the current paper proposes a new reconstruction method based on a moving Planar mirror. We devise the mirror posture instead of scene itself as a cue for reconstructing the geometry That implies that the geometric cues are inserted into the scene by compulsion. With this method, we can obtain the geometric details regardless of the scene complexity. For this purpose, we first capture image sequences through the moving mirror containing the interested scene, and then calibrate the camera through the mirror's posture. Since the calibration results are still inaccurate due to the detection error, the camera pose is revised using frame-correspondence of the comer points that are easily obtained using the initial camera posture. Finally, 3D information is computed from a set of calibrated image sequences. We validate our approach with a set of experiments on some complex objects.

Online Content Editing System to Edit Broadcasting and Personal Contents (방송 및 개인 콘텐츠 편집을 위한 온라인 콘텐츠 편집 시스템)

  • Yang, Chang Mo;Chung, Kwangsue
    • Journal of Broadcast Engineering
    • /
    • v.20 no.4
    • /
    • pp.619-631
    • /
    • 2015
  • In this paper, we propose a new online content editing system to edit the broadcasting and personal contents. The proposed editing system consists of the content management server and the content editor, in which the content management server is used for storing the contents, while the content editor is used for editing the contents via the user interface. Unlike the existing content editing methods which edit the downloaded contents, the proposed editing system edits the contents stored in the content management server while the content editor plays them by using the streaming technology. However, it is not effective to edit the whole segments of contents while playing them by using the streaming technology. To resolve the problem, the proposed editing system performs the scene detection of the contents, and the detected scene information is used for playing and editing of contents. After completing the content editing, the edited information is uploaded to the content management server as a metadata format. In the proposed editing system, both scene and edited information are represented only as the metadata format, and the physical content segmentation according to the scene and edited information is not performed. The implementation results show that the proposed editing system provides similar performances with the existing content editing methods which use the content download and editing methods.

Abstraction Mechanism of Low-Level Video Features for Automatic Retrieval of Explosion Scenes (폭발장면 자동 검출을 위한 저급 수준 비디오 특징의 추상화)

  • Lee, Sang-Hyeok;Nang, Jong-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.5
    • /
    • pp.389-401
    • /
    • 2001
  • This paper proposes an abstraction mechanism of the low-level digital video features for the automatic retrievals of the explosion scenes from the digital video library. In the proposed abstraction mechanism, the regional dominant colors of the key frame and the motion energy of the shot are defined as the primary abstractions of the shot for the explosion scene retrievals. It is because an explosion shot usually consists of the frames with a yellow-tone pixel and the objects in the shot are moved rapidly. The regional dominant colors of shot are selected by dividing its key frame image into several regions and extracting their regional dominant colors, and the motion energy of the shot is defined as the edge image differences between key frame and its neighboring frame. The edge image of the key frame makes the retrieval of the explosion scene more precisely, because the flames usually veils all other objects in the shot so that the edge image of the key frame comes to be simple enough in the explosion shot. The proposed automatic retrieval algorithm declares an explosion scene if it has a shot with a yellow regional dominant color and its motion energy is several times higher than the average motion energy of the shots in that scene. The edge image of the key frame is also used to filter out the false detection. Upon the extensive exporimental results, we could argue that the recall and precision of the proposed abstraction and detecting algorithm are about 0.8, and also found that they are not sensitive to the thresholds. This abstraction mechanism could be used to summarize the long action videos, and extract a high level semantic information from digital video archive.

  • PDF

Reproducing Summarized Video Contents based on Camera Framing and Focus

  • Hyung Lee;E-Jung Choi
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.85-92
    • /
    • 2023
  • In this paper, we propose a method for automatically generating story-based abbreviated summaries from long-form dramas and movies. From the shooting stage, the basic premise was to compose a frame with illusion of depth considering the golden division as well as focus on the object of interest to focus the viewer's attention in terms of content delivery. To consider how to extract the appropriate frames for this purpose, we utilized elemental techniques that have been utilized in previous work on scene and shot detection, as well as work on identifying focus-related blur. After converting the videos shared on YouTube to frame-by-frame, we divided them into a entire frame and three partial regions for feature extraction, and calculated the results of applying Laplacian operator and FFT to each region to choose the FFT with relative consistency and robustness. By comparing the calculated values for the entire frame with the calculated values for the three regions, the target frames were selected based on the condition that relatively sharp regions could be identified. Based on the selected results, the final frames were extracted by combining the results of an offline change point detection method to ensure the continuity of the frames within the shot, and an edit decision list was constructed to produce an abbreviated summary of 62.77% of the footage with F1-Score of 75.9%

Detection of Pavement Region with Structural Patterns through Adaptive Multi-Seed Region Growing (적응적 다중 시드 영역 확장법을 이용한 구조적 패턴의 보도 영역 검출)

  • Weon, Sun-Hee;Joo, Sung-Il;Na, Hyeon-Suk;Choi, Hyung-Il
    • The KIPS Transactions:PartB
    • /
    • v.19B no.4
    • /
    • pp.209-220
    • /
    • 2012
  • In this paper, we propose an adaptive pavement region detection method that is robust to changes of structural patterns in a natural scene. In order to segment out a pavement reliably, we propose two step approaches. We first detect the borderline of a pavement and separate out the candidate region of a pavement using VRays. The VRays are straight lines starting from a vanishing point. They split out the candidate region that includes the pavement in a radial shape. Once the candidate region is found, we next employ the adaptive multi-seed region growing(A-MSRG) method within the candidate region. The A-MSRG method segments out the pavement region very accurately by growing seed regions. The number of seed regions are to be determined adaptively depending on the encountered situation. We prove the effectiveness of our approach by comparing its performance against the performances of seed region growing(SRG) approach and multi-seed region growing(MSRG) approach in terms of the false detection rate.

Detection of Moving Objects in Crowded Scenes using Trajectory Clustering via Conditional Random Fields Framework (Conditional Random Fields 구조에서 궤적군집화를 이용한 혼잡 영상의 이동 객체 검출)

  • Kim, Hyeong-Ki;Lee, Gwang-Gook;Kim, Whoi-Yul
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.8
    • /
    • pp.1128-1141
    • /
    • 2010
  • This paper proposes a method of moving object detection in crowded scene using clustered trajectory. Unlike previous appearance based approaches, the proposed method employes motion information only to isolate moving objects. In the proposed method, feature points are extracted from input frames first and then feature tracking is followed to create feature trajectories. Based on an assumption that feature points originated from the same objects shows similar motion as the object moves, the proposed method detects moving objects by clustering trajectories of similar motions. For this purpose an energy function based on spatial proximity, motion coherence, and temporal continuity is defined to measure the similarity between two trajectories and the clustering is achieved by minimizing the energy function in CRFs (conditional random fields). Compared to previous methods, which are unable to separate falsely merged trajectories during the clustering process, the proposed method is able to rearrange the falsely merged trajectories during iteration because the clustering is solved my energy minimization in CRFs. Experiment results with three different crowded scenes show about 94% detection rate with 7% false alarm rate.