Search | Korea Science

Multi-stage Transformer for Video Anomaly Detection

Viet-Tuan Le;Khuong G. T. Diep;Tae-Seok Kim;Yong-Guk Kim
- Proceedings of the Korea Information Processing Society Conference
- /
- 2023.11a
- /
- pp.648-651
- /
- 2023
Video anomaly detection aims to detect abnormal events. Motivated by the power of transformers recently shown in vision tasks, we propose a novel transformer-based network for video anomaly detection. To capture long-range information in video, we employ a multi-scale transformer as an encoder. A convolutional decoder is utilized to predict the future frame from the extracted multi-scale feature maps. The proposed method is evaluated on three benchmark datasets: USCD Ped2, CUHK Avenue, and ShanghaiTech. The results show that the proposed method achieves better performance compared to recent methods.
https://doi.org/10.3745/PKIPS.y2023m11a.648 인용 PDF

A Single Phase Multi-level Active Power Filter System using Instantaneous Reactive Power Harmonic Detection Method (순시 무효 전력 고조파 검출방법을 이용한 단상 멀티레벨 능동전력 필터)

Kim Soo-Hong;Kim Sung-Min;Lee Kang-Hee;Kim Yoon-Ho
- The Transactions of the Korean Institute of Power Electronics
- /
- v.10 no.3
- /
- pp.296-301
- /
- 2005
This paper proposing the use of the Instantaneous reactive power method as a harmonic detection method for a single phase active filter system. This method is to detect harmonic components through d-q frame approach. The conventional use of d-q frame approach for a 3-phase system Is extended to the single phase system. The proposed system uses a multi-level inverter for harmonic compensation and the inverter is connected to the input side without using transformers. The proposed algorithm is verified by simulation and experiment.
PDF KSCI

A Robust Method for Text Detection in Video (비디오에서 문자 검출을 위한 강인한 방법)

Dinh, Viet-Cuong;Jeon, Seung-Su;Ryu, Han-Jin;Seol, Sang-Hun
- Proceedings of the Korean Information Science Society Conference
- /
- 2007.06c
- /
- pp.403-406
- /
- 2007
This paper proposes an effective method for text detection in video. First, we apply an edge detection method to the video frame with a relative low threshold to keep all possible text edge pixels. Second, a multi-frame integration method is applied to significantly remove background pixels which are not stationary in a specific period. Finally, text regions are extracted by using the coarse to fine projection method. Experimental results demonstrate the effectiveness of the proposed method.
PDF

Voiced-Unvoiced-Silence Detection Algorithm using Perceptron Neural Network (퍼셉트론 신경회로망을 사용한 유성음, 무성음, 묵음 구간의 검출 알고리즘)

Choi, Jae-Seung
- The Journal of the Korea institute of electronic communication sciences
- /
- v.6 no.2
- /
- pp.237-242
- /
- 2011
This paper proposes a detection algorithm for each section which detects the voiced section, unvoiced section, and the silence section at each frame using a multi-layer perceptron neural network. First, a power spectrum and FFT (fast Fourier transform) coefficients obtained by FFT are used as the input to the neural network for each frame, then the neural network is trained using these power spectrum and FFT coefficients. In this experiment, the performance of the proposed algorithm for detection of the voiced section, unvoiced section, and silence section was evaluated based on the detection rates using various speeches, which are degraded by white noise and used as the input data of the neural network. In this experiment, the detection rates were 92% or more for such speech and white noise when training data and evaluation data were the different.
https://doi.org/10.13067/JKIECS.2011.6.2.237 인용 PDF KSCI

Face Detection Using Multi-level Features for Privacy Protection in Large-scale Surveillance Video (대규모 비디오 감시 환경에서 프라이버시 보호를 위한 다중 레벨 특징 기반 얼굴검출 방법에 관한 연구)

Lee, Seung Ho;Moon, Jung Ik;Kim, Hyung-Il;Ro, Yong Man
- Journal of Korea Multimedia Society
- /
- v.18 no.11
- /
- pp.1268-1280
- /
- 2015
In video surveillance system, the exposure of a person's face is a serious threat to personal privacy. To protect the personal privacy in large amount of videos, an automatic face detection method is required to locate and mask the person's face. However, in real-world surveillance videos, the effectiveness of existing face detection methods could deteriorate due to large variations in facial appearance (e.g., facial pose, illumination etc.) or degraded face (e.g., occluded face, low-resolution face etc.). This paper proposes a new face detection method based on multi-level facial features. In a video frame, different kinds of spatial features are independently extracted, and analyzed, which could complement each other in the aforementioned challenges. Temporal domain analysis is also exploited to consolidate the proposed method. Experimental results show that, compared to competing methods, the proposed method is able to achieve very high recall rates while maintaining acceptable precision rates.
https://doi.org/10.9717/kmms.2015.18.11.1268 인용 PDF KSCI KPUBS HTML

Black Ice Detection Platform and Its Evaluation using Jetson Nano Devices based on Convolutional Neural Network (CNN)

Sun-Kyoung KANG;Yeonwoo LEE
- Korean Journal of Artificial Intelligence
- /
- v.11 no.4
- /
- pp.1-8
- /
- 2023
In this paper, we propose a black ice detection platform framework using Convolutional Neural Networks (CNNs). To overcome black ice problem, we introduce a real-time based early warning platform using CNN-based architecture, and furthermore, in order to enhance the accuracy of black ice detection, we apply a multi-scale dilation convolution feature fusion (MsDC-FF) technique. Then, we establish a specialized experimental platform by using a comprehensive dataset of thermal road black ice images for a training and evaluation purpose. Experimental results of a real-time black ice detection platform show the better performance of our proposed network model compared to conventional image segmentation models. Our proposed platform have achieved real-time segmentation of road black ice areas by deploying a road black ice area segmentation network on the edge device Jetson Nano devices. This approach in parallel using multi-scale dilated convolutions with different dilation rates had faster segmentation speeds due to its smaller model parameters. The proposed MsCD-FF Net(2) model had the fastest segmentation speed at 5.53 frame per second (FPS). Thereby encouraging safe driving for motorists and providing decision support for road surface management in the road traffic monitoring department.
https://doi.org/10.24225/kjai.2023.11.4.1 인용 PDF

Visual inspection algorithm of cold rolled strips by wavelet frame transform (Wavelet frame 변환을 이용한 냉연 시각검사 알고리듬)

Lee, Chang-Su;Choi, Jong-Ho
- Journal of Institute of Control, Robotics and Systems
- /
- v.4 no.3
- /
- pp.372-377
- /
- 1998
This paper deals with the detection, feature extraction and classification of surface defects in cold rolled strips. Inspection systems are one of the most important fields in factory automation. Defects such as slipmark and dullmark can be effectively detected with a Gaussian matched filter because their shapes are similar to Gaussian. It is justified that the proposed WF(Wavelet Frame) method could be regarded as multiscale Gaussian matched filter which can be applied to the inspection of cold rolled strip. After a wavelet frame transform, the entropies and moments are computed for each subband which pass through both local low pass filter and nonlinear operator. With these features as input, a MLP(Multi Layer Perceptron) is used as a classifier. The proposed inspection method was applied to the real images with defects, and hence showed good performance. The role of each extracted feature is analyzed by KLT(Karhunen-Loeve Transform).
PDF

Quasi-Orthogonal STBC with Iterative Decoding in Bit Interleaved Coded Modulation

Sung, Chang-Kyung;Kim, Ji-Hoon;Lee, In-Kyu
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.4A
- /
- pp.426-433
- /
- 2008
In this paper, we present a method to improve the performance of the four transmit antenna quasi-orthogonal space-time block code (STBC) in the coded system. For the four transmit antenna case, the quasi-orthogonal STBC consists of two symbol groups which are orthogonal to each other, but intra group symbols are not. In uncoded system with the matched filter detection, constellation rotation can improve the performance. However, in coded systems, its gain is absorbed by the coding gain especially for lower rate code. We propose an iterative decoding method to improve the performance of quasi-orthogonal codes in coded systems. With conventional quasi-orthogonal STBC detection, the joint ML detection can be improved by iterative processing between the demapper and the decoder. Simulation results shows that the performance improvement is about 2dB at 1% frame error rate.
PDF KSCI

A New Anchor Shot Detection System for News Video Indexing

Lee, Han-Sung;Im, Young-Hee;Park, Joo-Young;Park, Dai-Hee
- Journal of the Korean Institute of Intelligent Systems
- /
- v.18 no.1
- /
- pp.133-138
- /
- 2008
In this paper, we propose a novel anchor shot detection system, named to MASD (Multi-phase Anchor Shot Detection), which is a core step of the preprocessing process for the news video analysis. The proposed system is composed of four modules and operates sequentially: 1) skin color detection module for reducing the candidate face regions; 2) face detection module for finding the key-frames with a facial data; 3) vector representation module for the key-frame images using a non-negative matrix factorization; 4) one class SVM module for determining the anchor shots using a support vector data description. Besides the qualitative analysis, our experiments validate that the proposed system shows not only the comparable accuracy to the recently developed methods, but also more faster detection rate than those of others.
https://doi.org/10.5391/JKIIS.2008.18.1.133 인용 PDF KSCI

Speech detection from broadcast contents using multi-scale time-dilated convolutional neural networks (다중 스케일 시간 확장 합성곱 신경망을 이용한 방송 콘텐츠에서의 음성 검출)

Jang, Byeong-Yong;Kwon, Oh-Wook
- Phonetics and Speech Sciences
- /
- v.11 no.4
- /
- pp.89-96
- /
- 2019
In this paper, we propose a deep learning architecture that can effectively detect speech segmentation in broadcast contents. We also propose a multi-scale time-dilated layer for learning the temporal changes of feature vectors. We implement several comparison models to verify the performance of proposed model and calculated the frame-by-frame F-score, precision, and recall. Both the proposed model and the comparison model are trained with the same training data, and we train the model using 32 hours of Korean broadcast data which is composed of various genres (drama, news, documentary, and so on). Our proposed model shows the best performance with F-score 91.7% in Korean broadcast data. The British and Spanish broadcast data also show the highest performance with F-score 87.9% and 92.6%. As a result, our proposed model can contribute to the improvement of performance of speech detection by learning the temporal changes of the feature vectors.
https://doi.org/10.13064/KSSS.2019.11.4.089 인용 PDF KSCI

Search Result 65, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)