Search | Korea Science

Deep Window Detection in Street Scenes

Ma, Wenguang;Ma, Wei
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.2
- /
- pp.855-870
- /
- 2020
Windows are key components of building facades. Detecting windows, crucial to 3D semantic reconstruction and scene parsing, is a challenging task in computer vision. Early methods try to solve window detection by using hand-crafted features and traditional classifiers. However, these methods are unable to handle the diversity of window instances in real scenes and suffer from heavy computational costs. Recently, convolutional neural networks based object detection algorithms attract much attention due to their good performances. Unfortunately, directly training them for challenging window detection cannot achieve satisfying results. In this paper, we propose an approach for window detection. It involves an improved Faster R-CNN architecture for window detection, featuring in a window region proposal network, an RoI feature fusion and a context enhancement module. Besides, a post optimization process is designed by the regular distribution of windows to refine detection results obtained by the improved deep architecture. Furthermore, we present a newly collected dataset which is the largest one for window detection in real street scenes to date. Experimental results on both existing datasets and the new dataset show that the proposed method has outstanding performance.
https://doi.org/10.3837/tiis.2020.02.022 인용 PDF KSCI HTML

Gesture-Based Emotion Recognition by 3D-CNN and LSTM with Keyframes Selection

Ly, Son Thai;Lee, Guee-Sang;Kim, Soo-Hyung;Yang, Hyung-Jeong
- International Journal of Contents
- /
- v.15 no.4
- /
- pp.59-64
- /
- 2019
In recent years, emotion recognition has been an interesting and challenging topic. Compared to facial expressions and speech modality, gesture-based emotion recognition has not received much attention with only a few efforts using traditional hand-crafted methods. These approaches require major computational costs and do not offer many opportunities for improvement as most of the science community is conducting their research based on the deep learning technique. In this paper, we propose an end-to-end deep learning approach for classifying emotions based on bodily gestures. In particular, the informative keyframes are first extracted from raw videos as input for the 3D-CNN deep network. The 3D-CNN exploits the short-term spatiotemporal information of gesture features from selected keyframes, and the convolutional LSTM networks learn the long-term feature from the features results of 3D-CNN. The experimental results on the FABO dataset exceed most of the traditional methods results and achieve state-of-the-art results for the deep learning-based technique for gesture-based emotion recognition.
https://doi.org/10.5392/IJoC.2019.15.4.059 인용 PDF KSCI HTML

Micro-Expression Recognition Base on Optical Flow Features and Improved MobileNetV2

Xu, Wei;Zheng, Hao;Yang, Zhongxue;Yang, Yingjie
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.6
- /
- pp.1981-1995
- /
- 2021
When a person tries to conceal emotions, real emotions will manifest themselves in the form of micro-expressions. Research on facial micro-expression recognition is still extremely challenging in the field of pattern recognition. This is because it is difficult to implement the best feature extraction method to cope with micro-expressions with small changes and short duration. Most methods are based on hand-crafted features to extract subtle facial movements. In this study, we introduce a method that incorporates optical flow and deep learning. First, we take out the onset frame and the apex frame from each video sequence. Then, the motion features between these two frames are extracted using the optical flow method. Finally, the features are inputted into an improved MobileNetV2 model, where SVM is applied to classify expressions. In order to evaluate the effectiveness of the method, we conduct experiments on the public spontaneous micro-expression database CASME II. Under the condition of applying the leave-one-subject-out cross-validation method, the recognition accuracy rate reaches 53.01%, and the F-score reaches 0.5231. The results show that the proposed method can significantly improve the micro-expression recognition performance.
https://doi.org/10.3837/tiis.2021.06.002 인용 PDF KSCI HTML

Handwritten Indic Digit Recognition using Deep Hybrid Capsule Network

Mohammad Reduanul Haque;Rubaiya Hafiz;Mohammad Zahidul Islam;Mohammad Shorif Uddin
- International Journal of Computer Science & Network Security
- /
- v.24 no.2
- /
- pp.89-94
- /
- 2024
Indian subcontinent is a birthplace of multilingual people where documents such as job application form, passport, number plate identification, and so forth is composed of text contents written in different languages/scripts. These scripts may be in the form of different indic numerals in a single document page. Due to this reason, building a generic recognizer that is capable of recognizing handwritten indic digits written by diverse writers is needed. Also, a lot of work has been done for various non-Indic numerals particularly, in case of Roman, but, in case of Indic digits, the research is limited. Moreover, most of the research focuses with only on MNIST datasets or with only single datasets, either because of time restraints or because the model is tailored to a specific task. In this work, a hybrid model is proposed to recognize all available indic handwritten digit images using the existing benchmark datasets. The proposed method bridges the automatically learnt features of Capsule Network with hand crafted Bag of Feature (BoF) extraction method. Along the way, we analyze (1) the successes (2) explore whether this method will perform well on more difficult conditions i.e. noise, color, affine transformations, intra-class variation, natural scenes. Experimental results show that the hybrid method gives better accuracy in comparison with Capsule Network.
https://doi.org/10.22937/IJCSNS.2024.24.2.10 인용 PDF

Performance Improvement of a Korean Prosodic Phrase Boundary Prediction Model using Efficient Feature Selection (효율적인 기계학습 자질 선별을 통한 한국어 운율구 경계 예측 모델의 성능 향상)

Kim, Min-Ho;Kwon, Hyuk-Chul
- Journal of KIISE:Software and Applications
- /
- v.37 no.11
- /
- pp.837-844
- /
- 2010
Prediction of the prosodic phrase boundary is one of the most important natural language processing tasks. We propose, for the natural prediction of the Korean prosodic phrase boundary, a statistical approach incorporating efficient learning features. These new features reflect the factors that affect generation of the prosodic phrase boundary better than existing learning features. Notably, moreover, such learning features, extracted according to the hand-crafted prosodic phrase boundary prediction rule, impart higher accuracy. We developed a statistical model for Korean prosodic phrase boundaries based on the proposed new features. The results were 86.63% accuracy for three levels (major break, minor break, no break) and 81.14% accuracy for six levels (major break with falling tone/rising tone, minor break with falling tone/rising tone/middle tone, no break).
PDF KSCI

Search Result 15, Processing Time 0.019 seconds

Deep Window Detection in Street Scenes

Gesture-Based Emotion Recognition by 3D-CNN and LSTM with Keyframes Selection

Micro-Expression Recognition Base on Optical Flow Features and Improved MobileNetV2

Handwritten Indic Digit Recognition using Deep Hybrid Capsule Network

Performance Improvement of a Korean Prosodic Phrase Boundary Prediction Model using Efficient Feature Selection (효율적인 기계학습 자질 선별을 통한 한국어 운율구 경계 예측 모델의 성능 향상)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)