• Title/Summary/Keyword: recognition task

Search Result 616, Processing Time 0.026 seconds

Few Samples Face Recognition Based on Generative Score Space

  • Wang, Bin;Wang, Cungang;Zhang, Qian;Huang, Jifeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.12
    • /
    • pp.5464-5484
    • /
    • 2016
  • Few samples face recognition has become a highly challenging task due to the limitation of available labeled samples. As two popular paradigms in face image representation, sparse component analysis is highly robust while parts-based paradigm is particularly flexible. In this paper, we propose a probabilistic generative model to incorporate the strengths of the two paradigms for face representation. This model finds a common spatial partition for given images and simultaneously learns a sparse component analysis model for each part of the partition. The two procedures are built into a probabilistic generative model. Then we derive the score function (i.e. feature mapping) from the generative score space. A similarity measure is defined over the derived score function for few samples face recognition. This model is driven by data and specifically good at representing face images. The derived generative score function and similarity measure encode information hidden in the data distribution. To validate the effectiveness of the proposed method, we perform few samples face recognition on two face datasets. The results show its advantages.

Dual-Encoded Features from Both Spatial and Curvelet Domains for Image Smoke Recognition

  • Yuan, Feiniu;Tang, Tiantian;Xia, Xue;Shi, Jinting;Li, Shuying
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2078-2093
    • /
    • 2019
  • Visual smoke recognition is a challenging task due to large variations in shape, texture and color of smoke. To improve performance, we propose a novel smoke recognition method by combining dual-encoded features that are extracted from both spatial and Curvelet domains. A Curvelet transform is used to filter an image to generate fifty sub-images of Curvelet coefficients. Then we extract Local Binary Pattern (LBP) maps from these coefficient maps and aggregate histograms of these LBP maps to produce a histogram map. Afterwards, we encode the histogram map again to generate Dual-encoded Local Binary Patterns (Dual-LBP). Histograms of Dual-LBPs from Curvelet domain and Completed Local Binary Patterns (CLBP) from spatial domain are concatenated to form the feature for smoke recognition. Finally, we adopt Gaussian Kernel Optimization (GKO) algorithm to search the optimal kernel parameters of Support Vector Machine (SVM) for further improvement of classification accuracy. Experimental results demonstrate that our method can extract effective and reasonable features of smoke images, and achieve good classification accuracy.

Credit Card Number Recognition for People with Visual Impairment (시력 취약 계층을 위한 신용 카드 번호 인식 연구)

  • Park, Dahoon;Kwon, Kon-Woo
    • Journal of IKEEE
    • /
    • v.25 no.1
    • /
    • pp.25-31
    • /
    • 2021
  • The conventional credit card number recognition system generally needs a card to be placed in a designated location before its processing, which is not an ideal user experience especially for people with visual impairment. To improve the user experience, this paper proposes a novel algorithm that can automatically detect the location of a credit card number based on the fact that a group of sixteen digits has a fixed aspect ratio. The proposed algorithm first performs morphological operations to obtain multiple candidates of the credit card number with >4:1 aspect ratio, then recognizes the card number by testing each candidate via OCR and BIN matching techniques. Implemented with OpenCV and Firebase ML, the proposed scheme achieves 77.75% accuracy in the credit card number recognition task.

Egocentric Vision for Human Activity Recognition Using Deep Learning

  • Malika Douache;Badra Nawal Benmoussat
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.730-744
    • /
    • 2023
  • The topic of this paper is the recognition of human activities using egocentric vision, particularly captured by body-worn cameras, which could be helpful for video surveillance, automatic search and video indexing. This being the case, it could also be helpful in assistance to elderly and frail persons for revolutionizing and improving their lives. The process throws up the task of human activities recognition remaining problematic, because of the important variations, where it is realized through the use of an external device, similar to a robot, as a personal assistant. The inferred information is used both online to assist the person, and offline to support the personal assistant. With our proposed method being robust against the various factors of variability problem in action executions, the major purpose of this paper is to perform an efficient and simple recognition method from egocentric camera data only using convolutional neural network and deep learning. In terms of accuracy improvement, simulation results outperform the current state of the art by a significant margin of 61% when using egocentric camera data only, more than 44% when using egocentric camera and several stationary cameras data and more than 12% when using both inertial measurement unit (IMU) and egocentric camera data.

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Ginsenoside Rb1 ameliorates cisplatin-induced learning and memory impairments

  • Chen, Chen;Zhang, Haifeng;Xu, Hongliang;Zheng, Yake;Wu, Tianwen;Lian, Yajun
    • Journal of Ginseng Research
    • /
    • v.43 no.4
    • /
    • pp.499-507
    • /
    • 2019
  • Background: Ginsenoside Rb1 (Rb1), a dominant component from the extract of Panax ginseng root, exhibits neuroprotective functions in many neurological diseases. This study was intended to investigate whether Rb1 can attenuate cisplatin-induced memory impairments and explore the potential mechanisms. Methods: Cisplatin was injected intraperitoneally with a dose of 5 mg/kg/wk, and Rb1 was administered in drinking water at the dose of 2 mg/kg/d to rats for 5 consecutive wk. The novel objects recognition task and Morris water maze were used to detect the memory of rats. Nissl staining was used to examine the neuron numbers in the hippocampus. The activities of superoxide dismutase, glutathione peroxidase, cholineacetyltransferase, acetylcholinesterase, and the levels of malondialdehyde, reactive oxygen species, acetylcholine, tumor necrosis factor-${\alpha}$, interleukin-$1{\beta}$, and interleukin-10 were measured by ELISA to assay the oxidative stress, cholinergic function, and neuroinflammation in the hippocampus. Results: Rb1 administration effectively ameliorates the memory impairments caused by cisplatin in both novel objects recognition task and Morris water maze task. Rb1 also attenuates the neuronal loss induced by cisplatin in the different regions (CA1, CA3, and dentate gyrus) of the hippocampus. Meanwhile, Rb1 is able to rescue the cholinergic neuron function, inhibit the oxidative stress and neuroinflammation in cisplatin-induced rat brain. Conclusion: Rb1 rescues the cisplatin-induced memory impairment via restoring the neuronal loss by reducing oxidative stress and neuroinflammation and recovering the cholinergic neuron functions.

Evaluation of Face Recognition System based on Scenarios (얼굴인식 시스템의 시나리오 기반 평가 방법론)

  • Maeng, Doo-Lyel;Hong, Byung-Woo;Kim, Sung-Jo
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.4
    • /
    • pp.487-495
    • /
    • 2010
  • It has been required to develop an accurate and reliable evaluation method for the performance of biometric systems as their use is getting popular. Among a number of biometric systems, face recognition is one of the most widely used techniques and this leads to develop a stable evaluation method for face recognition systems in order to standardize the performance of face recognition systems. However, it is considered as a difficult task to evaluation such systems due to a large number of factors that affect their performance. Thus, it may be infeasible to take into account all the environmental factors that are related to the performance of face recognition systems and this naturally suggests an evaluation method for the overall performance based on scenarios. In this paper, we have analyzed environmental factors that are related to the performance of general face recognition systems and proposed their evaluation method taking into account those factors. We have proposed an evaluation method based on scenario that considers the combination of individual environment factors instead of evaluating the performance of face recognition systems regarding each factor. Indeed, we have presented examples on the evaluation of face recognition systems based on scenario that takes into account overall environmental factors.

Ontology-based User Intention Recognition for Proactive Planning of Intelligent Robot Behavior (지능형로봇 행동의 능동적 계획수립을 위한 온톨로지 기반 사용자 의도인식)

  • Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.1
    • /
    • pp.86-99
    • /
    • 2011
  • Due to the uncertainty of intention recognition for behaviors of users, the intention is differently recognized according to the situation for the same behavior by the same user, the accuracy of user intention recognition by minimizing the uncertainty is able to be improved. This paper suggests a novel ontology-based method to recognize user intentions, and able to minimize the uncertainties that are the obstacles against the precise recognition of user intention. This approach creates ontology for user intention, makes a hierarchy and relationship among user intentions by using RuleML as well as Dynamic Bayesian Network, and improves the accuracy of user intention recognition by using the defined RuleML as well as the gathered sensor data such as temperature, humidity, vision, and auditory. To evaluate the performance of robot proactive planning mechanism, we developed a simulator, carried out some experiments to measure the accuracy of user intention recognition for all possible situations, and analyzed and detailed described the results. The result of our experiments represented relatively high level the accuracy of user intention recognition. On the other hand, the result of experiments tells us the fact that the actions including the uncertainty get in the way the precise user intention recognition.

A Study on Real-Time Walking Action Control of Biped Robot with Twenty Six Joints Based on Voice Command (음성명령기반 26관절 보행로봇 실시간 작업동작제어에 관한 연구)

  • Jo, Sang Young;Kim, Min Sung;Yang, Jun Suk;Koo, Young Mok;Jung, Yang Geun;Han, Sung Hyun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.4
    • /
    • pp.293-300
    • /
    • 2016
  • The Voice recognition is one of convenient methods to communicate between human and robots. This study proposes a speech recognition method using speech recognizers based on Hidden Markov Model (HMM) with a combination of techniques to enhance a biped robot control. In the past, Artificial Neural Networks (ANN) and Dynamic Time Wrapping (DTW) were used, however, currently they are less commonly applied to speech recognition systems. This Research confirms that the HMM, an accepted high-performance technique, can be successfully employed to model speech signals. High recognition accuracy can be obtained by using HMMs. Apart from speech modeling techniques, multiple feature extraction methods have been studied to find speech stresses caused by emotions and the environment to improve speech recognition rates. The procedure consisted of 2 parts: one is recognizing robot commands using multiple HMM recognizers, and the other is sending recognized commands to control a robot. In this paper, a practical voice recognition system which can recognize a lot of task commands is proposed. The proposed system consists of a general purpose microprocessor and a useful voice recognition processor which can recognize a limited number of voice patterns. By simulation and experiment, it was illustrated the reliability of voice recognition rates for application of the manufacturing process.

Image Super-Resolution for Improving Object Recognition Accuracy (객체 인식 정확도 개선을 위한 이미지 초해상도 기술)

  • Lee, Sung-Jin;Kim, Tae-Jun;Lee, Chung-Heon;Yoo, Seok Bong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.6
    • /
    • pp.774-784
    • /
    • 2021
  • The object detection and recognition process is a very important task in the field of computer vision, and related research is actively being conducted. However, in the actual object recognition process, the recognition accuracy is often degraded due to the resolution mismatch between the training image data and the test image data. To solve this problem, in this paper, we designed and developed an integrated object recognition and super-resolution framework by proposing an image super-resolution technique to improve object recognition accuracy. In detail, 11,231 license plate training images were built by ourselves through web-crawling and artificial-data-generation, and the image super-resolution artificial neural network was trained by defining an objective function to be robust to the image flip. To verify the performance of the proposed algorithm, we experimented with the trained image super-resolution and recognition on 1,999 test images, and it was confirmed that the proposed super-resolution technique has the effect of improving the accuracy of character recognition.