• Title/Summary/Keyword: state recognition

Search Result 1,016, Processing Time 0.03 seconds

The development of food image detection and recognition model of Korean food for mobile dietary management

  • Park, Seon-Joo;Palvanov, Akmaljon;Lee, Chang-Ho;Jeong, Nanoom;Cho, Young-Im;Lee, Hae-Jeung
    • Nutrition Research and Practice
    • /
    • v.13 no.6
    • /
    • pp.521-528
    • /
    • 2019
  • BACKGROUND/OBJECTIVES: The aim of this study was to develop Korean food image detection and recognition model for use in mobile devices for accurate estimation of dietary intake. MATERIALS/METHODS: We collected food images by taking pictures or by searching web images and built an image dataset for use in training a complex recognition model for Korean food. Augmentation techniques were performed in order to increase the dataset size. The dataset for training contained more than 92,000 images categorized into 23 groups of Korean food. All images were down-sampled to a fixed resolution of $150{\times}150$ and then randomly divided into training and testing groups at a ratio of 3:1, resulting in 69,000 training images and 23,000 test images. We used a Deep Convolutional Neural Network (DCNN) for the complex recognition model and compared the results with those of other networks: AlexNet, GoogLeNet, Very Deep Convolutional Neural Network, VGG and ResNet, for large-scale image recognition. RESULTS: Our complex food recognition model, K-foodNet, had higher test accuracy (91.3%) and faster recognition time (0.4 ms) than those of the other networks. CONCLUSION: The results showed that K-foodNet achieved better performance in detecting and recognizing Korean food compared to other state-of-the-art models.

A Comprehensive Survey of Lightweight Neural Networks for Face Recognition (얼굴 인식을 위한 경량 인공 신경망 연구 조사)

  • Yongli Zhang;Jaekyung Yang
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.1
    • /
    • pp.55-67
    • /
    • 2023
  • Lightweight face recognition models, as one of the most popular and long-standing topics in the field of computer vision, has achieved vigorous development and has been widely used in many real-world applications due to fewer number of parameters, lower floating-point operations, and smaller model size. However, few surveys reviewed lightweight models and reimplemented these lightweight models by using the same calculating resource and training dataset. In this survey article, we present a comprehensive review about the recent research advances on the end-to-end efficient lightweight face recognition models and reimplement several of the most popular models. To start with, we introduce the overview of face recognition with lightweight models. Then, based on the construction of models, we categorize the lightweight models into: (1) artificially designing lightweight FR models, (2) pruned models to face recognition, (3) efficient automatic neural network architecture design based on neural architecture searching, (4) Knowledge distillation and (5) low-rank decomposition. As an example, we also introduce the SqueezeFaceNet and EfficientFaceNet by pruning SqueezeNet and EfficientNet. Additionally, we reimplement and present a detailed performance comparison of different lightweight models on the nine different test benchmarks. At last, the challenges and future works are provided. There are three main contributions in our survey: firstly, the categorized lightweight models can be conveniently identified so that we can explore new lightweight models for face recognition; secondly, the comprehensive performance comparisons are carried out so that ones can choose models when a state-of-the-art end-to-end face recognition system is deployed on mobile devices; thirdly, the challenges and future trends are stated to inspire our future works.

Performance Comparison and Error Analysis of Korean Bio-medical Named Entity Recognition (한국어 생의학 개체명 인식 성능 비교와 오류 분석)

  • Jae-Hong Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.701-708
    • /
    • 2024
  • The advent of transformer architectures in deep learning has been a major breakthrough in natural language processing research. Object name recognition is a branch of natural language processing and is an important research area for tasks such as information retrieval. It is also important in the biomedical field, but the lack of Korean biomedical corpora for training has limited the development of Korean clinical research using AI. In this study, we built a new biomedical corpus for Korean biomedical entity name recognition and selected language models pre-trained on a large Korean corpus for transfer learning. We compared the name recognition performance of the selected language models by F1-score and the recognition rate by tag, and analyzed the errors. In terms of recognition performance, KlueRoBERTa showed relatively good performance. The error analysis of the tagging process shows that the recognition performance of Disease is excellent, but Body and Treatment are relatively low. This is due to over-segmentation and under-segmentation that fails to properly categorize entity names based on context, and it will be necessary to build a more precise morphological analyzer and a rich lexicon to compensate for the incorrect tagging.

Face recognition rate comparison with distance change using embedded data in stereo images (스테레오 영상에서 임베디드 데이터를 이용한 거리에 따른 얼굴인식률 비교)

  • 박장한;남궁재찬
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.6
    • /
    • pp.81-89
    • /
    • 2004
  • In this paper, we compare face recognition rate by PCA algorithm using distance change and embedded data being input left side and right side image in stereo images. The proposed method detects face region from RGB color space to YCbCr color space. Also, The extracted face image's scale up/down according to distance change and extracts more robust face region. The proposed method through an experiment could establish standard distance (100cm) in distance about 30∼200cm, and get 99.05% (100cm) as an average recognition result by scale change. The definition of super state is specification region in normalized size (92${\times}$112), and the embedded data extracts the inner factor of defined super state, achieved face recognition through PCA algorithm. The orignal images can receive specification data in limited image's size (92${\times}$112) because embedded data to do learning not that do all learning, in image of 92${\times}$112 size averagely 99.05%, shows face recognition rate of test 1 99.05%, test 2 98.93%, test 3 98.54%, test 4 97.85%. Therefore, the proposed method through an experiment showed that if apply distance change rate could get high recognition rate, and the processing speed improved as well as reduce face information.

Gynecologic Patients' Recognition of Radiation Exposure in Gyeongbuk Area (경북 지역 산부인과 환자의 방사선피폭 인식)

  • Park, Jeong-Kyu
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.8
    • /
    • pp.176-187
    • /
    • 2008
  • As the radiological medical instruments have been competitively developed in recent years, its utilization for the patient treatment has been expanded. The medical examination using the radiation has been gradually increased, so that it is recognized as a significant factor of increasing the radiation exposure. In this study, the recognition about the radiation exposure was analyzed for 555 gynecologic patients in 8 secondary and tertiary medical centers in Gyeongbuk from November 17 to April 19, 2007. The results are followed. There was a significant difference on the recognition for the radiation by age and education (p<0.05), There was the significant difference in the recognition about the radiological instruments by age and occupation (p<0.05), and there was the significant difference in the information identification & analysis by age and occupation (p<0.05). As the result of analyzing a correlation of the radiation's harmfulness, recognition, psychological state and exposure prevention, there was the correlation of 0.572 between the harmfulness and recognition, the correlation of 0.740 between the harmfulness and the psychological state, and the correlation of 0.477 between the harmfulness and the exposure prevention. It was statistically very significant (p<0.01). But, there was no significance with the radiological instrument and information identification (p>0.05). As the result of the study, it could be known that the mental threat factor was more included than the physical threat from the position of gynecologic patients who were sensitive to the radiation. Accordingly, radiological technologist who manages the radiation needs to let the patient or its guardians recognize the degree of physical harmfulness exactly.

Spontaneous Speech Emotion Recognition Based On Spectrogram With Convolutional Neural Network (CNN 기반 스펙트로그램을 이용한 자유발화 음성감정인식)

  • Guiyoung Son;Soonil Kwon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.6
    • /
    • pp.284-290
    • /
    • 2024
  • Speech emotion recognition (SER) is a technique that is used to analyze the speaker's voice patterns, including vibration, intensity, and tone, to determine their emotional state. There has been an increase in interest in artificial intelligence (AI) techniques, which are now widely used in medicine, education, industry, and the military. Nevertheless, existing researchers have attained impressive results by utilizing acted-out speech from skilled actors in a controlled environment for various scenarios. In particular, there is a mismatch between acted and spontaneous speech since acted speech includes more explicit emotional expressions than spontaneous speech. For this reason, spontaneous speech-emotion recognition remains a challenging task. This paper aims to conduct emotion recognition and improve performance using spontaneous speech data. To this end, we implement deep learning-based speech emotion recognition using the VGG (Visual Geometry Group) after converting 1-dimensional audio signals into a 2-dimensional spectrogram image. The experimental evaluations are performed on the Korean spontaneous emotional speech database from AI-Hub, consisting of 7 emotions, i.e., joy, love, anger, fear, sadness, surprise, and neutral. As a result, we achieved an average accuracy of 83.5% and 73.0% for adults and young people using a time-frequency 2-dimension spectrogram, respectively. In conclusion, our findings demonstrated that the suggested framework outperformed current state-of-the-art techniques for spontaneous speech and showed a promising performance despite the difficulty in quantifying spontaneous speech emotional expression.

A Study on the Fingerprint Recognition Algorithm Using Enhancement Method of Fingerprint Ridge Structure

  • Jung, Yong-Hoon;Roh, Jeong-Serk;Rhee, Sang-Burm
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1788-1793
    • /
    • 2003
  • The present of state is situation that is realized by necessity of maintenance of public security about great many information is real condition been increasing continually in knowledge info-age been situating in wide field of national defense, public peace, banking, politics, education etc. Also, loss or forgetfulness, and peculation by ID for individual information and number increase of password in Internet called that is sea of information is resulting various social problem. By alternative about these problem, including Biometrics, several authentication systems through sign(Signature), Smart Card, Watermarking technology are developed. Therefore, This paper shows that extract factor that efficiency can get into peculiar feature in physical features for good fingerprint recognition algorithm implementation with old study finding that take advantage of special quality of these fingerprint.

  • PDF

Phoneme Recognition Using Frequency State Neural Network (주파수 상태 신경 회로망을 이용한 음소 인식)

  • Lee, Jun-Mo;Hwang, Yeong-Soo;Kim, Seong-Jong;Shin, In-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.4
    • /
    • pp.12-19
    • /
    • 1994
  • This paper reports a new structure for phoneme recognition neural network. The proposed neural network is able to deal with the structure of the frequency bands as well as the temporal structure of phonemic features which used in the conventional TSNN. We trained this neural network using the phonetics (아, 이, 오, ㅅ, ㅊ, ㅍ, ㄱ, ㅇ, ㄹ, ㅁ) and the phoneme recognition of this neural network was a little better than those of conventional TDNN and TSNN using only temporal structure of phonemic features.

  • PDF

RECOGNIZING SIX EMOTIONAL STATES USING SPEECH SIGNALS

  • Kang, Bong-Seok;Han, Chul-Hee;Youn, Dae-Hee;Lee, Chungyong
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2000.04a
    • /
    • pp.366-369
    • /
    • 2000
  • This paper examines three algorithms to recognize speaker's emotion using the speech signals. Target emotions are happiness, sadness, anger, fear, boredom and neutral state. MLB(Maximum-Likeligood Bayes), NN(Nearest Neighbor) and HMM (Hidden Markov Model) algorithms are used as the pattern matching techniques. In all cases, pitch and energy are used as the features. The feature vectors for MLB and NN are composed of pitch mean, pitch standard deviation, energy mean, energy standard deviation, etc. For HMM, vectors of delta pitch with delta-delta pitch and delta energy with delta-delta energy are used. We recorded a corpus of emotional speech data and performed the subjective evaluation for the data. The subjective recognition result was 56% and was compared with the classifiers' recognition rates. MLB, NN, and HMM classifiers achieved recognition rates of 68.9%, 69.3% and 89.1% respectively, for the speaker dependent, and context-independent classification.

  • PDF

A postprocessing method for korean optical character recognition using eojeol information (어절 정보를 이용한 한국어 문자 인식 후처리 기법)

  • 이영화;김규성;김영훈;이상조
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.2
    • /
    • pp.65-70
    • /
    • 1998
  • In this paper, we will to check and to correct mis-recognized word using Eojeol information. First, we divided into 16 classes that constituents in a Eojeol after we analyzed Korean statement into Eojeol units. Eojeol-Constituent state diagram constructed these constitutents, find the Left-Right Connectivity Information. As analogized the speech of connectivity information, reduced the number of cadidate words and restricted case of morphological analysis for mis-recognition Eojeol. Then, we improved correction speed uisng heuristic information as the adjacency information for Eojeol each other. In the correction phase, construct Reverse-Order Word Dictionary. Using this, we can trace word dictionary regardless of mis-recongnition word position. Its results show that improvement of recognition rate from 97.03% to 98.02% and check rate, reduction of chadidata words and morpholgical analysis cases.

  • PDF