• Title/Summary/Keyword: recognition task

Search Result 616, Processing Time 0.021 seconds

A Corpus Selection Based Approach to Language Modeling for Large Vocabulary Continuous Speech Recognition (대용량 연속 음성 인식 시스템에서의 코퍼스 선별 방법에 의한 언어모델 설계)

  • Oh, Yoo-Rhee;Yoon, Jae-Sam;kim, Hong-Kook
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.103-106
    • /
    • 2005
  • In this paper, we propose a language modeling approach to improve the performance of a large vocabulary continuous speech recognition system. The proposed approach is based on the active learning framework that helps to select a text corpus from a plenty amount of text data required for language modeling. The perplexity is used as a measure for the corpus selection in the active learning. From the recognition experiments on the task of continuous Korean speech, the speech recognition system employing the language model by the proposed language modeling approach reduces the word error rate by about 6.6 % with less computational complexity than that using a language model constructed with randomly selected texts.

  • PDF

Study On Masked Face Detection And Recognition using transfer learning

  • Kwak, NaeJoung;Kim, DongJu
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.294-301
    • /
    • 2022
  • COVID-19 is a crisis with numerous casualties. The World Health Organization (WHO) has declared the use of masks as an essential safety measure during the COVID-19 pandemic. Therefore, whether or not to wear a mask is an important issue when entering and exiting public places and institutions. However, this makes face recognition a very difficult task because certain parts of the face are hidden. As a result, face identification and identity verification in the access system became difficult. In this paper, we propose a system that can detect masked face using transfer learning of Yolov5s and recognize the user using transfer learning of Facenet. Transfer learning preforms by changing the learning rate, epoch, and batch size, their results are evaluated, and the best model is selected as representative model. It has been confirmed that the proposed model is good at detecting masked face and masked face recognition.

Deep Learning based Human Recognition using Integration of GAN and Spatial Domain Techniques

  • Sharath, S;Rangaraju, HG
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.127-136
    • /
    • 2021
  • Real-time human recognition is a challenging task, as the images are captured in an unconstrained environment with different poses, makeups, and styles. This limitation is addressed by generating several facial images with poses, makeup, and styles with a single reference image of a person using Generative Adversarial Networks (GAN). In this paper, we propose deep learning-based human recognition using integration of GAN and Spatial Domain Techniques. A novel concept of human recognition based on face depiction approach by generating several dissimilar face images from single reference face image using Domain Transfer Generative Adversarial Networks (DT-GAN) combined with feature extraction techniques such as Local Binary Pattern (LBP) and Histogram is deliberated. The Euclidean Distance (ED) is used in the matching section for comparison of features to test the performance of the method. A database of millions of people with a single reference face image per person, instead of multiple reference face images, is created and saved on the centralized server, which helps to reduce memory load on the centralized server. It is noticed that the recognition accuracy is 100% for smaller size datasets and a little less accuracy for larger size datasets and also, results are compared with present methods to show the superiority of proposed method.

Recognition of Characters Printed on PCB Components Using Deep Neural Networks (심층신경망을 이용한 PCB 부품의 인쇄문자 인식)

  • Cho, Tai-Hoon
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.3
    • /
    • pp.6-10
    • /
    • 2021
  • Recognition of characters printed or marked on the PCB components from images captured using cameras is an important task in PCB components inspection systems. Previous optical character recognition (OCR) of PCB components typically consists of two stages: character segmentation and classification of each segmented character. However, character segmentation often fails due to corrupted characters, low image contrast, etc. Thus, OCR without character segmentation is desirable and increasingly used via deep neural networks. Typical implementation based on deep neural nets without character segmentation includes convolutional neural network followed by recurrent neural network (RNN). However, one disadvantage of this approach is slow execution due to RNN layers. LPRNet is a segmentation-free character recognition network with excellent accuracy proved in license plate recognition. LPRNet uses a wide convolution instead of RNN, thus enabling fast inference. In this paper, LPRNet was adapted for recognizing characters printed on PCB components with fast execution and high accuracy. Initial training with synthetic images followed by fine-tuning on real text images yielded accurate recognition. This net can be further optimized on Intel CPU using OpenVINO tool kit. The optimized version of the network can be run in real-time faster than even GPU.

A Task Scheduling Strategy in a Multi-core Processor for Visual Object Tracking Systems (시각물체 추적 시스템을 위한 멀티코어 프로세서 기반 태스크 스케줄링 방법)

  • Lee, Minchae;Jang, Chulhoon;Sunwoo, Myoungho
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.24 no.2
    • /
    • pp.127-136
    • /
    • 2016
  • The camera based object detection systems should satisfy the recognition performance as well as real-time constraints. Particularly, in safety-critical systems such as Autonomous Emergency Braking (AEB), the real-time constraints significantly affects the system performance. Recently, multi-core processors and system-on-chip technologies are widely used to accelerate the object detection algorithm by distributing computational loads. However, due to the advanced hardware, the complexity of system architecture is increased even though additional hardwares improve the real-time performance. The increased complexity also cause difficulty in migration of existing algorithms and development of new algorithms. In this paper, to improve real-time performance and design complexity, a task scheduling strategy is proposed for visual object tracking systems. The real-time performance of the vision algorithm is increased by applying pipelining to task scheduling in a multi-core processor. Finally, the proposed task scheduling algorithm is applied to crosswalk detection and tracking system to prove the effectiveness of the proposed strategy.

Deep Multi-task Network for Simultaneous Hazy Image Semantic Segmentation and Dehazing (안개영상의 의미론적 분할 및 안개제거를 위한 심층 멀티태스크 네트워크)

  • Song, Taeyong;Jang, Hyunsung;Ha, Namkoo;Yeon, Yoonmo;Kwon, Kuyong;Sohn, Kwanghoon
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.9
    • /
    • pp.1000-1010
    • /
    • 2019
  • Image semantic segmentation and dehazing are key tasks in the computer vision. In recent years, researches in both tasks have achieved substantial improvements in performance with the development of Convolutional Neural Network (CNN). However, most of the previous works for semantic segmentation assume the images are captured in clear weather and show degraded performance under hazy images with low contrast and faded color. Meanwhile, dehazing aims to recover clear image given observed hazy image, which is an ill-posed problem and can be alleviated with additional information about the image. In this work, we propose a deep multi-task network for simultaneous semantic segmentation and dehazing. The proposed network takes single haze image as input and predicts dense semantic segmentation map and clear image. The visual information getting refined during the dehazing process can help the recognition task of semantic segmentation. On the other hand, semantic features obtained during the semantic segmentation process can provide cues for color priors for objects, which can help dehazing process. Experimental results demonstrate the effectiveness of the proposed multi-task approach, showing improved performance compared to the separate networks.

A Study on Job and Task Satisfaction of Physiotherapist -Focusing on Employees in Orthopedic Manual Therapy Part- (물리치료사의 직업 및 직무만족도에 관한 연구 - 정형도수치료 직무 중심으로 -)

  • Park, Youn-Ki
    • The Journal of Korean Academy of Orthopedic Manual Physical Therapy
    • /
    • v.19 no.1
    • /
    • pp.21-31
    • /
    • 2013
  • Background: The purpose of this survey is to determine the job and task satisfaction of physiotherapists. These are important factors because they are directly connected to both morale and work efficiency. Methods: Data was collected from March 9th, 2013 to April 15th, 2013 using self-administered questionnaires. First, Cronbach Alpha coefficient was used to evaluate date reliability. Further data analysis used mean and standard deviation to determine frequency and satisfaction for each characteristic. To determine the significance of job and task satisfaction, T-test and an analysis of variance were performed. Also, regression analysis was used to find out a relation between job satisfaction of physiotherapist and task satisfaction of orthopaedic physical therapy. Result: This survey includes results from 197 physiotherapists who engage in orthopaedic physical therapy from major, medium and small cities. The general characteristics of survey respondents include: 112 males (56.9%), 85 females (43.1%); 123 in their twenties (62.4%), 56 in their thirties (28.4%), and 18 over forty (9.1%); 156 had less than five years work experience in orthopaedics, 25 had six to ten years, and 16 had more than eleven years work experience. In the physiotherapist's job satisfaction survey (out of 5), males averaged 3.71 and females averaged 3.43. Individuals with less than five years in the career averaged 3.5, 3.69 for between 6 to 10 years in career, 3.87 for over 11 years in career; this showed a significant difference. Results of the sub-factors of job satisfaction were 3.81 for self-esteem and 3.21 for prospect of occupation. Results of task satisfaction in orthopaedic therapy showed a significant difference between 4.03 for males and 3.66 for females. For sub-factors of task satisfaction scores were 3.81 for vision, 4.29 for task adoption, and 3.57 for task recognition. Conclusion: Physiotherapists will be satisfied when their motivation to work and morale are increased by concerns such as improving the education environment, expert physiotherapist adoption issue, and medical law revision.

  • PDF

Continuous Speech Recognition Using N-gram Language Models Constructed by Iterative Learning (반복학습법에 의해 작성한 N-gram 언어모델을 이용한 연속음성인식에 관한 연구)

  • 오세진;황철준;김범국;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.6
    • /
    • pp.62-70
    • /
    • 2000
  • In usual language models(LMs), the probability has been estimated by selecting highly frequent words from a large text side database. However, in case of adopting LMs in a specific task, it is unnecessary to using the general method; constructing it from a large size tent, considering the various kinds of cost. In this paper, we propose a construction method of LMs using a small size text database in order to be used in specific tasks. The proposed method is efficient in increasing the low frequent words by applying same sentences iteratively, for it will robust the occurrence probability of words as well. We carried out continuous speech recognition(CSR) experiments on 200 sentences uttered by 3 speakers using LMs by iterative teaming(IL) in a air flight reservation task. The results indicated that the performance of CSR, using an IL applied LMs, shows an 20.4% increased recognition accuracy compared to those without it. This system, using the IL method, also shows an average of 13.4% higher recognition accuracy than the previous one, which uses context-free grammar(CFG), implying the effectiveness of it.

  • PDF

A novel method to aging state recognition of viscoelastic sandwich structures

  • Qu, Jinxiu;Zhang, Zhousuo;Luo, Xue;Li, Bing;Wen, Jinpeng
    • Steel and Composite Structures
    • /
    • v.21 no.6
    • /
    • pp.1183-1210
    • /
    • 2016
  • Viscoelastic sandwich structures (VSSs) are widely used in mechanical equipment, but in the service process, they always suffer from aging which affect the whole performance of equipment. Therefore, aging state recognition of VSSs is significant to monitor structural state and ensure the reliability of equipment. However, non-stationary vibration response signals and weak state change characteristics make this task challenging. This paper proposes a novel method for this task based on adaptive second generation wavelet packet transform (ASGWPT) and multiwavelet support vector machine (MWSVM). For obtaining sensitive feature parameters to different structural aging states, the ASGWPT, its wavelet function can adaptively match the frequency spectrum characteristics of inspected vibration response signal, is developed to process the vibration response signals for energy feature extraction. With the aim to improve the classification performance of SVM, based on the kernel method of SVM and multiwavelet theory, multiwavelet kernel functions are constructed, and then MWSVM is developed to classify the different aging states. In order to demonstrate the effectiveness of the proposed method, different aging states of a VSS are created through the hot oxygen accelerated aging of viscoelastic material. The application results show that the proposed method can accurately and automatically recognize the different structural aging states and act as a promising approach to aging state recognition of VSSs. Furthermore, the capability of ASGWPT in processing the vibration response signals for feature extraction is validated by the comparisons with conventional second generation wavelet packet transform, and the performance of MWSVM in classifying the structural aging states is validated by the comparisons with traditional wavelet support vector machine.

The neighborhood size and frequency effect in Korean words (한국어 단어재인에서 나타나는 이웃효과)

  • Kwon You-An;Cho Hye-Suk;Nam Ki-Chun
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.117-120
    • /
    • 2006
  • This paper examined two hypotheses. Firstly, if the first syllable of word play an important role in visual word recognition, it may be the unit of word neighbor. Secondly, if the first syllable is the unit of lexical access, the neighborhood size effect and the neighborhood frequency effect would appear in a lexical decision task and a form primed lexical decision task. We conducted two experiments. Experiment 1 showed that words had large neighbors made a inhibitory effect in the LDT(lexical decision task). Experiment 2 showed the interaction between the neighborhood frequency effectand the word form similarity in the form primed LDT. We concluded that the first syllable in Korean words might be the unit of word neighborhood and play a central role in a lexical access.

  • PDF