• Title/Summary/Keyword: Recognition time reduction

Search Result 125, Processing Time 0.025 seconds

An Efficient Deep Learning Based Image Recognition Service System Using AWS Lambda Serverless Computing Technology (AWS Lambda Serverless Computing 기술을 활용한 효율적인 딥러닝 기반 이미지 인식 서비스 시스템)

  • Lee, Hyunchul;Lee, Sungmin;Kim, Kangseok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.6
    • /
    • pp.177-186
    • /
    • 2020
  • Recent advances in deep learning technology have improved image recognition performance in the field of computer vision, and serverless computing is emerging as the next generation cloud computing technology for event-based cloud application development and services. Attempts to use deep learning and serverless computing technology to increase the number of real-world image recognition services are increasing. Therefore, this paper describes how to develop an efficient deep learning based image recognition service system using serverless computing technology. The proposed system suggests a method that can serve large neural network model to users at low cost by using AWS Lambda Server based on serverless computing. We also show that we can effectively build a serverless computing system that uses a large neural network model by addressing the shortcomings of AWS Lambda Server, cold start time and capacity limitation. Through experiments, we confirmed that the proposed system, using AWS Lambda Serverless Computing technology, is efficient for servicing large neural network models by solving processing time and capacity limitations as well as cost reduction.

Speech Recognition of the Korean Vowel 'ㅜ' Based on Time Domain Bulk Indicators (시간 영역 벌크 지표에 기반한 한국어 모음 'ㅜ'의 음성 인식)

  • Lee, Jae Won
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.11
    • /
    • pp.591-600
    • /
    • 2016
  • Computing technologies are increasingly applied to most casual human environment networks, as computing technologies are further developed. In addition, the rapidly increasing interest in IoT has led to the wide acceptance of speech recognition as a means of HCI. In this study, we present a novel method for recognizing the Korean vowel 'ㅜ', as a part of a phoneme based Korean speech recognition system. The proposed method involves analyses of bulk indicators calculated in the time domain instead of analysis in the frequency domain, with consequent reduction in the computational cost. Four elementary algorithms for detecting typical waveform patterns of 'ㅜ' using bulk indicators are presented and combined to make final decisions. The experimental results show that the proposed method can achieve 90.1% recognition accuracy, and recognition speed of 0.68 msec per syllable.

Development of an Efficient 3D Object Recognition Algorithm for Robotic Grasping in Cluttered Environments (혼재된 환경에서의 효율적 로봇 파지를 위한 3차원 물체 인식 알고리즘 개발)

  • Song, Dongwoon;Yi, Jae-Bong;Yi, Seung-Joon
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.255-263
    • /
    • 2022
  • 3D object detection pipelines often incorporate RGB-based object detection methods such as YOLO, which detects the object classes and bounding boxes from the RGB image. However, in complex environments where objects are heavily cluttered, bounding box approaches may show degraded performance due to the overlapping bounding boxes. Mask based methods such as Mask R-CNN can handle such situation better thanks to their detailed object masks, but they require much longer time for data preparation compared to bounding box-based approaches. In this paper, we present a 3D object recognition pipeline which uses either the YOLO or Mask R-CNN real-time object detection algorithm, K-nearest clustering algorithm, mask reduction algorithm and finally Principal Component Analysis (PCA) alg orithm to efficiently detect 3D poses of objects in a complex environment. Furthermore, we also present an improved YOLO based 3D object detection algorithm that uses a prioritized heightmap clustering algorithm to handle overlapping bounding boxes. The suggested algorithms have successfully been used at the Artificial-Intelligence Robot Challenge (ARC) 2021 competition with excellent results.

Improvement of EPC Class-1 Anticollision Algorithm for RFID Air-Interface Protocol (무선인식 프로토콜의 EPC 클래스-1 충돌방지 알고리즘 개선)

  • Kang, Bong-Soo;Lim, Jung-Hyun;Kim, Heung-Soo;Yang, Doo-Yeong
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.4
    • /
    • pp.10-19
    • /
    • 2007
  • In this paper, Class-1 Air-interface protocols of EPCglobal applied to RFID system in UHF band are analyzed, and the standard anticollision algorithms are realized. Also, the improved anticollision algorithms of the Class-1 Generation-1 and Generation-2 protocol are proposed and the performances of anticollision algorithms are compared. As the results, reduction ratio of total tag recognition time of the improved Generation-1 algorithm is 54.5% for 100 tags and 63.4% for 1000 tags with respect to standard algorithm, respectively. And the reduction ratio of the improved Generation-2 algorithm is 7.9% for 100 tags and 11.7% for 1000 tags. Total recognition times of the improved algorithms are shorter than those of standard algorithms according to increasing the number of tag. Therefore, the improved anticollision algorithm proposed in this paper is the advanced method improving the performance of tag recognition in the RFID system.

Design of Face Recognition algorithm Using PCA&LDA combined for Data Pre-Processing and Polynomial-based RBF Neural Networks (PCA와 LDA를 결합한 데이터 전 처리와 다항식 기반 RBFNNs을 이용한 얼굴 인식 알고리즘 설계)

  • Oh, Sung-Kwun;Yoo, Sung-Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.5
    • /
    • pp.744-752
    • /
    • 2012
  • In this study, the Polynomial-based Radial Basis Function Neural Networks is proposed as an one of the recognition part of overall face recognition system that consists of two parts such as the preprocessing part and recognition part. The design methodology and procedure of the proposed pRBFNNs are presented to obtain the solution to high-dimensional pattern recognition problems. In data preprocessing part, Principal Component Analysis(PCA) which is generally used in face recognition, which is useful to express some classes using reduction, since it is effective to maintain the rate of recognition and to reduce the amount of data at the same time. However, because of there of the whole face image, it can not guarantee the detection rate about the change of viewpoint and whole image. Thus, to compensate for the defects, Linear Discriminant Analysis(LDA) is used to enhance the separation of different classes. In this paper, we combine the PCA&LDA algorithm and design the optimized pRBFNNs for recognition module. The proposed pRBFNNs architecture consists of three functional modules such as the condition part, the conclusion part, and the inference part as fuzzy rules formed in 'If-then' format. In the condition part of fuzzy rules, input space is partitioned with Fuzzy C-Means clustering. In the conclusion part of rules, the connection weight of pRBFNNs is represented as two kinds of polynomials such as constant, and linear. The coefficients of connection weight identified with back-propagation using gradient descent method. The output of the pRBFNNs model is obtained by fuzzy inference method in the inference part of fuzzy rules. The essential design parameters (including learning rate, momentum coefficient and fuzzification coefficient) of the networks are optimized by means of Differential Evolution. The proposed pRBFNNs are applied to face image(ex Yale, AT&T) datasets and then demonstrated from the viewpoint of the output performance and recognition rate.

A study on adaptive weighted median filter using edge information (에지정보를 이용한 적응적 가중메디안필터에 대한 연구)

  • Lee, Yong-Hwan;Park, Jang-Chun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.10
    • /
    • pp.2830-2837
    • /
    • 1999
  • Image processing steps are consist of image acquisition, preprocessing, region, segmentation and recognition. But image corrupted commonly by noise reduction methods, many filters were proposed like mean filter, median filter, weighted median filter, Cheikh filter, and Kyu-cheol lee filter as spatial noise reduction filtering. We propose a new edge detection algorithm so that we find out edge existence and nonexistence. In non-edge area, we selectively apply weighted median filter based upon using information of difference value between weighted median filter's value and center pixel's value. As a result, we finally prove a better performance of noise reduction by applying adaptive weighted median filter and improvement of processing time through using simple algorithm.

  • PDF

PCA-SVM Based Vehicle Color Recognition (PCA-SVM 기법을 이용한 차량의 색상 인식)

  • Park, Sun-Mi;Kim, Ku-Jin
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.285-292
    • /
    • 2008
  • Color histograms have been used as feature vectors to characterize the color features of given images, but they have a limitation in efficiency by generating high-dimensional feature vectors. In this paper, we present a method to reduce the dimension of the feature vectors by applying PCA (principal components analysis) to the color histogram of a given vehicle image. With SVM (support vector machine) method, the dimension-reduced feature vectors are used to recognize the colors of vehicles. After reducing the dimension of the feature vector by a factor of 32, the successful recognition rate is reduced only 1.42% compared to the case when we use original feature vectors. Moreover, the computation time for the color recognition is reduced by a factor of 31, so we could recognize the colors efficiently.

Performance Improvement of Connected Digit Recognition with Channel Compensation Method for Telephone speech (채널보상기법을 사용한 전화 음성 연속숫자음의 인식 성능향상)

  • Kim Min Sung;Jung Sung Yun;Son Jong Mok;Bae Keun Sung
    • MALSORI
    • /
    • no.44
    • /
    • pp.73-82
    • /
    • 2002
  • Channel distortion degrades the performance of speech recognizer in telephone environment. It mainly results from the bandwidth limitation and variation of transmission channel. Variation of channel characteristics is usually represented as baseline shift in the cepstrum domain. Thus undesirable effect of the channel variation can be removed by subtracting the mean from the cepstrum. In this paper, to improve the recognition performance of Korea connected digit telephone speech, channel compensation methods such as CMN (Cepstral Mean Normalization), RTCN (Real Time Cepatral Normalization), MCMN (Modified CMN) and MRTCN (Modified RTCN) are applied to the static MFCC. Both MCMN and MRTCN are obtained from the CMN and RTCN, respectively, using variance normalization in the cepstrum domain. Using HTK v3.1 system, recognition experiments are performed for Korean connected digit telephone speech database released by SITEC (Speech Information Technology & Industry Promotion Center). Experiments have shown that MRTCN gives the best result with recognition rate of 90.11% for connected digit. This corresponds to the performance improvement over MFCC alone by 1.72%, i.e, error reduction rate of 14.82%.

  • PDF

Vocal Tract Length Normalization for Speech Recognition (음성인식을 위한 성도 길이 정규화)

  • 지상문
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.7
    • /
    • pp.1380-1386
    • /
    • 2003
  • Speech recognition performance is degraded by the variation in vocal tract length among speakers. In this paper, we have used a vocal tract length normalization method wherein the frequency axis of the short-time spectrum associated with a speaker's speech is scaled to minimize the effects of speaker's vocal tract length on the speech recognition performance In order to normalize vocal tract length, we tried several frequency warping functions such as linear and piece-wise linear function. Variable interval piece-wise linear warping function is proposed to effectively model the variation of frequency axis scale due to the large variation of vocal tract length. Experimental results on TIDIGITS connected digits showed the dramatic reduction of word error rates from 2.15% to 0.53% by the proposed vocal tract normalization.

Image Processing Technique for Laser Beam Recognition in Shooting Simulation System (모의 사격 시스템에서 레이저 빔 인식을 위한 영상처리 기법)

  • Oh, Se-Chang;Han, Dong-Il
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.3
    • /
    • pp.594-601
    • /
    • 2009
  • Shooting simulation systems not only reduce a great amount of expense and time for military exercises but also prevent accidents. In particular, the shooting simulation systems using laser beam have an advantage which is very similar to the shooting exercise that uses real bullets. However, real time technique for laser beam recognition in a target image is necessary. The method proposed in this paper takes a difference image from two adjacent image frames. Then a thresholding is applied on this difference image to discriminate laser beam from background. To decide the threshold value the intensity distribution of background points is modeled assuming normal distribution. Then a noise reduction and a region segmentation are applied on the binary image to find the position of a laser beam. The time complexity of this process depends on the size of an image multiplied by the size of a mask used in the noise reduction process. The experimental result showed that the accuracy of the system was 93.3%. Even in the inaccurate cases the beam was always found in the resultant region.