• Title/Summary/Keyword: Recognition time reduction

Search Result 125, Processing Time 0.034 seconds

On Implementing a Robust Speech Recognition System Based on a Signal Bias Removal Algorithm (신호편의제거 알고리듬에 기초한 강인한 음성 인식시스템의 구현)

  • 임계종;계영철;구명완
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.67-72
    • /
    • 2000
  • Particularly based on the signal bias removal(SBR) algorithm for compensating the corrupted speech, this paper presents a new algorithm which is independent of environments, minimizes the amount of computation, and is readily applicable to the conventional recognition system. To this end, a multiple-bias algorithm and a partial codebook search algorithm have been added to the conventional SBR algorithm. The simulation results show that combining the two algorithms proposed in this paper provides a reduction of computation time to 1/8 times as well as an improvement of the recognition rate from 77.58% of the conventional system to 81.32%.

  • PDF

FGW-FER: Lightweight Facial Expression Recognition with Attention

  • Huy-Hoang Dinh;Hong-Quan Do;Trung-Tung Doan;Cuong Le;Ngo Xuan Bach;Tu Minh Phuong;Viet-Vu Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2505-2528
    • /
    • 2023
  • The field of facial expression recognition (FER) has been actively researched to improve human-computer interaction. In recent years, deep learning techniques have gained popularity for addressing FER, with numerous studies proposing end-to-end frameworks that stack or widen significant convolutional neural network layers. While this has led to improved performance, it has also resulted in larger model sizes and longer inference times. To overcome this challenge, our work introduces a novel lightweight model architecture. The architecture incorporates three key factors: Depth-wise Separable Convolution, Residual Block, and Attention Modules. By doing so, we aim to strike a balance between model size, inference speed, and accuracy in FER tasks. Through extensive experimentation on popular benchmark FER datasets, our proposed method has demonstrated promising results. Notably, it stands out due to its substantial reduction in parameter count and faster inference time, while maintaining accuracy levels comparable to other lightweight models discussed in the existing literature.

Efficient Implementation of Candidate Region Extractor for Pedestrian Detection System with Stereo Camera based on GP-GPU (스테레오 영상 보행자 인식 시스템의 후보 영역 검출을 위한 GP-GPU 기반의 효율적 구현)

  • Jeong, Geun-Yong;Jeong, Jun-Hee;Lee, Hee-Chul;Jeon, Gwang-Gil;Cho, Joong-Hwee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.2
    • /
    • pp.121-128
    • /
    • 2013
  • There have been various research efforts for pedestrian recognition in embedded imaging systems. However, many suffer from their heavy computational complexities. SVM classification method has been widely used for pedestrian recognition. The reduction of candidate region is crucial for low-complexity scheme. In this paper, We propose a real time HOG based pedestrian detection system on GPU which images are captured by a pair of cameras. To speed up humans on road detection, the proposed method reduces a number of detection windows with disparity-search and near-search algorithm and uses the GPU and the NVIDIA CUDA framework. This method can be achieved speedups of 20% or more compared to the recent GPU implementations. The effectiveness of our algorithm is demonstrated in terms of the processing time and the detection performance.

A Fast Algorithm for Korean Text Extraction and Segmentation from Subway Signboard Images Utilizing Smartphone Sensors

  • Milevskiy, Igor;Ha, Jin-Young
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.3
    • /
    • pp.161-166
    • /
    • 2011
  • We present a fast algorithm for Korean text extraction and segmentation from subway signboards using smart phone sensors in order to minimize computational time and memory usage. The algorithm can be used as preprocessing steps for optical character recognition (OCR): binarization, text location, and segmentation. An image of a signboard captured by smart phone camera while holding smart phone by an arbitrary angle is rotated by the detected angle, as if the image was taken by holding a smart phone horizontally. Binarization is only performed once on the subset of connected components instead of the whole image area, resulting in a large reduction in computational time. Text location is guided by user's marker-line placed over the region of interest in binarized image via smart phone touch screen. Then, text segmentation utilizes the data of connected components received in the binarization step, and cuts the string into individual images for designated characters. The resulting data could be used as OCR input, hence solving the most difficult part of OCR on text area included in natural scene images. The experimental results showed that the binarization algorithm of our method is 3.5 and 3.7 times faster than Niblack and Sauvola adaptive-thresholding algorithms, respectively. In addition, our method achieved better quality than other methods.

Comparison of the outcomes of nasal bone reduction using serial imaging

  • Lee, Cho Long;Yang, Ho Jik;Hwang, Young Joong
    • Archives of Craniofacial Surgery
    • /
    • v.22 no.4
    • /
    • pp.193-198
    • /
    • 2021
  • Background: Nasal bone fractures are frequently encountered in clinical practice. Although fracture reduction is simple and correction requires a short operative time, low patient satisfaction and relatively high complication rates remain issues for many surgeons. These challenges may result from inaccuracies in fracture recognition and assessment or inappropriate surgical planning. Findings from immediate postoperative computed tomography (CT) scans and those performed at 4 to 6 weeks postoperatively were compared to evaluate the accuracy and outcomes of nasal fracture reduction. Methods: This retrospective study included patients diagnosed with nasal bone fractures at our department who underwent closed reduction surgery. Patients who did not undergo additional CT scans were excluded from the study. Clinical examinations, patient records, and radiographic images were evaluated in 20 patients with nasal bone fractures. Results: CT findings from immediately after surgery and a 1month follow-up were compared in 20 patients. Satisfactory nasal projection and aesthetically acceptable results were observed in patients with accurate correction or mild overcorrection, while undercorrection was associated with unfavorable results. Conclusion: Closed reduction surgery for correcting nasal bone fractures usually provides acceptable outcomes with relatively few complications. If available, immediate postoperative CT scans are recommended to guide surgeons in the choice of whether to perform secondary adjustments if the initial results are unsatisfactory. Based on photogrammetric data, nasal bone reduction with accurate correction or mild overcorrection achieved acceptable and stable outcomes at 1 month postoperatively. Therefore, when upward dislocation is observed on postoperative CT, one can simply observe without a subsequent intervention.

Noise Reduction in Real-time Context Aware using Wearable Device (웨어러블 기기를 이용한 실시간 상황인식에서의 잡음제거)

  • Kim, Tae Ho;Suh, Dong Hyeok;Yoon, Shin Sook;Ryu, Keun Ho
    • Journal of Digital Contents Society
    • /
    • v.19 no.9
    • /
    • pp.1803-1810
    • /
    • 2018
  • Recently, many researches related to IoT (Internet of Things) have been actively conducted. In order to improve the context aware function of smart wearable devices using the IoT, we proposed a noise reduction method for the event data of the sensor part. In thisstudy, the adoption of the low - pass filter induces the attenuation of the abnormally measured value, and the benefit was obtained from the situation recognition using the event data of the sensor. As a result, we have validated attenuation for abnormal or excessive noise using event data detected and reported by 3-axis acceleration sensors on some devices, such as smartphones and smart watches. In addition, various pattern data necessary for real - time context aware were obtained through noise pattern analysis.

Semantic Object Detection based on LiDAR Distance-based Clustering Techniques for Lightweight Embedded Processors (경량형 임베디드 프로세서를 위한 라이다 거리 기반 클러스터링 기법을 활용한 의미론적 물체 인식)

  • Jung, Dongkyu;Park, Daejin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.10
    • /
    • pp.1453-1461
    • /
    • 2022
  • The accuracy of peripheral object recognition algorithms using 3D data sensors such as LiDAR in autonomous vehicles has been increasing through many studies, but this requires high performance hardware and complex structures. This object recognition algorithm acts as a large load on the main processor of an autonomous vehicle that requires performing and managing many processors while driving. To reduce this load and simultaneously exploit the advantages of 3D sensor data, we propose 2D data-based recognition using the ROI generated by extracting physical properties from 3D sensor data. In the environment where the brightness value was reduced by 50% in the basic image, it showed 5.3% higher accuracy and 28.57% lower performance time than the existing 2D-based model. Instead of having a 2.46 percent lower accuracy than the 3D-based model in the base image, it has a 6.25 percent reduction in performance time.

Hardware Implementation for Real-Time Speech Processing with Multiple Microphones

  • Seok, Cheong-Gyu;Choi, Jong-Suk;Kim, Mun-Sang;Park, Gwi-Tea
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.215-220
    • /
    • 2005
  • Nowadays, various speech processing systems are being introduced in the fields of robotics. However, real-time processing and high performances are required to properly implement speech processing system for the autonomous robots. Achieving these goals requires advanced hardware techniques including intelligent software algorithms. For example, we need nonlinear amplifier boards which are able to adjust the compression radio (CR) via computer programming. And the necessity for noise reduction, double-buffering on EPLD (Erasable programmable logic device), simultaneous multi-channel AD conversion, distant sound localization will be explained in this paper. These ideas can be used to improve distant and omni-directional speech recognition. This speech processing system, based on embedded Linux system, is supposed to be mounted on the new home service robot, which is being developed at KIST (Korea Institute of Science and Technology)

  • PDF

Adaptive Keyframe and ROI selection for Real-time Video Stabilization (실시간 영상 안정화를 위한 키프레임과 관심영역 선정)

  • Bae, Ju-Han;Hwang, Young-Bae;Choi, Byung-Ho;Chon, Je-Youl
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.11a
    • /
    • pp.288-291
    • /
    • 2011
  • Video stabilization is an important image enhancement widely used in surveillance system in order to improve recognition performance. Most previous methods calculate inter-frame homography to estimate global motion. These methods are relatively slow and suffer from significant depth variations or multiple moving object. In this paper, we propose a fast and practical approach for video stabilization that selects the most reliable key frame as a reference frame to a current frame. We use optical flow to estimate global motion within an adaptively selected region of interest in static camera environment. Optimal global motion is found by probabilistic voting in the space of optical flow. Experiments show that our method can perform real-time video stabilization validated by stabilized images and remarkable reduction of mean color difference between stabilized frames.

  • PDF

On the Morphological Fast Reconstructive Filter (형태론적 고속 복원성 여파기)

  • 박덕홍;김한균;정호열;오주환;김회진;나상신;선우명훈;정기훈;김용득
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.12
    • /
    • pp.81-90
    • /
    • 1994
  • This paper proposes a motphological fast reconstructive filter (FRF) using up/down sampling techniques for reconstructive opening and closing, and a parallel structure for fast multiresolution decomposition. Compuer simulation shows that, compared with the conventional RF, the proposed FRF can reduce the processing time up to 8 times while it maintains a similar performance in reconstructed shapes. Further reduction in the decomposition time achieved by the paralellized algorithm combined with the FRF, which can be applied in areas such as defect detection, image segmentation, pattern recognition, etc.

  • PDF