• Title/Summary/Keyword: Image machine learning

Search Result 587, Processing Time 0.028 seconds

An Ensemble Classifier Based Method to Select Optimal Image Features for License Plate Recognition (차량 번호판 인식을 위한 앙상블 학습기 기반의 최적 특징 선택 방법)

  • Jo, Jae-Ho;Kang, Dong-Joong
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.1
    • /
    • pp.142-149
    • /
    • 2016
  • This paper proposes a method to detect LP(License Plate) of vehicles in indoor and outdoor parking lots. In restricted environment, there are many conventional methods for detecting LP. But, it is difficult to detect LP in natural and complex scenes with background clutters because several patterns similar with text or LP always exist in complicated backgrounds. To verify the performance of LP text detection in natural images, we apply MB-LGP feature by combining with ensemble machine learning algorithm in purpose of selecting optimal features of small number in huge pool. The feature selection is performed by adaptive boosting algorithm that shows great performance in minimum false positive detection ratio and in computing time when combined with cascade approach. MSER is used to provide initial text regions of vehicle LP. Throughout the experiment using real images, the proposed method functions robustly extracting LP in natural scene as well as the controlled environment.

An ANN-based gesture recognition algorithm for smart-home applications

  • Huu, Phat Nguyen;Minh, Quang Tran;The, Hoang Lai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.1967-1983
    • /
    • 2020
  • The goal of this paper is to analyze and build an algorithm to recognize hand gestures applying to smart home applications. The proposed algorithm uses image processing techniques combing with artificial neural network (ANN) approaches to help users interact with computers by common gestures. We use five types of gestures, namely those for Stop, Forward, Backward, Turn Left, and Turn Right. Users will control devices through a camera connected to computers. The algorithm will analyze gestures and take actions to perform appropriate action according to users requests via their gestures. The results show that the average accuracy of proposal algorithm is 92.6 percent for images and more than 91 percent for video, which both satisfy performance requirements for real-world application, specifically for smart home services. The processing time is approximately 0.098 second with 10 frames/sec datasets. However, accuracy rate still depends on the number of training images (video) and their resolution.

Study for an Artificial Visual Machine for the Blind (맹인용인공시각보조장치에 관한 연구)

  • 홍승홍;이균하
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.15 no.5
    • /
    • pp.19-24
    • /
    • 1978
  • In this paper, the functional propertied of vibrotactile sense of skin were studied by means of psycophysical experiments with respect to frequency and waveform of mechanical vibration, two-point threshold, contactor size of stimulators. Furthermore, leased on the experimental result, a small vibrotactile stimulator made of piezoelectrc ireed vibrator array was proposed for a aid blind to recognition of the Korean letters. A tactile output image is presented by an 8 row$\times$1 column array of samall vibrator reeds with 200 Hz rectangular wave, the array fitting on a fore-finger. Under the control of the NOVA mini-computer, the bimorph reeds array could represent any of one of the 24 characters of the Korean vowel and consonant at the 8 positions from left to right on the array. Without learning effect, the identification test of the Korean characters by the designed experimental system was carried out. The average rate of correct response was 90%.

  • PDF

Medical Image Data Standardization for Machine Learning and Its Application Software (기계학습을 위한 의료영상 데이터 표준화 및 응용 소프트웨어)

  • Kim, Ji-Eon;Han, SeongMin;Park, Minki;Kim, Seung-Jin;No, Si-Hyeong;Jun, Hong-Yong;Lee, Chung Sub;Kim, Tae-Hoon;Jeon, Chang-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.346-347
    • /
    • 2019
  • 의료영상은 환자의 질병을 진단하고 치료방침을 결정하는데 중요한 도구로 자리매김하고 있다. 최근 의료영상을 인공지능 연구가 국내외에서 활발하게 진행되고 있다. 특히 대규모의 의료영상들을 학습시켜 질병과 상태를 정밀 진단할 뿐만 아니라 예측하는 소프트웨어를 개발 하는 상황이다. 그러나 의료영상은 DICOM 표준에 따르고 있지만 태그정보의 사용은 의료기기와 의료기관마다 상이하다. 따라서 의료영상에 대한 메타 데이터의 표준화에 어려움이 있다. 본 논문은 이러한 의료영상 데이터를 표준화 할 수 있는 방법을 제안한다. 그리고 제안한 표준화 데이터로 변환할 수 있는 ETL 소프트웨어의 수행결과를 보이고, 조건에 따라 머신러닝 학습 데이터셋을 생성하는 결과를 제공한다. 향후 제안한 의료영상 표준화와 ETL 소프트웨어는 다양한 수요자 중심의 표준화된 데이터셋을 제공할 수 있는 플랫폼의 주요기능으로 활용 될 것으로 기대한다.

Study on hole-filling technique of motion capture images using GANs (Generative Adversarial Networks) (GANs(Generative Adversarial Networks)를 활용한 모션캡처 이미지의 hole-filling 기법 연구)

  • Shin, Kwang-Seong;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.160-161
    • /
    • 2019
  • As a method for modeling a three-dimensional object, there are a method using a 3D scanner, a method using a motion capture system, and a method using a Kinect system. Through this method, a portion that is not captured due to occlusion occurs in the process of creating a three-dimensional object. In order to implement a perfect three-dimensional object, it is necessary to arbitrarily fill the obscured part. There is a technique to fill the unexposed part by various image processing methods. In this study, we propose a method using GANs, which is the latest trend of unsupervised machine learning, as a method for more natural hole-filling.

  • PDF

Credit Card Number Recognition for People with Visual Impairment (시력 취약 계층을 위한 신용 카드 번호 인식 연구)

  • Park, Dahoon;Kwon, Kon-Woo
    • Journal of IKEEE
    • /
    • v.25 no.1
    • /
    • pp.25-31
    • /
    • 2021
  • The conventional credit card number recognition system generally needs a card to be placed in a designated location before its processing, which is not an ideal user experience especially for people with visual impairment. To improve the user experience, this paper proposes a novel algorithm that can automatically detect the location of a credit card number based on the fact that a group of sixteen digits has a fixed aspect ratio. The proposed algorithm first performs morphological operations to obtain multiple candidates of the credit card number with >4:1 aspect ratio, then recognizes the card number by testing each candidate via OCR and BIN matching techniques. Implemented with OpenCV and Firebase ML, the proposed scheme achieves 77.75% accuracy in the credit card number recognition task.

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

  • Ren, Qun
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.556-570
    • /
    • 2021
  • The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.

Development of Wave Height Field Measurement System Using a Depth Camera (깊이카메라를 이용한 파고장 계측 시스템의 구축)

  • Kim, Hoyong;Jeon, Chanil;Seo, Jeonghwa
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.58 no.6
    • /
    • pp.382-390
    • /
    • 2021
  • The present study suggests the application of a depth camera for wave height field measurement, focusing on the calibration procedure and test setup. Azure Kinect system is used to measure the water surface elevation, with a field of view of 800 mm × 800 mm and repetition rate of 30 Hz. In the optimal optical setup, the spatial resolution of the field of view is 288 × 320 pixels. To detect the water surface by the depth camera, tracer particles that float on the water and reflects infrared is added. The calibration consists of wave height scaling and correction of the barrel distortion. A polynomial regression model of image correction is established using machine learning. The measurement results by the depth camera are compared with capacitance type wave height gauge measurement, to show good agreement.

Research on the Lesion Classification by Radiomics in Laryngoscopy Image (후두내시경 영상에서의 라디오믹스에 의한 병변 분류 연구)

  • Park, Jun Ha;Kim, Young Jae;Woo, Joo Hyun;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.43 no.5
    • /
    • pp.353-360
    • /
    • 2022
  • Laryngeal disease harms quality of life, and laryngoscopy is critical in identifying causative lesions. This study extracts and analyzes using radiomics quantitative features from the lesion in laryngoscopy images and will fit and validate a classifier for finding meaningful features. Searching the region of interest for lesions not classified by the YOLOv5 model, features are extracted with radionics. Selected the extracted features are through a combination of three feature selectors, and three estimator models. Through the selected features, trained and verified two classification models, Random Forest and Gradient Boosting, and found meaningful features. The combination of SFS, LASSO, and RF shows the highest performance with an accuracy of 0.90 and AUROC 0.96. Model using features to select by SFM, or RIDGE was low lower performance than other things. Classification of larynx lesions through radiomics looks effective. But it should use various feature selection methods and minimize data loss as losing color data.

Research on Airport Public Art Design Elements and Preferences Based on Big Data Sentiment Analysis (빅데이터 감성분석에 따른 공항 공공예술 디자인 요소 및 선호도 연구)

  • Zhang, Yun;Zou, ChangYun;Kim, CheeYong
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1499-1511
    • /
    • 2022
  • In the context of globalization, circulation between cities has become more frequent. The airport is no longer just a place for boarding, disembarking, and transportation, but a public place that serves as the communication function of the "aviation city". The intervention of public art in the airport space not only gives users a sense of space experience, but also becomes a unique carrier for city and country image shaping. The purpose of this paper is to study the emotional value brought by airport public art to users, and to investigate the correlation analysis of public art design elements and user preferences based on this premise. The research methods are machine learning method and SPSS 21.0. The user's emotional value is introduced in the big data evaluation, and the preference and inclination of airport users to various elements of public art are analyzed by questionnaire. Through the research conclusion, the preference and main contradiction of users in the airport for the four dimensions of public art design elements are obtained. Opinions and optimization methods to provide reference data and theoretical support for public art design.