• Title/Summary/Keyword: Image machine learning

Search Result 592, Processing Time 0.029 seconds

Development of Wave Height Field Measurement System Using a Depth Camera (깊이카메라를 이용한 파고장 계측 시스템의 구축)

  • Kim, Hoyong;Jeon, Chanil;Seo, Jeonghwa
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.58 no.6
    • /
    • pp.382-390
    • /
    • 2021
  • The present study suggests the application of a depth camera for wave height field measurement, focusing on the calibration procedure and test setup. Azure Kinect system is used to measure the water surface elevation, with a field of view of 800 mm × 800 mm and repetition rate of 30 Hz. In the optimal optical setup, the spatial resolution of the field of view is 288 × 320 pixels. To detect the water surface by the depth camera, tracer particles that float on the water and reflects infrared is added. The calibration consists of wave height scaling and correction of the barrel distortion. A polynomial regression model of image correction is established using machine learning. The measurement results by the depth camera are compared with capacitance type wave height gauge measurement, to show good agreement.

Research on the Lesion Classification by Radiomics in Laryngoscopy Image (후두내시경 영상에서의 라디오믹스에 의한 병변 분류 연구)

  • Park, Jun Ha;Kim, Young Jae;Woo, Joo Hyun;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.43 no.5
    • /
    • pp.353-360
    • /
    • 2022
  • Laryngeal disease harms quality of life, and laryngoscopy is critical in identifying causative lesions. This study extracts and analyzes using radiomics quantitative features from the lesion in laryngoscopy images and will fit and validate a classifier for finding meaningful features. Searching the region of interest for lesions not classified by the YOLOv5 model, features are extracted with radionics. Selected the extracted features are through a combination of three feature selectors, and three estimator models. Through the selected features, trained and verified two classification models, Random Forest and Gradient Boosting, and found meaningful features. The combination of SFS, LASSO, and RF shows the highest performance with an accuracy of 0.90 and AUROC 0.96. Model using features to select by SFM, or RIDGE was low lower performance than other things. Classification of larynx lesions through radiomics looks effective. But it should use various feature selection methods and minimize data loss as losing color data.

Research on Airport Public Art Design Elements and Preferences Based on Big Data Sentiment Analysis (빅데이터 감성분석에 따른 공항 공공예술 디자인 요소 및 선호도 연구)

  • Zhang, Yun;Zou, ChangYun;Kim, CheeYong
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1499-1511
    • /
    • 2022
  • In the context of globalization, circulation between cities has become more frequent. The airport is no longer just a place for boarding, disembarking, and transportation, but a public place that serves as the communication function of the "aviation city". The intervention of public art in the airport space not only gives users a sense of space experience, but also becomes a unique carrier for city and country image shaping. The purpose of this paper is to study the emotional value brought by airport public art to users, and to investigate the correlation analysis of public art design elements and user preferences based on this premise. The research methods are machine learning method and SPSS 21.0. The user's emotional value is introduced in the big data evaluation, and the preference and inclination of airport users to various elements of public art are analyzed by questionnaire. Through the research conclusion, the preference and main contradiction of users in the airport for the four dimensions of public art design elements are obtained. Opinions and optimization methods to provide reference data and theoretical support for public art design.

Implementation of medical image labeling web application for machine learning (기계학습을 위한 의료영상 라벨링 웹 애플리케이션 구현)

  • Lee, Chung-sub;Lim, Dong-Wook;Kim, Ji-Eon;Noh, Si-Hyeong;Yu, Yeong-Ju;Kim, Tae-Hoon;Jeong, Chang-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.602-605
    • /
    • 2021
  • 최근 인공지능 연구가 활발히 진행되고 있는 가운데 국내외에서 오픈 데이터셋을 제공하고 있어 기술개발이 가속화되고 있다. 데이터셋은 지도학습을 위한 학습데이터로 라벨링 데이터를 포함하고 있어 다양한 라벨링 기능이 적용된 도구 개발이 필요하다. 본 논문에서는 의료영상의 라벨링 데이터를 정교하고 빠르게 생성하기 위한 라벨링 웹 애플리케이션에 대해서 기술한다. 이를 구현하기 위해서 Back Projection, Grabcut 기법을 이용한 반자동 방식과 기계학습 모델을 통해서 예측한 자동 방식의 라벨링 기능을 구현하였다. 이와 관련하여 라벨링 기능별 수행 결과를 근감소증 진단을 위한 영상 라벨링 수행결과와 정량분석 결과를 보였다.

Counterfactual image generation by disentangling data attributes with deep generative models

  • Jieon Lim;Weonyoung Joo
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.6
    • /
    • pp.589-603
    • /
    • 2023
  • Deep generative models target to infer the underlying true data distribution, and it leads to a huge success in generating fake-but-realistic data. Regarding such a perspective, the data attributes can be a crucial factor in the data generation process since non-existent counterfactual samples can be generated by altering certain factors. For example, we can generate new portrait images by flipping the gender attribute or altering the hair color attributes. This paper proposes counterfactual disentangled variational autoencoder generative adversarial networks (CDVAE-GAN), specialized for data attribute level counterfactual data generation. The structure of the proposed CDVAE-GAN consists of variational autoencoders and generative adversarial networks. Specifically, we adopt a Gaussian variational autoencoder to extract low-dimensional disentangled data features and auxiliary Bernoulli latent variables to model the data attributes separately. Also, we utilize a generative adversarial network to generate data with high fidelity. By enjoying the benefits of the variational autoencoder with the additional Bernoulli latent variables and the generative adversarial network, the proposed CDVAE-GAN can control the data attributes, and it enables producing counterfactual data. Our experimental result on the CelebA dataset qualitatively shows that the generated samples from CDVAE-GAN are realistic. Also, the quantitative results support that the proposed model can produce data that can deceive other machine learning classifiers with the altered data attributes.

Fire Detection using Deep Convolutional Neural Networks for Assisting People with Visual Impairments in an Emergency Situation (시각 장애인을 위한 영상 기반 심층 합성곱 신경망을 이용한 화재 감지기)

  • Kong, Borasy;Won, Insu;Kwon, Jangwoo
    • 재활복지
    • /
    • v.21 no.3
    • /
    • pp.129-146
    • /
    • 2017
  • In an event of an emergency, such as fire in a building, visually impaired and blind people are prone to exposed to a level of danger that is greater than that of normal people, for they cannot be aware of it quickly. Current fire detection methods such as smoke detector is very slow and unreliable because it usually uses chemical sensor based technology to detect fire particles. But by using vision sensor instead, fire can be proven to be detected much faster as we show in our experiments. Previous studies have applied various image processing and machine learning techniques to detect fire, but they usually don't work very well because these techniques require hand-crafted features that do not generalize well to various scenarios. But with the help of recent advancement in the field of deep learning, this research can be conducted to help solve this problem by using deep learning-based object detector that can detect fire using images from security camera. Deep learning based approach can learn features automatically so they can usually generalize well to various scenes. In order to ensure maximum capacity, we applied the latest technologies in the field of computer vision such as YOLO detector in order to solve this task. Considering the trade-off between recall vs. complexity, we introduced two convolutional neural networks with slightly different model's complexity to detect fire at different recall rate. Both models can detect fire at 99% average precision, but one model has 76% recall at 30 FPS while another has 61% recall at 50 FPS. We also compare our model memory consumption with each other and show our models robustness by testing on various real-world scenarios.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

A Real Time Low-Cost Hand Gesture Control System for Interaction with Mechanical Device (기계 장치와의 상호작용을 위한 실시간 저비용 손동작 제어 시스템)

  • Hwang, Tae-Hoon;Kim, Jin-Heon
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1423-1429
    • /
    • 2019
  • Recently, a system that supports efficient interaction, a human machine interface (HMI), has become a hot topic. In this paper, we propose a new real time low-cost hand gesture control system as one of vehicle interaction methods. In order to reduce computation time, depth information was acquired using a time-of-flight (TOF) camera because it requires a large amount of computation when detecting hand regions using an RGB camera. In addition, fourier descriptor were used to reduce the learning model. Since the Fourier descriptor uses only a small number of points in the whole image, it is possible to miniaturize the learning model. In order to evaluate the performance of the proposed technique, we compared the speeds of desktop and raspberry pi2. Experimental results show that performance difference between small embedded and desktop is not significant. In the gesture recognition experiment, the recognition rate of 95.16% is confirmed.

Analyzing and Solving GuessWhat?! (GuessWhat?! 문제에 대한 분석과 파훼)

  • Lee, Sang-Woo;Han, Cheolho;Heo, Yujung;Kang, Wooyoung;Jun, Jaehyun;Zhang, Byoung-Tak
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.30-35
    • /
    • 2018
  • GuessWhat?! is a game in which two machine players, composed of questioner and answerer, ask and answer yes-no-N/A questions about the object hidden for the answerer in the image, and the questioner chooses the correct object. GuessWhat?! has received much attention in the field of deep learning and artificial intelligence as a testbed for cutting-edge research on the interplay of computer vision and dialogue systems. In this study, we discuss the objective function and characteristics of the GuessWhat?! game. In addition, we propose a simple solver for GuessWhat?! using a simple rule-based algorithm. Although a human needs four or five questions on average to solve this problem, the proposed method outperforms state-of-the-art deep learning methods using only two questions, and exceeds human performance using five questions.

Extracting Rules from Neural Networks with Continuous Attributes (연속형 속성을 갖는 인공 신경망의 규칙 추출)

  • Jagvaral, Batselem;Lee, Wan-Gon;Jeon, Myung-joong;Park, Hyun-Kyu;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.22-29
    • /
    • 2018
  • Over the decades, neural networks have been successfully used in numerous applications from speech recognition to image classification. However, these neural networks cannot explain their results and one needs to know how and why a specific conclusion was drawn. Most studies focus on extracting binary rules from neural networks, which is often impractical to do, since data sets used for machine learning applications contain continuous values. To fill the gap, this paper presents an algorithm to extract logic rules from a trained neural network for data with continuous attributes. It uses hyperplane-based linear classifiers to extract rules with numeric values from trained weights between input and hidden layers and then combines these classifiers with binary rules learned from hidden and output layers to form non-linear classification rules. Experiments with different datasets show that the proposed approach can accurately extract logical rules for data with nonlinear continuous attributes.