• Title/Summary/Keyword: image-processing

Search Result 9,983, Processing Time 0.034 seconds

The Performance Improvement of U-Net Model for Landcover Semantic Segmentation through Data Augmentation (데이터 확장을 통한 토지피복분류 U-Net 모델의 성능 개선)

  • Baek, Won-Kyung;Lee, Moung-Jin;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1663-1676
    • /
    • 2022
  • Recently, a number of deep-learning based land cover segmentation studies have been introduced. Some studies denoted that the performance of land cover segmentation deteriorated due to insufficient training data. In this study, we verified the improvement of land cover segmentation performance through data augmentation. U-Net was implemented for the segmentation model. And 2020 satellite-derived landcover dataset was utilized for the study data. The pixel accuracies were 0.905 and 0.923 for U-Net trained by original and augmented data respectively. And the mean F1 scores of those models were 0.720 and 0.775 respectively, indicating the better performance of data augmentation. In addition, F1 scores for building, road, paddy field, upland field, forest, and unclassified area class were 0.770, 0.568, 0.433, 0.455, 0.964, and 0.830 for the U-Net trained by original data. It is verified that data augmentation is effective in that the F1 scores of every class were improved to 0.838, 0.660, 0.791, 0.530, 0.969, and 0.860 respectively. Although, we applied data augmentation without considering class balances, we find that data augmentation can mitigate biased segmentation performance caused by data imbalance problems from the comparisons between the performances of two models. It is expected that this study would help to prove the importance and effectiveness of data augmentation in various image processing fields.

Filtering-Based Method and Hardware Architecture for Drivable Area Detection in Road Environment Including Vegetation (초목을 포함한 도로 환경에서 주행 가능 영역 검출을 위한 필터링 기반 방법 및 하드웨어 구조)

  • Kim, Younghyeon;Ha, Jiseok;Choi, Cheol-Ho;Moon, Byungin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.51-58
    • /
    • 2022
  • Drivable area detection, one of the main functions of advanced driver assistance systems, means detecting an area where a vehicle can safely drive. The drivable area detection is closely related to the safety of the driver and it requires high accuracy with real-time operation. To satisfy these conditions, V-disparity-based method is widely used to detect a drivable area by calculating the road disparity value in each row of an image. However, the V-disparity-based method can falsely detect a non-road area as a road when the disparity value is not accurate or the disparity value of the object is equal to the disparity value of the road. In a road environment including vegetation, such as a highway and a country road, the vegetation area may be falsely detected as the drivable area because the disparity characteristics of the vegetation are similar to those of the road. Therefore, this paper proposes a drivable area detection method and hardware architecture with a high accuracy in road environments including vegetation areas by reducing the number of false detections caused by V-disparity characteristic. When 289 images provided by KITTI road dataset are used to evaluate the road detection performance of the proposed method, it shows an accuracy of 90.12% and a recall of 97.96%. In addition, when the proposed hardware architecture is implemented on the FPGA platform, it uses 8925 slice registers and 7066 slice LUTs.

Distracted Driver Detection and Characteristic Area Localization by Combining CAM-Based Hierarchical and Horizontal Classification Models (CAM 기반의 계층적 및 수평적 분류 모델을 결합한 운전자 부주의 검출 및 특징 영역 지역화)

  • Go, Sooyeon;Choi, Yeongwoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.439-448
    • /
    • 2021
  • Driver negligence accounts for the largest proportion of the causes of traffic accidents, and research to detect them is continuously being conducted. This paper proposes a method to accurately detect a distracted driver and localize the most characteristic parts of the driver. The proposed method hierarchically constructs a CNN basic model that classifies 10 classes based on CAM in order to detect driver distration and 4 subclass models for detailed classification of classes having a confusing or common feature area in this model. The classification result output from each model can be considered as a new feature indicating the degree of matching with the CNN feature maps, and the accuracy of classification is improved by horizontally combining and learning them. In addition, by combining the heat map results reflecting the classification results of the basic and detailed classification models, the characteristic areas of attention in the image are found. The proposed method obtained an accuracy of 95.14% in an experiment using the State Farm data set, which is 2.94% higher than the 92.2%, which is the highest accuracy among the results using this data set. Also, it was confirmed by the experiment that more meaningful and accurate attention areas were found than the results of the attention area found when only the basic model was used.

Development of CanSat System With 3D Rendering and Real-time Object Detection Functions (3D 렌더링 및 실시간 물체 검출 기능 탑재 캔위성 시스템 개발)

  • Kim, Youngjun;Park, Junsoo;Nam, Jaeyoung;Yoo, Seunghoon;Kim, Songhyon;Lee, Sanghyun;Lee, Younggun
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.49 no.8
    • /
    • pp.671-680
    • /
    • 2021
  • This paper deals with the contents of designing and producing reconnaissance hardware and software, and verifying the functions after being installed on the CanSat platform and ground stations. The main reconnaissance mission is largely composed of two things: terrain search that renders the surrounding terrain in 3D using radar, GPS, and IMU sensors, and real-time detection of major objects through optical camera image analysis. In addition, data analysis efficiency was improved through GUI software to enhance the completeness of the CanSat system. Specifically, software that can check terrain information and object detection information in real time at the ground station was produced, and mission failure was prevented through abnormal packet exception processing and system initialization functions. Communication through LTE and AWS server was used as the main channel, and ZigBee was used as the auxiliary channel. The completed CanSat was tested for air fall using a rocket launch method and a drone mount method. In experimental results, the terrain search and object detection performance was excellent, and all the results were processed in real-time and then successfully displayed on the ground station software.

Evaluation of Debonding Defects in Railway Concrete Slabs Using Shear Wave Tomography (전단파 토모그래피를 활용한 철도 콘크리트 궤도 슬래브 층분리 결함 평가)

  • Lee, Jin-Wook;Kee, Seong-Hoon;Lee, Kang Seok
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.26 no.3
    • /
    • pp.11-20
    • /
    • 2022
  • The main purpose of this study is to investigate the applicability of the shear wave tomography technology as a non-destructive testing method to evaluate the debonding between the track concrete layer (TCL) and the hydraulically stabilized based course (HSB) of concrete slab tracks for the Korea high-speed railway system. A commercially available multi-channel shear wave measurement device (MIRA) is used to evaluate debonding defects in full-scaled mock-up test specimen that was designed and constructed according to the Rheda 200 system. A part of the mock-up specimen includes two artificial debonding defects with a length and a width of 400mm and thicknesses of 5mm and 10mm, respectively. The tomography images obtained by a MIRA on the surface of the concrete specimens are effective for visualizing the debonding defects in concrete. In this study, a simple image processing method is proposed to suppress the noisy signals reflected from the embedded items (reinforcing steel, precast sleeper, insert, etc.) in TCL, which significantly improves the readability of debonding defects in shear wave tomography images. Results show that debonding maps constructed in this study are effective for visualizing the spatial distribution and the depths of the debondiing defects in the railway concrete slab specimen.

Grasping a Target Object in Clutter with an Anthropomorphic Robot Hand via RGB-D Vision Intelligence, Target Path Planning and Deep Reinforcement Learning (RGB-D 환경인식 시각 지능, 목표 사물 경로 탐색 및 심층 강화학습에 기반한 사람형 로봇손의 목표 사물 파지)

  • Ryu, Ga Hyeon;Oh, Ji-Heon;Jeong, Jin Gyun;Jung, Hwanseok;Lee, Jin Hyuk;Lopez, Patricio Rivera;Kim, Tae-Seong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.363-370
    • /
    • 2022
  • Grasping a target object among clutter objects without collision requires machine intelligence. Machine intelligence includes environment recognition, target & obstacle recognition, collision-free path planning, and object grasping intelligence of robot hands. In this work, we implement such system in simulation and hardware to grasp a target object without collision. We use a RGB-D image sensor to recognize the environment and objects. Various path-finding algorithms been implemented and tested to find collision-free paths. Finally for an anthropomorphic robot hand, object grasping intelligence is learned through deep reinforcement learning. In our simulation environment, grasping a target out of five clutter objects, showed an average success rate of 78.8%and a collision rate of 34% without path planning. Whereas our system combined with path planning showed an average success rate of 94% and an average collision rate of 20%. In our hardware environment grasping a target out of three clutter objects showed an average success rate of 30% and a collision rate of 97% without path planning whereas our system combined with path planning showed an average success rate of 90% and an average collision rate of 23%. Our results show that grasping a target object in clutter is feasible with vision intelligence, path planning, and deep RL.

The Design of Smart Factory System using AI Edge Device (AI 엣지 디바이스를 이용한 스마트 팩토리 시스템 설계)

  • Han, Seong-Il;Lee, Dae-Sik;Han, Ji-Hwan;Shin, Han Jae
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.4
    • /
    • pp.257-270
    • /
    • 2022
  • In this paper, we design a smart factory risk improvement system and risk improvement method using AI edge devices. The smart factory risk improvement system collects, analyzes, prevents, and promptly responds to the worker's work performance process in the smart factory using AI edge devices, and can reduce the risk that may occur during work with improving the defect rate when workers perfom jobs. In particular, based on worker image information, worker biometric information, equipment operation information, and quality information of manufactured products, it is possible to set an abnormal risk condition, and it is possible to improve the risk so that the work is efficient and for the accurate performance. In addition, all data collected from cameras and IoT sensors inside the smart factory are processed by the AI edge device instead of all data being sent to the cloud, and only necessary data can be transmitted to the cloud, so the processing speed is fast and it has the advantage that security problems are low. Additionally, the use of AI edge devices has the advantage of reducing of data communication costs and the costs of data transmission bandwidth acquisition due to decrease of the amount of data transmission to the cloud.

Sign Language Dataset Built from S. Korean Government Briefing on COVID-19 (대한민국 정부의 코로나 19 브리핑을 기반으로 구축된 수어 데이터셋 연구)

  • Sim, Hohyun;Sung, Horyeol;Lee, Seungjae;Cho, Hyeonjoong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.325-330
    • /
    • 2022
  • This paper conducts the collection and experiment of datasets for deep learning research on sign language such as sign language recognition, sign language translation, and sign language segmentation for Korean sign language. There exist difficulties for deep learning research of sign language. First, it is difficult to recognize sign languages since they contain multiple modalities including hand movements, hand directions, and facial expressions. Second, it is the absence of training data to conduct deep learning research. Currently, KETI dataset is the only known dataset for Korean sign language for deep learning. Sign language datasets for deep learning research are classified into two categories: Isolated sign language and Continuous sign language. Although several foreign sign language datasets have been collected over time. they are also insufficient for deep learning research of sign language. Therefore, we attempted to collect a large-scale Korean sign language dataset and evaluate it using a baseline model named TSPNet which has the performance of SOTA in the field of sign language translation. The collected dataset consists of a total of 11,402 image and text. Our experimental result with the baseline model using the dataset shows BLEU-4 score 3.63, which would be used as a basic performance of a baseline model for Korean sign language dataset. We hope that our experience of collecting Korean sign language dataset helps facilitate further research directions on Korean sign language.

Dental Surgery Simulation Using Haptic Feedback Device (햅틱 피드백 장치를 이용한 치과 수술 시뮬레이션)

  • Yoon Sang Yeun;Sung Su Kyung;Shin Byeong Seok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.275-284
    • /
    • 2023
  • Virtual reality simulations are used for education and training in various fields, and are especially widely used in the medical field recently. The education/training simulator consists of tactile/force feedback generation and image/sound output hardware that provides a sense similar to a doctor's treatment of a real patient using real surgical tools, and software that produces realistic images and tactile feedback. Existing simulators are complicated and expensive because they have to use various types of hardware to simulate various surgical instruments used during surgery. In this paper, we propose a dental surgical simulation system using a force feedback device and a morphable haptic controller. Haptic hardware determines whether the surgical tool collides with the surgical site and provides a sense of resistance and vibration. In particular, haptic controllers that can be deformed, such as length changes and bending, can express various senses felt depending on the shape of various surgical tools. When the user manipulates the haptic feedback device, events such as movement of the haptic feedback device or button clicks are delivered to the simulation system, resulting in interaction between dental surgical tools and oral internal models, and thus haptic feedback is delivered to the haptic feedback device. Using these basic techniques, we provide a realistic training experience of impacted wisdom tooth extraction surgery, a representative dental surgery technique, in a virtual environment represented by sophisticated three-dimensional models.

Deep Learning Braille Block Recognition Method for Embedded Devices (임베디드 기기를 위한 딥러닝 점자블록 인식 방법)

  • Hee-jin Kim;Jae-hyuk Yoon;Soon-kak Kwon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.4
    • /
    • pp.1-9
    • /
    • 2023
  • In this paper, we propose a method to recognize the braille blocks for embedded devices in real time through deep learning. First, a deep learning model for braille block recognition is trained on a high-performance computer, and the learning model is applied to a lightweight tool to apply to an embedded device. To recognize the walking information of the braille block, an algorithm is used to determine the path using the distance from the braille block in the image. After detecting braille blocks, bollards, and crosswalks through the YOLOv8 model in the video captured by the embedded device, the walking information is recognized through the braille block path discrimination algorithm. We apply the model lightweight tool to YOLOv8 to detect braille blocks in real time. The precision of YOLOv8 model weights is lowered from the existing 32 bits to 8 bits, and the model is optimized by applying the TensorRT optimization engine. As the result of comparing the lightweight model through the proposed method with the existing model, the path recognition accuracy is 99.05%, which is almost the same as the existing model, but the recognition speed is reduced by 59% compared to the existing model, processing about 15 frames per second.