• Title/Summary/Keyword: Vision Processing

Search Result 1,548, Processing Time 0.025 seconds

Visual Classification of Wood Knots Using k-Nearest Neighbor and Convolutional Neural Network (k-Nearest Neighbor와 Convolutional Neural Network에 의한 제재목 표면 옹이 종류의 화상 분류)

  • Kim, Hyunbin;Kim, Mingyu;Park, Yonggun;Yang, Sang-Yun;Chung, Hyunwoo;Kwon, Ohkyung;Yeo, Hwanmyeong
    • Journal of the Korean Wood Science and Technology
    • /
    • v.47 no.2
    • /
    • pp.229-238
    • /
    • 2019
  • Various wood defects occur during tree growing or wood processing. Thus, to use wood practically, it is necessary to objectively assess their quality based on the usage requirement by accurately classifying their defects. However, manual visual grading and species classification may result in differences due to subjective decisions; therefore, computer-vision-based image analysis is required for the objective evaluation of wood quality and the speeding up of wood production. In this study, the SIFT+k-NN and CNN models were used to implement a model that automatically classifies knots and analyze its accuracy. Toward this end, a total of 1,172 knot images in various shapes from five domestic conifers were used for learning and validation. For the SIFT+k-NN model, SIFT technology was used to extract properties from the knot images and k-NN was used for the classification, resulting in the classification with an accuracy of up to 60.53% when k-index was 17. The CNN model comprised 8 convolution layers and 3 hidden layers, and its maximum accuracy was 88.09% after 1205 epoch, which was higher than that of the SIFT+k-NN model. Moreover, if there is a large difference in the number of images by knot types, the SIFT+k-NN tended to show a learning biased toward the knot type with a higher number of images, whereas the CNN model did not show a drastic bias regardless of the difference in the number of images. Therefore, the CNN model showed better performance in knot classification. It is determined that the wood knot classification by the CNN model will show a sufficient accuracy in its practical applicability.

Design and Implementation of OpenCV-based Inventory Management System to build Small and Medium Enterprise Smart Factory (중소기업 스마트공장 구축을 위한 OpenCV 기반 재고관리 시스템의 설계 및 구현)

  • Jang, Su-Hwan;Jeong, Jopil
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.161-170
    • /
    • 2019
  • Multi-product mass production small and medium enterprise factories have a wide variety of products and a large number of products, wasting manpower and expenses for inventory management. In addition, there is no way to check the status of inventory in real time, and it is suffering economic damage due to excess inventory and shortage of stock. There are many ways to build a real-time data collection environment, but most of them are difficult to afford for small and medium-sized companies. Therefore, smart factories of small and medium enterprises are faced with difficult reality and it is hard to find appropriate countermeasures. In this paper, we implemented the contents of extension of existing inventory management method through character extraction on label with barcode and QR code, which are widely adopted as current product management technology, and evaluated the effect. Technically, through preprocessing using OpenCV for automatic recognition and classification of stock labels and barcodes, which is a method for managing input and output of existing products through computer image processing, and OCR (Optical Character Recognition) function of Google vision API. And it is designed to recognize the barcode through Zbar. We propose a method to manage inventory by real-time image recognition through Raspberry Pi without using expensive equipment.

Detection of Zebra-crossing Areas Based on Deep Learning with Combination of SegNet and ResNet (SegNet과 ResNet을 조합한 딥러닝에 기반한 횡단보도 영역 검출)

  • Liang, Han;Seo, Suyoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.3
    • /
    • pp.141-148
    • /
    • 2021
  • This paper presents a method to detect zebra-crossing using deep learning which combines SegNet and ResNet. For the blind, a safe crossing system is important to know exactly where the zebra-crossings are. Zebra-crossing detection by deep learning can be a good solution to this problem and robotic vision-based assistive technologies sprung up over the past few years, which focused on specific scene objects using monocular detectors. These traditional methods have achieved significant results with relatively long processing times, and enhanced the zebra-crossing perception to a large extent. However, running all detectors jointly incurs a long latency and becomes computationally prohibitive on wearable embedded systems. In this paper, we propose a model for fast and stable segmentation of zebra-crossing from captured images. The model is improved based on a combination of SegNet and ResNet and consists of three steps. First, the input image is subsampled to extract image features and the convolutional neural network of ResNet is modified to make it the new encoder. Second, through the SegNet original up-sampling network, the abstract features are restored to the original image size. Finally, the method classifies all pixels and calculates the accuracy of each pixel. The experimental results prove the efficiency of the modified semantic segmentation algorithm with a relatively high computing speed.

Filtering-Based Method and Hardware Architecture for Drivable Area Detection in Road Environment Including Vegetation (초목을 포함한 도로 환경에서 주행 가능 영역 검출을 위한 필터링 기반 방법 및 하드웨어 구조)

  • Kim, Younghyeon;Ha, Jiseok;Choi, Cheol-Ho;Moon, Byungin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.51-58
    • /
    • 2022
  • Drivable area detection, one of the main functions of advanced driver assistance systems, means detecting an area where a vehicle can safely drive. The drivable area detection is closely related to the safety of the driver and it requires high accuracy with real-time operation. To satisfy these conditions, V-disparity-based method is widely used to detect a drivable area by calculating the road disparity value in each row of an image. However, the V-disparity-based method can falsely detect a non-road area as a road when the disparity value is not accurate or the disparity value of the object is equal to the disparity value of the road. In a road environment including vegetation, such as a highway and a country road, the vegetation area may be falsely detected as the drivable area because the disparity characteristics of the vegetation are similar to those of the road. Therefore, this paper proposes a drivable area detection method and hardware architecture with a high accuracy in road environments including vegetation areas by reducing the number of false detections caused by V-disparity characteristic. When 289 images provided by KITTI road dataset are used to evaluate the road detection performance of the proposed method, it shows an accuracy of 90.12% and a recall of 97.96%. In addition, when the proposed hardware architecture is implemented on the FPGA platform, it uses 8925 slice registers and 7066 slice LUTs.

SAAnnot-C3Pap: Ground Truth Collection Technique of Playing Posture Using Semi Automatic Annotation Method (SAAnnot-C3Pap: 반자동 주석화 방법을 적용한 연주 자세의 그라운드 트루스 수집 기법)

  • Park, So-Hyun;Kim, Seo-Yeon;Park, Young-Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.10
    • /
    • pp.409-418
    • /
    • 2022
  • In this paper, we propose SAAnnot-C3Pap, a semi-automatic annotation method for obtaining ground truth of a player's posture. In order to obtain ground truth about the two-dimensional joint position in the existing music domain, openpose, a two-dimensional posture estimation method, was used or manually labeled. However, automatic annotation methods such as the existing openpose have the disadvantages of showing inaccurate results even though they are fast. Therefore, this paper proposes SAAnnot-C3Pap, a semi-automated annotation method that is a compromise between the two. The proposed approach consists of three main steps: extracting postures using openpose, correcting the parts with errors among the extracted parts using supervisely, and then analyzing the results of openpose and supervisely. Perform the synchronization process. Through the proposed method, it was possible to correct the incorrect 2D joint position detection result that occurred in the openpose, solve the problem of detecting two or more people, and obtain the ground truth in the playing posture. In the experiment, we compare and analyze the results of the semi-automated annotation method openpose and the SAAnnot-C3Pap proposed in this paper. As a result of comparison, the proposed method showed improvement of posture information incorrectly collected through openpose.

A modified U-net for crack segmentation by Self-Attention-Self-Adaption neuron and random elastic deformation

  • Zhao, Jin;Hu, Fangqiao;Qiao, Weidong;Zhai, Weida;Xu, Yang;Bao, Yuequan;Li, Hui
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.1-16
    • /
    • 2022
  • Despite recent breakthroughs in deep learning and computer vision fields, the pixel-wise identification of tiny objects in high-resolution images with complex disturbances remains challenging. This study proposes a modified U-net for tiny crack segmentation in real-world steel-box-girder bridges. The modified U-net adopts the common U-net framework and a novel Self-Attention-Self-Adaption (SASA) neuron as the fundamental computing element. The Self-Attention module applies softmax and gate operations to obtain the attention vector. It enables the neuron to focus on the most significant receptive fields when processing large-scale feature maps. The Self-Adaption module consists of a multiplayer perceptron subnet and achieves deeper feature extraction inside a single neuron. For data augmentation, a grid-based crack random elastic deformation (CRED) algorithm is designed to enrich the diversities and irregular shapes of distributed cracks. Grid-based uniform control nodes are first set on both input images and binary labels, random offsets are then employed on these control nodes, and bilinear interpolation is performed for the rest pixels. The proposed SASA neuron and CRED algorithm are simultaneously deployed to train the modified U-net. 200 raw images with a high resolution of 4928 × 3264 are collected, 160 for training and the rest 40 for the test. 512 × 512 patches are generated from the original images by a sliding window with an overlap of 256 as inputs. Results show that the average IoU between the recognized and ground-truth cracks reaches 0.409, which is 29.8% higher than the regular U-net. A five-fold cross-validation study is performed to verify that the proposed method is robust to different training and test images. Ablation experiments further demonstrate the effectiveness of the proposed SASA neuron and CRED algorithm. Promotions of the average IoU individually utilizing the SASA and CRED module add up to the final promotion of the full model, indicating that the SASA and CRED modules contribute to the different stages of model and data in the training process.

Improving the Performance of Deep-Learning-Based Ground-Penetrating Radar Cavity Detection Model using Data Augmentation and Ensemble Techniques (데이터 증강 및 앙상블 기법을 이용한 딥러닝 기반 GPR 공동 탐지 모델 성능 향상 연구)

  • Yonguk Choi;Sangjin Seo;Hangilro Jang;Daeung Yoon
    • Geophysics and Geophysical Exploration
    • /
    • v.26 no.4
    • /
    • pp.211-228
    • /
    • 2023
  • Ground-penetrating radar (GPR) surveys are commonly used to monitor embankments, which is a nondestructive geophysical method. The results of GPR surveys can be complex, depending on the situation, and data processing and interpretation are subject to expert experiences, potentially resulting in false detection. Additionally, this process is time-intensive. Consequently, various studies have been undertaken to detect cavities in GPR survey data using deep learning methods. Deep-learning-based approaches require abundant data for training, but GPR field survey data are often scarce due to cost and other factors constaining field studies. Therefore, in this study, a deep- learning-based model was developed for embankment GPR survey cavity detection using data augmentation strategies. A dataset was constructed by collecting survey data over several years from the same embankment. A you look only once (YOLO) model, commonly used in computer vision for object detection, was employed for this purpose. By comparing and analyzing various strategies, the optimal data augmentation approach was determined. After initial model development, a stepwise process was employed, including box clustering, transfer learning, self-ensemble, and model ensemble techniques, to enhance the final model performance. The model performance was evaluated, with the results demonstrating its effectiveness in detecting cavities in embankment GPR survey data.

Effective Multi-Modal Feature Fusion for 3D Semantic Segmentation with Multi-View Images (멀티-뷰 영상들을 활용하는 3차원 의미적 분할을 위한 효과적인 멀티-모달 특징 융합)

  • Hye-Lim Bae;Incheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.505-518
    • /
    • 2023
  • 3D point cloud semantic segmentation is a computer vision task that involves dividing the point cloud into different objects and regions by predicting the class label of each point. Existing 3D semantic segmentation models have some limitations in performing sufficient fusion of multi-modal features while ensuring both characteristics of 2D visual features extracted from RGB images and 3D geometric features extracted from point cloud. Therefore, in this paper, we propose MMCA-Net, a novel 3D semantic segmentation model using 2D-3D multi-modal features. The proposed model effectively fuses two heterogeneous 2D visual features and 3D geometric features by using an intermediate fusion strategy and a multi-modal cross attention-based fusion operation. Also, the proposed model extracts context-rich 3D geometric features from input point cloud consisting of irregularly distributed points by adopting PTv2 as 3D geometric encoder. In this paper, we conducted both quantitative and qualitative experiments with the benchmark dataset, ScanNetv2 in order to analyze the performance of the proposed model. In terms of the metric mIoU, the proposed model showed a 9.2% performance improvement over the PTv2 model using only 3D geometric features, and a 12.12% performance improvement over the MVPNet model using 2D-3D multi-modal features. As a result, we proved the effectiveness and usefulness of the proposed model.

Analysis of the application of image quality assessment method for mobile tunnel scanning system (이동식 터널 스캐닝 시스템의 이미지 품질 평가 기법의 적용성 분석)

  • Chulhee Lee;Dongku Kim;Donggyou Kim
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.26 no.4
    • /
    • pp.365-384
    • /
    • 2024
  • The development of scanning technology is accelerating for safer and more efficient automated inspection than human-based inspection. Research on automatically detecting facility damage from images collected using computer vision technology is also increasing. The pixel size, quality, and quantity of an image can affect the performance of deep learning or image processing for automatic damage detection. This study is a basic to acquire high-quality raw image data and camera performance of a mobile tunnel scanning system for automatic detection of damage based on deep learning, and proposes a method to quantitatively evaluate image quality. A test chart was attached to a panel device capable of simulating a moving speed of 40 km/h, and an indoor test was performed using the international standard ISO 12233 method. Existing image quality evaluation methods were applied to evaluate the quality of images obtained in indoor experiments. It was determined that the shutter speed of the camera is closely related to the motion blur that occurs in the image. Modulation transfer function (MTF), one of the image quality evaluation method, can objectively evaluate image quality and was judged to be consistent with visual observation.

A Study on the establishment of IoT management process in terms of business according to Paradigm Shift (패러다임 전환에 의한 기업 측면의 IoT 경영 프로세스 구축방안 연구)

  • Jeong, Min-Eui;Yu, Song-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.151-171
    • /
    • 2015
  • This study examined the concepts of the Internet of Things(IoT), the major issue and IoT trend in the domestic and international market. also reviewed the advent of IoT era which caused a 'Paradigm Shift'. This study proposed a solution for the appropriate corresponding strategy in terms of Enterprise. Global competition began in the IoT market. So, Businesses to be competitive and responsive, the government's efforts, as well as the efforts of companies themselves is needed. In particular, in order to cope with the dynamic environment appropriately, faster and more efficient strategy is required. In other words, proposed a management strategy that can respond the IoT competitive era on tipping point through the vision of paradigm shift. We forecasted and proposed the emergence of paradigm shift through a comparative analysis of past management paradigm and IoT management paradigm as follow; I) Knowledge & learning oriented management, II) Technology & innovation oriented management, III) Demand driven management, IV) Global collaboration management. The Knowledge & learning oriented management paradigm is expected to be a new management paradigm due to the development of IT technology development and information processing technology. In addition to the rapid development such as IT infrastructure and processing of data, storage, knowledge sharing and learning has become more important. Currently Hardware-oriented management paradigm will be changed to the software-oriented paradigm. In particular, the software and platform market is a key component of the IoT ecosystem, has been estimated to be led by Technology & innovation oriented management. In 2011, Gartner announced the concept of "Demand-Driven Value Networks(DDVN)", DDVN emphasizes value of the whole of the network. Therefore, Demand driven management paradigm is creating demand for advanced process, not the process corresponding to the demand simply. Global collaboration management paradigm create the value creation through the fusion between technology, between countries, between industries. In particular, cooperation between enterprises that has financial resources and brand power and venture companies with creative ideas and technical will generate positive synergies. Through this, The large enterprises and small companies that can be win-win environment would be built. Cope with the a paradigm shift and to establish a management strategy of Enterprise process, this study utilized the 'RTE cyclone model' which proposed by Gartner. RTE concept consists of three stages, Lead, Operate, Manage. The Lead stage is utilizing capital to strengthen the business competitiveness. This stages has the goal of linking to external stimuli strategy development, also Execute the business strategy of the company for capital and investment activities and environmental changes. Manege stage is to respond appropriately to threats and internalize the goals of the enterprise. Operate stage proceeds to action for increasing the efficiency of the services across the enterprise, also achieve the integration and simplification of the process, with real-time data capture. RTE(Real Time Enterprise) concept has the value for practical use with the management strategy. Appropriately applied in this study, we propose a 'IoT-RTE Cyclone model' which emphasizes the agility of the enterprise. In addition, based on the real-time monitoring, analysis, act through IT and IoT technology. 'IoT-RTE Cyclone model' that could integrate the business processes of the enterprise each sector and support the overall service. therefore the model be used as an effective response strategy for Enterprise. In particular, IoT-RTE Cyclone Model is to respond to external events, waste elements are removed according to the process is repeated. Therefore, it is possible to model the operation of the process more efficient and agile. This IoT-RTE Cyclone Model can be used as an effective response strategy of the enterprise in terms of IoT era of rapidly changing because it supports the overall service of the enterprise. When this model leverages a collaborative system among enterprises it expects breakthrough cost savings through competitiveness, global lead time, minimizing duplication.