• 제목/요약/키워드: computer vision systems

검색결과 602건 처리시간 0.022초

Deformation estimation of truss bridges using two-stage optimization from cameras

  • Jau-Yu Chou;Chia-Ming Chang
    • Smart Structures and Systems
    • /
    • 제31권4호
    • /
    • pp.409-419
    • /
    • 2023
  • Structural integrity can be accessed from dynamic deformations of structures. Moreover, dynamic deformations can be acquired from non-contact sensors such as video cameras. Kanade-Lucas-Tomasi (KLT) algorithm is one of the commonly used methods for motion tracking. However, averaging throughout the extracted features would induce bias in the measurement. In addition, pixel-wise measurements can be converted to physical units through camera intrinsic. Still, the depth information is unreachable without prior knowledge of the space information. The assigned homogeneous coordinates would then mismatch manually selected feature points, resulting in measurement errors during coordinate transformation. In this study, a two-stage optimization method for video-based measurements is proposed. The manually selected feature points are first optimized by minimizing the errors compared with the homogeneous coordinate. Then, the optimized points are utilized for the KLT algorithm to extract displacements through inverse projection. Two additional criteria are employed to eliminate outliers from KLT, resulting in more reliable displacement responses. The second-stage optimization subsequently fine-tunes the geometry of the selected coordinates. The optimization process also considers the number of interpolation points at different depths of an image to reduce the effect of out-of-plane motions. As a result, the proposed method is numerically investigated by using a truss bridge as a physics-based graphic model (PBGM) to extract high-accuracy displacements from recorded videos under various capturing angles and structural conditions.

Automatic assessment of post-earthquake buildings based on multi-task deep learning with auxiliary tasks

  • Zhihang Li;Huamei Zhu;Mengqi Huang;Pengxuan Ji;Hongyu Huang;Qianbing Zhang
    • Smart Structures and Systems
    • /
    • 제31권4호
    • /
    • pp.383-392
    • /
    • 2023
  • Post-earthquake building condition assessment is crucial for subsequent rescue and remediation and can be automated by emerging computer vision and deep learning technologies. This study is based on an endeavour for the 2nd International Competition of Structural Health Monitoring (IC-SHM 2021). The task package includes five image segmentation objectives - defects (crack/spall/rebar exposure), structural component, and damage state. The structural component and damage state tasks are identified as the priority that can form actionable decisions. A multi-task Convolutional Neural Network (CNN) is proposed to conduct the two major tasks simultaneously. The rest 3 sub-tasks (spall/crack/rebar exposure) were incorporated as auxiliary tasks. By synchronously learning defect information (spall/crack/rebar exposure), the multi-task CNN model outperforms the counterpart single-task models in recognizing structural components and estimating damage states. Particularly, the pixel-level damage state estimation witnesses a mIoU (mean intersection over union) improvement from 0.5855 to 0.6374. For the defect detection tasks, rebar exposure is omitted due to the extremely biased sample distribution. The segmentations of crack and spall are automated by single-task U-Net but with extra efforts to resample the provided data. The segmentation of small objects (spall and crack) benefits from the resampling method, with a substantial IoU increment of nearly 10%.

Ship Number Recognition Method Based on An improved CRNN Model

  • Wenqi Xu;Yuesheng Liu;Ziyang Zhong;Yang Chen;Jinfeng Xia;Yunjie Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권3호
    • /
    • pp.740-753
    • /
    • 2023
  • Text recognition in natural scene images is a challenging problem in computer vision. The accurate identification of ship number characters can effectively improve the level of ship traffic management. However, due to the blurring caused by motion and text occlusion, the accuracy of ship number recognition is difficult to meet the actual requirements. To solve these problems, this paper proposes a dual-branch network based on the CRNN identification network. The network couples image restoration and character recognition. The CycleGAN module is used for blur restoration branch, and the Pix2pix module is used for character occlusion branch. The two are coupled to reduce the impact of image blur and occlusion. Input the recovered image into the text recognition branch to improve the recognition accuracy. After a lot of experiments, the model is robust and easy to train. Experiments on CTW datasets and real ship maps illustrate that our method can get more accurate results.

Deep Local Multi-level Feature Aggregation Based High-speed Train Image Matching

  • Li, Jun;Li, Xiang;Wei, Yifei;Wang, Xiaojun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권5호
    • /
    • pp.1597-1610
    • /
    • 2022
  • At present, the main method of high-speed train chassis detection is using computer vision technology to extract keypoints from two related chassis images firstly, then matching these keypoints to find the pixel-level correspondence between these two images, finally, detection and other steps are performed. The quality and accuracy of image matching are very important for subsequent defect detection. Current traditional matching methods are difficult to meet the actual requirements for the generalization of complex scenes such as weather, illumination, and seasonal changes. Therefore, it is of great significance to study the high-speed train image matching method based on deep learning. This paper establishes a high-speed train chassis image matching dataset, including random perspective changes and optical distortion, to simulate the changes in the actual working environment of the high-speed rail system as much as possible. This work designs a convolutional neural network to intensively extract keypoints, so as to alleviate the problems of current methods. With multi-level features, on the one hand, the network restores low-level details, thereby improving the localization accuracy of keypoints, on the other hand, the network can generate robust keypoint descriptors. Detailed experiments show the huge improvement of the proposed network over traditional methods.

Updating BIM: Reflecting Thermographic Sensing in BIM-based Building Energy Analysis

  • Ham, Youngjib;Golparvar-Fard, Mani
    • 국제학술발표논문집
    • /
    • The 6th International Conference on Construction Engineering and Project Management
    • /
    • pp.532-536
    • /
    • 2015
  • This paper presents an automated computer vision-based system to update BIM data by leveraging multi-modal visual data collected from existing buildings under inspection. Currently, visual inspections are conducted for building envelopes or mechanical systems, and auditors analyze energy-related contextual information to examine if their performance is maintained as expected by the design. By translating 3D surface thermal profiles into energy performance metrics such as actual R-values at point-level and by mapping such properties to the associated BIM elements using XML Document Object Model (DOM), the proposed method shortens the energy performance modeling gap between the architectural information in the as-designed BIM and the as-is building condition, which improve the reliability of building energy analysis. The experimental results on existing buildings show that (1) the point-level thermography-based thermal resistance measurement can be automatically matched with the associated BIM elements; and (2) their corresponding thermal properties are automatically updated in gbXML schema. This paper provides practitioners with insight to uncover the fundamentals of how multi-modal visual data can be used to improve the accuracy of building energy modeling for retrofit analysis. Open research challenges and lessons learned from real-world case studies are discussed in detail.

  • PDF

화장품 물체 인식을 위한 Two-Stage 딥러닝 기반 알고리즘 (Two-Stage Deep Learning Based Algorithm for Cosmetic Object Recognition)

  • 김종민;서대호
    • 산업경영시스템학회지
    • /
    • 제46권4호
    • /
    • pp.101-106
    • /
    • 2023
  • With the recent surge in YouTube usage, there has been a proliferation of user-generated videos where individuals evaluate cosmetics. Consequently, many companies are increasingly utilizing evaluation videos for their product marketing and market research. However, a notable drawback is the manual classification of these product review videos incurring significant costs and time. Therefore, this paper proposes a deep learning-based cosmetics search algorithm to automate this task. The algorithm consists of two networks: One for detecting candidates in images using shape features such as circles, rectangles, etc and Another for filtering and categorizing these candidates. The reason for choosing a Two-Stage architecture over One-Stage is that, in videos containing background scenes, it is more robust to first detect cosmetic candidates before classifying them as specific objects. Although Two-Stage structures are generally known to outperform One-Stage structures in terms of model architecture, this study opts for Two-Stage to address issues related to the acquisition of training and validation data that arise when using One-Stage. Acquiring data for the algorithm that detects cosmetic candidates based on shape and the algorithm that classifies candidates into specific objects is cost-effective, ensuring the overall robustness of the algorithm.

객체 영역에 특화된 뎁스 추정 기반의 충돌방지 기술개발 (Object-aware Depth Estimation for Developing Collision Avoidance System)

  • 황규태;송지민;이상준
    • 대한임베디드공학회논문지
    • /
    • 제19권2호
    • /
    • pp.91-99
    • /
    • 2024
  • Collision avoidance system is important to improve the robustness and functional safety of autonomous vehicles. This paper proposes an object-level distance estimation method to develop a collision avoidance system, and it is applied to golfcarts utilized in country club environments. To improve the detection accuracy, we continually trained an object detection model based on pseudo labels generated by a pre-trained detector. Moreover, we propose object-aware depth estimation (OADE) method which trains a depth model focusing on object regions. In the OADE algorithm, we generated dense depth information for object regions by utilizing detection results and sparse LiDAR points, and it is referred to as object-aware LiDAR projection (OALP). By using the OALP maps, a depth estimation model was trained by backpropagating more gradients of the loss on object regions. Experiments were conducted on our custom dataset, which was collected for the travel distance of 22 km on 54 holes in three country clubs under various weather conditions. The precision and recall rate were respectively improved from 70.5% and 49.1% to 95.3% and 92.1% after the continual learning with pseudo labels. Moreover, the OADE algorithm reduces the absolute relative error from 4.76% to 4.27% for estimating distances to obstacles.

Leveraging Deep Learning and Farmland Fertility Algorithm for Automated Rice Pest Detection and Classification Model

  • Hussain. A;Balaji Srikaanth. P
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권4호
    • /
    • pp.959-979
    • /
    • 2024
  • Rice pest identification is essential in modern agriculture for the health of rice crops. As global rice consumption rises, yields and quality must be maintained. Various methodologies were employed to identify pests, encompassing sensor-based technologies, deep learning, and remote sensing models. Visual inspection by professionals and farmers remains essential, but integrating technology such as satellites, IoT-based sensors, and drones enhances efficiency and accuracy. A computer vision system processes images to detect pests automatically. It gives real-time data for proactive and targeted pest management. With this motive in mind, this research provides a novel farmland fertility algorithm with a deep learning-based automated rice pest detection and classification (FFADL-ARPDC) technique. The FFADL-ARPDC approach classifies rice pests from rice plant images. Before processing, FFADL-ARPDC removes noise and enhances contrast using bilateral filtering (BF). Additionally, rice crop images are processed using the NASNetLarge deep learning architecture to extract image features. The FFA is used for hyperparameter tweaking to optimise the model performance of the NASNetLarge, which aids in enhancing classification performance. Using an Elman recurrent neural network (ERNN), the model accurately categorises 14 types of pests. The FFADL-ARPDC approach is thoroughly evaluated using a benchmark dataset available in the public repository. With an accuracy of 97.58, the FFADL-ARPDC model exceeds existing pest detection methods.

카메라 기반 객체의 위치인식을 위한 왜곡제거 및 오검출 필터링 기법 (Distortion Removal and False Positive Filtering for Camera-based Object Position Estimation)

  • 진실;송지민;최지호;진용식;정재진;이상준
    • 대한임베디드공학회논문지
    • /
    • 제19권1호
    • /
    • pp.1-8
    • /
    • 2024
  • Robotic arms have been widely utilized in various labor-intensive industries such as manufacturing, agriculture, and food services, contributing to increasing productivity. In the development of industrial robotic arms, camera sensors have many advantages due to their cost-effectiveness and small sizes. However, estimating object positions is a challenging problem, and it critically affects to the robustness of object manipulation functions. This paper proposes a method for estimating the 3D positions of objects, and it is applied to a pick-and-place task. A deep learning model is utilized to detect 2D bounding boxes in the image plane, and the pinhole camera model is employed to compute the object positions. To improve the robustness of measuring the 3D positions of objects, we analyze the effect of lens distortion and introduce a false positive filtering process. Experiments were conducted on a real-world scenario for moving medicine bottles by using a camera-based manipulator. Experimental results demonstrated that the distortion removal and false positive filtering are effective to improve the position estimation precision and the manipulation success rate.

Efficient Recognition of Easily-confused Chinese Herbal Slices Images Using Enhanced ResNeSt

  • Qi Zhang;Jinfeng Ou;Huaying Zhou
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권8호
    • /
    • pp.2103-2118
    • /
    • 2024
  • Chinese herbal slices (CHS) automated recognition based on computer vision plays a critical role in the practical application of intelligent Chinese medicine. Due to the complexity and similarity of herbal images, identifying Chinese herbal slices is still a challenging task. Especially, easily-confused CHS have higher inter-class and intra-class complexity and similarity issues, the existing deep learning models are less adaptable to identify them efficiently. To comprehensively address these problems, a novel tiny easily-confused CHS dataset has been built firstly, which includes six pairs of twelve categories with about 2395 samples. Furthermore, we propose a ResNeSt-CHS model that combines multilevel perception fusion (MPF) and perceptive sparse fusion (PSF) blocks for efficiently recognizing easilyconfused CHS images. To verify the superiority of the ResNeSt-CHS and the effectiveness of our dataset, experiments have been employed, validating that the ResNeSt-CHS is optimal for easily-confused CHS recognition, with 2.1% improvement of the original ResNeSt model. Additionally, the results indicate that ResNeSt-CHS is applied on a relatively small-scale dataset yet high accuracy. This model has obtained state-of-the-art easily-confused CHS classification performance, with accuracy of 90.8%, far beyond other models (EfficientNet, Transformer, and ResNeSt, etc) in terms of evaluation criteria.