• Title/Summary/Keyword: computer vision systems

Search Result 600, Processing Time 0.032 seconds

Ensemble-based deep learning for autonomous bridge component and damage segmentation leveraging Nested Reg-UNet

  • Abhishek Subedi;Wen Tang;Tarutal Ghosh Mondal;Rih-Teng Wu;Mohammad R. Jahanshahi
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.335-349
    • /
    • 2023
  • Bridges constantly undergo deterioration and damage, the most common ones being concrete damage and exposed rebar. Periodic inspection of bridges to identify damages can aid in their quick remediation. Likewise, identifying components can provide context for damage assessment and help gauge a bridge's state of interaction with its surroundings. Current inspection techniques rely on manual site visits, which can be time-consuming and costly. More recently, robotic inspection assisted by autonomous data analytics based on Computer Vision (CV) and Artificial Intelligence (AI) has been viewed as a suitable alternative to manual inspection because of its efficiency and accuracy. To aid research in this avenue, this study performs a comparative assessment of different architectures, loss functions, and ensembling strategies for the autonomous segmentation of bridge components and damages. The experiments lead to several interesting discoveries. Nested Reg-UNet architecture is found to outperform five other state-of-the-art architectures in both damage and component segmentation tasks. The architecture is built by combining a Nested UNet style dense configuration with a pretrained RegNet encoder. In terms of the mean Intersection over Union (mIoU) metric, the Nested Reg-UNet architecture provides an improvement of 2.86% on the damage segmentation task and 1.66% on the component segmentation task compared to the state-of-the-art UNet architecture. Furthermore, it is demonstrated that incorporating the Lovasz-Softmax loss function to counter class imbalance can boost performance by 3.44% in the component segmentation task over the most employed alternative, weighted Cross Entropy (wCE). Finally, weighted softmax ensembling is found to be quite effective when used synchronously with the Nested Reg-UNet architecture by providing mIoU improvement of 0.74% in the component segmentation task and 1.14% in the damage segmentation task over a single-architecture baseline. Overall, the best mIoU of 92.50% for the component segmentation task and 84.19% for the damage segmentation task validate the feasibility of these techniques for autonomous bridge component and damage segmentation using RGB images.

Twin models for high-resolution visual inspections

  • Seyedomid Sajedi;Kareem A. Eltouny;Xiao Liang
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.351-363
    • /
    • 2023
  • Visual structural inspections are an inseparable part of post-earthquake damage assessments. With unmanned aerial vehicles (UAVs) establishing a new frontier in visual inspections, there are major computational challenges in processing the collected massive amounts of high-resolution visual data. We propose twin deep learning models that can provide accurate high-resolution structural components and damage segmentation masks efficiently. The traditional approach to cope with high memory computational demands is to either uniformly downsample the raw images at the price of losing fine local details or cropping smaller parts of the images leading to a loss of global contextual information. Therefore, our twin models comprising Trainable Resizing for high-resolution Segmentation Network (TRS-Net) and DmgFormer approaches the global and local semantics from different perspectives. TRS-Net is a compound, high-resolution segmentation architecture equipped with learnable downsampler and upsampler modules to minimize information loss for optimal performance and efficiency. DmgFormer utilizes a transformer backbone and a convolutional decoder head with skip connections on a grid of crops aiming for high precision learning without downsizing. An augmented inference technique is used to boost performance further and reduce the possible loss of context due to grid cropping. Comprehensive experiments have been performed on the 3D physics-based graphics models (PBGMs) synthetic environments in the QuakeCity dataset. The proposed framework is evaluated using several metrics on three segmentation tasks: component type, component damage state, and global damage (crack, rebar, spalling). The models were developed as part of the 2nd International Competition for Structural Health Monitoring.

Design and Implementation of a Concentration-based Review Support Tool for Real-time Online Class Participants (실시간 온라인 수업 수강자들의 집중력 기반 복습 지원 도구의 설계 및 구현)

  • Tae-Hwan Kim;Dae-Soo Cho;Seung-Min Park
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.3
    • /
    • pp.521-526
    • /
    • 2023
  • Due to the recent pandemic, most educational systems are being conducted through online classes. Unlike face-to-face classes, it is even more difficult for learners to maintain concentration, and evaluating the learners' attitude toward the class is also challenging. In this paper, we proposed a real-time concentration-based review support system for learners in real-time video lectures that can be used in online classes. This system measured the learner's face, pupils, and user activity in real-time using the equipment used in the existing video system, and delivers real-time concentration measurement values to the instructor in various forms. At the same time, if the concentration measurement value falls below a certain level, the system alerted the learner and records the timestamp of the lecture. By using this system, instructors can evaluate the learners' participation in the class in real-time and help to improve their class abilities.

Dynamic characteristics monitoring of wind turbine blades based on improved YOLOv5 deep learning model

  • W.H. Zhao;W.R. Li;M.H. Yang;N. Hong;Y.F. Du
    • Smart Structures and Systems
    • /
    • v.31 no.5
    • /
    • pp.469-483
    • /
    • 2023
  • The dynamic characteristics of wind turbine blades are usually monitored by contact sensors with the disadvantages of high cost, difficult installation, easy damage to the structure, and difficult signal transmission. In view of the above problems, based on computer vision technology and the improved YOLOv5 (You Only Look Once v5) deep learning model, a non-contact dynamic characteristic monitoring method for wind turbine blade is proposed. First, the original YOLOv5l model of the CSP (Cross Stage Partial) structure is improved by introducing the CSP2_2 structure, which reduce the number of residual components to better the network training speed. On this basis, combined with the Deep sort algorithm, the accuracy of structural displacement monitoring is mended. Secondly, for the disadvantage that the deep learning sample dataset is difficult to collect, the blender software is used to model the wind turbine structure with conditions, illuminations and other practical engineering similar environments changed. In addition, incorporated with the image expansion technology, a modeling-based dataset augmentation method is proposed. Finally, the feasibility of the proposed algorithm is verified by experiments followed by the analytical procedure about the influence of YOLOv5 models, lighting conditions and angles on the recognition results. The results show that the improved YOLOv5 deep learning model not only perform well compared with many other YOLOv5 models, but also has high accuracy in vibration monitoring in different environments. The method can accurately identify the dynamic characteristics of wind turbine blades, and therefore can provide a reference for evaluating the condition of wind turbine blades.

Sparse Class Processing Strategy in Image-based Livestock Defect Detection (이미지 기반 축산물 불량 탐지에서의 희소 클래스 처리 전략)

  • Lee, Bumho;Cho, Yesung;Yi, Mun Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.11
    • /
    • pp.1720-1728
    • /
    • 2022
  • The industrial 4.0 era has been opened with the development of artificial intelligence technology, and the realization of smart farms incorporating ICT technology is receiving great attention in the livestock industry. Among them, the quality management technology of livestock products and livestock operations incorporating computer vision-based artificial intelligence technology represent key technologies. However, the insufficient number of livestock image data for artificial intelligence model training and the severely unbalanced ratio of labels for recognizing a specific defective state are major obstacles to the related research and technology development. To overcome these problems, in this study, combining oversampling and adversarial case generation techniques is proposed as a method necessary to effectively utilizing small data labels for successful defect detection. In addition, experiments comparing performance and time cost of the applicable techniques were conducted. Through experiments, we confirm the validity of the proposed methods and draw utilization strategies from the study results.

Joint Reasoning of Real-time Visual Risk Zone Identification and Numeric Checking for Construction Safety Management

  • Ali, Ahmed Khairadeen;Khan, Numan;Lee, Do Yeop;Park, Chansik
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.313-322
    • /
    • 2020
  • The recognition of the risk hazards is a vital step to effectively prevent accidents on a construction site. The advanced development in computer vision systems and the availability of the large visual database related to construction site made it possible to take quick action in the event of human error and disaster situations that may occur during management supervision. Therefore, it is necessary to analyze the risk factors that need to be managed at the construction site and review appropriate and effective technical methods for each risk factor. This research focuses on analyzing Occupational Safety and Health Agency (OSHA) related to risk zone identification rules that can be adopted by the image recognition technology and classify their risk factors depending on the effective technical method. Therefore, this research developed a pattern-oriented classification of OSHA rules that can employ a large scale of safety hazard recognition. This research uses joint reasoning of risk zone Identification and numeric input by utilizing a stereo camera integrated with an image detection algorithm such as (YOLOv3) and Pyramid Stereo Matching Network (PSMNet). The research result identifies risk zones and raises alarm if a target object enters this zone. It also determines numerical information of a target, which recognizes the length, spacing, and angle of the target. Applying image detection joint logic algorithms might leverage the speed and accuracy of hazard detection due to merging more than one factor to prevent accidents in the job site.

  • PDF

Dual-stream Co-enhanced Network for Unsupervised Video Object Segmentation

  • Hongliang Zhu;Hui Yin;Yanting Liu;Ning Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.938-958
    • /
    • 2024
  • Unsupervised Video Object Segmentation (UVOS) is a highly challenging problem in computer vision as the annotation of the target object in the testing video is unknown at all. The main difficulty is to effectively handle the complicated and changeable motion state of the target object and the confusion of similar background objects in video sequence. In this paper, we propose a novel deep Dual-stream Co-enhanced Network (DC-Net) for UVOS via bidirectional motion cues refinement and multi-level feature aggregation, which can fully take advantage of motion cues and effectively integrate different level features to produce high-quality segmentation mask. DC-Net is a dual-stream architecture where the two streams are co-enhanced by each other. One is a motion stream with a Motion-cues Refine Module (MRM), which learns from bidirectional optical flow images and produces fine-grained and complete distinctive motion saliency map, and the other is an appearance stream with a Multi-level Feature Aggregation Module (MFAM) and a Context Attention Module (CAM) which are designed to integrate the different level features effectively. Specifically, the motion saliency map obtained by the motion stream is fused with each stage of the decoder in the appearance stream to improve the segmentation, and in turn the segmentation loss in the appearance stream feeds back into the motion stream to enhance the motion refinement. Experimental results on three datasets (Davis2016, VideoSD, SegTrack-v2) demonstrate that DC-Net has achieved comparable results with some state-of-the-art methods.

A Collaborative Video Annotation and Browsing System using Linked Data (링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징 시스템)

  • Lee, Yeon-Ho;Oh, Kyeong-Jin;Sean, Vi-Sal;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.203-219
    • /
    • 2011
  • Previously common users just want to watch the video contents without any specific requirements or purposes. However, in today's life while watching video user attempts to know and discover more about things that appear on the video. Therefore, the requirements for finding multimedia or browsing information of objects that users want, are spreading with the increasing use of multimedia such as videos which are not only available on the internet-capable devices such as computers but also on smart TV and smart phone. In order to meet the users. requirements, labor-intensive annotation of objects in video contents is inevitable. For this reason, many researchers have actively studied about methods of annotating the object that appear on the video. In keyword-based annotation related information of the object that appeared on the video content is immediately added and annotation data including all related information about the object must be individually managed. Users will have to directly input all related information to the object. Consequently, when a user browses for information that related to the object, user can only find and get limited resources that solely exists in annotated data. Also, in order to place annotation for objects user's huge workload is required. To cope with reducing user's workload and to minimize the work involved in annotation, in existing object-based annotation automatic annotation is being attempted using computer vision techniques like object detection, recognition and tracking. By using such computer vision techniques a wide variety of objects that appears on the video content must be all detected and recognized. But until now it is still a problem facing some difficulties which have to deal with automated annotation. To overcome these difficulties, we propose a system which consists of two modules. The first module is the annotation module that enables many annotators to collaboratively annotate the objects in the video content in order to access the semantic data using Linked Data. Annotation data managed by annotation server is represented using ontology so that the information can easily be shared and extended. Since annotation data does not include all the relevant information of the object, existing objects in Linked Data and objects that appear in the video content simply connect with each other to get all the related information of the object. In other words, annotation data which contains only URI and metadata like position, time and size are stored on the annotation sever. So when user needs other related information about the object, all of that information is retrieved from Linked Data through its relevant URI. The second module enables viewers to browse interesting information about the object using annotation data which is collaboratively generated by many users while watching video. With this system, through simple user interaction the query is automatically generated and all the related information is retrieved from Linked Data and finally all the additional information of the object is offered to the user. With this study, in the future of Semantic Web environment our proposed system is expected to establish a better video content service environment by offering users relevant information about the objects that appear on the screen of any internet-capable devices such as PC, smart TV or smart phone.

An Empirical Study on the Determinants of Supply Chain Management Systems Success from Vendor's Perspective (참여자관점에서 공급사슬관리 시스템의 성공에 영향을 미치는 요인에 관한 실증연구)

  • Kang, Sung-Bae;Moon, Tae-Soo;Chung, Yoon
    • Asia pacific journal of information systems
    • /
    • v.20 no.3
    • /
    • pp.139-166
    • /
    • 2010
  • The supply chain management (SCM) systems have emerged as strong managerial tools for manufacturing firms in enhancing competitive strength. Despite of large investments in the SCM systems, many companies are not fully realizing the promised benefits from the systems. A review of literature on adoption, implementation and success factor of IOS (inter-organization systems), EDI (electronic data interchange) systems, shows that this issue has been examined from multiple theoretic perspectives. And many researchers have attempted to identify the factors which influence the success of system implementation. However, the existing studies have two drawbacks in revealing the determinants of systems implementation success. First, previous researches raise questions as to the appropriateness of research subjects selected. Most SCM systems are operating in the form of private industrial networks, where the participants of the systems consist of two distinct groups: focus companies and vendors. The focus companies are the primary actors in developing and operating the systems, while vendors are passive participants which are connected to the system in order to supply raw materials and parts to the focus companies. Under the circumstance, there are three ways in selecting the research subjects; focus companies only, vendors only, or two parties grouped together. It is hard to find researches that use the focus companies exclusively as the subjects probably due to the insufficient sample size for statistic analysis. Most researches have been conducted using the data collected from both groups. We argue that the SCM success factors cannot be correctly indentified in this case. The focus companies and the vendors are in different positions in many areas regarding the system implementation: firm size, managerial resources, bargaining power, organizational maturity, and etc. There are no obvious reasons to believe that the success factors of the two groups are identical. Grouping the two groups also raises questions on measuring the system success. The benefits from utilizing the systems may not be commonly distributed to the two groups. One group's benefits might be realized at the expenses of the other group considering the situation where vendors participating in SCM systems are under continuous pressures from the focus companies with respect to prices, quality, and delivery time. Therefore, by combining the system outcomes of both groups we cannot measure the system benefits obtained by each group correctly. Second, the measures of system success adopted in the previous researches have shortcoming in measuring the SCM success. User satisfaction, system utilization, and user attitudes toward the systems are most commonly used success measures in the existing studies. These measures have been developed as proxy variables in the studies of decision support systems (DSS) where the contribution of the systems to the organization performance is very difficult to measure. Unlike the DSS, the SCM systems have more specific goals, such as cost saving, inventory reduction, quality improvement, rapid time, and higher customer service. We maintain that more specific measures can be developed instead of proxy variables in order to measure the system benefits correctly. The purpose of this study is to find the determinants of SCM systems success in the perspective of vendor companies. In developing the research model, we have focused on selecting the success factors appropriate for the vendors through reviewing past researches and on developing more accurate success measures. The variables can be classified into following: technological, organizational, and environmental factors on the basis of TOE (Technology-Organization-Environment) framework. The model consists of three independent variables (competition intensity, top management support, and information system maturity), one mediating variable (collaboration), one moderating variable (government support), and a dependent variable (system success). The systems success measures have been developed to reflect the operational benefits of the SCM systems; improvement in planning and analysis capabilities, faster throughput, cost reduction, task integration, and improved product and customer service. The model has been validated using the survey data collected from 122 vendors participating in the SCM systems in Korea. To test for mediation, one should estimate the hierarchical regression analysis on the collaboration. And moderating effect analysis should estimate the moderated multiple regression, examines the effect of the government support. The result shows that information system maturity and top management support are the most important determinants of SCM system success. Supply chain technologies that standardize data formats and enhance information sharing may be adopted by supply chain leader organization because of the influence of focal company in the private industrial networks in order to streamline transactions and improve inter-organization communication. Specially, the need to develop and sustain an information system maturity will provide the focus and purpose to successfully overcome information system obstacles and resistance to innovation diffusion within the supply chain network organization. The support of top management will help focus efforts toward the realization of inter-organizational benefits and lend credibility to functional managers responsible for its implementation. The active involvement, vision, and direction of high level executives provide the impetus needed to sustain the implementation of SCM. The quality of collaboration relationships also is positively related to outcome variable. Collaboration variable is found to have a mediation effect between on influencing factors and implementation success. Higher levels of inter-organizational collaboration behaviors such as shared planning and flexibility in coordinating activities were found to be strongly linked to the vendors trust in the supply chain network. Government support moderates the effect of the IS maturity, competitive intensity, top management support on collaboration and implementation success of SCM. In general, the vendor companies face substantially greater risks in SCM implementation than the larger companies do because of severe constraints on financial and human resources and limited education on SCM systems. Besides resources, Vendors generally lack computer experience and do not have sufficient internal SCM expertise. For these reasons, government supports may establish requirements for firms doing business with the government or provide incentives to adopt, implementation SCM or practices. Government support provides significant improvements in implementation success of SCM when IS maturity, competitive intensity, top management support and collaboration are low. The environmental characteristic of competition intensity has no direct effect on vendor perspective of SCM system success. But, vendors facing above average competition intensity will have a greater need for changing technology. This suggests that companies trying to implement SCM systems should set up compatible supply chain networks and a high-quality collaboration relationship for implementation and performance.

Contour Extraction Method using p-Snake with Prototype Energy (원형에너지가 추가된 p-Snake를 이용한 윤곽선 추출 기법)

  • Oh, Seung-Taek;Jun, Byung-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.4
    • /
    • pp.101-109
    • /
    • 2014
  • It is an essential element for the establishment of image processing related systems to find the exact contour from the image of an arbitrary object. In particular, if a vision system is established to inspect the products in the automated production process, it is very important to detect the contours for standardized shapes such lines and curves. In this paper, we propose a prototype adaptive dynamic contour model, p-Snake with improved contour extraction algorithms by adding the prototype energy. The proposed method is to find the initial contour by applying the existing Snake algorithm after Sobel operation is performed for prototype analysis. Next, the final contour of the object is detected by analyzing prototypes such as lines and circles, defining prototype energy and using it as an additional energy item in the existing Snake function on the basis of information on initial contour. We performed experiments on 340 images obtained by using an environment that duplicated the background of an industrial site. It was found that even if objects are not clearly distinguished from the background due to noise and lighting or the edges being insufficiently visible in the images, the contour can be extracted. In addition, in the case of similarity which is the measure representing how much it matches the prototype, the prototype similarity of contour extracted from the proposed p-ACM is superior to that of ACM by 9.85%.