• Title/Summary/Keyword: Multi-Vision

Search Result 491, Processing Time 0.027 seconds

Aircraft Recognition from Remote Sensing Images Based on Machine Vision

  • Chen, Lu;Zhou, Liming;Liu, Jinming
    • Journal of Information Processing Systems
    • /
    • v.16 no.4
    • /
    • pp.795-808
    • /
    • 2020
  • Due to the poor evaluation indexes such as detection accuracy and recall rate when Yolov3 network detects aircraft in remote sensing images, in this paper, we propose a remote sensing image aircraft detection method based on machine vision. In order to improve the target detection effect, the Inception module was introduced into the Yolov3 network structure, and then the data set was cluster analyzed using the k-means algorithm. In order to obtain the best aircraft detection model, on the basis of our proposed method, we adjusted the network parameters in the pre-training model and improved the resolution of the input image. Finally, our method adopted multi-scale training model. In this paper, we used remote sensing aircraft dataset of RSOD-Dataset to do experiments, and finally proved that our method improved some evaluation indicators. The experiment of this paper proves that our method also has good detection and recognition ability in other ground objects.

The Ebb and Flow of Regional Integration Vision in Asia-Pacific: From a Lens of Leaders' Declarations over 30 Years

  • Jeongmeen Suh
    • East Asian Economic Review
    • /
    • v.27 no.4
    • /
    • pp.303-325
    • /
    • 2023
  • This paper examines how APEC has transformed itself into an international forum for the vision of regional integration. It aims to quantify the documentation produced by the international organization and provide quantifiable evidence that aligns with prior knowledge rather than relying solely on intuition. For this purpose, I use various text mining techniques to extract multi-dimensional features from the text of APEC Leaders' Declarations from 1993 to 2023. In terms of interest and expectations for APEC as a forum, it is found that members have experienced two major peaks and troughs over the last three decades. It is found that the change point coincides with the Asian financial crisis of 1997 and the tensions between the United States and China since 2017. To explore more various aspects of economic integration in the Asia-Pacific region, this study also considers how consistently APEC has been an international forum for addressing issues, which members are active, and how members have clustered based on their views of APEC.

Infrared Target Recognition using Heterogeneous Features with Multi-kernel Transfer Learning

  • Wang, Xin;Zhang, Xin;Ning, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.9
    • /
    • pp.3762-3781
    • /
    • 2020
  • Infrared pedestrian target recognition is a vital problem of significant interest in computer vision. In this work, a novel infrared pedestrian target recognition method that uses heterogeneous features with multi-kernel transfer learning is proposed. Firstly, to exploit the characteristics of infrared pedestrian targets fully, a novel multi-scale monogenic filtering-based completed local binary pattern descriptor, referred to as MSMF-CLBP, is designed to extract the texture information, and then an improved histogram of oriented gradient-fisher vector descriptor, referred to as HOG-FV, is proposed to extract the shape information. Second, to enrich the semantic content of feature expression, these two heterogeneous features are integrated to get more complete representation for infrared pedestrian targets. Third, to overcome the defects, such as poor generalization, scarcity of tagged infrared samples, distributional and semantic deviations between the training and testing samples, of the state-of-the-art classifiers, an effective multi-kernel transfer learning classifier called MK-TrAdaBoost is designed. Experimental results show that the proposed method outperforms many state-of-the-art recognition approaches for infrared pedestrian targets.

Efficient Multi-scalable Network for Single Image Super Resolution

  • Alao, Honnang;Kim, Jin-Sung;Kim, Tae Sung;Lee, Kyujoong
    • Journal of Multimedia Information System
    • /
    • v.8 no.2
    • /
    • pp.101-110
    • /
    • 2021
  • In computer vision, single-image super resolution has been an area of research for a significant period. Traditional techniques involve interpolation-based methods such as Nearest-neighbor, Bilinear, and Bicubic for image restoration. Although implementations of convolutional neural networks have provided outstanding results in recent years, efficiency and single model multi-scalability have been its challenges. Furthermore, previous works haven't placed enough emphasis on real-number scalability. Interpolation-based techniques, however, have no limit in terms of scalability as they are able to upscale images to any desired size. In this paper, we propose a convolutional neural network possessing the advantages of the interpolation-based techniques, which is also efficient, deeming it suitable in practical implementations. It consists of convolutional layers applied on the low-resolution space, post-up-sampling along the end hidden layers, and additional layers on high-resolution space. Up-sampling is applied on a multiple channeled feature map via bicubic interpolation using a single model. Experiments on architectural structure, layer reduction, and real-number scale training are executed with results proving efficient amongst multi-scale learning (including scale multi-path-learning) based models.

A Vision Based Guideline Interpretation Technique for AGV Navigation (AGV 운행을 위한 비전기반 유도선 해석 기술)

  • Byun, Sungmin;Kim, Minhwan
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.11
    • /
    • pp.1319-1329
    • /
    • 2012
  • AGVs are more and more utilized nowadays and magnetic guided AGVs are most widely used because their system has low cost and high speed. But this type of AGVs requires high infrastructure building cost and has poor flexibility of navigation path layout changing. Thus it is hard to applying this type of AGVs to a small quantity batch production system or a cooperative production system with many AGVs. In this paper, we propose a vision based guideline interpretation technique that uses the cheap, easily installable and changeable color tapes (or paint) as a guideline. So a vision-based AGV with color tapes is effectively applicable to the production systems. For easy setting and changing of AGV navigation path, we suggest an automatic method for interpreting a complex guideline layout including multi-branches and joins of branches. We also suggest a trace direction decision method for stable navigation of AGVs. Through several real-time navigation tests with an industrial AGV installed with the suggested technique, we confirmed that the technique is practically and stably applicable to real industrial field.

Bayesian Sensor Fusion of Monocular Vision and Laser Structured Light Sensor for Robust Localization of a Mobile Robot (이동 로봇의 강인 위치 추정을 위한 단안 비젼 센서와 레이저 구조광 센서의 베이시안 센서융합)

  • Kim, Min-Young;Ahn, Sang-Tae;Cho, Hyung-Suck
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.4
    • /
    • pp.381-390
    • /
    • 2010
  • This paper describes a procedure of the map-based localization for mobile robots by using a sensor fusion technique in structured environments. A combination of various sensors with different characteristics and limited sensibility has advantages in view of complementariness and cooperation to obtain better information on the environment. In this paper, for robust self-localization of a mobile robot with a monocular camera and a laser structured light sensor, environment information acquired from two sensors is combined and fused by a Bayesian sensor fusion technique based on the probabilistic reliability function of each sensor predefined through experiments. For the self-localization using the monocular vision, the robot utilizes image features consisting of vertical edge lines from input camera images, and they are used as natural landmark points in self-localization process. However, in case of using the laser structured light sensor, it utilizes geometrical features composed of corners and planes as natural landmark shapes during this process, which are extracted from range data at a constant height from the navigation floor. Although only each feature group of them is sometimes useful to localize mobile robots, all features from the two sensors are simultaneously used and fused in term of information for reliable localization under various environment conditions. To verify the advantage of using multi-sensor fusion, a series of experiments are performed, and experimental results are discussed in detail.

Multiple Camera-Based Real-Time Long Queue Vision Algorithm for Public Safety and Efficiency

  • Tae-hoon Kim;Ji-young Na;Ji-won Yoon;Se-Hun Lee;Jun-ho Ahn
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.10
    • /
    • pp.47-57
    • /
    • 2024
  • This paper proposes a system to efficiently manage delays caused by unmanaged and congested queues in crowded environments. Such queues not only cause inconvenience but also pose safety risks. Existing systems, relying on single-camera feeds, are inadequate for complex scenarios requiring multiple cameras. To address this, we developed a multi-vision long queue detection system that integrates multiple vision algorithms to accurately detect various types of queues. The algorithm processes real-time video data from multiple cameras, stitching overlapping segments into a single panoramic image. By combining object detection, tracking, and position variation analysis, the system recognizes long queues in crowded environments. The algorithm was validated with 96% accuracy and a 92% F1-score across diverse settings.

A Multi-Level Accumulation-Based Rectification Method and Its Circuit Implementation

  • Son, Hyeon-Sik;Moon, Byungin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.6
    • /
    • pp.3208-3229
    • /
    • 2017
  • Rectification is an essential procedure for simplifying the disparity extraction of stereo matching algorithms by removing vertical mismatches between left and right images. To support real-time stereo matching, studies have introduced several look-up table (LUT)- and computational logic (CL)-based rectification approaches. However, to support high-resolution images, the LUT-based approach requires considerable memory resources, and the CL-based approach requires numerous hardware resources for its circuit implementation. Thus, this paper proposes a multi-level accumulation-based rectification method as a simple CL-based method and its circuit implementation. The proposed method, which includes distortion correction, reduces addition operations by 29%, and removes multiplication operations by replacing the complex matrix computations and high-degree polynomial calculations of the conventional rectification with simple multi-level accumulations. The proposed rectification circuit can rectify $1,280{\times}720$ stereo images at a frame rate of 135 fps at a clock frequency of 125 MHz. Because the circuit is fully pipelined, it continuously generates a pair of left and right rectified pixels every cycle after 13-cycle latency plus initial image buffering time. Experimental results show that the proposed method requires significantly fewer hardware resources than the conventional method while the differences between the results of the proposed and conventional full rectifications are negligible.

Robust Multi-Layer Hierarchical Model for Digit Character Recognition

  • Yang, Jie;Sun, Yadong;Zhang, Liangjun;Zhang, Qingnian
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.2
    • /
    • pp.699-707
    • /
    • 2015
  • Although digit character recognition has got a significant improvement in recent years, it is still challenging to achieve satisfied result if the data contains an amount of distracting factors. This paper proposes a novel digit character recognition approach using a multi-layer hierarchical model, Hybrid Restricted Boltzmann Machines (HRBMs), which allows the learning architecture to be robust to background distracting factors. The insight behind the proposed model is that useful high-level features appear more frequently than distracting factors during learning, thus the high-level features can be decompose into hybrid hierarchical structures by using only small label information. In order to extract robust and compact features, a stochastic 0-1 layer is employed, which enables the model's hidden nodes to independently capture the useful character features during training. Experiments on the variations of Mixed National Institute of Standards and Technology (MNIST) dataset show that improvements of the multi-layer hierarchical model can be achieved by the proposed method. Finally, the paper shows the proposed technique which is used in a real-world application, where it is able to identify digit characters under various complex background images.

Multi-View Supporting VR/AR Visualization System for Supercomputing-based Engineering Analysis Services (슈퍼컴퓨팅 기반의 공학해석 서비스 제공을 위한 멀티 뷰 지원 VR/AR 가시화 시스템 개발)

  • Seo, Dong Woo;Lee, Jae Yeol;Lee, Sang Min;Kim, Jae Seong;Park, Hyung Wook
    • Korean Journal of Computational Design and Engineering
    • /
    • v.18 no.6
    • /
    • pp.428-438
    • /
    • 2013
  • The requirement for high performance visualization of engineering analysis of digital products is increasing since the size of the current analysis problems is more and more complex, which needs high-performance codes as well as high performance computing systems. On the other hand, different companies or customers do not have all the facilities or have difficulties in accessing those computing resources. In this paper, we present a multi-view supporting VR/AR system for providing supercomputing-based engineering analysis services. The proposed system is designed to provide different views supporting VR/AR visualization services depending on the requirement of the customers. It provides a sophisticated VR rendering directly dependent on a supercomputing resource as well as a remotely accessible AR visualization. By providing multi-view centric analysis services, the proposed system can be more easily applied to various customers requiring different levels of high performance computing resources. We will show the scalability and vision of the proposed approach by demonstrating illustrative examples with different levels of complexity.