• Title/Summary/Keyword: Multi-Vision

Search Result 491, Processing Time 0.036 seconds

Multi-resolution Pyramid based Image Identification (다중 해상도 피라미드 기반 영상 인식자)

  • Park, Je-Ho
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.1
    • /
    • pp.6-10
    • /
    • 2020
  • Unlike modern photography technology, in the early days, efforts to physically compose an image with a concept similar to the current photograph have not been popular or commercially successful. The limitation of the use of images as artistic media or recordings has reached the stage of introducing the technology of image analysis to automate the function that humans recognize and judge through vision. In addition, the accuracy of the image has exceeded the human visual ability, enabling the technology that enables the step of recognizing and informing the fact that the human is not aware of it. Based on such a base, the range that can be applied through the image data in the future era can be said to be unpredictable, and the technology that targets large scale image database instead of an image is also expanding the possibilities as a new application technology. In order to identify a particular image from a massive database, different methodologies have been introduced. In this paper, we discuss image identifier production methods based on multi-resolution pyramid.

A Geometric Active Contour Model Using Multi Resolution Level Set Methods (다중 해상도 레벨 세트 방식을 이용한 기하 활성 모델)

  • Kim, Seong-Gon;Kim, Du-Yeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.10
    • /
    • pp.2809-2815
    • /
    • 1999
  • Level set, and active contour(snakes) models are extensively used for image segmentation or shape extraction in computer vision. Snakes utilize the energy minimization concepts, and level set is based on the curve evolution in order to extract contours from image data. In general, these two models have their own drawbacks. For instance, snake acts pooly unless it is placed close to the wanted shape boundary, and it has difficult problem when image has multiple objects to be extracted. But, level set method is free of initial curve position problem, and has ability to handle topology of multiple objects. Nevertheless, level set method requires much more calculation time compared to snake model. In this paper, we use good points of two described models and also apply multi resolution algorithm in order to speed up the process without decreasing the performance of the shape extraction.

  • PDF

Research Trends on Deep Reinforcement Learning (심층 강화학습 기술 동향)

  • Jang, S.Y.;Yoon, H.J.;Park, N.S.;Yun, J.K.;Son, Y.S.
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.4
    • /
    • pp.1-14
    • /
    • 2019
  • Recent trends in deep reinforcement learning (DRL) have revealed the considerable improvements to DRL algorithms in terms of performance, learning stability, and computational efficiency. DRL also enables the scenarios that it covers (e.g., partial observability; cooperation, competition, coexistence, and communications among multiple agents; multi-task; decentralized intelligence) to be vastly expanded. These features have cultivated multi-agent reinforcement learning research. DRL is also expanding its applications from robotics to natural language processing and computer vision into a wide array of fields such as finance, healthcare, chemistry, and even art. In this report, we briefly summarize various DRL techniques and research directions.

Scaling Up Face Masks Classification Using a Deep Neural Network and Classical Method Inspired Hybrid Technique

  • Kumar, Akhil;Kalia, Arvind;Verma, Kinshuk;Sharma, Akashdeep;Kaushal, Manisha;Kalia, Aayushi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.11
    • /
    • pp.3658-3679
    • /
    • 2022
  • Classification of persons wearing and not wearing face masks in images has emerged as a new computer vision problem during the COVID-19 pandemic. In order to address this problem and scale up the research in this domain, in this paper a hybrid technique by employing ResNet-101 and multi-layer perceptron (MLP) classifier has been proposed. The proposed technique is tested and validated on a self-created face masks classification dataset and a standard dataset. On self-created dataset, the proposed technique achieved a classification accuracy of 97.3%. To embrace the proposed technique, six other state-of-the-art CNN feature extractors with six other classical machine learning classifiers have been tested and compared with the proposed technique. The proposed technique achieved better classification accuracy and 1-6% higher precision, recall, and F1 score as compared to other tested deep feature extractors and machine learning classifiers.

A Method of Multi-Scale Feature Compression for Object Tracking in VCM (VCM 의 객체추적을 위한 다중스케일 특징 압축 기법)

  • Yong-Uk Yoon;Gyu-Woong Han;Dong-Ha Kim;Jae-Gon Kim
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.10-13
    • /
    • 2022
  • 최근 인공지능 기술을 바탕으로 지능형 분석을 수행하는 기계를 위한 비디오 부호화 기술의 필요성이 요구되면서, MPEG 에서는 VCM(Video Coding for Machines) 표준화를 시작하였다. VCM 에서는 기계를 위한 비디오/이미지 압축 또는 비디오/이미지 특징 압축을 위한 다양한 방법이 제시되고 있다. 본 논문에서는 객체추적(object tracking)을 위한 머신비전(machine vision) 네트워크에서 추출되는 다중스케일(multi-scale) 특징의 효율적인 압축 기법을 제시한다. 제안기법은 다중스케일 특징을 단일스케일(single-scale) 특징으로 차원을 축소하여 형성된 특징 시퀀스를 최신 비디오 코덱 표준인 VVC(Versatile Video Coding)를 사용하여 압축한다. 제안기법은 VCM 에서 제시하는 기준(anchor) 대비 89.65%의 BD-rate 부호화 성능향상을 보인다.

  • PDF

Text-to-Face Generation Using Multi-Scale Gradients Conditional Generative Adversarial Networks (다중 스케일 그라디언트 조건부 적대적 생성 신경망을 활용한 문장 기반 영상 생성 기법)

  • Bui, Nguyen P.;Le, Duc-Tai;Choo, Hyunseung
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.764-767
    • /
    • 2021
  • While Generative Adversarial Networks (GANs) have seen huge success in image synthesis tasks, synthesizing high-quality images from text descriptions is a challenging problem in computer vision. This paper proposes a method named Text-to-Face Generation Using Multi-Scale Gradients for Conditional Generative Adversarial Networks (T2F-MSGGANs) that combines GANs and a natural language processing model to create human faces has features found in the input text. The proposed method addresses two problems of GANs: model collapse and training instability by investigating how gradients at multiple scales can be used to generate high-resolution images. We show that T2F-MSGGANs converge stably and generate good-quality images.

A Micro-robotic Platform for Micro/nano Assembly: Development of a Compact Vision-based 3 DOF Absolute Position Sensor (마이크로/나노 핸들링을 위한 마이크로 로보틱 플랫폼: 비전 기반 3자유도 절대위치센서 개발)

  • Lee, Jae-Ha;Breguet, Jean Marc;Clavel, Reymond;Yang, Seung-Han
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.27 no.1
    • /
    • pp.125-133
    • /
    • 2010
  • A versatile micro-robotic platform for micro/nano scale assembly has been demanded in a variety of application areas such as micro-biology and nanotechnology. In the near future, a flexible and compact platform could be effectively used in a scanning electron microscope chamber. We are developing a platform that consists of miniature mobile robots and a compact positioning stage with multi degree-of-freedom. This paper presents the design and the implementation of a low-cost and compact multi degree of freedom position sensor that is capable of measuring absolute translational and rotational displacement. The proposed sensor is implemented by using a CMOS type image sensor and a target with specific hole patterns. Experimental design based on statistics was applied to finding optimal design of the target. Efficient algorithms for image processing and absolute position decoding are discussed. Simple calibration to eliminate the influence of inaccuracy of the fabricated target on the measuring performance also presented. The developed sensor was characterized by using a laser interferometer. It can be concluded that the sensor system has submicron resolution and accuracy of ${\pm}4{\mu}m$ over full travel range. The proposed vision-based sensor is cost-effective and used as a compact feedback device for implementation of a micro robotic platform.

AUTOMATIC MULTITORCH WELDING SYSTEM WITH HIGH SPEED

  • Moon, H.S;Kim, J.S.;Jung, M.Y.;Kweon, H.J.;Kim, H.S.;Youn, J.G.
    • Proceedings of the KWS Conference
    • /
    • 2002.10a
    • /
    • pp.320-323
    • /
    • 2002
  • This paper presents a new generation of system for pressure vessel and shipbuilding. Typical pressure vessel and ship building weld joint preparations are either traditional V, butt, fillet grooves or have narrow or semi narrow gap profiles. The fillet and U groove are prevalently used in heavy industries and shipbuilding to melt and join the parts. Since the wall thickness can be up to 6" or greater, welds must be made in many layers, each layer containing several passes. However, the welding time for the conventional processes such as SAW(Submerged Arc Welding) and FCAW(Flux Cored Arc Welding) can be many hours. Although SAW and FCAW are normally a mechanized process, pressure vessel and ship structures welding up to now have usually been controlled by a full time operator. The operator has typically been responsible for positioning each individual weld run, for setting weld process parameters, for maintaining flux and wire levels, for removing slag and so on. The aim of the system is to develop a high speed welding system with multitorch for increasing the production speed on the line and to remove the need for the operator so that the system can run automatically for the complete multi-torch multi-layer weld. To achieve this, a laser vision sensor, a rotating torch and an image processing algorithm have been made. Also, the multitorch welding system can be applicable for the fine grained steel because of the high welding speed and lower heat input compare to a conventional welding process.

  • PDF

Development of an Integrated Traffic Object Detection Framework for Traffic Data Collection (교통 데이터 수집을 위한 객체 인식 통합 프레임워크 개발)

  • Yang, Inchul;Jeon, Woo Hoon;Lee, Joyoung;Park, Jihyun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.18 no.6
    • /
    • pp.191-201
    • /
    • 2019
  • A fast and accurate integrated traffic object detection framework was proposed and developed, harnessing a computer-vision based deep-learning approach performing automatic object detections, a multi object tracking technology, and video pre-processing tools. The proposed method is capable of detecting traffic object such as autos, buses, trucks and vans from video recordings taken under a various kinds of external conditions such as stability of video, weather conditions, video angles, and counting the objects by tracking them on a real-time basis. By creating plausible experimental scenarios dealing with various conditions that likely affect video quality, it is discovered that the proposed method achieves outstanding performances except for the cases of rain and snow, thereby resulting in 98% ~ 100% of accuracy.

EAR: Enhanced Augmented Reality System for Sports Entertainment Applications

  • Mahmood, Zahid;Ali, Tauseef;Muhammad, Nazeer;Bibi, Nargis;Shahzad, Imran;Azmat, Shoaib
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.12
    • /
    • pp.6069-6091
    • /
    • 2017
  • Augmented Reality (AR) overlays virtual information on real world data, such as displaying useful information on videos/images of a scene. This paper presents an Enhanced AR (EAR) system that displays useful statistical players' information on captured images of a sports game. We focus on the situation where the input image is degraded by strong sunlight. Proposed EAR system consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player and face detection, face recognition, and players' statistics display. First, an algorithm based on multi-scale retinex is proposed for image enhancement. Then, to detect players' and faces', we use adaptive boosting and Haar features for feature extraction and classification. The player face recognition algorithm uses boosted linear discriminant analysis to select features and nearest neighbor classifier for classification. The system can be adjusted to work in different types of sports where the input is an image and the desired output is display of information nearby the recognized players. Simulations are carried out on 2096 different images that contain players in diverse conditions. Proposed EAR system demonstrates the great potential of computer vision based approaches to develop AR applications.