• Title/Summary/Keyword: Computer vision technology

Search Result 666, Processing Time 0.029 seconds

Development of Multi-functional Tele-operative Modular Robotic System For Watermelon Cultivation in Greenhouse

  • H. Hwang;Kim, C. S.;Park, D. Y.
    • Journal of Biosystems Engineering
    • /
    • v.28 no.6
    • /
    • pp.517-524
    • /
    • 2003
  • There have been worldwide research and development efforts to automate various processes of bio-production and those efforts will be expanded with priority given to tasks which require high intensive labor or produce high value-added product and tasks under hostile environment. In the field of bio-production capabilities of the versatility and robustness of automated system have been major bottlenecks along with economical efficiency. This paper introduces a new concept of automation based on tole-operation, which can provide solutions to overcome inherent difficulties in automating bio-production processes. Operator(farmer), computer, and automatic machinery share their roles utilizing their maximum merits to accomplish given tasks successfully. Among processes of greenhouse watermelon cultivation tasks such as pruning, watering, pesticide application, and harvest with loading were chosen based on the required labor intensiveness and functional similarities to realize the proposed concept. The developed system was composed of 5 major hardware modules such as wireless remote monitoring and task control module, wireless remote image acquisition and data transmission module, gantry system equipped with 4 d.o.f. Cartesian type robotic manipulator, exchangeable modular type end-effectors, and guided watermelon loading and storage module. The system was operated through the graphic user interface using touch screen monitor and wireless data communication among operator, computer, and machine. The proposed system showed practical and feasible way of automation in the field of volatile bio-production process.

Audio-Visual Fusion for Sound Source Localization and Improved Attention (음성-영상 융합 음원 방향 추정 및 사람 찾기 기술)

  • Lee, Byoung-Gi;Choi, Jong-Suk;Yoon, Sang-Suk;Choi, Mun-Taek;Kim, Mun-Sang;Kim, Dai-Jin
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.35 no.7
    • /
    • pp.737-743
    • /
    • 2011
  • Service robots are equipped with various sensors such as vision camera, sonar sensor, laser scanner, and microphones. Although these sensors have their own functions, some of them can be made to work together and perform more complicated functions. AudioFvisual fusion is a typical and powerful combination of audio and video sensors, because audio information is complementary to visual information and vice versa. Human beings also mainly depend on visual and auditory information in their daily life. In this paper, we conduct two studies using audioFvision fusion: one is on enhancing the performance of sound localization, and the other is on improving robot attention through sound localization and face detection.

Lightweight Single Image Super-Resolution Convolution Neural Network in Portable Device

  • Wang, Jin;Wu, Yiming;He, Shiming;Sharma, Pradip Kumar;Yu, Xiaofeng;Alfarraj, Osama;Tolba, Amr
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.11
    • /
    • pp.4065-4083
    • /
    • 2021
  • Super-resolution can improve the clarity of low-resolution (LR) images, which can increase the accuracy of high-level compute vision tasks. Portable devices have low computing power and storage performance. Large-scale neural network super-resolution methods are not suitable for portable devices. In order to save the computational cost and the number of parameters, Lightweight image processing method can improve the processing speed of portable devices. Therefore, we propose the Enhanced Information Multiple Distillation Network (EIMDN) to adapt lower delay and cost. The EIMDN takes feedback mechanism as the framework and obtains low level features through high level features. Further, we replace the feature extraction convolution operation in Information Multiple Distillation Block (IMDB), with Ghost module, and propose the Enhanced Information Multiple Distillation Block (EIMDB) to reduce the amount of calculation and the number of parameters. Finally, coordinate attention (CA) is used at the end of IMDB and EIMDB to enhance the important information extraction from Spaces and channels. Experimental results show that our proposed can achieve convergence faster with fewer parameters and computation, compared with other lightweight super-resolution methods. Under the condition of higher peak signal-to-noise ratio (PSNR) and higher structural similarity (SSIM), the performance of network reconstruction image texture and target contour is significantly improved.

Motion Detection Model Based on PCNN

  • Yoshida, Minoru;Tanaka, Masaru;Kurita, Takio
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.273-276
    • /
    • 2002
  • Pulse-Coupled Neural Network (PCNN), which can explain the synchronous burst of neurons in a cat visual cortex, is a fundamental model for the biomimetic vision. The PCNN is a kind of pulse coded neural network models. In order to get deep understanding of the visual information Processing, it is important to simulate the visual system through such biologically plausible neural network model. In this paper, we construct the motion detection model based on the PCNN with the receptive field models of neurons in the lateral geniculate nucleus and the primary visual cortex. Then it is shown that this motion detection model can detect the movements and the direction of motion effectively.

  • PDF

Improvement of Strategy Algorithm for Soccer Robot (축구 로봇의 전략 알고리즘 개선)

  • 김재현;이대훈;이성민;최환도;김중완
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 2001.04a
    • /
    • pp.177-181
    • /
    • 2001
  • This paper presents an strategy algorithm of a soccer robot. We simply classified strategy of soccer robot as attack and defense. We use DC-motor in our Soccer Robot. We use the vision system made by MIRO team of Kaist and Soty team for image processing. Host computer is made by Pentium III. The RF module is used for the communication between each robot and the host computer. Fuzzy logic is applied to the path planning of our robot. We improve strategy algorithm of soccer robot. Here we explain improvement of strategy algorithm and fault of the our soccer robot system.

  • PDF

A Survey of Deep Learning in Agriculture: Techniques and Their Applications

  • Ren, Chengjuan;Kim, Dae-Kyoo;Jeong, Dongwon
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1015-1033
    • /
    • 2020
  • With promising results and enormous capability, deep learning technology has attracted more and more attention to both theoretical research and applications for a variety of image processing and computer vision tasks. In this paper, we investigate 32 research contributions that apply deep learning techniques to the agriculture domain. Different types of deep neural network architectures in agriculture are surveyed and the current state-of-the-art methods are summarized. This paper ends with a discussion of the advantages and disadvantages of deep learning and future research topics. The survey shows that deep learning-based research has superior performance in terms of accuracy, which is beyond the standard machine learning techniques nowadays.

Convolutional Neural Network Based on Accelerator-Aware Pruning for Object Detection in Single-Shot Multibox Detector (싱글숏 멀티박스 검출기에서 객체 검출을 위한 가속 회로 인지형 가지치기 기반 합성곱 신경망 기법)

  • Kang, Hyeong-Ju
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.1
    • /
    • pp.141-144
    • /
    • 2020
  • Convolutional neural networks (CNNs) show high performance in computer vision tasks including object detection, but a lot of weight storage and computation is required. In this paper, a pruning scheme is applied to CNNs for object detection, which can remove much amount of weights with a negligible performance degradation. Contrary to the previous ones, the pruning scheme applied in this paper considers the base accelerator architecture. With the consideration, the pruned CNNs can be efficiently performed on an ASIC or FPGA accelerator. Even with the constrained pruning, the resulting CNN shows a negligible degradation of detection performance, less-than-1% point degradation of mAP on VOD0712 test set. With the proposed scheme, CNNs can be applied to objection dtection efficiently.

Video Road Vehicle Detection and Tracking based on OpenCV

  • Hou, Wei;Wu, Zhenzhen;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.3
    • /
    • pp.226-233
    • /
    • 2022
  • Video surveillance is widely used in security surveillance, military navigation, intelligent transportation, etc. Its main research fields are pattern recognition, computer vision and artificial intelligence. This article uses OpenCV to detect and track vehicles, and monitors by establishing an adaptive model on a stationary background. Compared with traditional vehicle detection, it not only has the advantages of low price, convenient installation and maintenance, and wide monitoring range, but also can be used on the road. The intelligent analysis and processing of the scene image using CAMSHIFT tracking algorithm can collect all kinds of traffic flow parameters (including the number of vehicles in a period of time) and the specific position of vehicles at the same time, so as to solve the vehicle offset. It is reliable in operation and has high practical value.

SoftMax Computation in CNN Using Input Maximum Value (CNN에서 입력 최댓값을 이용한 SoftMax 연산 기법)

  • Kang, Hyeong-Ju
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.2
    • /
    • pp.325-328
    • /
    • 2022
  • A convolutional neural network(CNN) is widely used in the computer vision tasks, but its computing power requirement needs a design of a special circuit. Most of the computations in a CNN can be implemented efficiently in a digital circuit, but the SoftMax layer has operations unsuitable for circuit implementation, which are exponential and logarithmic functions. This paper proposes a new method to integrate the exponential and logarithmic tables of the conventional circuits into a single table. The proposed structure accesses a look-up table (LUT) only with a few maximum values, and the LUT has the result value directly. Our proposed method significantly reduces the space complexity of the SoftMax layer circuit implementation. But our resulting circuit is comparable to the original baseline with small degradation in precision.

An Efficient Monocular Depth Prediction Network Using Coordinate Attention and Feature Fusion

  • Huihui, Xu;Fei ,Li
    • Journal of Information Processing Systems
    • /
    • v.18 no.6
    • /
    • pp.794-802
    • /
    • 2022
  • The recovery of reasonable depth information from different scenes is a popular topic in the field of computer vision. For generating depth maps with better details, we present an efficacious monocular depth prediction framework with coordinate attention and feature fusion. Specifically, the proposed framework contains attention, multi-scale and feature fusion modules. The attention module improves features based on coordinate attention to enhance the predicted effect, whereas the multi-scale module integrates useful low- and high-level contextual features with higher resolution. Moreover, we developed a feature fusion module to combine the heterogeneous features to generate high-quality depth outputs. We also designed a hybrid loss function that measures prediction errors from the perspective of depth and scale-invariant gradients, which contribute to preserving rich details. We conducted the experiments on public RGBD datasets, and the evaluation results show that the proposed scheme can considerably enhance the accuracy of depth prediction, achieving 0.051 for log10 and 0.992 for δ<1.253 on the NYUv2 dataset.