• Title/Summary/Keyword: image-level fusion

Search Result 85, Processing Time 0.026 seconds

Language Identification by Fusion of Gabor, MDLC, and Co-Occurrence Features (Gabor, MDLC, Co-Occurrence 특징의 융합에 의한 언어 인식)

  • Jang, Ick-Hoon;Kim, Ji-Hong
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.3
    • /
    • pp.277-286
    • /
    • 2014
  • In this paper, we propose a texture feature-based language identification by fusion of Gabor, MDLC (multi-lag directional local correlation), and co-occurrence features. In the proposed method, for a test image, Gabor magnitude images are first obtained by Gabor transform followed by magnitude operator. Moments for the Gabor magniude images are then computed and vectorized. MDLC images are then obtained by MDLC operator and their moments are computed and vectorized. GLCM (gray-level co-occurrence matrix) is next calculated from the test image and co-occurrence features are computed using the GLCM, and the features are also vectorized. The three vectors of the Gabor, MDLC, and co-occurrence features are fused into a feature vector. In classification, the WPCA (whitened principal component analysis) classifier, which is usually adopted in the face identification, searches the training feature vector most similar to the test feature vector. We evaluate the performance of our method by examining averaged identification rates for a test document image DB obtained by scanning of documents with 15 languages. Experimental results show that the proposed method yields excellent language identification with rather low feature dimension for the test DB.

A Study on Visual Feedback Control of a Dual Arm Robot with Eight Joints

  • Lee, Woo-Song;Kim, Hong-Rae;Kim, Young-Tae;Jung, Dong-Yean;Han, Sung-Hyun
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.610-615
    • /
    • 2005
  • Visual servoing is the fusion of results from many elemental areas including high-speed image processing, kinematics, dynamics, control theory, and real-time computing. It has much in common with research into active vision and structure from motion, but is quite different from the often described use of vision in hierarchical task-level robot control systems. We present a new approach to visual feedback control using image-based visual servoing with the stereo vision in this paper. In order to control the position and orientation of a robot with respect to an object, a new technique is proposed using a binocular stereo vision. The stereo vision enables us to calculate an exact image Jacobian not only at around a desired location but also at the other locations. The suggested technique can guide a robot manipulator to the desired location without giving such priori knowledge as the relative distance to the desired location or the model of an object even if the initial positioning error is large. This paper describes a model of stereo vision and how to generate feedback commands. The performance of the proposed visual servoing system is illustrated by the simulation and experimental results and compared with the case of conventional method for dual-arm robot made in Samsung Electronics Co., Ltd.

  • PDF

Multi-parametric MRIs based assessment of Hepatocellular Carcinoma Differentiation with Multi-scale ResNet

  • Jia, Xibin;Xiao, Yujie;Yang, Dawei;Yang, Zhenghan;Lu, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.5179-5196
    • /
    • 2019
  • To explore an effective non-invasion medical imaging diagnostics approach for hepatocellular carcinoma (HCC), we propose a method based on adopting the multiple technologies with the multi-parametric data fusion, transfer learning, and multi-scale deep feature extraction. Firstly, to make full use of complementary and enhancing the contribution of different modalities viz. multi-parametric MRI images in the lesion diagnosis, we propose a data-level fusion strategy. Secondly, based on the fusion data as the input, the multi-scale residual neural network with SPP (Spatial Pyramid Pooling) is utilized for the discriminative feature representation learning. Thirdly, to mitigate the impact of the lack of training samples, we do the pre-training of the proposed multi-scale residual neural network model on the natural image dataset and the fine-tuning with the chosen multi-parametric MRI images as complementary data. The comparative experiment results on the dataset from the clinical cases show that our proposed approach by employing the multiple strategies achieves the highest accuracy of 0.847±0.023 in the classification problem on the HCC differentiation. In the problem of discriminating the HCC lesion from the non-tumor area, we achieve a good performance with accuracy, sensitivity, specificity and AUC (area under the ROC curve) being 0.981±0.002, 0.981±0.002, 0.991±0.007 and 0.999±0.0008, respectively.

Image Quality Enhancement for Chest X-ray images (흉부 엑스레이 영상을 위한 화질 개선 알고리즘)

  • Park, So Yeon;Song, Byung Cheol
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.10
    • /
    • pp.97-107
    • /
    • 2015
  • The initial X-ray images obtained from a digital X-ray machine have a wide data range and uneven brightness level than normal images. In particular, in Chest X-ray images, it is necessary to improve naturally all of the parts such as ribs, spine, tissue, etc. These X-ray images can not be improved enough from conventional image quality enhancement algorithms because their characteristics are different from ordinary images'. This paper proposes to eliminate unnecessary background from an input image and expand the histogram range of the image. Then, we adjust the weight per frequency band of the image for improvement of contrast and sharpness. Finally, jointly taking the advantages of global contrast enhancement and local contrast enhancement methods we obtain an improved X-ray image suitable for effective diagnosis in comparison with the existing methods. Experimental results show quantitatively that the proposed algorithm provides better X-ray images in terms of the discrete entropy and saturation than the previous works.

Hot Spot Detection of Thermal Infrared Image of Photovoltaic Power Station Based on Multi-Task Fusion

  • Xu Han;Xianhao Wang;Chong Chen;Gong Li;Changhao Piao
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.791-802
    • /
    • 2023
  • The manual inspection of photovoltaic (PV) panels to meet the requirements of inspection work for large-scale PV power plants is challenging. We present a hot spot detection and positioning method to detect hot spots in batches and locate their latitudes and longitudes. First, a network based on the YOLOv3 architecture was utilized to identify hot spots. The innovation is to modify the RU_1 unit in the YOLOv3 model for hot spot detection in the far field of view and add a neural network residual unit for fusion. In addition, because of the misidentification problem in the infrared images of the solar PV panels, the DeepLab v3+ model was adopted to segment the PV panels to filter out the misidentification caused by bright spots on the ground. Finally, the latitude and longitude of the hot spot are calculated according to the geometric positioning method utilizing known information such as the drone's yaw angle, shooting height, and lens field-of-view. The experimental results indicate that the hot spot recognition rate accuracy is above 98%. When keeping the drone 25 m off the ground, the hot spot positioning error is at the decimeter level.

Radiologic Findings and Risk Factors of Adjacent Segment Degeneration after Anterior Cervical Discectomy and Fusion : A Retrospective Matched Cohort Study with 3-Year Follow-Up Using MRI

  • Ahn, Sang-Soak;So, Wan-Soo;Ku, Min-Geun;Kim, Sang-Hyeon;Kim, Dong-Won;Lee, Byung-Hun
    • Journal of Korean Neurosurgical Society
    • /
    • v.59 no.2
    • /
    • pp.129-136
    • /
    • 2016
  • Objective : The purpose of this study was to figure out the radiologic findings and risk factors related to adjacent segment degeneration (ASD) after anterior cervical discectomy and fusion (ACDF) using 3-year follow-up radiography, computed tomography (CT), and magnetic resonance image (MRI). Methods : A retrospective matched comparative study was performed for 64 patients who underwent single-level ACDF with a cage and plate. Radiologic parameters, including upper segment range of motion (USROM), lower segment range of motion (LSROM), upper segment disc height (UDH), and lower segment disc height (LDH), clinical outcomes assessed with neck and arm visual analogue scale (VAS), and risk factors were analyzed. Results : Patients were categorized into the ASD (32 patients) and non-ASD (32 patients) group. The decrease of UDH was significantly greater in the ASD group at each follow-up visit. At 36 months postoperatively, the difference for USROM value from the preoperative one significantly increased in the ASD group than non-ASD group. Preoperative other segment degeneration was significantly associated with the increased incidence of ASD at 36 months. However, pain intensity for the neck and arm was not significantly different between groups at any post-operative follow-up visit. Conclusion : The main factor affecting ASD is preoperative other segment degeneration out of the adjacent segment. In addition, patients over the age of 50 are at higher risk of developing ASD. Although there was definite radiologic degeneration in the ASD group, no significant difference was observed between the ASD and non-ASD groups in terms of the incidence of symptomatic disease.

A Study on Automatic Extraction of Buildings Using LIDAR with Aerial Imagery (LIDAR 데이터와 항공사진을 이용한 건물의 자동추출에 관한 연구)

  • 이영진;조우석
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2003.04a
    • /
    • pp.471-477
    • /
    • 2003
  • This paper presents an algorithm that automatically extracts buildings among many different features on the earth surface by fusing LIDAR data with panchromatic aerial images. The proposed algorithm consists of three stages such as point level process, polygon level process, parameter space level process. At the first stage, we eliminate gross errors and apply a local maxima filter to detect building candidate points from the raw laser scanning data. After then, a grouping procedure is performed for segmenting raw LIDAR data and the segmented LIDAR data is polygonized by the encasing polygon algorithm developed in the research. At the second stage, we eliminate non-building polygons using several constraints such as area and circularity. At the last stage, all the polygons generated at the second stage are projected onto the aerial stereo images through collinearity condition equations. Finally, we fuse the projected encasing polygons with edges detected by image processing for refining the building segments. The experimental results showed that the RMSEs of building corners in X, Y and Z were ${\pm}$8.1cm, ${\pm}$24.7cm, ${\pm}$35.9cm, respectively.

  • PDF

Using the fusion of spatial and temporal features for malicious video classification (공간과 시간적 특징 융합 기반 유해 비디오 분류에 관한 연구)

  • Jeon, Jae-Hyun;Kim, Se-Min;Han, Seung-Wan;Ro, Yong-Man
    • The KIPS Transactions:PartB
    • /
    • v.18B no.6
    • /
    • pp.365-374
    • /
    • 2011
  • Recently, malicious video classification and filtering techniques are of practical interest as ones can easily access to malicious multimedia contents through the Internet, IPTV, online social network, and etc. Considerable research efforts have been made to developing malicious video classification and filtering systems. However, the malicious video classification and filtering is not still being from mature in terms of reliable classification/filtering performance. In particular, the most of conventional approaches have been limited to using only the spatial features (such as a ratio of skin regions and bag of visual words) for the purpose of malicious image classification. Hence, previous approaches have been restricted to achieving acceptable classification and filtering performance. In order to overcome the aforementioned limitation, we propose new malicious video classification framework that takes advantage of using both the spatial and temporal features that are readily extracted from a sequence of video frames. In particular, we develop the effective temporal features based on the motion periodicity feature and temporal correlation. In addition, to exploit the best data fusion approach aiming to combine the spatial and temporal features, the representative data fusion approaches are applied to the proposed framework. To demonstrate the effectiveness of our method, we collect 200 sexual intercourse videos and 200 non-sexual intercourse videos. Experimental results show that the proposed method increases 3.75% (from 92.25% to 96%) for classification of sexual intercourse video in terms of accuracy. Further, based on our experimental results, feature-level fusion approach (for fusing spatial and temporal features) is found to achieve the best classification accuracy.

Applicability of Satellite SAR Imagery for Estimating Reservoir Storage (저수지 저수량 추정을 위한 위성 SAR 자료의 활용성)

  • Jang, Min-Won;Lee, Hyeon-Jeong;Kim, Yi-Hyun;Hong, Suk-Young
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.53 no.6
    • /
    • pp.7-16
    • /
    • 2011
  • This study discussed the applicability of satellite SAR (Synthetic Aperture Radar) imagery with regard to reservoir monitoring, and tried the extraction of reservoir storage from multi-temporal C-band RADARSAT-1 SAR backscattering images of Yedang and Goongpyeong agricultural reservoirs, acquired from May to October 2005. SAR technology has been advanced as a complementary and alternative approach to optical remote sensing and in-situ measurement. Water bodies in SAR imagery represent low brightness induced by low backscattering, and reservoir storage can be derived from the backscatter contrast with the level-area-volume relationship of each reservoir. The threshold segmentation over the routine preprocessing of SAR images such as speckle reduction and low-pass filtering concluded a significant correlation between the SAR-derived reservoir storage and the observation record in spite of the considerable disagreement. The result showed up critical limitations for adopting SAR data to reservoir monitoring as follows: the inappropriate specifications of SAR data, the unreliable rating curve of reservoir, the lack of climatic information such as wind and precipitation, the interruption of inside and neighboring land cover, and so on. Furthermore, better accuracy of SAR-based reservoir monitoring could be expected through different alternatives such as multi-sensor image fusion, water level measurement with altimeters or interferometry, etc.

Hand Raising Pose Detection in the Images of a Single Camera for Mobile Robot (주행 로봇을 위한 단일 카메라 영상에서 손든 자세 검출 알고리즘)

  • Kwon, Gi-Il
    • The Journal of Korea Robotics Society
    • /
    • v.10 no.4
    • /
    • pp.223-229
    • /
    • 2015
  • This paper proposes a novel method for detection of hand raising poses from images acquired from a single camera attached to a mobile robot that navigates unknown dynamic environments. Due to unconstrained illumination, a high level of variance in human appearances and unpredictable backgrounds, detecting hand raising gestures from an image acquired from a camera attached to a mobile robot is very challenging. The proposed method first detects faces to determine the region of interest (ROI), and in this ROI, we detect hands by using a HOG-based hand detector. By using the color distribution of the face region, we evaluate each candidate in the detected hand region. To deal with cases of failure in face detection, we also use a HOG-based hand raising pose detector. Unlike other hand raising pose detector systems, we evaluate our algorithm with images acquired from the camera and images obtained from the Internet that contain unknown backgrounds and unconstrained illumination. The level of variance in hand raising poses in these images is very high. Our experiment results show that the proposed method robustly detects hand raising poses in complex backgrounds and unknown lighting conditions.