• Title/Summary/Keyword: Feature Scale Model

Search Result 222, Processing Time 0.024 seconds

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention (딥러닝 기반의 Multi Scale Attention을 적용한 개선된 Pyramid Scene Parsing Network)

  • Kim, Jun-Hyeok;Lee, Sang-Hun;Han, Hyun-Ho
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.45-51
    • /
    • 2021
  • With the development of deep learning, semantic segmentation methods are being studied in various fields. There is a problem that segmenation accuracy drops in fields that require accuracy such as medical image analysis. In this paper, we improved PSPNet, which is a deep learning based segmentation method to minimized the loss of features during semantic segmentation. Conventional deep learning based segmentation methods result in lower resolution and loss of object features during feature extraction and compression. Due to these losses, the edge and the internal information of the object are lost, and there is a problem that the accuracy at the time of object segmentation is lowered. To solve these problems, we improved PSPNet, which is a semantic segmentation model. The multi-scale attention proposed to the conventional PSPNet was added to prevent feature loss of objects. The feature purification process was performed by applying the attention method to the conventional PPM module. By suppressing unnecessary feature information, eadg and texture information was improved. The proposed method trained on the Cityscapes dataset and use the segmentation index MIoU for quantitative evaluation. As a result of the experiment, the segmentation accuracy was improved by about 1.5% compared to the conventional PSPNet.

WLSD: A Perceptual Stimulus Model Based Shape Descriptor

  • Li, Jiatong;Zhao, Baojun;Tang, Linbo;Deng, Chenwei;Han, Lu;Wu, Jinghui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.12
    • /
    • pp.4513-4532
    • /
    • 2014
  • Motivated by the Weber's Law, this paper proposes an efficient and robust shape descriptor based on the perceptual stimulus model, called Weber's Law Shape Descriptor (WLSD). It is based on the theory that human perception of a pattern depends not only on the change of stimulus intensity, but also on the original stimulus intensity. Invariant to scale and rotation is the intrinsic properties of WLSD. As a global shape descriptor, WLSD has far lower computation complexity while is as discriminative as state-of-art shape descriptors. Experimental results demonstrate the strong capability of the proposed method in handling shape retrieval.

A Three-scale Pedestrian Detection Method based on Refinement Module (Refinement Module 기반 Three-Scale 보행자 검출 기법)

  • Kyungmin Jung;Sooyong Park;Hyun Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.5
    • /
    • pp.259-265
    • /
    • 2023
  • Pedestrian detection is used to effectively detect pedestrians in various situations based on deep learning. Pedestrian detection has difficulty detecting pedestrians due to problems such as camera performance, pedestrian description, height, and occlusion. Even in the same pedestrian, performance in detecting them can differ according to the height of the pedestrian. The height of general pedestrians encompasses various scales, such as those of infants, adolescents, and adults, so when the model is applied to one group, the extraction of data becomes inaccurate. Therefore, this study proposed a pedestrian detection method that fine-tunes the pedestrian area by Refining Layer and Feature Concatenation to consider various heights of pedestrians. Through this, the score and location value for the pedestrian area were finely adjusted. Experiments on four types of test data demonstrate that the proposed model achieves 2-5% higher average precision (AP) compared to Faster R-CNN and DRPN.

Robust AAM-based Face Tracking with Occlusion Using SIFT Features (SIFT 특징을 이용하여 중첩상황에 강인한 AAM 기반 얼굴 추적)

  • Eom, Sung-Eun;Jang, Jun-Su
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.355-362
    • /
    • 2010
  • Face tracking is to estimate the motion of a non-rigid face together with a rigid head in 3D, and plays important roles in higher levels such as face/facial expression/emotion recognition. In this paper, we propose an AAM-based face tracking algorithm. AAM has been widely used to segment and track deformable objects, but there are still many difficulties. Particularly, it often tends to diverge or converge into local minima when a target object is self-occluded, partially or completely occluded. To address this problem, we utilize the scale invariant feature transform (SIFT). SIFT is an effective method for self and partial occlusion because it is able to find correspondence between feature points under partial loss. And it enables an AAM to continue to track without re-initialization in complete occlusions thanks to the good performance of global matching. We also register and use the SIFT features extracted from multi-view face images during tracking to effectively track a face across large pose changes. Our proposed algorithm is validated by comparing other algorithms under the above 3 kinds of occlusions.

Panoramic Image Composition Algorithm through Scaling and Rotation Invariant Features (크기 및 회전 불변 특징점을 이용한 파노라마 영상 합성 알고리즘)

  • Kwon, Ki-Won;Lee, Hae-Yeoun;Oh, Duk-Hwan
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.333-344
    • /
    • 2010
  • This paper addresses the way to compose paronamic images from images taken the same objects. With the spread of digital camera, the panoramic image has been studied to generate with its interest. In this paper, we propose a panoramic image generation method using scaling and rotation invariant features. First, feature points are extracted from input images and matched with a RANSAC algorithm. Then, after the perspective model is estimated, the input image is registered with this model. Since the SURF feature extraction algorithm is adapted, the proposed method is robust against geometric distortions such as scaling and rotation. Also, the improvement of computational cost is achieved. In the experiment, the SURF feature in the proposed method is compared with features from Harris corner detector or the SIFT algorithm. The proposed method is tested by generating panoramic images using $640{\times}480$ images. Results show that it takes 0.4 second in average for computation and is more efficient than other schemes.

Berg Balance Scale Score Classification Study Using Inertial Sensor (관성센서를 이용한 버그균형검사 점수 분류 연구)

  • Hong, Sangpyo;Kim, Yeon-wook;Cho, WooHyeong;Joa, Kyung-Lim;Jung, Han-Young;Kim, K.S.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.11 no.1
    • /
    • pp.53-62
    • /
    • 2017
  • In this paper, we present the score classification accuracy of BBS(Berg Balance Scale) which is the most commonly used balance evaluation tool using machine learning. Data acquisition was performed using the Noraxon system and an inertial sensor of Noraxon system was attached to the body in 8 locations (left and right ankle, left and right upper buttocks, left and right wrists, back, forehead). Based on the 3-axis accelerometer of the inertial sensor, the feature vector STFT(Short Time Fourier Transform) and SAM(Signal Area Magnitude) were extracted. Then, the items of the BBS were divided into static movement and dynamic movement depending on the operation characteristics, and the feature vectors were selected according to the sensor attachment positions which affect the score for each item of the BBS. Feature vectors selected for each item of BBS were classified using GMM(Gaussian Mixture Model). As a result of the accuracy calculation for 40 subjects, 55.5%, 72.2%, 87.5%, 50%, 35.1%, 62.5%, 43.3%, 58.6%, 60.7%, 33.3%, 44.8%, 89.2%, 51.8%, 85.1%, respectively.

Android malicious code Classification using Deep Belief Network

  • Shiqi, Luo;Shengwei, Tian;Long, Yu;Jiong, Yu;Hua, Sun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.454-475
    • /
    • 2018
  • This paper presents a novel Android malware classification model planned to classify and categorize Android malicious code at Drebin dataset. The amount of malicious mobile application targeting Android based smartphones has increased rapidly. In this paper, Restricted Boltzmann Machine and Deep Belief Network are used to classify malware into families of Android application. A texture-fingerprint based approach is proposed to extract or detect the feature of malware content. A malware has a unique "image texture" in feature spatial relations. The method uses information on texture image extracted from malicious or benign code, which are mapped to uncompressed gray-scale according to the texture image-based approach. By studying and extracting the implicit features of the API call from a large number of training samples, we get the original dynamic activity features sets. In order to improve the accuracy of classification algorithm on the features selection, on the basis of which, it combines the implicit features of the texture image and API call in malicious code, to train Restricted Boltzmann Machine and Back Propagation. In an evaluation with different malware and benign samples, the experimental results suggest that the usability of this method---using Deep Belief Network to classify Android malware by their texture images and API calls, it detects more than 94% of the malware with few false alarms. Which is higher than shallow machine learning algorithm clearly.

Texture Segmentation Using Statistical Characteristics of SOM and Multiscale Bayesian Image Segmentation Technique (SOM의 통계적 특성과 다중 스케일 Bayesian 영상 분할 기법을 이용한 텍스쳐 분할)

  • Kim Tae-Hyung;Eom Il-Kyu;Kim Yoo-Shin
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.6
    • /
    • pp.43-54
    • /
    • 2005
  • This paper proposes a novel texture segmentation method using Bayesian image segmentation method and SOM(Self Organization feature Map). Multi-scale wavelet coefficients are used as the input of SOM, and likelihood and a posterior probability for observations are obtained from trained SOMs. Texture segmentation is performed by a posterior probability from trained SOMs and MAP(Maximum A Posterior) classification. And the result of texture segmentation is improved by context information. This proposed segmentation method shows better performance than segmentation method by HMT(Hidden Markov Tree) model. The texture segmentation results by SOM and multi-sclae Bayesian image segmentation technique called HMTseg also show better performance than by HMT and HMTseg.

Efficient Multi-scalable Network for Single Image Super Resolution

  • Alao, Honnang;Kim, Jin-Sung;Kim, Tae Sung;Lee, Kyujoong
    • Journal of Multimedia Information System
    • /
    • v.8 no.2
    • /
    • pp.101-110
    • /
    • 2021
  • In computer vision, single-image super resolution has been an area of research for a significant period. Traditional techniques involve interpolation-based methods such as Nearest-neighbor, Bilinear, and Bicubic for image restoration. Although implementations of convolutional neural networks have provided outstanding results in recent years, efficiency and single model multi-scalability have been its challenges. Furthermore, previous works haven't placed enough emphasis on real-number scalability. Interpolation-based techniques, however, have no limit in terms of scalability as they are able to upscale images to any desired size. In this paper, we propose a convolutional neural network possessing the advantages of the interpolation-based techniques, which is also efficient, deeming it suitable in practical implementations. It consists of convolutional layers applied on the low-resolution space, post-up-sampling along the end hidden layers, and additional layers on high-resolution space. Up-sampling is applied on a multiple channeled feature map via bicubic interpolation using a single model. Experiments on architectural structure, layer reduction, and real-number scale training are executed with results proving efficient amongst multi-scale learning (including scale multi-path-learning) based models.

Multi-parametric MRIs based assessment of Hepatocellular Carcinoma Differentiation with Multi-scale ResNet

  • Jia, Xibin;Xiao, Yujie;Yang, Dawei;Yang, Zhenghan;Lu, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.5179-5196
    • /
    • 2019
  • To explore an effective non-invasion medical imaging diagnostics approach for hepatocellular carcinoma (HCC), we propose a method based on adopting the multiple technologies with the multi-parametric data fusion, transfer learning, and multi-scale deep feature extraction. Firstly, to make full use of complementary and enhancing the contribution of different modalities viz. multi-parametric MRI images in the lesion diagnosis, we propose a data-level fusion strategy. Secondly, based on the fusion data as the input, the multi-scale residual neural network with SPP (Spatial Pyramid Pooling) is utilized for the discriminative feature representation learning. Thirdly, to mitigate the impact of the lack of training samples, we do the pre-training of the proposed multi-scale residual neural network model on the natural image dataset and the fine-tuning with the chosen multi-parametric MRI images as complementary data. The comparative experiment results on the dataset from the clinical cases show that our proposed approach by employing the multiple strategies achieves the highest accuracy of 0.847±0.023 in the classification problem on the HCC differentiation. In the problem of discriminating the HCC lesion from the non-tumor area, we achieve a good performance with accuracy, sensitivity, specificity and AUC (area under the ROC curve) being 0.981±0.002, 0.981±0.002, 0.991±0.007 and 0.999±0.0008, respectively.