• 제목/요약/키워드: multi-scale features

검색결과 185건 처리시간 0.026초

Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion

  • Xinhua Lu;Haihai Wei;Li Ma;Qingji Xue;Yonghui Fu
    • Journal of Information Processing Systems
    • /
    • 제19권4호
    • /
    • pp.427-438
    • /
    • 2023
  • Plenty of works have indicated that single image super-resolution (SISR) models relying on synthetic datasets are difficult to be applied to real scene text image super-resolution (STISR) for its more complex degradation. The up-to-date dataset for realistic STISR is called TextZoom, while the current methods trained on this dataset have not considered the effect of multi-scale features of text images. In this paper, a multi-scale and attention fusion model for realistic STISR is proposed. The multi-scale learning mechanism is introduced to acquire sophisticated feature representations of text images; The spatial and channel attentions are introduced to capture the local information and inter-channel interaction information of text images; At last, this paper designs a multi-scale residual attention module by skillfully fusing multi-scale learning and attention mechanisms. The experiments on TextZoom demonstrate that the model proposed increases scene text recognition's (ASTER) average recognition accuracy by 1.2% compared to text super-resolution network.

해체와 구성을 이용한 다중 스케일 균열 검출 (Multi-scale crack detection using decomposition and composition)

  • 김영로;정지영
    • 디지털산업정보학회논문지
    • /
    • 제9권3호
    • /
    • pp.13-20
    • /
    • 2013
  • In this paper, we propose a multi-scale crack detection method. This method uses decomposition, composition, and shape properties. It is based on morphology algorithm, crack features. We use a morphology operator which extracts patterns of crack. It segments cracks and background using opening and closing operations. Morphology based segmentation is better than existing integration methods using subtraction in detecting a crack it has small width. However, morphology methods using only one structure element could detect only fixed width crack. Thus, we use decomposition and composition methods. We use a decimation method for decomposition. After decomposition and morphology operation, we get edge images given by binary values. Our method calculates values of properties such as the number of pixels and the maximum length of the segmented region. We decide whether the segmented region belongs to cracks according to those data. Experimental results show that our proposed multi-scale crack detection method has better results than those of existing detection methods.

청각 장애인용 홈 모니터링 시스템을 위한 다채널 다중 스케일 신경망 기반의 사운드 이벤트 검출 (Sound event detection based on multi-channel multi-scale neural networks for home monitoring system used by the hard-of-hearing)

  • 이기용;김형국
    • 한국음향학회지
    • /
    • 제39권6호
    • /
    • pp.600-605
    • /
    • 2020
  • 본 논문에서는 청각 장애인을 위한 소리 감지 홈 모니터링을 위해 다채널 다중 스케일 신경망을 사용한 사운드 이벤트 검출 방식을 제안한다. 제안하는 시스템에서는 홈 내의 여러 무선 마이크 센서들로부터 높은 신호 품질을 갖는 두 개의 채널을 선택하고, 그 신호들로부터 도착신호 지연시간, 피치 범위, 그리고 다중 스케일 합성 곱 신경망을 로그멜 스펙트로그램에 적용하여 추출한 특징들을 양방향 게이트 순환 신경망 기반의 분류기에 적용함으로써 사운드 이벤트 검출의 성능을 더욱 향상시킨다. 검출된 사운드 이벤트 결과는 선택된 채널의 센서 위치와 함께 텍스트로 변환되어 청각 장애인에게 제공된다. 실험결과는 제안한 시스템의 사운드 이벤트 검출 방식이 기존 방식보다 우수하며 청각 장애인에게 효과적으로 사운드 정보를 전달할 수 있음을 보인다.

대규모 비디오 감시 환경에서 프라이버시 보호를 위한 다중 레벨 특징 기반 얼굴검출 방법에 관한 연구 (Face Detection Using Multi-level Features for Privacy Protection in Large-scale Surveillance Video)

  • 이승호;문정익;김형일;노용만
    • 한국멀티미디어학회논문지
    • /
    • 제18권11호
    • /
    • pp.1268-1280
    • /
    • 2015
  • In video surveillance system, the exposure of a person's face is a serious threat to personal privacy. To protect the personal privacy in large amount of videos, an automatic face detection method is required to locate and mask the person's face. However, in real-world surveillance videos, the effectiveness of existing face detection methods could deteriorate due to large variations in facial appearance (e.g., facial pose, illumination etc.) or degraded face (e.g., occluded face, low-resolution face etc.). This paper proposes a new face detection method based on multi-level facial features. In a video frame, different kinds of spatial features are independently extracted, and analyzed, which could complement each other in the aforementioned challenges. Temporal domain analysis is also exploited to consolidate the proposed method. Experimental results show that, compared to competing methods, the proposed method is able to achieve very high recall rates while maintaining acceptable precision rates.

Review of Operational Multi-Scale Environment Model with Grid Adaptivity

  • Kang, Sung-Dae
    • Environmental Sciences Bulletin of The Korean Environmental Sciences Society
    • /
    • 제10권S_1호
    • /
    • pp.23-28
    • /
    • 2001
  • A new numerical weather prediction and dispersion model, the Operational Multi-scale Environment model with Grid Adaptivity(OMEGA) including an embedded Atmospheric Dispersion Model(ADM), is introduced as a next generation atmospheric simulation system for real-time hazard predictions, such as severe weather or the transport of hazardous release. OMEGA is based on an unstructured grid that can facilitate a continuously varying horizontal grid resolution ranging from 100 km down to 1 km and a vertical resolution from 20 -30 meters in the boundary layer to 1 km in the free atmosphere. OMEGA is also naturally scale spanning and time. In particular, the unstructured grid cells in the horizontal dimension can increase the local resolution to better capture the topography or important physical features of the atmospheric circulation and cloud dynamics. This means the OMEGA can readily adapt its grid to a stationary surface, terrain features, or dynamic features in an evolving weather pattern. While adaptive numerical techniques have yet to be extensively applied in atmospheric models, the OMEGA model is the first to exploit the adaptive nature of an unstructured gridding technique for atmospheric simulation and real-time hazard prediction. The purpose of this paper is to provide a detailed description of the OMEGA model, the OMEGA system, and a detailed comparison of OMEGA forecast results with observed data.

  • PDF

Infrared Target Recognition using Heterogeneous Features with Multi-kernel Transfer Learning

  • Wang, Xin;Zhang, Xin;Ning, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권9호
    • /
    • pp.3762-3781
    • /
    • 2020
  • Infrared pedestrian target recognition is a vital problem of significant interest in computer vision. In this work, a novel infrared pedestrian target recognition method that uses heterogeneous features with multi-kernel transfer learning is proposed. Firstly, to exploit the characteristics of infrared pedestrian targets fully, a novel multi-scale monogenic filtering-based completed local binary pattern descriptor, referred to as MSMF-CLBP, is designed to extract the texture information, and then an improved histogram of oriented gradient-fisher vector descriptor, referred to as HOG-FV, is proposed to extract the shape information. Second, to enrich the semantic content of feature expression, these two heterogeneous features are integrated to get more complete representation for infrared pedestrian targets. Third, to overcome the defects, such as poor generalization, scarcity of tagged infrared samples, distributional and semantic deviations between the training and testing samples, of the state-of-the-art classifiers, an effective multi-kernel transfer learning classifier called MK-TrAdaBoost is designed. Experimental results show that the proposed method outperforms many state-of-the-art recognition approaches for infrared pedestrian targets.

Seafloor Classification Based on the Texture Analysis of Sonar Images Using the Gabor Wavelet

  • Sun, Ning;Shim, Tae-Bo
    • The Journal of the Acoustical Society of Korea
    • /
    • 제27권3E호
    • /
    • pp.77-83
    • /
    • 2008
  • In the process of the sonar image textures produced, the orientation and scale factors are very significant. However, most of the related methods ignore the directional information and scale invariance or just pay attention to one of them. To overcome this problem, we apply Gabor wavelet to extract the features of sonar images, which combine the advantages of both the Gabor filter and traditional wavelet function. The mother wavelet is designed with constrained parameters and the optimal parameters will be selected at each orientation, with the help of bandwidth parameters based on the Fisher criterion. The Gabor wavelet can have the properties of both multi-scale and multi-orientation. Based on our experiment, this method is more appropriate than traditional wavelet or single Gabor filter as it provides the better discrimination of the textures and improves the recognition rate effectively. Meanwhile, comparing with other fusion methods, it can reduce the complexity and improve the calculation efficiency.

Integration of Multi-scale CAM and Attention for Weakly Supervised Defects Localization on Surface Defective Apple

  • Nguyen Bui Ngoc Han;Ju Hwan Lee;Jin Young Kim
    • 스마트미디어저널
    • /
    • 제12권9호
    • /
    • pp.45-59
    • /
    • 2023
  • Weakly supervised object localization (WSOL) is a task of localizing an object in an image using only image-level labels. Previous studies have followed the conventional class activation mapping (CAM) pipeline. However, we reveal the current CAM approach suffers from problems which cause original CAM could not capture the complete defects features. This work utilizes a convolutional neural network (CNN) pretrained on image-level labels to generate class activation maps in a multi-scale manner to highlight discriminative regions. Additionally, a vision transformer (ViT) pretrained was treated to produce multi-head attention maps as an auxiliary detector. By integrating the CNN-based CAMs and attention maps, our approach localizes defective regions without requiring bounding box or pixel-level supervision during training. We evaluate our approach on a dataset of apple images with only image-level labels of defect categories. Experiments demonstrate our proposed method aligns with several Object Detection models performance, hold a promise for improving localization.

다중해상도해석을 이용한 콘크리트 재료의 수치적 동질화 (Numerical Homogenization in Concrete Materials Using Multi-Resolution Analysis)

  • 이인규;노영숙
    • 콘크리트학회논문집
    • /
    • 제17권6호
    • /
    • pp.939-946
    • /
    • 2005
  • 비균질 재료인 콘크리트의 강성 특성과 성능저하 현상을 웨이블릿 변환을 이용한 다중해상도해석을 통해 각 관찰 규모에 따라 동질화 과정의 적용성 및 거시적 손상지수의 평가 등을 연구하였다. 연속적인 Haar 웨이블릿 변환은 기존 강성행렬의 특성을 연속적인 축소규모로의 복제를 통해 미세규모로부터 거시규모로의 축소 또는 복원 과정을 나타내었고 이는 선형구조계의 크기별 스펙트럼 특성의 보존, 즉 타원성, 철면성 그리고 양의 정부호성을 보존하여 각 규모별 해의 유효성을 확인하였다. 웨이블릿 계수를 이용한 기존 강성의 평균은 거시단계의 변형에너지와 상호관계를 가지고 아래 단계로의 축소, 윗 단계로의 복원을 자유롭게 할 수 있는 장점이 있다. 이러한 다중해상도해석의 예제로서 1차원 및 2차원 2상복합체를 가지고 유한요소해석을 통해 기존 이론의 검증과 최소고유치의 각 크기단계별 변화 과정, 원 축소 구조계의 해의 유일성 그리고 국부적 손상지수의 동질화 여부 등을 검사하였다. 이러한 동질화 축소 과정은 자유도가 큰 비선형 구조계로의 적용의 첫 단계를 제공하였다.