• Title/Summary/Keyword: Feature Maps

Search Result 287, Processing Time 0.032 seconds

Dual Attention Based Image Pyramid Network for Object Detection

  • Dong, Xiang;Li, Feng;Bai, Huihui;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4439-4455
    • /
    • 2021
  • Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300 × 300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.

α-feature map scaling for raw waveform speaker verification (α-특징 지도 스케일링을 이용한 원시파형 화자 인증)

  • Jung, Jee-weon;Shim, Hye-jin;Kim, Ju-ho;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.441-446
    • /
    • 2020
  • In this paper, we propose the α-Feature Map Scaling (α-FMS) method which extends the FMS method that was designed to enhance the discriminative power of feature maps of deep neural networks in Speaker Verification (SV) systems. The FMS derives a scale vector from a feature map and then adds or multiplies them to the features, or sequentially apply both operations. However, the FMS method not only uses an identical scale vector for both addition and multiplication, but also has a limitation that it can only add a value between zero and one in case of addition. In this study, to overcome these limitations, we propose α-FMS to add a trainable parameter α to the feature map element-wise, and then multiply a scale vector. We compare the performance of the two methods: the one where α is a scalar, and the other where it is a vector. Both α-FMS methods are applied after each residual block of the deep neural network. The proposed system using the α-FMS methods are trained using the RawNet2 and tested using the VoxCeleb1 evaluation set. The result demonstrates an equal error rate of 2.47 % and 2.31 % for the two α-FMS methods respectively.

Automatic Change Detection Based on Areal Feature Matching in Different Network Data-sets (이종의 도로망 데이터 셋에서 면 객체 매칭 기반 변화탐지)

  • Kim, Jiyoung;Huh, Yong;Yu, Kiyun;Kim, Jung Ok
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.31 no.6_1
    • /
    • pp.483-491
    • /
    • 2013
  • By a development of car navigation systems and mobile or positioning technology, it increases interest in location based services, especially pedestrian navigation systems. Updating of digital maps is important because digital maps are mass data and required to short updating cycle. In this paper, we proposed change detection for different network data-sets based on areal feature matching. Prior to change detection, we defined type of updating between different network data-sets. Next, we transformed road lines into areal features(block) that are surrounded by them and calculated a shape similarity between blocks in different data-sets. Blocks that a shape similarity is more than 0.6 are selected candidate block pairs. Secondly, we detected changed-block pairs by bipartite graph clustering or properties of a concave polygon according to types of updating, and calculated Fr$\acute{e}$chet distance between segments within the block or forming it. At this time, road segments of KAIS map that Fr$\acute{e}$chet distance is more than 50 are extracted as updating road features. As a result of accuracy evaluation, a value of detection rate appears high at 0.965. We could thus identify that a proposed method is able to apply to change detection between different network data-sets.

Development of Deep Learning Structure for Defective Pixel Detection of Next-Generation Smart LED Display Board using Imaging Device (영상장치를 이용한 차세대 스마트 LED 전광판의 불량픽셀 검출을 위한 딥러닝 구조 개발)

  • Sun-Gu Lee;Tae-Yoon Lee;Seung-Ho Lee
    • Journal of IKEEE
    • /
    • v.27 no.3
    • /
    • pp.345-349
    • /
    • 2023
  • In this paper, we propose a study on the development of deep learning structure for defective pixel detection of next-generation smart LED display board using imaging device. In this research, a technique utilizing imaging devices and deep learning is introduced to automatically detect defects in outdoor LED billboards. Through this approach, the effective management of LED billboards and the resolution of various errors and issues are aimed. The research process consists of three stages. Firstly, the planarized image data of the billboard is processed through calibration to completely remove the background and undergo necessary preprocessing to generate a training dataset. Secondly, the generated dataset is employed to train an object recognition network. This network is composed of a Backbone and a Head. The Backbone employs CSP-Darknet to extract feature maps, while the Head utilizes extracted feature maps as the basis for object detection. Throughout this process, the network is adjusted to align the Confidence score and Intersection over Union (IoU) error, sustaining continuous learning. In the third stage, the created model is employed to automatically detect defective pixels on actual outdoor LED billboards. The proposed method, applied in this paper, yielded results from accredited measurement experiments that achieved 100% detection of defective pixels on real LED billboards. This confirms the improved efficiency in managing and maintaining LED billboards. Such research findings are anticipated to bring about a revolutionary advancement in the management of LED billboards.

Automatic detection and severity prediction of chronic kidney disease using machine learning classifiers (머신러닝 분류기를 사용한 만성콩팥병 자동 진단 및 중증도 예측 연구)

  • Jihyun Mun;Sunhee Kim;Myeong Ju Kim;Jiwon Ryu;Sejoong Kim;Minhwa Chung
    • Phonetics and Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.45-56
    • /
    • 2022
  • This paper proposes an optimal methodology for automatically diagnosing and predicting the severity of the chronic kidney disease (CKD) using patients' utterances. In patients with CKD, the voice changes due to the weakening of respiratory and laryngeal muscles and vocal fold edema. Previous studies have phonetically analyzed the voices of patients with CKD, but no studies have been conducted to classify the voices of patients. In this paper, the utterances of patients with CKD were classified using the variety of utterance types (sustained vowel, sentence, general sentence), the feature sets [handcrafted features, extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS), CNN extracted features], and the classifiers (SVM, XGBoost). Total of 1,523 utterances which are 3 hours, 26 minutes, and 25 seconds long, are used. F1-score of 0.93 for automatically diagnosing a disease, 0.89 for a 3-classes problem, and 0.84 for a 5-classes problem were achieved. The highest performance was obtained when the combination of general sentence utterances, handcrafted feature set, and XGBoost was used. The result suggests that a general sentence utterance that can reflect all speakers' speech characteristics and an appropriate feature set extracted from there are adequate for the automatic classification of CKD patients' utterances.

Efficiency Evaluation of the Feature Extraction of Roads from Map Image using Morphological Operators* (수리 형태학적 연산자를 이용한 지도 화상에서 도로 정보의 특징 추출에 대한 효율성 평가)

  • 남태희
    • Journal of the Korea Society of Computer and Information
    • /
    • v.4 no.2
    • /
    • pp.19-26
    • /
    • 1999
  • The geographic information system is needed in the image recognition field. This study recommends an efficient method to construct the GIS from the feature extraction of roads through scanning of a normal or hand-made maps. Many algorithms have been presented for such image information recognition. However, such algorithm processes have limitations due to their complexity. To efficiently extract road information from scanning map images. a $3{\times}3$ directional form is applied - structuring element, erosion and dilation, and opening and closing. This method allows for efficient evaluation of the featured road extracts from the map image and from the character sets.

  • PDF

Three-stream network with context convolution module for human-object interaction detection

  • Siadari, Thomhert S.;Han, Mikyong;Yoon, Hyunjin
    • ETRI Journal
    • /
    • v.42 no.2
    • /
    • pp.230-238
    • /
    • 2020
  • Human-object interaction (HOI) detection is a popular computer vision task that detects interactions between humans and objects. This task can be useful in many applications that require a deeper understanding of semantic scenes. Current HOI detection networks typically consist of a feature extractor followed by detection layers comprising small filters (eg, 1 × 1 or 3 × 3). Although small filters can capture local spatial features with a few parameters, they fail to capture larger context information relevant for recognizing interactions between humans and distant objects owing to their small receptive regions. Hence, we herein propose a three-stream HOI detection network that employs a context convolution module (CCM) in each stream branch. The CCM can capture larger contexts from input feature maps by adopting combinations of large separable convolution layers and residual-based convolution layers without increasing the number of parameters by using fewer large separable filters. We evaluate our HOI detection method using two benchmark datasets, V-COCO and HICO-DET, and demonstrate its state-of-the-art performance.

A Study on Expressional Feature of <> by Art Spigelman (아트 슈피겔만의 <<쥐>>에 관한 표현적 특성 연구)

  • Choi, Ji-Young;Kim, Chee-Yong
    • Journal of Digital Contents Society
    • /
    • v.9 no.3
    • /
    • pp.413-422
    • /
    • 2008
  • On Expressional Feature of by Art Spigelman who won the Pulitzer Prize in 1992, first in the works background of Spigelman, $60{\sim}70's$ the author sends an adolescence the ideological background of the work investigated a periodic situation and the environment where becomes the creation background of <> and researched. Second, in narrative special quality it was researched about fablelic narration, narration of documentary, doublespeak, frame formula configuration. Third, in formal quality, <> being based on Scott McClould's cartoon theories analyzed with structure. Last, <> was investigated in form.structural completion degrees with Spigelman peculiar expressive techniques, photos, informality illustrations, maps that were expressed the realism strongly were analyzed.

  • PDF

Automatic Attention Object Extraction Using Feature Maps (특징 지도를 이용한 자동적인 중심 객체 추출)

  • Park Ki-Tae;Kim Jong-Hyeok;Moon Young-Shik
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.370-372
    • /
    • 2006
  • 본 논문에서 제안하는 방법은 영상에서 중심 객체를 추출하기 위해 에지와 색상 정보에서 추출한 특집 지도와 배경의 영향을 줄이기 위친 창조 지도(reference map)를 제안한 것이 특징이다. 특징 지도는 다른 영역과 현저하게 구분되는 영역을 검출하기 위해서 영상의 특징 값(feature)들을 이용해서 구성한 영상이라고 할 수 있다. 그리고 창조 지도는 배경의 영향을 최소화하면서, 객체가 존재할 확률이 높은 부분을 나타내는 지도이다. 제안하는 방법은 밝기 차 정보를 가지고 있는 에지와 YCbCr 컬러모델과 HSV 컬러모델의 색상 성분을 특징 값으로 사용한다. 이들 특징 값을 이용해서 특징 지도를 구성하는 방법으로 영상 내 색상 차에 의해서 나타나는 경계부분을 구하는 방법을 사용한다. 이 방법을 사용하여 에지 지도와 두 개의 색상 지도의 3가지 특징 지도를 생성한다. 다음으로, 영상 배경의 영향을 줄이기 위해 참조 지도를 구한다. 구해진 참조 지도와 특징 지도들을 이용해서 결합 지도(combination map)를 생성한다. 결함 지도로부터 다각형의 객체 후보 영역을 구하고, 객체 후보 영역에 영상분할을 적용하여 중심 객체를 추출한다. 실험에 사용된 영상들은 Corel DB를 사용하였으며, 실험결과로써 precision은 84.3%, recall은 81.3%의 성능을 보인다.

  • PDF

Stochastic Non-linear Hashing for Near-Duplicate Video Retrieval using Deep Feature applicable to Large-scale Datasets

  • Byun, Sung-Woo;Lee, Seok-Pil
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4300-4314
    • /
    • 2019
  • With the development of video-related applications, media content has increased dramatically through applications. There is a substantial amount of near-duplicate videos (NDVs) among Internet videos, thus NDVR is important for eliminating near-duplicates from web video searches. This paper proposes a novel NDVR system that supports large-scale retrieval and contributes to the efficient and accurate retrieval performance. For this, we extracted keyframes from each video at regular intervals and then extracted both commonly used features (LBP and HSV) and new image features from each keyframe. A recent study introduced a new image feature that can provide more robust information than existing features even if there are geometric changes to and complex editing of images. We convert a vector set that consists of the extracted features to binary code through a set of hash functions so that the similarity comparison can be more efficient as similar videos are more likely to map into the same buckets. Lastly, we calculate similarity to search for NDVs; we examine the effectiveness of the NDVR system and compare this against previous NDVR systems using the public video collections CC_WEB_VIDEO. The proposed NDVR system's performance is very promising compared to previous NDVR systems.