• Title/Summary/Keyword: 다중모드

Search Result 715, Processing Time 0.026 seconds

A Study on Performance Improvement of GVQA Model Using Transformer (트랜스포머를 이용한 GVQA 모델의 성능 개선에 관한 연구)

  • Park, Sung-Wook;Kim, Jun-Yeong;Park, Jun;Lee, Han-Sung;Jung, Se-Hoon;Sim, Cun-Bo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.749-752
    • /
    • 2021
  • 오늘날 인공지능(Artificial Intelligence, AI) 분야에서 가장 구현하기 어려운 분야 중 하나는 추론이다. 근래 추론 분야에서 영상과 언어가 결합한 다중 모드(Multi-modal) 환경에서 영상 기반의 질의 응답(Visual Question Answering, VQA) 과업에 대한 AI 모델이 발표됐다. 얼마 지나지 않아 VQA 모델의 성능을 개선한 GVQA(Grounded Visual Question Answering) 모델도 발표됐다. 하지만 아직 GVQA 모델도 완벽한 성능을 내진 못한다. 본 논문에서는 GVQA 모델의 성능 개선을 위해 VCC(Visual Concept Classifier) 모델을 ViT-G(Vision Transformer-Giant)/14로 변경하고, ACP(Answer Cluster Predictor) 모델을 GPT(Generative Pretrained Transformer)-3으로 변경한다. 이와 같은 방법들은 성능을 개선하는 데 큰 도움이 될 수 있다고 사료된다.

Multimodal Image Fusion with Human Pose for Illumination-Robust Detection of Human Abnormal Behaviors (조명을 위한 인간 자세와 다중 모드 이미지 융합 - 인간의 이상 행동에 대한 강력한 탐지)

  • Cuong H. Tran;Seong G. Kong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.637-640
    • /
    • 2023
  • This paper presents multimodal image fusion with human pose for detecting abnormal human behaviors in low illumination conditions. Detecting human behaviors in low illumination conditions is challenging due to its limited visibility of the objects of interest in the scene. Multimodal image fusion simultaneously combines visual information in the visible spectrum and thermal radiation information in the long-wave infrared spectrum. We propose an abnormal event detection scheme based on the multimodal fused image and the human poses using the keypoints to characterize the action of the human body. Our method assumes that human behaviors are well correlated to body keypoints such as shoulders, elbows, wrists, hips. In detail, we extracted the human keypoint coordinates from human targets in multimodal fused videos. The coordinate values are used as inputs to train a multilayer perceptron network to classify human behaviors as normal or abnormal. Our experiment demonstrates a significant result on multimodal imaging dataset. The proposed model can capture the complex distribution pattern for both normal and abnormal behaviors.

Deinterlacing Method for improving Motion Estimator based on multi arithmetic Architecture (다중연산구조기반의 고밀도 성능향상을 위한 움직임추정의 디인터레이싱 방법)

  • Lee, Kang-Whan
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.1
    • /
    • pp.49-55
    • /
    • 2007
  • To improved the multi-resolution fast hierarchical motion estimation by using de-interlacing algorithm that is effective in term of both performance and VLSI implementation, is proposed so as to cover large search area field-based as well as frame based image processing in SoC design. In this paper, we have simulated a various picture mode M=2 or M=3. As a results, the proposed algorithm achieved the motion estimation performance PSNR compare with the full search block matching algorithm, the average performance degradation reached to -0.7dB, which did not affect on the subjective quality of reconstructed images at all. And acquiring the more desirable to adopt design SoC for the fast hierarchical motion estimation, we exploit foreground and background search algorithm (FBSA) base on the dual arithmetic processor element(DAPE). It is possible to estimate the large search area motion displacement using a half of number PE in general operation methods. And the proposed architecture of MHME improve the VLSI design hardware through the proposed FBSA structure with DAPE to remove the local memory. The proposed FBSA which use bit array processing in search area can improve structure as like multiple processor array unit(MPAU).

Real-time Recognition and Tracking System of Multiple Moving Objects (다중 이동 객체의 실시간 인식 및 추적 시스템)

  • Park, Ho-Sik;Bae, Cheol-Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.7C
    • /
    • pp.421-427
    • /
    • 2011
  • The importance of the real-time object recognition and tracking field has been growing steadily due to rapid advancement in the computer vision applications industry. As is well known, the mean-shift algorithm is widely used in robust real-time object tracking systems. Since the mentioned algorithm is easy to implement and efficient in object tracking computation, many say it is suitable to be applied to real-time object tracking systems. However, one of the major drawbacks of this algorithm is that it always converges to a local mode, failing to perform well in a cluttered environment. In this paper, an Optical Flow-based algorithm which fits for real-time recognition of multiple moving objects is proposed. Also in the tests, the newly proposed method contributed to raising the similarity of multiple moving objects, the similarity was as high as 0.96, up 13.4% over that of the mean-shift algorithm. Meanwhile, the level of pixel errors from using the new method keenly decreased by more than 50% over that from applying the mean-shift algorithm. If the data processing speed in the video surveillance systems can be reduced further, owing to improved algorithms for faster moving object recognition and tracking functions, we will be able to expect much more efficient intelligent systems in this industrial arena.

An Effective Routing of Zone Routing Protocol for Mobile Ad Hoc Networks (MANET을 위한 존 라우팅 프로토콜의 효율적인 경로 설정)

  • Chu, Seong-Eun;Kim, Jae-Nam;Kang, Dae-Wook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11b
    • /
    • pp.1547-1550
    • /
    • 2002
  • MANET은 전형적인 무선 네트워킹과는 다른 새로운 무선 네트워킹 파라다임으로써 기존 유선 망의 하부 구조에 의존하지 않고 이동 호스트틀로만 구성된 네트워크이다. Ad Hoc망에서 통신을 하기 위해서는 출발지 노드에서 목적지 노드까지 데이터 전송을 위한 라우팅에 관한 문제이다. Ad Hoc망에서는 모든 단말기의 위치변화가 가능하기 때문에 경로설정에 어려움이 따른다. 노드간에 정보를 보내고자 할 때 노트가 인접한 상태가 아니면 정보를 직전 보낼 수 없고 여러 중간 노드들을 거쳐서 정보를 보내는 다중-홉 라우팅 방식을 사용해야 한다. 따라서 중간 노드들은 패킷 라우터의 역할을 해야하는데 무선 통신 자체가 좁은 대역폭과 한정된 채널을 가지고 전송 범위가 제한되는 문제가 있다. 또한 노트 자체의 이동성과 전력 소모 등으로 인한 이탈은 망 위상을 수시로 변화시키므로 노트간에 정보를 전송하는데 가장 종은 경로는 수시로 변경될 수 있으므로 많은 어려움이 따르게 된다. 본 논문에서는 이러한 문제의 해결방안으로 경로유지 과정에서 Ad Hoc망 내의 노드들은 이동성의 특성으로 인해 현재 사용되는 경로 보다 더 짧고 효율적인 경로가 발생하고 중간 노트가 이동 될 때 새로운 경로로 갱신하여 솔기없는 최적의 경로를 유지할 수 있는 방법을 제안한다. 제안 방법은 ZRP의 IERP에서 감청모드를 통하여 사공중인 경로보다 최적의 경로를 감지하여 새로운 경로로 갱신하는 방법과 중간 노드가 이동하여 경로가 깨진 경우 부분적으로 경로를 복구하는 방법을 제시하여 항상 최적화된 경로를 유지함으로써 Ad Hoc망의 위상변화에 대한 적응성을 높일 수 있도록 한다. SQL Server 2000 그리고 LSF를 이용하였다. 그리고 구현 환경과 구성요소에 대한 수행 화면을 보였다.ool)을 사용하더라도 단순 다중 쓰레드 모델보다 더 많은 수의 클라이언트를 수용할 수 있는 장점이 있다. 이러한 결과를 바탕으로 본 연구팀에서 수행중인 MoIM-Messge서버의 네트워크 모듈로 다중 쓰레드 소켓폴링 모델을 적용하였다.n rate compared with conventional face recognition algorithms. 아니라 실내에서도 발생하고 있었다. 정량한 8개 화합물 각각과 총 휘발성 유기화합물의 스피어만 상관계수는 벤젠을 제외하고는 모두 유의하였다. 이중 톨루엔과 크실렌은 총 휘발성 유기화합물과 좋은 상관성 (톨루엔 0.76, 크실렌, 0.87)을 나타내었다. 이 연구는 톨루엔과 크실렌이 총 휘발성 유기화합물의 좋은 지표를 사용될 있고, 톨루엔, 에틸벤젠, 크실렌 등 많은 휘발성 유기화합물의 발생원은 실외뿐 아니라 실내에도 있음을 나타내고 있다.>10)의 $[^{18}F]F_2$를 얻었다. 결론: $^{18}O(p,n)^{18}F$ 핵반응을 이용하여 친전자성 방사성동위원소 $[^{18}F]F_2$를 생산하였다. 표적 챔버는 알루미늄으로 제작하였으며 본 연구에서 연구된 $[^{18}F]F_2$가스는 친핵성 치환반응으로 방사성동위원소를 도입하기 어려운 다양한 방사성의 약품개발에 유용하게 이용될 수 있을 것이다.었으나 움직임 보정 후 영상을 이용하여 비교한 경우, 결합능 변화가 선조체 영역에서 국한되어 나타나며 그 유

  • PDF

A Crypto-processor Supporting Multiple Block Cipher Algorithms (다중 블록 암호 알고리듬을 지원하는 암호 프로세서)

  • Cho, Wook-Lae;Kim, Ki-Bbeum;Bae, Gi-Chur;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.11
    • /
    • pp.2093-2099
    • /
    • 2016
  • This paper describes a design of crypto-processor that supports multiple block cipher algorithms of PRESENT, ARIA, and AES. The crypto-processor integrates three cores that are PRmo (PRESENT with mode of operation), AR_AS (ARIA_AES), and AES-16b. The PRmo core implementing 64-bit block cipher PRESENT supports key length 80-bit and 128-bit, and four modes of operation including ECB, CBC, OFB, and CTR. The AR_AS core supporting key length 128-bit and 256-bit integrates two 128-bit block ciphers ARIA and AES into a single data-path by utilizing resource sharing technique. The AES-16b core supporting key length 128-bit implements AES with a reduced data-path of 16-bit for minimizing hardware. Each crypto-core contains its own on-the-fly key scheduler, and consecutive blocks of plaintext/ciphertext can be processed without reloading key. The crypto-processor was verified by FPGA implementation. The crypto-processor implemented with a $0.18{\mu}m$ CMOS cell library occupies 54,500 gate equivalents (GEs), and it can operate with 55 MHz clock frequency.

Comparison of Multi-angle TerraSAR-X Staring Mode Image Registration Method through Coarse to Fine Step (Coarse to Fine 단계를 통한 TerraSAR-X Staring Mode 다중 관측각 영상 정합기법 비교 분석)

  • Lee, Dongjun;Kim, Sang-Wan
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.475-491
    • /
    • 2021
  • With the recent increase in available high-resolution (< ~1 m) satellite SAR images, the demand for precise registration of SAR images is increasing in various fields including change detection. The registration between high-resolution SAR images acquired in different look angle is difficult due to speckle noise and geometric distortion caused by the characteristics of SAR images. In this study, registration is performed in two stages, coarse and fine, using the x-band SAR data imaged at staring spotlight mode of TerraSAR-X. For the coarse registration, a method combining the adaptive sampling method and SAR-SIFT (Scale Invariant Feature Transform) is applied, and three rigid methods (NCC: Normalized Cross Correlation, Phase Congruency-NCC, MI: Mutual Information) and one non-rigid (Gefolki: Geoscience extended Flow Optical Flow Lucas-Kanade Iterative), for the fine registration stage, was performed for performance comparison. The results were compared by using RMSE (Root Mean Square Error) and FSIM (Feature Similarity) index, and all rigid models showed poor results in all image combinations. It is confirmed that the rigid models have a large registration error in the rugged terrain area. As a result of applying the Gefolki algorithm, it was confirmed that the RMSE of Gefolki showed the best result as a 1~3 pixels, and the FSIM index also obtained a higher value than 0.02~0.03 compared to other rigid methods. It was confirmed that the mis-registration due to terrain effect could be sufficiently reduced by the Gefolki algorithm.

Diagnosis of Valve Internal Leakage for Ship Piping System using Acoustic Emission Signal-based Machine Learning Approach (선박용 밸브의 내부 누설 진단을 위한 음향방출신호의 머신러닝 기법 적용 연구)

  • Lee, Jung-Hyung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.1
    • /
    • pp.184-192
    • /
    • 2022
  • Valve internal leakage is caused by damage to the internal parts of the valve, resulting in accidents and shutdowns of the piping system. This study investigated the possibility of a real-time leak detection method using the acoustic emission (AE) signal generated from the piping system during the internal leakage of a butterfly valve. Datasets of raw time-domain AE signals were collected and postprocessed for each operation mode of the valve in a systematic manner to develop a data-driven model for the detection and classification of internal leakage, by applying machine learning algorithms. The aim of this study was to determine whether it is possible to treat leak detection as a classification problem by applying two classification algorithms: support vector machine (SVM) and convolutional neural network (CNN). The results showed different performances for the algorithms and datasets used. The SVM-based binary classification models, based on feature extraction of data, achieved an overall accuracy of 83% to 90%, while in the case of a multiple classification model, the accuracy was reduced to 66%. By contrast, the CNN-based classification model achieved an accuracy of 99.85%, which is superior to those of any other models based on the SVM algorithm. The results revealed that the SVM classification model requires effective feature extraction of the AE signals to improve the accuracy of multi-class classification. Moreover, the CNN-based classification can be a promising approach to detect both leakage and valve opening as long as the performance of the processor does not degrade.

Magnetic Properties of Electroless Co-Mn-P Alloy Deposits (무전해 Co-Mn-P 합금 도금층의 자기적 특성)

  • Yun, Seong-Ryeol;Han, Seung-Hui;Kim, Chang-Uk
    • Korean Journal of Materials Research
    • /
    • v.9 no.3
    • /
    • pp.274-281
    • /
    • 1999
  • Usually sputtering and electroless plating methods were used for manufacturing metal-alloy thin film magnetic memory devices. Since electroless plating method has many merits in mass production and product variety com­pared to sputtering method, many researches about electroless plating have been performed in the United State of America and Japan. However, electroless plating method has not been studied frequently in Korea. In these respects the purpose of this research is manufacturing Co-Mn-P alloy thin film on the corning glass 2948 by electroless plating method using sodium hypophosphite as a reductant, and analyzing deposition rate, alloy composition, microstructure, and magnetic characteristics at various pH's and temperatures. For Co-P alloy thin film, the reductive deposition reaction 0$\alpha$urred only in basic condition, not in acidic condition. The deposition rate increased as the pH and temperature increased, and the optimum condition was found at the pH of 10 and the temperature of $80^{\circ}C$. Also magnetic charac­teristics was found to be most excellent at the pH of 9 and the temperature of $70^{\circ}C$, resulting in the coercive force of 8700e and the squareness of 0.78. At this condition, the contents of P was 2.54% and the thickness of the film was $0.216\mu\textrm{m}$. For crystal orientation, we could not observe fcc for $\beta$-Co. On the other hand,(1010), (0002), (1011) orientation of hcp for a-Co was observed. We could confirm the formation of longitudinal magnetization from dominant (1010) and (1011) orientation of Co-P alloy. For Co-Mn-P alloy deposition, coercive force was about 1000e more than that of Co P alloy, but squareness had no difference. For crystal orientation, (l01O) and (lOll) orientation of $\alpha$-Co was dominant as same as that of Co- P alloy. Likewise we could confirm the formation of longitudinal magnetization.

  • PDF

Enhanced WMAN System based on Region and Time Partitioning D-TDD OFDM Architecture (영역/시간 세분화 D-TDD OFDM 구조에 기반한 새로운 WMAN 시스템 구조 설계)

  • Kim, Mee-Ran;Cheong, Hee-Jeong;Kim, Nak-Myeong
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.43 no.11 s.353
    • /
    • pp.68-77
    • /
    • 2006
  • In accommodating the asymmetric traffic for future wireless multimedia services, the dynamic time division duplexing (D-TDD) scheme is considered as one of the key solutions. With the D-TDD mode, however, the inter-BS and inter-MS interference is inevitable during the cross time slot (CTS) period, and this interference seriously degrades the system performance. To mitigate such interference, we propose a region and time partitioning D-TDD architecture for OFDM systems. Each time slot in the CTS period is split into several minislots, and then each cell is divided into as many regions as the number of minislots per time slot. We then assign the minislots only to the users in its predefined corresponding region. On top of such architecture which inherently separates the interfering entities farther from each other, we design a robust time slot allocation scheme so that the inter-cell interference can be minimized. By the computer simulation, it has been verified that the proposed scheme outperforms the conventional time slot allocation methods in both the outage probability and the bandwidth efficiency.