Search | Korea Science

A Survey on Vision Transformers for Object Detection Task (객체 탐지 과업에서의 트랜스포머 기반 모델의 특장점 분석 연구)

Jungmin, Ha;Hyunjong, Lee;Jungmin, Eom;Jaekoo, Lee
- IEMEK Journal of Embedded Systems and Applications
- /
- v.17 no.6
- /
- pp.319-327
- /
- 2022
Transformers are the most famous deep learning models that has achieved great success in natural language processing and also showed good performance on computer vision. In this survey, we categorized transformer-based models for computer vision, particularly object detection tasks and perform comprehensive comparative experiments to understand the characteristics of each model. Next, we evaluated the models subdivided into standard transformer, with key point attention, and adding attention with coordinates by performance comparison in terms of object detection accuracy and real-time performance. For performance comparison, we used two metrics: frame per second (FPS) and mean average precision (mAP). Finally, we confirmed the trends and relationships related to the detection and real-time performance of objects in several transformer models using various experiments.
https://doi.org/10.14372/IEMEK.2022.17.6.319 인용 PDF KSCI

Deep Learning-based Gaze Direction Vector Estimation Network Integrated with Eye Landmark Localization (딥러닝 기반의 눈 랜드마크 위치 검출이 통합된 시선 방향 벡터 추정 네트워크)

Joo, Hee Young;Ko, Min Soo;Song, Hyok
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2021.06a
- /
- pp.180-182
- /
- 2021
본 논문은 눈 랜드마크 위치 검출과 시선 방향 벡터 추정이 하나의 딥러닝 네트워크로 통합된 시선 추정 네트워크를 제안한다. 제안하는 네트워크는 Stacked Hourglass Network[1]를 백본(Backbone) 구조로 이용하며, 크게 랜드마크 검출기, 특징 맵 추출기, 시선 방향 추정기라는 세 개의 부분으로 구성되어 있다. 랜드마크 검출기에서는 눈 랜드마크 50개 포인트의 좌표를 추정하며, 특징 맵 추출기에서는 시선 방향 추정을 위한 눈 이미지의 특징 맵을 생성한다. 그리고 시선 방향 추정기에서는 각 출력 결과를 조합하고 이를 통해 최종 시선 방향 벡터를 추정한다. 제안하는 네트워크는 UnityEyes[2] 데이터셋을 통해 생성된 가상의 합성 눈 이미지와 랜드마크 좌표 데이터를 이용하여 학습하였으며, 성능 평가는 실제 사람의 눈 이미지로 구성된 MPIIGaze[3] 데이터 셋을 이용하였다. 실험을 통해 시선 추정 오차는 0.0396 MSE(Mean Square Error)의 성능을 보였으며, 네트워크의 추정 속도는 42 FPS(Frame Per Second)를 나타내었다.
PDF

Object Tracking Using CAM shift with 8-way Search Window (CAM shift와 8방향 탐색 윈도우를 이용한 객체 추적)

Kim, Nam-Gon;Lee, Geum-Boon;Cho, Beom-Joon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.3
- /
- pp.636-644
- /
- 2015
This research aims to suggest methods to improve object tracking performance by combining CAM shift algorithm with 8-way search window, and reduce arithmetic operation by reducing the number of frame used for tracking. CAM shift has its adverse effect in tracking methods using signature color or having difficulty in tracking rapidly moving object. To resolve this, moving search window of CAM shift makes it possible to more accurately track high-speed moving object after finding object by conducting 8-way search by using information at a final successful timing point from a timing point missing tracking object. Moreover, hardware development led to increased unnecessary arithmetic operation by increasing the number of frame produced per second, which indicates efficiency can be enhanced by reducing the number of frame used in tracking to reduce unnecessary arithmetic operation.
https://doi.org/10.6109/jkiice.2015.19.3.636 인용 PDF KSCI KPUBS HTML

Effective Scheduling Algorithm using Queue Separation and Packet Segmentation for Jumbo Packets (큐 분리 및 패킷 분할을 이용한 효율적인 점보패킷 스케쥴링 방법)

윤빈영;고남석;김환우
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.28 no.9A
- /
- pp.663-668
- /
- 2003
With the advent of high speed networking technology, computers connected to the high-speed networks tend to consume more of their CPU cycles to process data. So one of the solutions to improve the performance of the computers is to reduce the CPU cycles for processing the data. As the consumption of the CPU cycles is increased in proportion to the number of the packets per second to be processed, reducing the number of the packets per second by increasing the length of the packet is one of the solutions. In order to meet this requirement, two types of jumbo packets such as jumbograms and jumbo frames have already been standardized or being discussed. In case that the jumbograms and general packets are interleaved and scheduled together in a router, the jumbogrms may deteriorate the QoS of the general packets due to the transfer delay. They also frequently exhaust the memory with storing the huge length of the packets. This produces the congestion state easily in the router that results in the loss of the packets. In this paper, we analyze the problems in processing the jumbo packets and suggest a noble solution to overcome the problems.
PDF KSCI

A QoS-Aware Energy Optimization Technique for Smartphone GPUs (QoS를 고려한 스마트폰 GPU의 에너지 최적화 기법)

Kim, Dohan;Song, Wook;Kim, HyungHoon;Kim, Jihong
- Journal of KIISE
- /
- v.42 no.5
- /
- pp.566-572
- /
- 2015
We proposed a novel energy optimization technique for smartphone GPUs, more aggressively lowering the GPU frequency while obtaining higher energy efficiency with a negligible negative impact on the GPU performance. In order to achieve the Quality of Service (QoS) specified by the smartphone application, the proposed optimization technique employed the minimal acceptable GPU frequency based on average Frames per Second (FPS) for each GPU frequency level. Our experimental results on a smartphone development board showed that the proposed technique can reduce the GPU energy consumption by up to 23% over the default DVFS algorithm with only a 0.45 frame drop.
https://doi.org/10.5626/JOK.2015.42.5.566 인용 KSCI

Deep Learning-based Gaze Direction Vector Estimation Network Integrated with Eye Landmark Localization (딥 러닝 기반의 눈 랜드마크 위치 검출이 통합된 시선 방향 벡터 추정 네트워크)

Joo, Heeyoung;Ko, Min-Soo;Song, Hyok
- Journal of Broadcast Engineering
- /
- v.26 no.6
- /
- pp.748-757
- /
- 2021
In this paper, we propose a gaze estimation network in which eye landmark position detection and gaze direction vector estimation are integrated into one deep learning network. The proposed network uses the Stacked Hourglass Network as a backbone structure and is largely composed of three parts: a landmark detector, a feature map extractor, and a gaze direction estimator. The landmark detector estimates the coordinates of 50 eye landmarks, and the feature map extractor generates a feature map of the eye image for estimating the gaze direction. And the gaze direction estimator estimates the final gaze direction vector by combining each output result. The proposed network was trained using virtual synthetic eye images and landmark coordinate data generated through the UnityEyes dataset, and the MPIIGaze dataset consisting of real human eye images was used for performance evaluation. Through the experiment, the gaze estimation error showed a performance of 3.9, and the estimation speed of the network was 42 FPS (Frames per second).
https://doi.org/10.5909/JBE.2021.26.6.748 인용 PDF KSCI KPUBS

Single Board Realtime 2-D IIR Filtering System (실시간 2차원 디지털 IIR 필터의 구현)

Jeong, Jae-Gil
- The Journal of Engineering Research
- /
- v.2 no.1
- /
- pp.39-47
- /
- 1997
This paper presents a single board digital signal processing system which can perform two-dimensional (2-D) digital infinite impulse response (IIR) filtering in realtime. We have developed an architecture to provide not only the necessary computational power but also a balance of the system input/output and computational requirements. The architecture achieves large system throughput by using highly parallel processing at both the system and processor levels. It reduces system data communication requirements significantly by taking advantage of a custom-designed processor and by providing each processor with its own input and ouput channel. After system initialization, almost 100 percent of the time is used for data processing. Data transfers occur concurrently with data processing. The functional level simulation reveals that the system throughput can reach as high as one pixel per system cycle. With only 10MHz clock frequency system, it can implement up to fourth order 2-D IIR filters for video-rate data ($512\times512$ pixels per frame at 30 frames per second). If we increase the system frequency, the system can be used for the preprocessing and postprocessing of video signal of HDTV.
PDF

Implementation and verification of H.264 / AVC Intra Predictor for mobile environment (모바일 환경에서의 H.264 / AVC를 위한 인트라 예측기의 구현 및 검증)

Yun, Cheol-Hwan;Jeong, Yong-Jin
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.44 no.12
- /
- pp.93-101
- /
- 2007
Small area and low power implementation are important requirements for various multimedia processing hardware, especially for mobile environment. This paper presents a hardware architecture of H.264/AVC Intra Prediction module aiming on small area and low power. A single arithmetic unit was shared and processed sequentially for all mode decisions and computations to predict an image frame. As a result, we could get smaller area and smaller memory size compared to other existing implementations. The proposed architecture was verified using the Altera Excalibur device, and the implemented hardware has been described in Verilog-HDL and synthesized on Samsung STD130 0.18um CMOS Standard Cell Library using Synopsys Design Compiler. The synthesis result was about 11.9K logic gates and 1078 byte internal SRAM and the maximum operating frequency was 107Mhz. It consumes 879,617 clocks to process one QCIF frame, which means it can process 121.5 QCIF$(176\times144)$ frames per second, therefore it shows that it can be used for real time H.264/AVC encoding of various multimedia applications.
PDF KSCI

Development of Android App for Suppor ting Smooth Multimedia Streaming Service Using Frame Buffer (프레임 버퍼를 이용한 매끄러운 멀티미디어 스트리밍 서비스를 지원하는 안드로이드 앱 개발)

Seo, Sang-min;Kwon, Jonnho;Choi, Yoon-Ho
- Journal of Internet Computing and Services
- /
- v.17 no.1
- /
- pp.55-64
- /
- 2016
Existing Android applications for streaming video in real time are dependent on the codec, which composes the encoding function, and the version of Android operating system. Also, for streaming video in real time, most applications should be connected with a separate desktop PC. To overcome these disadvantages, we propose a new application, which records and streams video in real time. Specifically, the proposed application uses the flash video file format, which is the common media file format supported by various versions of Android operating system. Through experiments, we show that it is possible for the proposed application to record the video screens more than 20 frames per second and to stream it in real time while using the existing video encoding methods.
https://doi.org/10.7472/jksii.2016.17.1.55 인용 PDF KSCI

Experimental Investigation on the Gap Cavitation of Semi-spade Rudder (Semi-spade 타의 간극 캐비테이션에 대한 실험적 연구)

Paik, Bu-Geun;Kim, Kyung-Youl;Ahn, Jong-Woo;Kim, Yong-Soo;Kim, Sung-Pyo;Park, Je-Jun
- Journal of the Society of Naval Architects of Korea
- /
- v.43 no.4 s.148
- /
- pp.422-430
- /
- 2006
The horn and movable parts around the gap of the conventional semi-spade rudder are visualized by high speed CCD camera with the frame rate of 4000 fps (frame per second) to study the unsteady cavity pattern on the rudder surface and gap. In addition, the pressure measurements are conducted on the rudder surface and inside the gap to find out the characteristics of the flow behavior. The rudder without propeller wake is tested at the range of $1.0{\leq}{\sigma}_v\;1.6$ and at the rudder deflection angle of $-8{\leq}{\theta}{\leq}10^{\circ}$. The time resolved cavity images are captured and show strong cavitation around the rudder gap in all deflection angles. As the deflection angle gets larger, the flow separated from the horn surface increases the strength of cavitation. The accelerated flow along the horn decreases its pressure and the separated flow from the horn increases the pressure abruptly. The pressure distribution inside the gap reveals the flow moving from the pressure to suction side. In the negative deflection angle, the turning area on the movable part initiates the flow separation and cavitation on it.
https://doi.org/10.3744/SNAK.2006.43.4.422 인용 PDF KSCI

Search Result 130, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)