• Title/Summary/Keyword: Direct Memory Access

Search Result 69, Processing Time 0.032 seconds

Design of InfiniBand RDMA-based Network Structure of Apache Storm (InfiniBand RDMA 기반 Apache Storm의 네트워크 구조 설계)

  • Yang, Seokwoo;Son, Siwoon;Choi, Seong-Yun;Choi, Mi-Jung;Moon, Yang-Sae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.679-681
    • /
    • 2017
  • Apache Storm은 대용량 데이터 스트림을 처리하기 위한 실시간 분산 병렬 처리 프레임워크이며, 이를 사용해 다수의 프로세스 및 스레드를 동시에 동작시킬 수 있다. 하지만, 이러한 멀티 프로세스 및 스레드 환경을 제공하는 Storm은 많은 네트워크 시스템 호출을 수행하고, 이는 잦은 문맥 전환(context switch), 운영체제로의 버퍼 복사, 운영체제 내의 버퍼 복사 등으로 인해 CPU 과부하 문제를 발생시킬 수 있다. 이러한 문제는 고성능 네트워크 장비인 InfiniBand의 IPoIB(IP over InfiniBand) 통신을 사용할 때, InfiniBand가 지원하는 대역폭(bandwidth) 대비 저용량 데이터의 송수신으로 인해 더 잦은 문맥 전환과 버퍼 복사가 발생하여 CPU 과부하 문제가 더욱 심각해진다. 따라서, 본 논문에서는 InfiniBand의 RDMA(Remote Direct Memory Access)를 Storm에 적용하는 설계안을 제시함으로써 CPU 과부하 문제를 해결한다.

Generation of Displacement Signal for Realizing Road Profile using the Accelerometer (가속도계를 이용한 노면형상재현 변위신호 생성)

  • Kim, Jong-Tye;Kim, Cheol-Woo;Kim, Taek-Hyun
    • Journal of the Korean Society of Propulsion Engineers
    • /
    • v.14 no.2
    • /
    • pp.39-45
    • /
    • 2010
  • In the recent years, it is important to evaluate the durability and the reliability of the vehicle, aircraft, and structure. Especially, in case of the vehicle, the durability and reliability are tested by driving test after making prototype vehicles. However, these methods require many costs and efforts for the experiment are needed to react the defects of product. This problems can be settled by simulator which supplies the realistic environments. In this parer, four-axial road simulator with hydraulic power and driving program to operate are made up. The displacement road profile is realized by accelerometers. For the verification the real-vehicle experiment is executed and road profile obtained from the experiment is verified by four-axial road simulator.

Model Validation of a Fast Ethernet Controller for Performance Evaluation of Network Processors (네트워크 프로세서의 성능 예측을 위한 고속 이더넷 제어기의 상위 레벨 모델 검증)

  • Lee Myeong-jin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.1
    • /
    • pp.92-99
    • /
    • 2005
  • In this paper, we present a high-level design methodology applied on a network system-on-a-chip(SOC) using SystemC. The main target of our approach is to get optimum performance parameters for high network address translation(NAT) throughput. The Fast Ethernet media access controller(MAC) and its direct memory access(DMA) controller are modeled with SystemC in transaction level. They are calibrated through the cycle-based measurement of the operation of the real Verilog register transfer language(RTL). The NAT throughput of the model is within $\pm$10% error compared to the output of the real evaluation board. Simulation speed of the model is more than 100 times laster than the RTL. The validated models are used for intensive architecture exploration to find the performance bottleneck in the NAT router.

Implementation and Performance Analysis of Pointer Swizzling Method for Effective Access to Complex Objects (복합 객체의 효율적인 접근을 위한 포인터 스위즐링 방법의 구현 및 성능 분석)

  • Min, Jun-Gi;Gang, Heum-Geun;Lee, Seong-Jin;Jeong, Jin-Wan
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.5 no.4
    • /
    • pp.395-404
    • /
    • 1999
  • 포인터 스위즐링 기법은 포인터 스위즐링과 언스위즐링으로 이루어지며, 포인터 스위즐링은 객체 접근 시 객체 식별자를 해당 객체의 메모리 주소로 교체하는 것을 말하며, 언스위즐링은 객체 교체 또는 객체 저장 시에 스위즐링된 포인터를 원래의 객체 식별자로 환원하는 것을 말한다. 본 연구에서는 시스템 버퍼 구조에 따라 여러 포인터 스위즐링 기법을 분류하여 장단점을 분석하였으며, 이중 버퍼 구조 상에서 적극/소극, 직접/간접 스위즐링, 언스위즐링 모듈을 설계, 구현하였다. 또한 제한된 크기의 객체 버퍼 상에서 각 포인터 스위즐링 모듈의 성능을 평가하였다. 이 성능 평가의 결과로는 사용하지 않는 포인터는 스위즐링하지 않으며 언스위즐링 부담이 적은 소극 간접 스위즐링 기법이 일반적으로 뛰어난 성능을 나타냄을 알 수 있게 되었다.Abstract The pointer swizzling methods consist of pointer swizzling and unswizzling. Pointer swizzling replaces the OID of a object to the memory address of the object at object access time and unswizzling replaces the swizzling pointer of the OID at object replacement time or object save time. In this research, the different techniques for pointer swizzling are classified according to the system buffer structure and analyzed the pros and cons. In addition, eager/lazy, direct/indirect swizzling, unswizzling modules are designed and implemented on a dual buffering structure. Also, we evaluate the performance of pointer swizzling modules on the restricted object buffer size. The results of performance evaluation show that the performance of lazy indirect pointer swizzling technique is generally good because unused pointers are not swizzled, and unswizzling overhead is minimized.

Data Cache System based on the Selective Bank Algorithm for Embedded System (내장형 시스템을 위한 선택적 뱅크 알고리즘을 이용한 데이터 캐쉬 시스템)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • The KIPS Transactions:PartA
    • /
    • v.16A no.2
    • /
    • pp.69-78
    • /
    • 2009
  • One of the most effective way to improve cache performance is to exploit both temporal and spatial locality given by any program executive characteristics. In this paper we present a high performance and low power cache structure with a bank selection mechanism that enhances exploitation of spatial and temporal locality. The proposed cache system consists of two parts, i.e., a main direct-mapped cache with a small block size and a fully associative buffer with a large block size as a multiple of the small block size. Especially, the main direct-mapped cache is constructed as two banks for low power consumption and stores a small block which is selected from fully associative buffer by the proposed bank selection algorithm. By using the bank selection algorithm and three state bits, We selectively extend the lifetime of those small blocks with high temporal locality by storing them in the main direct-mapped caches. This approach effectively reduces conflict misses and cache pollution at the same time. According to the simulation results, the average miss ratio, compared with the Victim and STAS caches with the same size, is improved by about 23% and 32% for Mibench applications respectively. The average memory access time is reduced by about 14% and 18% compared with the he victim and STAS caches respectively. It is also shown that energy consumption of the proposed cache is around 10% lower than other cache systems that we examine.

An Active Prefetch Filtering Schemes using Exclusive Prefetch Cache (선인출 전용 캐시를 이용한 적극적 선인출 필터링 기법)

  • Chon Young-Suk;Kim Suk-il;Jeon Joong-nam
    • The KIPS Transactions:PartA
    • /
    • v.12A no.1 s.91
    • /
    • pp.41-52
    • /
    • 2005
  • Memory reference instruction caused by cache miss is the critical factor that limits the processing power of processor. Cache prefetching technique is an effective way to reduce the latency due to memory access. However, excessively aggressive prefetch leads to cache pollution and finally to cancel out the advantage of prefetch. In this study, an active prefetch filtering scheme is introduced which dynamically decides whether to commence prefetching after referring a filtering table to reduce the cache pollution due to unnecessary prefetches. For the precision filtering, an evicted address referencing scheme has been proposed where the filter directly compares the current prefetch address with previous unnecessary prefetch addresses stored in filtering table. Moreover, a small sized exclusive prefetch cache has been introduced to increase the amount of eviction of unnecessarily prefetched addresses to enhance the accuracy of dynamic filtering. The exclusive prefetch cache also prevents useful demand data from being pushed out by prefetched data, while the evicted address direct referencing scheme enables the prefetch cache to keep most of useful prefetch data within its small size. Experimental results from commonly used general and multimedia benchmarks show that the average cache miss ratio has been decreased by $13.3{\%}$ by virtue of enhanced filtering accuracy compared with conventional schemes.

Space-Time Concatenated Convolutional and Differential Codes with Interference Suppression for DS-CDMA Systems (간섭 억제된 DS-CDMA 시스템에서의 시공간 직렬 연쇄 컨볼루션 차등 부호 기법)

  • Yang, Ha-Yeong;Sin, Min-Ho;Song, Hong-Yeop;Hong, Dae-Sik;Gang, Chang-Eon
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.39 no.1
    • /
    • pp.1-10
    • /
    • 2002
  • A space-time concatenated convolutional and differential coding scheme is employed in a multiuser direct-sequence code-division multiple-access(DS-CDMA) system. The system consists of single-user detectors (SUD), which are used to suppress multiple-access interference(MAI) with no requirement of other users' spreading codes, timing, or phase information. The space-time differential code, treated as a convolutional code of code rate 1 and memory 1, does not sacrifice the coding efficiency and has the least number of states. In addition, it brings a diversity gain through the space-time processing with a simple decoding process. The iterative process exchanges information between the differential decoder and the convolutional decoder. Numerical results show that this space-time concatenated coding scheme provides better performance and more flexibility than conventional convolutional codes in DS-CDMA systems, even in the sense of similar complexity Further study shows that the performance of this coding scheme applying to DS-CDMA systems with SUDs improves by increasing the processing gain or the number of taps of the interference suppression filter, and degrades for higher near-far interfering power or additional near-far interfering users.

A Performance Evaluation of a RISC-Based Digital Signal Processor Architecture (RISC 기반 DSP 프로세서 아키텍쳐의 성능 평가)

  • Kang, Ji-Yang;Lee, Jong-Bok;Sung, Won-Yong
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.2
    • /
    • pp.1-13
    • /
    • 1999
  • As the complexity of DSP (Digital Signal Processing) applications increases, the need for new architectures supporting efficient high-level language compilers also grows. By combining several DSP processor specific features, such as single cycle MAC (Multiply-and-ACcumulate), direct memory access, automatic address generation, and hardware looping, with a RISC core having many general purpose registers and orthogonal instructions, a high-performance and compiler-friendly RISC-based DSP processors can be designed. In this study, we develop a code-converter that can exploit these DSP architectural features by post-processing compiler-generated assembly code, and evaluate the performance effects of each feature using seven DSP-kernel benchmarks and a QCELP vocoder program. Finally, we also compare the performances with several existing DSP processors, such as TMS320C3x, TMS320C54x, and TMS320C5x.

  • PDF

Buying vs. Using: User Segmentation & UI Optimization through Mobile Phone Log Analysis (구매 vs. 사용 휴대폰 Log 분석을 통한 사용자 재분류 및 UI 최적화)

  • Jeon, Myoung-Hoon;Na, Dae-Yol;Ahn, Jung-Hee
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02b
    • /
    • pp.460-464
    • /
    • 2008
  • To improve and optimize user interfaces of the system, the accurate understanding of users' behavior is an essential prerequisite. Direct questions depend on user' s ambiguous memory and usability tests depend on the researchers' intention instead of users'. Furthermore, they do not provide with natural context of use. In this paper we described the work which examined users' behavior through log analysis in their own environment. 50 users were recruited by consumer segmentation and they were downloaded logging-software in their mobile phone. After two weeks, logged data were gathered and analyzed. The complementary methods such as a user diary and an interview were conducted. The result of the analysis showed the frequency of menu and key access, used time, data storage and several usage patterns. Also, it was found that users could be segmented into new groups by their usage patterns. The improvement of the mobile phone user interface was proposed based on the result of this study.

  • PDF

Architecture of Software Testing Tool for Railway Signalling through Actual Use Interface Channel (실사용 인터페이스를 이용한 열차제어 소프트웨어 테스팅 도구의 구조)

  • Hwang, Jong-Gyu;Baek, Jong-Hyun;Jo, Hyun-Jeong;Lee, Kang-Mi
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39C no.9
    • /
    • pp.880-886
    • /
    • 2014
  • Many railway signalling functions have increasingly depended on computer software with recent development in computing technology, leading to evolution into more flexible and intelligent railway signalling system. Meanwhile, software programs are likely to have many errors and the cost incurred by such errors has increased. Especially, if fatal software error occurs during railway operation, it may result in loss of lives. So the software verification and validation have become more important. It is needed for software functional safety tool to support these, but most commercial tools depend on direct access to the system's memory, resulting in many difficulties in application. Owing to such difficulties and complexity, they are rarely used in railway signalling system software validation. In this study, a new testing tool for software functional testing through an external interface that can be easily used in functional testing of software was developed. Such testing tool allows development and analysis of test cases for black-box testing through analysis of actually used interface protocols, leading to increased user convenience.