• Title/Summary/Keyword: memory access time

Search Result 409, Processing Time 0.029 seconds

Development and Analyses of Xen based Dynamic Binary Instrumentation using Intel VT (Intel VT 기술을 이용한 Xen 기반 동적 악성코드 분석 시스템 구현 및 평가)

  • Kim, Tae-Hyoung;Kim, In-Hyuk;Eom, Young-Ik;Kim, Won-Ho
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.5
    • /
    • pp.304-313
    • /
    • 2010
  • There are several methods for malware analyses. However, it is difficult to detect malware exactly with existing detection methods. Especially, malware with strong anti-debugging facilities can detect analyzer and disturb their analyses. Furthermore, it takes too much time to analyze malware. In order to resolve these problems of current analyzers, more improved analysis scheme is required. This paper suggests a dynamic binary instrumentation which supports the instruction analysis and the memory access tracing. Additionally, by supporting the API call tracing with the DLL loading analysis, our system establishes the foundation for analyzing various executable codes. Based on Xen, full-virtualization environment is built using Intel's VT technology. Windows XP can be used as a guest. We analyze representative malware using several functions of our system, and show the accuracy and efficiency enhancements in binary analyses capability of our system.

Parallel Rabin Fingerprinting on GPGPU for Efficient Data Deduplication (효율적인 데이터 중복제거를 위한 GPGPU 병렬 라빈 핑거프린팅)

  • Ma, Jeonghyeon;Park, Sejin;Park, Chanik
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.611-616
    • /
    • 2014
  • Rabin fingerprinting used for chunking requires the largest amount computation time in data deduplication, In this paper, therefore, we proposed parallel Rabin fingerprinting on GPGPU for efficient data deduplication. In addition, for efficient parallelism in Rabin fingerprinting, four issues are considered. Firstly, when dividing input data stream into data sections, we consider the data located near the boundaries between data sections to calculate Rabin fingerprint continuously. Secondly, we consider exploiting the characteristics of Rabin fingerprinting for efficient operation. Thirdly, we consider the chunk boundaries which can be changed compared to sequential Rabin fingerprinting when adapting parallel Rabin fingerprinting. Finally, we consider optimizing GPGPU memory access. Parallel Rabin fingerprinting on GPGPU shows 16 times and 5.3 times better performance compared to sequential Rabin fingerprinting on CPU and compared to parallel Rabin fingerprinting on CPU, respectively. These throughput improvement of Rabin fingerprinting can lead to total performance improvement of data deduplication.

General Web Cache Implementation Using NIO (NIO를 이용한 범용 웹 캐시 구현)

  • Lee, Chul-Hui;Shin, Yong-Hyeon
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.1
    • /
    • pp.79-85
    • /
    • 2016
  • Network traffic is increased rapidly, due to mobile and social network, such as smartphones and facebook, in recent web environment. In this paper, we improved web response time of existing system using direct buffer of NIO and DMA. This solved the disadvantage of JAVA, such as CPU performance reduction due to the blocking of I/O, garbage collection of buffer. Key values circulated many data due to priority change put on a hash map operated easily and apply a priority modification algorithm. Large response data is separated and stored at a fast direct buffer and improved performance. This paper showed that the proposed method using NIO was much improved performance, in many test situations of cache hit and cache miss.

The Need of Cache Partitioning on Shared Cache of Integrated Graphics Processor between CPU and GPU (내장형 GPU 환경에서 CPU-GPU 간의 공유 캐시에서의 캐시 분할 방식의 필요성)

  • Sung, Hanul;Eom, Hyeonsang;Yeom, HeonYoung
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.9
    • /
    • pp.507-512
    • /
    • 2014
  • Recently, Distributed computing processing begins using both CPU(Central processing unit) and GPU(Graphic processing unit) to improve the performance to overcome darksilicon problem which cannot use all of the transistors because of the electric power limitation. There is an integrated graphics processor that CPU and GPU share memory and Last level cache(LLC). But, There is no LLC access rules between CPU and GPU, so if GPU and CPU processes run together at the same time, performance of both processes gets worse because of the contention on the LLC. This Paper gives evidence to prove the need of the Cache Partitioning and is mentioned about the cache partitioning design using page coloring to allocate the L3 Cache space only for the GPU process to guarantee GPU process performance.

AE32000B: a Fully Synthesizable 32-Bit Embedded Microprocessor Core

  • Kim, Hyun-Gyu;Jung, Dae-Young;Jung, Hyun-Sup;Choi, Young-Min;Han, Jung-Su;Min, Byung-Gueon;Oh, Hyeong-Cheol
    • ETRI Journal
    • /
    • v.25 no.5
    • /
    • pp.337-344
    • /
    • 2003
  • In this paper, we introduce a fully synthesizable 32-bit embedded microprocessor core called the AE32000B. The AE32000B core is based on the extendable instruction set computer architecture, so it has high code density and a low memory access rate. In order to improve the performance of the core, we developed and adopted various design options, including the load extension register instruction (LERI) folding unit, a high performance multiply and accumulate (MAC) unit, various DSP units, and an efficient coprocessor interface. The instructions per cycle count of the Dhrystone 2.1 benchmark for the designed core is about 0.86. We verified the synthesizability and the area and time performances of our design using two CMOS standard cell libraries: a 0.35-${\mu}m$ library and a 0.18-${\mu}m$ library. With the 0.35-${\mu}m$ library, the core can be synthesized with about 47,000 gates and operate at 70 MHz or higher, while it can be synthesized with about 53,000 gates and operate at 120 MHz or higher with the 0.18-${\mu}m$ library.

  • PDF

Efficient Hardware Support: The Lock Mechanism without Retry (하드웨어 지원의 재시도 없는 잠금기법)

  • Kim Mee-Kyung;Hong Chul-Eui
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.9
    • /
    • pp.1582-1589
    • /
    • 2006
  • A lock mechanism is essential for synchronization on the multiprocessor systems. The conventional queuing lock has two bus traffics that are the initial and retry of the lock-read. %is paper proposes the new locking protocol, called WPV (Waiting Processor Variable) lock mechanism, which has only one lock-read bus traffic command. The WPV mechanism accesses the shared data in the initial lock-read phase that is held in the pipelined protocol until the shared data is transferred. The nv mechanism also uses the cache state lock mechanism to reduce the locking overhead and guarantees the FIFO lock operations in the multiple lock contentions. In this paper, we also derive the analytical model of WPV lock mechanism as well as conventional memory and cache queuing lock mechanisms. The simulation results on the WPV lock mechanism show that about 50% of access time is reduced comparing with the conventional queuing lock mechanism.

Buying vs. Using: User Segmentation & UI Optimization through Mobile Phone Log Analysis (구매 vs. 사용 휴대폰 Log 분석을 통한 사용자 재분류 및 UI 최적화)

  • Jeon, Myoung-Hoon;Na, Dae-Yol;Ahn, Jung-Hee
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02b
    • /
    • pp.460-464
    • /
    • 2008
  • To improve and optimize user interfaces of the system, the accurate understanding of users' behavior is an essential prerequisite. Direct questions depend on user' s ambiguous memory and usability tests depend on the researchers' intention instead of users'. Furthermore, they do not provide with natural context of use. In this paper we described the work which examined users' behavior through log analysis in their own environment. 50 users were recruited by consumer segmentation and they were downloaded logging-software in their mobile phone. After two weeks, logged data were gathered and analyzed. The complementary methods such as a user diary and an interview were conducted. The result of the analysis showed the frequency of menu and key access, used time, data storage and several usage patterns. Also, it was found that users could be segmented into new groups by their usage patterns. The improvement of the mobile phone user interface was proposed based on the result of this study.

  • PDF

Study on resistive switching characteristics of AlN films (AlN 박막의 저항 변화 특성에 관한 연구)

  • Kim, Hee-Dong;An, Ho-Myoung;Seo, Yu-Jeong;Kim, Tae-Geun
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2010.06a
    • /
    • pp.257-257
    • /
    • 2010
  • 최근 저항 변화 메모리는 종래의 비휘발성 기억소자인 Flash memory보다 access time(writing)이 105배 이상 빠르고, DRAM과 같이 2~5 V 이하의 낮은 전압 특성 및 간단한 제조 공정 등으로 차세대 비휘발성 메모리 소자로 주목 받고 있지만, 여전히 소자의 Endurance 및 Retention 특성 등의 신뢰성 문제를 해결해야 할 과제로 안고 있다. 이러한 문제점들을 해결하기 위해 페로브스카이트계 산화물 또는 이원 산화물 등의 다양한 저항 변화 물질에 대한 연구가 진행되고 있다. 하지만, 현재 주로 연구되고 있는 금속 산화물계 물질들은 그 제조 공정상 산소에 의한 다수의 산소 디펙트 형성과 제작 시 쉽게 발생할 수 있는 표면 오염의 문제점을 안고 있다. 본 연구는 기존의 금속 산화물계 박막의 제조 공정에서 발생하는 문제점을 해결하기 위해 질화물계 박막을 저항변화 물질로 도입함으로써, 기존의 저항 변화 물질의 장점인 간단한 공정 및 저전압/고속 동작 특성을 동일하게 유지 할 뿐 아니라, 그 제조 공정상 발생하는 다수의 산소 디펙트와 표면 오염의 문제를 해결함으로써, 보다 고효율을 가지며 재현성이 우수한 메모리 소자를 구현 하고자 한다 [1, 2]. 본 연구를 위해 Pt/AlN/Pt 구조의 Metal/Insulator/Metal(MIM) 저항 변화 메모리를 제작 하였다. 최적의 저항 변화 특성 조건을 확인하기 위해 70~200nm까지 두께 구분과 N2 가스 분위기의 열처리 온도를 $200{\sim}600^{\circ}C$까지 진행 하였다. 본 소자의 저항 변화 특성 실험은 Keithley 4200-SCS을 이용하여 진행 하였다. 실험 결과, AlN의 최적의 두께 및 열처리 온도 조건은 130nm/$500^{\circ}C$였으며, 안정적인 unipolar 저항 변화 특성을 확인 활 수 있었다.

  • PDF

Rapid Theraml Annealing Effect on the Magnetic Tunnel Junction with MgO Tunnel Barrier (MgO 절연막을 갖는 자기 터널 접합구조에서의 급속 열처리 효과)

  • Min, Kiljoon;Lee, Kyungil;Kim, Taewan;Jang, Joonyeon
    • Journal of the Korean Magnetics Society
    • /
    • v.25 no.2
    • /
    • pp.47-51
    • /
    • 2015
  • To achieve a high tunneling magneto resistance (TMR) of sputtered magnetic tunnel junctions (MTJs) with an MgO barrier, the annealing process is indispensable. The structural and compositional changes as consequences of the annealing greatly affect the spin-dependent transport properties of MTJs. Higher TMR could be obtained for MTJs annealed at higher annealing temperature. The diffusion of Ru, Mn and/or Ta in the MTJs may occur during annealing process, which is known to be detrimental to spin-dependent tunneling effect. The rapid thermal annealing (RTA) process was used for annealing the MTJs with synthetic antiferromagnets. To suppress the diffusion of Mn, Ru and/or Ta in the MTJs, the process time and temperature of RTA were minutely controlled.

Performance Analyses of Instruction Fetch Models Considering Cache Miss and Branch Misprediction (캐쉬 미스와 분기예측 실패를 고려한 명령어 페치 모델의 성능분석)

  • Kim, Seon-Mo;Jeong, Jin-Ha;Choe, Sang-Bang
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.12
    • /
    • pp.685-697
    • /
    • 2001
  • Cache memories are small fast memories used to temporarily hold the contents of main memory that are likely to be referenced by processors so as to reduce instruction and data access time. In this paper, we represent analytical models of instruction fetch process for four types of instruction cache structures that can be used for superscalar processors. In the models, we define various kinds of architectural parameters and take cache miss and branch misprediction into consideration. To prove the correctness of the proposed models, we performed extensive simulations and compared the results with the analytical models. Simulation results showed that the proposed model can estimate the instruction fetch rate accurately within 10% error in most cases. Both analytical model and simulation show that the increase of cache misses reduces the instruction fetch rate more severely than that of branch misprediction does. However, the analytical model can explain the causes of performance degradation which cannot be uncovered by the simulation method only. The model is also able to provide exact relationship between cache miss and branch misprediction for instruction fetch analysis.

  • PDF