• Title/Summary/Keyword: memory access time

Search Result 410, Processing Time 0.028 seconds

Optical properties of Ag/$Ge_1Se_1Te_2$ material with secondary Ag layer adoption (두 번째 Ag 층을 적용한 Ag/$Ge_1Se_1Te_2$ 물질의 광학적 특성 연구)

  • Kim, Hyun-Koo;Han, Song-Lee;Kim, Jae-Hoon;Koo, Sang-Mo;Chung, Hong-Bay
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2008.06a
    • /
    • pp.191-192
    • /
    • 2008
  • For phase transition method, good record sensitivity, low heat radiation, fast crystallization and hi-resolution are essential. Also, a retention time is very important part for phase-transition. In our past papers, we chose composition of $Ge_1Se_1Te_2$ material to use a Se factor which has good optical sensitivity than conventional Sb. Ge-Se-Te and Ag/$Ge_1Se_1Te_2$ samples are fabricated and irradiated with He-Ne laser and DPSS laser to investigate a reversible phase change by light. Because of Ag ions, the Ag layer inserted sample showed better performance than conventional one. We should note that this novel one showed another possibility for phase-change random access memory.

  • PDF

A Reconfigurable Parallel Processor for Efficient Processing of Mobile Multimedia (모바일 멀티미디어의 효율적 처리를 위한 재구성형 병렬 프로세서의 구조)

  • Yoo, Se-Hoon;Kim, Ki-Chul;Yang, Yil-Suk;Roh, Tae-Moon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.10
    • /
    • pp.23-32
    • /
    • 2007
  • This paper proposes a reconfigurable parallel processor architecture which can efficiently implement various multimedia applications, such as 3D graphics, H.264/H.263/MPEG-4, JPEG/JPEG2000, and MP3. The proposed architecture directly connects memories and processors so that memory access time and power consumption are reduced. It supports floating-point operations needed in the geometry stage of 3D graphics. It adopts partitioned SIMD to reduce hardware costs. Conditional execution of instructions is used for easy development of parallel algorithms.

Binary Search on Multiple Small Trees for IP Address Lookup (복수의 작은 트리에 대한 바이너리 검색을 이용한 IP 주소 검색 구조)

  • Lee Bo mi;Lim Hye sook;Kim Won jung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1642-1651
    • /
    • 2004
  • Advance of internet access technology requires more internet bandwidth and high-speed packet processing. IP address lookups in routers are essential elements which should be performed in real time for packets arriving tens-of-million packets per second. In this paper, we proposed a new architecture for efficient IP address lookup. The proposed scheme produces multiple balanced trees stored into a single SRAM. The proposed scheme performs sequential binary searches on multiple trees. Performance evaluation results show that p개posed architecture requires 301.7KByte SRAM to store about 40,000 prefix samples, and an address lookup is achieved by 11.3 memory accesses in average.

Implementation of a Performance Monitor using Object Oriented Concept (객체 지향 개념을 적용한 성능 모니터의 구현)

  • Kim, Yong-Soo;Lee, Keum-Suk
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.8
    • /
    • pp.2038-2059
    • /
    • 1997
  • The physical attributes of a computer, such as processor speed, size and access time of memory, and I/O bandwidth, are fixed when the computer is delivered to the user. Under these conditions, the performance of a multi-process system where multiple processes share various resources can be enhanced by monitoring and controlling the relationship between the user processes and the system resources. This paper applies object oriented concept to the performance management and suggests a standard for the management. Resource managers and user processes are defined as managed object and a performance manager is defined as a managing object. A protocol between the user processes and performance manager is designed and attributes and methods of the objects are also defined. Through the standardization, the user processes and the performance manager can be developed independently and the system's performance can also be collectively managed.

  • PDF

Improved Uniformity of Resistive Switching Characteristics in Ge0.5Se0.5-based ReRAM Device Using the Ag Nanocrystal (Ag Nanocrystal이 적용된 Ge0.5Se0.5-based ReRAM 소자의 Uniformity 특성 향상에 대한 연구)

  • Chung, Hong-Bay;Kim, Jang-Han;Nam, Ki-Hyun
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.27 no.8
    • /
    • pp.491-496
    • /
    • 2014
  • The resistive switching characteristics of resistive random access memory (ReRAM) based on amorphous $Ge_{0.5}Se_{0.5}$ thin films have been demonstrated by using Ti/Ag nanocrystals/$Ge_{0.5}Se_{0.5}$/Pt structure. Ag nanocrystals (Ag NCs) were spread on the amorphous $Ge_{0.5}Se_{0.5}$ thin film and they played the role of metal ions source. As a result, comparing the conventional Ag/$Ge_{0.5}Se_{0.5}$/Pt structure, this Ti/Ag NCs/$Ge_{0.5}Se_{0.5}$/Pt ReRAM device exhibits the highly uniform bipolar resistive switching (BRS) characteristics, such as the operating voltages, and the resistance values. At the same time, a stable DC endurance(> 100 cycles), and the excellent data retention (> $10^4$ sec) properties were found from the Ti/Ag NCs/$Ge_{0.5}Se_{0.5}$/Pt structured ReRAM device.

A Development of LDA Topic Association Systems Based on Spark-Hadoop Framework

  • Park, Kiejin;Peng, Limei
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.140-149
    • /
    • 2018
  • Social data such as users' comments are unstructured in nature and up-to-date technologies for analyzing such data are constrained by the available storage space and processing time when fast storing and processing is required. On the other hand, it is even difficult in using a huge amount of dynamically generated social data to analyze the user features in a high speed. To solve this problem, we design and implement a topic association analysis system based on the latent Dirichlet allocation (LDA) model. The LDA does not require the training process and thus can analyze the social users' hourly interests on different topics in an easy way. The proposed system is constructed based on the Spark framework that is located on top of Hadoop cluster. It is advantageous of high-speed processing owing to that minimized access to hard disk is required and all the intermediately generated data are processed in the main memory. In the performance evaluation, it requires about 5 hours to analyze the topics for about 1 TB test social data (SNS comments). Moreover, through analyzing the association among topics, we can track the hourly change of social users' interests on different topics.

Location Based Routing Service In Distributed Web Environment

  • Kim, Do-Hyun;Jang, Byung-Tae
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.340-342
    • /
    • 2003
  • Location based services based on positions of moving objects are expanding the business area gradually. The location is included all estimate position of the future as well as the position of the present and the past. Location based routing service is active business application in which the position information of moving objects is applied efficiently. This service includes the trajectory of past positions, the real-time tracing of present position of special moving objects, and the shortest and optimized paths combined with map information. In this paper, we describes the location based routing services is extend in distributed web GIS environment. Web GIS service systems provide the various GIS services of analyzing and displaying the spatial data with friendly user - interface. That is, we propose the efficient architecture and technologies for servicing the location based routing services in distributed web GIS environment. The position of moving objects is acquired by GPS (Global Positioning System) and converted the coordinate of real world by map matching with geometric information. We suppose the swapping method between main memory and storages to access the quite a number of moving objects. And, the result of location based routing services is wrapped the web-styled data format. We design the schema based on the GML. We design these services as components were developed in object-oriented computing environment, and provide the interoperability, language-independent, easy developing environment as well as re - usability.

  • PDF

A Design Of Physical Layer For OpenCable Copy Protection Module Using SystemC (SystemC를 이용한 OpenCableTM Copy Protection Module의 Physical Layer 설계)

  • Lee, Jung-Ho;Lee, Suk-Yun;Cho, Jun-Dong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.157-160
    • /
    • 2004
  • 본 논문은 미국 차세대 디지털 케이블 방송 표준 규격인 오픈케이블($OpenCable^{TM}$)의 수신제한 모듈인 CableCard의 Physical Layer를 SystemC의 TLM(Transaction Level Modeling)과 RTL(Register-Transfer Level) 모델링 기법으로 설계하였다. 본 논문에서 설계한 CableCard의 Physical Layer는 PCMCIA Interface, Command Inteface 그리고 MPEG-2 TS Interface 로 구성된다. CableCard가 전원이 인가될 때, 카드 초기화를 위하여 동작하는 PCMCIA 인터페이스는 16 비트 PC 카드 SRAM 타입으로 2MByte Memory와 100ns access time으로 동작할 수 있게 설계하였다. PCMCIA 카드 초기화 동작이 완료된 후, CableCard의 기능을 수행하기 위하여 두 개의 논리적 인터페이스가 정의되는데 하나는 MPEG-2 TS 인터페이스이고, 다른 하나는 호스트(셋톱박스)와 모듈 사이의 명령어들을 전달하는 명령어 인터페이스(Command Interface)이다. 명령어 인터페이스(Command Interface)는 셋톱박스의 CPU와 통신하기 위한 1KByte의 Data Channel과 OOB(Out-Of-Band) 통신을 위한 4KByte의 Extended Channel 로 구성되고, 최대 20Mbits/s까지 동작한다. 그리고 MPEG-2 TS는 100Mbits/s까지 동작을 수행할 수 있게 설계하였다. 설계한 코드를 실행한 후, Cadence사의 SimVision을 통해서 타이밍 시뮬레이션을 검증하였다.

  • PDF

Acceleration of computation speed for elastic wave simulation using a Graphic Processing Unit (그래픽 프로세서를 이용한 탄성파 수치모사의 계산속도 향상)

  • Nakata, Norimitsu;Tsuji, Takeshi;Matsuoka, Toshifumi
    • Geophysics and Geophysical Exploration
    • /
    • v.14 no.1
    • /
    • pp.98-104
    • /
    • 2011
  • Numerical simulation in exploration geophysics provides important insights into subsurface wave propagation phenomena. Although elastic wave simulations take longer to compute than acoustic simulations, an elastic simulator can construct more realistic wavefields including shear components. Therefore, it is suitable for exploration of the responses of elastic bodies. To overcome the long duration of the calculations, we use a Graphic Processing Unit (GPU) to accelerate the elastic wave simulation. Because a GPU has many processors and a wide memory bandwidth, we can use it in a parallelised computing architecture. The GPU board used in this study is an NVIDIA Tesla C1060, which has 240 processors and a 102 GB/s memory bandwidth. Despite the availability of a parallel computing architecture (CUDA), developed by NVIDIA, we must optimise the usage of the different types of memory on the GPU device, and the sequence of calculations, to obtain a significant speedup of the computation. In this study, we simulate two- (2D) and threedimensional (3D) elastic wave propagation using the Finite-Difference Time-Domain (FDTD) method on GPUs. In the wave propagation simulation, we adopt the staggered-grid method, which is one of the conventional FD schemes, since this method can achieve sufficient accuracy for use in numerical modelling in geophysics. Our simulator optimises the usage of memory on the GPU device to reduce data access times, and uses faster memory as much as possible. This is a key factor in GPU computing. By using one GPU device and optimising its memory usage, we improved the computation time by more than 14 times in the 2D simulation, and over six times in the 3D simulation, compared with one CPU. Furthermore, by using three GPUs, we succeeded in accelerating the 3D simulation 10 times.

Toward Optimal FPGA Implementation of Deep Convolutional Neural Networks for Handwritten Hangul Character Recognition

  • Park, Hanwool;Yoo, Yechan;Park, Yoonjin;Lee, Changdae;Lee, Hakkyung;Kim, Injung;Yi, Kang
    • Journal of Computing Science and Engineering
    • /
    • v.12 no.1
    • /
    • pp.24-35
    • /
    • 2018
  • Deep convolutional neural network (DCNN) is an advanced technology in image recognition. Because of extreme computing resource requirements, DCNN implementation with software alone cannot achieve real-time requirement. Therefore, the need to implement DCNN accelerator hardware is increasing. In this paper, we present a field programmable gate array (FPGA)-based hardware accelerator design of DCNN targeting handwritten Hangul character recognition application. Also, we present design optimization techniques in SDAccel environments for searching the optimal FPGA design space. The techniques we used include memory access optimization and computing unit parallelism, and data conversion. We achieved about 11.19 ms recognition time per character with Xilinx FPGA accelerator. Our design optimization was performed with Xilinx HLS and SDAccel environment targeting Kintex XCKU115 FPGA from Xilinx. Our design outperforms CPU in terms of energy efficiency (the number of samples per unit energy) by 5.88 times, and GPGPU in terms of energy efficiency by 5 times. We expect the research results will be an alternative to GPGPU solution for real-time applications, especially in data centers or server farms where energy consumption is a critical problem.