• Title/Summary/Keyword: On-Chip Memory

Search Result 296, Processing Time 0.024 seconds

Analysis on the Performance Impact of Partitioned LLC for Heterogeneous Multicore Processors (이종 멀티코어 프로세서에서 분할된 공유 LLC가 성능에 미치는 영향 분석)

  • Moon, Min Goo;Kim, Cheol Hong
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.2
    • /
    • pp.39-49
    • /
    • 2019
  • Recently, CPU-GPU integrated heterogeneous multicore processors have been widely used for improving the performance of computing systems. Heterogeneous multicore processors integrate CPUs and GPUs on a single chip where CPUs and GPUs share the LLC(Last Level Cache). This causes a serious cache contention problem inside the processor, resulting in significant performance degradation. In this paper, we propose the partitioned LLC architecture to solve the cache contention problem in heterogeneous multicore processors. We analyze the performance impact varying the LLC size of CPUs and GPUs, respectively. According to our simulation results, the bigger the LLC size of the CPU, the CPU performance improves by up to 21%. However, the GPU shows negligible performance difference when the assigned LLC size increases. In other words, the GPU is less likely to lose the performance when the LLC size decreases. Because the performance degradation due to the LLC size reduction in GPU is much smaller than the performance improvement due to the increase of the LLC size of the CPU, the overall performance of heterogeneous multicore processors is expected to be improved by applying partitioned LLC to CPUs and GPUs. In addition, if we develop a memory management technique that can maximize the performance of each core in the future, we can greatly improve the performance of heterogeneous multicore processors.

The study of stereoscopic editing process with applying depth information (깊이정보를 활용한 입체 편집 프로세스 연구)

  • Baek, Kwang-Ho;Kim, Min-Seo;Han, Myung-Hee
    • Journal of Digital Contents Society
    • /
    • v.13 no.2
    • /
    • pp.225-233
    • /
    • 2012
  • The 3D stereoscopic image contents have been emerging as the blue chip of the contents market of the next generation since the . However, all the 3D contents created commercially in the country have failed to enter box office. It is because the quality of Korean 3D contents is much lower than that of overseas contents and also current 3D post production process is based on 2D. Considering all these facts, the 3D editing process has connection with the quality of contents. The current 3D editing processes of the production case of are using the way that edits with the system on basis of 2D, followed by checking with 3D display system and modifying, if there are any problems. In order to improve those conditions, I suggest that the 3D editing process contain more objectivity by visualizing the depth data applied in some composition work such as Disparity map, Depth map, and the current 3D editing process. The proposed process has been used in the music drama , comparing with those of the film . The 3D values could be checked among cuts which have been changed a lot since those of , while the 3D value of drew an equal result in general. Since the current process is based on an artist's subjective sense of 3D, it could be changed according to the condition and state of the artist. Furthermore, it is impossible for us to predict the positive range, so it is apprehended that the cubic effect of space might be perverted by showing each different 3D value according to cuts in the same space or a limited space. On the other hand, the objective 3D editing by applying the visualization of depth data can adjust itself to the cubic effect of the same space and the whole content equally, which will enrich the 3D contents. It will even be able to solve some problems such as distortion of cubic effect and visual fatigue, etc.

Implementation of the AMBA AXI4 Bus interface for effective data transaction and optimized hardware design (효율적인 데이터 전송과 하드웨어 최적화를 위한 AMBA AXI4 BUS Interface 구현)

  • Kim, Hyeon-Wook;Kim, Geun-Jun;Jo, Gi-Ppeum;Kang, Bong-Soon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.2
    • /
    • pp.70-75
    • /
    • 2014
  • Recently, the demand for high-integrated, low-powered, and high-powered SoC design has been increasing due to the multi-functionality and the miniaturization of digital devices and the high capacity of service informations. With the rapid evolution of the system, the required hardware performances have become diversified, the FPGA system has been increasingly adopted for the rapid verification, and SoC system using the FPGA and the ARM core for control has been growingly chosen. While the AXI bus is used in these kinds of systems in various ways, it is traditionally designed with AXI slave structure. In slave structure, there are problems with the CPU resources because CPU is continually involved in the data transfer and can't be used in other jobs, and with the decreased transmission efficiency because the time not used of AXI bus beomes longer. In this paper, an efficient AXI master interface is proposed to solve this problem. The simulation results show that the proposed system achieves reductions in the consumption clock by an average of 51.99% and in the slice by 31% and that the maximum operating frequency is increased to 107.84MHz by about 140%.

A Security SoC embedded with ECDSA Hardware Accelerator (ECDSA 하드웨어 가속기가 내장된 보안 SoC)

  • Jeong, Young-Su;Kim, Min-Ju;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.7
    • /
    • pp.1071-1077
    • /
    • 2022
  • A security SoC that can be used to implement elliptic curve cryptography (ECC) based public-key infrastructures was designed. The security SoC has an architecture in which a hardware accelerator for the elliptic curve digital signature algorithm (ECDSA) is interfaced with the Cortex-A53 CPU using the AXI4-Lite bus. The ECDSA hardware accelerator, which consists of a high-performance ECC processor, a SHA3 hash core, a true random number generator (TRNG), a modular multiplier, BRAM, and control FSM, was designed to perform the high-performance computation of ECDSA signature generation and signature verification with minimal CPU control. The security SoC was implemented in the Zynq UltraScale+ MPSoC device to perform hardware-software co-verification, and it was evaluated that the ECDSA signature generation or signature verification can be achieved about 1,000 times per second at a clock frequency of 150 MHz. The ECDSA hardware accelerator was implemented using hardware resources of 74,630 LUTs, 23,356 flip-flops, 32kb BRAM, and 36 DSP blocks.

Gene Expression Profiling of SH-SY5Y Human Neuroblastoma Cells Treated with Ginsenoside Rg1 and Rb1 (Ginsenoside Rg1 및 Rb1을 처리한 신경세포주(SH-SY5Y세포)의 유전자 발현양상)

  • Lee, Joon-Noh;Yang, Byung-Hwan;Choi, Seung-Hak;Kim, Seok-Hyun;Chai, Young-Gyu;Jung, Kyoung-Hwa;Lee, Jun-Seok;Choi, Kang-Ju;Kim, Young-Suk
    • Korean Journal of Biological Psychiatry
    • /
    • v.12 no.1
    • /
    • pp.42-61
    • /
    • 2005
  • Objectives:The ginsenoside Rg1 and Rb1, the major components of ginseng saponin, have neurotrophic and neuroprotective effects including promotion of neuronal survival and proliferation, facilitation of learning and memory, and protection from ischemic injury and apoptosis. In this study, to investigate the molecular basis of the effects of ginsenoside on neuron, we analyzed gene expression profiling of SH-SY5Y human neuroblastoma cells treated with ginsenoside Rg1 or Rb1. Methods:SH-SY5Y cells were cultured and treated in triplicate with ginsenoside Rg1 or Rb1($80{\mu}M$, $40{\mu}M$, $20{\mu}M$). The proliferation rates of SH-SY5Y cells were determined by MTT assay and microscopic examination. We used a high density cDNA microarray chip that contained 8K human genes to analyze the gene expression profiles in SH-SY5Y cells. We analyzed using the Significance Analysis of Microarray(SAM) method for identifying genes on a microarray with statistically significant changes in expression. Results:Treatment of SH-SY5Y cells with $80{\mu}M$ ginsenoside Rg1 or Rb1 for 36h showed maximal proliferation compared with other concentrations or control. The results of the microarray experiment yielded 96 genes were upregulated(${\geq}$3 fold) in Rg1 treated cells and 40 genes were up-regulated(${\geq}$2 fold) in Rb1 treated cells. Treatment with ginsenoside Rg1 for 36h induced the expression of some genes associated with protein biosynthesis, regulation of transcription or translation, cell proliferation and growth, neurogenesis and differentiation, regulation of cell cycle, energy transport and others. Genes associated with neurogenesis and neuronal differentiation such as SCG10 and MLP increased in ginsenoside Rg1 treated cells, but such changes did not occur in Rb1-group. Conclusion:Our data provide novel insights into the gene mechanisms involved in possible role for ginsenoside Rg1 or Rb1 in mediating neuronal proliferation or cell viability, which can elicit distinct patterns of gene expression in neuronal cell line. Ginsenoside Rg1 have more broad and strong effects than ginsenoside Rb1 in gene expression and related cellular physiology. In addition, we suggest that SCG10 gene, which is known to be expressed in neuronal differentiation during development and neuronal regeneration during adulthood, may have a role in enhancement of activity dependent synaptic plasticity or cytoskeletal regulation following treatment of ginsenoside Rg1. Further, ginsenoside Rg1 may have a possible role in regeneration of injured neuron, promotion of memory, and prevention from aging or neuronal degeneration.

  • PDF

Switching and Leakage-Power Suppressed SRAM for Leakage-Dominant Deep-Submicron CMOS Technologies (초미세 CMOS 공정에서의 스위칭 및 누설전력 억제 SRAM 설계)

  • Choi Hoon-Dae;Min Kyeong-Sik
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.3 s.345
    • /
    • pp.21-32
    • /
    • 2006
  • A new SRAM circuit with row-by-row activation and low-swing write schemes is proposed to reduce switching power of active cells as well as leakage one of sleep cells in this paper. By driving source line of sleep cells by $V_{SSH}$ which is higher than $V_{SS}$, the leakage current can be reduced to 1/100 due to the cooperation of the reverse body-bias. Drain Induced Barrier Lowering (DIBL), and negative $V_{GS}$ effects. Moreover, the bit line leakage which may introduce a fault during the read operation can be eliminated in this new SRAM. Swing voltage on highly capacitive bit lines is reduced to $V_{DD}-to-V_{SSH}$ from the conventional $V_{DD}-to-V_{SS}$ during the write operation, greatly saving the bit line switching power. Combining the row-by-row activation scheme with the low-swing write does not require the additional area penalty. By the SPICE simulation with the Berkeley Predictive Technology Modes, 93% of leakage power and 43% of switching one are estimated to be saved in future leakage-dominant 70-un process. A test chip has been fabricated using $0.35-{\mu}m$ CMOS process to verify the effectiveness and feasibility of the new SRAM, where the switching power is measured to be 30% less than the conventional SRAM when the I/O bit width is only 8. The stored data is confirmed to be retained without loss until the retention voltage is reduced to 1.1V which is mainly due to the metal shield. The switching power will be expected to be more significant with increasing the I/O bit width.