Search | Korea Science

Design and Implementation of OpenSHMEM-Light using PCIe NTB (PCIe NTB를 활용한 OpenSHMEM-Light의 설계 및 구현)

Ju, Youngwoong;Choi, Min
- Proceedings of the Korea Information Processing Society Conference
- /
- 2016.10a
- /
- pp.58-61
- /
- 2016
.PCI Express는 고속, 저전력 등의 특성으로 업계 표준으로서 많이 쓰이고 있는 프로세서와 주변 I/O 장치들을 연결하는 버스 기술이다. 또한, PCI Express는 인피니밴드와 이더넷과 더불어 고성능 컴퓨터나 컴퓨터 클러스터를 위한 시스템 인터커넥트 기술로 널리 쓰이고 있다. PGAS(partitioned global address space) 프로그래밍 모델은 컴퓨터 클러스터와 같은 다중 호스트 시스템에서 단측 RDMA(remote direct memory access)를 구현하는데 많이 이용된다. 본 논문에서는 PCI Express 기반 RDMA를 구현하기 위해 PGAS 프로그래밍 모델인 OpenSHMEM의 기존의 특징을 유지하여 PCI Express 기반 OpenSHMEM API를 설계 및 구현하였다. 구현한 OpenSHMEM API는 PCI Express의 NTB(non-transparent bridge) 기술로 2대의 PC를 연결한 시스템에서 매트릭스 곱셈 예제를 통하여 실험하였다.
https://doi.org/10.3745/PKIPS.y2016m10a.58 인용 PDF

Research on Event Mechanism for Reducing Power Overheads in Cache Memory Synchronization (캐시 메모리 동기화 전력 감소를 위한 이벤트 메커니즘에 대한 연구)

Pak, Young-Jin;Jeong, Ha-Young;Lee, Yong-Surk
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.48 no.3
- /
- pp.69-75
- /
- 2011
In this paper, we propose an anycast event driven synchronization mechanism to reduce power overheads. Our proposed mechanism can reduce unnecessary polling operations on SHI(Snoop Hit Invalidate) or SHR(Snoop Hit Read) states. It prevents waisting bandwidth and reduces power overheads on polling operation. Also it decreases transition power of state change compared to broadcast model. Simulation results indicated that the proposed architecture had about 15.3% of power decrease compared to spin-lock model and about 4.7% of power decrease compared to broadcast model. Overall results indicated that proposed synchronization mechanism could increase power efficiency of multi-core system by reducing power overheads.
PDF KSCI

An implementation of Escape and BTA modes for MIPI DSI bridge IC (MIPI DSI 브릿지 IC의 Escape/BTA 모드 구현)

Kim, Gyeong-hun;Seo, Chang-sue;Shin, Kyung-wook
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2014.10a
- /
- pp.288-290
- /
- 2014
In this paper, Escape and BTA(Bus Turn Around) modes of master bridge IC are implemented, which supports MIPI(Mobile Industry Processor Interface) DSI(Display Serial Interface) standard. MIPI DSI master bridge IC sends RGB data and various commands to display module(slave) in order to test it. The Escape mode is designed to implement LPDT, ULPS and trigger message transmissions. The BTA mode is designed to obtain various status information from slave in reverse direction. Functional simulation results show that the designed Escape and BTA modes work correctly for various conditions defined in MIPI DSI standard.
PDF

A Low Power SRAM using Supply Voltage Charge Recycling (공급전압 전하재활용을 이용한 저전력 SRAM)

Yang, Byung-Do;Lee, Yong-Kyu
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.46 no.5
- /
- pp.25-31
- /
- 2009
A low power SRAM using supply voltage charge recycling (SVCR-SRAM) scheme is proposed. It divides into two SRAM cell blocks and supplies two different powers. A supplied power is $V_{DD}$ and $V_{DD}/2$. The other is $V_{DD}/2$ and GND. When N-bit cells are accessed, the charge used in N/2-bit cells with VDD and $V_{DD}/2$ is recycled in the other N/2-bit cells with $V_{DD}/2$ and GND. The SVCR scheme is used in the power consuming parts which bit line, data bus, word line, and SRAM cells to reduce dynamic power. The other parts of SRAM use $V_{DD}$ and GND to achieve high speed. Also, the SVCR-SRAM results in reducing leakage power of SRAM cells due to the body-effect. A 64K-bit SRAM ($8K{\times}8$bits) is implemented in a $0.18{\mu}m$ CMOS process. It saves 57.4% write power and 27.6% read power at $V_{DD}=1.8V$ and f=50MHz.
PDF KSCI

An 1.2V 10b 500MS/s Single-Channel Folding CMOS ADC (1.2V 10b 500MS/s 단일채널 폴딩 CMOS A/D 변환기)

Moon, Jun-Ho;Park, Sung-Hyun;Song, Min-Kyu
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.48 no.1
- /
- pp.14-21
- /
- 2011
A 10b 500MS/s $0.13{\mu}m$ CMOS ADC is proposed for 4G wireless communication systems such as a LTE-Advanced and SDR The ADC employs a calibration-free single-channel folding architecture for low power consumption and high speed conversion rate. In order to overcome the disadvantage of high folding rate, at the fine 7b ADC, a cascaded folding-interpolating technique is proposed. Further, a folding amplifier with the folded cascode output stage is also discussed in the block of folding bus, to improve the bandwidth limitation and voltage gain by parasitic capacitances. The chip has been fabricated with $0.13{\mu}m$ 1P6M CMOS technology, the effective chip area is $1.5mm^2$. The measured results of INL and DNL are within 2.95LSB and l.24LSB at 10b resolution, respectively. The SNDR is 54.8dB and SFDR is 63.4dBc when the input frequency is 9.27MHz at sampling frequency of 500MHz. The ADC consumes 150mW($300{\mu}W/MS/s$) including peripheral circuits at 500MS/s and 1.2V(1.5V) power supply.
PDF KSCI

Design of Asynchronous System Bus Wrappers based on a Hybrid Ternary Data Encoding Scheme (하이브리드 터너리 데이터 인코딩 기반의 비동기식 시스템 버스 래퍼 설계)

Lim, Young-Il;Lee, Je-Hoon;Lee, Seung-Sook;Cho, Kyoung-Rok
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.44 no.1
- /
- pp.36-44
- /
- 2007
This paper presented a hybrid ternary encoding scheme using 3-valued logic. It can adapt to the delay-insensitive(DI) model. We designed an asynchronous wrapper for the hybrid ternary encoding scheme to communicate with various asynchronous encoding schemes. It reduced about 50% of transmission lines and power consumption compared with the conventional 1-of-4 and ternary encoding scheme. The proposed wrappers were designed and simulated using the $0.18-{\mu}m$ standard CMOS technology. As a result, the asynchronous wrapper operated over 2 GHz communicating with a system bus. Moreover, the power dissipation of the system bus adapted the hybrid ternary encoding logic decreases 65%, 43%, and 36% of the dual-rail, 1-of-4, and ternary encoding scheme, respectively. The proposed data encoding scheme and the wrapper circuit can be useful for asynchronous high-speed and low-power asynchronous interface.
PDF KSCI

A Cost-effective Control Flow Checking using Loop Detection and Prediction (루프 검출 및 예측 방법을 적용한 비용 효율적인 실시간 분기 흐름 검사 기법)

Kim Gunbae;Ahn Jin-Ho;Kang Sungho
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.42 no.12
- /
- pp.91-102
- /
- 2005
Recently, concurrent error detection for the processor becomes important. But it imposes too much overhead to adopt concurrent error detection capability on the system. In this paper, a new approach to resolve the problems of concurrent error detection is proposed. A loop detection scheme is introduced to reduce the repetitive loop iteration and memory access. To reduce the memory overheat an offset to calculate the target address of branching node is proposed. Performance evaluation shows that the new architecture has lower memory overhead and frequency of memory access than previous works. In addition, the new architecture provides the same error coverage and requires nearly constant memory size regardless of the size of the application program. Consequently, the proposed architecture can be used as an cost effective method to detect control flow errors in the commercial on the shelf products.
PDF KSCI

An Optimized PWM Switching Strategy for an Induction Motor Voltage Control (전압제어 유도 전동기를 위한 최적 PWM 스위칭 방법)

Han, Sang-Soo;Chu, Soon-Nam
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.13 no.5
- /
- pp.922-930
- /
- 2009
An optimized PWM switching strategy for an induction motor voltage control is developed and demonstrated. Space vector modulation in voltage source inverter offers improved DC-bus utilization and reduced commutation losses and has been therefor recognized the preferred PWM method especially in case of digital implementation. An optimized PWM switching strategy for an induction motor voltage control consists of switching between the two active and one zero voltage vector by using the proposed optimal PWM algorithm. The preferred switching sequence is defined as a function of the modulation index and period of a carrier wave. The sequence is selected by using the inverter switching losses and the current ripple as the criteria. For low and medium power application, the experimental results indicate that good dynamic response and reduced harmonic distortion can be achieved by increasing switching frequency.
https://doi.org/10.6109/JKIICE.2009.13.5.922 인용 PDF KSCI

Low-Complexity Deeply Embedded CPU and SoC Implementation (낮은 복잡도의 Deeply Embedded 중앙처리장치 및 시스템온칩 구현)

Park, Chester Sungchung;Park, Sungkyung
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.17 no.3
- /
- pp.699-707
- /
- 2016
This paper proposes a low-complexity central processing unit (CPU) that is suitable for deeply embedded systems, including Internet of things (IoT) applications. The core features a 16-bit instruction set architecture (ISA) that leads to high code density, as well as a multicycle architecture with a counter-based control unit and adder sharing that lead to a small hardware area. A co-processor, instruction cache, AMBA bus, internal SRAM, external memory, on-chip debugger (OCD), and peripheral I/Os are placed around the core to make a system-on-a-chip (SoC) platform. This platform is based on a modified Harvard architecture to facilitate memory access by reducing the number of access clock cycles. The SoC platform and CPU were simulated and verified at the C and the assembly levels, and FPGA prototyping with integrated logic analysis was carried out. The CPU was synthesized at the ASIC front-end gate netlist level using a $0.18{\mu}m$ digital CMOS technology with 1.8V supply, resulting in a gate count of merely 7700 at a 50MHz clock speed. The SoC platform was embedded in an FPGA on a miniature board and applied to deeply embedded IoT applications.
https://doi.org/10.5762/KAIS.2016.17.3.699 인용 PDF KSCI

Design and Implementation of Initial OpenSHMEM Based on PCI Express (PCI Express 기반 OpenSHMEM 초기 설계 및 구현)

Joo, Young-Woong;Choi, Min
- KIPS Transactions on Computer and Communication Systems
- /
- v.6 no.3
- /
- pp.105-112
- /
- 2017
PCI Express is a bus technology that connects the processor and the peripheral I/O devices that widely used as an industry standard because it has the characteristics of high-speed, low power. In addition, PCI Express is system interconnect technology such as Ethernet and Infiniband used in high-performance computing and computer cluster. PGAS(partitioned global address space) programming model is often used to implement the one-sided RDMA(remote direct memory access) from multi-host systems, such as computer clusters. In this paper, we design and implement a OpenSHMEM API based on PCI Express maintaining the existing features of OpenSHMEM to implement RDMA based on PCI Express. We perform experiment with implemented OpenSHMEM API through a matrix multiplication example from system which PCs connected with NTB(non-transparent bridge) technology of PCI Express. The PCI Express interconnection network is currently very expensive and is not yet widely available to the general public. Nevertheless, we actually implemented and evaluated a PCI Express based interconnection network on the RDK evaluation board. In addition, we have implemented the OpenSHMEM software stack, which is of great interest recently.
https://doi.org/10.3745/KTCCS.2017.6.3.105 인용 PDF KSCI

Search Result 41, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)