Search | Korea Science

Optimization of ARIA Block-Cipher Algorithm for Embedded Systems with 16-bits Processors

Lee, Wan Yeon;Choi, Yun-Seok
- International Journal of Internet, Broadcasting and Communication
- /
- v.8 no.1
- /
- pp.42-52
- /
- 2016
In this paper, we propose the 16-bits optimization design of the ARIA block-cipher algorithm for embedded systems with 16-bits processors. The proposed design adopts 16-bits XOR operations and rotated shift operations as many as possible. Also, the proposed design extends 8-bits array variables into 16-bits array variables for faster chained matrix multiplication. In evaluation experiments, our design is compared to the previous 32-bits optimized design and 8-bits optimized design. Our 16-bits optimized design yields about 20% faster execution speed and about 28% smaller footprint than 32-bits optimized code. Also, our design yields about 91% faster execution speed with larger footprint than 8-bits optimized code.
https://doi.org/10.7236/IJIBC.2016.8.1.42 인용 PDF

Implementation of a 3D Graphics Hardwired T&L Accelerator based on a SoC Platform for a Mobile System (SoC 플랫폼 기반 모바일용 3차원 그래픽 Hardwired T&L Accelerator 구현)

Lee, Kwang-Yeob;Koo, Yong-Seo
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.44 no.9
- /
- pp.59-70
- /
- 2007
In this paper, we proposed an effective T&L(Transform & Lighting) Processor architecture for a real time 3D graphics acceleration SoC(System on a Chip) in a mobile system. We designed Floating point arithmetic IPs for a T&L processor. And we verified IPs using a SoC Platform. Designed T&L Processor consists of 24 bit floating point data format and 16 bit fixed point data format, and supports the pipeline keeping the balance between Transform process and Lighting process using a parallel computation of 3D graphics. The delay of pipeline processing only Transform operation is almost same as the delay processing both Transform operation and Lighting operation. Designed T&L Processor is implemented and verified using a SoC Platform. The T&L Processor operates at 80MHz frequency in Xilinx-Virtex4 FPGA. The processing speed is measured at the rate of 20M Vertexes/sec.
PDF KSCI

The Design of High Speed Processor for a Sequence Logic Control using FPGA (FPGA를 이용한 시퀀스 로직 제어용 고속 프로세서 설계)

Yang, Oh
- The Transactions of the Korean Institute of Electrical Engineers A
- /
- v.48 no.12
- /
- pp.1554-1563
- /
- 1999
This paper presents the design of high speed processor for a sequence logic control using field programmable gate array(FPGA). The sequence logic controller is widely used for automating a variety of industrial plants. The FPGA designed by VHDL consists of program and data memory interface block, input and output block, instruction fetch and decoder block, register and ALU block, program counter block, debug control block respectively. Dedicated clock inputs in the FPGA were used for high speed execution, and also the program memory was separated from the data memory for high speed execution of the sequence instructions at 40 MHz clock. Therefore it was possible that sequence instructions could be operated at the same time during the instruction fetch cycle. In order to reduce the instruction decoding time and the interface time of the data memory interface, an instruction code size was implemented by 16 bits or 32 bits respectively. And the real time debug operation was implemented for easy debugging the designed processor. This FPGA was synthesized by pASIC 2 SpDE and Synplify-Lite synthesis tool of Quick Logic company. The final simulation for worst cases was successfully performed under a Verilog HDL simulation environment. And the FPGA programmed for an 84 pin PLCC package was applied to sequence control system with inputs and outputs of 256 points. The designed processor for the sequence logic was compared with the control system using the DSP(TM320C32-40MHz) and conventional PLC system. The designed processor for the sequence logic showed good performance.
PDF

A Study on the Bit-slice Signal Processor for the Biological Signal Processing (생체 신호처리용 Bit-slice Signal Processor에 관한 연구)

Kim, Yeong-Ho;Kim, Dong-Rok;Min, Byeong-Gu
- Journal of Biomedical Engineering Research
- /
- v.6 no.2
- /
- pp.15-22
- /
- 1985
We have developed a microprogramir!able signal processor for real-time ultrasonic signal processing. Processing speed was increased by the parallelism in horizontal microprogram using 104bits microcode and the Pipelined architecture. Control unit of the signal processor was designed by microprogrammed architec- ture and writable control store (WCS) which was interfaced with host computer, APPLE- ll . This enables the processor to develop and simulate various digital signal processing algorithms. The performance of the processor was evaluated by the Fast Fourier Transform (FFT) program. The execution time to perform 16 bit 1024 points complex FF7, radix-2 DIT algorithm, was about 175 msec with IMHz master Clock. We can use this processor to Bevelop more efficient signal processing algorithms on the biological signal processing.
PDF

Computer Application to ECG Signal Processing

Okajima, Mitsuharu
- Journal of Biomedical Engineering Research
- /
- v.6 no.2
- /
- pp.13-14
- /
- 1985
We have developed a microprogramir!able signal processor for real-time ultrasonic signal processing. Processing speed was increased by the parallelism in horizontal microprogram using 104bits microcode and the Pipelined architecture. Control unit of the signal processor was designed by microprogrammed architec- ture and writable control store (WCS) which was interfaced with host computer, APPLE- ll . This enables the processor to develop and simulate various digital signal processing algorithms. The performance of the processor was evaluated by the Fast Fourier Transform (FFT) program. The execution time to perform 16 bit 1024 points complex FF7, radix-2 DIT algorithm, was about 175 msec with IMHz master Clock. We can use this processor to Bevelop more efficient signal processing algorithms on the biological signal processing.
PDF

Implementation of DCT using Bit Slice Signal Processor (BIT SLICE SIGNAL PROCESSOR를 이용한 DCT의 구현)

Kim, Dong-L.;Go, Seok-B.;Paek, Seung-K.;Lee, Tae-S.;Min, Byong-G.
- Proceedings of the KIEE Conference
- /
- 1987.07b
- /
- pp.1449-1453
- /
- 1987
A microprogrammable Bit Slice Sinal Processor for image processing is implemented. Processing speed is increased by the parallelism in horizontal microprogram using 120bits microcode, pipelined architecture, 2 bank memory switching that interfaces with the Host through DMA, a variable clock control, overflow checking H/W,look-up table method and cache memory. With this processor, a DCT algorithm which uses 2-D FFT is performed. The execution time for $512{\times}512{\times}8$ image is 12 sec when 16 bit operation is runned, and the recovered image has acceptable quality with MSE 0.276%.
PDF

Design of An Application Specific Instruction-set Processor for Embedded DSP Applications (내장형 신호처리를 위한 응용분야 전용 프로세서의 설계)

Lee, Sung-Won;Choi, Hoon;Park, In-Cheol
- Proceedings of the IEEK Conference
- /
- 1999.11a
- /
- pp.228-231
- /
- 1999
This paper describes the design and implementation of an application specific instruction-set processor developed for embedded DSP applications. The instruction-set has an uniform size of 16 bits, and supports 3 types of instructions: Primitive, Complex, and Specific. To reduce code size and cycle count we introduce complex instructions that can be selected according to the application under consideration, which leads to 50% code size reduction maximally. The processor has two independent data memories to double the data throughput and the address space. The processor is synthesized by 0.6$\mu$m single-poly double-metal technology. Critical path simulation shows that the maximum frequency is 110MHz and total gate count is 132, 000.
PDF

Study of Optimization for High Performance Adders (고성능 가산기의 최적화 연구)

허석원;김문경;이용주;이용석
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.5A
- /
- pp.554-565
- /
- 2004
In this paper, we implement single cycle and multi cycle adders. We can compare area and time by using the implemented adders. The size of adders is 64, 128, 256-bits. The architecture of hybrid adders is that the carry-out of small adder groups can be interconnected by utilizing n carry propagate unit. The size of small adder groups is selected in three formats - 4, 8, 16-bits. These adders were implemented with Verilog HDL with top-down methodology, and they were verified by behavioral model. The verified models were synthesized with a Samsung 0,35(um), 3.3(V) CMOS standard cell library while a using Synopsys Design Compiler. All adders were synthesized with group or ungroup. The optimized adder for a Crypto-processor included Smart Card IC is that a 64-bit RCA based on 16-bit CLA. All small adder groups in this optimized adder were synthesized with group. This adder can operate at a clock speed of 198 MHz and has about 961 gates. All adders can execute operations in this won case conditions of 2.7 V, 85 $^{\circ}C$.
PDF KSCI

Design of 32 bits tow Power Smart Card IC (32 비트 저전력 스마트카드 IC 설계)

김승철;김원종;조한진;정교일
- Proceedings of the IEEK Conference
- /
- 2002.06b
- /
- pp.349-352
- /
- 2002
In this Paper, we introduced 32 bit SOC implementation for multi-application Smart Card and described the methodology for reducing power consumption. It consists of ARMTTDMI micro-processor, 192 KBytes EEPROM, 16 KB SRAM, crypto processors and card reader interface based on AMBA bus system. We used Synopsys Power Compiler to estimate and optimize power consumption. Experimental results show that we can reduce Power consumption up to 62 % without increasing the chip area.
PDF

A Study on the Space Power System using Micro-Processor (마이크로 프로세서를 이용한 위성용 전력 시스템 제어에 관한 연구)

Kim, H.J.;Kim, Y.T.;Kim, I.G.
- Proceedings of the KIEE Conference
- /
- 1992.07b
- /
- pp.1032-1034
- /
- 1992
There are increasing demands for the Space Power System to get more useful performance. These demands are high-efficiency, high-confidency, light-weight, small-size, and the systems's flexibility which can shorten the development time. These demands can be achieved when we make the use of $\mu$-Processor. This paper, therefore, shows an analysis and experimental results for the Space Power System with MC 68000, Motorola's 16 bits MPU, to find the system's characteristics.
PDF

Search Result 27, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)