• Title/Summary/Keyword: 곱셈 알고리즘

Search Result 329, Processing Time 0.022 seconds

Efficient Scheduling Schemes for Low-Area Mixed-radix MDC FFT Processor (저면적 Mixed-radix MDC FFT 프로세서를 위한 효율적인 스케줄링 기법)

  • Jang, Jeong Keun;Sunwoo, Myung Hoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.7
    • /
    • pp.29-35
    • /
    • 2017
  • This paper presents a high-throughput area-efficient mixed-radix fast Fourier transform (FFT) processor using the efficient scheduling schemes. The proposed FFT processor can support 64, 128, 256, and 512-point FFTs for orthogonal frequency division multiplexing (OFDM) systems, and can achieve a high throughput using mixed-radix algorithm and eight-parallel multipath delay commutator (MDC) architecture. This paper proposes new scheduling schemes to reduce the size of read-only memories (ROMs) and complex constant multipliers without increasing delay elements and computation cycles; thus, reducing the hardware complexity further. The proposed mixed-radix MDC FFT processor is designed and implemented using the Samsung 65nm complementary metal-oxide semiconductor (CMOS) technology. The experimental result shows that the area of the proposed FFT processor is 0.36 mm2. Furthermore, the proposed processor can achieve high throughput rates of up to 2.64 GSample/s at 330 MHz.

High Performance Elliptic Curve Cryptographic Processor for $GF(2^m)$ ($GF(2^m)$의 고속 타원곡선 암호 프로세서)

  • Kim, Chang-Hoon;Kim, Tae-Ho;Hong, Chun-Pyo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.3
    • /
    • pp.113-123
    • /
    • 2007
  • This paper presents a high-performance elliptic curve cryptographic processor over $GF(2^m)$. The proposed design adopts Lopez-Dahab Montgomery algorithm for elliptic curve point multiplication and uses Gaussian normal basis for $GF(2^m)$ field arithmetic operations. We select m=163 which is the smallest value among five recommended $GF(2^m)$ field sizes by NIST and it is Gaussian normal basis of type 4. The proposed elliptic curve cryptographic processor consists of host interface, data memory, instruction memory, and control. We implement the proposed design using Xilinx XCV2000E FPGA device. Based on the FPGA implementation results, we can see that our design is 2.6 times faster and requires significantly less hardware resources compared with the previously proposed best hardware implementation.

High-speed Radix-8 FFT Structure for OFDM (OFDM용 고속 Radix-8 FFT 구조)

  • Jang, Young-Beom;Hur, Eun-Sung;Park, Jin-Su;Hong, Dae-Ki
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.84-93
    • /
    • 2007
  • In this paper, a Radix-8 structure for high-speed FFT is propose. Main block of the proposed FFT structure is Radix-8 DIF(Decimation In Frequency) butterfly. Even throughput of the Radix-8 FFT is twice than that of the Radix-4 FFT, implementation area of the Radix-8 is larger than that of Radix-4 FFT. But, implementation area of the proposed Radix-8 FFT was reduced by using DA(Distributed Arithmetic) for multiplication. For comparison, the 64-point FFT was implemented using conventional Radix-4 butterfly and proposed Radix-8 butterfly, respectively. The Verilog-HDL coding results for the proposed FFT structure show 49.2% cell area increment comparison with those of the conventional Radix-4 FFT structure. Namely, to speed up twice, 49.2% of area cost is required. In case of same throughput, power consumption of the proposed structure is reduced by 25.4%. Due to its efficient processing scheme, the proposed FFT structure can be used in large size of FFT like OFDM Modem.

A Low-power DIF Radix-4 FFT Processor for OFDM Systems Using CORDIC Algorithm (CORDIC을 이용한 OFDM용 저전력 DIF Radix-4 FFT 프로세서)

  • Jang, Young-Beom;Choi, Dong-Kyu;Kim, Do-Han
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.3
    • /
    • pp.103-110
    • /
    • 2008
  • In this paper, an efficient butterfly structure for 8K/2K-Point Radix-4 FFT algorithm using CORDIC(coordinate rotation digital computer) is proposed. It is shown that CORDIC can be efficiently used in twiddle factor calculation of the Radix-4 FFT algorithm. The Verilog-HDL coding results for the proposed CORDIC butterfly structure show 36.9% cell area reduction comparison with those of the conventional multiplier butterfly structure. Furthermore, the 8K/2K-point Radix-4 pipeline structure using the proposed butterfly and delay commutators is compared with other conventional structures. Implementation coding results show 11.6% cell area reduction. Due to its efficient processing scheme, the proposed FFT structure can be widely used in large size of FFT like OFDM Modem.

A Design and Performance Evaluation of Path Search by Simplification of Estimated Values based on Variable Heuristic (가변 휴리스틱 기반 추정치 간소화를 통한 경로탐색 기법의 설계 및 성능 평가)

  • Kim, Jin-Deog
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.11
    • /
    • pp.2002-2007
    • /
    • 2006
  • The path search method in the telematics system should consider traffic flow of the roads as well as the shortest time because the optimal path with minimized travel time could be continuously changed by the traffic flow. The existing path search methods are not able to cope efficiently with the change of the traffic flow. The search method to use traffic information also needs more computation time than the existing shortest path search. In this paper, a method for efficiency improvement of path search is implemented and its performance is evaluated. The method employs the fixed grid for adjustable heuristic to traffic flow. Moreover, in order to simplify the computation of estimation values, it only adds graded decimal values instead of multiplication operation of floating point numbers with due regard to the gradient between a departure and a destination. The results obtained from the experiments show that it achieves the high accuracy and short execution time as well.

PSNR Comparison of DCT-domain Image Resizing Methods (DCT 영역 영상 크기 조절 방법들에 대한 PSNR 비교)

  • Kim Do nyeon;Choi Yoon sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.10C
    • /
    • pp.1484-1489
    • /
    • 2004
  • Given a video frame in terms of its 8${\times}$8 block-DCT coefncients, we wish to obtain a downsized or upsized version of this Dame also in terms of 8${\times}$8 block DCT coefficients. The DCT being a linear unitary transform is distributive over matrix multiplication. This fact has been used for downsampling video frames in the DCT domains in Dugad's, Mukherjee's, and Park's methods. The downsampling and upsampling schemes combined together preserve all the low-frequency DCT coefficients of the original image. This implies tremendous savings for coding the difference between the original frame (unsampled image) and its prediction (the upsampled image).This is desirable for many applications based on scalable encoding of video. In this paper, we extend the earlier works to various DCT sizes, when we downsample and then upsample of an image by a factor of two. Through experiment, we could improve the PSM values whenever we increase the DCT block size. However, because the complexity will be also increase, we can say there is a tradeoff. The experiment result would provide important data for developing fast algorithms of compressed-domain image/video resizing.

Accelerated Convolution Image Processing by Using Look-Up Table and Overlap Region Buffering Method (Loop-Up Table과 필터 중첩영역 버퍼링 기법을 이용한 컨벌루션 영상처리 고속화)

  • Kim, Hyun-Woo;Kim, Min-Young
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.49 no.4
    • /
    • pp.17-22
    • /
    • 2012
  • Convolution filtering methods have been widely applied to various digital signal processing fields for image blurring, sharpening, edge detection, and noise reduction, etc. According to their application purpose, the filter mask size or shape and the mask value are selected in advance, and the designed filter is applied to input image for the convolution processing. In this paper, we proposed an image processing acceleration method for the convolution processing by using two-dimensional Look-up table (LUT) and overlap-region buffering technique. First, based on the fixed convolution mask value, the multiplication operation between 8 or 10 bit pixel values of the input image and the filter mask values is performed a priori, and the results memorized in LUT are referred during the convolution process. Second, based on symmetric structural characteristics of the convolution filters, inherent duplicated operation region is analysed, and the saved operation results in one step before in the predefined memory buffer is recalled and reused in current operation step. Through this buffering, unnecessary repeated filter operation on the same regions is minimized in sequential manner. As the proposed algorithms minimize the computational amount needed for the convolution operation, they work well under the operation environments utilizing embedded systems with limited computational resources or the environments of utilizing general personnel computers. A series of experiments under various situations verifies the effectiveness and usefulness of the proposed methods.

90/150 RCA Corresponding to Maximum Weight Polynomial with degree 2n (2n 차 최대무게 다항식에 대응하는 90/150 RCA)

  • Choi, Un-Sook;Cho, Sung-Jin
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.4
    • /
    • pp.819-826
    • /
    • 2018
  • The generalized Hamming weight is one of the important parameters of the linear code. It determines the performance of the code when the linear codes are applied to a cryptographic system. In addition, when the block code is decoded by soft decision using the lattice diagram, it becomes a measure for evaluating the state complexity required for the implementation. In particular, a bit-parallel multiplier on finite fields based on trinomials have been studied. Cellular automata(CA) has superior randomness over LFSR due to its ability to update its state simultaneously by local interaction. In this paper, we deal with the efficient synthesis of the pseudo random number generator, which is one of the important factors in the design of effective cryptosystem. We analyze the property of the characteristic polynomial of the simple 90/150 transition rule block, and propose a synthesis algorithm of the reversible 90/150 CA corresponding to the trinomials $x^2^n+x^{2^n-1}+1$($n{\geq}2$) and the 90/150 reversible CA(RCA) corresponding to the maximum weight polynomial with $2^n$ degree by using this rule block.

A binary adaptive arithmetic coding algorithm based on adaptive symbol changes for lossless medical image compression (무손실 의료 영상 압축을 위한 적응적 심볼 교환에 기반을 둔 이진 적응 산술 부호화 방법)

  • 지창우;박성한
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.12
    • /
    • pp.2714-2726
    • /
    • 1997
  • In this paper, adaptive symbol changes-based medical image compression method is presented. First, the differenctial image domain is obtained using the differentiation rules or obaptive predictors applied to original mdeical image. Also, the algorithm determines the context associated with the differential image from the domain. Then prediction symbols which are thought tobe the most probable differential image values are maintained at a high value through the adaptive symbol changes procedure based on estimates of the symbols with polarity coincidence between the differential image values to be coded under to context and differential image values in the model template. At the coding step, the differential image values are encoded as "predicted" or "non-predicted" by the binary adaptive arithmetic encoder, where a binary decision tree is employed. The simlation results indicate that the prediction hit ratios of differential image values using the proposed algorithm improve the coding gain by 25% and 23% than arithmetic coder with ISO JPEG lossless predictor and arithmetic coder with differentiation rules or adaptive predictors, respectively. It can be used in compression part of medical PACS because the proposed method allows the encoder be directly applied to the full bit-planes medical image without a decomposition of the full bit-plane into a series of binary bit-planes as well as lower complexity of encoder through using an additions when sub-dividing recursively unit intervals.

  • PDF