• Title/Summary/Keyword: 연산 감소

Search Result 1,089, Processing Time 0.032 seconds

A Network-adaptive Context Extraction Method for JPEG2000 Using Tree-Structure of Coefficients from DWT (DWT 계수의 트리구조를 이용한 네트워크-적응적 JPEG2000 컨텍스트 추출방법)

  • Choi Hyun-Jun;Seo Young-Ho;Kim Dong-Wook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.9C
    • /
    • pp.939-948
    • /
    • 2005
  • In EBCOT, the context extraction process takes excessive calculation time and this paper proposed a method to reduce this calculation time. That is, if a coefficient is less than a pre-defined threshold value the coefficient and its descendents skip the context extraction process. There is a trade-off relationship between the calculation time and the image quality or the amount of output data such that as this threshold value increases, the calculation time and the amount of output data decreases, but the image degradation increases. Therefore, by deciding this threshold value according to the network environments or conditions, it is possible to establish a network-adaptive context extraction method. The experimental results showed that the range of the threshold values for acceptable image quality(better than 30dB) is from 0 to 4. The experimental results showed that in this range the Resulting reduction rate in calculation time was from $3\%\;to\;64\%$ in average, the reduction rate in output data was from $32\%$ to $73\%$ in average, which means that large reduction in calculation time and output data can be obtained with a cost of an acceptable image quality degradation. Therefore, the proposed method is expected to be used efficiently in the application area such as the real-time image/video data communication in wireless environments, etc.

Hybrid FFT processor design using Parallel PD adder circuit (병렬 PD가산회로를 이용한 Hybrid FFT 연산기 설계)

  • 김성대;최전균;안점영;송홍복
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2000.10a
    • /
    • pp.499-503
    • /
    • 2000
  • The use of Multiple-Valued FFT(Fast fourier Transform) is extended from binary to multiple-valued logic(MVL) circuits. A multiple-valued FFT circuit can be implemented using current-mode CMOS techniques, reducing the transitor, wires count between devices to half compared to that of a binary implementation. For adder processing in FFT, We give the number representation using such redundant digit sets are called redundant positive-digit number representation and a Redundant set uses the carry-propagation-free addition method. As the designed Multiple-valued FFT internally using PD(positive digit) adder with the digit set 0,1,2,3 has attractive features on speed, regularity of the structure and reduced complexities of active elements and interconnections. for the mutiplier processing, we give Multiple-valued LUT(Look up table)to facilitate simple mathmatical operations on the stored digits. Finally, Multiple-valued 8point FFT operation is used as an example in this paper to illuatrates how a multiple-valued FFT can be beneficial.

  • PDF

A Novel Fixed-Complexity Signal Detection Technique Using Lattice Reduction for Multiple Antenna Systems (다중 안테나 시스템을 위한 고정된 연산 복잡도를 갖는 격자 감소 기반 신호 검출 기법)

  • Yang, Yusik;Suh, Dong Geun;Kim, Jaekwon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38A no.1
    • /
    • pp.10-18
    • /
    • 2013
  • Recently, a fixed complexity LR(fcLR) technique was proposed. Also QR-LRL signal detection method was proposed in which all constellation symbols are tried as the symbol corresponding to the least reliable layer (LRL), thereby achieving high error performance. In this paper, we combine these two efficient methods to propose a novel detection method. When the LRL is disregarded in the process of LR, the worst case complexity of LR is significantly reduced. Also, the proposed method is shown to be superior to the conventional fcLR-based detection method from the perspective of error performance. Simulations are performed to demonstrate the efficacy of the proposed method.

Fast Frame Selection Method for Multi-Reference and Variable Block Motion Estimation (다중참조 및 가변블록 움직임 추정을 위한 고속 참조영상 선택 방법)

  • Kim, Sung-Dae;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.6
    • /
    • pp.1-8
    • /
    • 2008
  • This paper introduces three efficient frame selection schemes to reduce the computation complexity for the multi-reference and variable block size Motion Estimation (ME). The proposed RSP (Reference Selection Pass) scheme can minimize the overhead of frame selection. The MFS (Modified Frame Selection) scheme can reduce the number of search points about 18% compared with existing schemes considering the motion of image during the reference frame selection process. In addition, the TPRFS (Two Pass Reference frame Selection) scheme can minimize the frame selection operation for the variable block size ME in H.264/AVC using the character of selected reference frame according to the block size. The simulation results show the proposed schemes can save up to 50% of the ME computation without degradation of image Qualify. Because the proposed schemes can be separated from the block matching process, they can be used with any existing single reference fast search algorithms.

An Efficient Hardware Design for Scaling and Transform Coefficients Decoding (스케일링과 변환계수 복호를 위한 효율적인 하드웨어 설계)

  • Jung, Hongkyun;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.10
    • /
    • pp.2253-2260
    • /
    • 2012
  • In this paper, an efficient hardware architecture is proposed for inverse transform and inverse quantization of H.264/AVC decoder. The previous inverse transform and quantization architecture has a different AC and DC coefficients decoding order. In the proposed architecture, IQ is achieved after IT regardless of the DC or AC coefficients. A common operation unit is also proposed to reduce the computational complexity of inverse quantization. Since division operation is included in the previous architecture, it will generate errors if the processing order is changed. In order to solve the problem, the division operation is achieved after IT to prevent errors in the proposed architecture. The architecture is implemented with 3-stage pipeline and a parallel vertical and horizontal IDCT is also implemented to reduce the operation cycle. As a result of analyzing the proposed ITIQ architecture operation cycle for one macroblock, the proposed one has improved by 45% than the previous one.

Floating Point Unit Design for the IEEE754-2008 (IEEE754-2008을 위한 고속 부동소수점 연산기 설계)

  • Hwang, Jin-Ha;Kim, Hyun-Pil;Park, Sang-Su;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.10
    • /
    • pp.82-90
    • /
    • 2011
  • Because of the development of Smart phone devices, the demands of high performance FPU(Floating-point Unit) becomes increasing. Therefore, we propose the high-speed single-/double-precision FPU design that includes an elementary add/sub unit and improved multiplier and compare and convert units. The most commonly used add/sub unit is optimized by the parallel rounding unit. The matrix operation is used in complex calculation something like a graphic calculation. We designed the Multiply-Add Fused(MAF) instead of multiplier to calculate the matrix more quickly. The branch instruction that is decided by the compare operation is very frequently used in various programs. We bypassed the result of the compare operation before all the pipeline processes ended to decrease the total execution time. And we included additional convert operations that are added in IEEE754-2008 standard. To verify our RTL designs, we chose four hundred thousand test vectors by weighted random method and simulated each unit. The FPU that was synthesized by Samsung's 45-nm low-power process satisfied the 600-MHz operation frequency. And we confirm a reduction in area by comparing the improved FPU with the existing FPU.

Development of Integer DCT for VLSI Implementation (VLSI 구현을 위한 정수화 DCT 개발)

  • 곽훈성;이종하
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1928-1934
    • /
    • 1993
  • This paper presents a fast algorithm of integer discrete cosine transform(IDCT) allowing VLSI implementation by integer arithmetic. The proposed fast algorithm has been developed using Chen`s matrix decomposition in DCT, and requires less number of arithmetic operations compared to the IDCT. In the presented algorithm, the number of addition number is the same as the one of Chen`s algorithm if DCT, and the number of multiplication if the same as that in DCT at N=8 but drastically decreasing when N is above 8. In addition, the drawbacks of DCT such as performance degradation at the finite length arithmetic could be overcome by the IDCT.

  • PDF

OpenMP application to implement CUDA for FDTD algorithm and performance measurement (CUDA로 구현한 FDTD알고리즘의 OpenMP기술 적용 및 성능 측정)

  • Jung, Bok-Jae;Oh, Seung-Take;Lee, Cheol-Hoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2013.01a
    • /
    • pp.3-6
    • /
    • 2013
  • 반도체 공정에서 소자의 제조 비용 감소를 위해 제조 공정 검증을 위한 시뮬레이션을 수행하게 된다. 이 시뮬레이션은 반도체 소자 내부의 물리량 계산을 통해 반도체 소자 내부의 불순물의 거동을 해석하게 된다. 이를 위해 사용되는 알고리즘으로 3차원적 형상을 표현하는 물리적 미분 미분방정식을 계산하게 되는데, 정확한 계산을 위해 유한 차분 시간 영역법(이하 FDTD)과 같은 수치해석 기법을 이용한다. 실제적으로 반도체 공정의 시뮬레이션에서 FDTD연산의 실행 시간은 90% 이상을 소요하게 된다. 이러한 연산에서 더욱 빠른 성능을 확보하기 위해 본 논문에서는 기존의 CUDA(Compute Unified Device Architecture)로 구현된 FDTD알고리즘을 OpenMP를 통한 다중 GPU제어를 이용하여 연산 수행시간을 감소하고, 그 결과물을 통하여 성능 향상도를 측정한다.

  • PDF

Fast Mode Decision Algorithm for Intra Prediction in Spatial Enhancement Layer of SVC (SVC 공간적 향상 계층에서 빠른 인트라 예측 모드 결정 방법)

  • Cho, Mi-Sook;Kang, Jin-Mi;Chung, Ki-Dong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06d
    • /
    • pp.251-254
    • /
    • 2008
  • H.264/AVC의 확장 표준으로 제정된 SVC는 공간적 확장성의 압축 효율을 높이기 위해 기존 H.264/AVC에서 제공하는 인트라 예측과 인터 예측뿐만 아니라 계층 간 예측을 추가로 수행한다. SVC 표준의 인트라 예측 과정은 부호화가 가능한 모든 모드를 부호화한 후에 최적의 RD(Rate Distortion) 값을 갖는 모드를 선택하기 때문에 계층 간 예측이 추가되어 연산량이 더욱 증가되는 문제점이 있다. 본 논문에서는 공간적 향상 계층에서 인트라 예측 시 연산량을 효과적으로 감소시킬 수 있는 빠른 인트라 예측 모드 결정 방법을 제안한다. 매크로블록 내 경계의 평탄 여부를 조사하여 미리 Intra_BL 모드를 결정하는 방법으로 모드 선택에 따른 RD 값 비교 과정을 줄임으로써 SVC 표준의 인트라 예측 방법보다 연산량이 크게 감소되었다.

  • PDF

Efficient Complex Event Processing Scheme through Similar Operation Processing in Duplicate Events (중복 이벤트 유사 연산 처리를 통한 효율적인 복합 이벤트 처리 기법)

  • Kim, Daeyun;Kim, Byounghoon;Ko, Geonsik;Noh, Yeonwoo;Choi, Dojin;Lim, Jongtae;Bok, Kyoungsoo;Yoo, Jaesoo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2016.05a
    • /
    • pp.59-60
    • /
    • 2016
  • 사물통신 기기의 발달로 다양한 응용에서 대용량의 스트림 데이터의 실시간 복합 이벤트 처리 기법에 대한 중요성이 증가되고 있다. 본 논문에서는 유사 연산 처리 비용을 감소시키기 위한 다수의 복합 이벤트 처리 기법을 제안한다. 제안하는 기법은 다수의 복합 이벤트를 처리하기 위한 연산자를 그래프로 표현하고 중복적인 연산을 감소시킨다.

  • PDF