• Title/Summary/Keyword: Variable latency

Search Result 62, Processing Time 0.029 seconds

Goldschmidt's Double Precision Floating Point Reciprocal Computation using 32 bit multiplier (32 비트 곱셈기를 사용한 골드스미트 배정도실수 역수 계산기)

  • Cho, Gyeong-Yeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.5
    • /
    • pp.3093-3099
    • /
    • 2014
  • Modern graphic processors, multimedia processors and audio processors mostly use floating-point number. Meanwhile, high-level language such as C and Java uses both single-precision and double precision floating-point number. In this paper, an algorithm which computes the reciprocal of double precision floating-point number using a 32 bit multiplier is proposed. It divides the mantissa of double precision floating-point number to upper part and lower part, and calculates the reciprocal of the upper part with Goldschmidt's algorithm, and computes the reciprocal of double precision floating-point number with calculated upper part reciprocal as the initial value is proposed. Since the number of multiplications performed by the proposed algorithm is dependent on the mantissa of floating-point number, the average number of multiplications per an operation is derived from some reciprocal tables with varying sizes.

Fully parallel low-density parity-check code-based polar decoder architecture for 5G wireless communications

  • Dinesh Kumar Devadoss;Shantha Selvakumari Ramapackiam
    • ETRI Journal
    • /
    • v.46 no.3
    • /
    • pp.485-500
    • /
    • 2024
  • A hardware architecture is presented to decode (N, K) polar codes based on a low-density parity-check code-like decoding method. By applying suitable pruning techniques to the dense graph of the polar code, the decoder architectures are optimized using fewer check nodes (CN) and variable nodes (VN). Pipelining is introduced in the CN and VN architectures, reducing the critical path delay. Latency is reduced further by a fully parallelized, single-stage architecture compared with the log N stages in the conventional belief propagation (BP) decoder. The designed decoder for short-to-intermediate code lengths was implemented using the Virtex-7 field-programmable gate array (FPGA). It achieved a throughput of 2.44 Gbps, which is four times and 1.4 times higher than those of the fast-simplified successive cancellation and combinational decoders, respectively. The proposed decoder for the (1024, 512) polar code yielded a negligible bit error rate of 10-4 at 2.7 Eb/No (dB). It converged faster than the BP decoding scheme on a dense parity-check matrix. Moreover, the proposed decoder is also implemented using the Xilinx ultra-scale FPGA and verified with the fifth generation new radio physical downlink control channel specification. The superior error-correcting performance and better hardware efficiency makes our decoder a suitable alternative to the successive cancellation list decoders used in 5G wireless communication.

A Variable Latency Goldschmidt's Floating Point Number Square Root Computation (가변 시간 골드스미트 부동소수점 제곱근 계산기)

  • Kim, Sung-Gi;Song, Hong-Bok;Cho, Gyeong-Yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.1
    • /
    • pp.188-198
    • /
    • 2005
  • The Goldschmidt iterative algorithm for finding a floating point square root calculated it by performing a fixed number of multiplications. In this paper, a variable latency Goldschmidt's square root algorithm is proposed, that performs multiplications a variable number of times until the error becomes smaller than a given value. To find the square root of a floating point number F, the algorithm repeats the following operations: $R_i=\frac{3-e_r-X_i}{2},\;X_{i+1}=X_i{\times}R^2_i,\;Y_{i+1}=Y_i{\times}R_i,\;i{\in}\{{0,1,2,{\ldots},n-1} }}'$with the initial value is $'\;X_0=Y_0=T^2{\times}F,\;T=\frac{1}{\sqrt {F}}+e_t\;'$. The bits to the right of p fractional bits in intermediate multiplication results are truncated, and this truncation error is less than $'e_r=2^{-p}'$. The value of p is 28 for the single precision floating point, and 58 for the doubel precision floating point. Let $'X_i=1{\pm}e_i'$, there is $'\;X_{i+1}=1-e_{i+1},\;where\;'\;e_{i+1}<\frac{3e^2_i}{4}{\mp}\frac{e^3_i}{4}+4e_{r}'$. If '|X_i-1|<2^{\frac{-p+2}{2}}\;'$ is true, $'\;e_{i+1}<8e_r\;'$ is less than the smallest number which is representable by floating point number. So, $\sqrt{F}$ is approximate to $'\;\frac{Y_{i+1}}{T}\;'$. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation is derived from many reciprocal square root tables ($T=\frac{1}{\sqrt{F}}+e_i$) with varying sizes. The superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a square root unit. Also, it can be used to construct optimized approximate reciprocal square root tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia, scientific computing, etc.

A Variable Latency Goldschmidt's Floating Point Number Divider (가변 시간 골드스미트 부동소수점 나눗셈기)

  • Kim Sung-Gi;Song Hong-Bok;Cho Gyeong-Yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.2
    • /
    • pp.380-389
    • /
    • 2005
  • The Goldschmidt iterative algorithm for a floating point divide calculates it by performing a fixed number of multiplications. In this paper, a variable latency Goldschmidt's divide algorithm is proposed, that performs multiplications a variable number of times until the error becomes smaller than a given value. To calculate a floating point divide '$\frac{N}{F}$', multifly '$T=\frac{1}{F}+e_t$' to the denominator and the nominator, then it becomes ’$\frac{TN}{TF}=\frac{N_0}{F_0}$'. And the algorithm repeats the following operations: ’$R_i=(2-e_r-F_i),\;N_{i+1}=N_i{\ast}R_i,\;F_{i+1}=F_i{\ast}R_i$, i$\in${0,1,...n-1}'. The bits to the right of p fractional bits in intermediate multiplication results are truncated, and this truncation error is less than ‘$e_r=2^{-p}$'. The value of p is 29 for the single precision floating point, and 59 for the double precision floating point. Let ’$F_i=1+e_i$', there is $F_{i+1}=1-e_{i+1},\;e_{i+1}',\;where\;e_{i+1}, If '$[F_i-1]<2^{\frac{-p+3}{2}}$ is true, ’$e_{i+1}<16e_r$' is less than the smallest number which is representable by floating point number. So, ‘$N_{i+1}$ is approximate to ‘$\frac{N}{F}$'. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation is derived from many reciprocal tables ($T=\frac{1}{F}+e_t$) with varying sizes. 1'he superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a divider. Also, it can be used to construct optimized approximate reciprocal tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia, scientific computing, etc

A Variable Latency Newton-Raphson's Floating Point Number Reciprocal Computation (가변 시간 뉴톤-랍손 부동소수점 역수 계산기)

  • Kim Sung-Gi;Cho Gyeong-Yeon
    • The KIPS Transactions:PartA
    • /
    • v.12A no.2 s.92
    • /
    • pp.95-102
    • /
    • 2005
  • The Newton-Raphson iterative algorithm for finding a floating point reciprocal which is widely used for a floating point division, calculates the reciprocal by performing a fixed number of multiplications. In this paper, a variable latency Newton-Raphson's reciprocal algorithm is proposed that performs multiplications a variable number of times until the error becomes smaller than a given value. To find the reciprocal of a floating point number F, the algorithm repeats the following operations: '$'X_{i+1}=X=X_i*(2-e_r-F*X_i),\;i\in\{0,\;1,\;2,...n-1\}'$ with the initial value $'X_0=\frac{1}{F}{\pm}e_0'$. The bits to the right of p fractional bits in intermediate multiplication results are truncated, and this truncation error is less than $'e_r=2^{-p}'$. The value of p is 27 for the single precision floating point, and 57 for the double precision floating point. Let $'X_i=\frac{1}{F}+e_i{'}$, these is $'X_{i+1}=\frac{1}{F}-e_{i+1},\;where\;{'}e_{i+1}, is less than the smallest number which is representable by floating point number. So, $X_{i+1}$ is approximate to $'\frac{1}{F}{'}$. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation is derived from many reciprocal tables $(X_0=\frac{1}{F}{\pm}e_0)$ with varying sizes. The superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a reciprocal unit. Also, it can be used to construct optimized approximate reciprocal tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia scientific computing, etc.

The Effects of prompting through 3-steps compliance training to reaction time for child with Asperger's syndrome (3단계 지시따르기에 의한 수용언어촉진이 아스퍼거 아동의 반응시간에 미치는 효과)

  • Yoon, Hyeon-Sook;Yoon, Sun-Young
    • Journal of the Korea Convergence Society
    • /
    • v.5 no.4
    • /
    • pp.137-146
    • /
    • 2014
  • This study investigated the effects of response prompting through 3-steps compliance training to reaction time for child with Asperger's syndrome(AS). The participant was 3 and 8 year-old boy who was diagnostic As with non-compliant, delayed receptive language. Study design was multiple-baseline across behaviors. Target Behaviors were hands-up, following direction, and answering behavior. Dependent variable was latency reaction time during compliance training. This results mean that reaction time was increased raise hands-up behavior, compliance behavior and response ask questions. During intervention, the participant improve the rate on-task behavior as well as reduce off-task behaviors.

Design of Dual-Path Decimal Floating-Point Adder (이중 경로 십진 부동소수점 가산기 설계)

  • Lee, Chang-Ho;Kim, Ji-Won;Hwang, In-Guk;Choi, Sang-Bang
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.9
    • /
    • pp.183-195
    • /
    • 2012
  • We propose a variable-latency Decimal Floating Point(DFP) adder which adopts the dual data path scheme. It is to speed addition and subtraction of operand that has identical exponents. The proposed DFP adder makes use of L. K. Wang's operand alignment algorithm, but operates through high speed data-path in guaranteed accuracy range. Synthesis results show that the area of the proposed DFP adder is increased by 8.26% compared to the L. K. Wang's DFP adder, though critical path delay is reduced by 10.54%. It also operates at 13.65% reduced path than critical path in case of an operation which has two DFP operands with identical exponents. We prove that the proposed DFP adder shows higher efficiency than L. K. Wang's DFP adder when the ratio of identical exponents is larger than 2%.

Adaptive Multi-level Streaming Service using Fuzzy Similarity in Wireless Mobile Networks (무선 모바일 네트워크상에서 퍼지 유사도를 이용한 적응형 멀티-레벨 스트리밍 서비스)

  • Lee, Chong-Deuk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.9
    • /
    • pp.3502-3509
    • /
    • 2010
  • Streaming service in the wireless mobile network environment has been a very challenging issue due to the dynamic uncertain nature of the channels. Overhead such as congestion, latency, and jitter lead to the problem of performance degradation of an adaptive multi-streaming service. This paper proposes a AMSS (Adaptive Multi-level Streaming Service) mechanism to reduce the performance degradation due to overhead such as variable network bandwidth, mobility and limited resources of the wireless mobile network. The proposed AMSS optimizes streaming services by: 1) use of fuzzy similarity metric, 2) minimization of packet loss due to buffer overflow and resource waste, and 3) minimization of packet loss due to congestion and delay. The simulation result shows that the proposed method has better performance in congestion control and packet loss ratio than the other existing methods of TCP-based method, UDP-based method and VBM-based method. The proposed method showed improvement of 10% in congestion control ratio and 8% in packet loss ratio compared with VBM-based method which is one of the best method.

CLINICAL STUDY OF MANDIBLE SYMPHYSIS WIDENING (외과적 하악 정중부 골신장술)

  • Kwon, Kyung-Hwan;Min, Seung-Ki;Oh, Sung-Hwan;Lee, Jun;Cha, Jae-Won
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.30 no.6
    • /
    • pp.516-525
    • /
    • 2004
  • Mandibular symphyseal distraction osteogenesis is an alternative approach for correcting mandibular transverse deficiencies and dental crowding. The traditional approaches for these are extraction of teeth and arch expansion with traditional orthodontic treatment. Also extractions are usually unavoidable in patients with severe crowding. The purpose of this study is to evaluate the effect of mandibular symphyseal distraction osteogenesis by use of tooth-borne expansion appliance. All of 12 patients had been performed distraction osteogenesis. The surgical procedures were accomplished under local anesthesia and intravenous sedation in an ambulatory surgical setting using a routine distraction protocol. The latency period was 5 days or 7 days after symphyseal osteotomies. The rate & rhyth is a intermittent, 0.75mm or 1.0 mm per day and stabilized for 6, 8 weeks after distraction. The time of orthodontic tooth movement after distraction was variable from 2 weeks to 8 weeks (mean 3 weeks). All patients had been evaluated with study casts, plain periapical films, panorama radiograms before & after surgery. Mandibular symphyseal distraction osteogenesis increased mandibular arch width and corrected dental crowding, with paralleling tooth-borne movement, without proclination of the mandibular incisors.

A Low Power 3D Graphics Accelerator Considering Both Active and Standby Modes for Mobile Devices (모바일기기의 동작모드와 대기모드를 모두 고려한 저전력 3차원 그래픽 가속기)

  • Kim, Young-Sik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.2
    • /
    • pp.57-64
    • /
    • 2007
  • This paper proposed the low power texture cache for mobile 3D graphics accelerators. It is very important to reduce the leakage power in the standby mode for mobile 3D graphics accelerators and the memory access latency of texture mapping in the active mode which needs a large memory bandwidth. The proposed structure reduces the leakage power using variable threshold values of power mode transitions according to the selected texture filtering algorithms of application programs, which has the run time gain for texture mapping. In the trace driven cache simulation the proposed structure shows the best 7% performance gain to the previous MSA cache according to the new performance metric considering both normalized leakage power and run time impact.