• Title/Summary/Keyword: Floating point number

Search Result 83, Processing Time 0.024 seconds

OpenGL ES 1.1 Implementation Using OpenGL (OpenGL을 이용한 OpenGL ES 1.1 구현)

  • Lee, Hwan-Yong;Baek, Nak-Hoon
    • The KIPS Transactions:PartA
    • /
    • v.16A no.3
    • /
    • pp.159-168
    • /
    • 2009
  • In this paper, we present an efficient way of implementing OpenGL ES 1.1 standard for the environments with hardware-supported OpenGL API, such as desktop PCs. Although OpenGL ES was started from the existing OpenGL features, it becomes a new three-dimensional graphics library customized for embedded systems through introducing fixed-point arithmetic operations, buffer management with fixed-point data type supports, completely new texture mapping functionalities and others. Currently, it is the official three dimensional graphics library for Google Android, Apple iPhone, PlayStation3, etc. In this paper, we achieved improvements on the arithmetic operations for the fixed-point number representation, which is the most characteristic data type for OpenGL ES. For the conversion of fixed-point data types to the floating-point number representations for the underlying OpenGL, we show the way of efficient conversion processes even with satisfying OpenGL ES standard requirements. We also introduced a simple memory management scheme to mange the converted data for the buffer containing fixed-point numbers. In the case of texture processing, the requirements in both standards are quite different and thus we used completely new software-implementations. Our final implementation result of OpenGL ES library provides all of over than 200 functions in OpenGL ES 1.1 standard and completely passed its conformance test, to show its compliance with the standard. From the efficiency viewpoint, we measured its execution times for several OpenGL ES-specific application programs and achieved at most 33.147 times improvements, to become the fastest one among the OpenGL ES implementations in the same category.

Design of a High-Performance Mobile GPGPU with SIMT Architecture based on a Small-size Warp Scheduler (작은 크기의 Warp 스케쥴러 기반 SIMT구조 고성능 모바일 GPGPU 설계)

  • Lee, Kwang-Yeob
    • Journal of IKEEE
    • /
    • v.25 no.3
    • /
    • pp.479-484
    • /
    • 2021
  • This paper proposed and designed a structure to achieve high performance with a small number of cores in GPGPU with SIMT structure. GPGPU for application to mobile devices requires a structure to increase performance compared to power consumption. In order to reduce power consumption, the number of cores decreased, but to improve performance, the size of the warp scheduler for managing threads was set to 4, which was greatly reduced than 32 of general GPGPU. Reducing warp size can reduce the number of idle cycles in pipelines and efficiently apply memory latency to reduce miss penalty when accessing cache memory. The designed GPGPU measured computational performance using a test program that includes floating point operations and measured power consumption through a 28nm CMOS process to obtain 104.5GFlops/Watt as a performance per power. The results of this paper showed about four times better performance per power compared to Tegra K1 of Nvidia

A DSP Implementation of Subband Sound Localization System

  • Park, Kyusik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4E
    • /
    • pp.52-60
    • /
    • 2001
  • This paper describes real time implementation of subband sound localization system on a floating-point DSP TI TMS320C31. The system determines two dimensional location of an active speaker in a closed room environment with real noise presents. The system consists of an two microphone array connected to TI DSP hosted by PC. The implemented sound localization algorithm is Subband CPSP which is an improved version of traditional CPSP (Cross-Power Spectrum Phase) method. The algorithm first split the input speech signal into arbitrary number of subband using subband filter banks and calculate the CPSP in each subband. It then averages out the CPSP results on each subband and compute a source location estimate. The proposed algorithm has an advantage over CPSP such that it minimize the overall estimation error in source location by limiting the specific band dominant noise to that subband. As a result, it makes possible to set up a robust real time sound localization system. For real time simulation, the input speech is captured using two microphone and digitized by the DSP at sampling rate 8192 hz, 16 bit/sample. The source location is then estimated at once per second to satisfy real-time computational constraints. The performance of the proposed system is confirmed by several real time simulation of the speech at a distance of 1m, 2m, 3m with various speech source locations and it shows over 5% accuracy improvement for the source location estimation.

  • PDF

Mooring Cost Sensitivity Study Based on Cost-Optimum Mooring Design

  • Ryu, Sam Sangsoo;Heyl, Caspar;Duggal, Arun
    • Journal of Ocean Engineering and Technology
    • /
    • v.23 no.1
    • /
    • pp.1-6
    • /
    • 2009
  • The paper describes results of a sensitivity study on an optimum mooring cost as a function of safety factor and allowable maximum offset of the offshore floating structure by finding the anchor leg component size and the declination angle. A harmony search (HS) based mooring optimization program was developed to conduct the study. This mooring optimization model was integrated with a frequency-domain global motion analysis program to assess both cost and design constraints of the mooring system. To find a trend of anchor leg system cost for the proposed sensitivity study, optimum costs after a certain number of improvisation were found and compared. For a case study a turret-moored FPSO with 3 ${\times}$ 3 anchor leg system was considered. To better guide search for the optimum cost, three different penalty functions were applied. The results show that the presented HS-based cost-optimum offshore mooring design tool can be used to find optimum mooring design values such as declination angle and horizontal end point separation as well as a cost-optimum mooring system in case either the allowable maximum offset or factor of safety varies.

A Study on the Design of Dolphin System for VLFS (부유식 해양구조물을 위한 돌핀 계류시스템의 설계 연구)

  • Cho Kyo-Nam
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.19 no.1 s.71
    • /
    • pp.105-111
    • /
    • 2006
  • Dolphin mooring system can be a good candidate for the VLFS fastening system in view point of strength and effectiveness. In the design process of the dolphin system, precise calculation of the wave forces and the subsequent selecting the proper number of the piles adopted are one of the main factors. In this paper, one of the design process of the dolphin system is investigated and a proper configuration of the system is derived based on the structural characteristics of the system that was obtained through the structural analysis of the basic pile element confronted to the external loadings including wave impact load. It was found that lot the better design of ihe mooring system for VLFS, mono pile mooring system is more recommendable in a specific condition than other multi piles mooring system.

Optimized Integer Cosine Transform (최적화 정수형 여현 변환)

  • 이종하;김혜숙;송인준;곽훈성
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.9
    • /
    • pp.1207-1214
    • /
    • 1995
  • We present an optimized integer cosine transform(OICT) as an alternative approach to the conventional discrete cosine transform(DCT), and its fast computational algorithm. In the actual implementation of the OICT, we have used the techniques similar to those of the orthogonal integer transform(OIT). The normalization factors are approximated to single one while keeping the reconstruction error at the best tolerable level. By obtaining a single normalization factor, both forward and inverse transform are performed using only the integers. However, there are so many sets of integers that are selected in the above manner, the best OICT matrix obtained through value minimizing the Hibert-Schmidt norm and achieving fast computational algorithm. Using matrix decomposing, a fast algorithm for efficient computation of the order-8 OICT is developed, which is minimized to 20 integer multiplications. This enables us to implement a high performance 2-D DCT processor by replacing the floating point operations by the integer number operations. We have also run the simulation to test the performance of the order-8 OICT with the transform efficiency, maximum reducible bits, and mean square error for the Wiener filter. When the results are compared to those of the DCT and OIT, the OICT has out-performed them all. Furthermore, when the conventional DCT coefficients are reduced to 7-bit as those of the OICT, the resulting reconstructed images were critically impaired losing the orthogonal property of the original DCT. However, the 7-bit OICT maintains a zero mean square reconstruction error.

  • PDF

Simple Al Robust Digital Position Control of PMSM using Neural Network Compensator (신경망 보상기를 이용한 PMSM의 간단한 지능형 강인 위치 제어)

  • Ko, Jong-Sun;Youn, Sung-Koo;Lee, Tae-Ho
    • The Transactions of the Korean Institute of Electrical Engineers B
    • /
    • v.49 no.8
    • /
    • pp.557-564
    • /
    • 2000
  • A very simple control approach using neural network for the robust position control of a Permanent Magnet Synchronous Motor(PMSM) is presented. The linear quadratic controller plus feedforward neural network is employed to obtain the robust PMSM system approximately linearized using field-orientation method for an AC servo. The neural network is trained in on-line phases and this neural network is composed by a feedforward recall and error back-propagation training. Since the total number of nodes are only eight, this system can be easily realized by the general microprocessor. During the normal operation, the input-output response is sampled and the weighting value is trained multi-times by error back-propagation method at each sample period to accommodate the possible variations in the parameters or load torque. In addition, the robustness is also obtained without affecting overall system response. This method is realized by a floating-point Digital Signal Processor DS1102 Board (TMS320C31).

  • PDF

Design of a High-speed Decision Feedback Equalizer using the Constant-Modulus Algorithm (CMA 알고리즘을 이용한 고속 DFE 등화기 설계)

  • Jeon, Yeong-Seop;;Kim, Gyeong-Ho
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.39 no.4
    • /
    • pp.173-179
    • /
    • 2002
  • This paper describes an equalizer using the DFE (Decision Feedback Equalizer) structure, CMA (Constant Modulus Algorithm) and LMS (Least Mean Square) algorithms. The DFE structure has better channel adaptive performance and lower BER than the transversal structure. The proposed equalizer can be used for 16/64 QAM modems. We employ high speed multipliers, square logics and many CSAs (Carry Save Adder) for high speed operations. We have developed floating-point models and fixed-point models using the COSSAP$\^$TM/ CAD tool and developed VHDL filter. The proposed equalizer shows low BER in multipath fading channel. We have performed models. From the simulation results, we employ a 12 tap feedback filter and a 8 tap feedforward logic synthesis using the SYNOPSYS$\^$TM/ CAD tool and the SAMSUNG 0.5$\mu\textrm{m}$ standard cell library (STD80) and verified function and timing simulations. The total number of gates is about 130,000.

A study on the Encoding Method for High Performance Moving Picture Encoder (고속 동영상 부호기를 위한 부호화 방법에 관한 연구)

  • 김용욱;허도근
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.2
    • /
    • pp.352-358
    • /
    • 2004
  • This paper is studied the improvement of performance for moving picture encoder using H.263. This is used the new motion vector search algorithm using a relation with neighborhood search point and is applied the integer DCT for the encoder. The integer DCT behaves DCT by the addition operation of the integer using WHT and a integer lifting than conventional DCT that needs the multiplication operation of a floating point number. Therefore, the integer Dn can reduce the operation amount than basis DCT with having an equal PSNR. The new motion vector search algorithm is showed almost similar PSNR as reducing the operation amount than the conventional motion vector search algorithm. To experiment a compatibility of the integer DCT and the conventional DCT, according to result compare case that uses a method only and case that uses the alternate two methods of the integer DCT or the conventional DCT to H.263 encoder and decoder, case that uses the alternate two methods is showed doing not deteriorate PSNR-and being each other compatible visually than case that uses an equal method only.

Time-optimized Color Conversion based on Multi-mode Chrominance Reconstruction and Operation Rearrangement for JPEG Image Decoding (JPEG 영상 복원을 위한 다중 모드 채도 복원과 연산 재배열 기반의 시간 최적화된 컬러 변환)

  • Kim, Young-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.1
    • /
    • pp.135-143
    • /
    • 2009
  • Recently, in the mobile device, the increase of the need for encoding and decoding of high-resolution images requires an efficient implementation of the image codec. This paper proposes a time-optimized color conversion method for the JPEG decoder, which reduces the number of calculations in the color conversion by the rearrangement of arithmetic operations being possible due to the linearity of the IDCT and the color conversion matrices and brings down the time complexity of the color conversion itself by the integer mapping replacing floating-point operations to the optimal fixed-point shift and addition operations, eventually reducing the time complexity of the JPEG decoder. And the proposed method compensates a decline of image quality incurred by the quantification error of the operation arrangement and the integer mapping by using the multi-mode chrominance reconstruction. The performance evaluation performed on the development platform of embedded systems showed that, compared to previous color conversion methods, the proposed method greatly reduces the image decoding time, minimizing the distortion of decoded images.