• Title/Summary/Keyword: PE블록

Search Result 32, Processing Time 0.027 seconds

An Implementation of 3D Graphic Accelerator for Phong Shading (퐁 음영법을 위한 3차원 그래픽 가속기의 구현)

  • Lee, Hyung;Park, Youn-Ok;Park, Jong-Won
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.5
    • /
    • pp.526-534
    • /
    • 2000
  • There have been many researches on the 3D graphic accelerator for high speed by needs of CAD/CAM,3D modeling, virtual reality or medical image. In this paper, an SIMD processor architecture for 3D graphic accelerator is proposed in order to improve the processing time of the 3D graphics, and a parallel Phong shading algorithm is presented to estimate performance of the proposed architecture. The proposed SIMD processor architecture for 3D graphic accelerator consists of PCI local bus interface, 16 Processing Elements (PE's), and Park's multi-access memory system (NAMS) that has 17 memory modules. A serial algorithm for Phong shading is modified for the architecture and the main key is to divide a polygon into $4\times{4}$ squares. And, for processing a square, 4 PE's are regarded as a PE Grou logically. Since MAMS can support block access type with interval 1, it is possible that 4 PE Groups process a square at a time. In consequence, 16 pixels are processed simultaneously. The proposed SIMD processor architecture is simulated by CADENCE Verilog-XL that is a package for the hardware simulation. With the same simulated results as that of the serial algorithm, the speed enhancement by the parallel algorithm to the serial one is 5.68.

  • PDF

Conflict-Free Memory System for Subarray Access (서브어레이 접근을 위한 충돌회피 기억장치)

  • 박춘자;박종원
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04a
    • /
    • pp.43-45
    • /
    • 2002
  • 이 논문에서는 pq개의 PE(Processing Element)를 가진 SIMD처리기에서 기억 장치 접근시간을 감소시키기 위한 충돌회피 기억장치를 제안했다. 이 기억장치는 MxN 배열내 자료들의 임의의 위치에서 일정 간격인 블록형태와 8방향 선형태인 pd개의 자료들의 동시 접근을 지원한다. 기억모듈 수는 pq보다 큰 소수이고, 간격은 기억모듈 수의 배수가 아닌 양수이다. 간단하고 빠른 주소계산회로와 이동회로를 위해, 요구된 자료들에서 첫번째 자료의 기준 주소와 pq개의 주소간의 차들로 구분한 후, 주소간의 차들은 첫번째 자료 요소의 기억모듈번호로부터 번호에 따라 오름차순 정렬되고 빠른 기억모듈에 저장된다. 그래서 m개의 주소간의 차이들에 첫번째 자료의 기준주소 더해진 후, 첫 번째 요소의 기억모듈 번호에 의한 오른쪽 회전이 간격을 가진 9가지 서브어레이 모두이게 요구된다. 9가지 자료 이동 형태를 멀티플렉싱과 회전에 의해 1가지로 감소시킨 효율적인 자료 이동 회로를 제안하였다. 제안된 충돌회피기억 장치는 이전기억 장치와 비교하여 자료 접근형태, 간격, 자료 배열의 크기에 제한, 하드웨어 비용, 속도, 복잡도면에서 개선하였다.

  • PDF

Adaptive Predictive Image Coding of Variable Block Shapes Based on Edge Contents of Blocks (경계의 방향성에 근거를 둔 가변블록형상 적응 예측영상부호화)

  • Do, Jae-Su;Kim, Ju-Yeong;Jang, Ik-Hyeon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2254-2263
    • /
    • 2000
  • This paper proposes an efficient predictive image-compression technique based on vector quantization of blocks of pels. In the proposed method edge contents of blocks control the selection of predictors and block shapes as well. The maximum number of bits assigned to quantizers has been in creased to 3bits/pel from 1/5bits/pel, the setting employed by forerunners in predictive vector quantization of images. This increase prevents the saturation in SNR observed in their results in high bit rates. The variable block shape is instrumental in eh reconstruction of edges. The adaptive procedure is controlled by means of he standard deviation ofp rediction errors generated by a default predictor; the standard deviation address a decision table which can be set up beforehand. eh proposed method is characterized by overall improvements in image quality over A-VQ-PE and A-DCT VQ, both of which are known for their efficient use of vector quantizers.

  • PDF

VLSI Implementation of Low-Power Motion Estimation Using Reduced Memory Accesses and Computations (메모리 호출과 연산횟수 감소기법을 이용한 저전력 움직임추정 VLSI 구현)

  • Moon, Ji-Kyung;Kim, Nam-Sub;Kim, Jin-Sang;Cho, Won-Kyung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.5A
    • /
    • pp.503-509
    • /
    • 2007
  • Low-power motion estimation is required for video coding in portable information devices. In this paper, we propose a low-power motion estimation algorithm and 1-D systolic may VLSI architecture using full search block matching algorithm (FSBMA). Main power dissipation sources of FSBMA are complex computations and frequent memory accesses for data in the search area. In the proposed algorithm, memory accesses and computations are reduced by using 1D PE (processing array) array architecture performing motion estimation of two neighboring blocks in parallel and by skipping unnecessary computations during motion estimation. The VLSI implementation results of the algorithm show that the proposed VLSI architecture can save 9.3% power dissipation and can operate two times faster than an existing low-power motion estimator.

Performance Analysis of Implementation on Image Processing Algorithm for Multi-Access Memory System Including 16 Processing Elements (16개의 처리기를 가진 다중접근기억장치를 위한 영상처리 알고리즘의 구현에 대한 성능평가)

  • Lee, You-Jin;Kim, Jea-Hee;Park, Jong-Won
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.3
    • /
    • pp.8-14
    • /
    • 2012
  • Improving the speed of image processing is in great demand according to spread of high quality visual media or massive image applications such as 3D TV or movies, AR(Augmented reality). SIMD computer attached to a host computer can accelerate various image processing and massive data operations. MAMS is a multi-access memory system which is, along with multiple processing elements(PEs), adequate for establishing a high performance pipelined SIMD machine. MAMS supports simultaneous access to pq data elements within a horizontal, a vertical, or a block subarray with a constant interval in an arbitrary position in an $M{\times}N$ array of data elements, where the number of memory modules(MMs), m, is a prime number greater than pq. MAMS-PP4 is the first realization of the MAMS architecture, which consists of four PEs in a single chip and five MMs. This paper presents implementation of image processing algorithms and performance analysis for MAMS-PP16 which consists of 16 PEs with 17 MMs in an extension or the prior work, MAMS-PP4. The newly designed MAMS-PP16 has a 64 bit instruction format and application specific instruction set. The author develops a simulator of the MAMS-PP16 system, which implemented algorithms can be executed on. Performance analysis has done with this simulator executing implemented algorithms of processing images. The result of performance analysis verifies consistent response of MAMS-PP16 through the pyramid operation in image processing algorithms comparing with a Pentium-based serial processor. Executing the pyramid operation in MAMS-PP16 results in consistent response of processing time while randomly response time in a serial processor.

Poly(ether block amide) (PEBA) Based Membranes for Carbon Dioxide Separation (이산화탄소 분리를 위한 PEBA공중합체 기반 분리막)

  • Lee, Jae Hun;Patel, Rajkumar
    • Membrane Journal
    • /
    • v.29 no.1
    • /
    • pp.1-10
    • /
    • 2019
  • Poly(ether block amide) (PEBA) is one of the commercially important class of block copolymer very much suitable specifically for $CO_2$ separation. Gas separation membrane need to have good mechanical strength as well as high gas permeability. The crystalline polyamide (PA) block provides the mechanical strength while the rubbery polyether (PE) group being $CO_2$-philic facilitate $CO_2$ permeation though the membrane. Composition of thermoplastic and rubbery phase in the polymer are changed to fit into suitable gas separation application. Although PEBA has good permeability, the selectivity of the membrane can be enhanced by incorporating molecular sieve without affection much the gas permeability. Mixed matrix membrane (MMM), a class of composite membrane combine the advantage of polymer matrix with the inorganic fillers. However, there are some disadvantages based on the compatibility of the inorganic fillers and polymeric phase. This review covers both the advantage and limitations of PEBA block copolymer based composite membrane.

Deinterlacing Method for improving Motion Estimator based on multi arithmetic Architecture (다중연산구조기반의 고밀도 성능향상을 위한 움직임추정의 디인터레이싱 방법)

  • Lee, Kang-Whan
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.1
    • /
    • pp.49-55
    • /
    • 2007
  • To improved the multi-resolution fast hierarchical motion estimation by using de-interlacing algorithm that is effective in term of both performance and VLSI implementation, is proposed so as to cover large search area field-based as well as frame based image processing in SoC design. In this paper, we have simulated a various picture mode M=2 or M=3. As a results, the proposed algorithm achieved the motion estimation performance PSNR compare with the full search block matching algorithm, the average performance degradation reached to -0.7dB, which did not affect on the subjective quality of reconstructed images at all. And acquiring the more desirable to adopt design SoC for the fast hierarchical motion estimation, we exploit foreground and background search algorithm (FBSA) base on the dual arithmetic processor element(DAPE). It is possible to estimate the large search area motion displacement using a half of number PE in general operation methods. And the proposed architecture of MHME improve the VLSI design hardware through the proposed FBSA structure with DAPE to remove the local memory. The proposed FBSA which use bit array processing in search area can improve structure as like multiple processor array unit(MPAU).

Preparation of Nanostructures Using Layer-by-Layer Assembly and Applications (층상자기조립법을 이용한 나노구조체의 제조와 응용)

  • Cho, Jin-Han
    • Journal of the Korean Vacuum Society
    • /
    • v.19 no.2
    • /
    • pp.81-90
    • /
    • 2010
  • We introduce a novel and versatile approach for preparing self-assembled nanoporous multilayered films with antireflective properties. Protonated polystyrene-block-poly (4-vinylpyrine) (PS-b-P4VP) and anionic polystyrene-block-poly (acrylic acid) (PS-b-PAA) block copolymer micelles (BCM) were used as building blocks for the layer-by-layer assembly of BCM multilayer films. BCM film growth is governed by electrostatic and hydrogen-bonding interactions between the oppositely BCMs. Both film porosity and film thickness are dependent upon the charge density of the micelles, with the porosity of the film controlled by the solution pH and the molecular weight (Mw) of the constituents. PS7K-b-P4VP28K/PS2K-b-PAA8K films prepared at pH 4 (for PS7K-b-P4VP28K) and pH 6 (for PS2K-b-PAA8K) are highly nanoporous and antireflective. In contrast, PS7K-b-P4VP28K/PS2K-b-PAA8K films assembled at pH 4/4 show a relatively dense surface morphology due to the decreased charge density of PS2K-b-PAA8K. Films formed from BCMs with increased PS block and decreased hydrophilic block (P4VP or PAA) size (e.g., PS36K-b-P4VP12K/PS16K-b-PAA4K at pH 4/4) were also nanoporous. Furthermore, we demonstrate that the nanostructured electrochemical sensors based on patterning methods show the electrochemical activities. Anionic poly(styrene sulfonate) (PSS) layers were selectively and uniformly deposited onto the catalase (CAT)-coated surface using the micro-contact printing method. The pH-induced charge reversal of catalase can provide the selective deposition of consecutive PE multilayers onto patterned PSS layers by causing the electrostatic repulsion between next PE layer and catalase. Based on this patterning method, the hybrid patterned multilayers composed of platinum nanoparticles (PtNP) and catalase were prepared and then their electrochemical properties were investigated from sensing $H_2O_2$ and NO gas. This study was based on the papers reported by our group. (J. Am. Chem. Soc. 128, 9935 (2006); Adv. Mater. 19, 4364 (2007); Electro. Mater. Lett. 3, 163 (2007)).

A Study on the Prediction System of Block Matching Rework Time (블록 정합 재작업 시수 예측 시스템에 관한 연구)

  • Jang, Moon-Seuk;Ruy, Won-Sun;Park, Chang-Kyu;Kim, Deok-Eun
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.55 no.1
    • /
    • pp.66-74
    • /
    • 2018
  • In order to evaluate the precision degree of the blocks on the dock, the shipyards recently started to use the point cloud approaches using the 3D scanners. However, they hesitate to use it due to the limited time, cost, and elaborative effects for the post-works. Although it is somewhat traditional instead, they have still used the electro-optical wave devices which have a characteristic of having less dense point set (usually 1 point per meter) around the contact section of two blocks. This paper tried to expand the usage of point sets. Our approach can estimate the rework time to weld between the Pre-Erected(PE) Block and Erected(ER) block as well as the precision of block construction. In detail, two algorithms were applied to increase the efficiency of estimation process. The first one is K-mean clustering algorithm which is used to separate only the related contact point set from others not related with welding sections. The second one is the Concave hull algorithm which also separates the inner point of the contact section used for the delayed outfitting and stiffeners section, and constructs the concave outline of contact section as the primary objects to estimate the rework time of welding. The main purpose of this paper is that the rework cost for welding is able to be obtained easily and precisely with the defective point set. The point set on the blocks' outline are challenging to get the approximated mathematical curves, owing to the lots of orthogonal parts and lack of number of point. To solve this problems we compared the Radial based function-Multi-Layer(RBF-ML) and Akima interpolation method. Collecting the proposed methods, the paper suggested the noble point matching method for minimizing the rework time of block-welding on the dock, differently the previous approach which had paid the attention of only the degree of accuracy.

Exploration of an Optimal Two-Dimensional Multi-Core System for Singular Value Decomposition (특이치 분해를 위한 최적의 2차원 멀티코어 시스템 탐색)

  • Park, Yong-Hun;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.21-31
    • /
    • 2014
  • Singular value decomposition (SVD) has been widely used to identify unique features from a data set in various fields. However, a complex matrix calculation of SVD requires tremendous computation time. This paper improves the performance of a representative one-sided block Jacoby algorithm using a two-dimensional (2D) multi-core system. In addition, this paper explores an optimal multi-core system by varying the number of processing elements in the 2D multi-core system with the same 400MHz clock frequency and TSMC 28nm technology for each matrix-based one-sided block Jacoby algorithm ($128{\times}128$, $64{\times}64$, $32{\times}32$, $16{\times}16$). Moreover, this paper demonstrates the potential of the 2D multi-core system for the one-sided block Jacoby algorithm by comparing the performance of the multi-core system with a commercial high-performance graphics processing unit (GPU).