• 제목/요약/키워드: parallel computer processing

검색결과 648건 처리시간 0.029초

Novel Parallel Approach for SIFT Algorithm Implementation

  • Le, Tran Su;Lee, Jong-Soo
    • Journal of information and communication convergence engineering
    • /
    • 제11권4호
    • /
    • pp.298-306
    • /
    • 2013
  • The scale invariant feature transform (SIFT) is an effective algorithm used in object recognition, panorama stitching, and image matching. However, due to its complexity, real-time processing is difficult to achieve with current software approaches. The increasing availability of parallel computers makes parallelizing these tasks an attractive approach. This paper proposes a novel parallel approach for SIFT algorithm implementation using a block filtering technique in a Gaussian convolution process on the SIMD Pixel Processor. This implementation fully exposes the available parallelism of the SIFT algorithm process and exploits the processing and input/output capabilities of the processor, which results in a system that can perform real-time image and video compression. We apply this implementation to images and measure the effectiveness of such an approach. Experimental simulation results indicate that the proposed method is capable of real-time applications, and the result of our parallel approach is outstanding in terms of the processing performance.

A Controllable Parallel CBC Block Cipher Mode of Operation

  • Ke Yuan;Keke Duanmu;Jian Ge;Bingcai Zhou;Chunfu Jia
    • Journal of Information Processing Systems
    • /
    • 제20권1호
    • /
    • pp.24-37
    • /
    • 2024
  • To address the requirement for high-speed encryption of large amounts of data, this study improves the widely adopted cipher block chaining (CBC) mode and proposes a controllable parallel cipher block chaining (CPCBC) block cipher mode of operation. The mode consists of two phases: extension and parallel encryption. In the extension phase, the degree of parallelism n is determined as needed. In the parallel encryption phase, n cipher blocks generated in the expansion phase are used as the initialization vectors to open n parallel encryption chains for parallel encryption. The security analysis demonstrates that CPCBC mode can enhance the resistance to byte-flipping attacks and padding oracle attacks if parallelism n is kept secret. Security has been improved when compared to the traditional CBC mode. Performance analysis reveals that this scheme has an almost linear acceleration ratio in the case of encrypting a large amount of data. Compared with the conventional CBC mode, the encryption speed is significantly faster.

Performance Study of Satellite Image Processing on Graphics Processors Unit Using CUDA

  • Jeong, In-Kyu;Hong, Min-Gee;Hahn, Kwang-Soo;Choi, Joonsoo;Kim, Choen
    • 대한원격탐사학회지
    • /
    • 제28권6호
    • /
    • pp.683-691
    • /
    • 2012
  • High resolution satellite images are now widely used for a variety of mapping applications including photogrammetry, GIS data acquisition and visualization. As the spectral and spatial data size of satellite images increases, a greater processing power is needed to process the images. The solution of these problems is parallel systems. Parallel processing techniques have been developed for improving the performance of image processing along with the development of the computational power. However, conventional CPU-based parallel computing is often not good enough for the demand for computational speed to process the images. The GPU is a good candidate to achieve this goal. Recently GPUs are used in the field of highly complex processing including many loop operations such as mathematical transforms, ray tracing. In this study we proposed a technique for parallel processing of high resolution satellite images using GPU. We implemented a spectral radiometric processing algorithm on Landsat-7 ETM+ imagery using CUDA, a parallel computing architecture developed by NVIDIA for GPU. Also performance of the algorithm on GPU and CPU is compared.

David II: 효과적인 메모리 시스템을 가지는 병렬 렌더링 프로세서 (David II: A new architecture for parallel rendering processors with effective memory system)

  • 이길환;박우찬;김일산;한탁돈
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2004년도 춘계학술발표대회
    • /
    • pp.1655-1658
    • /
    • 2004
  • Current rendering processors are organized mainly to process a triangle as fast as possible and recently parallel 3D rendering processors, which can process multiple triangles in parallel with multiple rasterizers, begin to appear. For high performance in processing triangles, it is desirable for each rasterizer have its own local pixel cache. However, the consistency problem may occur in accessing the data at the same address simultaneously by more than one rasterizer. In this paper, we propose a parallel rendering processor architecture, called DAVID II, resolving such consistency problem effectively. Moreover, the proposed architecture reduces the latency due to a pixel cache miss significantly. The experimental results show that DAVID II achieves almost linear speedup at best case even in sixteen rasterizers.

  • PDF

CDN Scalability Improvement using a Moderate Peer-assisted Method

  • Shi, Peichang;Wang, Huaimin;Yin, Hao;Ding, Bo;Wang, Tianzuo;Wang, Miao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제6권3호
    • /
    • pp.954-972
    • /
    • 2012
  • Content Delivery Networks (CDN) server loads that fluctuant necessitate CDN to improve its service scalability especially when the peak load exceeds its service capacity. The peer assisted scheme is widely used in improving CDN scalability. However, CDN operators do not want to lose profit by overusing it, which may lead to the CDN resource utilization reduced. Therefore, improving CDN scalability moderately and guarantying CDN resource utilization maximized is necessary. However, when and how to use the peer-assisted scheme to achieve such improvement remains a great challenge. In this paper, we propose a new method called Dynamic Moderate Peer-assisted Method (DMPM), which uses time series analysis to predict and decide when and how many server loads needs to offload. A novel peer-assisted mechanism based on the prediction designed, which can maximize the profit of the CDN operators without influencing scalability. Extensive evaluations based on an actual CDN load traces have shown the effectiveness of DMPM.

Efficient Face Recognition using Low-Dimensional PCA: Hierarchical Image & Parallel Processing

  • Song, Young-Jun;Kim, Young-Gil;Kim, Kwan-Dong;Kim, Nam;Ahn, Jae-Hyeong
    • International Journal of Contents
    • /
    • 제3권2호
    • /
    • pp.1-5
    • /
    • 2007
  • This paper proposes a technique for principal component analysis (PCA) to raise the recognition rate of a front face in a low dimension by hierarchical image and parallel processing structure. The conventional PCA shows a recognition rate of less than 50% in a low dimension (dimensions 1 to 6) when used for facial recognition. In this paper, a face is formed as images of 3 fixed-size levels: the 1st being a region around the nose, the 2nd level a region including the eyes, nose, and mouth, and the 3rd level image is the whole face. PCA of the 3-level images is treated by parallel processing structure, and finally their similarities are combined for high recognition rate in a low dimension. The proposed method under went experimental feasibility study with ORL face database for evaluation of the face recognition function. The experimental demonstration has been done by PCA and the proposed method according to each level. The proposed method showed high recognition of over 50% from dimensions 1 to 6.

Feasibility Study of a Distributed and Parallel Environment for Implementing the Standard Version of AAM Model

  • Naoui, Moulkheir;Mahmoudi, Said;Belalem, Ghalem
    • Journal of Information Processing Systems
    • /
    • 제12권1호
    • /
    • pp.149-168
    • /
    • 2016
  • The Active Appearance Model (AAM) is a class of deformable models, which, in the segmentation process, integrates the priori knowledge on the shape and the texture and deformation of the structures studied. This model in its sequential form is computationally intensive and operates on large data sets. This paper presents another framework to implement the standard version of the AAM model. We suggest a distributed and parallel approach justified by the characteristics of the model and their potentialities. We introduce a schema for the representation of the overall model and we study of operations that can be parallelized. This approach is intended to exploit the benefits build in the area of advanced image processing.

RISC 병렬 처리를 위한 기억공간의 효율적인 활용 알고리즘 (An efficient Storage Reclamation Algorithm for RISC Parallel Processing)

  • 이철원;임인칠
    • 전자공학회논문지B
    • /
    • 제28B권9호
    • /
    • pp.703-711
    • /
    • 1991
  • In this paper, an efficient storage reclamation algorithm for RISC parallel processing in the object orented programming environments is presented. The memory management for the dynamic memory allocation and the frequent memory access in object oriented programming is the main factor that decreases RISC parallel processing performance. The proposed algorithm can be efficiently allocated the memory space of RISCy computer which is required the frequent memory access, so it can be increased RISC parallel processing performance. The proposed algorithm is verified the efficiency by implementing C language on SUN SPARC(4.3 BSD UNIX).

  • PDF

대용량 고속화 수행을 위한 변형된 Feistel 구조 설계에 관한 연구 (Design of modified Feistel structure for high-capacity and high speed achievement)

  • 이선근;정우열
    • 한국컴퓨터정보학회논문지
    • /
    • 제10권3호
    • /
    • pp.183-188
    • /
    • 2005
  • 블록암호알고리즘의 기본 구조인 Feistel 구조는 순차처리 구조이므로 병렬처리가 곤란하다. 그러므로 본 논문은 이러한 순차처리 구조를 변형하여 Feistel 구조가 병렬처리가 가능하도록 하였다. 이를 이용하여 본 논문은 병렬 Feistel 구조를 가지는 DES를 설계하였다. 제안된 병렬 Feistel 구조는 자체의 구조적 문제 때문에 pipeline 방식을 사용할 수 없어 데이터 처리속도와 데이터 보안사이에서 trade-off관계를 가질 수밖에 없었던 DES등과 같은 블록암호알고리즘의 성능을 크게 향상 시킬 수 있었다. 그러므로 Feistel 구조를 적용한 SEED, AES의 Rijndael, Twofish 등에 제안된 방식을 적용할 경우 지금보다 더욱 우월한 보안 기능 및 고속의 처리능력을 발휘하게 될 것이다.

  • PDF

CUDA based parallel design of a shot change detection algorithm using frame segmentation and object movement

  • Kim, Seung-Hyun;Lee, Joon-Goo;Hwang, Doo-Sung
    • 한국컴퓨터정보학회논문지
    • /
    • 제20권7호
    • /
    • pp.9-16
    • /
    • 2015
  • This paper proposes the parallel design of a shot change detection algorithm using frame segmentation and moving blocks. In the proposed approach, the high parallel processing components, such as frame histogram calculation, block histogram calculation, Otsu threshold setting function, frame moving operation, and block histogram comparison, are designed in parallel for NVIDIA GPU. In order to minimize memory access delay time and guarantee fast computation, the output of a GPU kernel becomes the input data of another kernel in a pipeline way using the shared memory of GPU. In addition, the optimal sizes of CUDA processing blocks and threads are estimated through the prior experiments. In the experimental test of the proposed shot change detection algorithm, the detection rate of the GPU based parallel algorithm is the same as that of the CPU based algorithm, but the average of processing time speeds up about 6~8 times.