• Title/Summary/Keyword: Parallel Image Processing

Search Result 343, Processing Time 0.03 seconds

Parallel Processing for Scaler SoC Implementation (Scaler SoC 구현을 위한 병렬처리 기술)

  • Moon, Ji-hye;Jeong, Yeon-Kyeong;Han, Jong-Ki
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2013.11a
    • /
    • pp.143-146
    • /
    • 2013
  • 멀티미디어 시대에 있어 이미지 신호처리(Image Signal Processing, ISP) 기술의 중요성이 거듭 강조되고 있는 가운데, 디지털 신호의 해상도를 변경하기 위해 여러 가지 커널을 사용하여 영상을 scaling 하는 방법들이 제안되고 있다. 본 논문에서 는 cubic convolution scaler를 이용하여 영상을 확대 및 축소할 때 하드웨어 관점에서의 메모리 최적화를 주제로 다룬다. SoC 상에서는 라인 메모리 개념을 가지기 때문에 영상의 해상도를 변환할 때 많은 메모리를 사용하게 된다. 또한 scaling을 할 때 곱하기 연산이 들어가게 되면 그에 비례하여 복잡도와 그에 따른 비용이 증가하게 된다. 이에 메모리와 process단계를 최적화하는 방법을 제안한다.

  • PDF

Optimal Design Space Exploration of Multi-core Architecture for Real-time Lane Detection Algorithm (실시간 차선인식 알고리즘을 위한 최적의 멀티코어 아키텍처 디자인 공간 탐색)

  • Jeong, Inkyu;Kim, Jongmyon
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.3
    • /
    • pp.339-349
    • /
    • 2017
  • This paper proposes a four-stage algorithm for detecting lanes on a driving car. In the first stage, it extracts region of interests in an image. In the second stage, it employs a median filter to remove noise. In the third stage, a binary algorithm is used to classify two classes of backgrond and foreground of an input image. Finally, an image erosion algorithm is utilized to obtain clear lanes by removing noises and edges remained after the binary process. However, the proposed lane detection algorithm requires high computational time. To address this issue, this paper presents a parallel implementation of a real-time line detection algorithm on a multi-core architecture. In addition, we implement and simulate 8 different processing element (PE) architectures to select an optimal PE architecture for the target application. Experimental results indicate that 40×40 PE architecture show the best performance, energy efficiency and area efficiency.

Feature Extraction System for High-Speed Fingerprint Recognition using the Multi-Access Memory System (다중 접근 메모리 시스템을 이용한 고속 지문인식 특징추출 시스템)

  • Park, Jong Seon;Kim, Jea Hee;Ko, Kyung-Sik;Park, Jong Won
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.8
    • /
    • pp.914-926
    • /
    • 2013
  • Among the recent security systems, security system with fingerprint recognition gets many people's interests through the strengths such as exclusiveness, convenience, etc, in comparison with other security systems. The most important matters for fingerprint recognition system are reliability of matching between the fingerprint in database and user's fingerprint and rapid process of image processing algorithms used for fingerprint recognition. The existing fingerprint recognition system reduces the processing time by removing some processes in the feature extraction algorithms but has weakness of a reliability. This paper realizes the fingerprint recognition algorithm using MAMS(Multi-Access Memory System) for both the rapid processing time and the reliability in feature extraction and matching accuracy. Reliability of this process is verified by the correlation between serial processor's results and MAMS-PP64's results. The performance of the method using MAMS-PP64 is 1.56 times faster than compared serial processor.

Enhanced Image Mapping Method for Computer-Generated Integral Imaging System (집적 영상 시스템을 위한 향상된 이미지 매핑 방법)

  • Lee Bin-Na-Ra;Cho Yong-Joo;Park Kyoung-Shin;Min Sung-Wook
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.295-300
    • /
    • 2006
  • The integral imaging system is an auto-stereoscopic display that allows users to see 3D images without wearing special glasses. In the integral imaging system, the 3D object information is taken from several view points and stored as elemental images. Then, users can see a 3D reconstructed image by the elemental images displayed through a lens array. The elemental images can be created by computer graphics, which is referred to the computer-generated integral imaging. The process of creating the elemental images is called image mapping. There are some image mapping methods proposed in the past, such as PRR(Point Retracing Rendering), MVR(Multi-Viewpoint Rendering) and PGR(Parallel Group Rendering). However, they have problems with heavy rendering computations or performance barrier as the number of elemental lenses in the lens array increases. Thus, it is difficult to use them in real-time graphics applications, such as virtual reality or real-time, interactive games. In this paper, we propose a new image mapping method named VVR(Viewpoint Vector Rendering) that improves real-time rendering performance. This paper describes the concept of VVR first and the performance comparison of image mapping process with previous methods. Then, it discusses possible directions for the future improvements.

Color Media Instructions for Embedded Parallel Processors (임베디드 병렬 프로세서를 위한 칼라미디어 명령어 구현)

  • Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.7
    • /
    • pp.305-317
    • /
    • 2008
  • As a mobile computing environment is rapidly changing, increasing user demand for multimedia-over-wireless capabilities on embedded processors places constraints on performance, power, and sire. In this regard, this paper proposes color media instructions (CMI) for single instruction, multiple data (SIMD) parallel processors to meet the computational requirements and cost goals. While existing multimedia extensions store and process 48-bit pixels in a 32-bit register, CMI, which considers that color components are perceptually less significant, supports parallel operations on two-packed compressed 16-bit YCbCr (6 bit Y and 5 bits Cb, Cr) data in a 32-bit datapath processor. This provides greater concurrency and efficiency for YCbCr data processing. Moreover, the ability to reduce data format size reduces system cost. The reduction in data bandwidth also simplifies system design. Experimental results on a representative SIMD parallel processor architecture show that CMI achieves an average speedup of 6.3x over the baseline SIMD parallel processor performance. This is in contrast to MMX (a representative Intel's multimedia extensions), which achieves an average speedup of only 3.7x over the same baseline SIMD architecture. CMI also outperforms MMX in both area efficiency (a 52% increase versus a 13% increase) and energy efficiency (a 50% increase versus an 11% increase). CMI improves the performance and efficiency with a mere 3% increase in the system area and a 5% increase in the system power, while MMX requires a 14% increase in the system area and a 16% increase in the system power.

Characteristics of the Rock Cleavage in Jurassic Granite, Geochang (거창지역의 쥬라기 화강암에 발달된 결의 특성)

  • Park, Deok-Won
    • The Journal of the Petrological Society of Korea
    • /
    • v.24 no.3
    • /
    • pp.153-164
    • /
    • 2015
  • Jurassic granite from Geochang was analysed with respect to the characteristics of the rock cleavage. we have mainly discussed the structual anisotropy formed by microcracks. The phases of distribution of microcracks were well evidenced from the enlarged photomicrographs(${\times}6.7$) of the thin section. The planes of principal set of microcracks are parallel to the rift plane and those of secondary set are parallel to the grain plane. These rift and grain microcracks are mutually near-perpendicular on the hardway planes. From the directional angle(${\theta}$) - total length($L_t$), number(N) and density(${\rho}$) chart, the curve patterns of the above microcrack parameters reflect the phases of distribution of microcracks. Microcrack parameters such as number, length and density show an order of rift > grain > hardway. These results indicate a relative magnitude of the rock cleavage. Meanwhile, brazilian tensile strengths were measured with respect to the six directions. The results revealed a strong correlation between mechanical property with the above microcrack parameters. These general results correspond to those of the previous study for Jurassic granites from Pocheon and Hapcheon. Image processing technique for the enlarged photomicrograph of the thin section was carried out. The grain 1(G1) microcrack arrays developed in quartz and feldspar grains show excellent distribution on the photomicrograph. In particular, the directional angle of each microcrack set can be ascertained easily by brief image processing for the above photomicrograph.

VLSI Architecture Design of Reconstruction Filter for Morphological Image Segmentation (형태학적 영상 분할을 위한 재구성 필터의 VLSI 구조 설계)

  • Lee, Sang-Yeol;Chung, Eui-Yoon;Lee, Ho-Young;Kim, Hee-Soo;Ha, Yeong-Ho
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.12
    • /
    • pp.41-50
    • /
    • 1999
  • In this paper, the new VLSI architecture of a reconstruction filter for morphological image segmentation is proposed. The filter, based on the $h_{max}$ operation, simplifies the interior of each region while preserving the boundary information. The proposed architecture adopts a partitioned memory structure and an efficient image scanning strategy to reduce the operations. The proposed memory partitioning scheme makes it possible that every data required for processing can be read from each memory at a time, resulting in parallel data processing. By the extended connectivity consideration, the operation is much decreased because more simplification is achieved in scanning stage. The selective raster scan strategy endows the satisfactory noise removal capability with negligible hardware complexity increase. The proposed architecture is designed using VHDL, and functional evaluation is performed by the CAD tool, Mentor. The experiment results show that the proposed architecture can simplify image profile with less than 18% operations of the conventional method.

  • PDF

Non-Photorealistic Rendering Using CUDA-Based Image Segmentation (CUDA 기반 영상 분할을 사용한 비사실적 렌더링)

  • Yoon, Hyun-Cheol;Park, Jong-Seung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.529-536
    • /
    • 2015
  • When rendering both three-dimensional objects and photo images together, the non-photorealistic rendering results are in visual discord since the two contents have their own independent color distributions. This paper proposes a non-photorealistic rendering technique which renders both three-dimensional objects and photo images such as cartoons and sketches. The proposed technique computes the color distribution property of the photo images and reduces the number of colors of both photo images and 3D objects. NPR is performed based on the reduced colormaps and edge features. To enhance the natural scene presentation, the image region segmentation process is preferred when extracting and applying colormaps. However, the image segmentation technique needs a lot of computational operations. It takes a long time for non-photorealistic rendering for large size frames. To speed up the time-consuming segmentation procedure, we use GPGPU for the parallel computing using the GPU. As a result, we significantly improve the execution speed of the algorithm.

An Improved Convex Hull Algorithm Considering Sort in Plane Point Set (평면 점집합에서 정렬을 고려한 개선된 컨벡스 헐 알고리즘)

  • Park, Byeong-Ju;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.17 no.1
    • /
    • pp.29-35
    • /
    • 2013
  • In this paper, we suggest an improved Convex Hull algorithm considering sort in plane point set. This algorithm has low computational complexity since processing data are reduced by characteristic of extreme points. Also it obtains a complete convex set with just one processing using an convex vertex discrimination criterion. Initially it requires sorting of point set. However we can't quickly sort because of its heavy operations. This problem was solved by replacing value and index. We measure the execution time of algorithms by generating a random set of points. The results of the experiment show that it is about 2 times faster than the existing algorithm.

A Novel VLSI Architecture for Parallel Adaptive Dictionary-Base Text Compression (가변 적응형 사전을 이용한 텍스트 압축방식의 병렬 처리를 위한 VLSI 구조)

  • Lee, Yong-Doo;Kim, Hie-Cheol;Kim, Jung-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.6
    • /
    • pp.1495-1507
    • /
    • 1997
  • Among a number of approaches to text compression, adaptive dictionary schemes based on a sliding window have been very frequently used due to their high performance. The LZ77 algorithm is the most efficient algorithm which implements such adaptive schemes for the practical use of text compression. This paperpresents a VLSI architecture designed for processing the LZ77 algorithm in parallel. Compared with the other VLSI architectures developed so far, the proposed architecture provides the more viable solution to high performance with regard to its throughput, efficient implementation of the VLSI systolic arrays, and hardware scalability. Indeed, without being affected by the size of the sliding window, our system has the complexity of O(N) for both the compression and decompression and also requires small wafer area, where N is the size of the input text.

  • PDF