• Title/Summary/Keyword: 연산 지도

Search Result 3,998, Processing Time 0.036 seconds

Wavelet Image Coding Using the Significant Cluster Extraction by Morphology and the Adaptive Quantization (모폴로지에 의한 중요 클러스터 추출과 적응양자화를 이용한 웨이브릿 영상부호화)

  • 류태경;강경원;권기룡;김문수;문광석
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.5 no.2
    • /
    • pp.85-90
    • /
    • 2004
  • This paper proposes the wavelet image coding using the significant cluster extraction by morphology and the adaptive quantization. In the conventional MRWD method, the additional seed data takes large potion of the total data bits. The proposed method extracts the significant cluster using morphology to improve the coding efficiency. In addition, the adaptive quantization is proposed to reduce the number of redundant comparative operations which are indispensably occurred in the MRWD quantization. The experimental result shows that the proposed algorithm has the improved coding efficiency and computational cost while preserving superior PSNR

  • PDF

A Study of Car Plate Extraction and Segmentation using Morphology and ART2 (모폴로지와 ART2를 이용한 번호판 위치 검출 및 문자 세그멘테이션에 관한 연구)

  • 강동구;김도현;최선아;차의영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10b
    • /
    • pp.328-330
    • /
    • 2001
  • 컴퓨터 비전을 이용한 자동차 번호판 인식은 자동차에 특별한 장치가 필요없어 비용면에서 유리하다. 자동차 번호판을 인식하기 위해서는 번호판의 영역을 추출한 후 번호판에서 문자와 숫자 영역을 분리하고 그 세그먼트를 신경 회로망 혹은 다른 방법을 통하여 인식한다. 본 논문은 자동차 번호판 위치 검출 방법과 세그멘테이션 방법에 대하여 제안한다. 자동차 번호판 위치 검출을 위하여 모폴로지 기법과 ART2 클러스터링 방법을 이용하였고 검출한 번호판 영역에서 세그멘테이션은 모폴로지 연산을 이용한 이진화와 레이블링을 이용한다.

  • PDF

The Design of a Structure of Network Co-processor for SDR(Software Defined Radio) (SDR(Software Defined Radio)에 적합한 네트워크 코프로세서 구조의 설계)

  • Kim, Hyun-Pil;Jeong, Ha-Young;Ham, Dong-Hyeon;Lee, Yong-Surk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.2A
    • /
    • pp.188-194
    • /
    • 2007
  • In order to become ubiquitous world, the compatibility of wireless machines has become the significant characteristic of a communication terminal. Thus, SDR is the most necessary technology and standard. However, among the environment which has different communication protocol, it's difficult to make a terminal with only hardware using ASIC or SoC. This paper suggests the processor that can accelerate several communication protocol. It can be connected with main-processor, and it is specialized PHY layer of network The C-program that is modeled with the wireless protocol IEEE802.11a and IEEE802.11b which are based on widely used modulation way; OFDM and CDM is compiled with ARM cross compiler and done simulation and profiling with Simplescalar-Arm version. The result of profiling, most operations were Viterbi operations and complex floating point operations. According to this result we suggested a co-processor which can accelerate Viterbi operations and complex floating point operations and added instructions. These instructions are simulated with Simplescalar-Arm version. The result of this simulation, comparing with computing only one ARM core, the operations of Viterbi improved as fast as 4.5 times. And the operations of complex floating point improved as fast as twice. The operations of IEEE802.11a are 3 times faster, and the operations of IEEE802.11b are 1.5 times faster.

Cost-based Optimization of Block Recycling Scheme in NAND Flash Memory Based Storage System (NAND 플래시 메모리 저장 장치에서 블록 재활용 기법의 비용 기반 최적화)

  • Lee, Jong-Min;Kim, Sung-Hoon;Ahn, Seong-Jun;Lee, Dong-Hee;Noh, Sam-H.
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.7
    • /
    • pp.508-519
    • /
    • 2007
  • Flash memory based storage has been used in various mobile systems and now is to be used in Laptop computers in the name of Solid State Disk. The Flash memory has not only merits in terms of weight, shock resistance, and power consumption but also limitations like erase-before-write property. To overcome these limitations, Flash memory based storage requires special address mapping software called FTL(Flash-memory Translation Layer), which often performs merge operation for block recycling. In order to reduce block recycling cost in NAND Flash memory based storage, we introduce another block recycling scheme which we call migration. As a result, the FTL can select either merge or migration depending on their costs for each block recycling. Experimental results with Postmark benchmark and embedded system workload show that this cost-based selection of migration/merge operation improves the performance of Flash memory based storage. Also, we present a solution of macroscopic optimal migration/merge sequence that minimizes a block recycling cost for each migration/merge combination period. Experimental results show that the performance of Flash memory based storage can be more improved by the macroscopic optimization than the simple cost-based selection.

Analysis of Impact of Correlation Between Hardware Configuration and Branch Handling Methods Executing General Purpose Applications (범용 응용프로그램 실행 시 하드웨어 구성과 분기 처리 기법에 따른 GPU 성능 분석)

  • Choi, Hong Jun;Kim, Cheol Hong
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.3
    • /
    • pp.9-21
    • /
    • 2013
  • Due to increased computing power and flexibility of GPU, recent GPUs execute general purpose parallel applications as well as graphics applications. Programmers can use GPGPU by using the APIs from GPU vendors. Unfortunately, computational resources of GPU are not fully utilized when executing general purpose applications because of frequent branch instructions. To handle the branch problem, several warp formations have been proposed. Intuitively, we expect that the warp formations providing higher computational resource utilization show higher performance. Contrary to our expectations, according to simulation results, the performance of the warp formation providing better utilization is lower than that of the warp formation providing worse utilization. This is because warp formation providing high utilization causes serious memory bottleneck due to increased memory request. Therefore, warp formation providing high computation utilization cannot guarantee high performance without proper hardware resources. For this reason, we will analyze the correlation between hardware configuration and warp formation. Our simulation results present the guideline to solve the underutilization problem due to branch instructions when designing recent GPU.

Low-Gate-Count 32-Bit 2/3-Stage Pipelined Processor Design (소면적 32-bit 2/3단 파이프라인 프로세서 설계)

  • Lee, Kwang-Min;Park, Sungkyung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.4
    • /
    • pp.59-67
    • /
    • 2016
  • With the enhancement of built-in communication capabilities in various meters and wearable devices, which implies Internet of things (IoT), the demand of small-area embedded processors has increased. In this paper, we introduce a small-area 32-bit pipelined processor, Juno, which is available in the field of IoT. Juno is an EISC (Extendable Instruction Set Computer) machine and has a 2/3-stage pipeline structure to reduce the data dependency of the pipeline. It has a simple pipeline controller which only controls the program counter (PC) and two pipeline registers. It offers $32{\times}32=64$ multiplication, 64/32=32 division, $32{\times}32+64=64$ MAC (multiply and accumulate) operations together with 32*32=64 Galois field multiplication operation for encryption processing in wireless communications. It provides selective inclusion of these algebraic logic blocks if necessary in order to reduce the area of the overall processor. In this case, the gate count of our integer core amounts to 12k~22k and has a performance of 0.57 DMIPS/MHz and 1.024 Coremark/MHz.

Technique for Placing Continuous Media on a Disk Array under Fault-Tolerance and Arbitrary-Rate Search (결함허용과 임의 속도 탐색을 고려한 연속 매체 디스크 배치 기법)

  • O, Yu-Yeong;Kim, Seong-Su;Kim, Jae-Hun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.9
    • /
    • pp.1166-1176
    • /
    • 1999
  • 연속 매체, 특히 비디오 데이타에 대한 일반 사용자 연산에는 재생뿐만 아니라 임의 속도 탐색 연산, 정지 연산, 그리고 그 외 다양한 연산이 있다. 이 연산 중에서 원하는 화면을 빨리 찾는 데에 유용한 고속 전진(FF: fast-forward)과 고속 후진(FB: fast-backward)은 재생 연산과는 달리 비순차적인 디스크 접근을 요구한다. 이러한 경우에 디스크 부하가 균등하지 않으면 일부 디스크에 접근이 편중되어 서비스 품질이 떨어진다. 본 논문에서는 디스크 배열을 이용한 저장 시스템에서 디스크 접근을 고르게 분산시키기 위하여 '소수 라운드 로빈(PRR: Prime Round Robin)' 방식으로 연속 매체를 디스크에 배치하는 기법에서 문제가 됐던 낭비된 디스크 저장 공간을 신뢰도 향상을 위해서 사용하는 '그룹화된 패리티를 갖는 소수 라운드 로빈(PRRgp: PRR with Grouped Parities)' 방식을 제안한다. 이 기법은 PRR 기법처럼 임의 속도 검색 연산에 있어서 디스크 배열을 구성하는 모든 디스크의 부하를 균등하게 할뿐만 아니라 낭비됐던 디스크 저장 공간에 신뢰도를 높이기 위한 패리티 정보를 저장함으로서 신뢰도를 향상시킬 수 있다. 신뢰도 모델링 방법으로 조합 모델과 마르코프 모델을 이용해서 결함발생율과 결함복구율을 고려한 신뢰도를 산출하고 비교.분석한다. PRR 기법으로 연속 매체를 저장하고 낭비되는 공간에 패리티 정보를 저장할 경우에 동시에 두 개 이상의 결함 발생 시에 그 결함으로부터 복구가 불가능하지만 PRRgp 기법에서는 약 30% 이상의경우에 대해서 동시에 두 개의 결함 발생 시에 저장한 패리티 정보를 이용한 복구가 가능할 뿐만 아니라 패리티 그룹의 수가 두 개 이상인 경우에는 두 개 이상의 결함에 대해서도 복구가 가능하다.Abstract End-user operations on continuous media (say video data) consist of arbitrary-rate search, pause, and others as well as normal-rate play. FF(fast-forward) / FB(fast-backward) among those operations are desirable to find out the scene of interest but they require non-sequential access of disks. When accesses are clustered to several disks without considering load balance, high quality services in playback may not be available. In this paper, we propose a new disk placement scheme, called PRRgp(Prime Round Robin with Grouped Parities), with enhanced reliability by using the wasted disk storage space in an old one(PRR: Prime Round Robin), in which continuous media are placed on a disk array based storage systems to distribute disk accesses uniformly. The PRRgp can not only achieve load balance of disks consisting of a disk array under arbitrary-rate search like PRR, but also improve reliability by storing parity information on the wasted disk space appropriately. We use combinatorial and Markov models to evaluate the reliability for a disk array and to analyze the results. When continuous media like PRR are placed and parity information on the wasted disk space is stored, we cannot tolerate more than two simultaneous faults. But they can be recovered by using stored parity information for about 30 percent as a whole in case of PRRgp presented in this paper. In addition, more than two faults can be tolerated in case there are more than two parity groups.

Design of Efficient Gradient Orientation Bin and Weight Calculation Circuit for HOG Feature Calculation (HOG 특징 연산에 적용하기 위한 효율적인 기울기 방향 bin 및 가중치 연산 회로 설계)

  • Kim, Soojin;Cho, Kyeongsoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.11
    • /
    • pp.66-72
    • /
    • 2014
  • Histogram of oriented gradient (HOG) feature is widely used in vision-based pedestrian detection. The interpolation is the most important technique in HOG feature calculation to provide high detection rate. In interpolation technique of HOG feature calculation, two nearest orientation bins to gradient orientation for each pixel and the corresponding weights are required. In this paper, therefore, an efficient gradient orientation bin and weight calculation circuit for HOG feature is proposed. In the proposed circuit, pre-calculated values are defined in tables to avoid the operations of tangent function and division, and the size of tables is minimized by utilizing the characteristics of tangent function and weights for each gradient orientation. Pipeline architecture is adopted to the proposed circuit to accelerate the processing speed, and orientation bins and the corresponding weights for each pixel are calculated in two clock cycles by applying efficient coarse and fine search schemes. Since the proposed circuit calculates gradient orientation for each pixel with the interval of $1^{\circ}$ and determines both orientation bins and weights required in interpolation technique, it can be utilized in HOG feature calculation to support interpolation technique to provide high detection rate.

Baseline Wander Removing Method Based on Morphological Filter for Efficient QRS Detection (효율적인 QRS 검출을 위한 형태 연산 기반의 기저선 잡음 제거 기법)

  • Cho, Ik-Sung;Kim, Joo-Man;Kim, Seon-Jong;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.1
    • /
    • pp.166-174
    • /
    • 2013
  • QRS detection of ECG is the most popular and easy way to detect cardiac-disease. But it is difficult to analyze the ECG signal because of various noise types. The important problem in recording ECG signal is a baseline wandering, which is occurred by rhythm of respiration and muscle contraction attaching to an electrode. Particularly, in the healthcare system that must continuously monitor people's situation, it is necessary to process ECG signal in realtime. In other words, the design of algorithm that exactly detects QRS region using minimal computation by analyzing the person's physical condition and/or environment is needed. Therefore, baseline wander removing method based on morphological filter for efficient QRS detection method is presented in this paper. For this purpose, we detected QRS through the preprocessing method using morphological filter, adaptive threshold, and window. The signal distortion ratio of the proposed method is compared with other filtering method. Also, R wave detection is evaluated by using MIT-BIH arrhythmia database. Experiment result show that proposed method removes baseline wanders effectively without significant morphological distortion.

Implementation of High-radix Modular Exponentiator for RSA using CRT (CRT를 이용한 하이래딕스 RSA 모듈로 멱승 처리기의 구현)

  • 이석용;김성두;정용진
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.10 no.4
    • /
    • pp.81-93
    • /
    • 2000
  • In a methodological approach to improve the processing performance of modulo exponentiation which is the primary arithmetic in RSA crypto algorithm, we present a new RSA hardware architecture based on high-radix modulo multiplication and CRT(Chinese Remainder Theorem). By implementing the modulo multiplier using radix-16 arithmetic, we reduced the number of PE(Processing Element)s by quarter comparing to the binary arithmetic scheme. This leads to having the number of clock cycles and the delay of pipelining flip-flops be reduced by quarter respectively. Because the receiver knows p and q, factors of N, it is possible to apply the CRT to the decryption process. To use CRT, we made two s/2-bit multipliers operating in parallel at decryption, which accomplished 4 times faster performance than when not using the CRT. In encryption phase, the two s/2-bit multipliers can be connected to make a s-bit linear multiplier for the s-bit arithmetic operation. We limited the encryption exponent size up to 17-bit to maintain high speed, We implemented a linear array modulo multiplier by projecting horizontally the DG of Montgomery algorithm. The H/W proposed here performs encryption with 15Mbps bit-rate and decryption with 1.22Mbps, when estimated with reference to Samsung 0.5um CMOS Standard Cell Library, which is the fastest among the publications at present.