• Title/Summary/Keyword: Parallel operation algorithm

Search Result 245, Processing Time 0.022 seconds

Implementation of a Real-Time Spatio-Temporal Noise Reduction System (실시간 시공 노이즈 제거 시스템 구현)

  • Hong, Hye-Jeong;Kim, Hyun-Jin;Kang, Sung-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.2
    • /
    • pp.74-80
    • /
    • 2008
  • Spatio-temporal filters are capable of reducing noise from moving pictures, which cannot be dealt with by spatial filters. However, the algorithm is too complicated to be realized as hardware. We implemented a real-time spatio-temporal noise reduction system, using at most three frames, based upon adaptive mean filter algorithm. Some factors which interfere with hardware implementation were modified. Noise estimated from the previous frame was used to filter the current frame so that filtering could be conducted in parallel with noise estimation. This speeds up the system thereby enabling real-time execution. The form of filtering windows was also modified to facilitate synchronization. The proposed structure was implemented on Virtex 4 XC4VLX60, occupying 66% of total slices with 80MHz of the maximum operation frequency.

Detection of Fallen Pear Bags caused by Natural Disaster (자연 재해로 인하여 낙과된 무채색 배 봉지 검출)

  • Choi, Doo-Hyun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.1
    • /
    • pp.153-158
    • /
    • 2016
  • A detection algorithm of fallen pear bags caused by natural disaster like heavy rain, typhoon, hurricane, etc. is presented in this paper. The algorithm is developed for the gray pear bags with printed characters which are widely used at pear farms at Sangju and Naju producing large quantity of pears for export. It sets a region of interest (ROI) at first and then eliminates the regions having chromatic color in ROI. Morphological operation and prior information are used to eliminate small noises and several unusual regions and finally the regions of fallen pear bags are remained. The remained regions are analyzed and counted to estimate the scale of damage. Test images are consisted of the images taken at pear farms of Sangju and Naju at 2014. Experimental result shows that the detection rate of pear bags is more than 90% and also the proposed system can be implemented in real-time using hand-held devices because of its simple and parallel architecture.

Optimization Study of Toom-Cook Algorithm in NIST PQC SABER Utilizing ARM/NEON Processor (ARM/NEON 프로세서를 활용한 NIST PQC SABER에서 Toom-Cook 알고리즘 최적화 구현 연구)

  • Song, JinGyo;Kim, YoungBeom;Seo, Seog Chung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.3
    • /
    • pp.463-471
    • /
    • 2021
  • Since 2016, National Institute of Standards and Technology (NIST) has been conducting a post quantum cryptography standardization project in preparation for a quantum computing environment. Three rounds are currently in progress, and most of the candidates (5/7) are lattice-based. Lattice-based post quantum cryptography is evaluated to be applicable even in an embedded environment where resources are limited by providing efficient operation processing and appropriate key length. Among them, SABER KEM provides the efficient modulus and Toom-Cook to process polynomial multiplication with computation-intensive tasks. In this paper, we present the optimized implementation of evaluation and interpolation in Toom-Cook algorithm of SABER utilizing ARM/NEON in ARMv8-A platform. In the evaluation process, we propose an efficient interleaving method of ARM/NEON, and in the interpolation process, we introduce an optimized implementation methodology applicable in various embedded environments. As a result, the proposed implementation achieved 3.5 times faster performance in the evaluation process and 5 times faster in the interpolation process than the previous reference implementation.

A Study of Big data-based Machine Learning Techniques for Wheel and Bearing Fault Diagnosis (차륜 및 차축베어링 고장진단을 위한 빅데이터 기반 머신러닝 기법 연구)

  • Jung, Hoon;Park, Moonsung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.1
    • /
    • pp.75-84
    • /
    • 2018
  • Increasing the operation rate of components and stabilizing the operation through timely management of the core parts are crucial for improving the efficiency of the railroad maintenance industry. The demand for diagnosis technology to assess the condition of rolling stock components, which employs history management and automated big data analysis, has increased to satisfy both aspects of increasing reliability and reducing the maintenance cost of the core components to cope with the trend of rapid maintenance. This study developed a big data platform-based system to manage the rolling stock component condition to acquire, process, and analyze the big data generated at onboard and wayside devices of railroad cars in real time. The system can monitor the conditions of the railroad car component and system resources in real time. The study also proposed a machine learning technique that enabled the distributed and parallel processing of the acquired big data and automatic component fault diagnosis. The test, which used the virtual instance generation system of the Amazon Web Service, proved that the algorithm applying the distributed and parallel technology decreased the runtime and confirmed the fault diagnosis model utilizing the random forest machine learning for predicting the condition of the bearing and wheel parts with 83% accuracy.

A Study on Improved Image Matching Method using the CUDA Computing (CUDA 연산을 이용한 개선된 영상 매칭 방법에 관한 연구)

  • Cho, Kyeongrae;Park, Byungjoon;Yoon, Taebok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.4
    • /
    • pp.2749-2756
    • /
    • 2015
  • Recently, Depending on the quality of data increases, the problem of time-consuming to process the image is raised by being required to accelerate the image processing algorithms, in a traditional CPU and CUDA(Compute Unified Device Architecture) based recognition system for computing speed and performance gains compared to OpenMP When character recognition has been learned by the system to measure the input by the character data matching is implemented in an environment that recognizes the region of the well, so that the font of the characters image learning English alphabet are each constant and standardized in size and character an image matching method for calculating the matching has also been implemented. GPGPU (General Purpose GPU) programming platform technology when using the CUDA computing techniques to recognize and use the four cores of Intel i5 2500 with OpenMP to deal quickly and efficiently an algorithm, than the performance of existing CPU does not produce the rate of four times due to the delay of the data of the partition and merge operation proposed a method of improving the rate of speed of about 3.2 times, and the parallel processing of the video card that processes a result, the sequential operation of the process compared to CPU-based who performed the performance gain is about 21 tiems improvement in was confirmed.

Channel Searching Method of IEEE 802.15.4 Nodes for Avoiding WiFi Traffic Interference (WiFi 트래픽 간섭을 피하기 위한 IEEE 802.15.4 노드의 채널탐색방법)

  • Song, Myong Lyol
    • Journal of Internet Computing and Services
    • /
    • v.15 no.2
    • /
    • pp.19-31
    • /
    • 2014
  • In this paper, a parallel backoff delay procedure on multiple IEEE 802.15.4 channels and a channel searching method considering the frequency spectrum of WiFi traffic are studied for IEEE 802.15.4 nodes to avoid the interference from WiFi traffic. In order to search the channels being occupied by WiFi traffic, we analyzed the methods measuring the powers of adjacent channels simultaneously, checking the duration of measured power levels greater than a threshold, and finding the same periodicity of sampled RSSI data as the beacon frame by signal processing. In an wireless channel overlapped with IEEE 802.11 network, the operation of CSMA-CA algorithm for IEEE 802.15.4 nodes is explained. A method to execute a parallel backoff procedure on multiples IEEE 802.15.4 channels by an IEEE 802.15.4 device is proposed with the description of its algorithm. When we analyze the data measured by the experimental system implemented with the proposed method, it is observed that medium access delay times increase at the same time in the associated IEEE 802.15.4 channels that are adjacent each other during the generation of WiFi traffic. A channel evaluation function to decide the interference from other traffic on an IEEE 802.15.4 channel is defined. A channel searching method considering the channel evaluations on the adjacent channels together is proposed in order to search the IEEE 802.15.4 channels interfered by WiFi, and the experimental results show that it correctly finds the channels interfered by WiFi traffic.

Real-time Watermarking Algorithm using Multiresolution Statistics for DWT Image Compressor (DWT기반 영상 압축기의 다해상도의 통계적 특성을 이용한 실시간 워터마킹 알고리즘)

  • 최순영;서영호;유지상;김대경;김동욱
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.6
    • /
    • pp.33-43
    • /
    • 2003
  • In this paper, we proposed a real-time watermarking algorithm to be combined and to work with a DWT(Discrete Wavelet Transform)-based image compressor. To reduce the amount of computation in selecting the watermarking positions, the proposed algorithm uses a pre-established look-up table for critical values, which was established statistically by computing the correlation according to the energy values of the corresponding wavelet coefficients. That is, watermark is embedded into the coefficients whose values are greater than the critical value in the look-up table which is searched on the basis of the energy values of the corresponding level-1 subband coefficients. Therefore, the proposed algorithm can operate in a real-time because the watermarking process operates in parallel with the compression procession without affecting the operation of the image compression. Also it improved the property of losing the watermark and the efficiency of image compression by watermark inserting, which results from the quantization and Huffman-Coding during the image compression. Visual recognizable patterns such as binary image were used as a watermark The experimental results showed that the proposed algorithm satisfied the properties of robustness and imperceptibility that are the major conditions of watermarking.

Design and Hardware Implementation of High-Speed Variable-Length RSA Cryptosystem (가변길이 고속 RSA 암호시스템의 설계 및 하드웨어 구현)

  • 박진영;서영호;김동욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.9C
    • /
    • pp.861-870
    • /
    • 2002
  • In this paper, with targeting on the drawback of RSA of operation speed, a new 1024-bit RSA cryptosystem has been proposed and implemented in hardware to increase the operational speed and perform the variable-length encryption. The proposed cryptosystem mainly consists of the modular exponentiation part and the modular multiplication part. For the modular exponentiation, the RL-binary method, which performs squaring and modular multiplying in parallel, was improved, and then applied. And 4-stage CSA structure and radix-4 booth algorithm were applied to enhance the variable-length operation and reduce the number of partial product in modular multiplication arithmetic. The proposed RSA cryptosystem which can calculate at most 1024 bits at a tittle was mapped into the integrated circuit using the Hynix Phantom Cell Library for Hynix 0.35㎛ 2-Poly 4-Metal CMOS process. Also, the result of software implementation, which had been programmed prior to the hardware research, has been used to verify the operation of the hardware system. The size of the result from the hardware implementation was about 190k gate count and the operational clock frequency was 150㎒. By considering a variable-length of modulus number, the baud rate of the proposed scheme is one and half times faster than the previous works. Therefore, the proposed high speed variable-length RSA cryptosystem should be able to be used in various information security system which requires high speed operation.

A Development of Automatic Lineament Extraction Algorithm from Landsat TM images for Geological Applications (지질학적 활용을 위한 Landsat TM 자료의 자동화된 선구조 추출 알고리즘의 개발)

  • 원중선;김상완;민경덕;이영훈
    • Korean Journal of Remote Sensing
    • /
    • v.14 no.2
    • /
    • pp.175-195
    • /
    • 1998
  • Automatic lineament extraction algorithms had been developed by various researches for geological purpose using remotely sensed data. However, most of them are designed for a certain topographic model, for instance rugged mountainous region or flat basin. Most of common topographic characteristic in Korea is a mountainous region along with alluvial plain, and consequently it is difficult to apply previous algorithms directly to this area. A new algorithm of automatic lineament extraction from remotely sensed images is developed in this study specifically for geological applications. An algorithm, named as DSTA(Dynamic Segment Tracing Algorithm), is developed to produce binary image composed of linear component and non-linear component. The proposed algorithm effectively reduces the look direction bias associated with sun's azimuth angle and the noise in the low contrast region by utilizing a dynamic sub window. This algorithm can successfully accomodate lineaments in the alluvial plain as well as mountainous region. Two additional algorithms for estimating the individual lineament vector, named as ALEHHT(Automatic Lineament Extraction by Hierarchical Hough Transform) and ALEGHT(Automatic Lineament Extraction by Generalized Hough Transform) which are merging operation steps through the Hierarchical Hough transform and Generalized Hough transform respectively, are also developed to generate geological lineaments. The merging operation proposed in this study is consisted of three parameters: the angle between two lines($\delta$$\beta$), the perpendicular distance($(d_ij)$), and the distance between midpoints of lines(dn). The test result of the developed algorithm using Landsat TM image demonstrates that lineaments in alluvial plain as well as in rugged mountain is extremely well extracted. Even the lineaments parallel to sun's azimuth angle are also well detected by this approach. Further study is, however, required to accommodate the effect of quantization interval(droh) parameter in ALEGHT for optimization.

Implementation of High-radix Modular Exponentiator for RSA using CRT (CRT를 이용한 하이래딕스 RSA 모듈로 멱승 처리기의 구현)

  • 이석용;김성두;정용진
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.10 no.4
    • /
    • pp.81-93
    • /
    • 2000
  • In a methodological approach to improve the processing performance of modulo exponentiation which is the primary arithmetic in RSA crypto algorithm, we present a new RSA hardware architecture based on high-radix modulo multiplication and CRT(Chinese Remainder Theorem). By implementing the modulo multiplier using radix-16 arithmetic, we reduced the number of PE(Processing Element)s by quarter comparing to the binary arithmetic scheme. This leads to having the number of clock cycles and the delay of pipelining flip-flops be reduced by quarter respectively. Because the receiver knows p and q, factors of N, it is possible to apply the CRT to the decryption process. To use CRT, we made two s/2-bit multipliers operating in parallel at decryption, which accomplished 4 times faster performance than when not using the CRT. In encryption phase, the two s/2-bit multipliers can be connected to make a s-bit linear multiplier for the s-bit arithmetic operation. We limited the encryption exponent size up to 17-bit to maintain high speed, We implemented a linear array modulo multiplier by projecting horizontally the DG of Montgomery algorithm. The H/W proposed here performs encryption with 15Mbps bit-rate and decryption with 1.22Mbps, when estimated with reference to Samsung 0.5um CMOS Standard Cell Library, which is the fastest among the publications at present.