• Title/Summary/Keyword: computation-intensive

Search Result 107, Processing Time 0.048 seconds

k-NN Join Based on LSH in Big Data Environment

  • Ji, Jiaqi;Chung, Yeongjee
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.2
    • /
    • pp.99-105
    • /
    • 2018
  • k-Nearest neighbor join (k-NN Join) is a computationally intensive algorithm that is designed to find k-nearest neighbors from a dataset S for every object in another dataset R. Most related studies on k-NN Join are based on single-computer operations. As the data dimensions and data volume increase, running the k-NN Join algorithm on a single computer cannot generate results quickly. To solve this scalability problem, we introduce the locality-sensitive hashing (LSH) k-NN Join algorithm implemented in Spark, an approach for high-dimensional big data. LSH is used to map similar data onto the same bucket, which can reduce the data search scope. In order to achieve parallel implementation of the algorithm on multiple computers, the Spark framework is used to accelerate the computation of distances between objects in a cluster. Results show that our proposed approach is fast and accurate for high-dimensional and big data.

A New Landsat Image Co-Registration and Outlier Removal Techniques

  • Kim, Jong-Hong;Heo, Joon;Sohn, Hong-Gyoo
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.5
    • /
    • pp.439-443
    • /
    • 2006
  • Image co-registration is the process of overlaying two images of the same scene. One of which is a reference image, while the other (sensed image) is geometrically transformed to the one. Numerous methods were developed for the automated image co-registration and it is known as a timeconsuming and/or computation-intensive procedure. In order to improve efficiency and effectiveness of the co-registration of satellite imagery, this paper proposes a pre-qualified area matching, which is composed of feature extraction with Laplacian filter and area matching algorithm using correlation coefficient. Moreover, to improve the accuracy of co-registration, the outliers in the initial matching point should be removed. For this, two outlier detection techniques of studentized residual and modified RANSAC algorithm are used in this study. Three pairs of Landsat images were used for performance test, and the results were compared and evaluated in terms of robustness and efficiency.

Design of a DSP-Based Adaptive Controller for Real Time Dynamic Control of AM1 Robot

  • S. H. Han;K. S. Yoon;Lee, M. H.;Kim, S. K.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1998.10a
    • /
    • pp.100-104
    • /
    • 1998
  • This paper describes the real-time implementation of an adaptive controller fur the robotic manipulator. Digital signal processors(DSPs) are special purpose micro-processors that are particularly powerful for intensive numerical computations involving sums and products of variables. TMS320C50 chips are used in implementing real time adaptive control algorithms to provide an enhanced motion for robotic manipulators. In the proposed scheme, adaptation laws are derived from the improved Lyapunov second stability analysis based on the direct adaptive control theory. The adaptive controller consists of an adaptive feedforward controller and feedback controller. The proposed control scheme is simple in structure, fast in computation, and suitable for real-time control. Moreover, this scheme does not require any accurate dynamic modeling, nor values of manipulator parameters and payload. Performance of the adaptive controller is illustrated by simulation and experimental results for a assembling robot.

  • PDF

Accelerating 2D DCT in Multi-core and Many-core Environments (멀티코어와 매니코어 환경에서의 2 차원 DCT 가속)

  • Hong, Jin-Gun;Jung, Sung-Wook;Kim, Cheong-Ghil;Burgstaller, Bernd
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.250-253
    • /
    • 2011
  • Chip manufacture nowadays turned their attention from accelerating uniprocessors to integrating multiple cores on a chip. Moreover desktop graphic hardware is now starting to support general purpose computation. Desktop users are able to use multi-core CPU and GPU as a high performance computing resources these days. However exploiting parallel computing resources are still challenging because of lack of higher programming abstraction for parallel programming. The 2-dimensional discrete cosine transform (2D-DCT) algorithms are most computational intensive part of JPEG encoding. There are many fast 2D-DCT algorithms already studied. We implemented several algorithms and estimated its runtime on multi-core CPU and GPU environments. Experiments show that data parallelism can be fully exploited on CPU and GPU architecture. We expect parallelized DCT bring performance benefit towards its applications such as JPEG and MPEG.

Low Complexity Systolic Montgomery Multiplication over Finite Fields GF(2m) (유한체상의 낮은 복잡도를 갖는 시스톨릭 몽고메리 곱셈)

  • Lee, Keonjik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.18 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • Galois field arithmetic is important in error correcting codes and public-key cryptography schemes. Hardware realization of these schemes requires an efficient implementation of Galois field arithmetic operations. Multiplication is the main finite field operation and designing efficient multiplier can clearly affect the performance of compute-intensive applications. Diverse algorithms and hardware architectures are presented in the literature for hardware realization of Galois field multiplication to acquire a reduction in time and area. This paper presents a low complexity semi-systolic multiplier to facilitate parallel processing by partitioning Montgomery modular multiplication (MMM) into two independent and identical units and two-level systolic computation scheme. Analytical results indicate that the proposed multiplier achieves lower area-time (AT) complexity compared to related multipliers. Moreover, the proposed method has regularity, concurrency, and modularity, and thus is well suited for VLSI implementation. It can be applied as a core circuit for multiplication and division/exponentiation.

Comparative Analysis of Computation Times Based on the Number of Containers for CPU-Intensive Tasks in the Kubeflow Environment (Kubeflow 환경에서 CPU 집약적인 작업을 위한 컨테이너 수에 따른 연산 시간 비교 및 분석)

  • HyunSeung Jung;Taeshin Kang;Heonchang Yu;Jihun Kang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.93-96
    • /
    • 2023
  • 머신 러닝의 수요가 증가함에 따라, 머신 러닝 워크플로우의 배포 수요도 증가했다. Kubeflow를 통해 머신 러닝 배포를 편리하게 할 수 있으며, Kubeflow Pipelines에서는 하나의 작업을 여러 컨테이너로 분산시켜서 연산하는 것이 가능하다. 하지만 컨테이너 수를 많이 늘릴수록 반드시 성능이 향상되는 것은 아니다. 따라서, 본 연구에서는 성능 향상의 한계를 제공하는 원인을 분석하기 위해서, Kubeflow에서 CPU 집약적인 작업을 여러 컨테이너로 분산시켜서 연산을 수행하였다. 컨테이너 수에 따른 연산 완료 시간을 비교 및 분석한 결과, 컨테이너 수가 증가할수록 연산 속도 향상이 빨라지나, 어느 시점을 지나면 속도가 다시 완만하게 줄어드는 현상을 확인하였다. 이는 리소스 제한으로 인해 모든 컨테이너가 동시에 스케줄링 되지 못한 것이 가장 큰 원인으로 분석하였다.

Implementation of MPEG/Audio Decoder based on RISC Processor With Minimized DSP Accelerator (DSP 가속기가 내장된 RISC 프로세서 기반 MPEG/Audio 복호화기의 구현)

  • Bang Kyoung Ho;Lee Ken Sup;Park Young Cheol;Youn Dae Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1617-1622
    • /
    • 2004
  • MPEG/Audio decoder for mobile multimedia systems requires low power consumption. Implementations of AV decoder using a single RISC processor often need high power consumption owing to cash-miss in case of insufficient cash memory. In this paper, we present a MPEG/Audio decoder for mobile handset applications and implement it on a RISC processor embedding a minimized DSP accelerator. Audio decoding algorithm is splined into two parts; computation intensive and control intensive parts. Those parts we, respectively, allocated to DSP and RISC core, which are designed to run in parallel to increase the processing efficiency. The proposed system implements MP3 and AAC decoders at l7MHz and 24MHz clocks, which are reductions of 48% and 40% of complexities in comparison with implementations on a single RISC processor. The proposed method is adequate for mobile multimedia applications with insufficient cash memory.

A Study on a large-scale materials simulation using a PC networked cluster (PC Network Cluster를 사용한 대규모 재료 시뮬레이션에 관한 연구)

  • Choi, Deok-Kee;Ryu, Han-Kyu
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.30 no.5
    • /
    • pp.15-23
    • /
    • 2002
  • For molecular dynamics requires high-performance computers or supercomputers to handle huge amount of computation, it is not until recent days that the application of molecular dynamics to materials fracture simulations draw some attention from many researchers. With the recent advent of high-performance computers, computation intensive methods become more tractable than ever. However, carrying out materials simulation on high-performance computers costs too much in general. In this study, a PC cluster consisting of multiple commodity PCs is established and computer simulations of materials with cracks are carried out on it via molecular dynamics technique. The effect of the number of nodes, speedup factors, and communication time between nodes are measured to verify the performance of the PC cluster. Upon using the PC cluster, materials fracture simulations with more than 50,000 molecules are carried out successfully.

Towards efficient sharing of encrypted data in cloud-based mobile social network

  • Sun, Xin;Yao, Yiyang;Xia, Yingjie;Liu, Xuejiao;Chen, Jian;Wang, Zhiqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.4
    • /
    • pp.1892-1903
    • /
    • 2016
  • Mobile social network is becoming more and more popular with respect to the development and popularity of mobile devices and interpersonal sociality. As the amount of social data increases in a great deal and cloud computing techniques become developed, the architecture of mobile social network is evolved into cloud-based that mobile clients send data to the cloud and make data accessible from clients. The data in the cloud should be stored in a secure fashion to protect user privacy and restrict data sharing defined by users. Ciphertext-policy attribute-based encryption (CP-ABE) is currently considered to be a promising security solution for cloud-based mobile social network to encrypt the sensitive data. However, its ciphertext size and decryption time grow linearly with the attribute numbers in the access structure. In order to reduce the computing overhead held by the mobile devices, in this paper we propose a new Outsourcing decryption and Match-then-decrypt CP-ABE algorithm (OM-CP-ABE) which firstly outsources the computation-intensive bilinear pairing operations to a proxy, and secondly performs the decryption test on the attributes set matching access policy in ciphertexts. The experimental performance assessments show the security strength and efficiency of the proposed solution in terms of computation, communication, and storage. Also, our construction is proven to be replayable choosen-ciphertext attacks (RCCA) secure based on the decisional bilinear Diffie-Hellman (DBDH) assumption in the standard model.

A Study on the Fast Motion Estimation Coding by Moving Region Segmentation (동영역 분할에 의한 고속 움직임 추정 부호화에 관한 연구)

  • Lee, Bong-Ho;Choi, Kyung-Soo;Kwak, No-Youn;Hwang, Byong-Won
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.37 no.3
    • /
    • pp.88-97
    • /
    • 2000
  • This paper presents motion estimation method using region segmentation information Motion estimation which is very difficult to be implemented only by software because of intensive computation cost, is implemented by special-purpose hardware in real-time applications In this paper, we propose region based motion estimation algorithm which can reduce the computation cost by using region segmentation information and setting the variable search window compared with FSMA algorithm Secondly, another proposed algorithm is to segment semantic region like face for selective coding and transfer of semantic region using segmented region information This work alms to improving the subjective quality of skin color region or face region m the picture that has slow motion and IS mainly composed of one or two speakers of video conference and video telephony applications.

  • PDF