• Title/Summary/Keyword: 병렬 알고리즘

Search Result 1,326, Processing Time 0.034 seconds

Development and Speed Comparison of Convolutional Neural Network Using CUDA (CUDA를 이용한 Convolutional Neural Network의 구현 및 속도 비교)

  • Ki, Cheol-min;Cho, Tai-Hoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.335-338
    • /
    • 2017
  • Currently Artificial Inteligence and Deep Learning are social issues, and These technologies are applied to various fields. A good method among the various algorithms in Artificial Inteligence is Convolutional Neural Network. Convolutional Neural Network is a form that adds convolution layers that extracts features by convolution operation on a general neural network method. If you use Convolutional Neural Network as small amount of data, or if the structure of layers is not complicated, you don't have to pay attention to speed. But the learning time is long as the size of the learning data is large and the structure of layers is complicated. So, GPU-based parallel processing is a lot. In this paper, we developed Convolutional Neural Network using CUDA and Learning speed is faster and more efficient than the method using the CPU.

  • PDF

A Semi-MMIC Hair-pin Resonator Oscillator for K-Band Application (K-Band용 Semi-MMIC Hair-pin 공진 발진기)

  • 이현태;이종철;김종헌;김남영;김복기;홍의석
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.8B
    • /
    • pp.1493-1498
    • /
    • 2000
  • In this paper, we introduce a modified interference cancellation scheme to overcome MAI in DS-CDMA. Among ICs(Interference Cancellers), PIC(Parallel IC) requires the more complexity, and SIC(Successive IC) faces the problems of the long delay time. Most of all, the adaptive detector achieves the good BER performance using the adaptive filter conducted iteration algorithm. so it requires many iterations. To resolve the problems of them, we propose an improved adaptive detector that the received signal removed MAI through the sorting scheme and the cancellation method are fed into the adaptive filter. Because the improved input signal is fed into the adaptive filter, it has the same BER performance only using smaller iterations than the conventional adaptive detector, and the proposed detector having adaptive filter requires less complexity than the other detectors.

  • PDF

A Study on the Control Algorithm for Engine Clutch Engagement During Mode Change of Plug-in Hybrid Electric Vehicles (플러그인 하이브리드 차량의 모드변환에 따른 엔진클러치 접합 제어알고리즘 연구)

  • Sim, Kyuhyun;Lee, Suji;Namkoong, Choul;Lee, Ji-Suk;Han, Kwan-Soo;Hwang, Sung-Ho
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.40 no.9
    • /
    • pp.801-805
    • /
    • 2016
  • In this paper, engine clutch engagement shock is analyzed during the mode change of plug-in hybrid electric vehicles. Multi-driving mode includes the EV (electric vehicle) mode, HEV (hybrid electric vehicle) mode, and engine operating mode. Depending on the mode change, the engine clutch is either engaged or disengaged. The magnitude of shock during clutch engagement is very important because it impacts vehicle acceleration and clutch synchronization speed, which affects ride comfort substantially. The performance simulator of plug-in hybrid electric vehicles was developed using MATLAB/Simulink. The simulation results show that the mode change control algorithm is necessary for minimizing shock during clutch engagement.

High-Performance Line-Based Filtering Architecture Using Multi-Filter Lifting Method (다중필터 리프팅 방식을 이용한 고성능 라인기반 필터링 구조)

  • 서영호;김동욱
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.8
    • /
    • pp.75-84
    • /
    • 2004
  • In this paper, we proposed an efficient hardware architecture of line-based lifting algorithm for Motion JPEG2000. We proposed a new architecture of a lifting-based filtering cell which has an optimized and simplified structure. It was implemented in a hardware accommodating both (9,7) and (5,4) filter. Since the output rate is linearly proportional to the input rate, one can obtain the high throughput through parallel operation simply by adding the hardware units. It was implemented into both of ASIC and FPGA The 0.35${\mu}{\textrm}{m}$ CMOS library from Samsung was used for ASIC and Altera was the target for FRGA. In ASIC, the proposed architecture used 41,592 gates for the lifting arithmetic and 128 Kbit memory. For FPGA it used 6,520 LEs(Logic Elements) and 128 ESBs(Embedded System Blocks). The implementations were stably operated in the clock frequency of 128MHz and 52MHz, respectively.

High-speed Design of 8-bit Architecture of AES Encryption (AES 암호 알고리즘을 위한 고속 8-비트 구조 설계)

  • Lee, Je-Hoon;Lim, Duk-Gyu
    • Convergence Security Journal
    • /
    • v.17 no.2
    • /
    • pp.15-22
    • /
    • 2017
  • This paper presents new 8-bit implementation of AES. Most typical 8-bit AES designs are to reduce the circuit area by sacrificing its throughput. The presented AES architecture employs two separated S-box to perform round operation and key generation in parallel. From the simulation results of the proposed AES-128, the maximum critical path delay is 13.0ns. It can be operated in 77MHz and the throughput is 15.2 Mbps. Consequently, the throughput of the proposed AES has 1.54 times higher throughput than the other counterpart although the area increasement is limited in 1.17 times. The proposed AES design enables very low-area design without sacrificing its performance. Thereby, it can be suitable for the various IoT applications that need high speed communication.

Hierarchical Visualization of Cloud-Based Social Network Service Using Fuzzy (퍼지를 이용한 클라우드 기반의 소셜 네트워크 서비스 계층적 시각화)

  • Park, Sun;Kim, Yong-Il;Lee, Seong Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.7
    • /
    • pp.501-511
    • /
    • 2013
  • Recently, the visualization method of social network service have been only focusing on presentation of visualizing network data, which the methods do not consider an efficient processing speed and computational complexity for increasing at the ratio of arithmetical of a big data regarding social networks. This paper proposes a cloud based on visualization method to visualize a user focused hierarchy relationship between user's nodes on social network. The proposed method can intuitionally understand the user's social relationship since the method uses fuzzy to represent a hierarchical relationship of user nodes of social network. It also can easily identify a key role relationship of users on social network. In addition, the method uses hadoop and hive based on cloud for distributed parallel processing of visualization algorithm, which it can expedite the big data of social network.

A Design of Turbo Decoder using MAP Algorithm (MAP 알고리즘을 이용한 터보 복호화기 설계)

  • 권순녀;이윤현
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.8
    • /
    • pp.1854-1863
    • /
    • 2003
  • In the recent digital communication systems, the performance of Turbo Code using the mr correction coding depends on the interleaver influencing the free distance determination and the recursive decoding algorithms that is executed in the huh decoder. However, performance depends on the interleaver depth that needs many delays over the reception process. Moreover, turbo code has been blown as the robust coding methods with the confidence over the fading channel. International Telecommunication Union(ITU) has recently adopted it as the standardization of the channel coding over the third generation mobile communications(IMT­2000). Therefore, in this paper, we preposed the interleaver that has the better performance than existing block interleaver, and modified turbo decoder that has the parallel concatenated structure using MAP algorithm. In the real­time voice and video service over third generation mobile communications, the performance of the proposed two methods was analyzed and compared with the existing methods by computer simulation in terms of reduced decoding delay using the variable decoding method over AWGN and fading channels for CDMA environments.

The implementation of interface between industrial PC and PLC for multi-camera vision systems (멀티카메라 비전시스템을 위한 산업용 PC와 PLC간 제어 방법 개발)

  • Kim, Hyun Soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.1
    • /
    • pp.453-458
    • /
    • 2016
  • One of the most common applications of machine vision is quality inspections in automated production. In this study, a welding inspection system that is controlled by a PC and a PLC equipped with a multi-camera setup was developed. The system was designed to measure the primary dimensions, such as the length and width of the welding areas. The TCP/IP protocols and multi-threading techniques were used for parallel control of the optical components and physical distribution. A coaxial light was used to maintain uniform lighting conditions and enhance the image quality of the weld areas. The core image processing system was established through a combination of various algorithms from the OpenCV library. The proposed vision inspection system was fully validated for an actual weld production line and was shown to satisfy the functional and performance requirements.

Non-Photorealistic Rendering Using CUDA-Based Image Segmentation (CUDA 기반 영상 분할을 사용한 비사실적 렌더링)

  • Yoon, Hyun-Cheol;Park, Jong-Seung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.529-536
    • /
    • 2015
  • When rendering both three-dimensional objects and photo images together, the non-photorealistic rendering results are in visual discord since the two contents have their own independent color distributions. This paper proposes a non-photorealistic rendering technique which renders both three-dimensional objects and photo images such as cartoons and sketches. The proposed technique computes the color distribution property of the photo images and reduces the number of colors of both photo images and 3D objects. NPR is performed based on the reduced colormaps and edge features. To enhance the natural scene presentation, the image region segmentation process is preferred when extracting and applying colormaps. However, the image segmentation technique needs a lot of computational operations. It takes a long time for non-photorealistic rendering for large size frames. To speed up the time-consuming segmentation procedure, we use GPGPU for the parallel computing using the GPU. As a result, we significantly improve the execution speed of the algorithm.

Hardware/Software Partitioning Methodology for Reconfigurable System (재구성형 시스템을 위한 하드웨어/소프트웨어 분할 기법)

  • Kim, Jun-Yong;Ahn, Seong-Yong;Lee, Jeong-A.
    • The KIPS Transactions:PartA
    • /
    • v.11A no.5
    • /
    • pp.303-312
    • /
    • 2004
  • In this paper, we propose a methodology solving the problem of the hardware-software partitioning in reconfigurable systems using a Y-chart design space exploration and implement a simulator according to the methodology. The methodology generates a mapping set between tasks and hardware elements using the hardware element model and the application model. We evaluate the throughput by simulating cases in each mapping set. With the throughput evaluation result, we can select the mapping case with the highest throughput. We also propose an heuristic improving the simulation time by reducing the mapping set on the basis of the relationship between workload and parallelism. Simulation results show that we can reduce the size of mapping set which poses difficulties on hardware-software partitioning by up to 80%.