• Title/Summary/Keyword: 1단 병렬 시스템

Search Result 77, Processing Time 0.021 seconds

Fast Hologram Generating of 3D Object with Super Multi-Light Source using Parallel Distributed Computing (병렬 분산 컴퓨팅을 이용한 초다광원 3차원 물체의 홀로그램 고속 생성)

  • Song, Joongseok;Kim, Changseob;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.20 no.5
    • /
    • pp.706-717
    • /
    • 2015
  • The computer generated hologram (CGH) method is the technology which can generate a hologram by using only a personal computer (PC) commonly used. However, the CGH method requires a huge amount of calculational time for the 3D object with a super multi-light source or a high-definition hologram. Hence, some solutions are obviously necessary for reducing the computational complexity of a CGH algorithm or increasing the computing performance of hardware. In this paper, we propose a method which can generate a digital hologram of the 3D object with a super multi-light source using parallel distributed computing. The traditional methods has the limitation of improving CGH performance by using a single PC. However, the proposed method where a server PC efficiently uses the computing power of client PCs can quickly calculate the CGH method for 3D object with super multi-light source. In the experimental result, we verified that the proposed method can generate the digital hologram with 1,5361,536 resolution size of 3D object with 157,771 light source in 121 ms. In addition, in the proposed method, we verify that the proposed method can reduce generation time of a digital hologram in proportion to the number of client PCs.

Design of Binary Constant Envelope System using the Pre-Coding Scheme in the Multi-User CDMA Communication System (다중 사용자 CDMA 통신 시스템에서 프리코딩 기법을 사용한 2진 정진폭 시스템 설계)

  • 김상우;유흥균;정순기;이상태
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.15 no.5
    • /
    • pp.486-492
    • /
    • 2004
  • In this paper, we newly propose the binary CA-CDMA(constant amplitude CDMA) system using pre-coding method to solve the high PAPR problem caused by multi-user signal transmission in the CDMA system. 4-user CA-CDMA, the basis of proposed binary CA-CDMA system, makes binary output signal for 4 input users. It produces the output of binary(${\pm}$2) amplitude by using a parity signal resulting from the XOR operation of 4 users data. Another sub-channel or more bandwidth is not necessary because it is transmitted together with user data and can be easily recovered in the receiver. The extension of the number of users can be possible by the simple repetition of the basic binary 4-user CA-CDMA. For example, binary 16-user CA-CDMA is made easily by allocating the four 4-user CA-CDMA systems in parallel and leading the four outputs to the fifth 4-user CA-CDMA system as input, because the output signal of each 4-user CA-CDMA is also binary. By the same extension procedure, binary 64 and 256-user CA-CDMA systems can be made with the constant amplitude. As a result, the code rate of this proposed CA-CDMA system is just 1 and binary CA-CDMA does not change the transmission rate with the constant output signal(PAPR = 0 ㏈). Therefore, the power efficiency of the HPA can be maximized without the nonlinear distortion. From the simulation results, it is verified that the conventional CDMA system has multi-level output signal, but the proposed binary CA-CDMA system always produces binary output. And it is also found that the BER of conventional CDMA system is increased by nonlinear HPA, but the BER of proposed binary CA-CDMA system is not changed.

An Omnidirectional High Gain Antenna for UHF Band Ground Station (UHF대역 지상국용 무지향 고이득 안테나)

  • Bae, Ki-Hyoung;Chang, Min-Soo;Joo, Jae-Woo;Hwang, Chan-Ho;Hong, Ki-Pyo
    • Journal of the Korea Knowledge Information Technology Society
    • /
    • v.12 no.4
    • /
    • pp.539-550
    • /
    • 2017
  • In this paper, we designed, fabricated and tested an UHF band cylindrical dipole array antenna. In the proposed antenna, cylindrical dipoles were vertically arranged in four stages. A parallel structure feeding circuit was installed inside the cylindrical dipole and mounted so as to be broadband matching. The feeding circuit was installed at the center of the cylindrical dipole to optimize the gain flatness characteristic of the azimuth direction omnidirectional radiation pattern. Minimizing the difference between the signals branched from the feeding circuit and realizing the symmetry of the radiation pattern. The required specifications are more than 11.2% bandwidth in UHF band, above 6dBi antenna gain, standing wave ratio of 2:1 or less, less than ${\pm}1dB$ gain flatness in azimuth radiation pattern, more than 13 degrees in elevation radiation pattern of 3dB beamwidth. We confirmed the possibility of implementation through M&S and verified the result of M&S through production and testing. The test results are 11.2% bandwidth in the UHF band, 6.30 to 8.31 dBi gain, 1.53:1 standing wave ratio or less, within ${\pm}0.2dB$ gain flatness in the azimuth radiation pattern, elevation radiation pattern of 3dB beam width was 15.62 to 15.84 degrees. The test result meets all requirements specifications.

An implementation of 2D/3D Complex Optical System and its Algorithm for High Speed, Precision Solder Paste Vision Inspection (솔더 페이스트의 고속, 고정밀 검사를 위한 이차원/삼차원 복합 광학계 및 알고리즘 구현)

  • 조상현;최흥문
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.3
    • /
    • pp.139-146
    • /
    • 2004
  • A 2D/3D complex optical system and its vision inspection algerian is proposed and implemented as a single probe system for high speed, precise vision inspection of the solder pastes. One pass un length labeling algorithm is proposed instead of the conventional two pass labeling algorithm for fast extraction of the 2D shape of the solder paste image from the recent line-scan camera as well as the conventional area-scan camera, and the optical probe path generation is also proposed for the efficient 2D/3D inspection. The Moire interferometry-based phase shift algerian and its optical system implementation is introduced, instead of the conventional laser slit-beam method, for the high precision 3D vision inspection. All of the time-critical algorithms are MMX SIMD parallel-coded for further speedup. The proposed system is implemented for simultaneous 2D/3D inspection of 10mm${\times}$10mm FOV with resolutions of 10 ${\mu}{\textrm}{m}$ for both x, y axis and 1 ${\mu}{\textrm}{m}$ for z axis. Experiments conducted on several nBs show that the 2D/3D inspection of an FOV, excluding an image capturing, results in high speed of about 0.011sec/0.01sec, respectively, after image capturing, with $\pm$1${\mu}{\textrm}{m}$ height accuracy.

Acceleration of computation speed for elastic wave simulation using a Graphic Processing Unit (그래픽 프로세서를 이용한 탄성파 수치모사의 계산속도 향상)

  • Nakata, Norimitsu;Tsuji, Takeshi;Matsuoka, Toshifumi
    • Geophysics and Geophysical Exploration
    • /
    • v.14 no.1
    • /
    • pp.98-104
    • /
    • 2011
  • Numerical simulation in exploration geophysics provides important insights into subsurface wave propagation phenomena. Although elastic wave simulations take longer to compute than acoustic simulations, an elastic simulator can construct more realistic wavefields including shear components. Therefore, it is suitable for exploration of the responses of elastic bodies. To overcome the long duration of the calculations, we use a Graphic Processing Unit (GPU) to accelerate the elastic wave simulation. Because a GPU has many processors and a wide memory bandwidth, we can use it in a parallelised computing architecture. The GPU board used in this study is an NVIDIA Tesla C1060, which has 240 processors and a 102 GB/s memory bandwidth. Despite the availability of a parallel computing architecture (CUDA), developed by NVIDIA, we must optimise the usage of the different types of memory on the GPU device, and the sequence of calculations, to obtain a significant speedup of the computation. In this study, we simulate two- (2D) and threedimensional (3D) elastic wave propagation using the Finite-Difference Time-Domain (FDTD) method on GPUs. In the wave propagation simulation, we adopt the staggered-grid method, which is one of the conventional FD schemes, since this method can achieve sufficient accuracy for use in numerical modelling in geophysics. Our simulator optimises the usage of memory on the GPU device to reduce data access times, and uses faster memory as much as possible. This is a key factor in GPU computing. By using one GPU device and optimising its memory usage, we improved the computation time by more than 14 times in the 2D simulation, and over six times in the 3D simulation, compared with one CPU. Furthermore, by using three GPUs, we succeeded in accelerating the 3D simulation 10 times.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.

ATM Cell Encipherment Method using Rijndael Algorithm in Physical Layer (Rijndael 알고리즘을 이용한 물리 계층 ATM 셀 보안 기법)

  • Im Sung-Yeal;Chung Ki-Dong
    • The KIPS Transactions:PartC
    • /
    • v.13C no.1 s.104
    • /
    • pp.83-94
    • /
    • 2006
  • This paper describes ATM cell encipherment method using Rijndael Algorithm adopted as an AES(Advanced Encryption Standard) by NIST in 2001. ISO 9160 describes the requirement of physical layer data processing in encryption/decryption. For the description of ATM cell encipherment method, we implemented ATM data encipherment equipment which satisfies the requirements of ISO 9160, and verified the encipherment/decipherment processing at ATM STM-1 rate(155.52Mbps). The DES algorithm can process data in the block size of 64 bits and its key length is 64 bits, but the Rijndael algorithm can process data in the block size of 128 bits and the key length of 128, 192, or 256 bits selectively. So it is more flexible in high bit rate data processing and stronger in encription strength than DES. For tile real time encryption of high bit rate data stream. Rijndael algorithm was implemented in FPGA in this experiment. The boundary of serial UNI cell was detected by the CRC method, and in the case of user data cell the payload of 48 octets (384 bits) is converted in parallel and transferred to 3 Rijndael encipherment module in the block size of 128 bits individually. After completion of encryption, the header stored in buffer is attached to the enciphered payload and retransmitted in the format of cell. At the receiving end, the boundary of ceil is detected by the CRC method and the payload type is decided. n the payload type is the user data cell, the payload of the cell is transferred to the 3-Rijndael decryption module in the block sire of 128 bits for decryption of data. And in the case of maintenance cell, the payload is extracted without decryption processing.