Search | Korea Science

A Benchmark of Micro Parallel Computing Technology for Real-time Control in Smart Farm (MPICH vs OpenMP) (제목을스마트 시설환경 실시간 제어를 위한 마이크로 병렬 컴퓨팅 기술 분석)

Min, Jae-Ki;Lee, DongHoon
- Proceedings of the Korean Society for Agricultural Machinery Conference
- /
- 2017.04a
- /
- pp.161-161
- /
- 2017
스마트 시설환경의 제어 요소는 난방기, 창 개폐, 수분/양액 밸브 개폐, 환풍기, 제습기 등 직접적으로 시설환경의 조절에 관여하는 인자와 정보 교환을 위한 통신, 사용자 인터페이스 등 간접적으로 제어에 관련된 요소들이 복합적으로 존재한다. PID 제어와 같이 하는 수학적 논리를 바탕으로 한 제어와 전문 관리자의 지식을 기반으로 한 비선형 학습 모델에 의한 제어 등이 공존할 수 있다. 이러한 다양한 요소들을 복합적으로 연동시키기 위해선 기존의 시퀀스 기반 제어 방식에는 한계가 있을 수 있다. 관행의 방식과 같이 시계열 상에서 획득한 충분한 데이터를 이용하여 제어의 양과 시점을 결정하는 방식은 예외 상황에 충분히 대처하기 어려운 단점이 있을 수 있다. 이러한 예외 상황은 자연적인 조건의 변화에 따라 불가피하게 발생하는 경우와 시스템의 오류에 기인하는 경우로 나뉠 수 있다. 본 연구에서는 실시간으로 변하는 시설환경 내의 다양한 환경요소를 실시간으로 분석하고 상응하는 제어를 수행하여 수학적이며 예측 가능한 논리에 의해 준비된 제어시스템을 보완할 방법을 연구하였다. 과거의 고성능 컴퓨팅(HPC; High Performance Computing)은 다수의 컴퓨터를 고속 네트워크로 연동하여 집적적으로 연산능력을 향상시킨 기술로 비용과 규모의 측면에서 많은 투자를 필요로 하는 첨단 고급 기술이었다. 핸드폰과 모바일 장비의 발달로 인해 소형 마이크로프로세서가 발달하여 근래 2 Ghz의 클럭 속도에 이르는 어플리케이션 프로세서(AP: Application Processor)가 등장하기도 하였다. 상대적으로 낮은 성능에도 불구하고 저전력 소모와 플랫폼의 소형화를 장점으로 한 AP를 시설환경의 실시간 제어에 응용하기 위한 방안을 연구하였다. CPU의 클럭, 메모리의 양, 코어의 수량을 다음과 같이 달리한 3가지 시스템을 비교하여 AP를 이용한 마이크로 클러스터링 기술의 성능을 비교하였다.1) 1.5 Ghz, 8 Processors, 32 Cores, 1GByte/Processor, 32Bit Linux(ARMv71). 2) 2.0 Ghz, 4 Processors, 32 Cores, 2GByte/Processor, 32Bit Linux(ARMv71). 3) 1.5 Ghz, 8 Processors, 32 Cores, 2GByte/Processor, 64Bit Linux(Arch64). 병렬 컴퓨팅을 위한 개발 라이브러리로 MPICH(www.mpich.org)와 Open-MP(www.openmp.org)를 이용하였다. 2,500,000,000에 이르는 정수 중 소수를 구하는 연산에 소요된 시간은 1)17초, 2)13초, 3)3초 이었으며, $12800{\times}12800$ 크기의 행렬에 대한 2차원 FFT 연산 소요시간은 각각 1)10초, 2)8초, 3)2초 이었다. 3번 경우는 클럭속도가 3Gh에 이르는 상용 데스크탑의 연산 속도보다 빠르다고 평가할 수 있다. 라이브러리의 따른 결과는 근사적으로 동일하였다. 선행 연구에서 획득한 3차원 계측 데이터를 1초 단위로 3차원 선형 보간법을 수행한 경우 코어의 수를 4개 이하로 한 경우 근소한 차이로 동일한 결과를 보였으나, 코어의 수를 8개 이상으로 한 경우 앞선 결과와 유사한 경향을 보였다. 현장 보급 가능성, 구축비용 및 전력 소모 등을 종합적으로 고려한 AP 활용 마이크로 클러스터링 기술을 지속적으로 연구할 것이다.
PDF

Real-Time Tracking of Moving Object by Adaptive Search in Spatial-temporal Spaces (시공간 적응탐색에 의한 실시간 이동물체 추적)

Kim, Gye-Young;Choi, Hyung-Ill
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.31B no.11
- /
- pp.63-77
- /
- 1994
This paper describes the real-time system which, through analyzing a sequence of images, can extract motional information on a moving object and can contol servo equipment to always locate the moving object at the center of an image frame. An image is a vast amount of two-dimensional signal, so it takes a lot of time to analyze the whole quantity of a given image. Especially, the time needed to load pixels from a memory to processor increase exponentially as the size of an image increases. To solve such a problem and track a moving object in real-time, this paper addresses how to selectively search the spatial and time domain. Based on the selective search of spatial and time domain, this paper suggests various types of techniques which are essential in implementing a real-time tracking system. That is, this paper describes how to detect an entrance of a moving object in the field of view of a camera and the direction of the entrance, how to determine the time interval of adjacent images, how to determine nonstationary areas formed by a moving object and calculated velocity and position information of a moving object based on the determined areas, how to control servo equipment to locate the moving object at the center of an image frame, and how to properly adjust time interval(${\Delta}$t) to track an object taking variable speed.
PDF

Optimum Range Cutting for Packet Classification (최적화된 영역 분할을 이용한 패킷 분류 알고리즘)

Kim, Hyeong-Gee;Park, Kyong-Hye;Lim, Hye-Sook
- Journal of KIISE:Information Networking
- /
- v.35 no.6
- /
- pp.497-509
- /
- 2008
Various algorithms and architectures for efficient packet classification have been widely studied. Packet classification algorithms based on a decision tree structure such as HiCuts and HyperCuts are known to be the best by exploiting the geometrical representation of rules in a classifier. However, the algorithms are not practical since they involve complicated heuristics in selecting a dimension of cuts and determining the number of cuts at each node of the decision tree. Moreover, the cutting is not efficient enough since the cutting is based on regular interval which is not related to the actual range that each rule covers. In this paper, we proposed a new efficient packet classification algorithm using a range cutting. The proposed algorithm primarily finds out the ranges that each rule covers in 2-dimensional prefix plane and performs cutting according to the ranges. Hence, the proposed algorithm constructs a very efficient decision tree. The cutting applied to each node of the decision tree is optimal and deterministic not involving the complicated heuristics. Simulation results for rule sets generated using class-bench databases show that the proposed algorithm has better performance in average search speed and consumes up to 3-300 times less memory space compared with previous cutting algorithms.
PDF KSCI

A Study on GPU Computing of Bi-conjugate Gradient Method for Finite Element Analysis of the Incompressible Navier-Stokes Equations (유한요소 비압축성 유동장 해석을 위한 이중공액구배법의 GPU 기반 연산에 대한 연구)

Yoon, Jong Seon;Jeon, Byoung Jin;Jung, Hye Dong;Choi, Hyoung Gwon
- Transactions of the Korean Society of Mechanical Engineers B
- /
- v.40 no.9
- /
- pp.597-604
- /
- 2016
A parallel algorithm of bi-conjugate gradient method was developed based on CUDA for parallel computation of the incompressible Navier-Stokes equations. The governing equations were discretized using splitting P2P1 finite element method. Asymmetric stenotic flow problem was solved to validate the proposed algorithm, and then the parallel performance of the GPU was examined by measuring the elapsed times. Further, the GPU performance for sparse matrix-vector multiplication was also investigated with a matrix of fluid-structure interaction problem. A kernel was generated to simultaneously compute the inner product of each row of sparse matrix and a vector. In addition, the kernel was optimized to improve the performance by using both parallel reduction and memory coalescing. In the kernel construction, the effect of warp on the parallel performance of the present CUDA was also examined. The present GPU computation was more than 7 times faster than the single CPU by double precision.
https://doi.org/10.3795/KSME-B.2016.40.9.597 인용 PDF KSCI

Temporary Metadata Journaling Scheme to Improve Performance and Stability of a FAT Compatible File System (FAT 파일 시스템의 호환성을 유지하며 성능과 안정성을 향상시키는 메타데이터 저널링 기법의 설계)

Hyun, Choul-Seung;Choi, Jong-Moo;Lee, Dong-Hee;Noh, Sam-H.
- Journal of KIISE:Computer Systems and Theory
- /
- v.36 no.3
- /
- pp.191-198
- /
- 2009
The FAT (File Allocation Table) compatible file system has been widely used in mobile devices and memory cards because of its data exchangeability among numerous platforms recognizing the FAT file system. By the way. modern embedded systems have tough demands for instant power failure recovery and superior performance for multimedia applications. The key issue is how to achieve the goals of superior write performance and instant booting capability while controlling compatibility issues. To achieve the goals while controlling compatibility issues. we devised a temporary meta-data journaling scheme for a FAT compatible file system. Benchmark results of the scheme implemented in a FAT compatible file system shows that it really improves write performance of the FAT file system by converting small random write for meta-data update to a large sequential write in journaling area. Also, it provides natural way to implement the instant booting capability. Nevertheless, the file system compatibility is temporarily compromised by the scheme because it stores updated meta-data in the temporary journaling area rather than to their original locations. However, the compatibility can be fully recovered at any time by journal-flushing that copies meta-data in journaling area to their original locations. Generally, the journal-flushing is done before un-mounting a memory card so that it can be used in other mobile devices which recognized FAT file system but not the temporary meta-data journaling scheme.
PDF KSCI

A Distributed VOD Server Based on Virtual Interface Architecture and Interval Cache (버추얼 인터페이스 아키텍처 및 인터벌 캐쉬에 기반한 분산 VOD 서버)

Oh, Soo-Cheol;Chung, Sang-Hwa
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.10
- /
- pp.734-745
- /
- 2006
This paper presents a PC cluster-based distributed VOD server that minimizes the load of an interconnection network by adopting the VIA communication protocol and the interval cache algorithm. Video data is distributed to the disks of the distributed VOD server and each server node receives the data through the interconnection network and sends it to clients. The load of the interconnection network increases because of the large amount of video data transferred. This paper developed a distributed VOD file system, which is based on VIA, to minimize cost using interconnection network when accessing remote disks. VIA is a user-level communication protocol removing the overhead of TCP/IP. This papers also improved the performance of the interconnection network by expanding the maximum transfer size of VIA. In addition, the interval cache reduces traffic on the interconnection network by caching, in main memory, the video data transferred from disks of remote server nodes. Experiments using the distributed VOD server of this paper showed a maximum performance improvement of 21.3% compared with a distributed VOD server without VIA and the interval cache, when used with a four-node PC cluster.
PDF KSCI

Implementation of Massive FDTD Simulation Computing Model Based on MPI Cluster for Semi-conductor Process (반도체 검증을 위한 MPI 기반 클러스터에서의 대용량 FDTD 시뮬레이션 연산환경 구축)

Lee, Seung-Il;Kim, Yeon-Il;Lee, Sang-Gil;Lee, Cheol-Hoon
- The Journal of the Korea Contents Association
- /
- v.15 no.9
- /
- pp.21-28
- /
- 2015
In the semi-conductor process, a simulation process is performed to detect defects by analyzing the behavior of the impurity through the physical quantity calculation of the inner element. In order to perform the simulation, Finite-Difference Time-Domain(FDTD) algorithm is used. The improvement of semiconductor which is composed of nanoscale elements, the size of simulation is getting bigger. Problems that a processor such as CPU or GPU cannot perform the simulation due to the massive size of matrix or a computer consist of multiple processors cannot handle a massive FDTD may come up. For those problems, studies are performed with parallel/distributed computing. However, in the past, only single type of processor was used. In GPU's case, it performs fast, but at the same time, it has limited memory. On the other hand, in CPU, it performs slower than that of GPU. To solve the problem, we implemented a computing model that can handle any FDTD simulation regardless of size on the cluster which consist of heterogeneous processors. We tested the simulation on processors using MPI libraries which is based on 'point to point' communication and verified that it operates correctly regardless of the number of node and type. Also, we analyzed the performance by measuring the total execution time and specific time for the simulation on each test.
https://doi.org/10.5392/JKCA.2015.15.09.021 인용 PDF KSCI

On the Performance of Sample-Adaptive Product Quantizer for Noisy Channels (표본적응 프러덕트 양자기의 전송로 잡음에서의 성능 분석에 관한 연구)

Kim Dong Sik
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.3 s.303
- /
- pp.81-90
- /
- 2005
When we transmit signals, which are quantized by the vector quantizer (VQ), through noisy channels, the overall performance of the coding system is very dependent on the employed quantization scheme and the channel error effect. In order to design an optimal coding system, the source and channel coding scheme should be jointly optimized as in the channel-optimized VQ. As a suboptimal approach, we may consider the robust VQ (RVQ). In RVQ, we consider developing an index assignment function for mapping the output of quantizers to channel symbols so that the effect of the channel errors is minimized. Recently, a VQ, which can reduce the encoding complexity and is called the sample-adaptive product quantizer (SAPQ), has been proposed. SAPQ has very similar quantizer structure as to the product quantizer (PQ). However, the quantization performance can be better than PQ. Further, the encoding complexity and the memory requirement for the codebooks are lower than the regular full-search VQ case. In this paper, SAPQ is employed in order to design an RVQ to channel errors by reducing the vector dimension. Discussions on the codebook structure of SAPQ and experiments are introduced in an aspect of robustness to noisy channels.
PDF KSCI

Data Transmission System from Distant Area Using SD-Card and Ethernet (SD 카드와 이더넷을 이용한 원격지 데이터 전송시스템)

Jo, Heung-Kuk
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2010.10a
- /
- pp.381-385
- /
- 2010
An aging Society solitary life old mans are increasing. The nurses have to visit old mans and must confirm their disease, because they do not act well. It is very difficult to take care old man, because the number of Nurses are small. This problem is solved by collection of data about condition of old mans from long distance. Data communication with Ethernet have benefit to collection of measurement of old man's condition. The Data storage system an long distance place are storaged data and after several day data was transmitted to the DB over the Ethernet. For Miniaturization of such system the system must be OS-less Embedded Ethernet Server system. Such system manages the file management system only with H/W. The Storage device is used SD-card. SD Card is small size and operates with small power. By using 512MB sd memory card, it is possible to storage during 5~6 years, 10 byte of temperature value per second. In this paper, we make a Embedded Ethernet Server using W3100A, Atmega128 MCU and data stroage device using SD-Card. This system operates with O/S-less Embedded Ethernet Server. We talk about file System, Storage and Ethernet. We explained about MCU Atmega128, Interface between LAN LSIand W3100A, Interface between W3100A and Phyceiver RTL8201, data I/O between MCU and SD-Card and File System. We shows the experiment device and result of monitoring.
PDF

A 2.0-GS/s 5-b Current Mode ADC-Based Receiver with Embedded Channel Equalizer (채널 등화기를 내장한 2.0GS/s 5비트 전류 모드 ADC 기반 수신기)

Moon, Jong-Ho;Jung, Woo-Chul;Kim, Jin-Tae;Kwon, Kee-Won;Jun, Young-Hyun;Chun, Jung-Hoon
- Journal of the Institute of Electronics and Information Engineers
- /
- v.49 no.12
- /
- pp.184-193
- /
- 2012
In this paper, a 5-bit 2-GS/s 2-way time interleaved pipeline ADC for high-speed serial link receiver is demonstrated. Implemented as a current-mode amplifier, the stage ADC simultaneously processes the tracking and residue amplification to achieve higher sampling rate. In addition, each stage incorporates a built-in 1-tap FIR equalizer, reducing inter-symbol-interference (ISI）without an extra digital post-processing. The ADC is designed in a 110nm CMOS technology. It comsumes 91mW from a 1.2-V supply. The area excluding the memory block is $0.58{\times}0.42mm^2$. Simulation results show that when equalizer is enabled, the ADC achieves SNDR of 25.2dB and ENOB of 3.9bits at 2.0GS/s sample rate for a Nyquist input signal. When the equalizer is disengaged, SNDR is 26.0dB for 20MHz-1.0GHz input signal, and the ENOB of 4.0bits.
https://doi.org/10.5573/ieek.2012.49.12.184 인용 PDF

Search Result 893, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)