• Title/Summary/Keyword: 다중 프로세서 시스템

Search Result 281, Processing Time 0.027 seconds

Development of a Heuristic Method for Solving a Class of Nonlinear Integer Programs with Application to Redundancy Optimization for the Safely Control System using Multi-processor (비선형정수계획의 새로운 발견적해법의 개발과 고성능 다중프로세서를 이용한 안전관리 시스템의 신뢰도 중복설계의 최적화)

  • 김장욱;김재환;황승옥;박춘일;금상호
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.1 no.2
    • /
    • pp.69-82
    • /
    • 1995
  • This study is concerned with developing a heuristic algorithm for solving a class of ninlinear integer programs(NLIP). Exact algrithm for solving a NLIP either may not exist, or may take an unrealistically large amount of computing time. This study develops a new heuristic, the Excursion Algorithm(EA), for solving a class of NLIP's. It turns out that excursions over a bounded feasible and/or infeasible region is effective in alleviation the risks of being trapped at a lical optimum. The developed EA is applied to the redundancy optimization problems for improving the system safety, and is compared with other existing heuristic methods. We also include simulated annealing(SA) method in the comparision experiment due to ist populatrity for solving complex combinatorial problems. Computational results indicate that the proposed EA performs consistently better than the other in terms of solution quality, with moderate increase in computing time. Therefore, the proposed EA is believed to be an attractive alternative to other heuristic methods.

  • PDF

Comparison of Genetic Algorithms and Simulated Annealing for Multiprocessor Task Allocation (멀티프로세서 태스크 할당을 위한 GA과 SA의 비교)

  • Park, Gyeong-Mo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2311-2319
    • /
    • 1999
  • We present two heuristic algorithms for the task allocation problem (NP-complete problem) in parallel computing. The problem is to find an optimal mapping of multiple communicating tasks of a parallel program onto the multiple processing nodes of a distributed-memory multicomputer. The purpose of mapping these tasks into the nodes of the target architecture is the minimization of parallel execution time without sacrificing solution quality. Many heuristic approaches have been employed to obtain satisfactory mapping. Our heuristics are based on genetic algorithms and simulated annealing. We formulate an objective function as a total computational cost for a mapping configuration, and evaluate the performance of our heuristic algorithms. We compare the quality of solutions and times derived by the random, greedy, genetic, and annealing algorithms. Our experimental findings from a simulation study of the allocation algorithms are presented.

  • PDF

A 2-Dimension Torus-based Genetic Algorithm for Multi-disk Data Allocation (2차원 토러스 기반 다중 디스크 데이터 배치 병렬 유전자 알고리즘)

  • 안대영;이상화;송해상
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.2
    • /
    • pp.9-22
    • /
    • 2004
  • This paper presents a parallel genetic algorithm for the Multi-disk data allocation problem an NP-complete problem. This problem is to find a method to distribute a Binary Cartesian Product File on disk-arrays to maximize parallel disk I/O accesses. A Sequential Genetic Algorithm(SGA), DAGA, has been proposed and showed the superiority to the other proposed methods, but it has been observed that DAGA consumes considerably lengthy simulation time. In this paper, a parallel version of DAGA(ParaDAGA) is proposed. The ParaDAGA is a 2-dimension torus-based Parallel Genetic Algorithm(PGA) and it is based on a distributed population structure. The ParaDAGA has been implemented on the parallel computer simulated on a single processor platform. Through the simulation, we study the impact of varying ParaDAGA parameters and compare the quality of solution derived by ParaDAGA and DAGA. Comparing the quality of solutions, ParaDAGA is superior to DAGA in all cases of configurations in less simulation time.

A High Speed and Low Jitter PLL Clock generator (고속 저잡음 PLL 클럭 발생기)

  • Cho, Jeong-Hwan;Chong, Jong-Wha
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.39 no.3
    • /
    • pp.1-7
    • /
    • 2002
  • This paper presents a new PLL clock generator that can improve a jitter noise characteristics and acquisition process by designing a multi-PFD(Phase Frequency Detector) and an adaptive charge pump circuit. The conventional PLL has not only a jitter noise caused from such a demerit of the wide dead zone and duty cycle, but also a long delay interval that makes a high speed operation unable. An advanced multi-structured PFD circuit using the TSPC(True Single Phase Clocking) circuit is proposed, in which it shows an excellent functionalities in terms of the jitter noises by designing its circuit with the exact dead zone and duty cycle. Our new designed adaptive charge pump in the loop filter of a PLL can improve an acquisition characteristic by adaptively increasing of current. The Hspice simulation is done to evaluate the performance of the proposed circuit. Simulation result shows that our PLL has under 0.01ns in the dead zone, no influence from the duty cycle of input signals and under 50ns in the acquisition time. This circuit will be able to be used in develops of high-performance microprocessors and digital systems.  

Data Prefetching Effect of the Stride Merging-Arrays Method (스트라이드 배열 병합 방법의 데이터 선인출 효과)

  • Jeong, In-Beom;Lee, Jun-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.11
    • /
    • pp.1429-1436
    • /
    • 1999
  • 데이타들에 대한 선인출 효과를 얻기 위하여 캐쉬 메모리의 캐쉬 블록은 다중 워드로 구성된다. 그러나 선인출된 데이타들이 사용되지 않을 경우 캐쉬 메모리가 낭비되고 따라서 캐쉬 실패율이 증가한다. 데이타 배열 병합 방법은 캐쉬 실패 원인의 하나인 캐쉬 충돌 실패를 감소시키기 위하여 사용되고 있다. 그러나 기존의 배열 병합 방법은 유용하지 못한 데이타들을 캐쉬 블록에 선인출하는 현상을 보인다. 본 논문에서는 이러한 현상을 개선한 스트라이드 배열 병합을 제안한다. 모의시험에서 캐쉬 블록이 다중 워드로 구성된 경우 스트라이드 배열 병합은 캐쉬 충돌 실패를 감소시킬 뿐 만 아니라 유용한 데이타 선인출을 증가 시키므로 캐쉬 성능을 향상시킴을 보여준다. 또한 이렇게 향상된 캐쉬 성능은 프로세서 증가에 따른 확장성 있는 프로그램 성능을 나타낸다.Abstract The cache memory is composed of cache lines with multiple words to achieve the effect of data prefetching. However, if the prefetched data are not used, the spaces of the cache memory are wasted and thus the cache miss rate increases. The data merging-arrays method is used for the sake of the reduction of the cache conflict misses. However, the existing merging-arrays method results in the useless data prefetching. In this paper, a stride merging-arrays method is suggested for improving this phenomenon. Simulation results show that when a cache line is composed of multiple words, the stride merging-arrays method increases the cache performance due to not only the reduction of cache conflict misses but also the useful data prefetching. This enhanced cache performance also represents the more scalable performance of parallel applications according to increasing the number of processors.

Multi -Core Transactional Memory for High Contention Parallel Processing (집중 충돌 병렬 처리를 위한 효율적인 다중 코어 트랜잭셔널 메모리)

  • Kim, Seung-Hun;Kim, Sun-Woo;Ro, Won-Woo
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.1
    • /
    • pp.72-79
    • /
    • 2011
  • The importance of parallel programming seriously emerges ever since the modern microprocessor architecture has been shifted to the multi-core system. Transactional Memory has been proposed to address synchronization which is usually implemented by using locks. However, the lock based synchronization method reduces the parallelism and has the possibility of causing deadlock. In this paper, we propose an efficient method to utilize transactional memory for the situation which has high contention. The proposed idea is based on the theoretical analysis and it is verified with simulation results. The simulation environment has been implemented using HTM(Hardware Transactional Memory) systems. We also propose a model of the dining philosopher problem to discuss the efficient resource management using the transactional memory technique.

Parallel clustering technology for real-time LWIR band image processing (실시간 LWIR 밴드 영상 처리를 위한 병렬 클러스터링 기술)

  • Cho, Yongjin;Lee, Kyou-seung;Hong, Seongha;Oh, Jong-woo;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.158-158
    • /
    • 2017
  • 비닐포장 하부에 위치한 콩의 생장 초기에 발생한 초엽을 인식하기 위한 연구를 수행중이다. 선행 연구에서 비닐포장에 접촉한 콩 초엽으로 인해 비닐포장 상부 표면의 열 반응 분포에 변화가 있음을 발견하였다. 현장에서 주행 중에 콩 초엽의 위치를 실시간으로 인식하고 연동된 선형 또는 회전형 엑츄에이터를 제어하여 정확한 위치에 천공을 수행하기 위해서는 계측 시스템과 제어 시스템간의 시간적 차이를 최소할 수 있는 실시간 신호 처리 기술이 필수적이다. 선행 연구에서 사용한 다중 IR 센서의 분해능은 $16{\times}4pixel$이며 주파수는 3 Hz로, 폭이 30cm 내외인 비닐포장 상부의 정밀 분석에 한계가 있음을 발견하였다. 이를 해결하기 위하여 분해능과 계측 주기를 개선할 수 있는 초소형 ($1cm{\times}1cm{\times}1cm$) 열화상 센서를 이용하였다. LWIR(Longwave infrared)영역에 해당하는 $8{\mu}m{\sim}14{\mu}m$의 영역에서 $0.05^{\circ}C$의 분해능을 보이는 $ Lepton^{TM}$ (500-0690-00, FLIR, Goleta, CA)모델을 사용하였다. 프레임당 $80{\times}60$ 픽셀의 정보가 2 Byte의 단위로 계측이 되며 9 Hz의 주파수로 대상면의 열 분포를 측정할 수 있다. 이론적으로 초당 정보 전송량은 86,400 Byte ($80{\times}60{\times}2{\times}9$)이며, 1 m를 진행하는 주행형 천공기에 적용할 경우 1 프레임당 10cm 정도의 면적을 측정하므로, 최대 위치 판정 분해능은 약 10 cm / 60 pixel = 0.17 cm/pixel로 상대적으로 정밀한 위치 판별이 가능하다. $80{\times}60{\times}2Byet$의 정보를 0.1초 이내에 분석해야 하는 기술적 과제를 해결하기 위하여 천공 작업기에 적합한 상용 SBC(Single board computer)의 클럭 속도(1 Ghz)로 처리 가능한 공간 분포 분석 알고리즘을 개발하였다. 전체 이미지 도메인을 한 번에 분석하는데 소요되는 시간을 최소화하기 위하여 공간정보 행렬을 균등히 배분하고 별도의 프로세서에서 Feature를 분석한 후 개별 프로세서의 결과를 경합식으로 판정하는 기술을 연구하였다. 오픈 소스인 MPICH(www.mpich.org) 라이브러리를 이용하여 개발한 신호 분석 프로그램을 클러스터링으로 연동된 개별 코어에 설치/수행 하였다. 2D 행렬인 열분포 정보를 공간적으로 균등 분배하여 개별 코어에서 행렬의 Spatial domain analysis를 수행하였다. $20{\times}20$의 클러스터링 단위를 이용할 경우 총 12개의 코어가 필요하였으며, 초당 10회의 연산이 가능함을 확인하였다. 병렬 클러스터링 기술을 이용하여 1m/s 내외의 주행 속도에 대응이 가능한 비닐포장 상부 열 분포 분석 시스템을 구현하였다.

  • PDF

A High Performance and Low Power Banked-Promotion TLB Structure (저전력 고성능 뱅크-승격 TLB 구조)

  • Lee, Jung-Hoon;Kim, Shin-Dug
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.4
    • /
    • pp.232-243
    • /
    • 2002
  • There are many methods for improving TLB (translation lookaside buffer) performance, such as increasing the number of entry in TLB, supporting large page or multiple page sizes. The best way is to support multiple page sizes, but any operating system doesn't support multiple page sizes in user mode. So, we propose the new structure of TLB supporting two pages to obtain the effect of multiple page sizes with high performance and at low cost without operating system support. we propose a new TLB structure supporting two page sizes dynamically and selectively for high performance and low cost design without any operating system support. For high performance, a promotion-TLB is designed by supporting two page sizes. Also in order to attain low power consumption, a banked-TLB is constructed by dividing one fully associative TLB space into two sub-fully associative TLBs. These two banked-TLB structures are integrated into a banked-promotion TLB as a low power and high performance TLB structure for embedded processors. According to the results of comparison and analysis, a similar performance can be achieved by using fewer TLB entries and also power consumption can be reduced by around 50% comparing with the fully associative TLB.

Diagnosis of Valve Internal Leakage for Ship Piping System using Acoustic Emission Signal-based Machine Learning Approach (선박용 밸브의 내부 누설 진단을 위한 음향방출신호의 머신러닝 기법 적용 연구)

  • Lee, Jung-Hyung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.1
    • /
    • pp.184-192
    • /
    • 2022
  • Valve internal leakage is caused by damage to the internal parts of the valve, resulting in accidents and shutdowns of the piping system. This study investigated the possibility of a real-time leak detection method using the acoustic emission (AE) signal generated from the piping system during the internal leakage of a butterfly valve. Datasets of raw time-domain AE signals were collected and postprocessed for each operation mode of the valve in a systematic manner to develop a data-driven model for the detection and classification of internal leakage, by applying machine learning algorithms. The aim of this study was to determine whether it is possible to treat leak detection as a classification problem by applying two classification algorithms: support vector machine (SVM) and convolutional neural network (CNN). The results showed different performances for the algorithms and datasets used. The SVM-based binary classification models, based on feature extraction of data, achieved an overall accuracy of 83% to 90%, while in the case of a multiple classification model, the accuracy was reduced to 66%. By contrast, the CNN-based classification model achieved an accuracy of 99.85%, which is superior to those of any other models based on the SVM algorithm. The results revealed that the SVM classification model requires effective feature extraction of the AE signals to improve the accuracy of multi-class classification. Moreover, the CNN-based classification can be a promising approach to detect both leakage and valve opening as long as the performance of the processor does not degrade.

A Real-Time Scheduling Technique on Multi-Core Systems for Multimedia Multi-Streaming (다중 멀티미디어 스트리밍을 위한 멀티코어 시스템 기반의 실시간 스케줄링 기법)

  • Park, Sang-Soo
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.11
    • /
    • pp.1478-1490
    • /
    • 2011
  • Recently, multi-core processors have been drawing significant interest from the embedded systems research and industry communities due mainly to their potential for achieving high performance and fault-tolerance at low cost in such products as automobiles and cell phones. To process multimedia data, a scheduling algorithm is required to meet timing constraints of periodic tasks in the system. Though Pfair scheduling algorithm can meet all the timing constraints while achieving 100% utilization on multi-core based system theoretically, however, the algorithm incurs high scheduling overheads including frequent core migrations and system-wide synchronizations. To mitigate the problems, we propose a real-time scheduling algorithm for multi-core based system so that system-wide scheduling is performed only when it is absolutely necessary. Otherwise the proposed algorithm performs scheduling within each core independently. The experimental results by extensive simulations show that the proposed algorithm dramatically reduces the scheduling overheads up to as negligible one when the utilization is under 80%.