• Title/Summary/Keyword: 병렬회로

Search Result 1,179, Processing Time 0.024 seconds

An Improvement of Implementation Method for Multi-Layer AHB BusMatrix (ML-AHB 버스 매트릭스 구현 방법의 개선)

  • Hwang Soo-Yun;Jhang Kyoung-Sun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.11_12
    • /
    • pp.629-638
    • /
    • 2005
  • In the System on a Chip design, the on chip bus is one of the critical factors that decides the overall system performance. Especially, in the case or reusing the IPs such as processors, DSPs and multimedia IPs that requires higher bandwidth, the bandwidth problems of on chip bus are getting more serious. Recently ARM proposes the Multi-Layer AHB BusMatrix that is a highly efficient on chip bus to solve the bandwidth problems. The Multi-Layer AHB BusMatrix allows parallel access paths between multiple masters and slaves in a system. This is achieved by using a more complex interconnection matrix and gives the benefit of increased overall bus bandwidth, and a more flexible system architecture. However, there is one clock cycle delay for each master in existing Multi-Layer AHB BusMatrix whenever the master starts new transactions or changes the slave layers because of the Input Stage and arbitration logic realized with Moore type. In this paper, we improved the existing Multi-Layer AHB BusMatrix architecture to solve the one clock cycle delay problems and to reduce the area overhead of the Input Stage. With the elimination of the Input Stage and some restrictions on the arbitration scheme, we tan take away the one clock cycle delay and reduce the area overhead. Experimental results show that the end time of total bus transaction and the average latency time of improved Multi-Layer AHB BusMatrix are improved by $20\%\;and\;24\%$ respectively. in ease of executing a number of transactions by 4-beat incrementing burst type. Besides the total area and the clock period are reduced by $22\%\;and\;29\%$ respectively, compared with existing Multi-layer AHB BusMatrix.

A Load Balancing Method using Partition Tuning for Pipelined Multi-way Hash Join (다중 해시 조인의 파이프라인 처리에서 분할 조율을 통한 부하 균형 유지 방법)

  • Mun, Jin-Gyu;Jin, Seong-Il;Jo, Seong-Hyeon
    • Journal of KIISE:Databases
    • /
    • v.29 no.3
    • /
    • pp.180-192
    • /
    • 2002
  • We investigate the effect of the data skew of join attributes on the performance of a pipelined multi-way hash join method, and propose two new harsh join methods in the shared-nothing multiprocessor environment. The first proposed method allocates buckets statically by round-robin fashion, and the second one allocates buckets dynamically via a frequency distribution. Using harsh-based joins, multiple joins can be pipelined to that the early results from a join, before the whole join is completed, are sent to the next join processing without staying in disks. Shared nothing multiprocessor architecture is known to be more scalable to support very large databases. However, this hardware structure is very sensitive to the data skew. Unless the pipelining execution of multiple hash joins includes some dynamic load balancing mechanism, the skew effect can severely deteriorate the system performance. In this parer, we derive an execution model of the pipeline segment and a cost model, and develop a simulator for the study. As shown by our simulation with a wide range of parameters, join selectivities and sizes of relations deteriorate the system performance as the degree of data skew is larger. But the proposed method using a large number of buckets and a tuning technique can offer substantial robustness against a wide range of skew conditions.

Design of Software and Hardware Modules for a TCP/IP Offload Engine with Separated Transmission and Reception Paths (송수신 분리형 TCP/IP Offload Engine을 위한 소프트웨어 및 하드웨어 모듈의 설계)

  • Jang Hank-Kok;Chung Sang-Hwa;Choi Young-In
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.9
    • /
    • pp.691-698
    • /
    • 2006
  • TCP/IP Offload Engine (TOE) is a technology that processes TCP/IP on a network adapter instead of a host CPU to reduce protocol processing overhead from the host CPU. There have been some approaches to implementing TOE: software TOE based on an embedded processor; hardware TOE based on ASIC implementation; and hybrid TOE in which software and hardware functions are combined. In this paper, we designed software modules and hardware modules for a hybrid TOE on an FPGA that had two processor cores. Software modules are based on the embedded Linux. Hardware modules are for data transmission (TX) and reception (RX). One core controls the TX path and the other controls the RX path of the Linux. This TX/RX path separation mechanism can reduce task switching overheads between processes and overcome poor performance of single embedded processor. Hardware modules deal with creating headers for outgoing packets, processing headers of incoming packets, and fetching or storing data from or to the host memory by DMA. These can make it possible to improve the performance of data transmission and reception. We proved performance of the TOE with separated transmission and reception paths by performing experiments with a TOE network adapter that was equipped with the FPGA having processor cores.

An Efficient Disk Sharing Technique supporting Single Disk I/O Space in Linux Cluster Systems (리눅스 클러스터 시스템에서 단일 디스크 입출력 공간을 지원하는 효율적 디스크 공유 기법)

  • 김태호;이종우;이재원;김성동;채진석
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.6
    • /
    • pp.635-645
    • /
    • 2003
  • One of very important features that are necessarily supported by clustered parallel computer systems is a single I/O system image in which users can access both the local and remote I/O resources transparently. In this paper, we propose an efficient disk sharing technique supporting a single disk I/O system image architecture. The design separates the I/O subsystem of a cluster into the file system and a set of virtual hard disk drivers. The virtual hard disk driver deals with a hard disk in the remote node as a local hard disk. All services provided by it are performed in the device driver level without any modification of file systems. Users can, therefore, access all the disks in the cluster regardless of their locations. Our virtual hard disk driver is implemented under the linux, and also tested in a linux cluster system. We find by experiments that it can successfully support a single disk I/O space, and at the same time it shows better performance than NFS. We are sure that this paper can be a guideline for single I/O space of other devices to be easily constructed.

QoS-Guaranteed IP Mobility Management For Fast Moving Vehicles Using Multiple Tunnels (멀티 터널링을 이용한 고속 차량에서 QoS 보장 IP 이동성 관리 방법)

  • Chun, Seung-Man;Nah, Jae-Wook;Park, Jong-Tae
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.48 no.11
    • /
    • pp.44-52
    • /
    • 2011
  • In this article, we present a QoS-guaranteed IP mobility management scheme of Internet service for fast moving vehicles with multiple wireless network interfaces. The idea of the proposed mechanism consists of two things. One is that new wireless connections are established to available wireless channels whenever the measured data rate at the vehicle equipped with mobile gateway drops below to the required data rate of the user requirement. The other is that parallel distribution packet tunnels between an access router and the mobile gateway are dynamically constructed using multiple wireless network interfaces in order to guarantee the required data rate during the mobile gateway's movement. By doing these methods, the required data rate of the mobile gateway can be preserved while eliminating the possible delay and packet loss during handover operation, thus resulting in the guaranteed QoS. The architecture of the IETF standard HMIPv6 has been extended to realize the proposed scheme, and detailed algorithms for the extension of HMIPv6 has been designed. Finally, simulation has been done for performance evaluation, and the simulation results show that the proposed mechanism demonstrates guaranteed QoS during the handover with regard to the handover delay, packet loss and throughput.

Fast Image Pre-processing Algorithms Using SSE Instructions (SSE 명령어를 이용한 영상의 고속 전처리 알고리즘)

  • Park, Eun-Soo;Cui, Xuenan;Kim, Jun-Chul;Im, Yu-Cheong;Kim, Hak-Il
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.2
    • /
    • pp.65-77
    • /
    • 2009
  • This paper proposes fast image processing algorithms using SSE (Streaming SIMD Extensions) instructions. The CPU's supporting SSE instructions have 128bit XMM registers; data included in these registers are processed at the same time with the SIMD (Single Instruction Multiple Data) mode. This paper develops new SIMD image processing algorithms for Mean filter, Sobel horizontal edge detector, and Morphological erosion operation which are most widely used in automated optical inspection systems and compares their processing times. In order to objectively evaluate the processing time, the developed algorithms are compared with OpenCV 1.0 operated in SISD (Single Instruction Single Data) mode, Intel's IPP 5.2 and MIL 8.0 which are fast image processing libraries supporting SIMD mode. The experimental result shows that the proposed algorithms on average are 8 times faster than the SISD mode image processing library and 1.4 times faster than the SIMD fast image processing libraries. The proposed algorithms demonstrate their applicability to practical image processing systems at high speed without commercial image processing libraries or additional hardwares.

Improvement of Address Pointer Assignment in DSP Code Generation (DSP용 코드 생성에서 주소 포인터 할당 성능 향상 기법)

  • Lee, Hee-Jin;Lee, Jong-Yeol
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.1
    • /
    • pp.37-47
    • /
    • 2008
  • Exploitation of address generation units which are typically provided in DSPs plays an important role in DSP code generation since that perform fast address computation in parallel to the central data path. Offset assignment is optimization of memory layout for program variables by taking advantage of the capabilities of address generation units, consists of memory layout generation and address pointer assignment steps. In this paper, we propose an effective address pointer assignment method to minimize the number of address calculation instructions in DSP code generation. The proposed approach reduces the time complexity of a conventional address pointer assignment algorithm with fixed memory layouts by using minimum cost-nodes breaking. In order to contract memory size and processing time, we employ a powerful pruning technique. Moreover our proposed approach improves the initial solution iteratively by changing the memory layout for each iteration because the memory layout affects the result of the address pointer assignment algorithm. We applied the proposed approach to about 3,000 sequences of the OffsetStone benchmarks to demonstrate the effectiveness of the our approach. Experimental results with benchmarks show an average improvement of 25.9% in the address codes over previous works.

Responses of Phyllotreta striolata and Athalia rosae ruficornis to Colored-sticky Traps and Aggregation Pheromone and Seasonal Fluctuations in Radish Fields on Jeju Island (제주지역 무에서 벼룩잎벌레와 무잎벌의 색트랩과 집합페로몬에 대한 반응과 연중 발생특성)

  • Song, Jeong Heub;Yang, Young Taek;Yang, Cheol Jun;Choi, Byeong Ryul;Jwa, Chang Sook
    • Korean journal of applied entomology
    • /
    • v.54 no.4
    • /
    • pp.289-294
    • /
    • 2015
  • Striped flea beetle, Phyllotreta striolata (SFB) and turnip sawfly, Athalia rosae ruficornis (TSF) are two economically important sporadic pests in radish fields on Jeju island. The response of adult SFB and TSF to a mixture of aggregation pheromone, (+)-(6R,7S)-himachala-9,11-diene and host plant volatile, allyl isothiocyanate (HAI), as well as to yellow and blue sticky traps was examined in radish fields. Adult SFB was more attracted to the sticky trap with HAI, regardless of the color of the sticky trap; however, adult TSF was more attracted on the yellow sticky trap than blue, and no effect of HAI was observed. The adult SFB and TSF can be effectively monitored using yellow sticky traps placed 10 cm above the plant canopy. SFB and TSF had 3 and 5 peak times in a year, respectively. The first peak occurred in the middle of March for SFB and mid-late of April for TSF. We expect that the results of the present study can facilitate minimizing the damage caused by the two important pests in commercial radish fields.

Multiple Camera Based Imaging System with Wide-view and High Resolution and Real-time Image Registration Algorithm (다중 카메라 기반 대영역 고해상도 영상획득 시스템과 실시간 영상 정합 알고리즘)

  • Lee, Seung-Hyun;Kim, Min-Young
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.49 no.4
    • /
    • pp.10-16
    • /
    • 2012
  • For high speed visual inspection in semiconductor industries, it is essential to acquire two-dimensional images on regions of interests with a large field of view (FOV) and a high resolution simultaneously. In this paper, an imaging system is newly proposed to achieve high quality image in terms of precision and FOV, which is composed of single lens, a beam splitter, two camera sensors, and stereo image grabbing board. For simultaneously acquired object images from two camera sensors, Zhang's camera calibration method is applied to calibrate each camera first of all. Secondly, to find a mathematical mapping function between two images acquired from different view cameras, the matching matrix from multiview camera geometry is calculated based on their image homography. Through the image homography, two images are finally registered to secure a large inspection FOV. Here the inspection system of using multiple images from multiple cameras need very fast processing unit for real-time image matching. For this purpose, parallel processing hardware and software are utilized, such as Compute Unified Device Architecture (CUDA). As a result, we can obtain a matched image from two separated images in real-time. Finally, the acquired homography is evaluated in term of accuracy through a series of experiments, and the obtained results shows the effectiveness of the proposed system and method.

A Design of AES-based WiBro Security Processor (AES 기반 와이브로 보안 프로세서 설계)

  • Kim, Jong-Hwan;Shin, Kyung-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.7 s.361
    • /
    • pp.71-80
    • /
    • 2007
  • This paper describes an efficient hardware design of WiBro security processor (WBSec) supporting for the security sub-layer of WiBro wireless internet system. The WBSec processor, which is based on AES (Advanced Encryption Standard) block cipher algorithm, performs data oncryption/decryption, authentication/integrity, and key encryption/decryption for packet data protection of wireless network. It carries out the modes of ECB, CTR, CBC, CCM and key wrap/unwrap with two AES cores working in parallel. In order to achieve an area-efficient implementation, two design techniques are considered; First, round transformation block within AES core is designed using a shared structure for encryption/decryption. Secondly, SubByte/InvSubByte blocks that require the largest hardware in AES core are implemented using field transformation technique. It results that the gate count of WBSec is reduced by about 25% compared with conventional LUT (Look-Up Table)-based design. The WBSec processor designed in Verilog-HDL has about 22,350 gates, and the estimated throughput is about 16-Mbps at key wrap mode and maximum 213-Mbps at CCM mode, thus it can be used for hardware design of WiBro security system.