• Title/Summary/Keyword: multi-core processing

Search Result 220, Processing Time 0.021 seconds

Preprocessing Methods for Effective Modulo Scheduling on High Performance DSPs (고성능 디지털 신호 처리 프로세서상에서 효율적인 모듈로 스케쥴링을 위한 전처리 기법)

  • Cho, Doo-San;Paek, Yun-Heung
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.5
    • /
    • pp.487-501
    • /
    • 2007
  • To achieve high resource utilization for multi-issue DSPs, production compiler commonly includes variants of iterative modulo scheduling algorithm. However, excessive cyclic data dependences, which exist in communication and media processing loops, unduly restrict modulo scheduling freedom. As a result, replicated functional units in multi-issue DSPs are often under-utilized. To address this resource under-utilization problem, our paper describes a novel compiler preprocessing strategy for effective modulo scheduling. The preprocessing strategy proposed capitalizes on two new transformations, which are referred to as cloning and dismantling. Our preprocessing strategy has been validated by an implementation for StarCore SC140 DSP compiler.

A Comparative Study on Performance of Open Source IDS/IPS Snort and Suricata (오픈소스 IDS/IPS Snort와 Suricata의 탐지 성능에 대한 비교 연구)

  • Seok, Jinug;Choi, Moonseok;Kim, Jimyung;Park, Jonsung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.12 no.1
    • /
    • pp.89-95
    • /
    • 2016
  • Recent growth of hacking threats and development in software and technology put Network security under threat, In addition, intrusion, malware and worm virus have been increased due to the existence of variety of sophisticated hacking methods. The goal of this study is to compare Snort Alpha version with Suricata 2.0.11 version whereas previous study focuses on comparison between snort 2. x version under thread environment and Suricata under multi-threading environment. This thesis' experiment environment is set as followed. Intel (R) Core (TM) i5-4690 3. 50GHz (4threads) of CPU, 16GB of RAM, 3TB of Seagate HDD, Ubuntu 14.04 are used. According to the result, Snort Alpha version is superior to Suricata in performance, but Snort Alpha had some glitches when executing pcap files which created core dump errors. Therefore this experiment seeks to analyze which performs better between Snort Alpha version that supports multi packet processing threads and Suricata that supports multi-threading. Through this experiment, one can expect the better performance of beta and formal version of Snort in the future.

MGGC2.0: A preprocessing code for the multi-group cross section of the fast reactor with ultrafine group library

  • Kui Hu;Xubo Ma;Teng Zhang;Xuan Ma;Zifeng Huang;Yixue Chen
    • Nuclear Engineering and Technology
    • /
    • v.55 no.8
    • /
    • pp.2785-2796
    • /
    • 2023
  • How to generate the precise broad group cross section is important for the fast reactor design. In this study, a fast reactor multi-group cross-section generation code MGGC2.0 are developed in-house for processing ultrafine group MATXS format library. Validation and verification are performed for MGGC2.0 code by applying the benchmarks of ICSBEP handbook, and the results of MGGC2.0 agree well with that of MCNP. The consistent PN method with critical buckling search is in good agreement that condensed with TWODANT flux and flux moment for the inner core and outer core region. For the radial blanket and reflector, two region approximation method has been applied in MGGC2.0 by using collision Probability Method neutron flux solver. The RBEC-M benchmark was used to verify the power distribution calculation, and the relative error of power distribution comparison with the reference are less than 0.8% in the fuel region and the maximum relative error is 5.58% in the reflector region. Therefore, the precise broad cross section can be generated by MGGC2.0 for fast reactor.

Exploration of Optimal Multi-Core Processor Architecture for Physical Modeling of Plucked-String Instruments (현악기의 물리적 모델링을 위한 최적의 멀티코어 프로세서 아키텍처 탐색)

  • Kang, Myeong-Su;Choi, Ji-Won;Kim, Yong-Min;Kim, Jong-Myon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.5
    • /
    • pp.281-294
    • /
    • 2011
  • Physics-based sound synthesis usually requires high computational costs and this results in a restriction of its use in real-time applications. This motivates us to implement the sound synthesis algorithm of plucked-string instruments using multi-core processor architectures and determine the optimal processing element (PE) configuration for the target instruments. To determine the optimal PE configuration, we evaluate the impacts of a sample-per-processing element (SPE) ratio that is defined as the amount of sample data directly mapped to each PE on system performance and both area and energy efficiencies using architectural and workload simulations. For the acoustic guitar, the highest area and energy efficiencies are achieved at a SPE ratio of 5,513 and 2,756, respectively, for the synthesis of musical sounds sampled at 44.1 kHz. In the case of the classical guitar, the maximum area and energy efficiencies are achieved at a SPE ratio of 22,050 and 5,513, respectively. In addition, the synthetic sounds were very similar to original sounds in their spectra. Furthermore, we conducted MUSHRA subjective listening test with ten subjects including nine graduate students and one professor from the University of Ulsan, and the evaluation of the synthetic sounds was excellent.

Analytical Approach to Compression and Shear Characteristics of the Unit Cell of PCM Core with Pyramidal Configuration (피라미드 형상의 PCM 코어 단위 셀의 압축 및 전단특성에 관한 해석적 연구)

  • Kim, S.W.;Jung, H.C.;Lee, Y.S.;Kang, B.S.
    • Transactions of Materials Processing
    • /
    • v.19 no.7
    • /
    • pp.411-415
    • /
    • 2010
  • A sandwich panel which is comprised of truss cores faced with solid face sheets is lightweight and multi-functional. So it is widely used to not only structural material but also heat transfer media in transportation field such as airplane, train and vessel. There are various core topologies such as pyramidal and tetrahedral truss, square honeycombs and kagome truss. The study focused on analytical approach to optimize compression and shear quality of the unit cell of PCM with pyramidal configuration. With various unit cell models which have the same core weight per unit area but different truss member angle, analytical solution for effective stress ($\bar{\sigma},\bar{\tau}$), peak stress ($\bar{\sigma}_{peak},\bar{\tau}_{peak}$) by yielding and buckling, relative density ($\bar{\rho}_c$) and effective stiffness ($\bar{E},\bar{G}$) have been computed and compared each other. With this approach, the most optimal core configuration was predicted. The result has become the efficient guidelines for the design of PCM core structure.

Implementation of an Optimal Many-core Processor for Beamforming Algorithm of Mobile Ultrasound Image Signals (모바일 초음파 영상신호의 빔포밍 기법을 위한 최적의 매니코어 프로세서 구현)

  • Choi, Byong-Kook;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.8
    • /
    • pp.119-128
    • /
    • 2011
  • This paper introduces design space exploration of many-core processors that meet high performance and low power required by the beamforming algorithm of image signals of mobile ultrasound. For the design space exploration of the many-core processor, we mapped different number of ultrasound image data to each processing element of many-core, and then determined an optimal many-core processor architecture in terms of execution time, energy efficiency and area efficiency. Experimental results indicate that PE=4096 and 1024 provide the highest energy efficiency and area efficiency, respectively. In addition, PE=4096 achieves 46x and 10x better than TI DSP C6416, which is widely used for ultrasound image devices, in terms of energy efficiency and area efficiency, respectively.

A Research of User Experience on Multi-Modal Interactive Digital Art

  • Qianqian Jiang;Jeanhun Chung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.80-85
    • /
    • 2024
  • The concept of single-modal digital art originated in the 20th century and has evolved through three key stages. Over time, digital art has transformed into multi-modal interaction, representing a new era in art forms. Based on multi-modal theory, this paper aims to explore the characteristics of interactive digital art in innovative art forms and its impact on user experience. Through an analysis of practical application of multi-modal interactive digital art, this study summarises the impact of creative models of digital art on the physical and mental aspects of user experience. In creating audio-visual-based art, multi-modal digital art should seamlessly incorporate sensory elements and leverage computer image processing technology. Focusing on user perception, emotional expression, and cultural communication, it strives to establish an immersive environment with user experience at its core. Future research, particularly with emerging technologies like Artificial Intelligence(AR) and Virtual Reality(VR), should not merely prioritize technology but aim for meaningful interaction. Through multi-modal interaction, digital art is poised to continually innovate, offering new possibilities and expanding the realm of interactive digital art.

A Performance Improvement of Linux TCP/IP Stack based on Flow-Level Parallelism in a Multi-Core System (멀티코어 시스템에서 흐름 수준 병렬처리에 기반한 리눅스 TCP/IP 스택의 성능 개선)

  • Kwon, Hui-Ung;Jung, Hyung-Jin;Kwak, Hu-Keun;Kim, Young-Jong;Chung, Kyu-Sik
    • The KIPS Transactions:PartA
    • /
    • v.16A no.2
    • /
    • pp.113-124
    • /
    • 2009
  • With increasing multicore system, much effort has been put on the performance improvement of its application. Because multicore system has multiple processing devices in one system, its processing power increases compared to the single core system. However in many cases the advantages of multicore can not be exploited fully because the existing software and hardware were designed to be suitable for single core. When the existing software runs on multicore, its performance improvement is limited by the bottleneck of sharing resources and the inefficient use of cache memory on multicore. Therefore, according as the number of core increases, it doesn't show performance improvement and shows performance drop in the worst case. In this paper we propose a method of performance improvement of multicore system by applying Flow-Level Parallelism to the existing TCP/IP network application and operating system. The proposed method sets up the execution environment so that each core unit operates independently as much as possible in network application, TCP/IP stack on operating system, device driver, and network interface. Moreover it distributes network traffics to each core unit through L2 switch. The proposed method allows to minimize the sharing of application data, data structure, socket, device driver, and network interface between each core. Also it allows to minimize the competition among cores to take resources and increase the hit ratio of cache. We implemented the proposed methods with 8 core system and performed experiment. Experimental results show that network access speed and bandwidth increase linearly according to the number of core.

Parallel Range Query Processing with R-tree on Multi-GPUs (다중 GPU를 이용한 R-tree의 병렬 범위 질의 처리 기법)

  • Ryu, Hongsu;Kim, Mincheol;Choi, Wonik
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.522-529
    • /
    • 2015
  • Ever since the R-tree was proposed to index multi-dimensional data, many efforts have been made to improve its query performances. One common trend to improve query performance is to parallelize query processing with the use of multi-core architectures. To this end, a GPU-base R-tree has been recently proposed. However, even though a GPU-based R-tree can exhibit an improvement in query performance, it is limited in its ability to handle large volumes of data because GPUs have limited physical memory. To address this problem, we propose MGR-tree (Multi-GPU R-tree), which can manage large volumes of data by dividing nodes into multiple GPUs. Our experiments show that MGR-tree is up to 9.1 times faster than a sequential search on a GPU and up to 1.6 times faster than a conventional GPU-based R-tree.

Bi-directional Bus Architecture Suitable to Multitasking in MPEG System (MPEG 시스템용 다중 작업에 적합한 양방향 버스 구조)

  • Jun Chi-hoon;Yeon Gyu-sung;Hwang Tae-jin;Wee Jae-Kyung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.4 s.334
    • /
    • pp.9-18
    • /
    • 2005
  • This paper proposes the novel synchronous segmented bus architecture that has the pipeline bus architecture based on OCP(open core protocol) and the memory-oriented bus for MPEG system. The proposed architecture has bus architectures that support the memory interface for image data processing of MPEG system. Also it has the segmented hi-directional multiple bus architecture for multitasking processing by using multi -masters/multi - slave. In the scheme address of masters and slaves are fixed so that they are arranged for the location of IP cores according to operational characteristics of the system for efficient data processing. Also the bus architecture adopts synchronous segmented bus architecture for reuse of IP's and architecture or developed chips. This feature is suitable to the high performance and low power multimedia SoC systum by inherent characteristics of multitasking operation and segmented bus. Proposed bus architecture can have up to 3.7 times improvement in the effective bandwidth md up to 4 times reduction in the communication latency.