• Title/Summary/Keyword: Kernel Memory

Search Result 179, Processing Time 0.019 seconds

Speaker Identification on Various Environments Using an Ensemble of Kernel Principal Component Analysis (커널 주성분 분석의 앙상블을 이용한 다양한 환경에서의 화자 식별)

  • Yang, Il-Ho;Kim, Min-Seok;So, Byung-Min;Kim, Myung-Jae;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.3
    • /
    • pp.188-196
    • /
    • 2012
  • In this paper, we propose a new approach to speaker identification technique which uses an ensemble of multiple classifiers (speaker identifiers). KPCA (kernel principal component analysis) enhances features for each classifier. To reduce the processing time and memory requirements, we select limited number of samples randomly which are used as estimation set for each KPCA basis. The experimental result shows that the proposed approach gives a higher identification accuracy than GKPCA (greedy kernel principal component analysis).

Implementation of Hypervisor for Virtualizing uC/OS-II Real Time Kernel (uC/OS-II 실시간 커널의 가상화를 위한 하이퍼바이저 구현)

  • Shin, Dong-Ha;Kim, Ji-Yeon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.5
    • /
    • pp.103-112
    • /
    • 2007
  • In this paper, we implement a hypervisor that runs multiple uC/OS-II real-time kernels on one microprocessor. The hypervisor virtualizes microprocessor and memory that are main resources managed by uC/OS-II kernel. Microprocessor is virtualized by controlling interrupts that uC/OS-II real-time kernel handles and memory is virtualized by partitioning physical memory. The hypervisor consists of three components: interrupt control routines that virtualize timer interrupt and software interrupt, a startup code that initializes the hypervisor and uC/OS-II kernels, and an API that provides communication between two kernels. The original uC/OS-II kernel needs to be modified slightly in source-code level to run on the hypervisor. We performed a real-time test and an independent computation test on Jupiter 32-bit EISC microprocessor and showed that the virtualized kernels run without problem. The result of our research can reduce the hardware cost, the system space and weight, and system power consumption when the hypervisor is applied in embedded applications that require many embedded microprocessors.

  • PDF

Efficient Use of On-chip Memory through Profile-Driven Array Reorganization

  • Cho, Doosan;Youn, Jonghee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.6
    • /
    • pp.345-359
    • /
    • 2011
  • In high performance embedded systems, the use of multiple on-chip memories is an essential architectural feature for exploiting inherent parallelism in multimedia applications. This feature allows multiple data accesses to be executed in parallel. However, it remains difficult to effectively exploit of multiple on-chip memories. The successful use of this architecture strongly depends on how to efficiently detect and exploit memory parallelism in target applications. In this paper, we propose a technique based on a linear array access descriptor [1], which is generated from profiled data, to detect and exploit memory parallelism. The proposed technique tackles an array reorganization problem to maximize memory parallelism in multimedia applications. We present preliminary experiments applying the proposed technique onto a representative coarse grained reconfigurable array processor (CGRA) with multimedia kernel codes. Our experimental results demonstrate that our technique optimizes data placement by putting independent data on separate storage. The results exhibit 9.8% higher performance on average compared to the existing method.

Performance Analysis and Identifying Characteristics of Processing-in-Memory System with Polyhedral Benchmark Suite (프로세싱 인 메모리 시스템에서의 PolyBench 구동에 대한 동작 성능 및 특성 분석과 고찰)

  • Jeonggeun Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.3
    • /
    • pp.142-148
    • /
    • 2023
  • In this paper, we identify performance issues in executing compute kernels from PolyBench, which includes compute kernels that are the core computational units of various data-intensive workloads, such as deep learning and data-intensive applications, on Processing-in-Memory (PIM) devices. Therefore, using our in-house simulator, we measured and compared the various performance metrics of workloads based on traditional out-of-order and in-order processors with Processing-in-Memory-based systems. As a result, the PIM-based system improves performance compared to other computing models due to the short-term data reuse characteristic of computational kernels from PolyBench. However, some kernels perform poorly in PIM-based systems without a multi-layer cache hierarchy due to some kernel's long-term data reuse characteristics. Hence, our evaluation and analysis results suggest that further research should consider dynamic and workload pattern adaptive approaches to overcome performance degradation from computational kernels with long-term data reuse characteristics and hidden data locality.

  • PDF

(PMU (Performance Monitoring Unit)-Based Dynamic XIP(eXecute In Place) Technique for Embedded Systems) (내장형 시스템을 위한 PMU (Performance Monitoring Unit) 기반 동적 XIP (eXecute In Place) 기법)

  • Kim, Dohun;Park, Chanik
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.3 no.3
    • /
    • pp.158-166
    • /
    • 2008
  • These days, mobile embedded systems adopt flash memory capable of XIP feature since they can reduce memory usage, power consumption, and software load time. XIP provides direct access to ROM and flash memory for processors. However, using XIP incurs unnecessary degradation of applications' performance because direct access to ROM and flash memory shows more delay than that to main memory. In this paper, we propose a memory management framework, dynamic XIP, which can resolve the performance degradation of using XIP. Using a constrained RAM cache, dynamic XIP can dynamically change XIP region according to page access pattern to reduce performance degradation in execution time or energy consumption resulting from native XIP problem. The proposed framework consists of a page profiler gathering applications' memory access pattern using PMU and an XIP manager deciding that a page is accessed whether in main memory or in flash memory. The proposed framework is implemented and evaluated in Linux kernel. Our evaluation shows that our framework can reduce execution time at most 25% and energy consumption at most 22% compared with using XIP-only case adopted in general mobile embedded systems. Moreover, the evaluation shows that in execution time and energy consumption, our modified LRU algorithm with code page filters can reduce more than at most 90% and 80% respectively compared with applying just existing LRU algorithm to dynamic XIP.

  • PDF

A Study on Data Acquisition and Analysis Methods for Mac Memory Forensics (macOS 메모리 포렌식을 위한 데이터 수집 및 분석 방법에 대한 연구)

  • Jung Woo Lee;Dohyun Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.2
    • /
    • pp.179-192
    • /
    • 2024
  • macOS presents challenges for memory data acquisition due to its proprietary system architecture, closed-source kernel, and security features such as System Integrity Protection (SIP), which are exclusive to Apple's product line. Consequently, conventional memory acquisition tools are often ineffective or require system rebooting. This paper analyzes the status and limitations of existing memory forensics research and tools related to macOS. We investigate methods for memory acquisition and analysis across various macOS versions. Our findings include the development of a practical memory acquisition and analysis process for digital forensic investigations utilizing OSXPmem and dd tools for memory acquisition without system rebooting, and Volatility 2, 3 for memory data analysis.

Design and Implementation of iSCSI Protocol Based Virtual USB Drive for Mobile Devices (모바일 장치를 위한 iSCSI 프로토콜 기반의 가상 USB 드라이브 설계 및 구현)

  • Choi, Jae-Hyun;Nam, Young Jin;Kim, JongWan
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.5 no.4
    • /
    • pp.175-184
    • /
    • 2010
  • This paper designs a virtual USB drive for mobile devices which gives an illusion of a traditional USB flash memory drive and provides capacity-free storage space over IP network. The virtual USB drive operating with a S3C2410 hardware platform and embedded linux consists of USB device driver, an iSCSI-enabled network stack, and a seamless USB/iSCSI tunneling module. For performance enhancement, it additionally provides a kernel-level seamless USB/iSCSI tunneling module and data sharing with symbol references among kernel modules. Experiments reveal that the kernel-level implementation can improve the I/O performance up to 8 percentage, as compared with the user-level implementation.

Concurrent Support Vector Machine Processor (Concurrent Support Vector Machine 프로세서)

  • 위재우;이종호
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.53 no.8
    • /
    • pp.578-584
    • /
    • 2004
  • The CSVM(Current Support Vector Machine) that is a digital architecture performing all phases of recognition process including kernel computing, learning, and recall of SVM(Support Vector Machine) on a chip is proposed. Concurrent operation by parallel architecture of elements generates high speed and throughput. The classification problems of bio data having high dimension are solved fast and easily using the CSVM. Quadratic programming in original SVM learning algorithm is not suitable for hardware implementation, due to its complexity and large memory consumption. Hardware-friendly SVM learning algorithms, kernel adatron and kernel perceptron, are embedded on a chip. Experiments on fixed-point algorithm having quantization error are performed and their results are compared with floating-point algorithm. CSVM implemented on FPGA chip generates fast and accurate results on high dimensional cancer data.

MBS-LVM: A High-Performance Logical Volume Manager for Memory Bus-Connected Storages over NUMA Servers

  • Lee, Yongseob;Park, Sungyong
    • Journal of Information Processing Systems
    • /
    • v.15 no.1
    • /
    • pp.151-158
    • /
    • 2019
  • With the recent advances of memory technologies, high-performance non-volatile memories such as non-volatile dual in-line memory module (NVDIMM) have begun to be used as an addition or an alternative to server-side storages. When these memory bus-connected storages (MBSs) are installed over non-uniform memory access (NUMA) servers, the distance between NUMA nodes and MBSs is one of the crucial factors that influence file processing performance, because the access latency of a NUMA system varies depending on its distance from the NUMA nodes. This paper presents the design and implementation of a high-performance logical volume manager for MBSs, called MBS-LVM, when multiple MBSs are scattered over a NUMA server. The MBS-LVM consolidates the address space of each MBS into a single global address space and dynamically utilizes storage spaces such that each thread can access an MBS with the lowest latency possible. We implemented the MBS-LVM in the Linux kernel and evaluated its performance by porting it over the tmpfs, a memory-based file system widely used in Linux. The results of the benchmarking show that the write performance of the tmpfs using MBS-LVM has been improved by up to twenty times against the original tmpfs over a NUMA server with four nodes.

Kernel-level Software instrumentation via Light-weight Dynamic Binary Translation (경량 동적 코드 변환을 이용한 커널 수준 소프트웨어 계측에 관한 연구)

  • Lee, Dong-Woo;Kim, Jee-Hong;Eom, Young-Ik
    • Journal of Internet Computing and Services
    • /
    • v.12 no.5
    • /
    • pp.63-72
    • /
    • 2011
  • Binary translation is a kind of the emulation method which converts a binary code compiled on the particular instruction set architecture to the new binary code that can be run on another one. It has been mostly used for migrating legacy systems to new architecture. In recent, binary translation is used for instrumenting programs without modifying source code, because it enables inserting additional codes dynamically, For general application, there already exists some instrumentation software using binary translation, such as dynamic binary analyzers and virtual machine monitors. On the other hand, in order to be benefited from binary translation in kernel-level, a few issues, which include system performance, memory management, privileged instructions, and synchronization, should be treated. These matters are derived from the structure of the kernel, and the difference between the kernel and user-level application. In this paper, we present a scheme to apply binary translation and dynamic instrumentation on kernel. We implement it on Linux kernel and demonstrate that kernel-level binary translation adds an insignificant overhead to performance of the system.