• Title/Summary/Keyword: overhead reduction

Search Result 186, Processing Time 0.031 seconds

NVM-based Write Amplification Reduction to Avoid Performance Fluctuation of Flash Storage (플래시 스토리지의 성능 지연 방지를 위한 비휘발성램 기반 쓰기 증폭 감소 기법)

  • Lee, Eunji;Jeong, Minseong;Bahn, Hyokyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.4
    • /
    • pp.15-20
    • /
    • 2016
  • Write amplification is a critical factor that limits the stable performance of flash-based storage systems. To reduce write amplification, this paper presents a new technique that cooperatively manages data in flash storage and nonvolatile memory (NVM). Our scheme basically considers NVM as the cache of flash storage, but allows the original data in flash storage to be invalidated if there is a cached copy in NVM, which can temporarily serve as the original data. This scheme eliminates the copy-out operation for a substantial number of cached data, thereby enhancing garbage collection efficiency. Experimental results show that the proposed scheme reduces the copy-out overhead of garbage collection by 51.4% and decreases the standard deviation of response time by 35.4% on average.

LTRE: Lightweight Traffic Redundancy Elimination in Software-Defined Wireless Mesh Networks (소프트웨어 정의 무선 메쉬 네트워크에서의 경량화된 중복 제거 기법)

  • Park, Gwangwoo;Kim, Wontae;Kim, Joonwoo;Pack, Sangheon
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.976-985
    • /
    • 2017
  • Wireless mesh network (WMN) is a promising technology for building a cost-effective and easily-deployed wireless networking infrastructure. To efficiently utilize limited radio resources in WMNs, packet transmissions (particularly, redundant packet transmissions) should be carefully managed. We therefore propose a lightweight traffic redundancy elimination (LTRE) scheme to reduce redundant packet transmissions in software-defined wireless mesh networks (SD-WMNs). In LTRE, the controller determines the optimal path of each packet to maximize the amount of traffic reduction. In addition, LTRE employs three novel techniques: 1) machine learning (ML)-based information request, 2) ID-based source routing, and 3) popularity-aware cache update. Simulation results show that LTRE can significantly reduce the traffic overhead by 18.34% to 48.89%.

Overhead Reduction Methods in Communication between 6LoWPAN and External Node (6LoWPAN 노드와 외부 노드의 통신 시에 오버헤드 감소 방법)

  • Choi, Dae-In;Enkhzul, Doopalam;Park, Jong-Tak;Kahng, Hyun-K.
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.5B
    • /
    • pp.437-442
    • /
    • 2011
  • As an Internet Engineering Task Force (IETF) Working Group, 6LoWPAN is standardizing the IPv6 packet transfer technology in accordance with IEEE 802.15.4. It has completed two Request for Comments (RFC) documents, one of which, RFC 4944, addresses fragmentation, reassembly, and header compression technologies. In this paper, a communication mechanism is proposed to provide efficient communication between 6LoWPAN and external nodes. In this mechanism, the gateway between 6LoWPAN and external networks serves as the proxy gateway between nodes. The simulation was conducted using QualNet to compare the performance of the proposed mechanism and the existing RFC 4944 method. The comparative analysis of the proposed mechanism and the existing method showed that the proposed method performed better.

Efficient Motion Information Representation in Splitting Region of HEVC (HEVC의 분할 영역에서 효율적인 움직임 정보 표현)

  • Lee, Dong-Shik;Kim, Young-Mo
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.4
    • /
    • pp.485-491
    • /
    • 2012
  • This paper proposes 'Coding Unit Tree' based on quadtree efficiently with motion vector to represent splitting information of a Coding Unit (CU) in HEVC. The new international video coding, High Efficiency Video Coding (HEVC), adopts various techniques and new unit concept: CU, Prediction Unit (PU), and Transform Unit (TU). The basic coding unit, CU is larger than macroblock of H.264/AVC and it splits to process image-based quadtree with a hierarchical structure. However, in case that there are complex motions in CU, the more signaling bits with motion information need to be transmitted. This structure provides a flexibility and a base for a optimization, but there are overhead about splitting information. This paper analyzes those signals and proposes a new algorithm which removes those redundancy. The proposed algorithm utilizes a type code, a dominant value, and residue values at a node in quadtree to remove the addition bits. Type code represents a structure of an image tree and the two values represent a node value. The results show that the proposed algorithm gains 13.6% bit-rate reduction over the HM-1.0.

Method for Message Processing According to Priority in MQTT Broker (MQTT Broker에서 우선순위에 따른 메시지 처리를 위한 방법에 관한 연구)

  • Kim, Sung-jin;Oh, Chang-heon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.7
    • /
    • pp.1320-1326
    • /
    • 2017
  • Recently, IoT has been studying a lightweight protocol to satisfy device communication in a limited network environment. MQTT is a typical lightweight protocol. It supports small fixed headers to minimize overhead, and adopts publish/subscribe structure to guarantee real-time performance. However, MQTT does not support prioritization of important data and can not provide QoS in a specific IoT service. In this paper, we propose a message processing method to consider the priority of various IoT services in MQTT. In the proposed method, the priority flag is added to the fixed header of the MQTT in the node to transmit the message, and the broker confirms the priority of the corresponding message and processes it preferentially. Through experiment and evaluation, we confirmed the reduction of end-to-end delay between nodes according to priority.

A Study of Integral Image Hardware Design for Memory Size Efficiency (메모리 크기에 효율적인 적분영상 하드웨어 설계 연구)

  • Lee, Su-Hyun;Jeong, Yong-Jin
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.9
    • /
    • pp.75-81
    • /
    • 2014
  • The integral image is the sum of input image pixel values. It is mainly used to speed up processing of a box filter operation, such as Haar-like features. However, large memory for integral image data can be an obstacle on an embedded hardware environment with limited memory resources. Therefore, an efficient method to store the integral image is necessary. In this paper, we propose a memory size reduction hardware design for integral image. The hardware design is used two methods. It is the new integral image memory and modulo calculation for reducing integral image data. The new integral image memory has additional calculation overhead, but it is not obstacle in hardware environment that parallel processing is possible. In the Xilinx Virtex5-LX330T targeted experimental result, integral image memory can be reduced by 50% on a $640{\times}480$ 8-bit gray-scale input image.

A Latency Optimization Mapping Algorithm for Hybrid Optical Network-on-Chip (하이브리드 광학 네트워크-온-칩에서 지연 시간 최적화를 위한 매핑 알고리즘)

  • Lee, Jae Hun;Li, Chang Lin;Han, Tae Hee
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.7
    • /
    • pp.131-139
    • /
    • 2013
  • To overcome the limitations in performance and power consumption of traditional electrical interconnection based network-on-chips (NoCs), a hybrid optical network-on-chip (HONoC) architecture using optical interconnects is emerging. However, the HONoC architecture should use circuit-switching scheme owing to the overhead by optical devices, which worsens the latency unfairness problem caused by frequent path collisions. This resultingly exert a bad influence in overall performance of the system. In this paper, we propose a new task mapping algorithm for optimizing latency by reducing path collisions. The proposed algorithm allocates a task to a certain processing element (PE) for the purpose of minimizing path collisions and worst case latencies. Compared to the random mapping technique and the bandwidth-constrained mapping technique, simulation results show the reduction in latency by 43% and 61% in average for each $4{\times}4$ and $8{\times}8$ mesh topology, respectively.

Design of a Low Power Turbo Decoder by Reducing Decoding Iterations (반복 복호수 감소에 의한 저전력 터보 복호기의 설계)

  • Back, Seo-Young;Kim, Sik;Back, Seo-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1C
    • /
    • pp.1-8
    • /
    • 2004
  • This paper proposes a novel algorithm for a low power turbo decoder based on reduction of number of decoding iterations, targeting power-critical mobile communication devices. Previous researches that attempt to reduce number of decoding iterations, such as CRC-aided and LLR methods, either show degraded BER performance in return for reduced complexity or require additional hardware resources for controlling the number of iterations to meet BER performance, respectively. The proposed algorithm can reduce power consumption without degrading the BER performance, and it is achieved with minimal hardware overhead. The proposed algorithm achieves this by comparing consecutive hard decision results using a simple buffer and counter. Simulation results show that the number of decoding iterations can be reduced to about 60% without degrading the BER performance in the proposed decoder, and power consumption can be saved in proportion to the number of decoding iterations.

Design of Asynchronous 16-Bit Divider Using NST Algorithm (NST알고리즘을 이용한 비동기식 16비트 제산기 설계)

  • 이우석;박석재;최호용
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.3
    • /
    • pp.33-42
    • /
    • 2003
  • This paper describes an efficient design of an asynchronous 16-bit divider using the NST (new Svoboda-Tung) algorithm. The divider is designed to reduce power consumption by using the asynchronous design scheme in which the division operation is performed only when it is requested. The divider consists of three blocks, i.e. pre-scale block, iteration step block, and on-the-fly converter block using asynchronous pipeline structure. The pre-scale block is designed using a new subtracter to have small area and high performance. The iteration step block consists of an asynchronous ring structure with 4 division steps for area reduction. In other to reduce hardware overhead, the part related to critical path is designed by a dual-rail circuit, and the other part is done by a single-rail circuit in the ring structure. The on-the-fly converter block is designed for high performance using the on-the-fly algorithm that enables parallel operation with iteration step block. The design results with 0.6${\mu}{\textrm}{m}$ CMOS process show that the divider consists of 12,956 transistors with 1,480 $\times$1,200${\mu}{\textrm}{m}$$^2$area and average-case delay is 41.7㎱.

Design of efficient self-repair system for multi-faults (다중고장에 대한 효율적인 자가치유시스템 설계)

  • Choi, Ho-Yong;Seo, Jung-Il;Yu, Chung-Ho;Woo, Cheol-Jong;Lee, Jae-Eun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.11 s.353
    • /
    • pp.69-76
    • /
    • 2006
  • This paper proposes a self-repair system which is able to self-repair in cell unit by imitating the structure of living beings. Because the data of artificial cells move even diagonally, our system can self-repair faults not in column unit, but in cell unit. It leads to design an efficient self-repair system for multiple faults. Moreover, in artificial cell design, the usage of logic-based design method has smaller system size than that of the previous register-based design method. Our experimental result for 2-bit up/down counter shows 40.3% reduction in hardware overhead, compared to the previous method [6].