• Title/Summary/Keyword: low speed processor

Search Result 109, Processing Time 0.028 seconds

Effectiveness of Edge Selection on Mobile Devices (모바일 장치에서 에지 선택의 효율성)

  • Kang, Seok-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.7
    • /
    • pp.149-156
    • /
    • 2011
  • This paper proposes the effective edge selection algorithm for the rapid processing time and low memory usage of efficient graph-based image segmentation on mobile device. The graph-based image segmentation algorithm is to extract objects from a single image. The objects are consisting of graph edges, which are created by information of each image's pixel. The edge of graph is created by the difference of color intensity between the pixel and neighborhood pixels. The object regions are found by connecting the edges, based on color intensity and threshold value. Therefore, the number of edges decides on the processing time and amount of memory usage of graph-based image segmentation. Comparing to personal computer, the mobile device has many limitations such as processor speed and amount of memory. Additionally, the response time of application is an issue of mobile device programming. The image processing on mobile device should offer the reasonable response time, so that, the image segmentation processing on mobile should provide with the rapid processing time and low memory usage. In this paper, we demonstrate the performance of the effective edge selection algorithm, which effectively controls the edges of graph for the rapid processing time and low memory usage of graph-based image segmentation on mobile device.

Performance improvement on mobile devices using MVC+Prefetch Controller Pattern (MVC+Prefetch Controller 패턴을 사용한 모바일 기기의 성능향상 기법)

  • Im, Byung-Jai;Lee, Eun-Seok
    • The KIPS Transactions:PartD
    • /
    • v.18D no.3
    • /
    • pp.179-184
    • /
    • 2011
  • Current mobile devices have surpassed its boundaries as a more communication tool to a smart device which provides additional features. These features have supported the smart life of its users, but have reached its limit from low-performance processors and short-battery time. These issues can be resolved b implementing higher performing hardware, but they come with a burden of high cost. This paper introduces a new way of managing computing resources in a mobile device by enhancing the quality of human-computer interaction. The real-speed felt by users are mainly influenced by the time it takes form a user's input to the device to display the completed result on the screen. Since the size of the screen for mobile devices are small, if the processor only fetch data to be used for displaying on screen, the time can be significantly reduced. MVC+Prefetch Controller pattern accomplished this goal by using the minimum amount of data from DB to fetch display and still manages to support high-speed data transfer to achieve seamless display. This idea has been realized by practice using Samsung mobile phone S8500, which demonstrated the superior performance on user's perspective.

Low-Complexity Deeply Embedded CPU and SoC Implementation (낮은 복잡도의 Deeply Embedded 중앙처리장치 및 시스템온칩 구현)

  • Park, Chester Sungchung;Park, Sungkyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.3
    • /
    • pp.699-707
    • /
    • 2016
  • This paper proposes a low-complexity central processing unit (CPU) that is suitable for deeply embedded systems, including Internet of things (IoT) applications. The core features a 16-bit instruction set architecture (ISA) that leads to high code density, as well as a multicycle architecture with a counter-based control unit and adder sharing that lead to a small hardware area. A co-processor, instruction cache, AMBA bus, internal SRAM, external memory, on-chip debugger (OCD), and peripheral I/Os are placed around the core to make a system-on-a-chip (SoC) platform. This platform is based on a modified Harvard architecture to facilitate memory access by reducing the number of access clock cycles. The SoC platform and CPU were simulated and verified at the C and the assembly levels, and FPGA prototyping with integrated logic analysis was carried out. The CPU was synthesized at the ASIC front-end gate netlist level using a $0.18{\mu}m$ digital CMOS technology with 1.8V supply, resulting in a gate count of merely 7700 at a 50MHz clock speed. The SoC platform was embedded in an FPGA on a miniature board and applied to deeply embedded IoT applications.

Design and Implementation of a Bluetooth Baseband Module with DMA Interface (DMA 인터페이스를 갖는 블루투스 기저대역 모듈의 설계 및 구현)

  • Cheon, Ik-Jae;O, Jong-Hwan;Im, Ji-Suk;Kim, Bo-Gwan;Park, In-Cheol
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.3
    • /
    • pp.98-109
    • /
    • 2002
  • Bluetooth technology is a publicly available specification proposed for Radio Frequency (RF) communication for short-range :1nd point-to-multipoint voice and data transfer. It operates in the 2.4㎓ ISM(Industrial, Scientific and Medical) band and offers the potential for low-cost, broadband wireless access for various mobile and portable devices at range of about 10 meters. In this paper, we describe the structure and the test results of the bluetooth baseband module with direct memory access method we have developed. This module consists of three blocks; link controller, UART interface, and audio CODEC. This module has a bus interface for data communication between this module and main processor and a RF interface for the transmission of bit-stream between this module and RF module. The bus interface includes DMA interface. Compared with the link controller with FIFOs, The module with DMA has a wide difference in size of module and speed of data processing. The small size module supplies lorr cost and various applications. In addition, this supports a firmware upgrade capability through UART. An FPGA and an ASIC implementation of this module, designed as soft If, are tested for file and bit-stream transfers between PCs.

A Study of FC-NIC Design Using zynq SoC for Host Load Reduction (호스트 부하 경감 달성을 위한 zynq SoC를 적용한 FC-NIC 설계에 관한 연구)

  • Hwang, Byeung-Chang;Seo, Jung-hoon;Kim, Young-Su;Ha, Sung-woo;Kim, Jae-Young;Jang, Sun-geun
    • Journal of Advanced Navigation Technology
    • /
    • v.19 no.5
    • /
    • pp.423-432
    • /
    • 2015
  • This paper shows that design, manufacture and the performance of FC-NIC (fibre channel network interface card) for network unit configuration which is based on one of the 5 main configuration items of the common functional module for IMA (integrated modular Avionics) architecture. Especially, FC-NIC uses zynq SoC (system on chip) for host load reductions. The host merely transmit FC destination address, source memory location and size information to the FC-NIC. After then the FC-NIC read the host memory via DMA (direct memory access). FC upper layer protocol and sequence process at local processor and programmable logic of FC-NIC zynq SoC. It enables to free from host load for external communication. The performance of FC-NIC shows average 5.47 us low end-to-end latency at 2.125 Gbps line speed. It represent that FC-NIC is one of good candidate network for IMA.

Design and Implementation of an Alternate System Interconnect based on PCI Express (PCI Express 기반 시스템 인터커넥트의 설계 및 구현)

  • Kim, Young Woo;Ren, Ye;Choi, WonHyuk
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.8
    • /
    • pp.74-85
    • /
    • 2015
  • PCI Express is a well-known and widely used de-facto system bus standard for connecting among a processor and IO devices. PCI Express is originated from old PCI standard, and its most of applications are limited to be used within a PC or server system. But, because of its fast speed, low power consumption, and good protocol efficiency, it is considered as one of a good candidate for an alternate system interconnect for many years. In this paper, we present design, implementation and early evaluation of an alternate system interconnect by utilizing PCI Express. The developed alternate system interconnect using PCI Express (named PCIeLINK) utilizes non-transparent bridging (NTB) technic which generally used in fail-over system in PCI and PCI Express. By using NTB technic, PCI Express device can be extended to outside of a system without electrical and logical problems arising during system boot and enumeration. To build up an alternate system interconnect, we designed and implemented a network interface card having multiple PCI Express ${\times}4$ connections (theoretically 20 Gbps) and tested, The early test results revealed that an ${\times}4$ port in the card showed 8.6 Gbps peak performance for bulk transmission and 5.1 Gbps peak for normal TCP/IP transfer.

Parallel SystemC Cosimulation using Virtual Synchronization (가상 동기화 기법을 이용한 SystemC 통합시뮬레이션의 병렬 수행)

  • Yi, Young-Min;Kwon, Seong-Nam;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.12
    • /
    • pp.867-879
    • /
    • 2006
  • This paper concerns fast and time accurate HW/SW cosimulation for MPSoC(Multi-Processor System-on-chip) architecture where multiple software and/or hardware components exist. It is becoming more and more common to use MPSoC architecture to design complex embedded systems. In cosimulation of such architecture, as the number of the component simulators participating in the cosimulation increases, the time synchronization overhead among simulators increases, thereby resulting in low overall cosimulation performance. Although SystemC cosimulation frameworks show high cosimulation performance, it is in inverse proportion to the number of simulators. In this paper, we extend the novel technique, called virtual synchronization, which boosts cosimulation speed by reducing time synchronization overhead: (1) SystemC simulation is supported seamlessly in the virtual synchronization framework without requiring the modification on SystemC kernel (2) Parallel execution of component simulators with virtual synchronization is supported. We compared the performance and accuracy of the proposed parallel SystemC cosimulation framework with MaxSim, a well-known commercial SystemC cosimulation framework, and the proposed one showed 11 times faster performance for H.263 decoder example, while the accuracy was maintained below 5%.

Design and Implementation of Initial OpenSHMEM Based on PCI Express (PCI Express 기반 OpenSHMEM 초기 설계 및 구현)

  • Joo, Young-Woong;Choi, Min
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.3
    • /
    • pp.105-112
    • /
    • 2017
  • PCI Express is a bus technology that connects the processor and the peripheral I/O devices that widely used as an industry standard because it has the characteristics of high-speed, low power. In addition, PCI Express is system interconnect technology such as Ethernet and Infiniband used in high-performance computing and computer cluster. PGAS(partitioned global address space) programming model is often used to implement the one-sided RDMA(remote direct memory access) from multi-host systems, such as computer clusters. In this paper, we design and implement a OpenSHMEM API based on PCI Express maintaining the existing features of OpenSHMEM to implement RDMA based on PCI Express. We perform experiment with implemented OpenSHMEM API through a matrix multiplication example from system which PCs connected with NTB(non-transparent bridge) technology of PCI Express. The PCI Express interconnection network is currently very expensive and is not yet widely available to the general public. Nevertheless, we actually implemented and evaluated a PCI Express based interconnection network on the RDK evaluation board. In addition, we have implemented the OpenSHMEM software stack, which is of great interest recently.

Linear Resource Sharing Method for Query Optimization of Sliding Window Aggregates in Multiple Continuous Queries (다중 연속질의에서 슬라이딩 윈도우 집계질의 최적화를 위한 선형 자원공유 기법)

  • Baek, Seong-Ha;You, Byeong-Seob;Cho, Sook-Kyoung;Bae, Hae-Young
    • Journal of KIISE:Databases
    • /
    • v.33 no.6
    • /
    • pp.563-577
    • /
    • 2006
  • A stream processor uses resource sharing method for efficient of limited resource in multiple continuous queries. The previous methods process aggregate queries to consist the level structure. So insert operation needs to reconstruct cost of the level structure. Also a search operation needs to search cost of aggregation information in each size of sliding windows. Therefore this paper uses linear structure for optimization of sliding window aggregations. The method comprises of making decision, generation and deletion of panes in sequence. The decision phase determines optimum pane size for holding accurate aggregate information. The generation phase stores aggregate information of data per pane from stream buffer. At the deletion phase, panes are deleted that are no longer used. The proposed method uses resources less than the method where level structures were used as data structures as it uses linear data format. The input cost of aggregate information is saved by calculating only pane size of data though numerous stream data is arrived, and the search cost of aggregate information is also saved by linear searching though those sliding window size is different each other. In experiment, the proposed method has low usage of memory and the speed of query processing is increased.