• Title/Summary/Keyword: hardware optimization

Search Result 210, Processing Time 0.027 seconds

Development of a Lane Departure Warning Application on a Smartphone (스마트폰용 차선이탈경보 애플리케이션 개발)

  • Ro, Kwang-Hyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.6
    • /
    • pp.2793-2800
    • /
    • 2011
  • The purpose of this research is to develop and optimize a lane departure warning application based on a smartphone which can be applicable as a new platform for various mobile information applications. Recently, a lane detection warning system which is a representative application among safe driving assistant solutions is being commercialized. Due to the necessity of powerful embedded hardware platform and its price, its market is still not growing. In this research, it is proposed to develop and optimize a lane departure warning application on iPhone 3GS. OpenCV is used for efficient image processing, and for lane detection a heuristic algorithm based on Hough Transform is proposed. The application was developed under Macintosh PC platform with Xcode 3.2.4 development tools, downloaded to the iPhone and has been tested on the real paved road. The experimental result has shown that the detection ratio of the straight lane was over 90% and the processing speed was 1.52fps. For the enhancement of the speed, a few optimization methods were introduced and the fastest speed was 3.84fps. Through the improvement of lane detection algorithm, additional optimization works and the adoption of a new powerful platform, it will be successfully commercialized on smartphone application market.

Finite element-based software-in-the-loop for offline post-processing and real-time simulations

  • Oveisi, Atta;Sukhairi, T. Arriessa;Nestorovic, Tamara
    • Structural Engineering and Mechanics
    • /
    • v.67 no.6
    • /
    • pp.643-658
    • /
    • 2018
  • In this paper, we introduce a new framework for running the finite element (FE) packages inside an online Loop together with MATLAB. Contrary to the Hardware-in-the-Loop techniques (HiL), in the proposed Software-in-the-Loop framework (SiL), the FE package represents a simulation platform replicating the real system which can be out of access due to several strategic reasons, e.g., costs and accessibility. Practically, SiL for sophisticated structural design and multi-physical simulations provides a platform for preliminary tests before prototyping and mass production. This feature may reduce the new product's costs significantly and may add several flexibilities in implementing different instruments with the goal of shortlisting the most cost-effective ones before moving to real-time experiments for the civil and mechanical systems. The proposed SiL interconnection is not limited to ABAQUS as long as the host FE package is capable of executing user-defined commands in FORTRAN language. The focal point of this research is on using the compiled FORTRAN subroutine as a messenger between ABAQUS/CAE kernel and MATLAB Engine. In order to show the generality of the proposed scheme, the limitations of the available SiL schemes in the literature are addressed in this paper. Additionally, all technical details for establishing the connection between FEM and MATLAB are provided for the interested reader. Finally, two numerical sub-problems are defined for offline and online post-processing, i.e., offline optimization and closed-loop system performance analysis in control theory.

Experimental and computational analysis of behavior of three-way catalytic converter under axial and radial flow conditions

  • Taibani, Arif Zakaria;Kalamkar, Vilas
    • International Journal of Fluid Machinery and Systems
    • /
    • v.5 no.3
    • /
    • pp.134-142
    • /
    • 2012
  • The competition to deliver ultra-low emitting vehicles at a reasonable cost is driving the automotive industry to invest significant manpower and test laboratory resources in the design optimization of increasingly complex exhaust after-treatment systems. Optimization can no longer be based on traditional approaches, which are intensive in hardware use and laboratory testing. The CFD is in high demand for the analysis and design in order to reduce developing cost and time consuming in experiments. This paper describes the development of a comprehensive practical model based on experiments for simulating the performance of automotive three-way catalytic converters, which are employed to reduce engine exhaust emissions. An experiment is conducted to measure species concentrations before and after catalytic converter for different loads on engine. The model simulates the emission system behavior by using an exhaust system heat conservation and catalyst chemical kinetic sub-model. CFD simulation is used to study the performance of automotive catalytic converter. The substrate is modeled as a porous media in FLUENT and the standard k-e model is used for turbulence. The flow pattern is changed from axial to radial by changing the substrate model inside the catalytic converter and the flow distribution and the conversion efficiency of CO, HC and NOx are achieved first, and the predictions are in good agreement with the experimental measurements. It is found that the conversion from axial to radial flow makes the catalytic converter more efficient. These studies help to understand better the performance of the catalytic converter in order to optimize the converter design.

Efficient Design Methodology based on Hybrid Logic Synthesis for SoC (효율적인 SoC 논리합성을 위한 혼합방식의 설계 방법론)

  • Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.3
    • /
    • pp.571-578
    • /
    • 2012
  • In this paper, we propose two main points. The first is the constraint for logic synthesis, and the second is an efficient logic synthesis method. Logic synthesis is a process to obtain the gate-level netlist from RTL (register transfer level) codes using logic mapping and optimization with the specified constraints. The result of logic synthesis is tightly dependent on constraint and logic synthesis method. Since the size and timing can be dramatically changed by these, we should precisely consider them. In this paper, we present the considering items in the process of logic synthesis by using our experience and experimental results. The proposed techniques was applied to a circuit with the hardware resource of about 650K gates. The synthesis time for the hybrid method was reduced by 47% comparing the bottom-up method and It has better timing property about slack than top-down method.

Measuring Hadoop Optimality by Lorenz Curve (로렌츠 커브를 이용한 하둡 플랫폼의 최적화 지수)

  • Kim, Woo-Cheol;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.2
    • /
    • pp.249-261
    • /
    • 2014
  • Ever increasing "Big data" can only be effectively processed by parallel computing. Parallel computing refers to a high performance computational method that achieves effectiveness by dividing a big query into smaller subtasks and aggregating results from subtasks to provide an output. However, it is well-known that parallel computing does not achieve scalability which means that performance is improved linearly by adding more computers because it requires a very careful assignment of tasks to each node and collecting results in a timely manner. Hadoop is one of the most successful platforms to attain scalability. In this paper, we propose a measurement for Hadoop optimization by utilizing a Lorenz curve which is a proxy for the inequality of hardware resources. Our proposed index takes into account the intrinsic overhead of Hadoop systems such as CPU, disk I/O and network. Therefore, it also indicates that a given Hadoop can be improved explicitly and in what capacity. Our proposed method is illustrated with experimental data and substantiated by Monte Carlo simulations.

Optimum Interleaver Design and Performance Analysis of Double-Binary Turbo Code for Wireless Metropolitan Area Networks (WMAN 시스템의 이중 이진 구조 터보부호 인터리버 최적화 설계 및 성능 분석)

  • Park, Sung-Joon
    • Journal of the Korea Society for Simulation
    • /
    • v.17 no.1
    • /
    • pp.17-22
    • /
    • 2008
  • Double-binary turbo code has been adopted as an error control code of various future communication systems including wireless metropolitan area networks(WMAN) due to its powerful error correction capability. One of the components affecting the performance of turbo code is internal interleaver. In 802.16 d/e system, an almost regular permutation(ARP) interleaver has been included as a part of specification, however it seems that the interleaver is not optimized in terms of decoding performance. In this paper, we propose three optimization methods for the interleaver based on spatial distance, spread and minimum distance between original and interleaved sequence. We find optimized interleaving parameters for each optimization method and evaluate the performances of the proposed methods by computer simulation under additive white Gaussian noise(AWGN) channel. Optimized parameters can provide up to 1.0 dB power gain over the conventional method and furthermore the obtainable gain does not require any additional hardware complexity.

  • PDF

Optimization of Warp-wide CUDA Implementation for Parallel Shifted Sort Algorithm (병렬 Shifted Sort 알고리즘의 Warp 단위 CUDA 구현 최적화)

  • Park, Taejung
    • Journal of Digital Contents Society
    • /
    • v.18 no.4
    • /
    • pp.739-745
    • /
    • 2017
  • This paper presents and discusses an implementation of the GPU shifted sorting method to find approximate k nearest neighbors which executes within "warp", the minimum execution unit in GPU parallel architecture. Also, this paper presents the comparison results with other two common nearest neighbor searching methods, GPU-based kd-tree and ANN (Approximate Nearest Neighbor) library. The proposed implementation focuses on the cases when k is small, i.e. 2, 4, 8, and 16, which are handled efficiently within warp to consider it is very common for applications to handle small k's. Also, this paper discusses optimization ways to implementation by improving memory management in a loop for the CUB open library and adopting CUDA commands which are supported by GPU hardware. The proposed implementation shows more than 16-fold speed-up against GPU-based other methods in the tests, implying that the improvement would become higher for more larger input data.

Optimization Method on the Number of the Processing Elements in the Multi-Stage Motion Estimation Algorithm for High Efficiency Video Coding (HEVC 다단계 움직임 추정 기법에서 단위 연산기 개수의 최적화 방법)

  • Lee, Seongsoo
    • Journal of IKEEE
    • /
    • v.21 no.1
    • /
    • pp.100-103
    • /
    • 2017
  • Motion estimation occupies the largest computation in the video compression. Multiple processing elements are often exploited in parallel to meet processing speed. More processing elements increase processing speed, but they also increase hardware area. therefore, it is important to optimize the number of processing element. HEVC (high efficiency video coding) usually exploits multi-stage motion estimation algorithms for low computation and high performance. Since the number and position of search points are different in each stage, the utilization of the processing elements is not always 100% and the utilization is quite different with the number of processing elements. In this paper, the optimizing method is proposed on the number of processing elements. It finds out the optimal number of the processing elements for the given multi-stage motion estimation algorithm by calculating utilization and execution cycle of the processing elements.

An Optimization Tool for Determining Processor Affinity of Networking Processes (통신 프로세스의 프로세서 친화도 결정을 위한 최적화 도구)

  • Cho, Joong-Yeon;Jin, Hyun-Wook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.131-136
    • /
    • 2013
  • Multi-core processors can improve parallelism of application processes and thus can enhance the system throughput. Researchers have recently revealed that the processor affinity is an important factor to determine network I/O performance due to architectural characteristics of multi-core processors; thus, many researchers are trying to suggest a scheme to decide an optimal processor affinity. Existing schemes to dynamically decide the processor affinity are able to transparently adapt for system changes, such as modifications of application and upgrades of hardware, but these have limited access to characteristics of application behavior and run-time information that can be collected heuristically. Thus, these can provide only sub-optimal processor affinity. In this paper, we define meaningful system variables for determining optimal processor affinity and suggest a tool to gather such information. We show that the implemented tool can overcome limitations of existing schemes and can improve network bandwidth.

New generation software of structural analysis and design optimization--JIFEX

  • Gu, Yuanxian;Zhang, Hongwu;Guan, Zhenqun;Kang, Zhan;Li, Yunpeng;Zhong, Wanxie
    • Structural Engineering and Mechanics
    • /
    • v.7 no.6
    • /
    • pp.589-599
    • /
    • 1999
  • This paper presents the development and applications of the software package JIFEX, a new finite element system which can be used for structural analysis and optimum design by the modern computer hardware and software technologies such as MS Windows95/NT and Pentium PC platforms. The complete system of JIFEX is programmed with $C/C^{++}$ language to make full use of advanced facilities of MS Windows95/NT. In the system, the finite element data pre-processing, based on the most popular CAD package AutoCAD (R13, R14), has been implemented, so that the finite element modeling could be integrated with geometric modeling of CAD. The system not only has interactive graphics facility for data post-processing, but also realizes the real-time computing visualization by means of the Dynamic Data Exchange (DDE) technique. Running on the Pentium computers, JIFEX can solve large-scale finite element analysis problems such as the ones with more than 60000 nodes in the finite element model.