• Title/Summary/Keyword: Hardware Resources

Search Result 442, Processing Time 0.027 seconds

Analysis of Impact of Correlation Between Hardware Configuration and Branch Handling Methods Executing General Purpose Applications (범용 응용프로그램 실행 시 하드웨어 구성과 분기 처리 기법에 따른 GPU 성능 분석)

  • Choi, Hong Jun;Kim, Cheol Hong
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.3
    • /
    • pp.9-21
    • /
    • 2013
  • Due to increased computing power and flexibility of GPU, recent GPUs execute general purpose parallel applications as well as graphics applications. Programmers can use GPGPU by using the APIs from GPU vendors. Unfortunately, computational resources of GPU are not fully utilized when executing general purpose applications because of frequent branch instructions. To handle the branch problem, several warp formations have been proposed. Intuitively, we expect that the warp formations providing higher computational resource utilization show higher performance. Contrary to our expectations, according to simulation results, the performance of the warp formation providing better utilization is lower than that of the warp formation providing worse utilization. This is because warp formation providing high utilization causes serious memory bottleneck due to increased memory request. Therefore, warp formation providing high computation utilization cannot guarantee high performance without proper hardware resources. For this reason, we will analyze the correlation between hardware configuration and warp formation. Our simulation results present the guideline to solve the underutilization problem due to branch instructions when designing recent GPU.

Area Efficient FPGA Implementation of Block Cipher Algorithm SEED (블록 암호알고리즘 SEED의 면적 효율성을 고려한 FPGA 구현)

  • Kim, Jong-Hyeon;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.4
    • /
    • pp.372-381
    • /
    • 2001
  • In this paper SEED, the Korea Standard 128-bit block cipher algorithm is implemented with VHDL and mapped into one FPGA. SEED consists of round key generation block, F function block, G function block, round processing block, control block and I/O block. The designed SEED is realized in an FPGA but we design it technology-independently so that ASIC or core-based implementation is possible. SEED requires many hardware resources which may be impossible to realize in one FPGA. So it is necessary to minimize hardware resources. In this paper only one G function is implemented and is used for both the F function block and the round key block. That is, by using one G function sequentially, we can realize all the SEED components in one FPGA. The used cell rate after synthesis is 80% in Altem FLEXI0KlOO. The resulted design has 28Mhz clock speed and 14.9Mbps performance. The SEED hardware is technology-independent and no other external component is needed. Thus, it can be applied to other SEED implementations and cipher systems which use SEED.

  • PDF

Hierarchical IoT Edge Resource Allocation and Management Techniques based on Synthetic Neural Networks in Distributed AIoT Environments (분산 AIoT 환경에서 합성곱신경망 기반 계층적 IoT Edge 자원 할당 및 관리 기법)

  • Yoon-Su Jeong
    • Advanced Industrial SCIence
    • /
    • v.2 no.3
    • /
    • pp.8-14
    • /
    • 2023
  • The majority of IoT devices already employ AIoT, however there are still numerous issues that need to be resolved before AI applications can be deployed. In order to more effectively distribute IoT edge resources, this paper propose a machine learning-based approach to managing IoT edge resources. The suggested method constantly improves the allocation of IoT resources by identifying IoT edge resource trends using machine learning. IoT resources that have been optimized make use of machine learning convolution to reliably sustain IoT edge resources that are always changing. By storing each machine learning-based IoT edge resource as a hash value alongside the resource of the previous pattern, the suggested approach effectively verifies the resource as an attack pattern in a distributed AIoT context. Experimental results evaluate energy efficiency in three different test scenarios to verify the integrity of IoT Edge resources to see if they work well in complex environments with heterogeneous computational hardware.

Design and Implementation of Resources Management System for Extension of outside Data Space in Mobile Device (모바일 디바이스에서 외부 데이터 영역의 확장을 위한 자원관리시스템의 설계 및 구현)

  • 나승원;오세만
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.2
    • /
    • pp.33-48
    • /
    • 2003
  • Wireless Internet, created through the merging of mobile communication with Internet technology, provides the advantage of mobility, but the restrictions of the mobile environment are deterring it from growing into a mass public service. Of the restricting factors of the wireless environment, narrow memory space creates the disadvantage of not being able to manage resources in mobile devices efficiently Because there is a limit to obtaining sufficient memory space from hardware made with consideration of portability, future devices will need to have a platform design with storage area extended from internal storage to external storage space. In this paper, we present a mobile agent that extends the memory space from only the inside of a mobile device to an external server making it possible to use data by on-line Run-time, and can also manage internal files efficiently. We have designed and implemented a RMS(Resources Management System) as a realization. Devices using the proposed RMS will be able to apply extended processes with the 'Mobile Space Extension' and will be benefited with optimal memory space through efficient internal file management.

  • PDF

Static Timing Analysis Tool for ARM-based Embedded Software (ARM용 내장형 소프트웨어의 정적인 수행시간 분석 도구)

  • Hwang Yo-Seop;Ahn Seong-Yong;Shim Jea-Hong;Lee Jeong-A
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.1
    • /
    • pp.15-25
    • /
    • 2005
  • Embedded systems have a set of tasks to execute. These tasks can be implemented either on application specific hardware or as software running on a specific processor. The design of an embedded system involves the selection of hardware software resources, Partition of tasks into hardware and software, and performance evaluation. An accurate estimation of execution time for extreme cases (best and worst case) is important for hardware/software codesign. A tighter estimation of the execution time bound nay allow the use of a slower processor to execute the code and may help lower the system cost. In this paper, we consider an ARM-based embedded system and developed a tool to estimate the tight boundary of execution time of a task with loop bounds and any additional program path information. The tool we developed is based on an exiting timing analysis tool named 'Cinderella' which currently supports i960 and m68k architectures. We add a module to handle ARM ELF object file, which extracts control flow and debugging information, and a module to handle ARM instruction set so that the new tool can support ARM processor. We validate the tool by comparing the estimated bound of execution time with the run-time execution time measured by ARMulator for a selected bechmark programs.

A Scalable Montgomery Modular Multiplier (확장 가능형 몽고메리 모듈러 곱셈기)

  • Choi, Jun-Baek;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.625-633
    • /
    • 2021
  • This paper describes a scalable architecture for flexible hardware implementation of Montgomery modular multiplication. Our scalable modular multiplier architecture, which is based on a one-dimensional array of processing elements (PEs), performs word parallel operation and allows us to adjust computational performance and hardware complexity depending on the number of PEs used, NPE. Based on the proposed architecture, we designed a scalable Montgomery modular multiplier (sMM) core supporting eight field sizes defined in SEC2. Synthesized with 180-nm CMOS cell library, our sMM core was implemented with 38,317 gate equivalents (GEs) and 139,390 GEs for NPE=1 and NPE=8, respectively. When operating with a 100 MHz clock, it was evaluated that 256-bit modular multiplications of 0.57 million times/sec for NPE=1 and 3.5 million times/sec for NPE=8 can be computed. Our sMM core has the advantage of enabling an optimized implementation by determining the number of PEs to be used in consideration of computational performance and hardware resources required in application fields, and it can be used as an IP (intellectual property) in scalable hardware design of elliptic curve cryptography (ECC).

Hyperparameter Search for Facies Classification with Bayesian Optimization (베이지안 최적화를 이용한 암상 분류 모델의 하이퍼 파라미터 탐색)

  • Choi, Yonguk;Yoon, Daeung;Choi, Junhwan;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.3
    • /
    • pp.157-167
    • /
    • 2020
  • With the recent advancement of computer hardware and the contribution of open source libraries to facilitate access to artificial intelligence technology, the use of machine learning (ML) and deep learning (DL) technologies in various fields of exploration geophysics has increased. In addition, ML researchers have developed complex algorithms to improve the inference accuracy of various tasks such as image, video, voice, and natural language processing, and now they are expanding their interests into the field of automatic machine learning (AutoML). AutoML can be divided into three areas: feature engineering, architecture search, and hyperparameter search. Among them, this paper focuses on hyperparamter search with Bayesian optimization, and applies it to the problem of facies classification using seismic data and well logs. The effectiveness of the Bayesian optimization technique has been demonstrated using Vincent field data by comparing with the results of the random search technique.

A Client Agent Framework for Dynamic Connection with Web Services (웹 서비스 동적 연동을 위한 클라이언트 에이전트 프레임워크)

  • Park, Young-Joon;Lee, Woo-Jin
    • The KIPS Transactions:PartA
    • /
    • v.16A no.5
    • /
    • pp.339-346
    • /
    • 2009
  • In order to connect web services, clients generally should use heavy frameworks such as .Net framework and Java run-time environment, which require high performance hardware resources like a personal computer. Therefore, it is impossible for sensor nodes to handle web services due to their limited resources. In this paper, a client agent framework is proposed for dynamically connecting web services in the client node with limited resources. A client agent, which is managed by the framework in other server, has full capability for connecting web services, while a real client has a simple connection module with the client agent. In this framework, a client agent is dynamically generated using the WSDL in the web service server. By using the framework, sensor nodes or mobile devices can enhance their functionalities and services by accessing web services with minimum resources.

A Study on the Design of Immersed Augmented Reality Education Models (몰입형 증강현실 교육 모델 설계에 관한 연구)

  • Tae, Hyo-Sik
    • Journal of Internet of Things and Convergence
    • /
    • v.7 no.4
    • /
    • pp.23-28
    • /
    • 2021
  • Through the 4th industrial revolution, it is rapidly developing in various fields such as artificial intelligence, AR/VR, and big data, and software is at the center. In the field of education as well, the importance of integrated education to support the development of technology is being emphasized, and in order to compete in software technology, securing human resources for software development should be prioritize in domestic. However, unlike the hardware-centric society of the past, the role of software technology human resources is very important, and the reality is that they are discharging human resources that are far from the human resources image that companies need. In this paper, present an immersed education model for training AR software professionals, and based on this, propose an evaluation index that can grasp the quality of the program of the immersed AR education model. Through the AR education model, it is expected that the weaknesses and strengths of the model can be identified, and it can contribute to setting the direction for improvement of the education program.

Implementation of the low power platform for sensor network based IEEE 802.15.4 (IEEE 802.15.4 기반 센서 네트워크를 위한 저전력 실시간 플랫폼의 설계 및 구현)

  • Hwang, Tae-Ho;Song, Byung-Chul;Kim, Seong-Dong
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.1145-1148
    • /
    • 2005
  • The sensor network that may be deemed to fall in the field of ubiquitous computing performs the basic function of transmitting sensing data through the autonomous sensing and the Ad hoc network. In order to collect and treat various sensing data at the time of application and manage extremely limited system resources, the sensor network requires the embedded operating system that uses low power, a small cord size and the least hardware resources. In this paper, The operating system having a new structure for constructing the IEEE 802. 15.4 MAC and Zigbee sensor network is suggested and can be formed by reviewing the characteristics and the core structural requirements of the operating system for the sensor network based on operating systems, which have been formed under existing similar conditions, and applying such features and core structural requirements to the design of the operating system for achieving the features and the requirements.

  • PDF