• Title/Summary/Keyword: AI Inference

Search Result 90, Processing Time 0.019 seconds

AB9: A neural processor for inference acceleration

  • Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.491-504
    • /
    • 2020
  • We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.

Deep Analysis of Causal AI-Based Data Analysis Techniques for the Status Evaluation of Casual AI Technology (인과적 인공지능 기반 데이터 분석 기법의 심층 분석을 통한 인과적 AI 기술의 현황 분석)

  • Cha Jooho;Ryu Minwoo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.45-52
    • /
    • 2023
  • With the advent of deep learning, Artificial Intelligence (AI) technology has experienced rapid advancements, extending its application across various industrial sectors. However, the focus has shifted from the independent use of AI technology to its dispersion and proliferation through the open AI ecosystem. This shift signifies the transition from a phase of research and development to an era where AI technology is becoming widely accessible to the general public. However, as this dispersion continues, there is an increasing demand for the verification of outcomes derived from AI technologies. Causal AI applies the traditional concept of causal inference to AI, allowing not only the analysis of data correlations but also the derivation of the causes of the results, thereby obtaining the optimal output values. Causal AI technology addresses these limitations by applying the theory of causal inference to machine learning and deep learning to derive the basis of the analysis results. This paper analyzes recent cases of causal AI technology and presents the major tasks and directions of causal AI, extracting patterns between data using the correlation between them and presenting the results of the analysis.

Time-based Expert System Design for Coherent Integration Between M&S and AI (M&S와 AI간의 유기적 통합을 위한 시간기반 전문가 시스템 설계)

  • Shin, Suk-Hoon;Chi, Sung-Do
    • Journal of the Korea Society for Simulation
    • /
    • v.26 no.2
    • /
    • pp.59-65
    • /
    • 2017
  • Along with the development of M&S, modeling research utilizing AI technology is attracting attention because of the fact that the needs of fields including human decision making such as defense M&S are increased. Obviously AI is a way to solve complex problems. However, AI did not consider logical time such as input time and processing time required by M&S. Therefore, in this paper we proposed a "time-based expert system" which redesigned the representative AI technology rule-based expert system. It consists of a rule structure "IF-THEN-AFTER" and an inference engine, takes logical time into consideration. We also tried logical analysis using a simple example. As a result of the analysis, the proposal Time-based Expert System proved that the result changes according to the input time point and inference time.

Trends of Compiler Development for AI Processor (인공지능 프로세서 컴파일러 개발 동향)

  • Kim, J.K.;Kim, H.J.;Cho, Y.C.P.;Kim, H.M.;Lyuh, C.G.;Han, J.;Kwon, Y.
    • Electronics and Telecommunications Trends
    • /
    • v.36 no.2
    • /
    • pp.32-42
    • /
    • 2021
  • The rapid growth of deep-learning applications has invoked the R&D of artificial intelligence (AI) processors. A dedicated software framework such as a compiler and runtime APIs is required to achieve maximum processor performance. There are various compilers and frameworks for AI training and inference. In this study, we present the features and characteristics of AI compilers, training frameworks, and inference engines. In addition, we focus on the internals of compiler frameworks, which are based on either basic linear algebra subprograms or intermediate representation. For an in-depth insight, we present the compiler infrastructure, internal components, and operation flow of ETRI's "AI-Ware." The software framework's significant role is evidenced from the optimized neural processing unit code produced by the compiler after various optimization passes, such as scheduling, architecture-considering optimization, schedule selection, and power optimization. We conclude the study with thoughts about the future of state-of-the-art AI compilers.

Analysis on Lightweight Methods of On-Device AI Vision Model for Intelligent Edge Computing Devices (지능형 엣지 컴퓨팅 기기를 위한 온디바이스 AI 비전 모델의 경량화 방식 분석)

  • Hye-Hyeon Ju;Namhi Kang
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.1-8
    • /
    • 2024
  • On-device AI technology, which can operate AI models at the edge devices to support real-time processing and privacy enhancement, is attracting attention. As intelligent IoT is applied to various industries, services utilizing the on-device AI technology are increasing significantly. However, general deep learning models require a lot of computational resources for inference and learning. Therefore, various lightweighting methods such as quantization and pruning have been suggested to operate deep learning models in embedded edge devices. Among the lightweighting methods, we analyze how to lightweight and apply deep learning models to edge computing devices, focusing on pruning technology in this paper. In particular, we utilize dynamic and static pruning techniques to evaluate the inference speed, accuracy, and memory usage of a lightweight AI vision model. The content analyzed in this paper can be used for intelligent video control systems or video security systems in autonomous vehicles, where real-time processing are highly required. In addition, it is expected that the content can be used more effectively in various IoT services and industries.

Performance Analysis to Evaluate the Suitability of MicroVM with AI Applications for Edge Computing

  • Yunha Choi;Byungchul Tak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.107-116
    • /
    • 2024
  • In this paper, we analyze the performance of MicroVM when running AI applications on an edge computing environment and whether it can replace current container technology and traditional virtual machines. To achieve this, we set up Docker container, Firecracker MicroVM and KVM virtual machine environments on a Raspberry Pi 4 and executed representative AI applications in each environment. We analyze the inference time, total CPU usage and trends over time and file I/O performance on each environment. The results show that there is no significant performance difference between MicroVM and container when running AI applications. Moreover, on average, a stable inference time over multiple trials was observed on MicroVM. Therefore, we can confirm that executing AI applications using MicroVM instead of container or heavy-weight virtual machine is suitable for an edge computing.

A Study on the System for AI Service Production (인공지능 서비스 운영을 위한 시스템 측면에서의 연구)

  • Hong, Yong-Geun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.10
    • /
    • pp.323-332
    • /
    • 2022
  • As various services using AI technology are being developed, much attention is being paid to AI service production. Recently, AI technology is acknowledged as one of ICT services, a lot of research is being conducted for general-purpose AI service production. In this paper, I describe the research results in terms of systems for AI service production, focusing on the distribution and production of machine learning models, which are the final steps of general machine learning development procedures. Three different Ubuntu systems were built, and experiments were conducted on the system, using data from 2017 validation COCO dataset in combination of different AI models (RFCN, SSD-Mobilenet) and different communication methods (gRPC, REST) to request and perform AI services through Tensorflow serving. Through various experiments, it was found that the type of AI model has a greater influence on AI service inference time than AI machine communication method, and in the case of object detection AI service, the number and complexity of objects in the image are more affected than the file size of the image to be detected. In addition, it was confirmed that if the AI service is performed remotely rather than locally, even if it is a machine with good performance, it takes more time to infer the AI service than if it is performed locally. Through the results of this study, it is expected that system design suitable for service goals, AI model development, and efficient AI service production will be possible.

Trends in AI Processor Technology (인공지능프로세서 기술 동향)

  • Lee, M.Y.;Chung, J.;Lee, J.H.;Han, J.H.;Kwon, Y.S.
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.3
    • /
    • pp.66-75
    • /
    • 2020
  • As the increasing expectations of a practical AI (Artificial Intelligence) service makes AI algorithms more complicated, an efficient processor to process AI algorithms is required. To meet this requirement, processors optimized for parallel processing, such as GPUs (Graphics Processing Units), have been widely employed. However, the GPU has a generalized structure for various applications, so it is not optimized for the AI algorithm. Therefore, research on the development of AI processors optimized for AI algorithm processing has been actively conducted. This paper briefly introduces an AI processor especially for inference acceleration, developed by the Electronics and Telecommunications Research Institute, South Korea., and other global vendors for mobile and server platforms. However, the GPU has a generalized structure for various applications, so it is not optimized for the AI algorithm. Therefore, research on the development of AI processors optimized for AI algorithm processing has been actively conducted.

Implementation of Scenario-based AI Voice Chatbot System for Museum Guidance (박물관 안내를 위한 시나리오 기반의 AI 음성 챗봇 시스템 구현)

  • Sun-Woo Jung;Eun-Sung Choi;Seon-Gyu An;Young-Jin Kang;Seok-Chan Jeong
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.91-102
    • /
    • 2022
  • As artificial intelligence develops, AI chatbot systems are actively taking place. For example, in public institutions, the use of chatbots is expanding to work assistance and professional knowledge services in civil complaints and administration, and private companies are using chatbots for interactive customer response services. In this study, we propose a scenario-based AI voice chatbot system to reduce museum operating costs and provide interactive guidance services to visitors. The implemented voice chatbot system consists of a watcher object that detects the user's voice by monitoring a specific directory in real-time, and an event handler object that outputs AI's response voice by performing inference by model sequentially when a voice file is created. And Including a function to prevent duplication using thread and a deque, GPU operations are not duplicated during inference in a single GPU environment.

Development of ANN- and ANFIS-based Control Logics for Heating and Cooling Systems in Residential Buildings and Their Performance Tests (인공지능망과 뉴로퍼지 모델을 이용한 주거건물 냉난방 시스템 조절 로직 및 예비 성능 시험)

  • Moon, Jin-Woo
    • Journal of the Korean housing association
    • /
    • v.22 no.3
    • /
    • pp.113-122
    • /
    • 2011
  • This study aimed to develop AI- (Artificial Intelligence) based thermal control logics and test their performance for identifying the optimal thermal control method in buildings. For this objective, a conventional Two-Position On/Off logic and two AI-based variable logics, which applied ANN (Artificial Neural Network) and ANFIS (Adaptive Neuro-Fuzzy Inference System), have developed. Performance of each logic was tested in a typical two-story residential building in U.S.A. using the computer simulation incorporating MATLAB and IBPT (International Building Physics Toolbox). In the analysis of the test results, AI-based control logic presented the advanced thermal comfort with stability compared to the conventional logic while they did not show significant energy saving effects. In conclusion, the predictive and adaptive AI-based control logics have a potential to maintain interior air temperature more comfortably, and the findings in this study could be a solid foundation for identifying the optimal thermal control method in buildings.