• Title/Summary/Keyword: Inference System

Search Result 1,621, Processing Time 0.033 seconds

YOLO Model FPS Enhancement Method for Determining Human Facial Expression based on NVIDIA Jetson TX1 (NVIDIA Jetson TX1 기반의 사람 표정 판별을 위한 YOLO 모델 FPS 향상 방법)

  • Bae, Seung-Ju;Choi, Hyeon-Jun;Jeong, Gu-Min
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.5
    • /
    • pp.467-474
    • /
    • 2019
  • In this paper, we propose a novel method to improve FPS while maintaining the accuracy of YOLO v2 model in NVIDIA Jetson TX1. In general, in order to reduce the amount of computation, a conversion to an integer operation or reducing the depth of a network have been used. However, the accuracy of recognition can be deteriorated. So, we use methods to reduce computation and memory consumption through adjustment of the filter size and integrated computation of the network The first method is to replace the $3{\times}3$ filter with a $1{\times}1$ filter, which reduces the number of parameters to one-ninth. The second method is to reduce the amount of computation through CBR (Convolution-Add Bias-Relu) among the inference acceleration functions of TensorRT, and the last method is to reduce memory consumption by integrating repeated layers using TensorRT. For the simulation results, although the accuracy is decreased by 1% compared to the existing YOLO v2 model, the FPS has been improved from the existing 3.9 FPS to 11 FPS.

Accuracy Analysis for Slope Movement Characterization by comparing the Data from Real-time Measurement Device and 3D Model Value with Drone based Photogrammetry (도로비탈면 상시계측 실측치와 드론 사진측량에 의한 3D 모델값의 정확도 비교분석)

  • CHO, Han-Kwang;CHANG, Ki-Tae;HONG, Seong-Jin;HONG, Goo-Pyo;KIM, Sang-Hwan;KWON, Se-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.23 no.4
    • /
    • pp.234-252
    • /
    • 2020
  • This paper is to verify the effectiveness of 'Hybrid Disaster Management Strategy' that integrates 'RTM(Real-time Monitoring) based On-line' and 'UAV based Off-line' system. For landslide prone area where sensors were installed, the conventional way of risk management so far has entirely relied on RTM data collected from the field through the instrumentation devices. But it's not enough due to the limitation of'Pin-point sensor'which tend to provide with only the localized information where sensors have stayed fixed. It lacks, therefore, the whole picture to be grasped. In this paper, utilizing 'Digital Photogrammetry Software Pix4D', the possibility of inference for the deformation of ungauged area has been reviewed. For this purpose, actual measurement data from RTM were compared with the estimated value from 3D point cloud outcome by UAV, and the consequent results has shown very accurate in terms of RMSE.

Technology Trends of Smart Abnormal Detection and Diagnosis System for Gas and Hydrogen Facilities (가스·수소 시설의 스마트 이상감지 및 진단 시스템 기술동향)

  • Park, Myeongnam;Kim, Byungkwon;Hong, Gi Hoon;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.26 no.4
    • /
    • pp.41-57
    • /
    • 2022
  • The global demand for carbon neutrality in response to climate change is in a situation where it is necessary to prepare countermeasures for carbon trade barriers for some countries, including Korea, which is classified as an export-led economic structure and greenhouse gas exporter. Therefore, digital transformation, which is one of the predictable ways for the carbon-neutral transition model to be applied, should be introduced early. By applying digital technology to industrial gas manufacturing facilities used in one of the major industries, high-tech manufacturing industry, and hydrogen gas facilities, which are emerging as eco-friendly energy, abnormal detection, and diagnosis services are provided with cloud-based predictive diagnosis monitoring technology including operating knowledge. Here are the trends. Small and medium-sized companies that are in the blind spot of carbon-neutral implementation by confirming the direction of abnormal diagnosis predictive monitoring through optimization, augmented reality technology, IoT and AI knowledge inference, etc., rather than simply monitoring real-time facility status It can be seen that it is possible to disseminate technologies such as consensus knowledge in the engineering domain and predictive diagnostic monitoring that match the economic feasibility and efficiency of the technology. It is hoped that it will be used as a way to seek countermeasures against carbon emission trade barriers based on the highest level of ICT technology.

Semantic Computing-based Dynamic Job Scheduling Model and Simulation (시멘틱 컴퓨팅 기반의 동적 작업 스케줄링 모델 및 시뮬레이션)

  • Noh, Chang-Hyeon;Jang, Sung-Ho;Kim, Tae-Young;Lee, Jong-Sik
    • Journal of the Korea Society for Simulation
    • /
    • v.18 no.2
    • /
    • pp.29-38
    • /
    • 2009
  • In the computing environment with heterogeneous resources, a job scheduling model is necessary for effective resource utilization and high-speed data processing. And, the job scheduling model has to cope with a dynamic change in the condition of resources. There have been lots of researches on resource estimation methods and heuristic algorithms about how to distribute and allocate jobs to heterogeneous resources. But, existing researches have a weakness for system compatibility and scalability because they do not support the standard language. Also, they are impossible to process jobs effectively and deal with a variety of computing situations in which the condition of resources is dynamically changed in real-time. In order to solve the problems of existing researches, this paper proposes a semantic computing-based dynamic job scheduling model that defines various knowledge-based rules for job scheduling methods adaptable to changes in resource condition and allocate a job to the best suited resource through inference. This paper also constructs a resource ontology to manage information about heterogeneous resources without difficulty as using the OWL, the standard ontology language established by W3C. Experimental results shows that the proposed scheduling model outperforms existing scheduling models, in terms of throughput, job loss, and turn around time.

Building a Model to Estimate Pedestrians' Critical Lags on Crosswalks (횡단보도에서의 보행자의 임계간격추정 모형 구축)

  • Kim, Kyung Whan;Kim, Daehyon;Lee, Ik Su;Lee, Deok Whan
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.1D
    • /
    • pp.33-40
    • /
    • 2009
  • The critical lag of crosswalk pedestrians is an important parameter in analyzing traffic operation at unsignalized crosswalks, however there is few research in this field in Korea. The purpose of this study is to develop a model to estimate the critical lag. Among the elements which influence the critical lag, the age of pedestrians and the length of crosswalks, which have fuzzy characteristics, and the each lag which is rejected or accepted are collected on crosswalks of which lengths range from 3.5 m to 10.5 m. The values of the critical lag range from 2.56 sec. to 5.56 sec. The age and the length are divided to the 3 fuzzy variables each, and the critical lag of each case is estimated according to Raff's technique, so a total of 9 fuzzy rules are established. Based on the rules, an ANFIS (Adaptive Neuro-Fuzzy Inference System) model to estimate the critical lag is built. The predictability of the model is evaluated comparing the observed with the estimated critical lags by the model. Statistics of $R^2$, MAE, MSE are 0.96, 0.097, 0.015 respectively. Therefore, the model is evaluated to explain the result well. During this study, it is found that the critical lag increases rapidly over the pedestrian's age of 40 years.

Building robust Korean speech recognition model by fine-tuning large pretrained model (대형 사전훈련 모델의 파인튜닝을 통한 강건한 한국어 음성인식 모델 구축)

  • Changhan Oh;Cheongbin Kim;Kiyoung Park
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.75-82
    • /
    • 2023
  • Automatic speech recognition (ASR) has been revolutionized with deep learning-based approaches, among which self-supervised learning methods have proven to be particularly effective. In this study, we aim to enhance the performance of OpenAI's Whisper model, a multilingual ASR system on the Korean language. Whisper was pretrained on a large corpus (around 680,000 hours) of web speech data and has demonstrated strong recognition performance for major languages. However, it faces challenges in recognizing languages such as Korean, which is not major language while training. We address this issue by fine-tuning the Whisper model with an additional dataset comprising about 1,000 hours of Korean speech. We also compare its performance against a Transformer model that was trained from scratch using the same dataset. Our results indicate that fine-tuning the Whisper model significantly improved its Korean speech recognition capabilities in terms of character error rate (CER). Specifically, the performance improved with increasing model size. However, the Whisper model's performance on English deteriorated post fine-tuning, emphasizing the need for further research to develop robust multilingual models. Our study demonstrates the potential of utilizing a fine-tuned Whisper model for Korean ASR applications. Future work will focus on multilingual recognition and optimization for real-time inference.

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.

The Classification arranged from Protectorate period to the early Japanese Colonial rule period : for Official Documents during the period from Kabo Reform to The Great Han Empire - Focusing on Classification Stamp and Warehouse Number Stamp - (통감부~일제 초기 갑오개혁과 대한제국기 공문서의 분류 - 분류도장·창고번호도장을 중심으로 -)

  • Park, Sung-Joon
    • The Korean Journal of Archival Studies
    • /
    • no.22
    • /
    • pp.115-155
    • /
    • 2009
  • As Korea was merged into Japan, the official documents during Kabo Reform and The Great Han Empire time were handed over to the Government-General of Chosun and reclassified from section based to ministry based. However they had been reclassified before many times. The footprints of reclassification can be found in the classification stamps and warehouse number stamps which remained on the cover of official documents from Kabo Reform to The Great Han Empire. They classified the documents by Section in the classification system of Ministry-Department-Section, stamped and numbered them. It is consistent with the official document classification system in The Great Han Empire, which shows the section based classification was maintained. Although they stamped by Section and numbered the documents, there were differences in sub classification system by Section. In the documents of Land Tax Section, many institutions can be found. The documents of the same year can be found in different group and documents of similar characteristics are classified in the same group. Customs Section and Other Tax Section seemed to number their documents according to the year of documents. However the year and the order of 'i-ro-ha(イロハ) song' does not match. From Kabo Reform to The Great Han Empire the documents were grouped by Section. However they did not have classification rules for the sub units of Section. Therefore, it is not clear if the document grouping of classification stamps can be understood as the original order of official document classification system of The Great Han Empire. However, given the grouping method reflects the document classification system, the sub section classification system of the Great Han Empire can be inferred through the grouping method. In this inference, it is understood that the classification system was divided into two such as 'Section - Counterpart Institution' and 'Section - Document Issuance Year'. The Government-General of Chosun took over the official documents of The Great Han Empire, stored them in the warehouse and marked them with Warehouse Number Stamps. Warehouse Number Stamp contained the Institution that grouped those documents and the documents were stored by warehouse. Although most of the documents on the shelves in each warehouse were arranged by classification stamp number, some of them were mixed and the order of shelves and that of documents did not match. Although they arranged the documents on the shelves and gave the symbols in the order of 'i-ro-ha(イロハ) song', these symbols were not given by the order of number. During the storage of the documents by the Government-General of Chosun, the classification system according to the classification stamps was affected. One characteristic that can be found in warehouse number stamps is that the preservation period on each document group lost the meaning. The preservation period id decided according to the historical and administrative value. However, the warehouse number stamps did not distinguish the documents according to the preservation period and put the documents with different preservation period on one shelf. As Japan merged Korea, The Great Han Empire did not consider the official documents of the Great Han Empire as administrative documents that should be disposed some time later. It considered them as materials to review the old which is necessary for the colonial governance. As the meaning of the documents has been changed from general administrative documents to the materials that they would need to govern the colony, they dealt with all the official documents of The Great Han Empire as the same object regardless of preservation period. The Government-General of Chosun destroyed the classification system of the Great Han Empire which was based on Section and the functions in the Section by reclassifying them according to Ministry when they reclassified the official documents during Kobo Reform and the Great Han Empire in order to utilize them to govern the colony.

Aspect-Based Sentiment Analysis Using BERT: Developing Aspect Category Sentiment Classification Models (BERT를 활용한 속성기반 감성분석: 속성카테고리 감성분류 모델 개발)

  • Park, Hyun-jung;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.1-25
    • /
    • 2020
  • Sentiment Analysis (SA) is a Natural Language Processing (NLP) task that analyzes the sentiments consumers or the public feel about an arbitrary object from written texts. Furthermore, Aspect-Based Sentiment Analysis (ABSA) is a fine-grained analysis of the sentiments towards each aspect of an object. Since having a more practical value in terms of business, ABSA is drawing attention from both academic and industrial organizations. When there is a review that says "The restaurant is expensive but the food is really fantastic", for example, the general SA evaluates the overall sentiment towards the 'restaurant' as 'positive', while ABSA identifies the restaurant's aspect 'price' as 'negative' and 'food' aspect as 'positive'. Thus, ABSA enables a more specific and effective marketing strategy. In order to perform ABSA, it is necessary to identify what are the aspect terms or aspect categories included in the text, and judge the sentiments towards them. Accordingly, there exist four main areas in ABSA; aspect term extraction, aspect category detection, Aspect Term Sentiment Classification (ATSC), and Aspect Category Sentiment Classification (ACSC). It is usually conducted by extracting aspect terms and then performing ATSC to analyze sentiments for the given aspect terms, or by extracting aspect categories and then performing ACSC to analyze sentiments for the given aspect category. Here, an aspect category is expressed in one or more aspect terms, or indirectly inferred by other words. In the preceding example sentence, 'price' and 'food' are both aspect categories, and the aspect category 'food' is expressed by the aspect term 'food' included in the review. If the review sentence includes 'pasta', 'steak', or 'grilled chicken special', these can all be aspect terms for the aspect category 'food'. As such, an aspect category referred to by one or more specific aspect terms is called an explicit aspect. On the other hand, the aspect category like 'price', which does not have any specific aspect terms but can be indirectly guessed with an emotional word 'expensive,' is called an implicit aspect. So far, the 'aspect category' has been used to avoid confusion about 'aspect term'. From now on, we will consider 'aspect category' and 'aspect' as the same concept and use the word 'aspect' more for convenience. And one thing to note is that ATSC analyzes the sentiment towards given aspect terms, so it deals only with explicit aspects, and ACSC treats not only explicit aspects but also implicit aspects. This study seeks to find answers to the following issues ignored in the previous studies when applying the BERT pre-trained language model to ACSC and derives superior ACSC models. First, is it more effective to reflect the output vector of tokens for aspect categories than to use only the final output vector of [CLS] token as a classification vector? Second, is there any performance difference between QA (Question Answering) and NLI (Natural Language Inference) types in the sentence-pair configuration of input data? Third, is there any performance difference according to the order of sentence including aspect category in the QA or NLI type sentence-pair configuration of input data? To achieve these research objectives, we implemented 12 ACSC models and conducted experiments on 4 English benchmark datasets. As a result, ACSC models that provide performance beyond the existing studies without expanding the training dataset were derived. In addition, it was found that it is more effective to reflect the output vector of the aspect category token than to use only the output vector for the [CLS] token as a classification vector. It was also found that QA type input generally provides better performance than NLI, and the order of the sentence with the aspect category in QA type is irrelevant with performance. There may be some differences depending on the characteristics of the dataset, but when using NLI type sentence-pair input, placing the sentence containing the aspect category second seems to provide better performance. The new methodology for designing the ACSC model used in this study could be similarly applied to other studies such as ATSC.

A Generalized Adaptive Deep Latent Factor Recommendation Model (일반화 적응 심층 잠재요인 추천모형)

  • Kim, Jeongha;Lee, Jipyeong;Jang, Seonghyun;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.249-263
    • /
    • 2023
  • Collaborative Filtering, a representative recommendation system methodology, consists of two approaches: neighbor methods and latent factor models. Among these, the latent factor model using matrix factorization decomposes the user-item interaction matrix into two lower-dimensional rectangular matrices, predicting the item's rating through the product of these matrices. Due to the factor vectors inferred from rating patterns capturing user and item characteristics, this method is superior in scalability, accuracy, and flexibility compared to neighbor-based methods. However, it has a fundamental drawback: the need to reflect the diversity of preferences of different individuals for items with no ratings. This limitation leads to repetitive and inaccurate recommendations. The Adaptive Deep Latent Factor Model (ADLFM) was developed to address this issue. This model adaptively learns the preferences for each item by using the item description, which provides a detailed summary and explanation of the item. ADLFM takes in item description as input, calculates latent vectors of the user and item, and presents a method that can reflect personal diversity using an attention score. However, due to the requirement of a dataset that includes item descriptions, the domain that can apply ADLFM is limited, resulting in generalization limitations. This study proposes a Generalized Adaptive Deep Latent Factor Recommendation Model, G-ADLFRM, to improve the limitations of ADLFM. Firstly, we use item ID, commonly used in recommendation systems, as input instead of the item description. Additionally, we apply improved deep learning model structures such as Self-Attention, Multi-head Attention, and Multi-Conv1D. We conducted experiments on various datasets with input and model structure changes. The results showed that when only the input was changed, MAE increased slightly compared to ADLFM due to accompanying information loss, resulting in decreased recommendation performance. However, the average learning speed per epoch significantly improved as the amount of information to be processed decreased. When both the input and the model structure were changed, the best-performing Multi-Conv1d structure showed similar performance to ADLFM, sufficiently counteracting the information loss caused by the input change. We conclude that G-ADLFRM is a new, lightweight, and generalizable model that maintains the performance of the existing ADLFM while enabling fast learning and inference.