• Title/Summary/Keyword: Dataset Generation

Search Result 200, Processing Time 0.028 seconds

A Study of Unified Framework with Light Weight Artificial Intelligence Hardware for Broad range of Applications (다중 애플리케이션 처리를 위한 경량 인공지능 하드웨어 기반 통합 프레임워크 연구)

  • Jeon, Seok-Hun;Lee, Jae-Hack;Han, Ji-Su;Kim, Byung-Soo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.5
    • /
    • pp.969-976
    • /
    • 2019
  • A lightweight artificial intelligence hardware has made great strides in many application areas. In general, a lightweight artificial intelligence system consist of lightweight artificial intelligence engine and preprocessor including feature selection, generation, extraction, and normalization. In order to achieve optimal performance in broad range of applications, lightweight artificial intelligence system needs to choose a good preprocessing function and set their respective hyper-parameters. This paper proposes a unified framework for a lightweight artificial intelligence system and utilization method for finding models with optimal performance to use on a given dataset. The proposed unified framework can easily generate a model combined with preprocessing functions and lightweight artificial intelligence engine. In performance evaluation using handwritten image dataset and fall detection dataset measured with inertial sensor, the proposed unified framework showed building optimal artificial intelligence models with over 90% test accuracy.

Transfer Learning-based Generated Synthetic Images Identification Model (전이 학습 기반의 생성 이미지 판별 모델 설계)

  • Chaewon Kim;Sungyeon Yoon;Myeongeun Han;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.465-470
    • /
    • 2024
  • The advancement of AI-based image generation technology has resulted in the creation of various images, emphasizing the need for technology capable of accurately discerning them. The amount of generated image data is limited, and to achieve high performance with a limited dataset, this study proposes a model for discriminating generated images using transfer learning. Applying pre-trained models from the ImageNet dataset directly to the CIFAKE input dataset, we reduce training time cost followed by adding three hidden layers and one output layer to fine-tune the model. The modeling results revealed an improvement in the performance of the model when adjusting the final layer. Using transfer learning and then adjusting layers close to the output layer, small image data-related accuracy issues can be reduced and generated images can be classified.

HYBRID DATA SET GENERATION METHOD FOR COMPUTER VISION-BASED DEFECT DETECTION IN BUILDING CONSTRUCTION

  • Seung-mo Choi;Heesung Cha;Bo-sik, Son
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.311-318
    • /
    • 2024
  • Quality control in construction projects necessitates the detection of defects during construction. Currently, this task is performed manually by site supervisors. This manual process is inefficient, labor-intensive, and prone to human error, potentially leading to decreased productivity. To address this issue, research has been conducted to automate defect detection using computer vision-based object detection technologies. However, these studies often suffer from a lack of data for training deep learning models, resulting in inadequate accuracy. This study proposes a method to improve the accuracy of deep learning models through the use of virtual image data. The target building is created as a 3D model and finished with materials similar to actual components. Subsequently, a virtual defect texture is produced by layering three types of images: defect information, area information, and material information images, to fabricate materials with defects. Images are generated by rendering the 3D model and the defect, and annotations are created for segmentation. This approach creates a hybrid dataset by combining virtual data with actual site image data, which is then used to train the deep learning model. This research was conducted on the tile process of finishing construction projects, focusing on cracks and falls as the target defects. The training results of the deep learning model show that the F1-Score increased by 12.08% for falls and cracks when using the hybrid dataset compared to the real image dataset alone, validating the hybrid data approach. This study contributes not only to unmanned and automated smart construction management but also to enhancing safety on construction sites. To establish an integrated smart quality management system, it is necessary to detect various defects simultaneously with high accuracy. Utilizing this method for automatic defect detection in other types of construction can potentially expand the possibilities for implementing an integrated smart quality management system.

Descent Dataset Generation and Landmark Extraction for Terrain Relative Navigation on Mars (화성 지형상대항법을 위한 하강 데이터셋 생성과 랜드마크 추출 방법)

  • Kim, Jae-In
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1015-1023
    • /
    • 2022
  • The Entry-Descent-Landing process of a lander involves many environmental and technical challenges. To solve these problems, recently, terrestrial relative navigation (TRN) technology has been essential for landers. TRN is a technology for estimating the position and attitude of a lander by comparing Inertial Measurement Unit (IMU) data and image data collected from a descending lander with pre-built reference data. In this paper, we present a method for generating descent dataset and extracting landmarks, which are key elements for developing TRN technologies to be used on Mars. The proposed method generates IMU data of a descending lander using a simulated Mars landing trajectory and generates descent images from high-resolution ortho-map and digital elevation map through a ray tracing technique. Landmark extraction is performed by an area-based extraction method due to the low-textured surfaces on Mars. In addition, search area reduction is carried out to improve matching accuracy and speed. The performance evaluation result for the descent dataset generation method showed that the proposed method can generate images that satisfy the imaging geometry. The performance evaluation result for the landmark extraction method showed that the proposed method ensures several meters of positioning accuracy while ensuring processing speed as fast as the feature-based methods.

A Study on Fine-Tuning and Transfer Learning to Construct Binary Sentiment Classification Model in Korean Text (한글 텍스트 감정 이진 분류 모델 생성을 위한 미세 조정과 전이학습에 관한 연구)

  • JongSoo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, generative models based on the Transformer architecture, such as ChatGPT, have been gaining significant attention. The Transformer architecture has been applied to various neural network models, including Google's BERT(Bidirectional Encoder Representations from Transformers) sentence generation model. In this paper, a method is proposed to create a text binary classification model for determining whether a comment on Korean movie review is positive or negative. To accomplish this, a pre-trained multilingual BERT sentence generation model is fine-tuned and transfer learned using a new Korean training dataset. To achieve this, a pre-trained BERT-Base model for multilingual sentence generation with 104 languages, 12 layers, 768 hidden, 12 attention heads, and 110M parameters is used. To change the pre-trained BERT-Base model into a text classification model, the input and output layers were fine-tuned, resulting in the creation of a new model with 178 million parameters. Using the fine-tuned model, with a maximum word count of 128, a batch size of 16, and 5 epochs, transfer learning is conducted with 10,000 training data and 5,000 testing data. A text sentiment binary classification model for Korean movie review with an accuracy of 0.9582, a loss of 0.1177, and an F1 score of 0.81 has been created. As a result of performing transfer learning with a dataset five times larger, a model with an accuracy of 0.9562, a loss of 0.1202, and an F1 score of 0.86 has been generated.

A Study on Korean Speech Animation Generation Employing Deep Learning (딥러닝을 활용한 한국어 스피치 애니메이션 생성에 관한 고찰)

  • Suk Chan Kang;Dong Ju Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.461-470
    • /
    • 2023
  • While speech animation generation employing deep learning has been actively researched for English, there has been no prior work for Korean. Given the fact, this paper for the very first time employs supervised deep learning to generate Korean speech animation. By doing so, we find out the significant effect of deep learning being able to make speech animation research come down to speech recognition research which is the predominating technique. Also, we study the way to make best use of the effect for Korean speech animation generation. The effect can contribute to efficiently and efficaciously revitalizing the recently inactive Korean speech animation research, by clarifying the top priority research target. This paper performs this process: (i) it chooses blendshape animation technique, (ii) implements the deep-learning model in the master-servant pipeline of the automatic speech recognition (ASR) module and the facial action coding (FAC) module, (iii) makes Korean speech facial motion capture dataset, (iv) prepares two comparison deep learning models (one model adopts the English ASR module, the other model adopts the Korean ASR module, however both models adopt the same basic structure for their FAC modules), and (v) train the FAC modules of both models dependently on their ASR modules. The user study demonstrates that the model which adopts the Korean ASR module and dependently trains its FAC module (getting 4.2/5.0 points) generates decisively much more natural Korean speech animations than the model which adopts the English ASR module and dependently trains its FAC module (getting 2.7/5.0 points). The result confirms the aforementioned effect showing that the quality of the Korean speech animation comes down to the accuracy of Korean ASR.

A Novel Two-Stage Training Method for Unbiased Scene Graph Generation via Distribution Alignment

  • Dongdong Jia;Meili Zhou;Wei WEI;Dong Wang;Zongwen Bai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.12
    • /
    • pp.3383-3397
    • /
    • 2023
  • Scene graphs serve as semantic abstractions of images and play a crucial role in enhancing visual comprehension and reasoning. However, the performance of Scene Graph Generation is often compromised when working with biased data in real-world situations. While many existing systems focus on a single stage of learning for both feature extraction and classification, some employ Class-Balancing strategies, such as Re-weighting, Data Resampling, and Transfer Learning from head to tail. In this paper, we propose a novel approach that decouples the feature extraction and classification phases of the scene graph generation process. For feature extraction, we leverage a transformer-based architecture and design an adaptive calibration function specifically for predicate classification. This function enables us to dynamically adjust the classification scores for each predicate category. Additionally, we introduce a Distribution Alignment technique that effectively balances the class distribution after the feature extraction phase reaches a stable state, thereby facilitating the retraining of the classification head. Importantly, our Distribution Alignment strategy is model-independent and does not require additional supervision, making it applicable to a wide range of SGG models. Using the scene graph diagnostic toolkit on Visual Genome and several popular models, we achieved significant improvements over the previous state-of-the-art methods with our model. Compared to the TDE model, our model improved mR@100 by 70.5% for PredCls, by 84.0% for SGCls, and by 97.6% for SGDet tasks.

Development of Autonomous Vehicle Learning Data Generation System (자율주행 차량의 학습 데이터 자동 생성 시스템 개발)

  • Yoon, Seungje;Jung, Jiwon;Hong, June;Lim, Kyungil;Kim, Jaehwan;Kim, Hyungjoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.5
    • /
    • pp.162-177
    • /
    • 2020
  • The perception of traffic environment based on various sensors in autonomous driving system has a direct relationship with driving safety. Recently, as the perception model based on deep neural network is used due to the development of machine learning/in-depth neural network technology, a the perception model training and high quality of a training dataset are required. However, there are several realistic difficulties to collect data on all situations that may occur in self-driving. The performance of the perception model may be deteriorated due to the difference between the overseas and domestic traffic environments, and data on bad weather where the sensors can not operate normally can not guarantee the qualitative part. Therefore, it is necessary to build a virtual road environment in the simulator rather than the actual road to collect the traning data. In this paper, a training dataset collection process is suggested by diversifying the weather, illumination, sensor position, type and counts of vehicles in the simulator environment that simulates the domestic road situation according to the domestic situation. In order to achieve better performance, the authors changed the domain of image to be closer to due diligence and diversified. And the performance evaluation was conducted on the test data collected in the actual road environment, and the performance was similar to that of the model learned only by the actual environmental data.

Co-registration of PET-CT Brain Images using a Gaussian Weighted Distance Map (가우시안 가중치 거리지도를 이용한 PET-CT 뇌 영상정합)

  • Lee, Ho;Hong, Helen;Shin, Yeong-Gil
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.7
    • /
    • pp.612-624
    • /
    • 2005
  • In this paper, we propose a surface-based registration using a gaussian weighted distance map for PET-CT brain image fusion. Our method is composed of three main steps: the extraction of feature points, the generation of gaussian weighted distance map, and the measure of similarities based on weight. First, we segment head using the inverse region growing and remove noise segmented with head using region growing-based labeling in PET and CT images, respectively. And then, we extract the feature points of the head using sharpening filter. Second, a gaussian weighted distance map is generated from the feature points in CT images. Thus it leads feature points to robustly converge on the optimal location in a large geometrical displacement. Third, weight-based cross-correlation searches for the optimal location using a gaussian weighted distance map of CT images corresponding to the feature points extracted from PET images. In our experiment, we generate software phantom dataset for evaluating accuracy and robustness of our method, and use clinical dataset for computation time and visual inspection. The accuracy test is performed by evaluating root-mean-square-error using arbitrary transformed software phantom dataset. The robustness test is evaluated whether weight-based cross-correlation achieves maximum at optimal location in software phantom dataset with a large geometrical displacement and noise. Experimental results showed that our method gives more accuracy and robust convergence than the conventional surface-based registration.

Segmentation Foundation Model-based Automated Yard Management Algorithm (의미론적 분할 기반 모델을 이용한 조선소 사외 적치장 객체 자동 관리 기술)

  • Mingyu Jeong;Jeonghyun Noh;Janghyun Kim;Seongheon Ha;Taeseon Kang;Byounghak Lee;Kiryong Kang;Junhyeon Kim;Jinsun Park
    • Smart Media Journal
    • /
    • v.13 no.2
    • /
    • pp.52-61
    • /
    • 2024
  • In the shipyard, aerial images are acquired at regular intervals using Unmanned Aerial Vehicles (UAVs) for the management of external storage yards. These images are then investigated by humans to manage the status of the storage yards. This method requires a significant amount of time and manpower especially for large areas. In this paper, we propose an automated management technology based on a semantic segmentation foundation model to address these challenges and accurately assess the status of external storage yards. In addition, as there is insufficient publicly available dataset for external storage yards, we collected a small-scale dataset for external storage yards objects and equipment. Using this dataset, we fine-tune an object detector and extract initial object candidates. They are utilized as prompts for the Segment Anything Model(SAM) to obtain precise semantic segmentation results. Furthermore, to facilitate continuous storage yards dataset collection, we propose a training data generation pipeline using SAM. Our proposed method has achieved 4.00%p higher performance compared to those of previous semantic segmentation methods on average. Specifically, our method has achieved 5.08% higher performance than that of SegFormer.