• Title/Summary/Keyword: Dataset for AI

Search Result 215, Processing Time 0.026 seconds

A Study on the Generation of Webtoons through Fine-Tuning of Diffusion Models (확산모델의 미세조정을 통한 웹툰 생성연구)

  • Kyungho Yu;Hyungju Kim;Jeongin Kim;Chanjun Chun;Pankoo Kim
    • Smart Media Journal
    • /
    • v.12 no.7
    • /
    • pp.76-83
    • /
    • 2023
  • This study proposes a method to assist webtoon artists in the process of webtoon creation by utilizing a pretrained Text-to-Image model to generate webtoon images from text. The proposed approach involves fine-tuning a pretrained Stable Diffusion model using a webtoon dataset transformed into the desired webtoon style. The fine-tuning process, using LoRA technique, completes in a quick training time of approximately 4.5 hours with 30,000 steps. The generated images exhibit the representation of shapes and backgrounds based on the input text, resulting in the creation of webtoon-like images. Furthermore, the quantitative evaluation using the Inception score shows that the proposed method outperforms DCGAN-based Text-to-Image models. If webtoon artists adopt the proposed Text-to-Image model for webtoon creation, it is expected to significantly reduce the time required for the creative process.

Detection and Prediction of Subway Failure using Machine Learning (머신러닝을 이용한 지하철 고장 탐지 및 예측)

  • Kuk-Kyung Sung
    • Advanced Industrial SCIence
    • /
    • v.2 no.4
    • /
    • pp.11-16
    • /
    • 2023
  • The subway is a means of public transportation that plays an important role in the transportation system of modern cities. However, congestion often occurs due to sudden breakdowns and system outages, causing inconvenience. Therefore, in this paper, we conducted a study on failure prediction and prevention using machine learning to efficiently operate the subway system. Using UC Irvine's MetroPT-3 dataset, we built a subway breakdown prediction model using logistic regression. The model predicted the non-failure state with a high accuracy of 0.991. However, precision and recall are relatively low, suggesting the possibility of error in failure prediction. The ROC_AUC value is 0.901, indicating that the model can classify better than random guessing. The constructed model is useful for stable operation of the subway system, but additional research is needed to improve performance. Therefore, in the future, if there is a lot of learning data and the data is well purified, failure can be prevented by pre-inspection through prediction.

Framework Design for Malware Dataset Extraction Using Code Patches in a Hybrid Analysis Environment (코드패치 및 하이브리드 분석 환경을 활용한 악성코드 데이터셋 추출 프레임워크 설계)

  • Ki-Sang Choi;Sang-Hoon Choi;Ki-Woong Park
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.3
    • /
    • pp.403-416
    • /
    • 2024
  • Malware is being commercialized and sold on the black market, primarily driven by financial incentives. With the increasing demand driven by these sales, the scope of attacks via malware has expanded. In response, there has been a surge in research efforts leveraging artificial intelligence for detection and classification. However, adversaries are integrating various anti-analysis techniques into their malware to thwart analytical efforts. In this study, we introduce the "Malware Analysis with Dynamic Extraction (MADE)" framework, a hybrid binary analysis tool devised to procure datasets from advanced malware incorporating Anti-Analysis techniques. The MADE framework has the proficiency to autonomously execute dynamic analysis on binaries, encompassing those laden with Anti-VM and Anti-Debugging defenses. Experimental results substantiate that the MADE framework can effectively circumvent over 90% of diverse malware implementations using Anti-Analysis techniques and can adeptly extract relevant datasets.

Detection of Urban Trees Using YOLOv5 from Aerial Images (항공영상으로부터 YOLOv5를 이용한 도심수목 탐지)

  • Park, Che-Won;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1633-1641
    • /
    • 2022
  • Urban population concentration and indiscriminate development are causing various environmental problems such as air pollution and heat island phenomena, and causing human resources to deteriorate the damage caused by natural disasters. Urban trees have been proposed as a solution to these urban problems, and actually play an important role, such as providing environmental improvement functions. Accordingly, quantitative measurement and analysis of individual trees in urban trees are required to understand the effect of trees on the urban environment. However, the complexity and diversity of urban trees have a problem of lowering the accuracy of single tree detection. Therefore, we conducted a study to effectively detect trees in Dongjak-gu using high-resolution aerial images that enable effective detection of tree objects and You Only Look Once Version 5 (YOLOv5), which showed excellent performance in object detection. Labeling guidelines for the construction of tree AI learning datasets were generated, and box annotation was performed on Dongjak-gu trees based on this. We tested various scale YOLOv5 models from the constructed dataset and adopted the optimal model to perform more efficient urban tree detection, resulting in significant results of mean Average Precision (mAP) 0.663.

Quantitative Estimation Method for ML Model Performance Change, Due to Concept Drift (Concept Drift에 의한 ML 모델 성능 변화의 정량적 추정 방법)

  • Soon-Hong An;Hoon-Suk Lee;Seung-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.259-266
    • /
    • 2023
  • It is very difficult to measure the performance of the machine learning model in the business service stage. Therefore, managing the performance of the model through the operational department is not done effectively. Academically, various studies have been conducted on the concept drift detection method to determine whether the model status is appropriate. The operational department wants to know quantitatively the performance of the operating model, but concept drift can only detect the state of the model in relation to the data, it cannot estimate the quantitative performance of the model. In this study, we propose a performance prediction model (PPM) that quantitatively estimates precision through the statistics of concept drift. The proposed model induces artificial drift in the sampling data extracted from the training data, measures the precision of the sampling data, creates a dataset of drift and precision, and learns it. Then, the difference between the actual precision and the predicted precision is compared through the test data to correct the error of the performance prediction model. The proposed PPM was applied to two models, a loan underwriting model and a credit card fraud detection model that can be used in real business. It was confirmed that the precision was effectively predicted.

Automated Data Extraction from Unstructured Geotechnical Report based on AI and Text-mining Techniques (AI 및 텍스트 마이닝 기법을 활용한 지반조사보고서 데이터 추출 자동화)

  • Park, Jimin;Seo, Wanhyuk;Seo, Dong-Hee;Yun, Tae-Sup
    • Journal of the Korean Geotechnical Society
    • /
    • v.40 no.4
    • /
    • pp.69-79
    • /
    • 2024
  • Field geotechnical data are obtained from various field and laboratory tests and are documented in geotechnical investigation reports. For efficient design and construction, digitizing these geotechnical parameters is essential. However, current practices involve manual data entry, which is time-consuming, labor-intensive, and prone to errors. Thus, this study proposes an automatic data extraction method from geotechnical investigation reports using image-based deep learning models and text-mining techniques. A deep-learning-based page classification model and a text-searching algorithm were employed to classify geotechnical investigation report pages with 100% accuracy. Computer vision algorithms were utilized to identify valid data regions within report pages, and text analysis was used to match and extract the corresponding geotechnical data. The proposed model was validated using a dataset of 205 geotechnical investigation reports, achieving an average data extraction accuracy of 93.0%. Finally, a user-interface-based program was developed to enhance the practical application of the extraction model. It allowed users to upload PDF files of geotechnical investigation reports, automatically analyze these reports, and extract and edit data. This approach is expected to improve the efficiency and accuracy of digitizing geotechnical investigation reports and building geotechnical databases.

Transfer Learning-based Object Detection Algorithm Using YOLO Network (YOLO 네트워크를 활용한 전이학습 기반 객체 탐지 알고리즘)

  • Lee, Donggu;Sun, Young-Ghyu;Kim, Soo-Hyun;Sim, Issac;Lee, Kye-San;Song, Myoung-Nam;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.1
    • /
    • pp.219-223
    • /
    • 2020
  • To guarantee AI model's prominent recognition rate and recognition precision, obtaining the large number of data is essential. In this paper, we propose transfer learning-based object detection algorithm for maintaining outstanding performance even when the volume of training data is small. Also, we proposed a tranfer learning network combining Resnet-50 and YOLO(You Only Look Once) network. The transfer learning network uses the Leeds Sports Pose dataset to train the network that detects the person who occupies the largest part of each images. Simulation results yield to detection rate as 84% and detection precision as 97%.

Performance comparison of wake-up-word detection on mobile devices using various convolutional neural networks (다양한 합성곱 신경망 방식을 이용한 모바일 기기를 위한 시작 단어 검출의 성능 비교)

  • Kim, Sanghong;Lee, Bowon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.454-460
    • /
    • 2020
  • Artificial intelligence assistants that provide speech recognition operate through cloud-based voice recognition with high accuracy. In cloud-based speech recognition, Wake-Up-Word (WUW) detection plays an important role in activating devices on standby. In this paper, we compare the performance of Convolutional Neural Network (CNN)-based WUW detection models for mobile devices by using Google's speech commands dataset, using the spectrogram and mel-frequency cepstral coefficient features as inputs. The CNN models used in this paper are multi-layer perceptron, general convolutional neural network, VGG16, VGG19, ResNet50, ResNet101, ResNet152, MobileNet. We also propose network that reduces the model size to 1/25 while maintaining the performance of MobileNet is also proposed.

A New Head Pose Estimation Method based on Boosted 3-D PCA (새로운 Boosted 3-D PCA 기반 Head Pose Estimation 방법)

  • Lee, Kyung-Min;Lin, Chi-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.6
    • /
    • pp.105-109
    • /
    • 2021
  • In this paper, we evaluate Boosted 3-D PCA as a Dataset and evaluate its performance. After that, we will analyze the network features and performance. In this paper, the learning was performed using the 300W-LP data set using the same learning method as Boosted 3-D PCA, and the evaluation was evaluated using the AFLW2000 data set. The results show that the performance is similar to that of the Boosted 3-D PCA paper. This performance result can be learned using the data set of face images freely than the existing Landmark-to-Pose method, so that the poses can be accurately predicted in real-world situations. Since the optimization of the set of key points is not independent, we confirmed the manual that can reduce the computation time. This analysis is expected to be a very important resource for improving the performance of network boosted 3-D PCA or applying it to various application domains.

Generation of Stage Tour Contents with Deep Learning Style Transfer (딥러닝 스타일 전이 기반의 무대 탐방 콘텐츠 생성 기법)

  • Kim, Dong-Min;Kim, Hyeon-Sik;Bong, Dae-Hyeon;Choi, Jong-Yun;Jeong, Jin-Woo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.11
    • /
    • pp.1403-1410
    • /
    • 2020
  • Recently, as interest in non-face-to-face experiences and services increases, the demand for web video contents that can be easily consumed using mobile devices such as smartphones or tablets is rapidly increasing. To cope with these requirements, in this paper we propose a technique to efficiently produce video contents that can provide experience of visiting famous places (i.e., stage tour) in animation or movies. To this end, an image dataset was established by collecting images of stage areas using Google Maps and Google Street View APIs. Afterwards, a deep learning-based style transfer method to apply the unique style of animation videos to the collected street view images and generate the video contents from the style-transferred images was presented. Finally, we showed that the proposed method could produce more interesting stage-tour video contents through various experiments.