• Title/Summary/Keyword: AI Dataset

Search Result 259, Processing Time 0.028 seconds

Analysis of YouTube Viewers' Characteristics and Responses to Virtual Idols (버추얼 아이돌에 대한 유튜브 시청자 특성과 반응 분석)

  • JeongYoon Kang;Choonsung Shin;Hieyong Jeong
    • Journal of Information Technology Services
    • /
    • v.23 no.3
    • /
    • pp.103-118
    • /
    • 2024
  • Due to the advancement of virtual reality technology, virtual idols are widely used in industrial and cultural content industries. However, it is difficult to utilize virtual idols' social perceptions because they are not properly understood. Therefore, this paper collected and analyzed YouTube comments to identify differences about social perception through comparative analysis between virtual idols and general idols. The dataset was constructed by crawling comments from music videos with more than 10 million views of virtual idols and more than 10,000 comments. Keyword frequency and TF-IDF values were derived from the collected dataset, and the connection centrality CONCOR cluster was analyzed with a semantic network using the UCINET program. As a result of the analysis, it was found that virtual idols frequently used keywords such as "person," "quality," "character," "reality," "animation," while reactions and perceptions were derived from general idols. Based on the results of this analysis, it was found that while general idols are mainly evaluated with their appearance and cultural factors, social perceptions of virtual idols' values are mixed with evaluations of cultural factors such as "song," "voice," and "choreography," focusing on technical factors such as "people," "quality," "character," and "animation." However, keywords such as "song," "voice," "choreography," and "music" are included in the top 30 like regular idols and appear in the same cluster, suggesting that virtual idols are gradually shifting away from minority tastes to mainstream culture. This study aims to provide academic and practical implications for the future expansion of the industry and cultural content industry of virtual idols by grasping the social perception of virtual idols.

Construction of Artificial Intelligence Training Platform for Multi-Center Clinical Research (다기관 임상연구를 위한 인공지능 학습 플랫폼 구축)

  • Lee, Chung-Sub;Kim, Ji-Eon;No, Si-Hyeong;Kim, Tae-Hoon;Yoon, Kwon-Ha;Jeong, Chang-Won
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.239-246
    • /
    • 2020
  • In the medical field where artificial intelligence technology is introduced, research related to clinical decision support system(CDSS) in relation to diagnosis and prediction is actively being conducted. In particular, medical imaging-based disease diagnosis area applied AI technologies at various products. However, medical imaging data consists of inconsistent data, and it is a reality that it takes considerable time to prepare and use it for research. This paper describes a one-stop AI learning platform for converting to medical image standard R_CDM(Radiology Common Data Model) and supporting AI algorithm development research based on the dataset. To this, the focus is on linking with the existing CDM(common data model) and model the system, including the schema of the medical imaging standard model and report information for multi-center research based on DICOM(Digital Imaging and Communications in Medicine) tag information. And also, we show the execution results based on generated datasets through the AI learning platform. As a proposed platform, it is expected to be used for various image-based artificial intelligence researches.

Artificial Intelligence for Assistance of Facial Expression Practice Using Emotion Classification (감정 분류를 이용한 표정 연습 보조 인공지능)

  • Dong-Kyu, Kim;So Hwa, Lee;Jae Hwan, Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1137-1144
    • /
    • 2022
  • In this study, an artificial intelligence(AI) was developed to help with facial expression practice in order to express emotions. The developed AI used multimodal inputs consisting of sentences and facial images for deep neural networks (DNNs). The DNNs calculated similarities between the emotions predicted by the sentences and the emotions predicted by facial images. The user practiced facial expressions based on the situation given by sentences, and the AI provided the user with numerical feedback based on the similarity between the emotion predicted by sentence and the emotion predicted by facial expression. ResNet34 structure was trained on FER2013 public data to predict emotions from facial images. To predict emotions in sentences, KoBERT model was trained in transfer learning manner using the conversational speech dataset for emotion classification opened to the public by AIHub. The DNN that predicts emotions from the facial images demonstrated 65% accuracy, which is comparable to human emotional classification ability. The DNN that predicts emotions from the sentences achieved 90% accuracy. The performance of the developed AI was evaluated through experiments with changing facial expressions in which an ordinary person was participated.

A Study on the Development Trend of Artificial Intelligence Using Text Mining Technique: Focused on Open Source Software Projects on Github (텍스트 마이닝 기법을 활용한 인공지능 기술개발 동향 분석 연구: 깃허브 상의 오픈 소스 소프트웨어 프로젝트를 대상으로)

  • Chong, JiSeon;Kim, Dongsung;Lee, Hong Joo;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.1-19
    • /
    • 2019
  • Artificial intelligence (AI) is one of the main driving forces leading the Fourth Industrial Revolution. The technologies associated with AI have already shown superior abilities that are equal to or better than people in many fields including image and speech recognition. Particularly, many efforts have been actively given to identify the current technology trends and analyze development directions of it, because AI technologies can be utilized in a wide range of fields including medical, financial, manufacturing, service, and education fields. Major platforms that can develop complex AI algorithms for learning, reasoning, and recognition have been open to the public as open source projects. As a result, technologies and services that utilize them have increased rapidly. It has been confirmed as one of the major reasons for the fast development of AI technologies. Additionally, the spread of the technology is greatly in debt to open source software, developed by major global companies, supporting natural language recognition, speech recognition, and image recognition. Therefore, this study aimed to identify the practical trend of AI technology development by analyzing OSS projects associated with AI, which have been developed by the online collaboration of many parties. This study searched and collected a list of major projects related to AI, which were generated from 2000 to July 2018 on Github. This study confirmed the development trends of major technologies in detail by applying text mining technique targeting topic information, which indicates the characteristics of the collected projects and technical fields. The results of the analysis showed that the number of software development projects by year was less than 100 projects per year until 2013. However, it increased to 229 projects in 2014 and 597 projects in 2015. Particularly, the number of open source projects related to AI increased rapidly in 2016 (2,559 OSS projects). It was confirmed that the number of projects initiated in 2017 was 14,213, which is almost four-folds of the number of total projects generated from 2009 to 2016 (3,555 projects). The number of projects initiated from Jan to Jul 2018 was 8,737. The development trend of AI-related technologies was evaluated by dividing the study period into three phases. The appearance frequency of topics indicate the technology trends of AI-related OSS projects. The results showed that the natural language processing technology has continued to be at the top in all years. It implied that OSS had been developed continuously. Until 2015, Python, C ++, and Java, programming languages, were listed as the top ten frequently appeared topics. However, after 2016, programming languages other than Python disappeared from the top ten topics. Instead of them, platforms supporting the development of AI algorithms, such as TensorFlow and Keras, are showing high appearance frequency. Additionally, reinforcement learning algorithms and convolutional neural networks, which have been used in various fields, were frequently appeared topics. The results of topic network analysis showed that the most important topics of degree centrality were similar to those of appearance frequency. The main difference was that visualization and medical imaging topics were found at the top of the list, although they were not in the top of the list from 2009 to 2012. The results indicated that OSS was developed in the medical field in order to utilize the AI technology. Moreover, although the computer vision was in the top 10 of the appearance frequency list from 2013 to 2015, they were not in the top 10 of the degree centrality. The topics at the top of the degree centrality list were similar to those at the top of the appearance frequency list. It was found that the ranks of the composite neural network and reinforcement learning were changed slightly. The trend of technology development was examined using the appearance frequency of topics and degree centrality. The results showed that machine learning revealed the highest frequency and the highest degree centrality in all years. Moreover, it is noteworthy that, although the deep learning topic showed a low frequency and a low degree centrality between 2009 and 2012, their ranks abruptly increased between 2013 and 2015. It was confirmed that in recent years both technologies had high appearance frequency and degree centrality. TensorFlow first appeared during the phase of 2013-2015, and the appearance frequency and degree centrality of it soared between 2016 and 2018 to be at the top of the lists after deep learning, python. Computer vision and reinforcement learning did not show an abrupt increase or decrease, and they had relatively low appearance frequency and degree centrality compared with the above-mentioned topics. Based on these analysis results, it is possible to identify the fields in which AI technologies are actively developed. The results of this study can be used as a baseline dataset for more empirical analysis on future technology trends that can be converged.

An Artificial Intelligence Approach for Word Semantic Similarity Measure of Hindi Language

  • Younas, Farah;Nadir, Jumana;Usman, Muhammad;Khan, Muhammad Attique;Khan, Sajid Ali;Kadry, Seifedine;Nam, Yunyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.2049-2068
    • /
    • 2021
  • AI combined with NLP techniques has promoted the use of Virtual Assistants and have made people rely on them for many diverse uses. Conversational Agents are the most promising technique that assists computer users through their operation. An important challenge in developing Conversational Agents globally is transferring the groundbreaking expertise obtained in English to other languages. AI is making it possible to transfer this learning. There is a dire need to develop systems that understand secular languages. One such difficult language is Hindi, which is the fourth most spoken language in the world. Semantic similarity is an important part of Natural Language Processing, which involves applications such as ontology learning and information extraction, for developing conversational agents. Most of the research is concentrated on English and other European languages. This paper presents a Corpus-based word semantic similarity measure for Hindi. An experiment involving the translation of the English benchmark dataset to Hindi is performed, investigating the incorporation of the corpus, with human and machine similarity ratings. A significant correlation to the human intuition and the algorithm ratings has been calculated for analyzing the accuracy of the proposed similarity measures. The method can be adapted in various applications of word semantic similarity or module for any other language.

Implementation of a Deep Learning based Realtime Fire Alarm System using a Data Augmentation (데이터 증강 학습 이용한 딥러닝 기반 실시간 화재경보 시스템 구현)

  • Kim, Chi-young;Lee, Hyeon-Su;Lee, Kwang-yeob
    • Journal of IKEEE
    • /
    • v.26 no.3
    • /
    • pp.468-474
    • /
    • 2022
  • In this paper, we propose a method to implement a real-time fire alarm system using deep learning. The deep learning image dataset for fire alarms acquired 1,500 sheets through the Internet. If various images acquired in a daily environment are learned as they are, there is a disadvantage that the learning accuracy is not high. In this paper, we propose a fire image data expansion method to improve learning accuracy. The data augmentation method learned a total of 2,100 sheets by adding 600 pieces of learning data using brightness control, blurring, and flame photo synthesis. The expanded data using the flame image synthesis method had a great influence on the accuracy improvement. A real-time fire detection system is a system that detects fires by applying deep learning to image data and transmits notifications to users. An app was developed to detect fires by analyzing images in real time using a model custom-learned from the YOLO V4 TINY model suitable for the Edge AI system and to inform users of the results. Approximately 10% accuracy improvement can be obtained compared to conventional methods when using the proposed data.

Heart Disease Prediction Using Decision Tree With Kaggle Dataset

  • Noh, Young-Dan;Cho, Kyu-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.21-28
    • /
    • 2022
  • All health problems that occur in the circulatory system are refer to cardiovascular illness, such as heart and vascular diseases. Deaths from cardiovascular disorders are recorded one third of in total deaths in 2019 worldwide, and the number of deaths continues to rise. Therefore, if it is possible to predict diseases that has high mortality rate with patient's data and AI system, they would enable them to be detected and be treated in advance. In this study, models are produced to predict heart disease, which is one of the cardiovascular diseases, and compare the performance of models with Accuracy, Precision, and Recall, with description of the way of improving the performance of the Decision Tree(Decision Tree, KNN (K-Nearest Neighbor), SVM (Support Vector Machine), and DNN (Deep Neural Network) are used in this study.). Experiments were conducted using scikit-learn, Keras, and TensorFlow libraries using Python as Jupyter Notebook in macOS Big Sur. As a result of comparing the performance of the models, the Decision Tree demonstrates the highest performance, thus, it is recommended to use the Decision Tree in this study.

Injection Process Yield Improvement Methodology Based on eXplainable Artificial Intelligence (XAI) Algorithm (XAI(eXplainable Artificial Intelligence) 알고리즘 기반 사출 공정 수율 개선 방법론)

  • Ji-Soo Hong;Yong-Min Hong;Seung-Yong Oh;Tae-Ho Kang;Hyeon-Jeong Lee;Sung-Woo Kang
    • Journal of Korean Society for Quality Management
    • /
    • v.51 no.1
    • /
    • pp.55-65
    • /
    • 2023
  • Purpose: The purpose of this study is to propose an optimization process to improve product yield in the process using process data. Recently, research for low-cost and high-efficiency production in the manufacturing process using machine learning or deep learning has continued. Therefore, this study derives major variables that affect product defects in the manufacturing process using eXplainable Artificial Intelligence(XAI) method. After that, the optimal range of the variables is presented to propose a methodology for improving product yield. Methods: This study is conducted using the injection molding machine AI dataset released on the Korea AI Manufacturing Platform(KAMP) organized by KAIST. Using the XAI-based SHAP method, major variables affecting product defects are extracted from each process data. XGBoost and LightGBM were used as learning algorithms, 5-6 variables are extracted as the main process variables for the injection process. Subsequently, the optimal control range of each process variable is presented using the ICE method. Finally, the product yield improvement methodology of this study is proposed through a validation process using Test Data. Results: The results of this study are as follows. In the injection process data, it was confirmed that XGBoost had an improvement defect rate of 0.21% and LightGBM had an improvement defect rate of 0.29%, which were improved by 0.79%p and 0.71%p, respectively, compared to the existing defect rate of 1.00%. Conclusion: This study is a case study. A research methodology was proposed in the injection process, and it was confirmed that the product yield was improved through verification.

Research Trends and Datasets Review using Satellite Image (위성영상 이미지를 활용한 연구 동향 및 데이터셋 리뷰)

  • Kim, Se Hyoung;Chae, Jung Woo;Kang, Ju Young
    • Smart Media Journal
    • /
    • v.11 no.1
    • /
    • pp.17-30
    • /
    • 2022
  • Like other computer vision research trends, research using satellite images was able to achieve rapid growth with the development of GPU-based computer computing capabilities and deep learning methodologies related to image processing. As a result, satellite images are being used in various fields, and the number of studies on how to use satellite images is increasing. Therefore, in this paper, we will introduce the field of research and utilization of satellite images and datasets that can be used for research using satellite images. First, studies using satellite images were collected and classified according to the research method. It was largely classified into a Regression-based Approach and a Classification-based Approach, and the papers used by other methods were summarized. Next, the datasets used in studies using satellite images were summarized. This study proposes information on datasets and methods of use in research. In addition, it introduces how to organize and utilize domestic satellite image datasets that were recently opened by AI hub. In addition, I would like to briefly examine the limitations of satellite image-related research and future trends.

Effective speech recognition system for patients with Parkinson's disease (파킨슨병 환자에 대한 효과적인 음성인식 시스템)

  • Huiyong, Bak;Ryul, Kim;Sangmin, Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.6
    • /
    • pp.655-661
    • /
    • 2022
  • Since speech impairment is prevalent in patients with Parkinson's disease (PD), speech recognition systems suitable for these patients are needed. In this paper, we propose a speech recognition system that effectively recognizes the speech of patients with PD. The speech recognition system is firstly pre-trained with the Globalformer using the speech data from healthy people, and then fine-tuned using relatively small amount of speech data from the patient with PD. For this analysis, we used the speech dataset of healthy people built by AI hub and that of patients with PD collected at Inha University Hospital. As a result of the experiment, the proposed speech recognition system recognized the speech of patients with PD with Character Error Rate (CER) of 22.15 %, which was a better result compared to other methods.