• Title/Summary/Keyword: 인공지능 모델링

Search Result 210, Processing Time 0.025 seconds

A School-tailored High School Integrated Science Q&A Chatbot with Sentence-BERT: Development and One-Year Usage Analysis (인공지능 문장 분류 모델 Sentence-BERT 기반 학교 맞춤형 고등학교 통합과학 질문-답변 챗봇 -개발 및 1년간 사용 분석-)

  • Gyeongmo Min;Junehee Yoo
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.3
    • /
    • pp.231-248
    • /
    • 2024
  • This study developed a chatbot for first-year high school students, employing open-source software and the Korean Sentence-BERT model for AI-powered document classification. The chatbot utilizes the Sentence-BERT model to find the six most similar Q&A pairs to a student's query and presents them in a carousel format. The initial dataset, built from online resources, was refined and expanded based on student feedback and usability throughout over the operational period. By the end of the 2023 academic year, the chatbot integrated a total of 30,819 datasets and recorded 3,457 student interactions. Analysis revealed students' inclination to use the chatbot when prompted by teachers during classes and primarily during self-study sessions after school, with an average of 2.1 to 2.2 inquiries per session, mostly via mobile phones. Text mining identified student input terms encompassing not only science-related queries but also aspects of school life such as assessment scope. Topic modeling using BERTopic, based on Sentence-BERT, categorized 88% of student questions into 35 topics, shedding light on common student interests. A year-end survey confirmed the efficacy of the carousel format and the chatbot's role in addressing curiosities beyond integrated science learning objectives. This study underscores the importance of developing chatbots tailored for student use in public education and highlights their educational potential through long-term usage analysis.

Topic Modeling on Research Trends of Industry 4.0 Using Text Mining (텍스트 마이닝을 이용한 4차 산업 연구 동향 토픽 모델링)

  • Cho, Kyoung Won;Woo, Young Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.7
    • /
    • pp.764-770
    • /
    • 2019
  • In this research, text mining techniques were used to analyze the papers related to the "4th Industry". In order to analyze the papers, total of 685 papers were collected by searching with the keyword "4th industry" in Korea Journal Index(KCI) from 2016 to 2019. We used Python-based web scraping program to collect papers and use topic modeling techniques based on LDA algorithm implemented in R language for data analysis. As a result of perplexity analysis on the collected papers, nine topics were determined optimally and nine representative topics of the collected papers were extracted using the Gibbs sampling method. As a result, it was confirmed that artificial intelligence, big data, Internet of things(IoT), digital, network and so on have emerged as the major technologies, and it was confirmed that research has been conducted on the changes due to the major technologies in various fields related to the 4th industry such as industry, government, education field, and job.

Data Analysis of Dropouts of University Students Using Topic Modeling (토픽모델링을 활용한 대학생의 중도탈락 데이터 분석)

  • Jeong, Do-Heon;Park, Ju-Yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.88-95
    • /
    • 2021
  • This study aims to provide implications for establishing support policies for students by empirically analyzing data on university students dropouts. To this end, data of students enrolled in D University after 2017 were sampled and collected. The collected data was analyzed using topic modeling(LDA: Latent Dirichlet Allocation) technique, which is a probabilistic model based on text mining. As a result of the study, it was found that topics that were characteristic of dropout students were found, and the classification performance between groups through topics was also excellent. Based on these results, a specific educational support system was proposed to prevent dropout of university students. This study is meaningful in that it shows the use of text mining techniques in the education field and suggests an education policy based on data analysis.

Self-Recognition Algorithm of Artificial Immune System (인공면역계의 자기-인식 알고리즘)

  • 심귀보;선상준
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.9
    • /
    • pp.801-806
    • /
    • 2001
  • According as many people use a computer newly, damage of computer virus and hacking is rapidly increasing by the crucial users A computer virus is one of program in computer and has abilities of self reproduction ad destruction like a virus of biology. And hacking is to rob a person's data in a intruded computer and to delete data in a person s computer from the outside. To block hacking that is intrusion of a person s computer and the computer virus that destroys data, a study for intrusion-detection of system and virus detection using a biological immune system is in progress. In this paper, we make a model of positive selection and negative selection of self-recognition process that is ability of T-cytotoxic cell that plays an important part in biological immune system. So we embody a self-nonself distinction algorithm in computer, which is an important part when we detect an infected data by computer virus and a modified data by intrusion from the outside. The composed self-recognition process distinguishes self-file from the changed files. To prove the efficacy of self-recognition algorithm, we use simulation by a cell change and a string change of self file.

  • PDF

Traffic Speed Prediction Based on Graph Neural Networks for Intelligent Transportation System (지능형 교통 시스템을 위한 Graph Neural Networks 기반 교통 속도 예측)

  • Kim, Sunghoon;Park, Jonghyuk;Choi, Yerim
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.1
    • /
    • pp.70-85
    • /
    • 2021
  • Deep learning methodology, which has been actively studied in recent years, has improved the performance of artificial intelligence. Accordingly, systems utilizing deep learning have been proposed in various industries. In traffic systems, spatio-temporal graph modeling using GNN was found to be effective in predicting traffic speed. Still, it has a disadvantage that the model is trained inefficiently due to the memory bottleneck. Therefore, in this study, the road network is clustered through the graph clustering algorithm to reduce memory bottlenecks and simultaneously achieve superior performance. In order to verify the proposed method, the similarity of road speed distribution was measured using Jensen-Shannon divergence based on the analysis result of Incheon UTIC data. Then, the road network was clustered by spectrum clustering based on the measured similarity. As a result of the experiments, it was found that when the road network was divided into seven networks, the memory bottleneck was alleviated while recording the best performance compared to the baselines with MAE of 5.52km/h.

Effect of Anthropomorphism Level of Digital Human Banker Speech on User Experience: Focusing on Social Presence, Affinity, Trust, Perceived Intelligence, and Usefulness (디지털 휴먼 은행원 발화의 의인화 수준이 사용자 경험에 미치는 영향: 사회적 실재감, 친밀감, 신뢰도, 인지된 지능, 유용성을 중심으로)

  • Choi, Bomi;Jang, Seojin;Kang, Hyunmin
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.469-476
    • /
    • 2022
  • As the 3D modeling technology and conversational algorithm is developed, digital humans are being used in various fields, and also virtual bankers have begun to appear in banks, including major banks such as Shin-Han Bank and Nong-Hyup Bank. However, most of the research of digital human mainly focus on its appearance, and research on robot persona that should be considered in anthropomorphizing a robot is insufficient. In this study, an experiment was conducted to find out the user experience of three scenarios (student ID receipt, deposit and withdrawal account opening, leasehold loan consultation) in which the level of anthropomorphism of the speech strategy and the level of personal information use differed in the specific context of banking. As a result of the study, social presence and usefulness had an interactive effect on the scenario and the level of anthropomorphism. There was no interaction effect on intimacy, trustworthiness, and perceived intelligence, but a tendency could be confirmed.

Worker Collision Safety Management System using Object Detection (객체 탐지를 활용한 근로자 충돌 안전관리 시스템)

  • Lee, Taejun;Kim, Seongjae;Hwang, Chul-Hyun;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.9
    • /
    • pp.1259-1265
    • /
    • 2022
  • Recently, AI, big data, and IoT technologies are being used in various solutions such as fire detection and gas or dangerous substance detection for safety accident prevention. According to the status of occupational accidents published by the Ministry of Employment and Labor in 2021, the accident rate, the number of injured, and the number of deaths have increased compared to 2020. In this paper, referring to the dataset construction guidelines provided by the National Intelligence Service Agency(NIA), the dataset is directly collected from the field and learned with YOLOv4 to propose a collision risk object detection system through object detection. The accuracy of the dangerous situation rule violation was 88% indoors and 92% outdoors. Through this system, it is thought that it will be possible to analyze safety accidents that occur in industrial sites in advance and use them to intelligent platforms research.

Comparative Evaluation of Chest Image Pneumonia based on Learning Rate Application (학습률 적용에 따른 흉부영상 폐렴 유무 분류 비교평가)

  • Kim, Ji-Yul;Ye, Soo-Young
    • Journal of the Korean Society of Radiology
    • /
    • v.16 no.5
    • /
    • pp.595-602
    • /
    • 2022
  • This study tried to suggest the most efficient learning rate for accurate and efficient automatic diagnosis of medical images for chest X-ray pneumonia images using deep learning. After setting the learning rates to 0.1, 0.01, 0.001, and 0.0001 in the Inception V3 deep learning model, respectively, deep learning modeling was performed three times. And the average accuracy and loss function value of verification modeling, and the metric of test modeling were set as performance evaluation indicators, and the performance was compared and evaluated with the average value of three times of the results obtained as a result of performing deep learning modeling. As a result of performance evaluation for deep learning verification modeling performance evaluation and test modeling metric, modeling with a learning rate of 0.001 showed the highest accuracy and excellent performance. For this reason, in this paper, it is recommended to apply a learning rate of 0.001 when classifying the presence or absence of pneumonia on chest X-ray images using a deep learning model. In addition, it was judged that when deep learning modeling through the application of the learning rate presented in this paper could play an auxiliary role in the classification of the presence or absence of pneumonia on chest X-ray images. In the future, if the study of classification for diagnosis and classification of pneumonia using deep learning continues, the contents of this thesis research can be used as basic data, and furthermore, it is expected that it will be helpful in selecting an efficient learning rate in classifying medical images using artificial intelligence.

Semantic Visualization of Dynamic Topic Modeling (다이내믹 토픽 모델링의 의미적 시각화 방법론)

  • Yeon, Jinwook;Boo, Hyunkyung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.131-154
    • /
    • 2022
  • Recently, researches on unstructured data analysis have been actively conducted with the development of information and communication technology. In particular, topic modeling is a representative technique for discovering core topics from massive text data. In the early stages of topic modeling, most studies focused only on topic discovery. As the topic modeling field matured, studies on the change of the topic according to the change of time began to be carried out. Accordingly, interest in dynamic topic modeling that handle changes in keywords constituting the topic is also increasing. Dynamic topic modeling identifies major topics from the data of the initial period and manages the change and flow of topics in a way that utilizes topic information of the previous period to derive further topics in subsequent periods. However, it is very difficult to understand and interpret the results of dynamic topic modeling. The results of traditional dynamic topic modeling simply reveal changes in keywords and their rankings. However, this information is insufficient to represent how the meaning of the topic has changed. Therefore, in this study, we propose a method to visualize topics by period by reflecting the meaning of keywords in each topic. In addition, we propose a method that can intuitively interpret changes in topics and relationships between or among topics. The detailed method of visualizing topics by period is as follows. In the first step, dynamic topic modeling is implemented to derive the top keywords of each period and their weight from text data. In the second step, we derive vectors of top keywords of each topic from the pre-trained word embedding model. Then, we perform dimension reduction for the extracted vectors. Then, we formulate a semantic vector of each topic by calculating weight sum of keywords in each vector using topic weight of each keyword. In the third step, we visualize the semantic vector of each topic using matplotlib, and analyze the relationship between or among the topics based on the visualized result. The change of topic can be interpreted in the following manners. From the result of dynamic topic modeling, we identify rising top 5 keywords and descending top 5 keywords for each period to show the change of the topic. Existing many topic visualization studies usually visualize keywords of each topic, but our approach proposed in this study differs from previous studies in that it attempts to visualize each topic itself. To evaluate the practical applicability of the proposed methodology, we performed an experiment on 1,847 abstracts of artificial intelligence-related papers. The experiment was performed by dividing abstracts of artificial intelligence-related papers into three periods (2016-2017, 2018-2019, 2020-2021). We selected seven topics based on the consistency score, and utilized the pre-trained word embedding model of Word2vec trained with 'Wikipedia', an Internet encyclopedia. Based on the proposed methodology, we generated a semantic vector for each topic. Through this, by reflecting the meaning of keywords, we visualized and interpreted the themes by period. Through these experiments, we confirmed that the rising and descending of the topic weight of a keyword can be usefully used to interpret the semantic change of the corresponding topic and to grasp the relationship among topics. In this study, to overcome the limitations of dynamic topic modeling results, we used word embedding and dimension reduction techniques to visualize topics by era. The results of this study are meaningful in that they broadened the scope of topic understanding through the visualization of dynamic topic modeling results. In addition, the academic contribution can be acknowledged in that it laid the foundation for follow-up studies using various word embeddings and dimensionality reduction techniques to improve the performance of the proposed methodology.

Boosting the Performance of the Predictive Model on the Imbalanced Dataset Using SVM Based Bagging and Out-of-Distribution Detection (SVM 기반 Bagging과 OoD 탐색을 활용한 제조공정의 불균형 Dataset에 대한 예측모델의 성능향상)

  • Kim, Jong Hoon;Oh, Hayoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.11
    • /
    • pp.455-464
    • /
    • 2022
  • There are two unique characteristics of the datasets from a manufacturing process. They are the severe class imbalance and lots of Out-of-Distribution samples. Some good strategies such as the oversampling over the minority class, and the down-sampling over the majority class, are well known to handle the class imbalance. In addition, SMOTE has been chosen to address the issue recently. But, Out-of-Distribution samples have been studied just with neural networks. It seems to be hardly shown that Out-of-Distribution detection is applied to the predictive model using conventional machine learning algorithms such as SVM, Random Forest and KNN. It is known that conventional machine learning algorithms are much better than neural networks in prediction performance, because neural networks are vulnerable to over-fitting and requires much bigger dataset than conventional machine learning algorithms does. So, we suggests a new approach to utilize Out-of-Distribution detection based on SVM algorithm. In addition to that, bagging technique will be adopted to improve the precision of the model.