• Title/Summary/Keyword: Generative Models

Search Result 180, Processing Time 0.021 seconds

The Role of GPT Models in Sentiment Analysis Tasks

  • Mashael M. Alsulami
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.9
    • /
    • pp.12-20
    • /
    • 2024
  • Sentiment analysis has become a pivotal component in understanding public opinion, market trends, and user experiences across various domains. The advent of GPT (Generative Pre-trained Transformer) models has revolutionized the landscape of natural language processing, introducing a new dimension to sentiment analysis. This comprehensive roadmap delves into the transformative impact of GPT models on sentiment analysis tasks, contrasting them with conventional methodologies. With an increasing need for nuanced and context-aware sentiment analysis, this study explores how GPT models, known for their ability to understand and generate human-like text, outperform traditional methods in capturing subtleties of sentiment expression. We scrutinize various case studies and benchmarks, highlighting GPT models' prowess in handling context, sarcasm, and idiomatic expressions. This roadmap not only underscores the superior performance of GPT models but also discusses challenges and future directions in this dynamic field, offering valuable insights for researchers, practitioners, and AI enthusiasts. The in-depth analysis provided in this paper serves as a testament to the transformational potential of GPT models in the realm of sentiment analysis.

Analysis of generative AI's mathematical problem-solving performance: Focusing on ChatGPT 4, Claude 3 Opus, and Gemini Advanced (생성형 인공지능의 수학 문제 풀이에 대한 성능 분석: ChatGPT 4, Claude 3 Opus, Gemini Advanced를 중심으로)

  • Sejun Oh;Jungeun Yoon;Yoojin Chung;Yoonjoo Cho;Hyosup Shim;Oh Nam Kwon
    • The Mathematical Education
    • /
    • v.63 no.3
    • /
    • pp.549-571
    • /
    • 2024
  • As digital·AI-based teaching and learning is emphasized, discussions on the educational use of generative AI are becoming more active. This study analyzed the mathematical performance of ChatGPT 4, Claude 3 Opus, and Gemini Advanced on solving examples and problems from five first-year high school math textbooks. As a result of examining the overall correct answer rate and characteristics of each skill for a total of 1,317 questions, ChatGPT 4 had the highest overall correct answer rate of 0.85, followed by Claude 3 Opus at 0.67, and Gemini Advanced at 0.42. By skills, all three models showed high correct answer rates in 'Find functions' and 'Prove', while relatively low correct answer rates in 'Explain' and 'Draw graphs'. In particular, in 'Count', ChatGPT 4 and Claude 3 Opus had a correct answer rate of 1.00, while Gemini Advanced was low at 0.56. Additionally, all models had difficulty in explaining using Venn diagrams and creating images. Based on the research results, teachers should identify the strengths and limitations of each AI model and use them appropriately in class. This study is significant in that it suggested the possibility of use in actual classes by analyzing the mathematical performance of generative AI. It also provided important implications for redefining the role of teachers in mathematics education in the era of artificial intelligence. Further research is needed to develop a cooperative educational model between generative AI and teachers and to study individualized learning plans using AI.

Enhanced ACGAN based on Progressive Step Training and Weight Transfer

  • Jinmo Byeon;Inshil Doh;Dana Yang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.11-20
    • /
    • 2024
  • Among the generative models in Artificial Intelligence (AI), especially Generative Adversarial Network (GAN) has been successful in various applications such as image processing, density estimation, and style transfer. While the GAN models including Conditional GAN (CGAN), CycleGAN, BigGAN, have been extended and improved, researchers face challenges in real-world applications in specific domains such as disaster simulation, healthcare, and urban planning due to data scarcity and unstable learning causing Image distortion. This paper proposes a new progressive learning methodology called Progressive Step Training (PST) based on the Auxiliary Classifier GAN (ACGAN) that discriminates class labels, leveraging the progressive learning approach of the Progressive Growing of GAN (PGGAN). The PST model achieves 70.82% faster stabilization, 51.3% lower standard deviation, stable convergence of loss values in the later high resolution stages, and a 94.6% faster loss reduction compared to conventional methods.

Bayesian Model for Probabilistic Unsupervised Learning (확률적 자율 학습을 위한 베이지안 모델)

  • 최준혁;김중배;김대수;임기욱
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.9
    • /
    • pp.849-854
    • /
    • 2001
  • GTM(Generative Topographic Mapping) model is a probabilistic version of the SOM(Self Organizing Maps) which was proposed by T. Kohonen. The GTM is modelled by latent or hidden variables of probability distribution of data. It is a unique characteristic not implemented in SOM model, and, therefore, it is possible with GTM to analyze data accurately, thereby overcoming the limits of SOM. In the present investigation we proposed a BGTM(Bayesian GTM) combined with Bayesian learning and GTM model that has a small mis-classification ratio. By combining fast calculation ability and probabilistic distribution of data of GTM with correct reasoning based on Bayesian model, the BGTM model provided improved results, compared with existing models.

  • PDF

Theoretical Analyses of Science Teaching Models (과학수업모형들의 특성에 관한 이론적 분석)

  • Kim, Han-Ho
    • Journal of The Korean Association For Science Education
    • /
    • v.15 no.2
    • /
    • pp.201-212
    • /
    • 1995
  • The purpose of this study was to analyze science teaching models: Cognitive Conflict Teaching Model(CCTM), Generative Learning Model(GLM), Learning Cycle Model(LCM), Hypothesis-Testing Model(HTM), and Discovery Teaching Model(DTM). Using literature review, the models were analyzed and compared in several aspects; philosophical and psychological bases, primary goals and assumptions, syntax, implementation environments, and probable effects. The major finding were as follows; 1. Science teaching models had been diverse features. In the comparisons of science teaching models, some differences and similarities were founded. These were different in the degree of similarity and emphasis. 2. CCTM and GLM resemble each other in philosophical and psychological bases, primary goals and main assumptions, implementation environments, and probable effects. 3. LCM and HTM showed similarities in philosophical bases, syntax, and implementation environments. But differences were founded in other aspects These results showed that the diverse features of science teaching models should be considered in choosing a model for science teaching.

  • PDF

A study on evaluation method of NIDS datasets in closed military network (군 폐쇄망 환경에서의 모의 네트워크 데이터 셋 평가 방법 연구)

  • Park, Yong-bin;Shin, Sung-uk;Lee, In-sup
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.121-130
    • /
    • 2020
  • This paper suggests evaluating the military closed network data as an image which is generated by Generative Adversarial Network (GAN), applying an image evaluation method such as the InceptionV3 model-based Inception Score (IS) and Frechet Inception Distance (FID). We employed the famous image classification models instead of the InceptionV3, added layers to those models, and converted the network data to an image in diverse ways. Experimental results show that the Densenet121 model with one added Dense Layer achieves the best performance in data converted using the arctangent algorithm and 8 * 8 size of the image.

Semi-Supervised Spatial Attention Method for Facial Attribute Editing

  • Yang, Hyeon Seok;Han, Jeong Hoon;Moon, Young Shik
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.10
    • /
    • pp.3685-3707
    • /
    • 2021
  • In recent years, facial attribute editing has been successfully used to effectively change face images of various attributes based on generative adversarial networks and encoder-decoder models. However, existing models have a limitation in that they may change an unintended part in the process of changing an attribute or may generate an unnatural result. In this paper, we propose a model that improves the learning of the attention mask by adding a spatial attention mechanism based on the unified selective transfer network (referred to as STGAN) using semi-supervised learning. The proposed model can edit multiple attributes while preserving details independent of the attributes being edited. This study makes two main contributions to the literature. First, we propose an encoder-decoder model structure that learns and edits multiple facial attributes and suppresses distortion using an attention mask. Second, we define guide masks and propose a method and an objective function that use the guide masks for multiple facial attribute editing through semi-supervised learning. Through qualitative and quantitative evaluations of the experimental results, the proposed method was proven to yield improved results that preserve the image details by suppressing unintended changes than existing methods.

3D Object Generation and Renderer System based on VAE ResNet-GAN

  • Min-Su Yu;Tae-Won Jung;GyoungHyun Kim;Soonchul Kwon;Kye-Dong Jung
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.142-146
    • /
    • 2023
  • We present a method for generating 3D structures and rendering objects by combining VAE (Variational Autoencoder) and GAN (Generative Adversarial Network). This approach focuses on generating and rendering 3D models with improved quality using residual learning as the learning method for the encoder. We deep stack the encoder layers to accurately reflect the features of the image and apply residual blocks to solve the problems of deep layers to improve the encoder performance. This solves the problems of gradient vanishing and exploding, which are problems when constructing a deep neural network, and creates a 3D model of improved quality. To accurately extract image features, we construct deep layers of the encoder model and apply the residual function to learning to model with more detailed information. The generated model has more detailed voxels for more accurate representation, is rendered by adding materials and lighting, and is finally converted into a mesh model. 3D models have excellent visual quality and accuracy, making them useful in various fields such as virtual reality, game development, and metaverse.

Towards a small language model powered chain-of-reasoning for open-domain question answering

  • Jihyeon Roh;Minho Kim;Kyoungman Bae
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.11-21
    • /
    • 2024
  • We focus on open-domain question-answering tasks that involve a chain-of-reasoning, which are primarily implemented using large language models. With an emphasis on cost-effectiveness, we designed EffiChainQA, an architecture centered on the use of small language models. We employed a retrieval-based language model to address the limitations of large language models, such as the hallucination issue and the lack of updated knowledge. To enhance reasoning capabilities, we introduced a question decomposer that leverages a generative language model and serves as a key component in the chain-of-reasoning process. To generate training data for our question decomposer, we leveraged ChatGPT, which is known for its data augmentation ability. Comprehensive experiments were conducted using the HotpotQA dataset. Our method outperformed several established approaches, including the Chain-of-Thoughts approach, which is based on large language models. Moreover, our results are on par with those of state-of-the-art Retrieve-then-Read methods that utilize large language models.