• Title/Summary/Keyword: chain-of-reasoning

Search Result 13, Processing Time 0.023 seconds

Towards a small language model powered chain-of-reasoning for open-domain question answering

  • Jihyeon Roh;Minho Kim;Kyoungman Bae
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.11-21
    • /
    • 2024
  • We focus on open-domain question-answering tasks that involve a chain-of-reasoning, which are primarily implemented using large language models. With an emphasis on cost-effectiveness, we designed EffiChainQA, an architecture centered on the use of small language models. We employed a retrieval-based language model to address the limitations of large language models, such as the hallucination issue and the lack of updated knowledge. To enhance reasoning capabilities, we introduced a question decomposer that leverages a generative language model and serves as a key component in the chain-of-reasoning process. To generate training data for our question decomposer, we leveraged ChatGPT, which is known for its data augmentation ability. Comprehensive experiments were conducted using the HotpotQA dataset. Our method outperformed several established approaches, including the Chain-of-Thoughts approach, which is based on large language models. Moreover, our results are on par with those of state-of-the-art Retrieve-then-Read methods that utilize large language models.

Research Trends in Large Language Models and Mathematical Reasoning (초거대 언어모델과 수학추론 연구 동향)

  • O.W. Kwon;J.H. Shin;Y.A. Seo;S.J. Lim;J. Heo;K.Y. Lee
    • Electronics and Telecommunications Trends
    • /
    • v.38 no.6
    • /
    • pp.1-11
    • /
    • 2023
  • Large language models seem promising for handling reasoning problems, but their underlying solving mechanisms remain unclear. Large language models will establish a new paradigm in artificial intelligence and the society as a whole. However, a major challenge of large language models is the massive resources required for training and operation. To address this issue, researchers are actively exploring compact large language models that retain the capabilities of large language models while notably reducing the model size. These research efforts are mainly focused on improving pretraining, instruction tuning, and alignment. On the other hand, chain-of-thought prompting is a technique aimed at enhancing the reasoning ability of large language models. It provides an answer through a series of intermediate reasoning steps when given a problem. By guiding the model through a multistep problem-solving process, chain-of-thought prompting may improve the model reasoning skills. Mathematical reasoning, which is a fundamental aspect of human intelligence, has played a crucial role in advancing large language models toward human-level performance. As a result, mathematical reasoning is being widely explored in the context of large language models. This type of research extends to various domains such as geometry problem solving, tabular mathematical reasoning, visual question answering, and other areas.

Empowering Emotion Classification Performance Through Reasoning Dataset From Large-scale Language Model (초거대 언어 모델로부터의 추론 데이터셋을 활용한 감정 분류 성능 향상)

  • NunSol Park;MinHo Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.59-61
    • /
    • 2023
  • 본 논문에서는 감정 분류 성능 향상을 위한 초거대 언어모델로부터의 추론 데이터셋 활용 방안을 제안한다. 이 방안은 Google Research의 'Chain of Thought'에서 영감을 받아 이를 적용하였으며, 추론 데이터는 ChatGPT와 같은 초거대 언어 모델로 생성하였다. 본 논문의 목표는 머신러닝 모델이 추론 데이터를 이해하고 적용하는 능력을 활용하여, 감정 분류 작업의 성능을 향상시키는 것이다. 초거대 언어 모델(ChatGPT)로부터 추출한 추론 데이터셋을 활용하여 감정 분류 모델을 훈련하였으며, 이 모델은 감정 분류 작업에서 향상된 성능을 보였다. 이를 통해 추론 데이터셋이 감정 분류에 있어서 큰 가치를 가질 수 있음을 증명하였다. 또한, 이 연구는 기존에 감정 분류 작업에 사용되던 데이터셋만을 활용한 모델과 비교하였을 때, 추론 데이터를 활용한 모델이 더 높은 성능을 보였음을 증명한다. 이 연구를 통해, 적은 비용으로 초거대 언어모델로부터 생성된 추론 데이터셋의 활용 가능성을 보여주고, 감정 분류 작업 성능을 향상시키는 새로운 방법을 제시한다. 제시한 방안은 감정 분류뿐만 아니라 다른 자연어처리 분야에서도 활용될 수 있으며, 더욱 정교한 자연어 이해와 처리가 가능함을 시사한다.

  • PDF

Enhancing Empathic Reasoning of Large Language Models Based on Psychotherapy Models for AI-assisted Social Support (인공지능 기반 사회적 지지를 위한 대형언어모형의 공감적 추론 향상: 심리치료 모형을 중심으로)

  • Yoon Kyung Lee;Inju Lee;Minjung Shin;Seoyeon Bae;Sowon Hahn
    • Korean Journal of Cognitive Science
    • /
    • v.35 no.1
    • /
    • pp.23-48
    • /
    • 2024
  • Building human-aligned artificial intelligence (AI) for social support remains challenging despite the advancement of Large Language Models. We present a novel method, the Chain of Empathy (CoE) prompting, that utilizes insights from psychotherapy to induce LLMs to reason about human emotional states. This method is inspired by various psychotherapy approaches-Cognitive-Behavioral Therapy (CBT), Dialectical Behavior Therapy (DBT), Person-Centered Therapy (PCT), and Reality Therapy (RT)-each leading to different patterns of interpreting clients' mental states. LLMs without CoE reasoning generated predominantly exploratory responses. However, when LLMs used CoE reasoning, we found a more comprehensive range of empathic responses aligned with each psychotherapy model's different reasoning patterns. For empathic expression classification, the CBT-based CoE resulted in the most balanced classification of empathic expression labels and the text generation of empathic responses. However, regarding emotion reasoning, other approaches like DBT and PCT showed higher performance in emotion reaction classification. We further conducted qualitative analysis and alignment scoring of each prompt-generated output. The findings underscore the importance of understanding the emotional context and how it affects human-AI communication. Our research contributes to understanding how psychotherapy models can be incorporated into LLMs, facilitating the development of context-aware, safe, and empathically responsive AI.

DSS Architectures to Support Data Mining Activities for Supply Chain Management (데이터 마이닝을 활용한 공급사슬관리 의사결정지원시스템의 구조에 관한 연구)

  • Jhee, Won-Chul;Suh, Min-Soo
    • Asia pacific journal of information systems
    • /
    • v.8 no.3
    • /
    • pp.51-73
    • /
    • 1998
  • This paper is to evaluate the application potentials of data mining in the areas of Supply Chain Management (SCM) and to suggest the architectures of Decision Support Systems (DSS) that support data mining activities. We first briefly introduce data mining and review the recent literatures on SCM and then evaluate data mining applications to SCM in three aspects: marketing, operations management and information systems. By analyzing the cases about pricing models in distribution channels, demand forecasting and quality control, it is shown that artificial intelligence techniques such as artificial neural networks, case-based reasoning and expert systems, combined with traditional analysis models, effectively mine the useful knowledge from the large volume of SCM data. Agent-based information system is addressed as an important architecture that enables the pursuit of global optimization of SCM through communication and information sharing among supply chain constituents without loss of their characteristics and independence. We expect that the suggested architectures of intelligent DSS provide the basis in developing information systems for SCM to improve the quality of organizational decisions.

  • PDF

Bayesian Model for Cost Estimation of Construction Projects

  • Kim, Sang-Yon
    • Journal of the Korea Institute of Building Construction
    • /
    • v.11 no.1
    • /
    • pp.91-99
    • /
    • 2011
  • Bayesian network is a form of probabilistic graphical model. It incorporates human reasoning to deal with sparse data availability and to determine the probabilities of uncertain cases. In this research, bayesian network is adopted to model the problem of construction project cost. General information, time, cost, and material, the four main factors dominating the characteristic of construction costs, are incorporated into the model. This research presents verify a model that were conducted to illustrate the functionality and application of a decision support system for predicting the costs. The Markov Chain Monte Carlo (MCMC) method is applied to estimate parameter distributions. Furthermore, it is shown that not all the parameters are normally distributed. In addition, cost estimates based on the Gibbs output is performed. It can enhance the decision the decision-making process.

Analysis on value research trend and building the resource and competence based research framework for value creation (가치 연구의 동향 분석 및 가치창출에 대한 자원 및 역량기반 연구체계 구축)

  • Park, Changhyun;Lee, Heesang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.4
    • /
    • pp.1923-1931
    • /
    • 2014
  • A value creation is understood as important business strategy these days in both academics and industry. But this phenomenon is not fully understood based on systematized framework. In this paper, we summarized value research trend based on exploratory research and inductive reasoning by exploring both international and domestic journals. And we built a research framework that can analyze value creation between supplier and customer. Value research prior to 2004 is primarily divided into values of goods or services and relationship values. After 2004, service-dominant (SD) logic has been outlined. There are other research trends to see the relationship value in terms of relationship benefits and relationship in the network or supply chain. 4 critical resource types (financial resource, knowledge resource, efficiency resource, and intellectual resource) and 5 competence types (relational capability, collaboration capability, innovation capability, managing capability) are constructed as principal factors for value creation from inductive reasoning based upon a resource-based view (RBV) and a competence-based view (CBV). The research framework was built based on 4 resources and 4 competences.

Development and Application of the Scientific Inquiry Tasks for Small Group Argumentation (소집단의 논변활동을 위한 과학 탐구 과제의 개발과 적용)

  • Yun, Sun-Mi;Kim, Heui-Baik
    • Journal of The Korean Association For Science Education
    • /
    • v.31 no.5
    • /
    • pp.694-708
    • /
    • 2011
  • In this study, we developed tasks including cognitive scaffolding for students to explain scientific phenomena using valid evidences in science classroom and sought to investigate how tasks influence the development of small group scientific argumentation. Heterogeneous small groups in gender and achievement were organized in one classroom and the tasks were applied to the class. Students were asked to write down their own ideas, share individual ideas, and then choose the most plausible opinion in a group. One group was chosen for investigating the effect of tasks on the development of small group argumentation through the analysis of discourse transcripts of the group in 10 lessons, students' semi-structured interview, field note, and students' pre- and post argument tests. The discrepant argument examples were included in the tasks for students to refute an argument presenting evidences. Moreover, comparing opinion within the group and persuading others were included in the tasks to prompt small group argumentation. As a result, students' post-argument test grades were increased than pre-test grades, and they argued involving evidences and reasoning. The high level of arguments has appeared with high ratio of advanced utterances and lengthening of reasoning chain as lessons went on. Students had elaborate claims involving valid evidences and reasoning by reflective and critical thinking while discussing about the tasks. In addition, tasks which could have various warrants based on the data led to students' spontaneous participation. Therefore, this study has significance in understanding the context of developing small group argumentation, providing information about teaching and learning context prompting students to construct arguments in science inquiry lessons in middle school.

Preference-based Supply Chain Partner Selection Using Fuzzy Ontology (퍼지 온톨로지를 이용한 선호도 기반 공급사슬 파트너 선정)

  • Lee, Hae-Kyung;Ko, Chang-Seong;Kim, Tai-Oun
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.37-52
    • /
    • 2011
  • Supply chain management is a strategic thinking which enhances the value of supply chain and adapts more promptly for the changing environment. For the seamless partnership and value creation in supply chains, information and knowledge sharing and proper partner selection criteria must be applied. Thus, the partner selection criteria are critical to maintain product quality and reliability. Each part of a product is supplied by an appropriate supply partner. The criteria for selecting partners are technological capability, quality, price, consistency, etc. In reality, the criteria for partner selection may change according to the characteristics of the components. When the part is a core component, quality factor is the top priority compared to the price. For a standardized component, lower price has a higher priority. Sometimes, unexpected case occurs such as emergency order in which the preference may shift on the top. Thus, SCM partner selection criteria must be determined dynamically according to the characteristics of part and its context. The purpose of this research is to develop an OWL model for the supply chain partnership depending on its context and characteristics of the parts. The uncertainty of variable is tackled through fuzzy logic. The parts with preference of numerical value and context are represented using OWL. Part preference is converted into fuzzy membership function using fuzzy logic. For the ontology reasoning, SWRL (Semantic Web Rule Language) is applied. For the implementation of proposed model, starter motor of an automobile is adopted. After the fuzzy ontology is constructed, the process of selecting preference-based supply partner for each part is presented.

Category-based Feature Inference in Causal Chain (인과적 사슬구조에서의 범주기반 속성추론)

  • Choi, InBeom;Li, Hyung-Chul O.;Kim, ShinWoo
    • Science of Emotion and Sensibility
    • /
    • v.24 no.1
    • /
    • pp.59-72
    • /
    • 2021
  • Concepts and categories offer the basis for inference pertaining to unobserved features. Prior research on category-based induction that used blank properties has suggested that similarity between categories and features explains feature inference (Rips, 1975; Osherson et al., 1990). However, it was shown by later research that prior knowledge had a large influence on category-based inference and cases were reported where similarity effects completely disappeared. Thus, this study tested category-based feature inference when features are connected in a causal chain and proposed a feature inference model that predicts participants' inference ratings. Each participant learned a category with four features connected in a causal chain and then performed feature inference tasks for an unobserved feature in various exemplars of the category. The results revealed nonindependence, that is, the features not only linked directly to the target feature but also to those screened-off by other feature nodes and affected feature inference (a violation of the causal Markov condition). Feature inference model of causal model theory (Sloman, 2005) explained nonindependence by predicting the effects of directly linked features and indirectly related features. Indirect features equally affected participants' inference regardless of causal distance, and the model predicted smaller effects regarding causally distant features.