• Title/Summary/Keyword: Data-Driven learning

Search Result 248, Processing Time 0.027 seconds

Evaluating SR-Based Reinforcement Learning Algorithm Under the Highly Uncertain Decision Task (불확실성이 높은 의사결정 환경에서 SR 기반 강화학습 알고리즘의 성능 분석)

  • Kim, So Hyeon;Lee, Jee Hang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.331-338
    • /
    • 2022
  • Successor representation (SR) is a model of human reinforcement learning (RL) mimicking the underlying mechanism of hippocampal cells constructing cognitive maps. SR utilizes these learned features to adaptively respond to the frequent reward changes. In this paper, we evaluated the performance of SR under the context where changes in latent variables of environments trigger the reward structure changes. For a benchmark test, we adopted SR-Dyna, an integration of SR into goal-driven Dyna RL algorithm in the 2-stage Markov Decision Task (MDT) in which we can intentionally manipulate the latent variables - state transition uncertainty and goal-condition. To precisely investigate the characteristics of SR, we conducted the experiments while controlling each latent variable that affects the changes in reward structure. Evaluation results showed that SR-Dyna could learn to respond to the reward changes in relation to the changes in latent variables, but could not learn rapidly in that situation. This brings about the necessity to build more robust RL models that can rapidly learn to respond to the frequent changes in the environment in which latent variables and reward structure change at the same time.

Design and Implementation of Memory-Centric Computing System for Big Data Analysis

  • Jung, Byung-Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.7
    • /
    • pp.1-7
    • /
    • 2022
  • Recently, as the use of applications such as big data programs and machine learning programs that are driven while generating large amounts of data in the program itself becomes common, the existing main memory alone lacks memory, making it difficult to execute the program quickly. In particular, the need to derive results more quickly has emerged in a situation where it is necessary to analyze whether the entire sequence is genetically altered due to the outbreak of the coronavirus. As a result of measuring performance by applying large-capacity data to a computing system equipped with a self-developed memory pool MOCA host adapter instead of processing large-capacity data from an existing SSD, performance improved by 16% compared to the existing SSD system. In addition, in various other benchmark tests, IO performance was 92.8%, 80.6%, and 32.8% faster than SSD in computing systems equipped with memory pool MOCA host adapters such as SortSampleBam, ApplyBQSR, and GatherBamFiles by task of workflow. When analyzing large amounts of data, such as electrical dielectric pipeline analysis, it is judged that the measurement delay occurring at runtime can be reduced in the computing system equipped with the memory pool MOCA host adapter developed in this research.

Data-centric XAI-driven Data Imputation of Molecular Structure and QSAR Model for Toxicity Prediction of 3D Printing Chemicals (3D 프린팅 소재 화학물질의 독성 예측을 위한 Data-centric XAI 기반 분자 구조 Data Imputation과 QSAR 모델 개발)

  • ChanHyeok Jeong;SangYoun Kim;SungKu Heo;Shahzeb Tariq;MinHyeok Shin;ChangKyoo Yoo
    • Korean Chemical Engineering Research
    • /
    • v.61 no.4
    • /
    • pp.523-541
    • /
    • 2023
  • As accessibility to 3D printers increases, there is a growing frequency of exposure to chemicals associated with 3D printing. However, research on the toxicity and harmfulness of chemicals generated by 3D printing is insufficient, and the performance of toxicity prediction using in silico techniques is limited due to missing molecular structure data. In this study, quantitative structure-activity relationship (QSAR) model based on data-centric AI approach was developed to predict the toxicity of new 3D printing materials by imputing missing values in molecular descriptors. First, MissForest algorithm was utilized to impute missing values in molecular descriptors of hazardous 3D printing materials. Then, based on four different machine learning models (decision tree, random forest, XGBoost, SVM), a machine learning (ML)-based QSAR model was developed to predict the bioconcentration factor (Log BCF), octanol-air partition coefficient (Log Koa), and partition coefficient (Log P). Furthermore, the reliability of the data-centric QSAR model was validated through the Tree-SHAP (SHapley Additive exPlanations) method, which is one of explainable artificial intelligence (XAI) techniques. The proposed imputation method based on the MissForest enlarged approximately 2.5 times more molecular structure data compared to the existing data. Based on the imputed dataset of molecular descriptor, the developed data-centric QSAR model achieved approximately 73%, 76% and 92% of prediction performance for Log BCF, Log Koa, and Log P, respectively. Lastly, Tree-SHAP analysis demonstrated that the data-centric-based QSAR model achieved high prediction performance for toxicity information by identifying key molecular descriptors highly correlated with toxicity indices. Therefore, the proposed QSAR model based on the data-centric XAI approach can be extended to predict the toxicity of potential pollutants in emerging printing chemicals, chemical process, semiconductor or display process.

Development of Artificial Intelligence Joint Model for Hybrid Finite Element Analysis (하이브리드 유한요소해석을 위한 인공지능 조인트 모델 개발)

  • Jang, Kyung Suk;Lim, Hyoung Jun;Hwang, Ji Hye;Shin, Jaeyoon;Yun, Gun Jin
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.48 no.10
    • /
    • pp.773-782
    • /
    • 2020
  • The development of joint FE models for deep learning neural network (DLNN)-based hybrid FEA is presented. Material models of bolts and bearings in the front axle of tractor, showing complex behavior induced by various tightening conditions, were replaced with DLNN models. Bolts are modeled as one-dimensional Timoshenko beam elements with six degrees of freedom, and bearings as three-dimensional solid elements. Stress-strain data were extracted from all elements after finite element analysis subjected to various load conditions, and DLNN for bolts and bearing were trained with Tensorflow. The DLNN-based joint models were implemented in the ABAQUS user subroutines where stresses from the next increment are updated and the algorithmic tangent stiffness matrix is calculated. Generalization of the trained DLNN in the FE model was verified by subjecting it to a new loading condition. Finally, the DLNN-based FEA for the front axle of the tractor was conducted and the feasibility was verified by comparing with results of a static structural experiment of the actual tractor.

KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)

  • Park, Sang-Min;Na, Chul-Won;Choi, Min-Seong;Lee, Da-Hee;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.219-240
    • /
    • 2018
  • Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.

A Study on the Development of a Competency-Based Intervention Course Curriculum of the Korean Academy of Sensory Integration (대한감각통합치료학회 역량기반 중재과정 교육커리큘럼 개발연구)

  • Namkung, Young;Kim, Kyeong-Mi;Kim, Misun;Lee, Jiyoung
    • The Journal of Korean Academy of Sensory Integration
    • /
    • v.17 no.3
    • /
    • pp.26-45
    • /
    • 2019
  • Objective : The purpose of this study is to develop educational goals, training content, and training methods for the intervention course of the Korean Academy of Sensory Integration (KASI) and to conduct competency-based intervention courses based on the competency model for sensory integration intervention. Methods : This study was conducted on work therapists who participated in the 2019 intervention course of KASI. In the first phase, educational needs were analyzed to set goals for the interventional course. In the second phase, a meeting of researchers drafted the intervention course education program and the methods of education, and the intervention course was conducted. In the third phase, the changes in educational satisfaction and performance level pre- and post-intervention course for each competency index were investigated. Results : The educational goals of "learning and applying the clinical reasoning process of sensory integration intervention" and "intervention by applying the principle of sensory integration intervention" were set after reflecting on the results of the analysis of the educational requirements. The length of the competency-based intervention course was 42 hours. The average education satisfaction level of participants in the arbitration process was 4.48±0.73, and the average education satisfaction level of the supervisor was 3.92±0.71. In both groups, the most satisfying curriculums were the data-driven decision-making process and the intervention goal-setting lecture. But the satisfaction level of was the lowest. Before and after the intervention course, there were significant changes in the performance of the two behavioral indicators of the analytic skills in the expertise competency cluster of the competency model. Conclusion : This study is meaningful in that it conducted a survey of educational needs, the development and implementation of an educational curriculum, and an education satisfaction survey through systematic courses necessary for education development.

Causal inference from nonrandomized data: key concepts and recent trends (비실험 자료로부터의 인과 추론: 핵심 개념과 최근 동향)

  • Choi, Young-Geun;Yu, Donghyeon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.173-185
    • /
    • 2019
  • Causal questions are prevalent in scientific research, for example, how effective a treatment was for preventing an infectious disease, how much a policy increased utility, or which advertisement would give the highest click rate for a given customer. Causal inference theory in statistics interprets those questions as inferring the effect of a given intervention (treatment or policy) in the data generating process. Causal inference has been used in medicine, public health, and economics; in addition, it has received recent attention as a tool for data-driven decision making processes. Many recent datasets are observational, rather than experimental, which makes the causal inference theory more complex. This review introduces key concepts and recent trends of statistical causal inference in observational studies. We first introduce the Neyman-Rubin's potential outcome framework to formularize from causal questions to average treatment effects as well as discuss popular methods to estimate treatment effects such as propensity score approaches and regression approaches. For recent trends, we briefly discuss (1) conditional (heterogeneous) treatment effects and machine learning-based approaches, (2) curse of dimensionality on the estimation of treatment effect and its remedies, and (3) Pearl's structural causal model to deal with more complex causal relationships and its connection to the Neyman-Rubin's potential outcome model.

Mobile robot control by MNN using optimal EN (최적 EN를 사용한 MNN에 의한 Mobile Robot제어)

  • Choi, Woo-Kyung;Kim, Seong-Joo;Seo, Jae-Yong;Jeon, Hong-Tae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.2
    • /
    • pp.186-191
    • /
    • 2003
  • Skills in tracing of the MR divide into following, approaching, avoiding and warning and so on. It is difficult to have all these skills learned as neural network. To make this up for, skills consisted of each module, and Mobile Robot was controlled by the output of module adequate for the situation. A mobile Robot was equipped multi-ultrasonic sensor and a USB Camera, which can be in place of human sense, and the measured environment information data is learned through Modular Neural Network. MNN consisted of optimal combination of activation function in the Expert Network and its structure seemed to improve learning time and errors. The Gating Network(GN) used to control output values of the MNN by switching for angle and speed of the robot. In the paper, EN of Modular Neural network was designed optimal combination. Traveling with a real MR was performed repeatedly to verity the usefulness of the MNN which was proposed in this paper. The robot was properly controlled and driven by the result value and the experimental is rewarded with good fruits.

A Study on Composition and Utilization of Digital Literacy Education elements Using Open Contents (오픈 콘텐츠를 활용한 디지털 리터러시 학습 요소 구성과 활용)

  • Hong, Myunghui;Lee, Soonyoung
    • Journal of The Korean Association of Information Education
    • /
    • v.22 no.6
    • /
    • pp.711-721
    • /
    • 2018
  • The development of artificial intelligence technology and the shift to a software-driven society are raising the need for digital literacy education on how to access, understand, use, create and share new open content in a variety of sustainable open content. At this point in time, this paper defines the digital literacy as the subliteracy concept for data, tools, and device elements. It is defined as a concept that includes cognitive and non-cognitive abilities and is stratified by computer literacy, ICT literacy, and information literacy. Open content is also defined as teaching-learning materials that can be used and shared freely by anyone, such as the Open Education Resource (OER) and the Open Access movement. Based on the two definitions, a three-step strategy for digital literacy education was developed to select open content in the digital environment, followed by a digital literacy education plan, and finally, an education frame to foster digital literacy capabilities.

Classes in Object-Oriented Modeling (UML): Further Understanding and Abstraction

  • Al-Fedaghi, Sabah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.5
    • /
    • pp.139-150
    • /
    • 2021
  • Object orientation has become the predominant paradigm for conceptual modeling (e.g., UML), where the notions of class and object form the primitive building blocks of thought. Classes act as templates for objects that have attributes and methods (actions). The modeled systems are not even necessarily software systems: They can be human and artificial systems of many different kinds (e.g., teaching and learning systems). The UML class diagram is described as a central component of model-driven software development. It is the most common diagram in object-oriented models and used to model the static design view of a system. Objects both carry data and execute actions. According to some authorities in modeling, a certain degree of difficulty exists in understanding the semantics of these notions in UML class diagrams. Some researchers claim class diagrams have limited use for conceptual analysis and that they are best used for logical design. Performing conceptual analysis should not concern the ways facts are grouped into structures. Whether a fact will end up in the design as an attribute is not a conceptual issue. UML leads to drilling down into physical design details (e.g., private/public attributes, encapsulated operations, and navigating direction of an association). This paper is a venture to further the understanding of object-orientated concepts as exemplified in UML with the aim of developing a broad comprehension of conceptual modeling fundamentals. Thinging machine (TM) modeling is a new modeling language employed in such an undertaking. TM modeling interlaces structure (components) and actionality where actions infiltrate the attributes as much as the classes. Although space limitations affect some aspects of the class diagram, the concluding assessment of this study reveals the class description is a kind of shorthand for a richer sematic TM construct.