• Title/Summary/Keyword: data learning process

Search Result 2,087, Processing Time 0.025 seconds

Semi-Supervised Learning for Fault Detection and Classification of Plasma Etch Equipment (준지도학습 기반 반도체 공정 이상 상태 감지 및 분류)

  • Lee, Yong Ho;Choi, Jeong Eun;Hong, Sang Jeen
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.4
    • /
    • pp.121-125
    • /
    • 2020
  • With miniaturization of semiconductor, the manufacturing process become more complex, and undetected small changes in the state of the equipment have unexpectedly changed the process results. Fault detection classification (FDC) system that conducts more active data analysis is feasible to achieve more precise manufacturing process control with advanced machine learning method. However, applying machine learning, especially in supervised learning criteria, requires an arduous data labeling process for the construction of machine learning data. In this paper, we propose a semi-supervised learning to minimize the data labeling work for the data preprocessing. We employed equipment status variable identification (SVID) data and optical emission spectroscopy data (OES) in silicon etch with SF6/O2/Ar gas mixture, and the result shows as high as 95.2% of labeling accuracy with the suggested semi-supervised learning algorithm.

Character Recognition Algorithm using Accumulation Mask

  • Yoo, Suk Won
    • International Journal of Advanced Culture Technology
    • /
    • v.6 no.2
    • /
    • pp.123-128
    • /
    • 2018
  • Learning data is composed of 100 characters with 10 different fonts, and test data is composed of 10 characters with a new font that is not used for the learning data. In order to consider the variety of learning data with several different fonts, 10 learning masks are constructed by accumulating pixel values of same characters with 10 different fonts. This process eliminates minute difference of characters with different fonts. After finding maximum values of learning masks, test data is expanded by multiplying these maximum values to the test data. The algorithm calculates sum of differences of two corresponding pixel values of the expanded test data and the learning masks. The learning mask with the smallest value among these 10 calculated sums is selected as the result of the recognition process for the test data. The proposed algorithm can recognize various types of fonts, and the learning data can be modified easily by adding a new font. Also, the recognition process is easy to understand, and the algorithm makes satisfactory results for character recognition.

Advanced Information Data-interactive Learning System Effect for Creative Design Project

  • Park, Sangwoo;Lee, Inseop;Lee, Junseok;Sul, Sanghun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2831-2845
    • /
    • 2022
  • Compared to the significant approach of project-based learning research, a data-driven design project-based learning has not reached a meaningful consensus regarding the most valid and reliable method for assessing design creativity. This article proposes an advanced information data-interactive learning system for creative design using a service design process that combines a design thinking. We propose a service framework to improve the convergence design process between students and advanced information data analysis, allowing students to participate actively in the data visualization and research using patent data. Solving a design problem by discovery and interpretation process, the Advanced information-interactive learning framework allows the students to verify the creative idea values or to ideate new factors and the associated various feasible solutions. The student can perform the patent data according to a business intelligence platform. Most of the new ideas for solving design projects are evaluated through complete patent data analysis and visualization in the beginning of the service design process. In this article, we propose to adapt advanced information data to educate the service design process, allowing the students to evaluate their own idea and define the problems iteratively until satisfaction. Quantitative evaluation results have shown that the advanced information data-driven learning system approach can improve the design project - based learning results in terms of design creativity. Our findings can contribute to data-driven project-based learning for advanced information data that play a crucial role in convergence design in related standards and other smart educational fields that are linked.

Utility Analysis of Federated Learning Techniques through Comparison of Financial Data Performance (금융데이터의 성능 비교를 통한 연합학습 기법의 효용성 분석)

  • Jang, Jinhyeok;An, Yoonsoo;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.405-416
    • /
    • 2022
  • Current AI technology is improving the quality of life by using machine learning based on data. When using machine learning, transmitting distributed data and collecting it in one place goes through a de-identification process because there is a risk of privacy infringement. De-identification data causes information damage and omission, which degrades the performance of the machine learning process and complicates the preprocessing process. Accordingly, Google announced joint learning in 2016, a method of de-identifying data and learning without the process of collecting data into one server. This paper analyzed the effectiveness by comparing the difference between the learning performance of data that went through the de-identification process of K anonymity and differential privacy reproduction data using actual financial data. As a result of the experiment, the accuracy of original data learning was 79% for k=2, 76% for k=5, 52% for k=7, 50% for 𝜖=1, and 82% for 𝜖=0.1, and 86% for Federated learning.

Learning process mining techniques based on open education platforms (개방형 e-Learning 플랫폼 기반 학습 프로세스 마이닝 기술)

  • Kim, Hyun-ah
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.2
    • /
    • pp.375-380
    • /
    • 2019
  • In this paper, we study learning process mining and analytic technology based on open education platform. A study on mining through personal learning history log data based on an open education platform such as MOOC which is growing in interest recently. This technology is to design and implement a learning process mining framework for discovering and analyzing meaningful learning processes and knowledge from learning history log data. Learning process mining framework technology is a technique for expressing, extracting, analyzing and visualizing the learning process to provide learners with improved learning processes and educational services.

Text Classification with Heterogeneous Data Using Multiple Self-Training Classifiers

  • William Xiu Shun Wong;Donghoon Lee;Namgyu Kim
    • Asia pacific journal of information systems
    • /
    • v.29 no.4
    • /
    • pp.789-816
    • /
    • 2019
  • Text classification is a challenging task, especially when dealing with a huge amount of text data. The performance of a classification model can be varied depending on what type of words contained in the document corpus and what type of features generated for classification. Aside from proposing a new modified version of the existing algorithm or creating a new algorithm, we attempt to modify the use of data. The classifier performance is usually affected by the quality of learning data as the classifier is built based on these training data. We assume that the data from different domains might have different characteristics of noise, which can be utilized in the process of learning the classifier. Therefore, we attempt to enhance the robustness of the classifier by injecting the heterogeneous data artificially into the learning process in order to improve the classification accuracy. Semi-supervised approach was applied for utilizing the heterogeneous data in the process of learning the document classifier. However, the performance of document classifier might be degraded by the unlabeled data. Therefore, we further proposed an algorithm to extract only the documents that contribute to the accuracy improvement of the classifier.

Learning Process Monitoring of e-Learning for Corporate Education (기업교육을 위한 인터넷 원격훈련 학습과정 모니터링 연구)

  • Kim, Do-Hun;Jung, Hyojung
    • The Journal of Industrial Distribution & Business
    • /
    • v.9 no.8
    • /
    • pp.35-40
    • /
    • 2018
  • Purpose - The purpose of this study is to conduct a monitoring study on the learning process of e-learning contents. This study has two research objectives. First, by conducting monitoring research on the learning process, we aim to explore the implications for content development that reflects future student needs. Second, we want to collect empirical basic data on the estimation of appropriate amount of learning. Research design, data, and methodology - This study is a case study of learner's learning process in e-learning. After completion of the study, an in-depth interview was made after conducting a test to measure the total amount of cognitive load and the level of engagement that occurred during the learning process. The tool used to measure cognitive load is NASA-TLX, a subjective cognitive load measurement method. In the monitoring process, we observe external phenomena such as page movement and mouse movement path, and identify cognitive activities such as Think-Aloud technique. Results - In the total of three research subjects, the two courses showed excess learning time compared to the learning time, and one course showed less learning time than the learning time. This gives the following implications for content development. First, it is necessary to consider the importance of selecting the target and contents level according to the level of the subject. Second, it is necessary to design the learner participation activity that meets the learning goal level and to calculate the appropriate time accordingly. Third, it is necessary to design appropriate learning support strategy according to the learning task. This should be considered in designing lessons. Fourth, it is necessary to revitalize contents design centered on learning activities such as simulation. Conclusions - The implications of the examination system are as follows. First, it can be confirmed that there is difficulty in calculating the amount of learning centered on learning time and securing objective objectivity. Second, it can be seen that there are various variables affecting the actual learning time in addition to the content amount. Third, there is a need for reviewing the system of examination of learning amount centered on 'learning time'.

Anomaly Detection Model Based on Semi-Supervised Learning Using LIME: Focusing on Semiconductor Process (LIME을 활용한 준지도 학습 기반 이상 탐지 모델: 반도체 공정을 중심으로)

  • Kang-Min An;Ju-Eun Shin;Dong Hyun Baek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.4
    • /
    • pp.86-98
    • /
    • 2022
  • Recently, many studies have been conducted to improve quality by applying machine learning models to semiconductor manufacturing process data. However, in the semiconductor manufacturing process, the ratio of good products is much higher than that of defective products, so the problem of data imbalance is serious in terms of machine learning. In addition, since the number of features of data used in machine learning is very large, it is very important to perform machine learning by extracting only important features from among them to increase accuracy and utilization. This study proposes an anomaly detection methodology that can learn excellently despite data imbalance and high-dimensional characteristics of semiconductor process data. The anomaly detection methodology applies the LIME algorithm after applying the SMOTE method and the RFECV method. The proposed methodology analyzes the classification result of the anomaly classification model, detects the cause of the anomaly, and derives a semiconductor process requiring action. The proposed methodology confirmed applicability and feasibility through application of cases.

Generating Training Dataset of Machine Learning Model for Context-Awareness in a Health Status Notification Service (사용자 건강 상태알림 서비스의 상황인지를 위한 기계학습 모델의 학습 데이터 생성 방법)

  • Mun, Jong Hyeok;Choi, Jong Sun;Choi, Jae Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.1
    • /
    • pp.25-32
    • /
    • 2020
  • In the context-aware system, rule-based AI technology has been used in the abstraction process for getting context information. However, the rules are complicated by the diversification of user requirements for the service and also data usage is increased. Therefore, there are some technical limitations to maintain rule-based models and to process unstructured data. To overcome these limitations, many studies have applied machine learning techniques to Context-aware systems. In order to utilize this machine learning-based model in the context-aware system, a management process of periodically injecting training data is required. In the previous study on the machine learning based context awareness system, a series of management processes such as the generation and provision of learning data for operating several machine learning models were considered, but the method was limited to the applied system. In this paper, we propose a training data generating method of a machine learning model to extend the machine learning based context-aware system. The proposed method define the training data generating model that can reflect the requirements of the machine learning models and generate the training data for each machine learning model. In the experiment, the training data generating model is defined based on the training data generating schema of the cardiac status analysis model for older in health status notification service, and the training data is generated by applying the model defined in the real environment of the software. In addition, it shows the process of comparing the accuracy by learning the training data generated in the machine learning model, and applied to verify the validity of the generated learning data.

Privacy-Preserving Deep Learning using Collaborative Learning of Neural Network Model

  • Hye-Kyeong Ko
    • International journal of advanced smart convergence
    • /
    • v.12 no.2
    • /
    • pp.56-66
    • /
    • 2023
  • The goal of deep learning is to extract complex features from multidimensional data use the features to create models that connect input and output. Deep learning is a process of learning nonlinear features and functions from complex data, and the user data that is employed to train deep learning models has become the focus of privacy concerns. Companies that collect user's sensitive personal information, such as users' images and voices, own this data for indefinite period of times. Users cannot delete their personal information, and they cannot limit the purposes for which the data is used. The study has designed a deep learning method that employs privacy protection technology that uses distributed collaborative learning so that multiple participants can use neural network models collaboratively without sharing the input datasets. To prevent direct leaks of personal information, participants are not shown the training datasets during the model training process, unlike traditional deep learning so that the personal information in the data can be protected. The study used a method that can selectively share subsets via an optimization algorithm that is based on modified distributed stochastic gradient descent, and the result showed that it was possible to learn with improved learning accuracy while protecting personal information.