• Title/Summary/Keyword: data process

Search Result 23,601, Processing Time 0.051 seconds

Principles of Multivariate Data Visualization

  • Huh, Moon Yul;Cha, Woon Ock
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.465-474
    • /
    • 2004
  • Data visualization is the automation process and the discovery process to data sets in an effort to discover underlying information from the data. It provides rich visual depictions of the data. It has distinct advantages over traditional data analysis techniques such as exploring the structure of large scale data set both in the sense of number of observations and the number of variables by allowing great interaction with the data and end-user. We discuss the principles of data visualization and evaluate the characteristics of various tools of visualization according to these principles.

Character Recognition Algorithm using Accumulation Mask

  • Yoo, Suk Won
    • International Journal of Advanced Culture Technology
    • /
    • v.6 no.2
    • /
    • pp.123-128
    • /
    • 2018
  • Learning data is composed of 100 characters with 10 different fonts, and test data is composed of 10 characters with a new font that is not used for the learning data. In order to consider the variety of learning data with several different fonts, 10 learning masks are constructed by accumulating pixel values of same characters with 10 different fonts. This process eliminates minute difference of characters with different fonts. After finding maximum values of learning masks, test data is expanded by multiplying these maximum values to the test data. The algorithm calculates sum of differences of two corresponding pixel values of the expanded test data and the learning masks. The learning mask with the smallest value among these 10 calculated sums is selected as the result of the recognition process for the test data. The proposed algorithm can recognize various types of fonts, and the learning data can be modified easily by adding a new font. Also, the recognition process is easy to understand, and the algorithm makes satisfactory results for character recognition.

A Study on Analysis of Superlarge Manufacturing Process Data for Six Sigma (6 시그마 위한 대용량 공정데이터 분석에 관한 연구)

  • 박재홍;변재현
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2001.10a
    • /
    • pp.411-415
    • /
    • 2001
  • Advances in computer and sensor technology have made it possible to obtain superlarge manufacturing process data in real time, letting us to extract meaningful information from these superlarge data sets. We propose a systematic data analysis procedure which field engineers can apply easily to manufacture quality products. The procedure consists of data cleaning and data analysis stages. Data cleaning stage is to construct a database suitable for statistical analysis from the original superlarge manufacturing process data. In the data analysis stage, we suggest a graphical easy-to-implement approach to extract practical information from the cleaned database. This study will help manufacturing companies to achieve six sigma quality.

  • PDF

Utility Analysis of Federated Learning Techniques through Comparison of Financial Data Performance (금융데이터의 성능 비교를 통한 연합학습 기법의 효용성 분석)

  • Jang, Jinhyeok;An, Yoonsoo;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.405-416
    • /
    • 2022
  • Current AI technology is improving the quality of life by using machine learning based on data. When using machine learning, transmitting distributed data and collecting it in one place goes through a de-identification process because there is a risk of privacy infringement. De-identification data causes information damage and omission, which degrades the performance of the machine learning process and complicates the preprocessing process. Accordingly, Google announced joint learning in 2016, a method of de-identifying data and learning without the process of collecting data into one server. This paper analyzed the effectiveness by comparing the difference between the learning performance of data that went through the de-identification process of K anonymity and differential privacy reproduction data using actual financial data. As a result of the experiment, the accuracy of original data learning was 79% for k=2, 76% for k=5, 52% for k=7, 50% for 𝜖=1, and 82% for 𝜖=0.1, and 86% for Federated learning.

Design of Data Fusion and Data Processing Model According to Industrial Types (산업유형별 데이터융합과 데이터처리 모델의 설계)

  • Jeong, Min-Seung;Jin, Seon-A;Cho, Woo-Hyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.2
    • /
    • pp.67-76
    • /
    • 2017
  • In industrial site in various fields it will be generated in combination with large amounts of data have a correlation. It is able to collect a variety of data in types of industry process, but they are unable to integrate each other's association between each process. For the data of the existing industry, the set values of the molding condition table are input by the operator as an arbitrary value When a problem occurs in the work process. In this paper, design the fusion and analysis processing model of data collected for each industrial type, Prediction Case(Automobile Connect), a through for corporate earnings improvement and process manufacturing industries such as master data through standard molding condition table and the production history file comparison collected during the manufacturing process and reduced failure rate with a new molding condition table digitized by arbitrary value for worker, a new pattern analysis and reinterpreted for various malfunction factors and exceptions, increased productivity, process improvement, the cost savings. It can be designed in a variety of data analysis and model validation. In addition, to secure manufacturing process of objectivity, consistency and optimization by standard set values analyzed and verified and may be optimized to support the industry type, fits optimization(standard setting) techniques through various pattern types.

Recurrent Neural Network Modeling of Etch Tool Data: a Preliminary for Fault Inference via Bayesian Networks

  • Nawaz, Javeria;Arshad, Muhammad Zeeshan;Park, Jin-Su;Shin, Sung-Won;Hong, Sang-Jeen
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2012.02a
    • /
    • pp.239-240
    • /
    • 2012
  • With advancements in semiconductor device technologies, manufacturing processes are getting more complex and it became more difficult to maintain tighter process control. As the number of processing step increased for fabricating complex chip structure, potential fault inducing factors are prevail and their allowable margins are continuously reduced. Therefore, one of the key to success in semiconductor manufacturing is highly accurate and fast fault detection and classification at each stage to reduce any undesired variation and identify the cause of the fault. Sensors in the equipment are used to monitor the state of the process. The idea is that whenever there is a fault in the process, it appears as some variation in the output from any of the sensors monitoring the process. These sensors may refer to information about pressure, RF power or gas flow and etc. in the equipment. By relating the data from these sensors to the process condition, any abnormality in the process can be identified, but it still holds some degree of certainty. Our hypothesis in this research is to capture the features of equipment condition data from healthy process library. We can use the health data as a reference for upcoming processes and this is made possible by mathematically modeling of the acquired data. In this work we demonstrate the use of recurrent neural network (RNN) has been used. RNN is a dynamic neural network that makes the output as a function of previous inputs. In our case we have etch equipment tool set data, consisting of 22 parameters and 9 runs. This data was first synchronized using the Dynamic Time Warping (DTW) algorithm. The synchronized data from the sensors in the form of time series is then provided to RNN which trains and restructures itself according to the input and then predicts a value, one step ahead in time, which depends on the past values of data. Eight runs of process data were used to train the network, while in order to check the performance of the network, one run was used as a test input. Next, a mean squared error based probability generating function was used to assign probability of fault in each parameter by comparing the predicted and actual values of the data. In the future we will make use of the Bayesian Networks to classify the detected faults. Bayesian Networks use directed acyclic graphs that relate different parameters through their conditional dependencies in order to find inference among them. The relationships between parameters from the data will be used to generate the structure of Bayesian Network and then posterior probability of different faults will be calculated using inference algorithms.

  • PDF

An Analysis Techniques for Coatings Mixing using the R Data Analysis Framework (R기반 데이터 분석 프레임워크를 이용한 코팅제 배합 분석 기술)

  • Noh, Seong Yeo;Kim, Minjung;Kim, Young-Jin
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.6
    • /
    • pp.734-741
    • /
    • 2015
  • Coating is a type of paint. It protects a product forming a film layer on the product and assigns various properties to the product. Coating is one of the fields which is being studied actively in the polymer industry. Importance of coating in various industries is more increased. However, mixing process has been performing in dependence on operator's experience. In this paper, we found the relationship between each data from coating formulation process. We propose a framework to analyze the coating formulation process as well. It can improve the coating formulation process. In particular, the suggested framework may reduce degradation and loss costs due to absence of standard data which is accurate formulation criteria. Also it suggests responses to errors which can be occurred in the future through the analysis of the error data generated in mixing step.

A Decision Tree Approach for Identifying Defective Products in the Manufacturing Process

  • Choi, Sungsu;Battulga, Lkhagvadorj;Nasridinov, Aziz;Yoo, Kwan-Hee
    • International Journal of Contents
    • /
    • v.13 no.2
    • /
    • pp.57-65
    • /
    • 2017
  • Recently, due to the significance of Industry 4.0, the manufacturing industry is developing globally. Conventionally, the manufacturing industry generates a large volume of data that is often related to process, line and products. In this paper, we analyzed causes of defective products in the manufacturing process using the decision tree technique, that is a well-known technique used in data mining. We used data collected from the domestic manufacturing industry that includes Manufacturing Execution System (MES), Point of Production (POP), equipment data accumulated directly in equipment, in-process/external air-conditioning sensors and static electricity. We propose to implement a model using C4.5 decision tree algorithm. Specifically, the proposed decision tree model is modeled based on components of a specific part. We propose to identify the state of products, where the defect occurred and compare it with the generated decision tree model to determine the cause of the defect.

Multivariate Control Chart for Autocorrelated Process (자기상관자료를 갖는 공정을 위한 다변량 관리도)

  • Nam, Gook-Hyun;Chang, Young-Soon;Bai, Do-Sun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.27 no.3
    • /
    • pp.289-296
    • /
    • 2001
  • This paper proposes multivariate control chart for autocorrelated data which are common in chemical and process industries and lead to increase in the number of false alarms when conventional control charts are applied. The effect of autocorrelated data is modeled as a vector autoregressive process, and canonical analysis is used to reduce the dimensionality of the data set and find the canonical variables that explain as much of the data variation as possible. Charting statistics are constructed based on the residual vectors from the canonical variables which are uncorrelated over time, and therefore the control charts for these statistics can attenuate the autocorrelation in the process data. The charting procedures are illustrated with a numerical example and Monte Carlo simulation is conducted to investigate the performances of the proposed control charts.

  • PDF

Modeling of Nuclear Power Plant Steam Generator using Neural Networks (신경회로망을 이용한 원자력발전소 증기발생기의 모델링)

  • 이재기;최진영
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.4 no.4
    • /
    • pp.551-560
    • /
    • 1998
  • This paper presents a neural network model representing complex hydro-thermo-dynamic characteristics of a steam generator in nuclear power plants. The key modeling processes include training data gathering process, analysis of system dynamics and determining of the neural network structure, training process, and the final process for validation of the trained model. In this paper, we suggest a training data gathering method from an unstable steam generator so that the data sufficiently represent the dynamic characteristics of the plant over a wide operating range. In addition, we define the inputs and outputs of neural network model by analyzing the system dimension, relative degree, and inputs/outputs of the plant. Several types of neural networks are applied to the modeling and training process. The trained networks are verified by using a class of test data, and their performances are discussed.

  • PDF