• Title/Summary/Keyword: data pre-processing

Search Result 804, Processing Time 0.027 seconds

A Proposal of Remaining Useful Life Prediction Model for Turbofan Engine based on k-Nearest Neighbor (k-NN을 활용한 터보팬 엔진의 잔여 유효 수명 예측 모델 제안)

  • Kim, Jung-Tae;Seo, Yang-Woo;Lee, Seung-Sang;Kim, So-Jung;Kim, Yong-Geun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.4
    • /
    • pp.611-620
    • /
    • 2021
  • The maintenance industry is mainly progressing based on condition-based maintenance after corrective maintenance and preventive maintenance. In condition-based maintenance, maintenance is performed at the optimum time based on the condition of equipment. In order to find the optimal maintenance point, it is important to accurately understand the condition of the equipment, especially the remaining useful life. Thus, using simulation data (C-MAPSS), a prediction model is proposed to predict the remaining useful life of a turbofan engine. For the modeling process, a C-MAPSS dataset was preprocessed, transformed, and predicted. Data pre-processing was performed through piecewise RUL, moving average filters, and standardization. The remaining useful life was predicted using principal component analysis and the k-NN method. In order to derive the optimal performance, the number of principal components and the number of neighbor data for the k-NN method were determined through 5-fold cross validation. The validity of the prediction results was analyzed through a scoring function while considering the usefulness of prior prediction and the incompatibility of post prediction. In addition, the usefulness of the RUL prediction model was proven through comparison with the prediction performance of other neural network-based algorithms.

Implementation of CNN-based Classification Training Model for Unstructured Fashion Image Retrieval using Preprocessing with MASK R-CNN (비정형 패션 이미지 검색을 위한 MASK R-CNN 선형처리 기반 CNN 분류 학습모델 구현)

  • Seunga, Cho;Hayoung, Lee;Hyelim, Jang;Kyuri, Kim;Hyeon-Ji, Lee;Bong-Ki, Son;Jaeho, Lee
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.13-23
    • /
    • 2022
  • In this paper, we propose a detailed component image classification algorithm by fashion item for unstructured data retrieval in the fashion field. Due to the COVID-19 environment, AI-based online shopping malls are increasing recently. However, there is a limit to accurate unstructured data search with existing keyword search and personalized style recommendations based on user surfing behavior. In this study, pre-processing using Mask R-CNN was conducted using images crawled from online shopping sites and then classified components for each fashion item through CNN. We obtain the accuaracy for collar of the shirt's as 93.28%, the pattern of the shirt as 98.10%, the 3 classese fit of the jeans as 91.73%, And, we further obtained one for the 4 classes fit of jeans as 81.59% and the color of the jeans as 93.91%. At the results for the decorated items, we also obtained the accuract of the washing of the jeans as 91.20% and the demage of jeans accuaracy as 92.96%.

Differences of Teachers and Students' Perceptions on Teaching Skills (교사의 수업전문성에 관한 교사와 학생의 인식 차이)

  • Lee, Okhwa
    • Korean Educational Research Journal
    • /
    • v.43 no.1
    • /
    • pp.125-152
    • /
    • 2022
  • The purpose of this study is to examine the differences of perceptions of teachers and students regarding teaching skills. For the analysis, data was collected by ICALT(International Comparative Analysis of Learning and Teaching) class observation tool and students survey called My Teacher Questionnaire. a student survey. The data of teachers and students can be compared because as the two tools have seven common domains(Safe and stimulating learning climate, Efficient organization, Clear and structured instructions, Intensive and activating teaching, Adjusting instructions and learner processing to inter-learner differences, Teaching learning strategies, Learner engagement). In 2016, in Daejeon, Chungbuk and Chungnam. trained teachers collected data from 106 classes, and 2,866 students responded the survey. The reliability and validity of the two tools, class observation and MTQ(My Teacher Questionnaire) are proven to be satisfactory for use in Korean schools. Students perception on teaching was high, particularly when students are in lower grades and learning major subjects like English, Korean, and math. The domain of higher teaching skills, male students show higher perceptions while female students reported higher perceptions on lower-level teaching skill domains. To compare the perceptions of teachers and students, the predictive reliability of students engagement against teaching skill domains was used. Teachers showed higher predictive reliability on lower teaching skill domains while students showed higher predictive reliability on higher teaching skill domains. It is recommended for further study to develop a professional development model using a teacher class observation tool and the My Teacher Questionnaire for pre-service teachers and school teachers.

The Noise Robust Algorithm to Detect the Starting Point of Music for Content Based Music Retrieval System (노이즈에 강인한 음악 시작점 검출 알고리즘)

  • Kim, Jung-Soo;Sung, Bo-Kyung;Koo, Kwang-Hyo;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.9
    • /
    • pp.95-104
    • /
    • 2009
  • This paper proposes the noise robust algorithm to detect the starting point of music. Detection of starting point of music is necessary to solve computational-waste problem and retrieval-comparison problem with inconsistent input data in music content based retrieval system. In particular, such detection is even more necessary in time sequential retrieval method that compares data in the sequential order of time in contents based music retrieval system. Whereas it has the long point that the retrieval is fast since it executes simple comparison in the order of time, time sequential retrieval method has the short point that data starting time to be compared should be the same. However, digitalized music cannot guarantee the equity of starting time by bit rate conversion. Therefore, this paper ensured that recognition rate shall not decrease even while executing high speed retrieval by applying time sequential retrieval method through detection of music starting point in the pre-processing stage of retrieval. Starting point detection used minimum wave model that can detect effective sound, and for strength against noise, the noises existing in mute sound were swapped. The proposed algorithm was confirmed to produce about 38% more excellent performance than the results to which starting point detection was not applied, and was verified for the strength against noise.

Construction and Application of Intelligent Decision Support System through Defense Ontology - Application example of Air Force Logistics Situation Management System (국방 온톨로지를 통한 지능형 의사결정지원시스템 구축 및 활용 - 공군 군수상황관리체계 적용 사례)

  • Jo, Wongi;Kim, Hak-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.77-97
    • /
    • 2019
  • The large amount of data that emerges from the initial connection environment of the Fourth Industrial Revolution is a major factor that distinguishes the Fourth Industrial Revolution from the existing production environment. This environment has two-sided features that allow it to produce data while using it. And the data produced so produces another value. Due to the massive scale of data, future information systems need to process more data in terms of quantities than existing information systems. In addition, in terms of quality, only a large amount of data, Ability is required. In a small-scale information system, it is possible for a person to accurately understand the system and obtain the necessary information, but in a variety of complex systems where it is difficult to understand the system accurately, it becomes increasingly difficult to acquire the desired information. In other words, more accurate processing of large amounts of data has become a basic condition for future information systems. This problem related to the efficient performance of the information system can be solved by building a semantic web which enables various information processing by expressing the collected data as an ontology that can be understood by not only people but also computers. For example, as in most other organizations, IT has been introduced in the military, and most of the work has been done through information systems. Currently, most of the work is done through information systems. As existing systems contain increasingly large amounts of data, efforts are needed to make the system easier to use through its data utilization. An ontology-based system has a large data semantic network through connection with other systems, and has a wide range of databases that can be utilized, and has the advantage of searching more precisely and quickly through relationships between predefined concepts. In this paper, we propose a defense ontology as a method for effective data management and decision support. In order to judge the applicability and effectiveness of the actual system, we reconstructed the existing air force munitions situation management system as an ontology based system. It is a system constructed to strengthen management and control of logistics situation of commanders and practitioners by providing real - time information on maintenance and distribution situation as it becomes difficult to use complicated logistics information system with large amount of data. Although it is a method to take pre-specified necessary information from the existing logistics system and display it as a web page, it is also difficult to confirm this system except for a few specified items in advance, and it is also time-consuming to extend the additional function if necessary And it is a system composed of category type without search function. Therefore, it has a disadvantage that it can be easily utilized only when the system is well known as in the existing system. The ontology-based logistics situation management system is designed to provide the intuitive visualization of the complex information of the existing logistics information system through the ontology. In order to construct the logistics situation management system through the ontology, And the useful functions such as performance - based logistics support contract management and component dictionary are further identified and included in the ontology. In order to confirm whether the constructed ontology can be used for decision support, it is necessary to implement a meaningful analysis function such as calculation of the utilization rate of the aircraft, inquiry about performance-based military contract. Especially, in contrast to building ontology database in ontology study in the past, in this study, time series data which change value according to time such as the state of aircraft by date are constructed by ontology, and through the constructed ontology, It is confirmed that it is possible to calculate the utilization rate based on various criteria as well as the computable utilization rate. In addition, the data related to performance-based logistics contracts introduced as a new maintenance method of aircraft and other munitions can be inquired into various contents, and it is easy to calculate performance indexes used in performance-based logistics contract through reasoning and functions. Of course, we propose a new performance index that complements the limitations of the currently applied performance indicators, and calculate it through the ontology, confirming the possibility of using the constructed ontology. Finally, it is possible to calculate the failure rate or reliability of each component, including MTBF data of the selected fault-tolerant item based on the actual part consumption performance. The reliability of the mission and the reliability of the system are calculated. In order to confirm the usability of the constructed ontology-based logistics situation management system, the proposed system through the Technology Acceptance Model (TAM), which is a representative model for measuring the acceptability of the technology, is more useful and convenient than the existing system.

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

Measuring the Public Service Quality Using Process Mining: Focusing on N City's Building Licensing Complaint Service (프로세스 마이닝을 이용한 공공서비스의 품질 측정: N시의 건축 인허가 민원 서비스를 중심으로)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.35-52
    • /
    • 2019
  • As public services are provided in various forms, including e-government, the level of public demand for public service quality is increasing. Although continuous measurement and improvement of the quality of public services is needed to improve the quality of public services, traditional surveys are costly and time-consuming and have limitations. Therefore, there is a need for an analytical technique that can measure the quality of public services quickly and accurately at any time based on the data generated from public services. In this study, we analyzed the quality of public services based on data using process mining techniques for civil licensing services in N city. It is because the N city's building license complaint service can secure data necessary for analysis and can be spread to other institutions through public service quality management. This study conducted process mining on a total of 3678 building license complaint services in N city for two years from January 2014, and identified process maps and departments with high frequency and long processing time. According to the analysis results, there was a case where a department was crowded or relatively few at a certain point in time. In addition, there was a reasonable doubt that the increase in the number of complaints would increase the time required to complete the complaints. According to the analysis results, the time required to complete the complaint was varied from the same day to a year and 146 days. The cumulative frequency of the top four departments of the Sewage Treatment Division, the Waterworks Division, the Urban Design Division, and the Green Growth Division exceeded 50% and the cumulative frequency of the top nine departments exceeded 70%. Higher departments were limited and there was a great deal of unbalanced load among departments. Most complaint services have a variety of different patterns of processes. Research shows that the number of 'complementary' decisions has the greatest impact on the length of a complaint. This is interpreted as a lengthy period until the completion of the entire complaint is required because the 'complement' decision requires a physical period in which the complainant supplements and submits the documents again. In order to solve these problems, it is possible to drastically reduce the overall processing time of the complaints by preparing thoroughly before the filing of the complaints or in the preparation of the complaints, or the 'complementary' decision of other complaints. By clarifying and disclosing the cause and solution of one of the important data in the system, it helps the complainant to prepare in advance and convinces that the documents prepared by the public information will be passed. The transparency of complaints can be sufficiently predictable. Documents prepared by pre-disclosed information are likely to be processed without problems, which not only shortens the processing period but also improves work efficiency by eliminating the need for renegotiation or multiple tasks from the point of view of the processor. The results of this study can be used to find departments with high burdens of civil complaints at certain points of time and to flexibly manage the workforce allocation between departments. In addition, as a result of analyzing the pattern of the departments participating in the consultation by the characteristics of the complaints, it is possible to use it for automation or recommendation when requesting the consultation department. In addition, by using various data generated during the complaint process and using machine learning techniques, the pattern of the complaint process can be found. It can be used for automation / intelligence of civil complaint processing by making this algorithm and applying it to the system. This study is expected to be used to suggest future public service quality improvement through process mining analysis on civil service.

A Flow Control Scheme based on Queue Priority (큐의 우선순위에 근거한 흐름제어방식)

  • Lee, Gwang-Jun;Son, Ji-Yeon;Son, Chang-Won
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.1
    • /
    • pp.237-245
    • /
    • 1997
  • In this paper, a flow control mechanism is proposed which is based on the priority control between communication path of a node. In this scheme, demanding length of a data queue for any pre-defined, then each node in that path is forced to maintains buffer size under the limit by controlling priority level of the path. The communication path which requires higher bandwidth sets its demanding queue length smaller. By providing relationship between the priority of a path and length of its queue, the high bandwidth requesting path has a better chance to get high bandwidth by defining the smaller demanding queue size. And also, by forcing a path which has high flow rate to maintain small queue size in the path of the communication, the scheme keep the transmission delay of the path small. The size of the demanding queue of a path is regularly adjusted to meet the applications requirement, and the load status of the network during the life time of the communication. The priority control based on the demanding queue size is also provided in the intermediate nodes as well as the end nodes. By that the flow control can provide a quicker result than end to-end flow control, it provides better performance advantage especially for the high speed network.

  • PDF

Axial Load Capacity Prediction of Single Piles in Clay and Sand Layers Using Nonlinear Load Transfer Curves (비선형 하중전이법에 의한 점토 및 모래층에서 파일의 지지력 예측)

  • Kim, Hyeongjoo;Mission, Joseleo;Song, Youngsun;Ban, Jaehong;Baeg, Pilsoon
    • Journal of the Korean GEO-environmental Society
    • /
    • v.9 no.5
    • /
    • pp.45-52
    • /
    • 2008
  • The present study has extended OpenSees, which is an open-source software framework DOS program for developing applications to idealize geotechnical and structural problems, for the static analysis of axial load capacity and settlement of single piles in MS Windows environment. The Windows version of OpenSees as improved by this study has enhanced the DOS version from a general purpose software program to a special purpose program for driven and bored pile analysis with additional features of pre-processing and post-processing and a user friendly graphical interface. The method used in the load capacity analysis is the numerical methods based on load transfer functions combined with finite elements. The use of empirical nonlinear T-z and Q-z load transfer curves to model soil-pile interaction in skin friction and end bearing, respectively, has been shown to capture the nonlinear soil-pile response under settlement due to load. Validation studies have shown the static load capacity and settlement predictions implemented in this study are in fair agreement with reference data from the static loading tests.

  • PDF

Hardware Design of High Performance HEVC Deblocking Filter for UHD Videos (UHD 영상을 위한 고성능 HEVC 디블록킹 필터 설계)

  • Park, Jaeha;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.1
    • /
    • pp.178-184
    • /
    • 2015
  • This paper proposes a hardware architecture for high performance Deblocking filter(DBF) in High Efficiency Video Coding for UHD(Ultra High Definition) videos. This proposed hardware architecture which has less processing time has a 4-stage pipelined architecture with two filters and parallel boundary strength module. Also, the proposed filter can be used in low-voltage design by using clock gating architecture in 4-stage pipeline. The segmented memory architecture solves the hazard issue that arises when single port SRAM is accessed. The proposed order of filtering shortens the delay time that arises when storing data into the single port SRAM at the pre-processing stage. The DBF hardware proposed in this paper was designed with Verilog HDL, and was implemented with 22k logic gates as a result of synthesis using TSMC 0.18um CMOS standard cell library. Furthermore, the dynamic frequency can process UHD 8k($7680{\times}4320$) samples@60fps using a frequency of 150MHz with an 8K resolution and maximum dynamic frequency is 285MHz. Result from analysis shows that the proposed DBF hardware architecture operation cycle for one process coding unit has improved by 32% over the previous one.