• Title/Summary/Keyword: data pre-processing

Search Result 801, Processing Time 0.032 seconds

Improvement of Environmental Sounds Recognition by Post Processing (후처리를 이용한 환경음 인식 성능 개선)

  • Park, Jun-Qyu;Baek, Seong-Joon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.7
    • /
    • pp.31-39
    • /
    • 2010
  • In this study, we prepared the real environmental sound data sets arising from people's movement comprising 9 different environment types. The environmental sounds are pre-processed with pre-emphasis and Hamming window, then go into the classification experiments with the extracted features using MFCC (Mel-Frequency Cepstral Coefficients). The GMM (Gaussian Mixture Model) classifier without post processing tends to yield abruptly changing classification results since it does not consider the results of the neighboring frames. Hence we proposed the post processing methods which suppress abruptly changing classification results by taking the probability or the rank of the neighboring frames into account. According to the experimental results, the method using the probability of neighboring frames improve the recognition performance by more than 10% when compared with the method without post processing.

A Case Study of Basic Data Science Education using Public Big Data Collection and Spreadsheets for Teacher Education (교사교육을 위한 공공 빅데이터 수집 및 스프레드시트 활용 기초 데이터과학 교육 사례 연구)

  • Hur, Kyeong
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.3
    • /
    • pp.459-469
    • /
    • 2021
  • In this paper, a case study of basic data science practice education for field teachers and pre-service teachers was studied. In this paper, for basic data science education, spreadsheet software was used as a data collection and analysis tool. After that, we trained on statistics for data processing, predictive hypothesis, and predictive model verification. In addition, an educational case for collecting and processing thousands of public big data and verifying the population prediction hypothesis and prediction model was proposed. A 34-hour, 17-week curriculum using a spreadsheet tool was presented with the contents of such basic education in data science. As a tool for data collection, processing, and analysis, unlike Python, spreadsheets do not have the burden of learning program- ming languages and data structures, and have the advantage of visually learning theories of processing and anal- ysis of qualitative and quantitative data. As a result of this educational case study, three predictive hypothesis test cases were presented and analyzed. First, quantitative public data were collected to verify the hypothesis of predicting the difference in the mean value for each group of the population. Second, by collecting qualitative public data, the hypothesis of predicting the association within the qualitative data of the population was verified. Third, by collecting quantitative public data, the regression prediction model was verified according to the hypothesis of correlation prediction within the quantitative data of the population. And through the satisfaction analysis of pre-service and field teachers, the effectiveness of this education case in data science education was analyzed.

Point Cloud Classification Method for Mountainous Area (산악지역 점군자료 분류기법 연구)

  • Choi, Yun-Woong;Lee, Geun-Sang;Cho, Gi-Sung
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2010.04a
    • /
    • pp.387-388
    • /
    • 2010
  • There is no generalized and systematic method yet to data pre-processing for point cloud data classification even if there have been lots of previous studies such as local maxima filter, morphology filter, slope based filter and so on. Main focus of this study is to present classification method for bare ground information from LiDAR data for the mountainous area.

  • PDF

Development of High Fidelity Supersonic Flow Air Data Processing Algorithm (고 신뢰도 초고속 공기 유동 데이터 처리 알고리즘 개발)

  • Choi, Jong-Ho;Yoon, Hyun-Gull
    • Journal of the Korean Society of Propulsion Engineers
    • /
    • v.14 no.2
    • /
    • pp.54-62
    • /
    • 2010
  • This paper describes the development of high fidelity air data processing algorithm which can be applied into an air data system for a high speed aerial vehicle. Unlike the previous air data system, current algorithm used several pre-determined pressure data which were obtained with computational fluid dynamic approach without using total pressures having enough sensor redundancy and fault detection ability. The verification of current algorithm was done by commercial software Matlab and Simulink.

Introduction of Acquisition System, Processing System and Distributing Service for Geostationary Ocean Color Imager (GOCI) Data (정지궤도 해색탑재체(GOCI) 데이터의 수신.처리 시스템과 배포 서비스)

  • Yang, Chan-Su;Bae, Sang-Soo;Han, Hee-Jeong;Ahn, Yu-Hwan;Ryu, Joo-Hyung;Han, Tai-Hyun;Yoo, Hong-Rhyong
    • Korean Journal of Remote Sensing
    • /
    • v.26 no.2
    • /
    • pp.263-275
    • /
    • 2010
  • KOSC(Korea Ocean Satellite Center), the primary operational organization for GOCI(Geostationary Ocean Color Imager), was established in KORDI(Korea Ocean Research & Development Institute). For a stable distribution service of GOCI data, various systems were installed at KOSC as follows: GOCI Data Acquisition System, Image Pre-processing System, GOCI Data Processing System, GOCI Data Distribution System, Data Management System, Total Management & Control System and External Data Exchange System. KOSC distributes the GOCI data 8 times to user at 1-hour intervals during the daytime in near-real time according to the distribution policy. Finally, we introduce the KOSC website for users to search, request and download GOCI data.

A Prediction Model for Low Cycle and High Cycle Fatigue Lives of Pre-strained Fe-18Mn TWIP Steel (Fe-18Mn TWIP강의 Pre-strain에 따른 저주기 및 고주기 피로 수명 예측 모델)

  • Kim, Y.W.;Lee, C.S.
    • Transactions of Materials Processing
    • /
    • v.19 no.1
    • /
    • pp.11-16
    • /
    • 2010
  • The influence of pre-strain on low cycle fatigue behavior of Fe-18Mn-0.05Al-0.6C TWIP steel was studied by conducting axial strain-controlled tests. As-received plates were deformed by rolling with reduction ratios of 10 and 30%, respectively. A triangular waveform with a constant frequency of 1 Hz was employed for low cycle fatigue test at the total strain amplitudes in the range of ${\pm}0.4\;{\sim}\;{\pm}0.6$ pct. The results showed that low-cycle fatigue life was strongly dependent on the amount of pre-strain as well as the strain amplitude. Increasing the amount of prestrain, the number of reversals to failure was significantly decreased at high strain amplitudes, but the effect was negligible at low strain amplitudes. A new model for predicting fatigue life of pre-strained body has been suggested by adding ${\Delta}E_{pre-strain}$ to the energy-based fatigue damage parameter. Also, high-cycle fatigue lives predicted using the low-cycle fatigue data well agreed with the experimental ones.

Survey on the use of pre-processed food materials in school foodservices in the Kyunggi area (경기지역 학교급식소에서 전처리 식재료의 이용에 대한 실태 조사 및 중요도${\cdot}$수행도 평가)

  • Lee, Seung-Mi;Lee, Seung-Joo
    • Korean journal of food and cookery science
    • /
    • v.22 no.5 s.95
    • /
    • pp.553-564
    • /
    • 2006
  • This study was conducted to investigate the use and acceptability of pre-processed food materials in school foodservice. Self-administered questionnaires were collected from 81 schools in the Kyunggi area. Statistical data analysis was completed using the SPSS v. 10.0 program. Eighty-one school dietitians from 31 elementary, 31 middle, 19 high school participated in the survey. Most of the subjects (over 95%) understood that it is necessary to use pre-processed foods, and they considered food hygiene as the most important factor. The percentages of school foodservices that purchased and used pre-processed foods were: 82.7% for cabbage, 86.4% for onion 72.8% for carrot, 97% for garlic, 82.7% for potato, and over 90% for meats and fishes. Dietitians were most satisfied with the performance of ‘trash reduction’, and ‘saving cooking time’ when using pre-processed food materials. ‘Appearance’, ‘freshness’, ‘hygiene’, ‘nutrition’, and ‘specialty of the food-processing company’ were aspects of the most concern when purchasing and using pre-processed food materials.

Imputation of Medical Data Using Subspace Condition Order Degree Polynomials

  • Silachan, Klaokanlaya;Tantatsanawong, Panjai
    • Journal of Information Processing Systems
    • /
    • v.10 no.3
    • /
    • pp.395-411
    • /
    • 2014
  • Temporal medical data is often collected during patient treatments that require personal analysis. Each observation recorded in the temporal medical data is associated with measurements and time treatments. A major problem in the analysis of temporal medical data are the missing values that are caused, for example, by patients dropping out of a study before completion. Therefore, the imputation of missing data is an important step during pre-processing and can provide useful information before the data is mined. For each patient and each variable, this imputation replaces the missing data with a value drawn from an estimated distribution of that variable. In this paper, we propose a new method, called Newton's finite divided difference polynomial interpolation with condition order degree, for dealing with missing values in temporal medical data related to obesity. We compared the new imputation method with three existing subspace estimation techniques, including the k-nearest neighbor, local least squares, and natural cubic spline approaches. The performance of each approach was then evaluated by using the normalized root mean square error and the statistically significant test results. The experimental results have demonstrated that the proposed method provides the best fit with the smallest error and is more accurate than the other methods.

Efficient Data Management for Finite Element Analysis with Pre-Post Processing of Large Structures (전-후 처리 과정을 포함한 거대 구조물의 유한요소 해석을 위한 효율적 데이터 구조)

  • 박시형;박진우;윤태호;김승조
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2004.04a
    • /
    • pp.389-395
    • /
    • 2004
  • We consider the interface between the parallel distributed memory multifrontal solver and the finite element method. We give in detail the requirement and the data structure of parallel FEM interface which includes the element data and the node array. The full procedures of solving a large scale structural problem are assumed to have pre-post processors, of which algorithm is not considered in this paper. The main advantage of implementing the parallel FEM interface is shown up in the case that we use a distributed memory system with a large number of processors to solve a very large scale problem. The memory efficiency and the performance effect are examined by analyzing some examples on the Pegasus cluster system.

  • PDF

The Study of Failure Mode Data Development and Feature Parameter's Reliability Verification Using LSTM Algorithm for 2-Stroke Low Speed Engine for Ship's Propulsion (선박 추진용 2행정 저속엔진의 고장모드 데이터 개발 및 LSTM 알고리즘을 활용한 특성인자 신뢰성 검증연구)

  • Jae-Cheul Park;Hyuk-Chan Kwon;Chul-Hwan Kim;Hwa-Sup Jang
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.60 no.2
    • /
    • pp.95-109
    • /
    • 2023
  • In the 4th industrial revolution, changes in the technological paradigm have had a direct impact on the maintenance system of ships. The 2-stroke low speed engine system integrates with the core equipment required for propulsive power. The Condition Based Management (CBM) is defined as a technology that predictive maintenance methods in existing calender-based or running time based maintenance systems by monitoring the condition of machinery and diagnosis/prognosis failures. In this study, we have established a framework for CBM technology development on our own, and are engaged in engineering-based failure analysis, data development and management, data feature analysis and pre-processing, and verified the reliability of failure mode DB using LSTM algorithms. We developed various simulated failure mode scenarios for 2-stroke low speed engine and researched to produce data on onshore basis test_beds. The analysis and pre-processing of normal and abnormal status data acquired through failure mode simulation experiment used various Exploratory Data Analysis (EDA) techniques to feature extract not only data on the performance and efficiency of 2-stroke low speed engine but also key feature data using multivariate statistical analysis. In addition, by developing an LSTM classification algorithm, we tried to verify the reliability of various failure mode data with time-series characteristics.