• Title/Summary/Keyword: data pre-processing

Search Result 809, Processing Time 0.028 seconds

A Clustering Algorithm Considering Structural Relationships of Web Contents

  • Kang Hyuncheol;Han Sang-Tae;Sun Young-Su
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.1
    • /
    • pp.191-197
    • /
    • 2005
  • Application of data mining techniques to the world wide web, referred to as web mining, has been the focus of several recent researches. With the explosive growth of information sources available on the world wide web, it has become increasingly necessary to track and analyze their usage patterns. In this study, we introduce a process of pre-processing and cluster analysis on web log data and suggest a distance measure considering the structural relationships between web contents. Also, we illustrate some real examples of cluster analysis for web log data and look into practical application of web usage mining for eCRM.

GPS phase measurement cycle-slip detection based on a new wavelet function

  • Zuoya, Zheng;Xiushan, Lu;Xinzhou, Wang;Chuanfa, Chen
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • v.2
    • /
    • pp.91-96
    • /
    • 2006
  • Presently, cycle-slip detection is done between adjacent two points in many cycle-slip methods. Inherently, it is simple wavelet analysis. A new idea is put forward that the number of difference point can adjust by a parameter factor; we study this method to smooth raw data and detect cycle-slip with wavelet analysis. Taking CHAMP satellite data for example, we get some significant conclusions. It is showed that it is valid to detect cycle-slip in GPS phase measurement based on this wavelet function, and it is helpful to improve the precision of GPS data pre-processing and positioning.

  • PDF

Trend of Utilization of Machine Learning Technology for Digital Healthcare Data Analysis (디지털 헬스케어 데이터 분석을 위한 머신 러닝 기술 활용 동향)

  • Woo, Y.C.;Lee, S.Y.;Choi, W.;Ahn, C.W.;Baek, O.K.
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.1
    • /
    • pp.98-110
    • /
    • 2019
  • Machine learning has been applied to medical imaging and has shown an excellent recognition rate. Recently, there has been much interest in preventive medicine. If data are accessible, machine learning packages can be used easily in digital healthcare fields. However, it is necessary to prepare the data in advance, and model evaluation and tuning are required to construct a reliable model. On average, these processes take more than 80% of the total effort required. In this study, we describe the basic concepts of machine learning, pre-processing and visualization of datasets, feature engineering for reliable models, model evaluation and tuning, and the latest trends in popular machine learning frameworks. Finally, we survey a explainable machine learning analysis tool and will discuss the future direction of machine learning.

Damage progression study in fibre reinforced concrete using acoustic emission technique

  • Banjara, Nawal Kishor;Sasmal, Saptarshi;Srinivas, V.
    • Smart Structures and Systems
    • /
    • v.23 no.2
    • /
    • pp.173-184
    • /
    • 2019
  • The main objective of this study is to evaluate the true fracture energy and monitor the damage progression in steel fibre reinforced concrete (SFRC) specimens using acoustic emission (AE) features. Four point bending test is carried out using pre-notched plain and fibre reinforced (0.5% and 1% volume fraction) - concrete under monotonic loading. AE sensors are affixed at different locations of the specimens and AE parameters such as rise time, AE energy, hits, counts, amplitude and duration etc. are obtained. Using the captured and processed AE event data, fracture process zone is identified and the true fracture energy is evaluated. The AE data is also employed for tracing the damage progression in plain and fibre reinforced concrete, using both parametric- and signal- based techniques. Hilbert - Huang transform (HHT) is used in signal based processing for evaluating instantaneous frequency of the acoustic events. It is found that the appropriately processed and carefully analyzed acoustic data is capable of providing vital information on progression of damage on different types of concrete.

Optimised ML-based System Model for Adult-Child Actions Recognition

  • Alhammami, Muhammad;Hammami, Samir Marwan;Ooi, Chee-Pun;Tan, Wooi-Haw
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.929-944
    • /
    • 2019
  • Many critical applications require accurate real-time human action recognition. However, there are many hurdles associated with capturing and pre-processing image data, calculating features, and classification because they consume significant resources for both storage and computation. To circumvent these hurdles, this paper presents a recognition machine learning (ML) based system model which uses reduced data structure features by projecting real 3D skeleton modality on virtual 2D space. The MMU VAAC dataset is used to test the proposed ML model. The results show a high accuracy rate of 97.88% which is only slightly lower than the accuracy when using the original 3D modality-based features but with a 75% reduction ratio from using RGB modality. These results motivate implementing the proposed recognition model on an embedded system platform in the future.

Optimization of Action Recognition based on Slowfast Deep Learning Model using RGB Video Data (RGB 비디오 데이터를 이용한 Slowfast 모델 기반 이상 행동 인식 최적화)

  • Jeong, Jae-Hyeok;Kim, Min-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.8
    • /
    • pp.1049-1058
    • /
    • 2022
  • HAR(Human Action Recognition) such as anomaly and object detection has become a trend in research field(s) that focus on utilizing Artificial Intelligence (AI) methods to analyze patterns of human action in crime-ridden area(s), media services, and industrial facilities. Especially, in real-time system(s) using video streaming data, HAR has become a more important AI-based research field in application development and many different research fields using HAR have currently been developed and improved. In this paper, we propose and analyze a deep-learning-based HAR that provides more efficient scheme(s) using an intelligent AI models, such system can be applied to media services using RGB video streaming data usage without feature extraction pre-processing. For the method, we adopt Slowfast based on the Deep Neural Network(DNN) model under an open dataset(HMDB-51 or UCF101) for improvement in prediction accuracy.

Classification for Imbalanced Breast Cancer Dataset Using Resampling Methods

  • Hana Babiker, Nassar
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.1
    • /
    • pp.89-95
    • /
    • 2023
  • Analyzing breast cancer patient files is becoming an exciting area of medical information analysis, especially with the increasing number of patient files. In this paper, breast cancer data is collected from Khartoum state hospital, and the dataset is classified into recurrence and no recurrence. The data is imbalanced, meaning that one of the two classes have more sample than the other. Many pre-processing techniques are applied to classify this imbalanced data, resampling, attribute selection, and handling missing values, and then different classifiers models are built. In the first experiment, five classifiers (ANN, REP TREE, SVM, and J48) are used, and in the second experiment, meta-learning algorithms (Bagging, Boosting, and Random subspace). Finally, the ensemble model is used. The best result was obtained from the ensemble model (Boosting with J48) with the highest accuracy 95.2797% among all the algorithms, followed by Bagging with J48(90.559%) and random subspace with J48(84.2657%). The breast cancer imbalanced dataset was classified into recurrence, and no recurrence with different classified algorithms and the best result was obtained from the ensemble model.

Design of Disease Prediction Algorithm Applying Machine Learning Time Series Prediction

  • Hye-Kyeong Ko
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.321-328
    • /
    • 2024
  • This paper designs a disease prediction algorithm to diagnose migraine among the types of diseases in advance by learning algorithms using machine learning-based time series analysis. This study utilizes patient data statistics, such as electroencephalogram activity, to design a prediction algorithm to determine the onset signals of migraine symptoms, so that patients can efficiently predict and manage their disease. The results of the study evaluate how accurate the proposed prediction algorithm is in predicting migraine and how quickly it can predict the onset of migraine for disease prevention purposes. In this paper, a machine learning algorithm is used to analyze time series of data indicators used for migraine identification. We designed an algorithm that can efficiently predict and manage patients' diseases by quickly determining the onset signaling symptoms of disease development using existing patient data as input. The experimental results show that the proposed prediction algorithm can accurately predict the occurrence of migraine using machine learning algorithms.

Development of an Automated Operational Orbit Processing System (자동 궤도운용 시스템 개발)

  • Kim, Hae-Dong;Jung, Ok-Chul;Kim, Eun-Kyou;Bang, Hyo-Choong
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.35 no.9
    • /
    • pp.836-842
    • /
    • 2007
  • This paper describes the development of an automated operational orbit processing system (KGS automated Operational Orbit Processing System, KOOPS), which can determine, evaluate, update, and generate the orbit data automatically. Developed system can be applied to the multi satellite mission operations as a generic satellite orbit processing system in that the KOOPS has a capability to process various kinds of tracking data and assign pre and post processes according to the satellite system respectively. Results of applying the KOOPS to the KOMPSAT-1 and KOMPSAT-2 mission operations show that man power is greatly reduced and the efficiency and stability of the mission operations are significantly increased. The experiences to develop the KOOPS and operate multi satellite missions using this system can be applied to enhance the multi and generic flight dynamics system further.

Visualizing Article Material using a Big Data Analytical Tool R Language (빅데이터 분석 도구 R 언어를 이용한 논문 데이터 시각화)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.326-327
    • /
    • 2021
  • Newly, big data utilization has been widely interested in a wide variety of industrial fields. Big data analysis is the process of discovering meaningful new correlations, patterns, and trends in large volumes of data stored in data stores and creating new value. Thus, most big data analysis technology methods include data mining, machine learning, natural language processing, and pattern recognition used in existing statistical computer science. Also, using the R language, a big data tool, we can express analysis results through various visualization functions using pre-processing text data. The data used in this study were analyzed for 29 papers in a specific journal. In the final analysis results, the most frequently mentioned keyword was "Research", which ranked first 743 times. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.

  • PDF