• Title/Summary/Keyword: Statistical Learning Model

Search Result 528, Processing Time 0.024 seconds

Optimized Chinese Pronunciation Prediction by Component-Based Statistical Machine Translation

  • Zhu, Shunle
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.203-212
    • /
    • 2021
  • To eliminate ambiguities in the existing methods to simplify Chinese pronunciation learning, we propose a model that can predict the pronunciation of Chinese characters automatically. The proposed model relies on a statistical machine translation (SMT) framework. In particular, we consider the components of Chinese characters as the basic unit and consider the pronunciation prediction as a machine translation procedure (the component sequence as a source sentence, the pronunciation, pinyin, as a target sentence). In addition to traditional features such as the bidirectional word translation and the n-gram language model, we also implement a component similarity feature to overcome some typos during practical use. We incorporate these features into a log-linear model. The experimental results show that our approach significantly outperforms other baseline models.

Electric Power Demand Prediction Using Deep Learning Model with Temperature Data (기온 데이터를 반영한 전력수요 예측 딥러닝 모델)

  • Yoon, Hyoup-Sang;Jeong, Seok-Bong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.7
    • /
    • pp.307-314
    • /
    • 2022
  • Recently, researches using deep learning-based models are being actively conducted to replace statistical-based time series forecast techniques to predict electric power demand. The result of analyzing the researches shows that the performance of the LSTM-based prediction model is acceptable, but it is not sufficient for long-term regional-wide power demand prediction. In this paper, we propose a WaveNet deep learning model to predict electric power demand 24-hour-ahead with temperature data in order to achieve the prediction accuracy better than MAPE value of 2% which statistical-based time series forecast techniques can present. First of all, we illustrate a delated causal one-dimensional convolutional neural network architecture of WaveNet and the preprocessing mechanism of the input data of electric power demand and temperature. Second, we present the training process and walk forward validation with the modified WaveNet. The performance comparison results show that the prediction model with temperature data achieves MAPE value of 1.33%, which is better than MAPE Value (2.33%) of the same model without temperature data.

A Study on Killer Services in Ubiquitous Computing: The Case of the Scene of Labor Learning (유비쿼터스 컴퓨팅 환경에서의 킬러서비스 사례연구: 현장체험 학습을 중심으로)

  • Kim, Kyung-Kyu;Park, Sung-Kook;Ryoo, Sung-Yul;Kim, Moon-Oh;Chang, Hang-Bae
    • Journal of Information Technology Services
    • /
    • v.6 no.2
    • /
    • pp.99-112
    • /
    • 2007
  • In this study we designed the killer services for the scene of labor learning in ubiquitous computing. To achieve this study, we have explored the unmet needs of teachers in the scene of labor learning and examined whether the unmet needs could be served by the resources and capabilities of ubiquitous computing. Then, we have crafted a detail killer services that includes value propositions and resource maps by using statistical methodology. Finally, the killer services for the scene of labor learning proposed to serve educational users with the service architecture. The result of this study will be applied to develop new business model in ubiquitous computing as the basic research.

The Analysis of Association between Learning Styles and a Model of IoT-based Education : Chi-Square Test for Association

  • Sayassatov, Dulan;Cho, Namjae
    • Journal of Information Technology Applications and Management
    • /
    • v.27 no.3
    • /
    • pp.19-36
    • /
    • 2020
  • The Internet of things (IoT) is a system of interrelated computed devices, digital machines and any physical objects which are provided with unique identifiers and the potential to transmit data to people or machine (M2M) without requiring human interaction. IoT devices can be used to monitor and control the electrical and electronic systems used in different fields like smart home, smart city, smart healthcare and etc. In this study we introduce four imaginary IoT devices as a learning support assistants according to students' dominant learning styles measured by Honey and Mumford Learning Styles: Activists, Reflectors, Theorists and Pragmatists. This research emphasizes the association between students' strong learning styles and a preference to appropriate IoT devices with specific characteristics. Moreover, different levels of IoT devices' architecture are clearly explained in this study where all the artificial devices are designed based on this structure. Data analysis of experiment were measured by the use of chi square test for association and research results showed the statistical significance of the estimated model and the impacts of each category over the model where we finally got accurate estimates for our research variables. This study revealed the importance of considering the students' dominant learning styles before inventing a new IoT device.

Application of Reinforcement Learning in Detecting Fraudulent Insurance Claims

  • Choi, Jung-Moon;Kim, Ji-Hyeok;Kim, Sung-Jun
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.125-131
    • /
    • 2021
  • Detecting fraudulent insurance claims is difficult due to small and unbalanced data. Some research has been carried out to better cope with various types of fraudulent claims. Nowadays, technology for detecting fraudulent insurance claims has been increasingly utilized in insurance and technology fields, thanks to the use of artificial intelligence (AI) methods in addition to traditional statistical detection and rule-based methods. This study obtained meaningful results for a fraudulent insurance claim detection model based on machine learning (ML) and deep learning (DL) technologies, using fraudulent insurance claim data from previous research. In our search for a method to enhance the detection of fraudulent insurance claims, we investigated the reinforcement learning (RL) method. We examined how we could apply the RL method to the detection of fraudulent insurance claims. There are limited previous cases of applying the RL method. Thus, we first had to define the RL essential elements based on previous research on detecting anomalies. We applied the deep Q-network (DQN) and double deep Q-network (DDQN) in the learning fraudulent insurance claim detection model. By doing so, we confirmed that our model demonstrated better performance than previous machine learning models.

Structural Relationship among the Self-Efficacy, Self-Directed Learning Ability, School Adjustment, and Leaning Flow in Middle School Students (중학생의 자기효능감, 자기주도학습, 학교적응과 학습몰입 간의 관계 분석)

  • Kang, Seung Hee
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.24 no.6
    • /
    • pp.935-949
    • /
    • 2012
  • The purpose of this study was to investigate the structural relationship among the self-efficacy, self-directed learning ability, school adjustment and learning flow in middle school students by the structural equation modeling analysis. The subjects of this study consisted of 553 middle school students. The data were analyzed with descriptive statistics, Pearson correlations and structural equation modeling analysis by using the SPSS 12.0 and AMOS 5.0 statistical program. The results of this study were as followed: First, there were significant correlations among the self-efficacy, self-directed learning ability, school adjustment and learning flow. Second, the self-directed learning ability and school adjustment directly affected the learning flow. Third, self-efficacy and school adjustment variables indirectly affected learning flow. The indices of the best fit model on these variable were adequate. This study shows that the self-efficacy, self-directed learning ability, school adjustment are the significant predictor for the learning flow during adolescent.

A Spatial Analysis of Seismic Vulnerability of Buildings Using Statistical and Machine Learning Techniques Comparative Analysis (통계분석 기법과 머신러닝 기법의 비교분석을 통한 건물의 지진취약도 공간분석)

  • Seong H. Kim;Sang-Bin Kim;Dae-Hyeon Kim
    • Journal of Industrial Convergence
    • /
    • v.21 no.1
    • /
    • pp.159-165
    • /
    • 2023
  • While the frequency of seismic occurrence has been increasing recently, the domestic seismic response system is weak, the objective of this research is to compare and analyze the seismic vulnerability of buildings using statistical analysis and machine learning techniques. As the result of using statistical technique, the prediction accuracy of the developed model through the optimal scaling method showed about 87%. As the result of using machine learning technique, because the accuracy of Random Forest method is 94% in case of Train Set, 76.7% in case of Test Set, which is the highest accuracy among the 4 analyzed methods, Random Forest method was finally chosen. Therefore, Random Forest method was derived as the final machine learning technique. Accordingly, the statistical analysis technique showed higher accuracy of about 87%, whereas the machine learning technique showed the accuracy of about 76.7%. As the final result, among the 22,296 analyzed building data, the seismic vulnerabilities of 1,627(0.1%) buildings are expected as more dangerous when the statistical analysis technique is used, 10,146(49%) buildings showed the same rate, and the remaining 10,523(50%) buildings are expected as more dangerous when the machine learning technique is used. As the comparison of the results of using advanced machine learning techniques in addition to the existing statistical analysis techniques, in spatial analysis decisions, it is hoped that this research results help to prepare more reliable seismic countermeasures.

Comparison of Scala and R for Machine Learning in Spark (스파크에서 스칼라와 R을 이용한 머신러닝의 비교)

  • Woo-Seok Ryu
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.1
    • /
    • pp.85-90
    • /
    • 2023
  • Data analysis methodology in the healthcare field is shifting from traditional statistics-oriented research methods to predictive research using machine learning. In this study, we survey various machine learning tools, and compare several programming models, which utilize R and Spark, for applying R, a statistical tool widely used in the health care field, to machine learning. In addition, we compare the performance of linear regression model using scala, which is the basic languages of Spark and R. As a result of the experiment, the learning execution time when using SparkR increased by 10 to 20% compared to Scala. Considering the presented performance degradation, SparkR's distributed processing was confirmed as useful in R as the traditional statistical analysis tool that could be used as it is.

Semi-supervised learning using similarity and dissimilarity

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.1
    • /
    • pp.99-105
    • /
    • 2011
  • We propose a semi-supervised learning algorithm based on a form of regularization that incorporates similarity and dissimilarity penalty terms. Our approach uses a graph-based encoding of similarity and dissimilarity. We also present a model-selection method which employs cross-validation techniques to choose hyperparameters which affect the performance of the proposed method. Simulations using two types of dat sets demonstrate that the proposed method is promising.

Estimation of Software Reliability with Immune Algorithm and Support Vector Regression (면역 알고리즘 기반의 서포트 벡터 회귀를 이용한 소프트웨어 신뢰도 추정)

  • Kwon, Ki-Tae;Lee, Joon-Kil
    • Journal of Information Technology Services
    • /
    • v.8 no.4
    • /
    • pp.129-140
    • /
    • 2009
  • The accurate estimation of software reliability is important to a successful development in software engineering. Until recent days, the models using regression analysis based on statistical algorithm and machine learning method have been used. However, this paper estimates the software reliability using support vector regression, a sort of machine learning technique. Also, it finds the best set of optimized parameters applying immune algorithm, changing the number of generations, memory cells, and allele. The proposed IA-SVR model outperforms some recent results reported in the literature.