• Title/Summary/Keyword: 회귀분석모델

Search Result 1,538, Processing Time 0.034 seconds

A Study on the Sentiment analysis of Google Play Store App Comment Based on WPM(Word Piece Model) (WPM(Word Piece Model)을 활용한 구글 플레이스토어 앱의 댓글 감정 분석 연구)

  • Park, jae Hoon;Koo, Myong-wan
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.291-295
    • /
    • 2016
  • 본 논문에서는 한국어 기본 유니트 단위로 WPM을 활용한 구글 플레이 스토어 앱의 댓글 감정분석을 수행하였다. 먼저 자동 띄어쓰기 시스템을 적용한 후, 어절단위, 형태소 분석기, WPM을 각각 적용하여 모델을 생성하고, 로지스틱 회귀(Logistic Regression), 소프트맥스 회귀(Softmax Regression), 서포트 벡터머신(Support Vector Machine, SVM)등의 알고리즘을 이용하여 댓글 감정(긍정과 부정)을 비교 분석하였다. 그 결과 어절단위, 형태소 분석기보다 WPM이 최대 25%의 향상된 결과를 얻었다. 또한 분류 과정에서 로지스틱회귀, 소프트맥스 회귀보다는 SVM 성능이 우수했으며, SVM의 기본 파라미터({'kernel':('linear'), 'c':[4]})보다 최적의 파라미터를 적용({'kernel': ('linear','rbf', 'sigmoid', 'poly'), 'C':[0.01, 0.1, 1.4.5]} 하였을 때, 최대 91%의 성능이 나타났다.

  • PDF

A Study on the Sentiment analysis of Google Play Store App Comment Based on WPM(Word Piece Model) (WPM(Word Piece Model)을 활용한 구글 플레이스토어 앱의 댓글 감정 분석 연구)

  • Park, jae Hoon;Koo, Myong-wan
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.291-295
    • /
    • 2016
  • 본 논문에서는 한국어 기본 유니트 단위로 WPM을 활용한 구글 플레이 스토어 앱의 댓글 감정분석을 수행하였다. 먼저 자동 띄어쓰기 시스템을 적용한 후, 어절단위, 형태소 분석기, WPM을 각각 적용하여 모델을 생성하고, 로지스틱 회귀(Logistic Regression), 소프트맥스 회귀(Softmax Regression), 서포트 벡터머신(Support Vector Machine, SVM)등의 알고리즘을 이용하여 댓글 감정(긍정과 부정)을 비교 분석하였다. 그 결과 어절단위, 형태소 분석기보다 WPM이 최대 25%의 향상된 결과를 얻었다. 또한 분류 과정에서 로지스틱회귀, 소프트맥스 회귀보다는 SVM 성능이 우수했으며, SVM의 기본 파라미터({'kernel':('linear'), 'c':[4]})보다 최적의 파라미터를 적용({'kernel': ('linear','rbf', 'sigmoid', 'poly'), 'C':[0.01, 0.1, 1.4.5]} 하였을 때, 최대 91%의 성능이 나타났다.

  • PDF

Analysis of Water Balance in Paddy Fields using Open Source SWMMModel (Open source SWMM모형을 활용한 논배수로 물수지 분석)

  • Kim Beom gu;Choo In Kyo;Kareem Kola Yusuff;Jung Young Hun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.403-403
    • /
    • 2023
  • 도시화로 인한 생활, 공업, 농업용수의 수요는 증가하지만, 이를 해결하기 위한 댐 건설은 생태계의 단절, 수몰 지역 생성 등의 이유로 비판적인 여론이 많아 신규 수자원 확보가 어려워지고 있다. 따라서 우리는 신규 수자원을 확보하기보다 기존 수자원의 물관리 체계를 개선하고 합리적인 물 배분 기술을 개발할 필요가 있다. 이중 농업용수의 회귀 수량에 대하여 알아볼 필요가 있다. 수리 시설물에서 공급된 농업용수는 전량 작물에 의해 소비되는 것이 아니며, 포장으로 공급되지 않고 용수로를 통해 배수되기도 한다. 포장으로 공급된 수량은 물꼬를 넘어 배수되기도 하고, 일부는 침투되어 지하수를 통해 흘러나가기도 한다. 이 와 같이, 농업용수 공급량 중 소모되지 않고 하천으로 유입되는 수량을 관계 회귀 수량이라 한다. 따라서 본연구에서는 농업에 소모되지 않고 하천으로 유입되는 회귀수량을 정확히 조절할 수 있도록 농업용수 회귀수량을 계산하는 모델을 구현하였다. SWMM(Storm Water Management Model)은 도로, 도랑, 관로, 초지 등 주로 도시지역의 강우-유출-지표면 유출을 해석하는 모델이며 농지의 수로네트워크 특성을 잘 반영할 수 있다는 장점이 있다. 이번 연구에서는 용수로를 개수로로 고려하여 테스트베드 모형을 구축할 것이다. SWMM은 농업용수 물순환 모의를 위해 이미 활용되고 있으나 논에서의 증산량이 미반영되며 수혜지역 내의 지하수위가 미반영 되는 등 정확한 물순환 모의를 위해서 한계점 개선이 필요하다. 이 한계점 개선을 위해서 회귀수량 공식을 c언어로 구현 후 EPA SWMM의 소스코드를 활용하여 회귀수량 추정이 가능한 SWMM을 구현하였다. 해당 연구를 통해 농업용수의 회귀수량을 계산하여 정확한 물수지 분석이 가능하여 농업지역의 수자원 확보에 도움을 줄 것이다.

  • PDF

Analyzing Performance and Dynamics of Echo State Networks Given Various Structures of Hidden Neuron Connections (Echo State Network 모델의 은닉 뉴런 간 연결구조에 따른 성능과 동역학적 특성 분석)

  • Yoon, Sangwoong;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.4
    • /
    • pp.338-342
    • /
    • 2015
  • Recurrent Neural Network (RNN), a machine learning model which can handle time-series data, can possess more varied structures than a feed-forward neural network, since a RNN allows hidden-to-hidden connections. This research focuses on the network structure among hidden neurons, and discusses the information processing capability of RNN. Time-series learning potential and dynamics of RNNs are investigated upon several well-established network structure models. Hidden neuron network structure is found to have significant impact on the performance of a model, and the performance variations are generally correlated with the criticality of the network dynamics. Especially Preferential Attachment Network model showed an interesting behavior. These findings provide clues for performance improvement of the RNN.

Analysis on the Survivor's Pension Payment with Logistic Regression Model (로지스틱 회귀모형을 이용한 유족연금 수급 분석)

  • Kim, Mi-Jung;Kim, Jin-Hyung
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.2
    • /
    • pp.183-200
    • /
    • 2008
  • Research for efficient management of the National Pension has been emphasized as the current society trends toward aging and low birth rate. In this article, we suggest a statistical model for effective classification and prediction of the reserve for the survivor's pension in Korea. Logistic regression model is incorporated; correct classification rate, and distribution of the posterior probability for the reserve of survivor's pension are investigated and compared with the results from the general logistic models. Assessment of predictive model is also done with lift graph, ROC curve and K-S statistic. We suggest strategies for reducing financial risks in managing and planning the pension as an application of the suggested model.

Typhoon Path and Prediction Model Development for Building Damage Ratio Using Multiple Regression Analysis (태풍타입별 피해 분석 및 다중회귀분석을 활용한 태풍피해예측모델 개발 연구)

  • Yang, Seong-Pil;Son, Kiyoung;Lee, Kyoung-Hun;Kim, Ji-Myong
    • Journal of the Korea Institute of Building Construction
    • /
    • v.16 no.5
    • /
    • pp.437-445
    • /
    • 2016
  • Since typhoon is a critical meteorological disaster, some advanced countries have developed typhoon damage prediction models. However, although South Korea is vulnerable to typhoons, there is still shortage of study in typhoon damage prediction model reflecting the vulnerability of domestic building and features of disaster. Moreover, many studies have been only focused on the characteristics and typhoon and regional characteristics without various influencing factors. Therefore, the objective of this study is to analyze typhoon damage by path and develop to prediction model for building damage ratio by using multiple regression analysis. This study classifies the building damages by typhoon paths to identify influencing factors then the correlation analysis is conducted between building damage ratio and their factors. In addition, a multiple regression analysis is applied to develop a typhoon damage prediction model. Four categories; typhoon information, geography, construction environment, and socio-economy, are used as the independent variables. The results of this study will be used as fundamental material for the typhoon damage prediction model development of South Korea.

Application of Multiple Linear Regression Analysis and Tree-Based Machine Learning Techniques for Cutter Life Index(CLI) Prediction (커터수명지수 예측을 위한 다중선형회귀분석과 트리 기반 머신러닝 기법 적용)

  • Ju-Pyo Hong;Tae Young Ko
    • Tunnel and Underground Space
    • /
    • v.33 no.6
    • /
    • pp.594-609
    • /
    • 2023
  • TBM (Tunnel Boring Machine) method is gaining popularity in urban and underwater tunneling projects due to its ability to ensure excavation face stability and minimize environmental impact. Among the prominent models for predicting disc cutter life, the NTNU model uses the Cutter Life Index(CLI) as a key parameter, but the complexity of testing procedures and rarity of equipment make measurement challenging. In this study, CLI was predicted using multiple linear regression analysis and tree-based machine learning techniques, utilizing rock properties. Through literature review, a database including rock uniaxial compressive strength, Brazilian tensile strength, equivalent quartz content, and Cerchar abrasivity index was built, and derived variables were added. The multiple linear regression analysis selected input variables based on statistical significance and multicollinearity, while the machine learning prediction model chose variables based on their importance. Dividing the data into 80% for training and 20% for testing, a comparative analysis of the predictive performance was conducted, and XGBoost was identified as the optimal model. The validity of the multiple linear regression and XGBoost models derived in this study was confirmed by comparing their predictive performance with prior research.

Spatial Data Analysis for the U.S. Regional Income Convergence,1969-1999: A Critical Appraisal of $\beta$-convergence (미국 소득분포의 지역적 수렴에 대한 공간자료 분석(1969∼1999년) - 베타-수렴에 대한 비판적 검토 -)

  • Sang-Il Lee
    • Journal of the Korean Geographical Society
    • /
    • v.39 no.2
    • /
    • pp.212-228
    • /
    • 2004
  • This paper is concerned with an important aspect of regional income convergence, ${\beta}$-convergence, which refers to the negative relationship between initial income levels and income growth rates of regions over a period of time. The common research framework on ${\beta}$-convergence which is based on OLS regression models has two drawbacks. First, it ignores spatially autocorrelated residuals. Second, it does not provide any way of exploring spatial heterogeneity across regions in terms of ${\beta}$-convergence. Given that empirical studies on ${\beta}$-convergence need to be edified by spatial data analysis, this paper aims to: (1) provide a critical review of empirical studies on ${\beta}$-convergence from a spatial perspective; (2) investigate spatio-temporal income dynamics across the U.S. labor market areas for the last 30 years (1969-1999) by fitting spatial regression models and applying bivariate ESDA techniques. The major findings are as follows. First, the hypothesis of ${\beta}$-convergence was only partially evidenced, and the trend substantively varied across sub-periods. Second, a SAR model indicated that ${\beta}$-coefficient for the entire period was not significant at the 99% confidence level, which may lead to a conclusion that there is no statistical evidence of regional income convergence in the US over the last three decades. Third, the results from bivariate ESDA techniques and a GWR model report that there was a substantive level of spatial heterogeneity in the catch-up process, and suggested possible spatial regimes. It was also observed that the sub-periods showed a substantial level of spatio-temporal heterogeneity in ${\beta}$-convergence: the catch-up scenario in a spatial sense was least pronounced during the 1980s.

Estimation of Cerchar abrasivity index based on rock strength and petrological characteristics using linear regression and machine learning (선형회귀분석과 머신러닝을 이용한 암석의 강도 및 암석학적 특징 기반 세르샤 마모지수 추정)

  • Ju-Pyo Hong;Yun Seong Kang;Tae Young Ko
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.26 no.1
    • /
    • pp.39-58
    • /
    • 2024
  • Tunnel Boring Machines (TBM) use multiple disc cutters to excavate tunnels through rock. These cutters wear out due to continuous contact and friction with the rock, leading to decreased cutting efficiency and reduced excavation performance. The rock's abrasivity significantly affects cutter wear, with highly abrasive rocks causing more wear and reducing the cutter's lifespan. The Cerchar Abrasivity Index (CAI) is a key indicator for assessing rock abrasivity, essential for predicting disc cutter life and performance. This study aims to develop a new method for effectively estimating CAI using rock strength, petrological characteristics, linear regression, and machine learning. A database including CAI, uniaxial compressive strength, Brazilian tensile strength, and equivalent quartz content was created, with additional derived variables. Variables for multiple linear regression were selected considering statistical significance and multicollinearity, while machine learning model inputs were chosen based on variable importance. Among the machine learning prediction models, the Gradient Boosting model showed the highest predictive performance. Finally, the predictive performance of the multiple linear regression analysis and the Gradient Boosting model derived in this study were compared with the CAI prediction models of previous studies to validate the results of this research.

A Propose on Seismic Performance Evaluation Model of Slope using Artificial Neural Network Technique (인공신경망 기법을 이용한 사면의 내진성능평가 모델 제안)

  • Kwag, Shinyoung;Hahm, Daegi
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.32 no.2
    • /
    • pp.93-101
    • /
    • 2019
  • The objective of this study is to develop a model which can predict the seismic performance of the slope relatively accurately and efficiently by using artificial neural network(ANN) technique. The quantification of such the seismic performance of the slope is not easy task due to the randomness and the uncertainty of the earthquake input and slope model. Under these circumstances, probabilistic seismic fragility analyses of slope have been carried out by several researchers, and a closed-form equation for slope seismic performance was proposed through a multiple linear regression analysis. However, a traditional statistical linear regression analysis has shown a limit that cannot accurately represent the nonlinearistic relationship between the slope of various conditions and seismic performance. In order to overcome these problems, in this study, we attempted to apply the ANN to generate prediction models of the seismic performance of the slope. The validity of the derived model was verified by comparing this with the conventional multi-linear and multi-nonlinear regression models. As a result, the models obtained through the ANN basically showed excellent performance in predicting the seismic performance of the slope, compared to the models obtained by the statistical regression analyses of the previous study.