• 제목/요약/키워드: statistical learning approach

검색결과 152건 처리시간 0.03초

Sentiment Analysis on Indonesia Economic Growth using Deep Learning Neural Network Method

  • KRISMAWATI, Dewi;MARIEL, Wahyu Calvin Frans;ARSYI, Farhan Anshari;PRAMANA, Setia
    • 산경연구논집
    • /
    • 제13권6호
    • /
    • pp.9-18
    • /
    • 2022
  • Purpose: The government around the world is still highlighting the effect of the new variant of Covid-19. The government continues to make efforts to restore the economy through several programs, one of them is National Economic Recovery. This program is expected to increase public and investor confidence in handling Covid-19. This study aims to capture public sentiment on the economic growth rate in Indonesia, especially during the third wave of the omicron variant of the covid-19 virus, that is at the time in the fourth quarter of 2021. Research design, data, and methodology: The approach used in this research is to collect crowdsourcing data from twitter, in the range of 1st to 10th October 2021. The analysis is done by building model using Deep Learning Neural Network method. Results: The result of the sentiment analysis is that most of the tweets have a neutral sentiment on the Economic Growth discussion. Several central figures who discussed were Minister of Coordinating for the Economy of Indonesia, Minister of State-Owned Enterprises. Conclusions: Data from social media can be used by the government to capture public responses, especially public sentiment regarding economic growth. This can be used by policy makers, for example entrepreneurs to anticipate economic movements under certain conditions.

통계적 접근 방법을 이용한 저속비대선 및 컨테이너선의 동력 성능 추정 (Powering Performance Prediction of Low-Speed Full Ships and Container Carriers Using Statistical Approach)

  • 김유철;김건도;김명수;황승현;김광수;연성모;이영연
    • 대한조선학회논문집
    • /
    • 제58권4호
    • /
    • pp.234-242
    • /
    • 2021
  • In this study, we introduce the prediction of brake power for low-speed full ships and container carriers using the linear regression and a machine learning approach. The residual resistance coefficient, wake fraction coefficient, and thrust deduction factor are predicted by regression models using the main dimensions of ship and propeller. The brake power of a ship can be calculated by these coefficients according to the 1978 ITTC performance prediction method. The mean absolute error of the predicted power was under 7%. As a result of several validation cases, it was confirmed that the machine learning model showed slightly better results than linear regression.

Causality, causal discovery, causal inference and counterfactuals in Civil Engineering: Causal machine learning and case studies for knowledge discovery

  • M.Z. Naser;Arash Teymori Gharah Tapeh
    • Computers and Concrete
    • /
    • 제31권4호
    • /
    • pp.277-292
    • /
    • 2023
  • Much of our experiments are designed to uncover the cause(s) and effect(s) behind a phenomenon (i.e., data generating mechanism) we happen to be interested in. Uncovering such relationships allows us to identify the true workings of a phenomenon and, most importantly, to realize and articulate a model to explore the phenomenon on hand and/or allow us to predict it accurately. Fundamentally, such models are likely to be derived via a causal approach (as opposed to an observational or empirical mean). In this approach, causal discovery is required to create a causal model, which can then be applied to infer the influence of interventions, and answer any hypothetical questions (i.e., in the form of What ifs? Etc.) that commonly used prediction- and statistical-based models may not be able to address. From this lens, this paper builds a case for causal discovery and causal inference and contrasts that against common machine learning approaches - all from a civil and structural engineering perspective. More specifically, this paper outlines the key principles of causality and the most commonly used algorithms and packages for causal discovery and causal inference. Finally, this paper also presents a series of examples and case studies of how causal concepts can be adopted for our domain.

유전자 알고리즘을 활용한 인공신경망 모형 최적입력변수의 선정 : 부도예측 모형을 중심으로 (Using GA based Input Selection Method for Artificial Neural Network Modeling Application to Bankruptcy Prediction)

  • 홍승현;신경식
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 1999년도 추계학술대회-지능형 정보기술과 미래조직 Information Technology and Future Organization
    • /
    • pp.365-373
    • /
    • 1999
  • Recently, numerous studies have demonstrated that artificial intelligence such as neural networks can be an alternative methodology for classification problems to which traditional statistical methods have long been applied. In building neural network model, the selection of independent and dependent variables should be approached with great care and should be treated as a model construction process. Irrespective of the efficiency of a learning procedure in terms of convergence, generalization and stability, the ultimate performance of the estimator will depend on the relevance of the selected input variables and the quality of the data used. Approaches developed in statistical methods such as correlation analysis and stepwise selection method are often very useful. These methods, however, may not be the optimal ones for the development of neural network models. In this paper, we propose a genetic algorithms approach to find an optimal or near optimal input variables for neural network modeling. The proposed approach is demonstrated by applications to bankruptcy prediction modeling. Our experimental results show that this approach increases overall classification accuracy rate significantly.

  • PDF

개념 설계 단계에서 인공 신경망과 통계적 분석을 이용한 제품군의 근사적 전과정 평가 (Approximate Life Cycle Assessment of Classified Products using Artificial Neural Network and Statistical Analysis in Conceptual Product Design)

  • 박지형;서광규
    • 한국정밀공학회지
    • /
    • 제20권3호
    • /
    • pp.221-229
    • /
    • 2003
  • In the early phases of the product life cycle, Life Cycle Assessment (LCA) is recently used to support the decision-making fer the conceptual product design and the best alternative can be selected based on its estimated LCA and its benefits. Both the lack of detailed information and time for a full LCA fur a various range of design concepts need the new approach fer the environmental analysis. This paper suggests a novel approximate LCA methodology for the conceptual design stage by grouping products according to their environmental characteristics and by mapping product attributes into impact driver index. The relationship is statistically verified by exploring the correlation between total impact indicator and energy impact category. Then a neural network approach is developed to predict an approximate LCA of grouping products in conceptual design. Trained learning algorithms for the known characteristics of existing products will quickly give the result of LCA for new design products. The training is generalized by using product attributes for an ID in a group as well as another product attributes for another IDs in other groups. The neural network model with back propagation algorithm is used and the results are compared with those of multiple regression analysis. The proposed approach does not replace the full LCA but it would give some useful guidelines fer the design of environmentally conscious products in conceptual design phase.

An Ensemble Approach to Detect Fake News Spreaders on Twitter

  • Sarwar, Muhammad Nabeel;UlAmin, Riaz;Jabeen, Sidra
    • International Journal of Computer Science & Network Security
    • /
    • 제22권5호
    • /
    • pp.294-302
    • /
    • 2022
  • Detection of fake news is a complex and a challenging task. Generation of fake news is very hard to stop, only steps to control its circulation may help in minimizing its impacts. Humans tend to believe in misleading false information. Researcher started with social media sites to categorize in terms of real or fake news. False information misleads any individual or an organization that may cause of big failure and any financial loss. Automatic system for detection of false information circulating on social media is an emerging area of research. It is gaining attention of both industry and academia since US presidential elections 2016. Fake news has negative and severe effects on individuals and organizations elongating its hostile effects on the society. Prediction of fake news in timely manner is important. This research focuses on detection of fake news spreaders. In this context, overall, 6 models are developed during this research, trained and tested with dataset of PAN 2020. Four approaches N-gram based; user statistics-based models are trained with different values of hyper parameters. Extensive grid search with cross validation is applied in each machine learning model. In N-gram based models, out of numerous machine learning models this research focused on better results yielding algorithms, assessed by deep reading of state-of-the-art related work in the field. For better accuracy, author aimed at developing models using Random Forest, Logistic Regression, SVM, and XGBoost. All four machine learning algorithms were trained with cross validated grid search hyper parameters. Advantages of this research over previous work is user statistics-based model and then ensemble learning model. Which were designed in a way to help classifying Twitter users as fake news spreader or not with highest reliability. User statistical model used 17 features, on the basis of which it categorized a Twitter user as malicious. New dataset based on predictions of machine learning models was constructed. And then Three techniques of simple mean, logistic regression and random forest in combination with ensemble model is applied. Logistic regression combined in ensemble model gave best training and testing results, achieving an accuracy of 72%.

Sparse Multinomial Kernel Logistic Regression

  • Shim, Joo-Yong;Bae, Jong-Sig;Hwang, Chang-Ha
    • Communications for Statistical Applications and Methods
    • /
    • 제15권1호
    • /
    • pp.43-50
    • /
    • 2008
  • Multinomial logistic regression is a well known multiclass classification method in the field of statistical learning. More recently, the development of sparse multinomial logistic regression model has found application in microarray classification, where explicit identification of the most informative observations is of value. In this paper, we propose a sparse multinomial kernel logistic regression model, in which the sparsity arises from the use of a Laplacian prior and a fast exact algorithm is derived by employing a bound optimization approach. Experimental results are then presented to indicate the performance of the proposed procedure.

Effects of a GAISE-based teaching method on students' learning in introductory statistics

  • Erhardt, Erik Barry;Lim, Woong
    • Communications for Statistical Applications and Methods
    • /
    • 제27권3호
    • /
    • pp.269-284
    • /
    • 2020
  • This study compares two teaching methods in an introductory statistics course at a large state university. The first method is the traditional lecture-based approach. The second method implements a flipped classroom that incorporates the recommendations of the American Statistical Association's Guidelines for Assessment and Instruction in Statistics Education (GAISE) College Report. We compare these two methods, based on student performance, illustrate the procedures of the flipped pedagogy, and discuss the impact of aligning our course to current guidelines for teaching statistics at the college level. Results show that students in the flipped class performed better than students in traditional delivery. Student questionnaire responses also indicate that students in flipped delivery aligned with the GAISE recommendations have built a productive mindset in statistics.

Optimized Chinese Pronunciation Prediction by Component-Based Statistical Machine Translation

  • Zhu, Shunle
    • Journal of Information Processing Systems
    • /
    • 제17권1호
    • /
    • pp.203-212
    • /
    • 2021
  • To eliminate ambiguities in the existing methods to simplify Chinese pronunciation learning, we propose a model that can predict the pronunciation of Chinese characters automatically. The proposed model relies on a statistical machine translation (SMT) framework. In particular, we consider the components of Chinese characters as the basic unit and consider the pronunciation prediction as a machine translation procedure (the component sequence as a source sentence, the pronunciation, pinyin, as a target sentence). In addition to traditional features such as the bidirectional word translation and the n-gram language model, we also implement a component similarity feature to overcome some typos during practical use. We incorporate these features into a log-linear model. The experimental results show that our approach significantly outperforms other baseline models.

A New Methodology for Software Reliability based on Statistical Modeling

  • Avinash S;Y.Srinivas;P.Annan naidu
    • International Journal of Computer Science & Network Security
    • /
    • 제23권9호
    • /
    • pp.157-161
    • /
    • 2023
  • Reliability is one of the computable quality features of the software. To assess the reliability the software reliability growth models(SRGMS) are used at different test times based on statistical learning models. In all situations, Tradational time-based SRGMS may not be enough, and such models cannot recognize errors in small and medium sized applications.Numerous traditional reliability measures are used to test software errors during application development and testing. In the software testing and maintenance phase, however, new errors are taken into consideration in real time in order to decide the reliability estimate. In this article, we suggest using the Weibull model as a computational approach to eradicate the problem of software reliability modeling. In the suggested model, a new distribution model is suggested to improve the reliability estimation method. We compute the model developed and stabilize its efficiency with other popular software reliability growth models from the research publication. Our assessment results show that the proposed Model is worthier to S-shaped Yamada, Generalized Poisson, NHPP.