• Title/Summary/Keyword: 랜덤변수

Search Result 259, Processing Time 0.023 seconds

Movie Box-office Prediction using Deep Learning and Feature Selection : Focusing on Multivariate Time Series

  • Byun, Jun-Hyung;Kim, Ji-Ho;Choi, Young-Jin;Lee, Hong-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.6
    • /
    • pp.35-47
    • /
    • 2020
  • Box-office prediction is important to movie stakeholders. It is necessary to accurately predict box-office and select important variables. In this paper, we propose a multivariate time series classification and important variable selection method to improve accuracy of predicting the box-office. As a research method, we collected daily data from KOBIS and NAVER for South Korean movies, selected important variables using Random Forest and predicted multivariate time series using Deep Learning. Based on the Korean screen quota system, Deep Learning was used to compare the accuracy of box-office predictions on the 73rd day from movie release with the important variables and entire variables, and the results was tested whether they are statistically significant. As a Deep Learning model, Multi-Layer Perceptron, Fully Convolutional Neural Networks, and Residual Network were used. Among the Deep Learning models, the model using important variables and Residual Network had the highest prediction accuracy at 93%.

Machine learning model for residual chlorine prediction in sediment basin to control pre-chlorination in water treatment plant (정수장 전염소 공정제어를 위한 침전지 잔류염소농도 예측 머신러닝 모형)

  • Kim, Juhwan;Lee, Kyunghyuk;Kim, Soojun;Kim, Kyunghun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.spc1
    • /
    • pp.1283-1293
    • /
    • 2022
  • The purpose of this study is to predict residual chlorine in order to maintain stable residual chlorine concentration in sedimentation basin by using artificial intelligence algorithms in water treatment process employing pre-chlorination. Available water quantity and quality data are collected and analyzed statistically to apply into mathematical multiple regression and artificial intelligence models including multi-layer perceptron neural network, random forest, long short term memory (LSTM) algorithms. Water temperature, turbidity, pH, conductivity, flow rate, alkalinity and pre-chlorination dosage data are used as the input parameters to develop prediction models. As results, it is presented that the random forest algorithm shows the most moderate prediction result among four cases, which are long short term memory, multi-layer perceptron, multiple regression including random forest. Especially, it is result that the multiple regression model can not represent the residual chlorine with the input parameters which varies independently with seasonal change, numerical scale and dimension difference between quantity and quality. For this reason, random forest model is more appropriate for predict water qualities than other algorithms, which is classified into decision tree type algorithm. Also, it is expected that real time prediction by artificial intelligence models can play role of the stable operation of residual chlorine in water treatment plant including pre-chlorination process.

Fast Blind Image Denoising Algorithm Based on Estimating Noise Parameters (노이즈 매개변수 예측 기반 고속 노이즈 제거 방식)

  • Nguyen, Tuan-Anh;Kim, Beomsu;Hong, Min-Cheol
    • Journal of IKEEE
    • /
    • v.18 no.4
    • /
    • pp.523-531
    • /
    • 2014
  • In this paper, a fast single image blind denoising algorithm is presented, where noise parameters are estimated by local statistics of an observed degraded image without a prior information about the additive noise. The estimated noise parameters are used to define the constraints on the noise detection which is coupled with the 1st-order Markov Random Field. In addition, an adaptive modified weighted Gaussian filter is introduced, where variable window sizes and weighting coefficients defined by the constraints are used to control the degree of the smoothness of the reconstructed image. The experimental results demonstrate the capability of the proposed algorithm. Please put the abstract of paper here.

FEM-based Seismic Reliability Analysis of Real Structural Systems (실제 구조계의 유한요소법에 기초한 지진 신뢰성해석)

  • Huh Jung-Won;Haldar Achintya
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.19 no.2 s.72
    • /
    • pp.171-185
    • /
    • 2006
  • A sophisticated reliability analysis method is proposed to evaluate the reliability of real nonlinear complicated dynamic structural systems excited by short duration dynamic loadings like earthquake motions by intelligently integrating the response surface method, the finite element method, the first-order reliability method, and the iterative linear interpolation scheme. The method explicitly considers all major sources of nonlinearity and uncertainty in the load and resistance-related random variables. The unique feature of the technique is that the seismic loading is applied in the time domain, providing an alternative to the classical random vibration approach. The four-parameter Richard model is used to represent the flexibility of connections of real steel frames. Uncertainties in the Richard parameters are also incorporated in the algorithm. The laterally flexible steel frame is then reinforced with reinforced concrete shear walls. The stiffness degradation of shear walls after cracking is also considered. The applicability of the method to estimate the reliability of real structures is demonstrated by considering three examples; a laterally flexible steel frame with fully restrained connections, the same steel frame with partially restrained connections with different rigidities, and a steel frame reinforced with concrete shear walls.

Prediction of golf scores on the PGA tour using statistical models (PGA 투어의 골프 스코어 예측 및 분석)

  • Lim, Jungeun;Lim, Youngin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.41-55
    • /
    • 2017
  • This study predicts the average scores of top 150 PGA golf players on 132 PGA Tour tournaments (2013-2015) using data mining techniques and statistical analysis. This study also aims to predict the Top 10 and Top 25 best players in 4 different playoffs. Linear and nonlinear regression methods were used to predict average scores. Stepwise regression, all best subset, LASSO, ridge regression and principal component regression were used for the linear regression method. Tree, bagging, gradient boosting, neural network, random forests and KNN were used for nonlinear regression method. We found that the average score increases as fairway firmness or green height or average maximum wind speed increases. We also found that the average score decreases as the number of one-putts or scrambling variable or longest driving distance increases. All 11 different models have low prediction error when predicting the average scores of PGA Tournaments in 2015 which is not included in the training set. However, the performances of Bagging and Random Forest models are the best among all models and these two models have the highest prediction accuracy when predicting the Top 10 and Top 25 best players in 4 different playoffs.

Optimization of Input Features for Vegetation Classification Based on Random Forest and Sentinel-2 Image (랜덤포레스트와 Sentinel-2를 이용한 식생 분류의 입력특성 최적화)

  • LEE, Seung-Min;JEONG, Jong-Chul
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.23 no.4
    • /
    • pp.52-67
    • /
    • 2020
  • Recently, the Arctic has been exposed to snow-covered land due to melting permafrost every year, and the Korea Geographic Information Institute(NGII) provides polar spatial information service by establishing spatial information of the polar region. However, there is a lack of spatial information on vegetation sensitive to climate change. This research used a multi-temporal Sentinel-2 image to perform land cover classification of the Ny-Ålesund in Arctic Svalbard. In the pre-processing step, 10 bands and 6 vegetation spectral index were generated from multi-temporal Sentinel-2 images. In image-classification step is consisted of extracting the vegetation area through 8-class land cover classification and performing the vegetation species classification. The image classification algorithm used Random Forest to evaluate the accuracy and calculate feature importance through Out-Of-Bag(OOB). To identify the advantages of multi- temporary Sentinel-2 for vegetation classification, the overall accuracy was compared according to the number of images stacked and vegetation spectral index. Overall accuracy was 77% when using single-time Sentinel-2 images, but improved to 81% when using multi-time Sentinel-2 images. In addition, the overall accuracy improved to about 83% in learning when the vegetation index was used additionally. The most important spectral variables to distinguish between vegetation classes are located in the Red, Green, and short wave infrared-1(SWIR1). This research can be used as a basic study that optimizes input characteristics in performing the classification of vegetation in the polar regions.

패턴인식법에 의한 압축기의 이상진단에 관한 연구

  • 김태구;김광일
    • Proceedings of the Korean Institute of Industrial Safety Conference
    • /
    • 2001.11a
    • /
    • pp.25-30
    • /
    • 2001
  • 엔진이나 콤프레셔 등과 같은 기기 진동의 동특성은 불규칙적으로 변동하는 성분을 갖는 랜덤 프로세스로 그것을 수학적으로 명확히 기술하는 것은 어렵다/sup 1)/. 하지만 통계학적인 입장에서 시계열 데이터를 보면 시계열 데이터가 확률변수로서 각각의 모집단에 속한다. 따라서 이 점에 주목하여 시계열의 확률적인 특징을 추출하는 것으로, 각각의 시계열 데이터를 확률공간으로 구별하는 것이 가능하다면 시계열 데이터에 의해 표현되어진 상태의 식별가능하다는 이론이 성립된다/sup 2)/.(중략)

  • PDF

An Improved Partitioning Algorithm in Hardware Software Codeisgn (하드웨어 소프트웨어 통합설계에서의 개선된 분할 알고리즘)

  • Oh, Ju-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10a
    • /
    • pp.689-692
    • /
    • 2001
  • 본 논문에서는 주어진 제약조건을 만족하며 저비용 고효율의 목적물 합성을 위하여 어느 부분을 하드웨어로 또는 소프트웨어로 구현할 것인지를 결정하는 분할 알고리즘을 제안한다. 논문[6]에서 제시한 시뮬레이티드 어닐링의 후보자 선택은 랜덤한 방식에 의해 노드의 이동이 이루어지기 때문에 중복된 후보자의 선택으로 인하여 시간이 오래 걸리는 단점이 있다. 이러한 단점을 극복하기 위해, 본 논문에서는 비용 함수를 구성하는 변수들 중에서 시스템 실행시간과 구현 비용에 영향을 미칠 수 있는 부분들을 고려해 후보자를 선택하도록 하여 최적해 탐색을 위한 분할 알고리즘의 실행 시간을 단축시켰다. 실험 결과는 대상 노드가 많아질수록 기존의 방법보다 빠른 시간에 최적의 해를 탐색한다.

  • PDF

Field data analyses for repairable products (수리가능한 제품의 사용현장 데이터 분석)

  • 배도선;윤형제;최인수
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.2
    • /
    • pp.133-145
    • /
    • 1995
  • This paper is concerned with the method of estimating lifetime distribution from field data for repairable products with multiple modes of failure, and is an extension of Bai et al.(1995). The log linear function is considered as a model for describing the relation between failure time of a product and covariates. Using the nonhomogeneous poisson process, general methods for obtaining pseudo maximum likelihood estimators(PMLEs) for the parameters are outlined and specific formulas for Weibull distribution are obtained. Effects of follow-up percentage on the PMLEs are investigated. Extension to case-cohort design is also considered.

  • PDF

Efficient Methods for Reducing Clock Cycles in VHDL Model Verification (VHDL 모델 검증의 효율적인 시간단축 방법)

  • Kim, Kang-Chul
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.12
    • /
    • pp.39-45
    • /
    • 2003
  • Design verification of VHDL models is getting difficult and has become a critical and time-consuming process in hardware design. Recent]y the methods using Bayesian estimation and stopping rule have been introduced to verify behavioral models and to reduce clock cycles. This paper presents two strategies to reduce clock cycles when using stopping rule in a VHDL model verification. The first method is that a semi-random variable is defined and the data that stay in the range of semi-random variable are skipped when stopping rule is running. The second one is to keep the old values of parameters when phases of stopping rule are changed. 12 VHDL models are examined to observe the effectiveness of strategies, and the simulation results show that more than about 25% of clock cycles is reduced by using the two proposed strategies with 0.6% losses of branch coverage rate.