• Title/Summary/Keyword: R 패키지

Search Result 175, Processing Time 0.027 seconds

Comparision of Missing Imputaion Methods In fine dust data (미세먼지 자료에서의 결측치 대체 방법 비교)

  • Kim, YeonJin;Park, HeonJin
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.105-114
    • /
    • 2019
  • Missing value replacement is one of the big issues in data analysis. If you ignore the occurrence of the missing value and proceed with the analysis, a bias can occur and give incorrect results for the estimate. In this paper, we need to find and apply an appropriate alternative to missing data from weather data. Through this, we attempted to clarify and compare the simulations for various situations using existing methods such as MICE and MissForest based on R and time series-based models. When comparing these results with each variable, it was determined that the kalman filter of the auto arima model using the ImputeTS package and the MissForest model gave good results in the weather data.

  • PDF

Simple principal component analysis using Lasso (라소를 이용한 간편한 주성분분석)

  • Park, Cheolyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.533-541
    • /
    • 2013
  • In this study, a simple principal component analysis using Lasso is proposed. This method consists of two steps. The first step is to compute principal components by the principal component analysis. The second step is to regress each principal component on the original data matrix by Lasso regression method. Each of new principal components is computed as the linear combination of original data matrix using the scaled estimated Lasso regression coefficient as the coefficients of the combination. This method leads to easily interpretable principal components with more 0 coefficients by the properties of Lasso regression models. This is because the estimator of the regression of each principal component on the original data matrix is the corresponding eigenvector. This method is applied to real and simulated data sets with the help of an R package for Lasso regression and its usefulness is demonstrated.

Prediction of K-league soccer scores using bivariate Poisson distributions (이변량 포아송분포를 이용한 K-리그 골 점수의 예측)

  • Lee, Jang Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1221-1229
    • /
    • 2014
  • In this paper we choose the best model among several bivariate Poisson models on Korean soccer data. The models considered allow for correlation between the number of goals of two competing teams. We use an R package called bivpois for bivariate Poisson regression models and the data of K-league for season 1983-2012. Finally we conclude that the best fitted model supported by the AIC and BIC is the bivariate Poisson model with constant covariance. The zero and diagonal inflated models did not improve the model fit. The model can be used to examine home-away effect, goodness of fit, attack and defense parameters.

Long-Term Operation Modeling for the Hydropower Reservoir in the Han River Basin Using Linear Programming (선형계획법을 이용한 한강 수계 수력발전 댐 장기모형 구축)

  • Lee, Eunkyung;Ji, Jungwon;Yi, Jaeeung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.156-156
    • /
    • 2015
  • 최근 화석연료의 사용으로 인한 지구온난화 등 환경파괴가 점점 증가하는 추세이며 이로 인해 신재생에너지 중 하나인 수력발전이 주목받고 있다. 수력발전은 물의 위치에너지를 기계에너지로 이를 다시 전기에너지로 변환하는 친환경적인 방식으로 운영된다. 수력발전량은 우리나라 전체 발전량의 1.5% 정도로 적은 양의 발전량을 생산하지만 가동시간이 짧아 전력수요가 급변하는 상황에 대비 가능하기 때문에 수력발전은 필수적이다. 기후변화의 영향으로 연평균강수량은 증가하는 양상을 보이나 연 강수일수는 줄어드는 등 수자원의 불확실성이 증가하고 있는 실정이다. 따라서 미래 불확실한 수자원 공급에 대비할 수 있는 수자원의 효율적 활용에 대한 연구가 필요하다. 본 연구에서는 하천의 유량이 계절에 따라 변동 폭이 크다는 점을 고려하며 월별 발전량을 최대화하기 위해 선형계획법을 적용하는 모형을 구축하였다. 선형계획법은 목적함수와 제약조건식 모두 1차식으로 비선형항을 포함할 수 없으나 초기 해가 불필요하고 최적해가 보장된다는 장점을 가진다. 일부 목적함수나 제약조건식에 비선형항이 포함되어 있을 경우 Successive Linear Programming(SLP), Piecewise Linear Programming(PLP), Taylor Expansion 등의 방법을 이용하여 선형화할 수 있다. 본 연구에서 비선형 제약조건은 Taylor Expansion을 이용하여 선형화하였으며 한강수계 9개 댐의 월간 발전량을 최대화시키는 장기 운영 모형을 구축하였다. 개발 환경은 Linux-CentOS이며 사용프로그램은 통계 분석에 많이 활용되는 R programming이다. R programming은 패키지를 이용한 개발이 용이하고 Windows 뿐만 아니라 Linux, Mac, Unix 등의 운영체제에서도 호환 가능하다는 장점이 있다.

  • PDF

A MA-plot-based Feature Selection by MRMR in SVM-RFE in RNA-Sequencing Data

  • Kim, Chayoung
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.12
    • /
    • pp.25-30
    • /
    • 2018
  • It is extremely lacking and urgently required that the method of constructing the Gene Regulatory Network (GRN) from RNA-Sequencing data (RNA-Seq) because of Big-Data and GRN in Big-Data has obtained substantial observation as the interactions among relevant featured genes and their regulations. We propose newly the computational comparative feature patterns selection method by implementing a minimum-redundancy maximum-relevancy (MRMR) filter the support vector machine-recursive feature elimination (SVM-RFE) with Intensity-dependent normalization (DEGSEQ) as a preprocessor for emphasizing equal preciseness in RNA-seq in Big-Data. We found out the proposed algorithm might be more scalable and convenient because of all libraries in R package and be more improved in terms of the time consuming in Big-Data and minimum-redundancy maximum-relevancy of a set of feature patterns at the same time.

Synthetic data generation by probabilistic PCA (주성분 분석을 활용한 재현자료 생성)

  • Min-Jeong Park
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.4
    • /
    • pp.279-294
    • /
    • 2023
  • It is well known to generate synthetic data sets by the sequential regression multiple imputation (SRMI) method. The R-package synthpop are widely used for generating synthetic data by the SRMI approaches. In this paper, I suggest generating synthetic data based on the probabilistic principal component analysis (PPCA) method. Two simple data sets are used for a simulation study to compare the SRMI and PPCA approaches. Simulation results demonstrate that pairwise coefficients in synthetic data sets by PPCA can be closer to original ones than by SRMI. Furthermore, for the various data types that PPCA applications are well established, such as time series data, the PPCA approach can be extended to generate synthetic data sets.

Effects of Graphene Oxide Addition on the Electromigration Characteristics of Sn-3.0Ag-0.5Cu Pb-free Solder Joints (Graphene Oxide 첨가에 따른 Sn-3.0Ag-0.5Cu 무연솔더 접합부의 Electromigration 특성 분석)

  • Son, Kirak;Kim, Gahui;Ko, Yong-Ho;Park, Young-Bae
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.26 no.3
    • /
    • pp.81-88
    • /
    • 2019
  • In this study, the effects of graphene oxide (GO) addition on electromigration (EM) lifetime of Sn-3.0Ag-0.5Cu Pb-free solder joint between a ball grid array (BGA) package and printed circuit board (PCB) were investigated. After as-bonded, $(Cu,Ni)_6Sn_5$ intermetallic compound (IMC) was formed at the interface of package side finished with electroplated Ni/Au, while $Cu_6Sn_5$ IMC was formed at the interface of OSP-treated PCB side. Mean time to failure of solder joint without GO solder joint under $130^{\circ}C$ with a current density of $1.0{\times}10^3A/cm^2$ was 189.9 hrs and that with GO was 367.1 hrs. EM open failure was occurred at the interface of PCB side with smaller pad diameter than that of package side due to Cu consumption by electrons flow. Meanwhile, we observed that the added GO was distributed at the interface between $Cu_6Sn_5$ IMC and solder. Therefore, we assumed that EM reliability of solder joint with GO was superior to that of without GO by suppressing the Cu diffusion at current crowding regions.

Design and Implementation of TV Emulator Based On HTML5 based Smart TV Platform (HTML5 기반 스마트 TV 플랫폼 표준 기반 TV 에뮬레이터 설계 및 구현)

  • Kim, Ho-Youn;Kim, Jung-Hyun;Lee, Dong-Hoon;Park, Dong-Young
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.317-320
    • /
    • 2016
  • 한국정보통신기술협회(이하, TTA)에서 제정된 " HTML5 기반 스마트 TV 플랫폼(TTAK.KO-07.0111/R1)" 은 개방형 국제 기술표준인 W3C/HTML5 을 기반으로 스마트 TV 에서 방송환경과 운영체제에 종속되지 않고 애플리케이션이 실행될 수 있도록 스마트 TV 플랫폼의 기술 요구 사항을 정의한 표준이다. TTA 에서는 이러한 표준 기반의 TV 앱 생태계를 활성화하기 위해 앱 개발도구(SDK)를 개발하여 배포하였고, 이어서 TV 장치 없이 PC 환경에서 표준 기반으로 개발된 앱을 실행할 수 있는 있는 에뮬레이터 개발을 추진하였다. 개발된 에뮬레이터는 사용자의 방송 정보 설정을 바탕으로 방송 재생 및 제어 기능을 제공하고 표준 기술로 작성된 앱을 방송 연동형 혹은 패키지 형태로 실행할 수 있는 기능을 제공한다. 본 논문에서는 TTA 에서 개발한 표준 기반 TV 에뮬레이터의 설계 및 구현에 대해 소개한다.

  • PDF

A Study on the National Lawmaking Trends of the EU Electronic Communications Regulatory Package in the Member Nations (EU 통신법의 회원국내 국내법화 추진동향)

  • Kim, P.R.;Cha, S.M.
    • Electronics and Telecommunications Trends
    • /
    • v.20 no.2 s.92
    • /
    • pp.103-114
    • /
    • 2005
  • EU가 채택한 "전자통신 규제 패키지"는 종래의 정보통신 산업구조가 수직통합형에서 인터넷 기술발전에 따라 물리적인 네트워크, 전송서비스, 콘텐츠로 층별로 세분화되어 가고 있는 현실을 반영한 새로운 규제체계로 평가할 수 있다. EU는 가맹 국가에대해 이 법령을 2003년 7월까지 국내법에 적용하도록 권고하였으나, 15개 회원국 가운데 이 기한까지 국내법화를 실시한 나라는 5개국에 불과한 실정으로, 여기에 대한유럽 위원회의 대응이 주목된다. 우리나라도 이 법령의 문제점을 감안하면서 현실에맞는 통신. 방송 융합을 위한 규제 법령을 조속히 정비해야 할 것으로 기대한다.

The Development of Pattern Classification for Inner Defects in Semiconductor packages by Self-Organizing map (자기조직화 지도를 이용한 반도체 패키지 내부결함의 패턴분류 알고리즘 개발)

  • 김재열;윤성운;김훈조;김창현;송경석;양동조
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 2002.10a
    • /
    • pp.80-84
    • /
    • 2002
  • In this study, researchers developed the est algorithm for artificial defects in the semic packages and performed to it by pattern recogn technology. For this purpose, this algorithm was I that researcher made software with matlab. The so consists of some procedures including ultrasonic acquistion, equalization filtering, self-organizing backpropagation neural network. self-organizing ma backpropagation neural network are belong to metho neural networks. And the pattern recognition tech has applied to classify three kinds of detective pa semiconductor packages. that is, crack, delaminat normal. According to the results, it was found estimative algorithm was provided the recognition r 75.7%( for crack) and 83.4%( for delamination) 87.2 % ( for normal).

  • PDF