• Title/Summary/Keyword: R 패키지

Search Result 175, Processing Time 0.027 seconds

Review of Spatial Linear Mixed Models for Non-Gaussian Outcomes (공간적 상관관계가 존재하는 이산형 자료를 위한 일반화된 공간선형 모형 개관)

  • Park, Jincheol
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.353-360
    • /
    • 2015
  • Various statistical models have been proposed over the last decade for spatially correlated Gaussian outcomes. The spatial linear mixed model (SLMM), which incorporates a spatial effect as a random component to the linear model, is the one of the most widely used approaches in various application contexts. Employing link functions, SLMM can be naturally extended to spatial generalized linear mixed model for non-Gaussian outcomes (SGLMM). We review popular SGLMMs on non-Gaussian spatial outcomes and demonstrate their applications with available public data.

Intelligent Wordcloud Using Text Mining (텍스트 마이닝을 이용한 지능적 워드클라우드)

  • Kim, Yeongchang;Ji, Sangsu;Park, Dongseo;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.325-326
    • /
    • 2019
  • This paper proposes an intelligent word cloud by improving the existing method of representing word cloud by examining the frequency of nouns with text mining technique. In this paper, we propose a method to visually show word clouds focused on other parts, such as verbs, by effectively adding newly-coined words and the like to a dictionary that extracts noun words in text mining. In the experiment, the KoNLP package was used for extracting the frequency of existing nouns, and 80 new words that were not supported were added manually by examining frequency.

  • PDF

A Study on the Diffusion Prediction Model of COVID-19 (COVID-19 확산 예측 모형에 관한 연구)

  • Yun, Seok-Yong
    • Annual Conference of KIPS
    • /
    • 2020.05a
    • /
    • pp.413-416
    • /
    • 2020
  • COVID-19(Coronavirus Disease 2019)는 RNA 형 바이러스로써 점막감염(粘膜感染)과 비말전파(飛沫傳播)로 전염되는 급성 호흡기성 질병이다. 2019 년 12 월 중국 후베이 우한에서 처음 감염이 보고된 후 빠르게 글로벌로 확산되었고, 현재 여러 국가와 지역이 Lockdown 상태에 있다. COVID-19 의 치사율은 국가별, 연령별 차이는 있으나 사스(SARS-CoV), 메르스(MERS-CoV) 등과 비교하여 높다고 할 수 없다. 그러나 COVID-19 는 신종 코로나바이러스로써 아직 백신(Vaccine)과 항바이러스제가 개발되지 않았고 다른 질병과 비교하여 빠른 감염 속도때문에 의료 공백, 사회적 혼란, 경제적 손실을 크게 일으키고 있다. 따라서 바이러스의 확산 양상을 데이터 분석을 통하여 예측할 수 있다면 사회·경제적인 폐해를 줄일 수 있어 Bass 모델과 R 패키지를 이용하여 COVID-19 확산 예측 모형을 계량적으로 제시하였다.

Decommissioning Cost Estimation of Kori Unit 1 Using a Multi-Regression Analysis Model (회귀 분석 모델을 이용한 고리 1호기 해체 비용 추정)

  • Joo, Han Young;Kim, Jae Wook;Jeong, So Yun;Moon, Joo Hyun
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.18 no.2_spc
    • /
    • pp.247-260
    • /
    • 2020
  • A multi-regression model was developed to estimate the decommissioning cost for Kori unit 1 using foreign nuclear power plant (NPP) decommissioning cost data. First, the decommissioning cost data were collected for 13 boiling water reactors and 16 pressurized water reactors and converted into the values as of November 2019. Then, for the regression model, the decommissioning cost was chosen as the dependent variable, and two variables were selected as independent variables: a contamination factor that was designed to reflect the operational characteristics of the decommissioned NPP and the decommissioning period. A statistical package in the R language was used to derive the regression model. Finally, the regression model was applied to estimate the decommissioning cost for Kori unit 1. The estimated decommissioning cost for Kori unit 1 was 663.40~928.32 million US dollars (782,812~1,095,418 million Korean won).

Processing large-scale data with Apache Spark (Apache Spark를 활용한 대용량 데이터의 처리)

  • Ko, Seyoon;Won, Joong-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1077-1094
    • /
    • 2016
  • Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.

Data visualization of airquality data using R software (R 소프트웨어를 이용한 대기오염 데이터의 시각화)

  • Oh, Youngchang;Park, Eunsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.399-408
    • /
    • 2015
  • This paper presented airquality data through data visualization in several ways and described its characteristics related to statistical methods for analysis. Software R was used for visualization tools. The airquality data was measured in New York city from May to September of year 1973. First, simple, exploratory data analysis was done in terms of both data visualization and analysis to find out univariate characteristics. Then through data transformation and multiple regression analysis, model for describing the airquality level was found. Also, after some data categorization, overall feature of the data was explored using box plot and three-dimensional perspective drawing and scatter plot.

Exploring the Factors Influencing Students' Career Maturity in Seoul City Middle School: A Machine Learning (머신러닝을 활용한 서울시 중학생 진로성숙도 예측 요인 탐색)

  • Park, Jung
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.155-170
    • /
    • 2020
  • The purpose of this study was to apply machine learning techniques (Decision Tree, Random Forest, XGBoost) to data from the 4th~6th year of the Seoul Education Longitudinal Study to find the factors predicting the career maturity of middle school students in Seoul city. In order to evaluate the machine learning application result, the performance of the model according to the indicators was checked. In addition, the model was analyzed using the XGBoostExplainer package, and R and R Studio tools were used for this study. As a result, there was a slight difference in the ranking of variable importance by each model, but the rankings were high in 'Achievement goal awareness', 'Creativity', 'Self-concept', 'Relationship with parents and children', and 'Resilience'. In addition, using the XGBoostExplainer package, it was found that the factors that protect and deteriorate career maturity by panel and 'Achievement goal awareness' is the top priority factor for predicting career maturity. Based on the results of this study, it was suggested that a comparative study of machine learning and variable selection methods and a comparative study of each cohort of the Seoul Education Termination Study should be conducted.

Media exposure analysis of official sponsors and general companies of mega sport event (메가 스포츠이벤트의 공식스폰서와 일반기업의 미디어 노출 분석)

  • Kim, Joo-Hak;Cho, Sun-Mi
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.4
    • /
    • pp.171-181
    • /
    • 2018
  • As the proportion of sports events in the sports industry grows, the official sponsor market for sports events is also increasing. But because official sponsors are limited and expensive, some companies approach sporting events by way of Ambush marketing. This study is to analyze the differences of media exposure between official sponsors and general companies of mega sport events. To accomplish the purpose of the study, we collected text articles and analyzed them from the period of 2016 Rio Olympics, one year before the Olympics and one year after the Olympics. Web crawling was performed using Python for the collection of articles. Morphological and frequency analysis was performed using the KoNLP package and the TM package of statistical program R. In addition, the opinions of the related experts group were gathered to classify the companies or organizations in the media as the Organizing Committees for the Olympic Games(OCOGs), official sponsor, and general companies. As a result of the analysis, 5,220 times appeared related to the OCOGs, 7,845 times appeared related to the official sponsor, and 7,028 times appeared related to general companies. There isn't much difference in the frequency of exposure between official sponsors and general companies. It implies that Ambush marketing is recognized as a strategic marketing technique. The International Olympic Committee(IOC) has to recognize these social phenomena and establish reasonable standards for the marketing activities of official sponsors and general companies. And this study will serve as a basis for fair sponsor activities or marketing activities of sports events.

Nondestructive Quantification of Corrosion in Cu Interconnects Using Smith Charts (스미스 차트를 이용한 구리 인터커텍트의 비파괴적 부식도 평가)

  • Minkyu Kang;Namgyeong Kim;Hyunwoo Nam;Tae Yeob Kang
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.31 no.2
    • /
    • pp.28-35
    • /
    • 2024
  • Corrosion inside electronic packages significantly impacts the system performance and reliability, necessitating non-destructive diagnostic techniques for system health management. This study aims to present a non-destructive method for assessing corrosion in copper interconnects using the Smith chart, a tool that integrates the magnitude and phase of complex impedance for visualization. For the experiment, specimens simulating copper transmission lines were subjected to temperature and humidity cycles according to the MIL-STD-810G standard to induce corrosion. The corrosion level of the specimen was quantitatively assessed and labeled based on color changes in the R channel. S-parameters and Smith charts with progressing corrosion stages showed unique patterns corresponding to five levels of corrosion, confirming the effectiveness of the Smith chart as a tool for corrosion assessment. Furthermore, by employing data augmentation, 4,444 Smith charts representing various corrosion levels were obtained, and artificial intelligence models were trained to output the corrosion stages of copper interconnects based on the input Smith charts. Among image classification-specialized CNN and Transformer models, the ConvNeXt model achieved the highest diagnostic performance with an accuracy of 89.4%. When diagnosing the corrosion using the Smith chart, it is possible to perform a non-destructive evaluation using electronic signals. Additionally, by integrating and visualizing signal magnitude and phase information, it is expected to perform an intuitive and noise-robust diagnosis.

Development of Manufacturing System Package for CFRP Machining (패키지형 탄소섬유복합재 가공시스템 개발)

  • Kim, Hyo-Young;Kim, Tae-Gon;Lee, Seok-Woo;Yoon, Han-Sol;Kyung, Dae-Su;Choi, In-Hue;Choi, Hyun;Ko, Jong-Min
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.33 no.6
    • /
    • pp.431-438
    • /
    • 2016
  • Recently, concerns about the environment are becoming more important because of global warming and the exhaustion of earth's resources. In the aviation and automobile industries, the application of light materials is increasingly important for eco-friendly and effective. Carbon Fiber Reinforced Plastics is a composite material which great formability and the high strength of carbon fiber. CFRP, which is both light and strong, is hard to manufacture. In addition, CFRP machining has a high chance of defects. This research discusses the development of a manufacturing system package for CFRP machining. It involving CFRP Drilling/Water-jet Manufacturing Machines, Inspection/Post-processing Systems, CNC platform for an EtherCAT servo Communication, Flexible Manufacturing Systems and CFRP machining Processes.