• Title/Summary/Keyword: R Statistical Software

Search Result 154, Processing Time 0.023 seconds

Using R Software for Reliability Data Analysis

  • Shaffer, Leslie B.;Young, Timothy M.;Guess, Frank M.;Bensmail, Halima;Leon, Ramon V.
    • International Journal of Reliability and Applications
    • /
    • v.9 no.1
    • /
    • pp.53-70
    • /
    • 2008
  • In this paper, we discuss the plethora of uses for the software package R, and focus specifically on its helpful applications in reliability data analyses. Examples are presented; including the R coding protocol, R code, and plots for various statistical as well as reliability analyses. We explore Kaplan-Meier estimates and maximum likelihood estimation for distributions including the Weibull. Finally, we discuss future applications of R, and usages of quantile regression in reliability.

  • PDF

Knowledge Exchange Activities and Performances in Software Industry Clusters: Focus on Firm Size Effect

  • CHO, Sung Eui
    • The Journal of Economics, Marketing and Management
    • /
    • v.10 no.6
    • /
    • pp.9-16
    • /
    • 2022
  • Purpose: This research investigates the differences in knowledge exchange activities and performances between startups and large companies in software industry clusters. Research design, data, and methodology: Six independent factors of human resource information, R&D and technology, marketing knowledge, government support information, strategic knowledge, and cooperation information were extracted to test the firm size effect in the relationships with two performance factors such as satisfaction with industry cluster location and satisfaction with financial performances. Data were collected through a survey of entrepreneurs, managers, and employees and tested by statistical analysis methodologies. Results: Three independent factors of human resource information, R&D and technology, and cooperation information were particularly significant in the relationship with both dependent factors. Strategic knowledge significantly affected financial performance. Knowledge exchange activities were more important in startups than in large companies for all eight factors. Conclusion: Policies for software industry clusters need a different approach for startups and large companies.

Implementation of GrADS and R Scripts for Processing Future Climate Data to Produce Agricultural Climate Information (농업 기후 정보 생산을 위한 미래 기후 자료 처리 GrADS 및 R 프로그램 구현)

  • Lee, Kyu Jong;Lee, Semi;Lee, Byun Woo;Kim, Kwang Soo
    • Atmosphere
    • /
    • v.23 no.2
    • /
    • pp.237-243
    • /
    • 2013
  • A set of scripts for GrADS (Grid Analysis and Display System) and R was implemented to produce agricultural climate information using the future climate scenarios based on the Representative Concentration Pathways. The GrADS script was used to calculate agricultural climate indices including growing degree days and cooling degree days. The script generated agricultural climate maps of these indices, which are compatible with common Geographic Information System (GIS) applications. To perform a statistical analysis using the agricultural climate maps, a script for R, which is open source statistical software, was used. Because a large number of spatial climate data were produced, parallel processing packages such as SNOW, doSNOW, and foreach were used to perform a simple statistical analysis in the R script. The parallel script of R had speedup on workstations with multi-CPU cores.

Research on Natural Language Processing Package using Open Source Software (오픈소스 소프트웨어를 활용한 자연어 처리 패키지 제작에 관한 연구)

  • Lee, Jong-Hwa;Lee, Hyun-Kyu
    • The Journal of Information Systems
    • /
    • v.25 no.4
    • /
    • pp.121-139
    • /
    • 2016
  • Purpose In this study, we propose the special purposed R package named ""new_Noun()" to process nonstandard texts appeared in various social networks. As the Big data is getting interested, R - analysis tool and open source software is also getting more attention in many fields. Design/methodology/approach With more than 9,000 R packages, R provides a user-friendly functions of a variety of data mining, social network analysis and simulation functions such as statistical analysis, classification, prediction, clustering and association analysis. Especially, "KoNLP" - natural language processing package for Korean language - has reduced the time and effort of many researchers. However, as the social data increases, the informal expressions of Hangeul (Korean character) such as emoticons, informal terms and symbols make the difficulties increase in natural language processing. Findings In this study, to solve the these difficulties, special algorithms that upgrade existing open source natural language processing package have been researched. By utilizing the "KoNLP" package and analyzing the main functions in noun extracting command, we developed a new integrated noun processing package "new_Noun()" function to extract nouns which improves more than 29.1% compared with existing package.

A study on unstructured text mining algorithm through R programming based on data dictionary (Data Dictionary 기반의 R Programming을 통한 비정형 Text Mining Algorithm 연구)

  • Lee, Jong Hwa;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.2
    • /
    • pp.113-124
    • /
    • 2015
  • Unlike structured data which are gathered and saved in a predefined structure, unstructured text data which are mostly written in natural language have larger applications recently due to the emergence of web 2.0. Text mining is one of the most important big data analysis techniques that extracts meaningful information in the text because it has not only increased in the amount of text data but also human being's emotion is expressed directly. In this study, we used R program, an open source software for statistical analysis, and studied algorithm implementation to conduct analyses (such as Frequency Analysis, Cluster Analysis, Word Cloud, Social Network Analysis). Especially, to focus on our research scope, we used keyword extract method based on a Data Dictionary. By applying in real cases, we could find that R is very useful as a statistical analysis software working on variety of OS and with other languages interface.

Development of Web Contents for Statistical Analysis Using Statistical Package and Active Server Page (통계패키지와 Active Server Page를 이용한 통계 분석 웹 컨텐츠 개발)

  • Kang, Tae-Gu;Lee, Jae-Kwan;Kim, Mi-Ah;Park, Chan-Keun;Heo, Tae-Young
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.1
    • /
    • pp.109-114
    • /
    • 2010
  • In this paper, we developed the web content of statistical analysis using statistical package and Active Server Page (ASP). A statistical package is very difficult to learn and use for non-statisticians, however, non-statisticians want to do analyze the data without learning statistical packages such as SAS, S-plus, and R. Therefore, we developed the web based statistical analysis contents using S-plus which is the popular statistical package and ASP. In real application, we developed the web content for various statistical analyses such as exploratory data analysis, analysis of variance, and time series on the web using water quality data. The developed statistical analysis web content is very useful for non-statisticians such as public service person and researcher. Consequently, combining a web based contents with a statistical package, the users can access the site quickly and analyze data easily.

TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data

  • Lim, Jae Hyun;Lee, Soo Youn;Kim, Ju Han
    • Genomics & Informatics
    • /
    • v.15 no.1
    • /
    • pp.51-53
    • /
    • 2017
  • High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one another due to their disparate data structures and processing methods. They also lack visualization methods to confirm the integrity of the data and the process. In this paper, we propose an R-based RNA-Seq analysis pipeline called TRAPR, an integrated tool that facilitates the statistical analysis and visualization of RNA-Seq expression data. TRAPR provides various functions for data management, the filtering of low-quality data, normalization, transformation, statistical analysis, data visualization, and result visualization that allow researchers to build customized analysis pipelines.

A computational note on maximum likelihood estimation in random effects panel probit model

  • Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.315-323
    • /
    • 2019
  • Panel data sets have recently been developed in various areas, and many recent studies have analyzed panel, or longitudinal data sets. Often a dichotomous dependent variable occur in survival analysis, biomedical and epidemiological studies that is analyzed by a generalized linear mixed effects model (GLMM). The most common estimation method for the binary panel data may be the maximum likelihood (ML). Many statistical packages provide ML estimates; however, the estimates are computed from numerically approximated likelihood function. For instance, R packages, pglm (Croissant, 2017) approximate the likelihood function by the Gauss-Hermite quadratures, while Rchoice (Sarrias, Journal of Statistical Software, 74, 1-31, 2016) use a Monte Carlo integration method for the approximation. As a result, it can be observed that different packages give different results because of different numerical computation methods. In this note, we discuss the pros and cons of numerical methods compared with the exact computation method.

Interactive Statistics Laboratory using R and Sage (R을 활용한 '대화형 통계학 입문 실습실' 개발과 활용)

  • Lee, Sang-Gu;Lee, Geung-Hee;Choi, Yong-Seok;Lee, Jae Hwa;Lee, Jenny Jyoung
    • Communications of Mathematical Education
    • /
    • v.29 no.4
    • /
    • pp.573-588
    • /
    • 2015
  • In this paper, we introduce development process and application of a simple and effective model of a statistics laboratory using open source software R, one of leading language and environment for statistical computing and graphics. This model consists of HTML files, including Sage cells, video lectures and enough internet resources. Users do not have to install statistical softwares to run their code. Clicking 'evaluate' button in the web page displays the result that is calculated through cloud-computing environment. Hence, with any type of mobile equipment and internet, learners can freely practice statistical concepts and theorems via various examples with sample R (or Sage) codes which were given, while instructors can easily design and modify it for his/her lectures, only gathering many existing resources and editing HTML file. This will be a resonable model of laboratory for studying statistics. This model with bunch of provided materials will reduce the time and effort needed for R-beginners to be acquainted with and understand R language and also stimulate beginners' interest in statistics. We introduce this interactive statistical laboratory as an useful model for beginners to learn basic statistical concepts and R.

An Exploratory Study on Determinants Affecting R Programming Acceptance (R 프로그래밍 수용 결정 요인에 대한 탐색 연구)

  • Rubianogroot, Jennifer;Namn, Su Hyeon
    • Management & Information Systems Review
    • /
    • v.37 no.1
    • /
    • pp.139-154
    • /
    • 2018
  • R programming is free and open source system associated with a rich and ever-growing set of libraries of functions developed and submitted by independent end-users. It is recognized as a popular tool for handling big data sets and analyzing them. Reflecting these characteristics, R has been gaining popularity from data analysts. However, the antecedents of R technology acceptance has not been studied yet. In this study we identify and investigates cognitive factors contributing to build user acceptance toward R in education environment. We extend the existing technology acceptance model by incorporating social norms and software capability. It was found that the factors of subjective norm, perceived usefulness, ease of use affect positively on the intention of acceptance R programming. In addition, perceived usefulness is related to subjective norms, perceived ease of use, and software capability. The main difference of this research from the previous ones is that the target system is not a stand-alone. In addition, the system is not static in the sense that the system is not a final version. Instead, R system is evolving and open source system. We applied the Technology Acceptance Model (TAM) to the target system which is a platform where diverse applications such as statistical, big data analyses, and visual rendering can be performed. The model presented in this work can be useful for both colleges that plan to invest in new statistical software and for companies that need to pursue future installations of new technologies. In addition, we identified a modified version of the TAM model which is extended by the constructs such as subjective norm and software capability to the original TAM model. However one of the weak aspects that might inhibit the reliability and validity of the model is that small number of sample size.