• 제목/요약/키워드: R Statistical Software

Search Result 154, Processing Time 0.028 seconds

The Use of a Biplot in Studying the Career Maturity of College Freshmen (행렬도를 이용한 대학 신입생의 진로의식 분석)

  • Choi, Hye-Mi;Park, Chan-Yong;Lee, Sang-Hyeop;Chung, Sung-Suk
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.933-941
    • /
    • 2010
  • Biplot is a modern graphical methodology allowing for the projection of high-dimensional data to a low-dimensional subspace that is rich in information on variation in the data, correlation among variables as well as class separation. For the construction of biplots, we use a BiplotGUI package in a free statistical software R with increasing popularity. Moreover, using data from questionnaires given to Chonbuk National University freshmen in 2009, the relationship between career goals and career maturity are studied by applying the biplot method.

Practice of causal inference with the propensity of being zero or one: assessing the effect of arbitrary cutoffs of propensity scores

  • Kang, Joseph;Chan, Wendy;Kim, Mi-Ok;Steiner, Peter M.
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.1
    • /
    • pp.1-20
    • /
    • 2016
  • Causal inference methodologies have been developed for the past decade to estimate the unconfounded effect of an exposure under several key assumptions. These assumptions include, but are not limited to, the stable unit treatment value assumption, the strong ignorability of treatment assignment assumption, and the assumption that propensity scores be bounded away from zero and one (the positivity assumption). Of these assumptions, the first two have received much attention in the literature. Yet the positivity assumption has been recently discussed in only a few papers. Propensity scores of zero or one are indicative of deterministic exposure so that causal effects cannot be defined for these subjects. Therefore, these subjects need to be removed because no comparable comparison groups can be found for such subjects. In this paper, using currently available causal inference methods, we evaluate the effect of arbitrary cutoffs in the distribution of propensity scores and the impact of those decisions on bias and efficiency. We propose a tree-based method that performs well in terms of bias reduction when the definition of positivity is based on a single confounder. This tree-based method can be easily implemented using the statistical software program, R. R code for the studies is available online.

Sensitivity analysis of skull fracture

  • Vicini, Anthony;Goswami, Tarun
    • Biomaterials and Biomechanics in Bioengineering
    • /
    • v.3 no.1
    • /
    • pp.47-57
    • /
    • 2016
  • Results from multiple high profile experiments on the parameters influencing the impacts that cause skull fractures to the frontal, temporal, and parietal bones were gathered and analyzed. The location of the impact as a binary function of frontal or lateral strike, the velocity, the striking area of the impactor, and the force needed to cause skull fracture in each experiment were subjected to statistical analysis using the JMP statistical software pack. A novel neural network model predicting skull fracture threshold was developed with a high statistical correlation ($R^2=0.978$) and presented in this text. Despite variation within individual studies, the equation herein proposes a 3 kN greater resistance to fracture for the frontal bone when compared to the temporoparietal bones. Additionally, impacts with low velocities (<4.1 m/s) were more prone to cause fracture in the lateral regions of the skull when compared to similar velocity frontal impacts. Conversely, higher velocity impacts (>4.1 m/s) showed a greater frontal sensitivity.

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

Forest Fire Severity Classification Using Probability Density Function and KOMPSAT-3A (확률밀도함수와 KOMPSAT-3A를 활용한 산불피해강도 분류)

  • Lee, Seung-Min;Jeong, Jong-Chul
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_4
    • /
    • pp.1341-1350
    • /
    • 2019
  • This research deals with algorithm for forest fire severity classification using multi-temporal KOMPSAT-3A image to mapping forest fire areas. The recent satellite of the KOMPSAT series, KOMPSAT-3A, demonstrates high resolution and multi-spectral imagery with infrared and high resolution electro-optical bands. However, there is a lack of research to classify forest fire severity using KOMPSAT-3A. Therefore, the purpose of this study is to analyze forest fire severity using KOMPSAT-3A images. In addition, this research used pre-fire and post-fire Sentinel-2 with differenced Normalized Burn Ratio (dNBR) to taking for burn severity distribution map. To test the effectiveness of the proposed procedure on April 4, 2019, Gangneung wildfires were considered as a case study. This research used the probability density function for the classification of forest fire damage severity based on R software, a free software environment of statistical computing and graphics. The burn severities were estimated by changing NDVI before and after forest fire. Furthermore, standard deviation of probability density function was used to calculate the size of each class interval. A total of five distribution of forest fire severity were effectively classified.

A Study on Method that Estimate Expertness of Pulse Diagnosis in 8 Constitution Medicine (8체질맥진(體質脈診) 숙연도(熟練度) 평가방법(評價方法)에 관(關)한 연구(硏究))

  • Shin, Yong-Sup;Park, Young-Jae;Oh, Hwan-Sup;Park, Young-Bae
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.10 no.1
    • /
    • pp.78-97
    • /
    • 2006
  • Background: There was seldom study about method that estimate expertness of pulse diagnosis in 8 Constitution Medicine in spite of the diagnostician importance in 8 Constitution Medicine Objectives: This study is to evaluate diagnostician's consistency and accuracy about pulse diagnosis in 8 Constitution Medicine using Cage R&R study. Methods: The subjects were comprised of 28 volunteers. Among theme, 3 diagnosticians and 10 participants were chosen through questionnaire. Diagnosticians diagnosed participant's Constitution by pulse diagnosis in 8 Constitution Medicine with hiding their eyes by eyepatch. MINITAB statistical software(ver. 13.20) was used for statistical analysis: Attribute Cage R&R study was used to verify the results. Results: 1. In the measurements of consistency, diagnostician b(agreement=80%, Value of k=0.8276)was very good, diagnostician a(agreement=70%, Value of k=0.7465) was good, and diagnostician c(agreement=50%, Value of k=0.5365) was moderate. 2. In the measurements of accuracy, diagnostician b(agreement =70%, Value of t=0.6812) was good, diagnostician a(agreement=60%. Value of t=0.6414) was good, and diagno-stician c(agreement=0%, Value of k=-0.1000) was poor. 3. In cofidence of diagnosis, diagnostician c was 75%, diagnostician a was 70%, and diagnostician b was 64%. Conclusion: The results suggest that diagnostician's consistency and accuracy about pulse diagnosis in 8 Constitution Medicine can be evaluated by Cage R&R study. further study is needed for estimation method of pulse diagnosis in 8 Constitution Medicine.

  • PDF

Multi-dimensional Visualization Tool for Baseball Statistical Data Using R (R을 활용한 야구 통계 데이터 다차원 시각화 도구)

  • Kim, Ju Hee;Choi, Yong Suk
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.143-146
    • /
    • 2016
  • 본 연구에서는 대용량의 야구 데이터를 R 패키지인 googleVis를 이용하여 시각화하는 웹페이지를 구축하고, 버블 차트로 시각화하여 표현하였다. 웹페이지에서는 시각화하는 객체를 버블로 나타내며, 객체는 타자, 투수, 팀 3가지이다. 각 객체의 속성들을 버블 색상, 버블 사이즈, X-Y좌표, 연도에 설정함으로써 5차원으로 시각화하여 표현할 수 있게 한다. 웹페이지 기능 중 타임슬립 애니메이션을 사용하여 시간의 흐름에 따른 기록 변화를 한 눈에 관찰할 수 있으며, 선수 검색 기능을 통해 특정 선수들을 선택하여 비교 및 분석하는 것이 가능하다.

  • PDF

A Study on Comparison Analysis of Collaborative Filtering in Java and R

  • Nasridinov, Aziz;Park, Young-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1156-1157
    • /
    • 2013
  • The mobile application market has been growing extensively in recent years. Currently, Apple's App Store has more than 400,000 applications and Google's Android Market has above 150,000 applications. Such growth in volumes of mobile applications has created a need to develop a recommender system that assists the users to take the right choice, when searching for a mobile application. In this paper, we study the recommendation system building tools based on collaborative filtering. Specifically, we present a study on comparison analysis of collaborative filtering in Java and R statistical software. We implement the collaborative filtering using Java's Apache Mahout and R's recommenderlab package. We evaluate both methods and describe the advantages and disadvantages of using them in order to implement collaborative filtering.

Research on Big Data Integration Method

  • Kim, Jee-Hyun;Cho, Young-Im
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.1
    • /
    • pp.49-56
    • /
    • 2017
  • In this paper we propose the approach for big data integration so as to analyze, visualize and predict the future of the trend of the market, and that is to get the integration data model using the R language which is the future of the statistics and the Hadoop which is a parallel processing for the data. As four approaching methods using R and Hadoop, ff package in R, R and Streaming as Hadoop utility, and Rhipe and RHadoop as R and Hadoop interface packages are used, and the strength and weakness of four methods are described and analyzed, so Rhipe and RHadoop are proposed as a complete set of data integration model. The integration of R, which is popular for processing statistical algorithm and Hadoop contains Distributed File System and resource management platform and can implement the MapReduce programming model gives us a new environment where in R code can be written and deployed in Hadoop without any data movement. This model allows us to predictive analysis with high performance and deep understand over the big data.

A Study on the Development of Automation System for Social Science Research Based on Cloud (클라우드 기반의 사회과학연구 자동화 시스템 개발에 관한 연구)

  • Yoon, Cheolho
    • Information Systems Review
    • /
    • v.17 no.1
    • /
    • pp.217-238
    • /
    • 2015
  • Much of the process in Social Science Research can be expedited with use of an automation systems that can lead to research efficiency and dramatic improvement of the research process. This study proposes use of a social science research automation system based on the cloud, which generates questionnaires, supports data collection, and intuitively processes statistical analyses of the data collected. The Cloud-based Social Science Research Automation System is developed with GNU/GPL-based open source software. We also integrate R for statistical computing to enable advanced statistical analyses such as PLS structural equation modeling, mediate effect analysis, compare between groups, and complete general statistics. The Cloud-based Social Science Research Automation system developed in this study is expected to play an important role in improving the social science research process and in performing the social science studies efficiently.