• Title/Summary/Keyword: probability.statistics

Search Result 1,211, Processing Time 0.026 seconds

Analysis-based Pedestrian Traffic Incident Analysis Based on Logistic Regression (로지스틱 회귀분석 기반 노인 보행자 교통사고 요인 분석)

  • Siwon Kim;Jeongwon Gil;Jaekyung Kwon;Jae seong Hwang;Choul ki Lee
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.23 no.2
    • /
    • pp.15-31
    • /
    • 2024
  • The characteristics of elderly traffic accidents were identified by reflecting the situation of the elderly population in Korea, which is entering an ultra-aging society, and the relationship between independent and dependent variables was analyzed by classifying traffic accidents of serious or higher and traffic accidents of minor or lower in elderly pedestrian traffic accidents using binomial variables. Data collection, processing, and variable selection were performed by acquiring data from the elderly pedestrian traffic accident analysis system (TAAS) for the past 10 years (from 13 to 22 years), and basic statistics and analysis by accident factors were performed. A total of 15 influencing variables were derived by applying the logistic regression model, and the influencing variables that have the greatest influence on the probability of a traffic accident involving severe or higher elderly pedestrians were derived. After that, statistical tests were performed to analyze the suitability of the logistic model, and a method for predicting the probability of a traffic accident according to the construction of a prediction model was presented.

An Analysis of Statistics Chapter of the Grade 7's Current Textbook in View of the Distribution Concepts (중학교 1학년 통계단원에 나타난 분포개념에 관한 분석)

  • Lee, Young-Ha;Choi, Ji-An
    • Journal of Educational Research in Mathematics
    • /
    • v.18 no.3
    • /
    • pp.407-434
    • /
    • 2008
  • This research is to analyze the descriptions in the statistic chapter of the grade 7's current textbooks. The analysis is based on the distribution concepts suggested by Nam(2007). Thus we assumed that the goal of this statistic chapter is to establish concepts on the distributions and to learn ways of communication and comparison through distributional presentations. What we learned and wanted to suggest through the study is the followings. 1) Students are to learn what the distribution is and what are not. 2) Every kinds of presentational form of distributions is to given its own right to learn so that students are more encouraged to learn them and use them more adequately. 3) Density histogram is to be introduced to extend student's experiences viewing an area as 3 relative frequency, which is later to be progressed into a probability density. 4) Comparison of two distributions, especially through frequency polygons, is to be an hot issue among educational stakeholder whether to include or not. It is very important when stochastic correlations be learned, because it is nothing but a comparison between conditional distributions. 5) Statistical literacy is also an important issue for student's daily life. Especially the process ahead of the data collection must be introduced so that students acknowledge the importance of accurate and object-oriented data.

  • PDF

A Comparative Study of Curriculum and Mathematics Learning Programme of Lower Grade Between Korea and New Zealand (한국과 뉴질랜드의 초등학교 저학년 교육과정 및 수학학습 프로그램의 비교와 분석)

  • 최창우
    • School Mathematics
    • /
    • v.6 no.1
    • /
    • pp.1-19
    • /
    • 2004
  • Recently, we have been listening such a words, that is, the crisis of public education through the mass communication such as newspaper or broadcasting. This means that we didn't have an enough opportunity to think it over about good education programme which the education of school can be normalized or the design of curriculum in the current problems such as overcrowded class, teacher and poor finance which is not still solved. As we know, it is true that the older generation is familiar with the rote learning which was under the control of behaviorism for about three hundred years. Fortunately, The 7th curriculum which had made public by the ministry of education on 30 Dec. 1997 have changed so many things such as real life based or activity based and so on. But it still leaves something to be desired in reflecting the demand of teachers of field. Taking into account this real situation, I have wondered how they run curriculum and how math learning programme of lower grade is different with ours in New Zealand, etc and so I had tried to find some suggestive points through the comparison of curriculum and text between Korea and New Zealand. But, if we want to compare all the strands of curriculum between two countries, it is too global and so in this paper, we deal with only number and operations(number), measurement, figure(geometry), equation and patter(algebra), probability and statistics(statistics) which are dealt with more comparatively in the lower grade of primary school. Because the main purpose of this paper is a comparison and analysis of the curriculum and math learning program of the lower grade in the primary school between two countries and so we compare global characteristics of education system and curriculum between two countries, at first and then we dealt with the very core part of the content of New Zealand curriculum within the ranges of level 1, 2 and 3 and global characteristics of learning program simultaneously.

  • PDF

Reproducibility of Hypothesis Testing and Confidence Interval (가설검정과 신뢰구간의 재현성)

  • Huh, Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.645-653
    • /
    • 2014
  • P-value is the probability of observing a current sample and possibly other samples departing equally or more extremely from the null hypothesis toward postulated alternative hypothesis. When p-value is less than a certain level called ${\alpha}$(= 0:05), researchers claim that the alternative hypothesis is supported empirically. Unfortunately, some findings discovered in that way are not reproducible, partly because the p-value itself is a statistic vulnerable to random variation. Boos and Stefanski (2011) suggests calculating the upper limit of p-value in hypothesis testing, using a bootstrap predictive distribution. To determine the sample size of a replication study, this study proposes thought experiments by simulating boosted bootstrap samples of different sizes from given observations. The method is illustrated for the cases of two-group comparison and multiple linear regression. This study also addresses the reproducibility of the points in the given 95% confidence interval. Numerical examples show that the center point is covered by 95% confidence intervals generated from bootstrap resamples. However, end points are covered with a 50% chance. Hence this study draws the graph of the reproducibility rate for each parameter in the confidence interval.

Emerging P2P Traffic Analysis and Modeling (P2P 트래픽의 특성 분석과 트래픽 모델링)

  • 주성돈;이채우
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.2B
    • /
    • pp.279-288
    • /
    • 2004
  • Rapidly emerging P2P(Peer to Peer) applications generate very bursty traffic, which gives a lot of burden to network, and the amount of such traffic is increasing rapidly. Thus it is becoming more important to understand the characteristics of such traffic and reflect it when we design and analyze the network. To do that we measured the traffic in a campus network and present flow statistics and traffic models of the measured traffic, and compare them with those of the web traffic. The results indicate that P2P traffic is much burstier than web traffic and as a result it negatively affects network performance. We modeled P2P traffic using self-similar traffic model to predict packet delay and loss occurred in network which are very important to evaluate network performance. We also predict queue length distribution and loss probability in SSQ(Single Sewer Queue). To assess accuracy of traffic model, we compare the SSQ statistics of traffic models with that of the traffic trace. The results show that self-similar traffic models we use can predict P2P traffic behavior in network precisely. It is expected that the traffic models we derived can be used when we design network capacity and predict network performance and QoS of the P2P applications.

Additive hazards models for interval-censored semi-competing risks data with missing intermediate events (결측되었거나 구간중도절단된 중간사건을 가진 준경쟁적위험 자료에 대한 가산위험모형)

  • Kim, Jayoun;Kim, Jinheum
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.4
    • /
    • pp.539-553
    • /
    • 2017
  • We propose a multi-state model to analyze semi-competing risks data with interval-censored or missing intermediate events. This model is an extension of the three states of the illness-death model: healthy, disease, and dead. The 'diseased' state can be considered as the intermediate event. Two more states are added into the illness-death model to incorporate the missing events, which are caused by a loss of follow-up before the end of a study. One of them is a state of the lost-to-follow-up (LTF), and the other is an unobservable state that represents an intermediate event experienced after the occurrence of LTF. Given covariates, we employ the Lin and Ying additive hazards model with log-normal frailty and construct a conditional likelihood to estimate transition intensities between states in the multi-state model. A marginalization of the full likelihood is completed using adaptive importance sampling, and the optimal solution of the regression parameters is achieved through an iterative quasi-Newton algorithm. Simulation studies are performed to investigate the finite-sample performance of the proposed estimation method in terms of empirical coverage probability of true regression parameters. Our proposed method is also illustrated with a dataset adapted from Helmer et al. (2001).

Application of Multi-Dimensional Precipitation Models to the Sampling Error Problem (관측오차문제에 대한 다차원 강우모형의 적용)

  • Yu, Cheol-Sang
    • Journal of Korea Water Resources Association
    • /
    • v.30 no.5
    • /
    • pp.441-447
    • /
    • 1997
  • Rainfall observation using rain gage network or satellites includes the sampling error depending on the observation methods or plans. For example, the sampling using rain gages is continuous in time but discontinuous in space, which is nothing but the source of the sampling error. The sampling using satellites is the reverse case that continuous in space and discontinuous in time. The sampling error may be quantified by use of the temporal-spatial characteristics of rainfall and the sampling design. One of recent works on this problem was done by North and Nakamoto (1989), who derived a formulation for estimating the sampling error based on the temporal-spatial rainfall spectrum and the design scheme. The formula enables us to design an optimal rain gage network or a satellite operation plan providing the statistical characteristics of rainfall. In this paper the formula is reviewed and applied for the sampling error problems using several multi-dimensional precipitation models. The results show the limitation of the formulation, which cannot distinguish the model difference in case the model parameters can reproduce similar second order statistics of rainfall. The limitation can be improved by developing a new way to consider the higher order statistics, and eventually the probability density function (PDF) of rainfall.

  • PDF

Analysis of Characteristics of the Cancelled Districts of Housing Redevelopment Project - Focusing on Decision Tree Analysis - (재정비사업 해제구역 의사결정 특성 연구 - 의사결정나무기법 중심으로 -)

  • Lee, Do-Ghil
    • Journal of the Korean Regional Science Association
    • /
    • v.37 no.4
    • /
    • pp.49-59
    • /
    • 2021
  • This study aims to identify the characteristics of the cancelled districts of housing redevelopment and housing reconstruction project. The subject of this study is 189 project districts(121 promoted districts, 68 cancelled districts). Both 121 promoted districts and 68 cancelled districts were analyzed by Decision Tree Analysis. The first separation of the release zone influencing factors was made by the Development Actors. In other words, the most important independent variable for determining the release zone influence factor was shown to be the presence or absence of propulsion actors. Of the 89 districts without propellers, 41 were lifted and 48 were promoted, and 9 out of 100 districts with propellers were lifted and 91 were promoted. The second separation of the impact factors on the zone was then made by Land Owners, and the probability of cancellation increased if the number of landowners was less than 468 and 37 out of 62 were removed. On the other hand, four out of 27 districts with more than 468 landowners were lifted and 23 districts were promoted. The third separation was made by the Average Land Assessment, and 35 zones were lifted below the standard of KRW 269.64 million/m2 approximately KRW 8.91 million per pyeong, and two zones were lifted at higher official prices. In the second division, the number of landowners was 468 or more, and in node4, four areas were removed from areas with a public land area ratio of 29.43% or more, and no areas less were released. This study used SPSS Statistics 26 S/W for analysis.

Input Variables Selection by Principal Component Analysis and Mutual Information Estimation (주요성분분석과 상호정보 추정에 의한 입력변수선택)

  • Cho, Yong-Hyun;Hong, Seong-Jun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.2
    • /
    • pp.220-225
    • /
    • 2007
  • This paper presents an efficient input variable selection method using both principal component analysis(PCA) and adaptive partition mutual information(AP-MI) estimation. PCA which is based on 2nd order statistics, is applied to prevent a overestimation by quickly removing the dependence between input variables. AP-MI estimation is also applied to estimate an accurate dependence information by equally partitioning the samples of input variable for calculating the probability density function. The proposed method has been applied to 2 problems for selecting the input variables, which are the 7 artificial signals of 500 samples and the 24 environmental pollution signals of 55 samples, respectively. The experimental results show that the proposed methods has a fast and accurate selection performance. The proposed method has also respectively better performance than AP-MI estimation without the PCA and regular partition MI estimation.

The research for the management and financial affairs of geriatric hospital (노인병원의 운명 및 재무구조 특성에 관한 연구)

  • Kim, Do-Hun;Lee, Jong-Gil;Jung, Key-Stm;Lee, Chang-Eun
    • Korea Journal of Hospital Management
    • /
    • v.6 no.1
    • /
    • pp.1-17
    • /
    • 2001
  • According to the increase of the proportion of aged people, the medical demand for a senile chronic disease has been increased; therefore, aged people call for a geriatric hospital for special geriatric medical service. The main purpose of this study was to analyze the general characteristics and financial status of geriatric hospitals. For the study, a questionnaire was designed and sent to the geriatric hospitals to fill out the patient statistics, number of headcount by department, etc. to find out the stability, profitability, activity and so on financial statements of the hospitals were analyzed. The major findings of this study were as belows. 1. The ratio of the medical expenses to the revenue of the geriatric hospitals is much lower than acute care hospitals. But the probability of bankruptcy is higher due to the high ratio of the liabilities therefore it is required to stabilize the financial position by donating more money. 2. Government budget for the elderly people is not enough. To support the geriatric hospitals by going subsides, government should increase the budget. 3. Portion's of the patient of the geriatric hospitals are government support patient. Since the government doesn't pay the medical charges quickly, geriatric hospitals have a serious cash flow problem. Therefore, it is required that government is to prepay the bill. 4. Since geriatric hospitals treat elderly patient and most patients are government support patients, geriatric hospitals can be said to operate under the strict. 5. When we introduce the daily medical charge, the self-liability will be reduced on approximately 50% of current. This affection will bring a huge progressing financial structure to the medical profit of the geriatric hospital, and also patient family will feel less economical burden.

  • PDF