• 제목/요약/키워드: count data

검색결과 1,116건 처리시간 0.024초

Weighted zero-inflated Poisson mixed model with an application to Medicaid utilization data

  • Lee, Sang Mee;Karrison, Theodore;Nocon, Robert S.;Huang, Elbert
    • Communications for Statistical Applications and Methods
    • /
    • 제25권2호
    • /
    • pp.173-184
    • /
    • 2018
  • In medical or public health research, it is common to encounter clustered or longitudinal count data that exhibit excess zeros. For example, health care utilization data often have a multi-modal distribution with excess zeroes as well as a multilevel structure where patients are nested within physicians and hospitals. To analyze this type of data, zero-inflated count models with mixed effects have been developed where a count response variable is assumed to be distributed as a mixture of a Poisson or negative binomial and a distribution with a point mass of zeros that include random effects. However, no study has considered a situation where data are also censored due to the finite nature of the observation period or follow-up. In this paper, we present a weighted version of zero-inflated Poisson model with random effects accounting for variable individual follow-up times. We suggested two different types of weight function. The performance of the proposed model is evaluated and compared to a standard zero-inflated mixed model through simulation studies. This approach is then applied to Medicaid data analysis.

전화통화 빅데이터 분석에 관한 연구 (A Study on Phon Call Big Data Analytics)

  • 김정래;정찬기
    • 정보화연구
    • /
    • 제10권3호
    • /
    • pp.387-397
    • /
    • 2013
  • 본 연구는 전화통화에 의해 생성된 데이터에 대한 빅데이터 분석 접근을 제안한다. 전화통화 데이터의 분석모형은 자연어의 어휘식별을 위한 PVPF(Parallel Variable-length Phrase Finding) 알고리즘과 키워드의 사용빈도 측정을 위한 워드 카운트 알고리즘으로 구성된다. 제안한 분석모형에서는 먼저 PVPF 알고리즘에 의해 연계 단어 추출을 통해 어휘를 식별하며, MapReduce의 워드 카운트 알고리즘을 사용하여 식별된 어휘 및 단어의 사용빈도를 측정한다. 그 결과는 다양한 관점에서 해석될 수 있다. 제안 분석모형의 효과성을 보이기 위해 HDFS(Hadoop Distributed File System)를 기반으로 분석모형을 설계 구현하였으며, 전화통화 데이터를 실험 적용한다. 실험결과, 키워드 상관관계 분석 및 사용빈도 변화 분석을 통해 유의미한 결과를 도출한다.

지그비 기반의 휴대형 심전도 모니터링 시스템 설계 (Design of Zigbee based Portable ECG monitoring system)

  • 홍주현;김남진;차은종;이태수
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2006년도 심포지엄 논문집 정보 및 제어부문
    • /
    • pp.51-53
    • /
    • 2006
  • This paper proposes a portable ECG monitoring system, which integrates uptodate PDA and RF communication technology. The aim of the study is to acquire the subject's biomedical signal without any constraint. It has two types of transmission mode, which are total signal transmission mode and HR(heart rate)/SC(step count) transmission mode. In audition, wireless communication technology uses Zigbee Wireless PAN and can work in low-power mode, which is one of the advantages of ZiBbee communication technology. The developed system is composed of a transmitter and a receiver. The transmitter has three-axial acceleration sensor. ECG amplifier and Zigbee communication controller. In total signal transmission mode, it can send data 50 packets per second whose transmission speed corresponds to 300 ECG samples and 60 acceleration samples. In HR/SG transmission mode, it can calculate heart rate from EEG data with 216 samples per second and step count from acceleration data and send a packet every cardiac cycle. The receiver forwards the received data to PDA, where the data can be stored and displayed. Therefore, the developed device enables to continuous monitoring for Activities of Daily Living(ADL). Also, this method will reduce medical costs in the aged society.

  • PDF

효율적인 교통량 조사를 계획하기 위한 조사구간의 통계적 특성 분류 연구 (Statistical Classification of Highway Segments for Improving the Efficiency of Short-term Traffic Count Planning)

  • 정유석;오주삼
    • 한국도로학회논문집
    • /
    • 제18권3호
    • /
    • pp.109-114
    • /
    • 2016
  • PURPOSES : The demand for extending national highways is increasing, but traffic monitoring is hindered because of resource limitations. Hence, this study classified highway segments into 5 types to improve the efficiency of short-term traffic count planning. METHODS : The traffic volume trends of 880 highway segments were classified through R-squared and linear regression analyses; the steadiness of traffic volume trends was evaluated through coefficient of variance (COV), and the normality of the data were determined through the Shapiro-Wilk W-test. RESULTS : Of the 880 segments, 574 segments had relatively low COV and were classified as type 1 segments, and 123 and 64 segments with increasing and decreasing traffic volume trends were classified as type 2 and type 3 segments, respectively; 80 segments that failed the normality test were classified as type 4, and the remaining 39 were classified as type 5 segments. CONCLUSIONS : A theoretical basis for biennial count planning was established. Biennial count is recommended for types 1~4 because their mean absolute percentage errors (MAPEs) are approximately 10%. For type 5 (MAPE =19.26%), the conventional annual count can be continued. The results of this analysis can reduce the traffic monitoring budget.

계수 시계열을 위한 정수값 GARCH 모델링: 사례분석 (Integer-Valued GARCH Models for Count Time Series: Case Study)

  • 윤재은;황선영
    • 응용통계연구
    • /
    • 제28권1호
    • /
    • pp.115-122
    • /
    • 2015
  • 본 연구에서는 정수값을 갖는 계수 시계열의 조건부 이차적률인 변동성(volatility)을 다루고 있다. 여러 가지 정수값 GARCH, 즉, INGARCH 모형들을 소개하고 계수 시계열인 국내 풍진발생건수에 적용시켜 보았다. 과산포(over-dispersion)와 영과잉(zero-inflation)현상을 계수 시계열의 변동성 분석 입장에서 살펴보았고 향후 분석 모형으로서 영과잉(zero-inflation) INGARCH 모형인 ZI-INGARCH 모형을 살펴보았다.

카운트 데이터 기반 공간 군집 분석 연구의 동향과 방법론적 이슈 (Trends and Methodological Issues in Spatial Cluster Analysis for Count Data)

  • 조대헌
    • 대한지리학회지
    • /
    • 제48권5호
    • /
    • pp.768-785
    • /
    • 2013
  • 행정구역과 같은 공간 단위로 합산된 카운트 데이터는 지리학 연구에 있어 가장 기본적인 데이터이다. 카운트 데이터를 대상으로 하는 공간 군집 분석 연구가 지속적으로 수행되어 왔으나 상대적으로 큰 관심을 받지 못하였을 뿐만 아니라 여러 분야에서 산발적으로 이루어지면서 그 흐름은 물론 주요한 성과와 과제를 제대로 파악하기가 어려운 실정이다. 이 연구에서는 최근 20여 년 동안 이루어진 카운트 데이터 기반의 공간 군집분석 연구를 대상으로 동향과 방법론적 특성을 살펴본 후 이슈와 과제를 검토함으로써 지리학 연구에 시사하는 바를 파악하고자 한다. 지리학은 물론 보건이나 범죄 등의 영역에서 다양한 방법들이 사용되고 있는데, 이들은 그 목적이나 방법론적 특성이 비교적 뚜렷이 구분될 뿐만 아니라 통계학적 신뢰성과 관련된 이슈 또한 존재한다. 따라서 분석의 실행시 방법론에 대한 면밀한 검토가 필요하며, 향후 방법론과 관련된 실증 연구 및 분석 도구의 개발이 요구된다.

  • PDF

Determinants of the Performance of Government Assistance to R&D Activities

  • Kwak, So-Yoon;Yoo, Seung-Hoon
    • Asian Journal of Innovation and Policy
    • /
    • 제3권1호
    • /
    • pp.94-116
    • /
    • 2014
  • The technological innovation is considered as an important factor and there is a positive externality in developing technology in the form of technology spillover. In this context, it is argued that government should play an active role in advancing technology development and several means have been introduced. This study attempts to analyze manufacturing firms' evaluation for the performance of government assistance programs to their R&D activities. Considering that the performance evaluation takes the form of a count outcome, we apply several kinds of count data models. Some interesting findings emerge from the analysis. For example, we found that a firm's sales amount, dummy for the firm's having an R&D department, dummy for the firm's being a venture one, and the number of the firm's innovative activities have positive relationships with the degree that the firm evaluates government assistance as being useful.

시각적 평가에 의한 개더 드레이프 형상 분석 (Analysis of Types of Gather Drape with Visual Evaluation)

  • 이명희;정희경
    • 한국의상디자인학회지
    • /
    • 제7권1호
    • /
    • pp.33-40
    • /
    • 2005
  • Gathering is method used to control fullness along a seam line. The purpose of this study was to investigate the relationship between the quantitative research and qualitative method; the effect of gather and the types of gather drape. The experimental design consists of four factors: (l) three kinds of different weight and thickness of fabrics (2) three kinds of stitch densities (3) five kinds of ratio of gathers (4) three kinds of grain directions. Therefore one hundred thirty five (135) samples were made. And utilized SPSS WIN 10.0 Package in data analysis. The results of this study were as follows; First, after frequency analysis, side height, hem line width, node depth, node count, node width accorded with these result data recording. Second, after correlation analysis, side height related with front statements. Side height and entire visual was negative correlation. Hem line width, node depth, node count with section statements was negative correlation but node width at section statements was positive correlation. Third, after $k^2$ analysis, front picture parts getting excellent evaluation were 1st side height, 3rd hem line width, 4th node depth, 3rd node count, 3rd node width. And section illustration parts getting excellent evaluation were 4th side height, 1st hem line width, 2nd node depth, 3rd node count, 4th node width.

  • PDF

제2기(2012-2014) 국민환경보건 기초조사 자료를 활용한 국내 남성 택시 기사의 심혈관계 위험도 관련 혈액학적 변화에 대한 연구: 성향점수 매칭을 활용하여 (Assessing Hematological Change Associated with Cardiovascular Disease Risk among Korean Taxi Drivers Using Data from the Second (2012-2014) Korean National Environmental Health Survey: A Propensity Score Matching Approach)

  • 백기욱
    • 한국산업보건학회지
    • /
    • 제31권4호
    • /
    • pp.367-377
    • /
    • 2021
  • Objectives: Taxi drivers are exposed to various hazards, such as long periods of sedentary work and traffic-related air pollutants. However, studies on the health effects among taxi drivers in South Korea are insufficient. Methods: To assess subclinical hematologic change related to cardiovascular disease among male taxi drivers, we analyzed data from the second Korean National Environmental Health Survey. Fifty-nine taxi drivers and 1,912 controls were included in the analysis. Propensity score matching was performed to adjust for age, body mass index, and urinary cotinine. A total of 295 subjects were matched with 59 taxi drivers. Leukocyte count, platelet count, hematocrit, triglyceride, total cholesterol, HDL cholesterol land total IgE of the taxi drivers were compared with the control groups. Results: Taxi drivers showed significantly elevated blood leukocytes and platelets. Serum total IgE was significantly reduced in taxi drivers. However, blood leukocytes, platelets, and serum total IgE were not significantly correlated with work period among taxi drivers. Conclusions: Regarding the change of the blood leukocyte count, platelet count, and serum total IgE, taxi driving has the possibility to be associated with peripheral inflammation, humoral immunity and cardiovascular risk.

동적 구문처리기 소프트웨어 적용을 통한 대화력전 수행체계 연동의 유연성 향상 방안 (Improving Flexibility of External Data Exchange in Count-fire Operation System by Adapting Dynamic Parser Software)

  • 홍원의
    • 한국군사과학기술학회지
    • /
    • 제11권1호
    • /
    • pp.51-56
    • /
    • 2008
  • The counter-fire operation system performs its mission exchanging information with other related systems such as command & control systems and military information systems. In the process of exchanging information, the counter-fire operation system uses a type of data message which contains exchange data information in the format of KMTF. The requirement of data exchange of count-fire operation will continue to evolve. But the EDX(External Data eXchange) configuration item of the current counter-fire operation system can not effectively cope with the variation of data exchange requirements due to its fixed software structure. In the paper, a solution for improving flexibility of external data exchange in counter-fire operation system is proposed.