• Title/Summary/Keyword: 로그 회귀분석

Search Result 92, Processing Time 0.026 seconds

Development of Statistical Prediction Engine for Integrated Log Analysis Systems (통합 로그 분석 시스템을 위한 통계학적 예측 엔진 개발)

  • KO, Kwang-Man;Kwon, Beom-Chul;Kim, Sung-Chul;Lee, Sang-Jun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.638-639
    • /
    • 2013
  • Anymon Plus(ver 3.0)은 통합 로그 분석 시스템으로 대용량 로그 및 빅데이터의 실시간 수집 저장 분석할 수 있는 제품(초당 40,000 이벤트 처리)으로서, 방화벽 로그 분석을 통한 비정상 네트워크 행위 탐지, 웹 로그 분석을 통한 사용 패턴 분석, 인터넷 쇼핑몰 사기 주문 분석 및 탐지, 내부 정부 유출 분석 및 탐지 등과 같은 다양한 분야로 응용이 확대되고 있다. 본 논문에서는 보안관련 인프라 로그를 분석하고 예측하여 예상 보안사고 시기에 집중적 경계를 통한 선제적 대응을 모색하기 위해 통계적 이론에 기반한 통합 로그 분석 시스템을 개발하기 위해, 회귀분석 및 시계열 분석이 가능한 예측 엔진 시스템을 설계하고 구현한다.

포아송 반응을 갖는 로그 선형 회귀 모형에 대한 최우추정량과 모의실험 연구

  • 한정혜;조중재
    • Communications for Statistical Applications and Methods
    • /
    • v.2 no.1
    • /
    • pp.22-31
    • /
    • 1995
  • 본 논문에서는 포아송 반응을 갖는 로그 선형 회귀 모형에 붙스트랩 방법을 이용하여, 여러가지 통계적 추론을 위한 유용한 확률적 결과들을 연구.소개하고, 모의실험을 통한 소표본 성질들을 다양하게 제시하고자 한다. 특히 로그 선형 회귀 모형에 대한 최우 추정량 $\hat{\beta_n}$ 및 정보행렬 I(${\beta}_0$)의 추정량들 $I_1(\hat{\beta_n}{\cdot}X)$$I_2(\hat{\beta_n}{\cdot}X)$에 대한 일치성 및 정규성등의 확률적 성질들, 그리고 붙스트랩 방법을 적용한 대표본 성질들과 관련하여 여러가지 모의실험 결과들을 분석.연구하였다.

  • PDF

Estimations of the student numbers by nonlinear regression model (비선형 회귀모형을 이용한 학년별 학생수 추계)

  • Yoon, Yong-Hwa;Kim, Jong-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.1
    • /
    • pp.71-77
    • /
    • 2012
  • This paper introduces the projection methods by nonlinear regression model. To predict the student numbers, a log model and an involution model as the kind of a trend-extrapolation method are used. Empirical evidence shows that a projection by log model is better than by involution model with the confidence interval estimations for the coefficients of determination.

Predicting the success of CDM Registration for Hydropower Projects using Logistic Regression and CART (로그 회귀분석 및 CART를 활용한 수력사업의 CDM 승인여부 예측 모델에 관한 연구)

  • Park, Jong-Ho;Koo, Bonsang
    • Korean Journal of Construction Engineering and Management
    • /
    • v.16 no.2
    • /
    • pp.65-76
    • /
    • 2015
  • The Clean Development Mechanism (CDM) is the multi-lateral 'cap and trade' system endorsed by the Kyoto Protocol. CDM allows developed (Annex I) countries to buy CER credits from New and Renewable (NE) projects of non-Annex countries, to meet their carbon reduction requirements. This in effect subsidizes and promotes NE projects in developing countries, ultimately reducing global greenhouse gases (GHG). To be registered as a CDM project, the project must prove 'additionality,' which depends on numerous factors including the adopted technology, baseline methodology, emission reductions, and the project's internal rate of return. This makes it difficult to determine ex ante a project's acceptance as a CDM approved project, and entails sunk costs and even project cancellation to its project stakeholders. Focusing on hydro power projects and employing UNFCCC public data, this research developed a prediction model using logistic regression and CART to determine the likelihood of approval as a CDM project. The AUC for the logistic regression and CART model was 0.7674 and 0.7231 respectively, which proves the model's prediction accuracy. More importantly, results indicate that the emission reduction amount, MW per hour, investment/Emission as crucial variables, whereas the baseline methodology and technology types were insignificant. This demonstrates that at least for hydro power projects, the specific technology is not as important as the amount of emission reductions and relatively small scale projects and investment to carbon reduction ratios.

Reconstruction of Urbanization Levels and the Nature of Over/underurbanization Problems in China (중국 도시화율의 재구성과 과잉/과소 도시화 문제의 성격)

  • Jun, Kwang-Hee
    • Korea journal of population studies
    • /
    • v.27 no.2
    • /
    • pp.257-289
    • /
    • 2004
  • 이 연구의 목적은 중국의 도시화율을 재구성하고 그것을 바탕으로 과잉/과소 도시화 논쟁을 재점검하는 것이다. 연구는 과거에 발표된 도시화율에 비하여 2000년 센서스 보고서에 발표된 36.01%의 도시화율이 신뢰할만한 수치인가하는 질문에서 출발한다. 여기에 대한 답은 부정적이다. 따라서 이 연구는 유엔의 도시/농촌 인구성장 예측기법을 사용하여, 도시화율에 관한 두 세트의 시계열 자료를 재구성한다, 이 연구는 그 중 하나인 1982~2000년 자료를 바탕으로 과잉/과소 도시화 문제의 성격을 해명한다. 이 연구는 1인당 국민소득과 도시화의 관계를 해명하기 위한 두 종류의 회귀모형을 개발한다. 세계은행의 자료를 바탕으로 전세계의 경제발전과 도시화 수준에 관계에 관한 회귀방정식을 추정하고, 선형방정식보다 로그방정식이 예측력이 높음을 확인한다. 로그방정식의 추정결과에 따르면, 중국은 1978년 개혁${\cdot}$개방정책 이전에는 과잉 도시화되었고, 최근에 들어 오히려 도시화의 지체로 인한 과소 도시화의 문제가 통계적으로 유의미한 현상이 되고 있다. 분석의 결과는 중국이 1978년 시장경제를 도입한지 15년이 지난 이후에야 도시화 지체현상이 나타나고 있음에 주목하면서, 중국의 각종 도시정책이 도시발전에 강력한 장애물로 규제력을 행사하였음을 강조한다.

생존분석을 위한 통계패키지의 비교 연구 - SAS, SPSS, STATA -

  • Jo, Mi-Sun;Kim, Sun-Gwi
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.10a
    • /
    • pp.335-340
    • /
    • 2003
  • 최근 들어 생존분석 기법이 여러 분야에서 관심을 모으고 있을 뿐 아니라 생존자료를 분석하기 위한 여러 패키지들도 개발되어 연구되고 있다. 본고에서는 생존분석의 여러 모형을 간략히 소개하고 생존자료를 분석하기 위하여 널리 사용되고 있는 패키지인 SAS, SPSS, STATA의 기능을 찾아보고 그들의 특징을 비교 조사할 것이다.

  • PDF

Analysis of Factors for Korean Women's Cancer Screening through Hadoop-Based Public Medical Information Big Data Analysis (Hadoop기반의 공개의료정보 빅 데이터 분석을 통한 한국여성암 검진 요인분석 서비스)

  • Park, Min-hee;Cho, Young-bok;Kim, So Young;Park, Jong-bae;Park, Jong-hyock
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.10
    • /
    • pp.1277-1286
    • /
    • 2018
  • In this paper, we provide flexible scalability of computing resources in cloud environment and Apache Hadoop based cloud environment for analysis of public medical information big data. In fact, it includes the ability to quickly and flexibly extend storage, memory, and other resources in a situation where log data accumulates or grows over time. In addition, when real-time analysis of accumulated unstructured log data is required, the system adopts Hadoop-based analysis module to overcome the processing limit of existing analysis tools. Therefore, it provides a function to perform parallel distributed processing of a large amount of log data quickly and reliably. Perform frequency analysis and chi-square test for big data analysis. In addition, multivariate logistic regression analysis of significance level 0.05 and multivariate logistic regression analysis of meaningful variables (p<0.05) were performed. Multivariate logistic regression analysis was performed for each model 3.

Using Logistic Regression for Determining the Factors Affecting Bidding Success in World Bank's International Consulting Projects in Indonesia (로그 회귀분석을 이용한 해외 엔지니어링 사업의 낙찰 성공 요인 분석 - 세계은행의 인도네시아 사업을 중심으로-)

  • Yu, Youngsu;Shin, Byungjin;Koo, Bonsang;Han, Seungheon
    • Korean Journal of Construction Engineering and Management
    • /
    • v.19 no.1
    • /
    • pp.80-89
    • /
    • 2018
  • World Bank projects enable Korean engineering firms to enter new markets and diversify their portfolio. These firms need to understand the critical factors for bidding success in such projects. The World Bank publishes as open records all their bidding history data in their open database. This provides an opportunity to identify empirically the factors that determine which firms on chosen. This research collected relevant bid data, focusing on Indonesia, to perform a logistic regression with the goal of statistically identifying significant factors that result in bidding success. Results showed that work experience, being included in a consortium, and having a local partner positively affected winning a bid. On the other hand, having a local competitor of the recipient country negatively impacts the chances of attaining a bid. Commensurately, Korean engineering firms need to increase their work experience in internationally recognized projects, and include a local partner as a joint venture partner to increase their chances, while refrain from conventional projects that can be performed by local engineering firms.

A Study on Use Motivation, Consumers' Characteristics, and Viewing Satisfaction of Need Fulfillment Video Contents(Vlog / ASMR / Muk-bang) (욕구 충족 영상 콘텐츠(브이로그 / ASMR / 먹방) 이용 동기, 수용자 특성, 시청 만족도에 관한 연구)

  • Kang, Mee?Jeong;Cho, Chang-Hoan
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.1
    • /
    • pp.73-98
    • /
    • 2020
  • This study aims to redefine Vlog, ASMR, and Muk-bang contents as 'Need Fulfillment Video Contents,' which are emerging as major genres among the video contents. And this study explores the relationships between consumers' motives, viewing satisfaction, and consumers' characteristics such as demographic characteristics, big five personality traits, and individualism-collectivism tendencies in terms of uses and gratifications theory. Statistical analysis techniques such as factor analysis and hierarchical regression analysis were used to analyze 441 samples. As a result, age, income level, and collectivism were found to influence consumers' choice of Need Fulfillment Video Content genre. It was also found that the motivation of using Need Fulfillment Video Contents consisted of five factors: self-assessment and improvement, sensory stimulation and relaxation, entertainment, escapism and passing time, and following trends. Also, each usage motive influenced the viewing satisfaction in various ways. Based on the results of the analyses, the study concludes with discussion of the academic significance and practical implications for Need Fulfillment Video Contents industry development.

Bayesian analysis of directional conditionally autoregressive models (방향성 공간적 조건부 자기회귀 모형의 베이즈 분석 방법)

  • Kyung, Minjung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1133-1146
    • /
    • 2016
  • Counts or averages over arbitrary regions are often analyzed using conditionally autoregressive (CAR) models. The spatial neighborhoods within CAR model are generally formed using only the inter-distance or boundaries between the sub-regions. Kyung and Ghosh (2009) proposed a new class of models to accommodate spatial variations that may depend on directions, using different weights given to neighbors in different directions. The proposed model, directional conditionally autoregressive (DCAR) model, generalized the usual CAR model by accounting for spatial anisotropy. Bayesian inference method is discussed based on efficient Markov chain Monte Carlo (MCMC) sampling of the posterior distributions of the parameters. The method is illustrated using a data set of median property prices across Greater Glasgow, Scotland, in 2008.