• Title/Summary/Keyword: 제한이 있는 회귀모형

Search Result 58, Processing Time 0.036 seconds

Characteristics of Geometric Conditions Affecting Freeway Traffic Safety at Nighttime, Sunrise, and Sunset (야간 및 일출몰 시간대 교통안전에 영향을 미치는 고속도로 기하구조 특성분석)

  • Hong, Sung-Min;Kim, Joon-Ki;Oh, Cheol
    • Journal of Korean Society of Transportation
    • /
    • v.30 no.4
    • /
    • pp.95-106
    • /
    • 2012
  • Driver's capability of identifying the change in freeway alignments and environments is one of important factors associated with traffic safety on freeways. In particular, driver's visibility and recognition capability are highly dependent on the altitude of the sun by sunset, sunrise, and nighttime. The purpose of this study is to identify the characteristics of geometric conditions affecting crash occurrences at sunset, sunrise, and nighttime. Poisson and negative binomial regressions were adopted to predict freeway crash frequency in this study. Freeway crash data during 2007~2010 were used for developing the crash frequency models. A set of variables representing the characteristics of geometric conditions were identified as significant ones affecting crash occurrences. The results of this study would be useful in deriving effective countermeasures for preventing traffic crashes that mainly occur at sunset, sunrise, and nighttime on freeways.

Optimization for Elsholtzia ciliata Hylander Extraction using Supercritical Carbon Dioxide (초임계 이산화탄소를 이용한 향유 추출공정의 최적화)

  • Youn Kwang-Sup;Hong Joo-Heon;Kwon Joong-Ho;Choi Yong-Hee
    • Food Science and Preservation
    • /
    • v.13 no.3
    • /
    • pp.363-368
    • /
    • 2006
  • This study was performed to develop flavor materials from Elsholtzia ciliata Hylander with analyzing functionality and aroma profile and to optimize supercritical fluid extraction method and optimum condition. The qualities of water extracts such as total yield total phenolic compound electron donation ability, estragole and L-carvone, were affected by extraction pressure than time. The response variables had significant with pressure than with time and the established polynomial model was suitable(P>0.05) model by Lack-of-Fit analysis. The optimum extraction conditions which were limited of maximum value for dependent variables under experimental conditions based on central composite design were 238 bar and 42 min.

A Study on Estimating Route Travel Time Using Collected Data of Bus Information System (버스정보시스템(BIS) 수집자료를 이용한 경로통행시간 추정)

  • Lee, Young Woo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.3
    • /
    • pp.1115-1122
    • /
    • 2013
  • Recently the demands for traffic information tend to increase, and travel time might one of the most important traffic information. To effectively estimate exact travel time, highly reliable traffic data collection is required. BIS(Bus Information System) data would be useful for the estimation of the route travel time because BIS is collecting data for the bus travel time on the main road of the city on real-time basis. Traditionally use of BIS data has been limited to the realm of bus operating but it has not been used for a variety of traffic categories. Therefore, this study estimates a route travel time on road networks in urban areas on the basis of real-time data of BIS and then eventually constructs regression models. These models use an explanatory variable that corresponds to bus travel time excluding service time at the bus stop. The results show that the coefficient of determination for the constructed regression model is more than 0.950. As a result of T-test performance with assistance from collected data and estimated model values, it is likely that the model is statistically significant with a confidence level of 95%. It is generally found that the estimation for the exact travel time on real-time basis is plausible if the BIS data is used.

Analysis of the Relationships among Energy, Economic Growth and Greenhouse Gas Emissions Using Metropolitan City/Province Level Data (광역시·도별 자료를 이용한 에너지, 경제성장, 온실가스 배출 간의 관계 분석)

  • Lee, Jaeseok;Lee, Keun-Dae;Yu, Bok-Keun
    • Environmental and Resource Economics Review
    • /
    • v.30 no.3
    • /
    • pp.503-533
    • /
    • 2021
  • This paper analyzes the relationships among the energy consumption, renewable energy production, real gross regional domestic product(GRDP), and greenhouse gas(GHG) emissions. It uses the metropolitan city and province level data for Korea from 2010 to 2018, employing a panal vector autoregressive(VAR) model. We find that an increase in energy consumption has a limited impact on boosting renewable energy production or gross regional domestic product, while it leads to an increase in greenhouse gas emissions. A rise in renewable energy production can increase gross regional domestic product, but it has no meaningful effects on energy consumption and the reduction of green house gas emissions. Our finding indicates that it is crucial to expand the supply of renewable energy as well as to decrease energy consumption in order to achieve the goal of reducing greenhouse gas emissions and reaching economic growth.

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

Growth of Civic Organizations in South Korea (한국 시민단체의 성장에 대한 양적 연구)

  • Shin, Dong-Joon;Kim, Kwang-Soo;Kim, Jae-On
    • Survey Research
    • /
    • v.6 no.2
    • /
    • pp.75-101
    • /
    • 2005
  • This study introduces and analyzes the data from Directory of Korean NGOs, which was published in 1997 and again in 200, to conduct a quantitative research on the growth of civic organization in South Korea. This paper focus on the information on membership size and founding year which are essential indicators for the growth of organizations. Missing rates on those two indicators are checked to evaluate the quality of data. We examine the changes in membership size between the two time periods, 1996 and 1999. It shows that there is a considerable decrease in the membership size for civic and advocary organizations that are oriented to national issues. It suggests the competition among the organizations over limited resources, which is consistent with an assumption of ecological theory of organization on non-linear growth pattern. Using founding year data from 1945 to 1996, we estimate pseudo growth curves of civic organizations based on logistic growth curve model to discuss different growth patterns of organizations across areas of activities.

  • PDF

The Analysis of water quality using Satellite Remotely Sensed Imagery (위성사진을 이용한 해양환경분석)

  • Shin, Bum-Shick;Kim, Kyu-Han;Pyun, Chong-Kun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2006.05a
    • /
    • pp.1940-1944
    • /
    • 2006
  • 현지관측을 통한 지속적이고 광범위한 지역에 대해 정확하고 정밀하게 조사하여 종합적인 분석과 예측, 결정과정에 있어서, 복잡한 해양의 특성, 여러가지 조사 작업상의 난점, 경제적, 시간적으로 많은 어려움이 따르게 된다. 하지만, 위성원격탐사와 GIS를 이용한 해양환경파악기법은 현지관측에서 얻을 수 있는 제한적인 자료이외의 다량의 자료를 정성 및 정량적으로 데이터베이스화하여 분석함과 동시에 가시화함으로써 해양개발로 인해 불가피하게 초래될 수밖에 없는 환경을 보다 정확하게, 객관적으로 분석하여 장기적으로 예측할 수 있는 고도화된 환경조사 및 평가 기술이라고 할 수 있다. 본 연구에서는 고해상도 위성자료인 Landsat TM 영상과 NOAA AVHRR 자료를 이용하여 수온 및 클로로필을 추출하였으며, GIS를 이용하여 현지관측자료 및 수치해도를 기초로 공간분포도를 작성함으로서 그 외의 수질환경요소를 산출하였다. 위성영상분석은 현장조사와 같은 시점의 Landsat TM 위성영상을 획득하여, 위성 영상은 지구의 곡률과 자전, 위성체의 자세와 고도 및 속도, 그리고 센서의 기하 특성으로 인하여 실제의 지형에 대하여 기하학적 왜곡을 가지고 있으므로 지형도에서 지상기준점(Ground Control Point, GCP)를 추출하여 ERDAS Imagine으로 UTM좌표체계에 따른 기하보정(Geometric Correction)을 실시하였으며, 동일한 시기의 NOAA AVHRR영상을 데이터로 처리하여 수온자료를 추출하였다. 표층수온과 현장관측에 의한 클로로필을 수치 지도화하기 위하여 열적외선영역인 TM band 6의 분광특성값(Digital Number)과 동일한 위치의 수온자료를 기초로 회귀분석을 실시함으로써 수온추출 알고리즘을 도출하여, 분석데이터의 신뢰도를 검증하였으며, 수온, 클로로필, 투명도 등을 위성원격탐사 자료와 GIS를 이용하여 공간분석을 실시하고, 공간분포도를 작성함으로써 대상해역의 해양환경을 파악하였다. 본 연구결과, 분석된 위성자료가 현장조사에 의한 검증이 이루어지지 않을 경우, 영상자료분석을 통한 표층수온 추출은 대기 중의 수증기와 에어로졸에 의한 계산치의 오차가 반영되기 때문에 실측치 보다 낮게 평가 될 수 있으므로, 반드시 이에 대한 검증이 필요함을 알 수 있었다. 현지관측에 비해 막대한 비용과 시간을 절약할 수 있는 위성영상해석방법을 이용한 방법은 해양수질파악이 가능할 것으로 판단되며, GIS를 이용하여 다양하고 복잡한 자료를 데이터베이스화함으로써 가시화하고, 이를 기초로 공간분석을 실시함으로써 환경요소별 공간분포에 대한 파악을 통해 수치모형실험을 이용한 각종 환경영향의 평가 및 예측을 위한 기초자료로 이용이 가능할 것으로 사료된다.

  • PDF

Development of optimization algorithm to set transition point for multi-segmented rating curve (구간 분할된 레이팅 커브의 천이점 선정을 위한 최적화 알고리즘 개발)

  • Kim, Yeonsu;Noh, Joonwoo;Kim, Sunghoon;Yu, Wansik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.421-421
    • /
    • 2018
  • 효율적인 수자원 관리를 위하여 전국유역조사, 수자원 장기종합계획 등 다양한 사업이 수행되고 있으며, 이를 위하여 유출해석은 필수적인 항목이라 할 수 있다. 유출해석을 위하여 수문모형 또는 관측소의 유량자료가 활용되고 있으나, 이는 기존에 관측된 유량자료를 바탕으로 구축된 수위-유량관계 곡선식(Rating-curve)을 활용하여 재생산된 자료라 할 수 있다. 즉, 수위자료는 매시간 관측소에서 측정이 되지만, 유량자료의 경우 측정이 어려울 뿐만 아니라 변동성 및 불확실성이 크기 때문에 시계열 수위를 곡신식을 통해 유량으로 변환하여 활용하고 있다. 이와 같이 수위-유량관계 곡선식의 정확성이 수문자료 생산에 핵심 요소임에도 불구하고 이에 대한 연구는 제한적이며, 특히 홍수터 등의 영향을 고려하여 분할된 곡선의 천이점 접합시 곡선식의 정확도 향상을 위한 연구도 드문 편이다. 따라서 본 연구에서는 구간 분할된 곡선의 최적 천이점 선정을 위하여 Particle Swarm Optimization(PSO)기법을 활용하였으며, 총 5개 구간까지 구간별 목적함수로 RMSE, RSR, 결정계수 적용시 특성변화에 대한 연구를 수행하였다. 구간에 대하여 절대적인 오차를 산정하는 RMSE를 활용하는 경우 저수위 부분에 대한 오차가 증가하는 것을 확인할 수 있었으며, 상대적인 오차인 RSR, 결정계수를 활용하는 경우 전체 구간에 대한 오차를 보완할 수 있는 것으로 나타났다. PSO기법을 활용하여 도출된 곡선식에 대해서는 구간 및 전체구간에 대한 오차(RMSE, 결정계수, RSR, MAPE)를 활용하여 불확실성을 검토할 수 있도록 하였고, 잔차분석을 통한 이상치 및 회귀곡선에 대한 정규성 검토를 수행할 수 있는 툴을 개발하였다. 레이팅 커브를 작성하는데 있어 최적화 알고리즘을 활용하여 구간분할시 천이점 선정의 자동화로 천이점 선정에 소요되는 시간을 대폭 감축할 수 있을 뿐만 아니라, 구간별 오차를 종합적으로 고려하여 우수한 품질의 레이팅 커브를 도출할 수 있는 기반을 구축하였다.

  • PDF

A Regression-Model-based Method for Combining Interestingness Measures of Association Rule Mining (연관상품 추천을 위한 회귀분석모형 기반 연관 규칙 척도 결합기법)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.127-141
    • /
    • 2017
  • Advances in Internet technologies and the proliferation of mobile devices enabled consumers to approach a wide range of goods and services, while causing an adverse effect that they have hard time reaching their congenial items even if they devote much time to searching for them. Accordingly, businesses are using the recommender systems to provide tools for consumers to find the desired items more easily. Association Rule Mining (ARM) technology is advantageous to recommender systems in that ARM provides intuitive form of a rule with interestingness measures (support, confidence, and lift) describing the relationship between items. Given an item, its relevant items can be distinguished with the help of the measures that show the strength of relationship between items. Based on the strength, the most pertinent items can be chosen among other items and exposed to a given item's web page. However, the diversity of the measures may confuse which items are more recommendable. Given two rules, for example, one rule's support and confidence may not be concurrently superior to the other rule's. Such discrepancy of the measures in distinguishing one rule's superiority from other rules may cause difficulty in selecting proper items for recommendation. In addition, in an online environment where a web page or mobile screen can provide a limited number of recommendations that attract consumer interest, the prudent selection of items to be included in the list of recommendations is very important. The exposure of items of little interest may lead consumers to ignore the recommendations. Then, such consumers will possibly not pay attention to other forms of marketing activities. Therefore, the measures should be aligned with the probability of consumer's acceptance of recommendations. For this reason, this study proposes a model-based approach to combine those measures into one unified measure that can consistently determine the ranking of recommended items. A regression model was designed to describe how well the measures (independent variables; i.e., support, confidence, and lift) explain consumer's acceptance of recommendations (dependent variables, hit rate of recommended items). The model is intuitive to understand and easy to use in that the equation consists of the commonly used measures for ARM and can be used in the estimation of hit rates. The experiment using transaction data from one of the Korea's largest online shopping malls was conducted to show that the proposed model can improve the hit rates of recommendations. From the top of the list to 13th place, recommended items in the higher rakings from the proposed model show the higher hit rates than those from the competitive model's. The result shows that the proposed model's performance is superior to the competitive model's in online recommendation environment. In a web page, consumers are provided around ten recommendations with which the proposed model outperforms. Moreover, a mobile device cannot expose many items simultaneously due to its limited screen size. Therefore, the result shows that the newly devised recommendation technique is suitable for the mobile recommender systems. While this study has been conducted to cover the cross-selling in online shopping malls that handle merchandise, the proposed method can be expected to be applied in various situations under which association rules apply. For example, this model can be applied to medical diagnostic systems that predict candidate diseases from a patient's symptoms. To increase the efficiency of the model, additional variables will need to be considered for the elaboration of the model in future studies. For example, price can be a good candidate for an explanatory variable because it has a major impact on consumer purchase decisions. If the prices of recommended items are much higher than the items in which a consumer is interested, the consumer may hesitate to accept the recommendations.

The effect for exercise intensity on hypertension using propensity score (성향점수를 이용한 운동강도가 고혈압에 미치는 영향)

  • Hwang, Jinseub;Pi, Seonmi;Choi, Woochul;Kim, Jongtae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.109-117
    • /
    • 2017
  • This study aims to identify the effect for exercise intensity on hypertension using propensity score based on the sixth Korea National Health and Nutrition Examination Survey data and to provide an evidence for the most effective exercise intensity for prevention or treatment of hypertension. Specifically, we select 3,486 subjects who aged between 18 and 65 years after excluding some subjects who are expected to have limited athletic ability. We estimate propensity scores for exercise intensity based on the confounders such as sex, age, smoking, drinking, and natrium intake. Considering the complex survey design, we conduct a descriptive analysis and multiple logistic regression for hypertension with propensity score as a covariate. Although the results of the study did not show statistically significant relationship between exercise intensity and hypertension, we expect that it can be used as a basis evidence that the appropriate exercise of moderate intensity may be more effective for the prevention and treatment of hypertension rather than strong intensity exercise and non-exercise.