• Title/Summary/Keyword: imputation

Search Result 243, Processing Time 0.027 seconds

Forecasting the Demand Areas of a Factory Site: Based on a Statistical Model and Sampling Survey (공장용지 수요 추정 모형 개발 및 수요예측)

  • Jeong, Hyeong-Chul;Han, Geun-Shik;Kim, Seong-Yong
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.3
    • /
    • pp.465-475
    • /
    • 2011
  • In this paper, we have considered the problems of the estimation of the gross areas of a factory site relating to the areas of industrial complex lands based on a statistical forecasting model and the results of a sampling survey. In respect to the data of a gross areas of a factory site, we have only the sizes from 1981-2003. In 2009, the Korea Industrial Complex Corp. conducted a sampling survey to estimate its bulk size, and investigate the demands of its sizes for the next five years. In this study, we have adopted the sampling survey results, and have created a statistical growth model for the gross areas of a factory site to improve the prediction for the areas of a factory site. The three-different parts of data: the results of areas of a factory site by Korea National Statistical Office, imputation results by the statistical forecasting model, and sampling survey results have used as the basis for analysis. The combination of the three-different parts of data has created a new forecasting value of the areas of a factory site through the spline smoothing method.

A Study on the Optimal Cut-off Point in the Cut-off Sampling Method (절사표본에서 최적 절사점에 관한 연구)

  • Lee, Sang Eun;Cho, Min Ji;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.3
    • /
    • pp.501-512
    • /
    • 2014
  • Modified cut-off sampling is widely used for highly skewed data. A serious drawback of modified cut-off sampling is the difficulty of adjustment of non-response in take-all stratum. Therefore, solutions of the problems of non-response in take-all stratum have been studied in various ways such as substitute of samples, imputation or re-weight method. In this paper, a new cut-off point based on minimizing MSE being used in exponential and power functions is suggested and it can be reduced the number of take-all stratum. We also investigate another cut-off point determination method with underlying distributions such as truncated log-normal and truncated gamma distributions. Finally we suggest the optimal cut-off point which has a minimum of take-all stratum size among suggested methods. Simulation studies are performed and Labor Survey data and simulated data are used for the case study.

A Study for Traffic Forecasting Using Traffic Statistic Information (교통 통계 정보를 이용한 속도 패턴 예측에 관한 연구)

  • Choi, Bo-Seung;Kang, Hyun-Cheol;Lee, Seong-Keon;Han, Sang-Tae
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.6
    • /
    • pp.1177-1190
    • /
    • 2009
  • The traffic operating speed is one of important information to measure a road capacity. When we supply the information of the road of high traffic by using navigation, offering the present traffic information and the forecasted future information are the outstanding functions to serve the more accurate expected times and intervals. In this study, we proposed the traffic speed forecasting model using the accumulated traffic speed data of the road and highway and forecasted the average speed for each the road and high interval and each time interval using Fourier transformation and time series regression model with trigonometrical function. We also propose the proper method of missing data imputation and treatment for the outliers to raise an accuracy of the traffic speed forecasting and the speed grouping method for which data have similar traffic speed pattern to increase an efficiency of analysis.

Survival Prognostic Factors of Male Breast Cancer in Southern Iran: a LASSO-Cox Regression Approach

  • Shahraki, Hadi Raeisi;Salehi, Alireza;Zare, Najaf
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.15
    • /
    • pp.6773-6777
    • /
    • 2015
  • We used to LASSO-Cox method for determining prognostic factors of male breast cancer survival and showed the superiority of this method compared to Cox proportional hazard model in low sample size setting. In order to identify and estimate exactly the relative hazard of the most important factors effective for the survival duration of male breast cancer, the LASSO-Cox method has been used. Our data includes the information of male breast cancer patients in Fars province, south of Iran, from 1989 to 2008. Cox proportional hazard and LASSO-Cox models were fitted for 20 classified variables. To reduce the impact of missing data, the multiple imputation method was used 20 times through the Markov chain Mont Carlo method and the results were combined with Rubin's rules. In 50 patients, the age at diagnosis was 59.6 (SD=12.8) years with a minimum of 34 and maximum of 84 years and the mean of survival time was 62 months. Three, 5 and 10 year survival were 92%, 77% and 26%, respectively. Using the LASSO-Cox method led to eliminating 8 low effect variables and also decreased the standard error by 2.5 to 7 times. The relative efficiency of LASSO-Cox method compared with the Cox proportional hazard method was calculated as 22.39. The19 years follow of male breast cancer patients show that the age, having a history of alcohol use, nipple discharge, laterality, histological grade and duration of symptoms were the most important variables that have played an effective role in the patient's survival. In such situations, estimating the coefficients by LASSO-Cox method will be more efficient than the Cox's proportional hazard method.

Retrospective Air Quality Simulations of the TexAQS-II: Focused on Emissions Uncertainty

  • Lee, DaeGyun;Kim, Soontae;Kim, Hyuncheol;Ngan, Fong
    • Asian Journal of Atmospheric Environment
    • /
    • v.8 no.4
    • /
    • pp.212-224
    • /
    • 2014
  • There are several studies on the effects of emissions of highly reactive volatile organic compounds (HRVOC) from the industrial sources in the Houston-Galveston-Brazoria (HGB) area on the high ozone events during the Texas Air Quality Study (TexAQS) in summer of 2000. They showed that the modeled atmosphere lacked reactivity to produce the observed high ozone event and suggested "imputation" of HRVOC emissions from the base inventory. Byun et al. (2007b) showed the imputed inventory leads to too high ethylene concentrations compared to the measurements at the chemical super sites but still too little aloft compared to the NOAA aircraft. The paper suggested that the lack of reactivity in the modeled Houston atmosphere must be corrected by targeted, and sometimes of episodic, increase of HRVOC emissions from the large sources such as flares in the Houston Ship Channel (HSC) distributed into the deeper level of the boundary layer. We performed retrospective meteorological and air quality modeling to achieve better air quality prediction of ozone by comparison with various chemical and meteorological measurements during the Texas Air Quality Study periods in August-September 2006 (TexA QS-II). After identifying several shortcomings of the forecast meteorological simulations and emissions inputs, we prepared new retrospective meteorological simulations and updated emissions inputs. We utilized assimilated MM5 inputs to achieve better meteorological simulations (detailed description of MM5 assimilation can be found in F. Ngan et al., 2012) and used them in this study for air quality simulations. Using the better predicted meteorological results, we focused on the emissions uncertainty in order to capture high peak ozone which occasionally happens in the HGB area. We described how the ozone predictions are affected by emissions uncertainty in the air quality simulations utilizing different emission inventories and adjustments.

A Study on the Legal Character of Contractual Liability in Freight Agency under Chinese Contract Law (중국계약법상 화물운송대리에서의 계약책임과 귀책원칙)

  • KIM, Young-Ju
    • THE INTERNATIONAL COMMERCE & LAW REVIEW
    • /
    • v.66
    • /
    • pp.119-148
    • /
    • 2015
  • Generally, the liability for breach is defined as the civil liability that arises from the conduct of violation of a contract. There are two notable principles governing liability for breach that have fundamental impacts on the unified Contract Law of the People's Republic of China (hereinafter Chinese Contract Law) in the remedies. In China, during the drafting of the Contract Law, there was a great debate as to whether damages for breach of contract ought to follow the fault principle or to follow the strict liability principle. Ultimately the Chinese Contract Law follows the model of the CISG on this point, namely, it follows the strict liability principle (article 107) with an exemption cause of force majeure. Under Chinese Contract Law, it is interpreted as strict liability in principle. Strict Liability is a notion introduced into Chinese Contract Law from the Anglo-Saxon Law. The strict liability or no fault doctrine, on the contrary, allows a party to claim damages if the other party fails to fulfill his contractual obligations regardless of the fault of the failing party. Pursuant to the strict liability doctrine, if the performance of a contract is due, any non-performance will constitute a breach and the fault on the party in breach is irrelevant. This paper reviews problems of legal character or legal ground of contractual liability in Chinese contract law. Specifically, focusing on the interpretation of Chinese contract law sections and analysis of three cases related contractual liability in freight agency, the paper proposes some implications of structural features of Chinese contract law and international commercial transactions.

  • PDF

Variational Mode Decomposition with Missing Data (결측치가 있는 자료에서의 변동모드분해법)

  • Choi, Guebin;Oh, Hee-Seok;Lee, Youngjo;Kim, Donghoh;Yu, Kyungsang
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.159-174
    • /
    • 2015
  • Dragomiretskiy and Zosso (2014) developed a new decomposition method, termed variational mode decomposition (VMD), which is efficient for handling the tone detection and separation of signals. However, VMD may be inefficient in the presence of missing data since it is based on a fast Fourier transform (FFT) algorithm. To overcome this problem, we propose a new approach based on a novel combination of VMD and hierarchical (or h)-likelihood method. The h-likelihood provides an effective imputation methodology for missing data when VMD decomposes the signal into several meaningful modes. A simulation study and real data analysis demonstrates that the proposed method can produce substantially effective results.

An Evaluation System For Freeway Traffic Data Processing Techniques (고속도로 교통자료 처리기법 통합평가 시스템 개발)

  • Oh, Dong-Wook;Oh, Cheol;NamKoong, Sung;Jeon, Se-Kil
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.7 no.4
    • /
    • pp.13-24
    • /
    • 2008
  • Real-time traffic data are readily obtainable by traffic surveillance systems of intelligent transportation systems (ITS). Such data greatly support further applications in the field of traffic operations, planning, and safety. However, traffic data should be appropriately processed to fully exploit the benefits of data collection capability. Rather than developing individual data processing techniques, which is major concern of existing studies, this study proposes a novel methodology for evaluating data processing techniques in an integrated manner. Also, a tool for implementing the proposed methodology is developed. Users can extract useful and more reliable traffic data based upon their ultimate purpose of data usage by the evaluation tool developed in this study. Actual freeway traffic data are, as an example, fed into the evaluation tool, and results are discussed.

  • PDF

A Study on the Imposition of Sanctions on Illegal Use of Government R&D Expenses (정부연구개발비 유용행위 시 제재부가금에 관한 연구)

  • Noh, Sang-Kyun;An, Eun-Sook;Hyun, Byung-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.12
    • /
    • pp.854-862
    • /
    • 2018
  • The Government R&D budget for 2019 exceeded 20 trillion won in order to develop future growth market such as basic research investment and creation of growth engine. As such, the importance of R&D investment is increasing, and various schemes for enforcing efficient and transparent business expenses are being expanded. However, research expenses (Charge ratio) such as fraudulent execution of funds are continuously being generated, and a system of imposition of sanctions is being introduced. In this paper, the legal grounds of sanctions, comparative review of laws and regulations between the ministries, and the criteria of imposition (imputation) were analyzed. In addition, since the amendment of the standard for imposing the intergovernmental surcharges, a single standard has been applied, and the transition process of the surcharging system has been reviewed. As a result of the data analysis, it was found that they focused on micro - utility activities and suggested new policy measures corresponding to them. The Korea Academia-Industrial cooperation Society. The Korea Academia-Industrial cooperation Society.

Store Sales Prediction Using Gradient Boosting Model (그래디언트 부스팅 모델을 활용한 상점 매출 예측)

  • Choi, Jaeyoung;Yang, Heeyoon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.171-177
    • /
    • 2021
  • Through the rapid developments in machine learning, there have been diverse utilization approaches not only in industrial fields but also in daily life. Implementations of machine learning on financial data, also have been of interest. Herein, we employ machine learning algorithms to store sales data and present future applications for fintech enterprises. We utilize diverse missing data processing methods to handle missing data and apply gradient boosting machine learning algorithms; XGBoost, LightGBM, CatBoost to predict the future revenue of individual stores. As a result, we found that using median imputation onto missing data with the appliance of the xgboost algorithm has the best accuracy. By employing the proposed method, fintech enterprises and customers can attain benefits. Stores can benefit by receiving financial assistance beforehand from fintech companies, while these corporations can benefit by offering financial support to these stores with low risk.