• 제목/요약/키워드: over-fitting

검색결과 347건 처리시간 0.028초

3차원 사이버도시구축을 위한 그래디언트기반 3차원 평면추출기법의 지형 및 인공지물지역에의 적용에 관한 연구 (Application of the Gradient-Based 3D Patch Extraction Method to Terrain and Man-made Objects for Construction of 3D CyberCity)

  • 서수영
    • 한국측량학회:학술대회논문집
    • /
    • 한국측량학회 2010년 춘계학술발표회 논문집
    • /
    • pp.227-229
    • /
    • 2010
  • This study presents an application of the 3D patch extraction method which is based on gradient-driven properties to obtain 3D planar patches over the terrain and man-made objects from lidar data. The method which was exploited in this study is composed of a sequence of processes: segmentation by slope, initiation of triggering patches by mode selection, and expansion of the triggering patches. Since urban areas contain many planar regions over the terrain surface, application of the method has been experimented to extract 3D planar patches not only from non-terrain objects but also from the terrain. The experimental result shows that the method is efficient to acquire 3D planar patches.

  • PDF

Stochastic procedures for extreme wave induced responses in flexible ships

  • Jensen, Jorgen Juncher;Andersen, Ingrid Marie Vincent;Seng, Sopheak
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • 제6권4호
    • /
    • pp.1148-1159
    • /
    • 2014
  • Different procedures for estimation of the extreme global wave hydroelastic responses in ships are discussed. Firstly, stochastic procedures for application in detailed numerical studies (CFD) are outlined. The use of the First Order Reliability Method (FORM) to generate critical wave episodes of short duration, less than 1 minute, with prescribed probability content is discussed for use in extreme response predictions including hydroelastic behaviour and slamming load events. The possibility of combining FORM results with Monte Carlo simulations is discussed for faster but still very accurate estimation of extreme responses. Secondly, stochastic procedures using measured time series of responses as input are considered. The Peak-over-Threshold procedure and the Weibull fitting are applied and discussed for the extreme value predictions including possible corrections for clustering effects.

전력설비 보호를 위한 온도계전기용 광섬유 분배센서 (Distributed fiber-optic sensor network for the over temperature protection relay of electric power systems)

  • 박형준;이준호;송민호
    • 한국조명전기설비학회:학술대회논문집
    • /
    • 한국조명전기설비학회 2006년도 춘계학술대회 논문집
    • /
    • pp.86-90
    • /
    • 2006
  • We prosed a distributed fiber-optic sensor system with 10 fiber Bragg gratings, for over temperature protection relay in power systems. We applied Gaussian line-fitting algorithm to compensate the distortion effects in the wavelength-scanned Farby-Perot filter demodulation scheme. Compared with the highest-peak-detection method, the proposed algorithm was proved to minimize the random errors of distorted PD profiles. From experimental results, the overall measurement error was within 1 % compared with the reference thermocouple and the linearity error was less than 0.37 %.

  • PDF

일반화 가법 모형을 이용한 전주 외력 모델링 (A Model-Fitting Approach of External Force on Electric Pole Using Generalized Additive Model)

  • 박철영;신창선;박명혜;이승배;박장우
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제6권11호
    • /
    • pp.445-452
    • /
    • 2017
  • 전주(Electric Pole)는 전력 송/배전에 사용되는 지지물로 외력 측정을 위해 가속도 센서가 이용된다. 기상현상은 전주의 외력에 다양한 영향을 미친다. 가공전선의 탄성변화가 그중 하나이다. 이러한 이유로 전주에 미치는 기상현상 요인을 모델링 하는 것은 매우 중요하다. 가속도 센서로부터 수신된 데이터는 피치(Pitch)와 롤(Roll) 각도로 변환되어 수신된다. 기상 현상은 변수간 상관관계가 높게 나타나며, 모델링을 위해 유의한 설명변수를 선택하는 것은 과대적합(Over Fitting)의 문제에서 매우 중요한 요소이다. 다중공선성(Multicollinearity)을 고려한 설명력이 높은 모델 구축을 위해 기계학습 방법의 하나인 일반화 가법 모형(Generalized Additive Model)을 사용했다. 모델 구축에 사용된 기상 요인 변수는 온도, 습도, 강수량, 풍속, 풍향, 증기압, 대기압, 노점온도, 일조시간, 일사량, 운량이다. 분산 팽창 요인 검증을 수행한 결과 온도, 강수량, 풍속, 풍향, 대기압, 노점온도, 일조시간, 운량의 변수가 선택됐다. 설명변수중 일조시간, 운량, 대기압의 영향도가 높게 나타났으며, 일반화 가법 모형의 평균 결정계수(R-Squared)는 0.69로 유의한 모델을 구축했다. 구축된 모델은 전주 외력의 영향을 예측하는데 도움이 될 수 있을 것이며, 안전성 확보의 목적에 기여할 수 있을 것이라 생각한다.

Modeling Age-specific Cancer Incidences Using Logistic Growth Equations: Implications for Data Collection

  • Shen, Xing-Rong;Feng, Rui;Chai, Jing;Cheng, Jing;Wang, De-Bin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권22호
    • /
    • pp.9731-9737
    • /
    • 2014
  • Large scale secular registry or surveillance systems have been accumulating vast data that allow mathematical modeling of cancer incidence and mortality rates. Most contemporary models in this regard use time series and APC (age-period-cohort) methods and focus primarily on predicting or analyzing cancer epidemiology with little attention being paid to implications for designing cancer registry, surveillance or evaluation initiatives. This research models age-specific cancer incidence rates using logistic growth equations and explores their performance under different scenarios of data completeness in the hope of deriving clues for reshaping relevant data collection. The study used China Cancer Registry Report 2012 as the data source. It employed 3-parameter logistic growth equations and modeled the age-specific incidence rates of all and the top 10 cancers presented in the registry report. The study performed 3 types of modeling, namely full age-span by fitting, multiple 5-year-segment fitting and single-segment fitting. Measurement of model performance adopted adjusted goodness of fit that combines sum of squred residuals and relative errors. Both model simulation and performance evalation utilized self-developed algorithms programed using C# languade and MS Visual Studio 2008. For models built upon full age-span data, predicted age-specific cancer incidence rates fitted very well with observed values for most (except cervical and breast) cancers with estimated goodness of fit (Rs) being over 0.96. When a given cancer is concerned, the R valuae of the logistic growth model derived using observed data from urban residents was greater than or at least equal to that of the same model built on data from rural people. For models based on multiple-5-year-segment data, the Rs remained fairly high (over 0.89) until 3-fourths of the data segments were excluded. For models using a fixed length single-segment of observed data, the older the age covered by the corresponding data segment, the higher the resulting Rs. Logistic growth models describe age-specific incidence rates perfectly for most cancers and may be used to inform data collection for purposes of monitoring and analyzing cancer epidemic. Helped by appropriate logistic growth equations, the work vomume of contemporary data collection, e.g., cancer registry and surveilance systems, may be reduced substantially.

콘택트렌즈를 임시 제거한 상태에서의 착용 조건에 따른 굴절력 변화 (Refractive Power Changes after Removal of Contact Lenses)

  • 조윤경;김수운;유동식
    • 한국안광학회지
    • /
    • 제18권3호
    • /
    • pp.279-289
    • /
    • 2013
  • 목적: 소프트 콘택트렌즈를 일시 제거한 상태에서 굴절보정을 위한 굴절검사 시의 굴절력 변화를 평가하고자 하였다. 방법: 17세에서 39세(평균 $24{\pm}4.8$세)의 소프트 렌즈 착용자 91명(남자 15명, 여자 76명 총 182안)을 대상으로 하였다. 렌즈를 제거한 뒤 즉시, 30분, 60분, 90분 후 타각적, 자각적 굴절검사 및 각막곡률반경검사를 하였다. 렌즈의 종류, 피팅 및 착용 상태별로 측정 시간에 따른 굴절력 변화를 평가하였다. 결과: 타각적, 자각적 굴절검사 및 각막곡률반경검사에서 측정 시간대별로 유의한 변화를 보였다(p<0.0001). 굴절력은 측정 초기(렌즈 제거 후 30분 후)에서 완만한 근시화 경향을, 그리고 측정 후반(렌즈 제거 후 60분에서 90분)에서 미약한 근시화 경향을 보였다. 렌즈 종류, 피팅 상태, 착용 시간, 착용 일수 및 검사 전날의 수면 시간에 따른 유의한 차이는 없었으나, 각막곡률의 변화에서 측정 시간과 렌즈 종류(p=0.017), 피팅 상태(p=0.019) 및 수면 시간(p=0.010) 간의 상호작용 효과는 유의하였다. 결론: 소프트 콘택트렌즈의 종류, 피팅 및 착용 상태와 상관없이 렌즈 제거 후 굴절력과 각막곡률의 안정화 시점은 적어도 60분 이상으로 나타났다. 따라서 굴절보정을 위한 굴절검사 시 가능한 60분 이상 기다린 후 실시하여야 할 것이다.

Learning Similarity with Probabilistic Latent Semantic Analysis for Image Retrieval

  • Li, Xiong;Lv, Qi;Huang, Wenting
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권4호
    • /
    • pp.1424-1440
    • /
    • 2015
  • It is a challenging problem to search the intended images from a large number of candidates. Content based image retrieval (CBIR) is the most promising way to tackle this problem, where the most important topic is to measure the similarity of images so as to cover the variance of shape, color, pose, illumination etc. While previous works made significant progresses, their adaption ability to dataset is not fully explored. In this paper, we propose a similarity learning method on the basis of probabilistic generative model, i.e., probabilistic latent semantic analysis (PLSA). It first derives Fisher kernel, a function over the parameters and variables, based on PLSA. Then, the parameters are determined through simultaneously maximizing the log likelihood function of PLSA and the retrieval performance over the training dataset. The main advantages of this work are twofold: (1) deriving similarity measure based on PLSA which fully exploits the data distribution and Bayes inference; (2) learning model parameters by maximizing the fitting of model to data and the retrieval performance simultaneously. The proposed method (PLSA-FK) is empirically evaluated over three datasets, and the results exhibit promising performance.

Modeling the Natural Occurrence of Selected Dipterocarp Genera in Sarawak, Borneo

  • Teo, Stephen;Phua, Mui-How
    • Journal of Forest and Environmental Science
    • /
    • 제28권3호
    • /
    • pp.170-178
    • /
    • 2012
  • Dipterocarps or Dipterocarpaceae is a commercially important timber producing and dominant keystone tree family in the rain forests of Borneo. Borneo's landscape is changing at an unprecedented rate in recent years which affects this important biodiversity. This paper attempts to model the natural occurrence (distribution including those areas with natural forests before being converted to other land uses as opposed to current distribution) of dipterocarp species in Sarawak which is important for forest biodiversity conservation and management. Local modeling method of Inverse Distance Weighting was compared with commonly used statistical method (Binary Logistic Regression) to build the best natural distribution models for three genera (12 species) of dipterocarps. Database of species occurrence data and pseudoabsence data were constructed and divided into two halves for model building and validation. For logistic regression modeling, climatic, topographical and edaphic parameters were used. Proxy variables were used to represent the parameters which were highly (p>0.75) correlated to avoid over-fitting. The results show that Inverse Distance Weighting produced the best and consistent prediction with an average accuracy of over 80%. This study demonstrates that local interpolation method can be used for the modeling of natural distribution of dipterocarp species. The Inverse Distance Weighted was proven a better method and the possible reasons are discussed.

Classroom Roll-Call System Based on ResNet Networks

  • Zhu, Jinlong;Yu, Fanhua;Liu, Guangjie;Sun, Mingyu;Zhao, Dong;Geng, Qingtian;Su, Jinbo
    • Journal of Information Processing Systems
    • /
    • 제16권5호
    • /
    • pp.1145-1157
    • /
    • 2020
  • A convolution neural networks (CNNs) has demonstrated outstanding performance compared to other algorithms in the field of face recognition. Regarding the over-fitting problem of CNN, researchers have proposed a residual network to ease the training for recognition accuracy improvement. In this study, a novel face recognition model based on game theory for call-over in the classroom was proposed. In the proposed scheme, an image with multiple faces was used as input, and the residual network identified each face with a confidence score to form a list of student identities. Face tracking of the same identity or low confidence were determined to be the optimisation objective, with the game participants set formed from the student identity list. Game theory optimises the authentication strategy according to the confidence value and identity set to improve recognition accuracy. We observed that there exists an optimal mapping relation between face and identity to avoid multiple faces associated with one identity in the proposed scheme and that the proposed game-based scheme can reduce the error rate, as compared to the existing schemes with deeper neural network.

The Influence of Assay Error Weight on Gentamicin Pharmacokinetics Using the Bayesian and Nonlinear Least Square Regression Analysis in Appendicitis Patients

  • Jin, Pil-Burm
    • Archives of Pharmacal Research
    • /
    • 제28권5호
    • /
    • pp.598-603
    • /
    • 2005
  • The purpose of this study was to determine the influence of weight with gentamicin assay error on the Bayesian and nonlinear least squares regression analysis in 12 Korean appen dicitis patients. Gentamicin was administered intravenously over 0.5 h every 8 h. Three specimens were collected at 48 h after the first dose from all patients at the following times, just before regularly scheduled infusion, at 0.5 h and 2 h after the end of 0.5 h infusion. Serum gentamicin levels were analyzed by fluorescence polarization immunoassay technique with TDxFLx. The standard deviation (SD) of the assay over its working range had been determined at the serum gentamicin concentrations of 0, 2, 4, 8, 12, and 16 ${\mu}g$/mL in quadruplicate. The polynominal equation of gentamicin assay error was found to be SD (${\mu}g$/mL) = 0.0246-(0.0495C)+ (0.00203C$^2$). There were differences in the influence of weight with gentamicin assay error on pharmacokinetic parameters of gentamicin using the nonlinear least squares regression analysis but there were no differences on the Bayesian analysis. This polynominal equation can be used to improve the precision of fitting of pharmacokinetic models to optimize the process of model simulation both for population and for individualized pharmacokinetic models. The result would be improved dosage regimens and better, safer care of patients receiving gentamicin.