• Title/Summary/Keyword: Outliers

Search Result 655, Processing Time 0.034 seconds

A Real-Time Head Tracking Algorithm Using Mean-Shift Color Convergence and Shape Based Refinement (Mean-Shift의 색 수렴성과 모양 기반의 재조정을 이용한 실시간 머리 추적 알고리즘)

  • Jeong Dong-Gil;Kang Dong-Goo;Yang Yu Kyung;Ra Jong Beom
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.6
    • /
    • pp.1-8
    • /
    • 2005
  • In this paper, we propose a two-stage head tracking algorithm adequate for real-time active camera system having pan-tilt-zoom functions. In the color convergence stage, we first assume that the shape of a head is an ellipse and its model color histogram is acquired in advance. Then, the min-shift method is applied to roughly estimate a target position by examining the histogram similarity of the model and a candidate ellipse. To reflect the temporal change of object color and enhance the reliability of mean-shift based tracking, the target histogram obtained in the previous frame is considered to update the model histogram. In the updating process, to alleviate error-accumulation due to outliers in the target ellipse of the previous frame, the target histogram in the previous frame is obtained within an ellipse adaptively shrunken on the basis of the model histogram. In addition, to enhance tracking reliability further, we set the initial position closer to the true position by compensating the global motion, which is rapidly estimated on the basis of two 1-D projection datasets. In the subsequent stage, we refine the position and size of the ellipse obtained in the first stage by using shape information. Here, we define a robust shape-similarity function based on the gradient direction. Extensive experimental results proved that the proposed algorithm performs head hacking well, even when a person moves fast, the head size changes drastically, or the background has many clusters and distracting colors. Also, the propose algorithm can perform tracking with the processing speed of about 30 fps on a standard PC.

Statistical Analysis of Protein Content in Wheat Germplasm Based on Near-infrared Reflectance Spectroscopy (밀 유전자원의 근적외선분광분석 예측모델에 의한 단백질 함량 변이분석)

  • Oh, Sejong;Choi, Yu Mi;Yoon, Hyemyeong;Lee, Sukyeung;Yoo, Eunae;Hyun, Do Yoon;Shin, Myoung-Jae;Lee, Myung Chul;Chae, Byungsoo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.64 no.4
    • /
    • pp.353-365
    • /
    • 2019
  • A near-infrared reflectance spectroscopy (NIRS) prediction model was set to establish a rapid analysis system of wheat germplasm and provide statistical information on the characteristics of protein contents. The variability index value (VIV) of calibration resources was 0.80, the average protein content was 13.2%, and the content range was from 7.0% to 13.2%. After measuring the near-infrared spectra of calibration resources, the NIRS prediction model was developed through a regression analysis between protein content and spectra data, and then optimized by excluding outliers. The standard error of calibration, R2, and the slope of the optimized model were 0.132, 0.997, and 1.000 respectively, and those of external validation results were 0.994, 0.191, and 1.013, respectively. Based on these results, a developed NIRS model could be applied to the rapid analysis of protein in wheat. The distribution of NIRS protein content of 6,794 resources were analyzed using a normal distribution analysis. The VIV was 0.79, the average protein was 12.1%, and the content range of resources accounting for 42.1% and 68% of the total accessions were 10-13% and 9.5-14.6%, respectively. The composition of total resources was classified into breeding line (3,128), landrace (2,705), and variety (961). The VIV in breeding line was 0.80, the protein average was 11.8%, and the contents of 68% of total resources ranged from 9.2% to 14.5%. The VIV in landrace was 0.76, the protein average was 12.1%, and the content range of resources of 68% of total accessions was 9.8-14.4%. The VIV in variety was 0.80, the protein average was 12.8%, and the accessions representing 68% of total resources ranged from 10.2% to 15.4%. These results should be helpful to the related experts of wheat breeding.

Processing and Quality Control of Flux Data at Gwangneung Forest (광릉 산림의 플럭스 자료 처리와 품질 관리)

  • Lim, Hee-Jeong;Lee, Young-Hee
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.10 no.3
    • /
    • pp.82-93
    • /
    • 2008
  • In order to ensure a standardized data analysis of the eddy covariance measurements, Hong and Kim's quality control program has been updated and used to process eddy covariance data measured at two levels on the main flux tower at Gwangneung site from January to May in 2005. The updated program was allowed to remove outliers automatically for $CO_2$ and latent heat fluxes. The flag system consists of four quality groups(G, D, B and M). During the study period, the missing data were about 25% of the total records. About 60% of the good quality data were obtained after the quality control. The number of record in G group was larger at 40m than at 20m. It is due that the level of 20m was within the roughness sublayer where the presence of the canopy influences directly on the character of the turbulence. About 60% of the bad data were due to low wind speed. Energy balance closure at this site was about 40% during the study period. Large imbalance is attributed partly to the combined effects of the neglected heat storage terms, inaccuracy of ground heat flux and advection due to local wind system near the surface. The analysis of wind direction indicates that the frequent occurrence of positive momentum flux was closely associated with mountain valley wind system at this site. The negative $CO_2$ flux at night was examined in terms of averaging time. The results show that when averaging time is larger than 10min, the magnitude of calculated $CO_2$ fluxes increases rapidly, suggesting that the 30min $CO_2$ flux is influenced severely by the mesoscale motion or nonstationarity. A proper choice of averaging time needs to be considered to get accurate turbulent fluxes during nighttime.

Evaluation of Oil Spill Detection Models by Oil Spill Distribution Characteristics and CNN Architectures Using Sentinel-1 SAR data (Sentienl-1 SAR 영상을 활용한 유류 분포특성과 CNN 구조에 따른 유류오염 탐지모델 성능 평가)

  • Park, Soyeon;Ahn, Myoung-Hwan;Li, Chenglei;Kim, Junwoo;Jeon, Hyungyun;Kim, Duk-jin
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_3
    • /
    • pp.1475-1490
    • /
    • 2021
  • Detecting oil spill area using statistical characteristics of SAR images has limitations in that classification algorithm is complicated and is greatly affected by outliers. To overcome these limitations, studies using neural networks to classify oil spills are recently investigated. However, the studies to evaluate whether the performance of model shows a consistent detection performance for various oil spill cases were insufficient. Therefore, in this study, two CNNs (Convolutional Neural Networks) with basic structures(Simple CNN and U-net) were used to discover whether there is a difference in detection performance according to the structure of CNN and distribution characteristics of oil spill. As a result, through the method proposed in this study, the Simple CNN with contracting path only detected oil spill with an F1 score of 86.24% and U-net, which has both contracting and expansive path showed an F1 score of 91.44%. Both models successfully detected oil spills, but detection performance of the U-net was higher than Simple CNN. Additionally, in order to compare the accuracy of models according to various oil spill cases, the cases were classified into four different categories according to the spatial distribution characteristics of the oil spill (presence of land near the oil spill area) and the clarity of border between oil and seawater. The Simple CNN had F1 score values of 85.71%, 87.43%, 86.50%, and 85.86% for each category, showing the maximum difference of 1.71%. In the case of U-net, the values for each category were 89.77%, 92.27%, 92.59%, and 92.66%, with the maximum difference of 2.90%. Such results indicate that neither model showed significant differences in detection performance by the characteristics of oil spill distribution. However, the difference in detection tendency was caused by the difference in the model structure and the oil spill distribution characteristics. In all four oil spill categories, the Simple CNN showed a tendency to overestimate the oil spill area and the U-net showed a tendency to underestimate it. These tendencies were emphasized when the border between oil and seawater was unclear.

Comparison of Housewives' Agricultural Food Consumption Characteristics by Age (주부의 연령대별 농식품 소비 특성 비교)

  • Hong, Jun-Ho;Kim, Jin-Sil;Yu, Yeon-Ju;Lee, Kyung-Hee;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.83-89
    • /
    • 2021
  • Lifestyle is changing rapidly, and food consumption patterns vary widely among households as dietary and food processing technologies evolve. This paper reclassified the food group of consumer panel data established by the Rural Development Administration, which contains information on purchasing agricultural products by household unit, and compared the consumption characteristics of agricultural products by age group. The criteria for age classification were divided into groups in their 60s and older with a prevalence of 20% or more metabolic diseases and groups in their 30s and 40s with less than 10%. Using the LightGBM algorithm, we classified the differences in food consumption patterns in their 30s and 50s and 60s and found that the precision was 0.85, the reproducibility was 0.71, and F1_score was 0.77. The results of variable importance were confectionery, folio, seasoned vegetables, fruit vegetables, and marine products, followed by the top five values of the SHAP indicator: confectionery, marine products, seasoned vegetables, fruit vegetables, and folio vegetables. As a result of binary classification of consumption patterns as a median instead of the average sensitive to outliers, confectionery showed that those in their 30s and 40s were more than twice as high as those in their 60s. Other variables also showed significant differences between those in their 30s and 40s and those in their 60s and older. According to the study, people in their 30s and 40s consumed more than twice as much confectionery as those in their 60s, while those in their 60s consumed more than twice as much marine products, seasoned vegetables, fruit vegetables, and folioce or logistics as much as those in their 30s and 40s. In addition to the top five items, consumption of 30s and 40s in wheat-processed snacks, breads and noodles was high, which differed from food consumption patterns in their 60s.

Meta-analysis on the Effect of Startup Support Policies to Startup Performance (창업지원정책이 창업성과에 미치는 영향에 관한 메타분석)

  • Kim, Sun Chic;Jeon, Byung Hoon;Yun, Sung Im
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.15 no.6
    • /
    • pp.95-114
    • /
    • 2020
  • This paper, a meta-analysis of the effect of the start-up support policy on the start-up performance was conducted to examine the effect of the start-up support policy on the start-up performance of beneficiary companies and to provide theoretical and practical implications to support organizations and practitioners. To this end, 35 papers containing the correlation coefficient, which is a positive statistical value, were selected from the previous studies in academic journals and dissertations published in Korea from 2007 to 2020. In the preceding study of the start-up support policy, the independent variables include funding, education support, facility/equipment support, network support, mentoring support, consulting support, marketing support, management support, technical support, manpower support, and finance as a dependent variable. The effect size of the impact on aptitude and non-financial performance was reviewed. The pattern of the effect size was presented as a forest plot for easy visual understanding, and outliers were verified through sensitivity analysis for small-study-effect data with publication convenience. As a result of analyzing the effect size of the government-supported policy, it was verified that the effect size was generally medium or higher, affecting the entrepreneurial performance. Among the independent variables, the factor that has the greatest effect on startup performance is manpower support, followed by technical support, marketing support, management support, facility/equipment support, education support, mentoring support, funding, network support, and consulting support. It was analyzed that the effect size was large in order. As the 「Small and Medium Business Startup Support Act」 was recently reorganized from the manufacturing industry to digital transformation and smartization on October 8, 2020, the start-up support policy should consider the start-up stage and verify the priorities to organize the budget.

An Outlier Detection Using Autoencoder for Ocean Observation Data (해양 이상 자료 탐지를 위한 오토인코더 활용 기법 최적화 연구)

  • Kim, Hyeon-Jae;Kim, Dong-Hoon;Lim, Chaewook;Shin, Yongtak;Lee, Sang-Chul;Choi, Youngjin;Woo, Seung-Buhm
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.33 no.6
    • /
    • pp.265-274
    • /
    • 2021
  • Outlier detection research in ocean data has traditionally been performed using statistical and distance-based machine learning algorithms. Recently, AI-based methods have received a lot of attention and so-called supervised learning methods that require classification information for data are mainly used. This supervised learning method requires a lot of time and costs because classification information (label) must be manually designated for all data required for learning. In this study, an autoencoder based on unsupervised learning was applied as an outlier detection to overcome this problem. For the experiment, two experiments were designed: one is univariate learning, in which only SST data was used among the observation data of Deokjeok Island and the other is multivariate learning, in which SST, air temperature, wind direction, wind speed, air pressure, and humidity were used. Period of data is 25 years from 1996 to 2020, and a pre-processing considering the characteristics of ocean data was applied to the data. An outlier detection of actual SST data was tried with a learned univariate and multivariate autoencoder. We tried to detect outliers in real SST data using trained univariate and multivariate autoencoders. To compare model performance, various outlier detection methods were applied to synthetic data with artificially inserted errors. As a result of quantitatively evaluating the performance of these methods, the multivariate/univariate accuracy was about 96%/91%, respectively, indicating that the multivariate autoencoder had better outlier detection performance. Outlier detection using an unsupervised learning-based autoencoder is expected to be used in various ways in that it can reduce subjective classification errors and cost and time required for data labeling.

The Economic Cycle and Contributing Factors to the Operating Profit Ratio of Korean Liner Shipping (경기순환과 우리나라 정기선 해운의 영업이익률 변동 요인)

  • Mok, Ick-soo;Ryoo, Dong-keun
    • Journal of Navigation and Port Research
    • /
    • v.46 no.4
    • /
    • pp.375-384
    • /
    • 2022
  • The shipping industry is cyclically impacted by complex variables such as various economic indicators, social events, and supply and demand. The purpose of this study was to analyze the operating profit of 13 Korean liner companies over 30 years, including the financial crisis of the late 1990s, the global financial crisis of the late 2000s, and the COVID-19 global pandemic. This study was conducted to also identify factors that impacted the profit ratio of Korea's liner shipping companies according to economic conditions. It was divided into ocean-going and short-sea shipping, reflecting the characteristics of liner shipping companies, and was analyzed by hierarchical multiple regression analysis. The time series data are based on the Korean International Financial Reporting Standards (K-IFRS) and comprise seaborne trade volume, fleet evolution, and macroeconomic indicators. The outliers representing the economic downturn due to social events were separately analyzed. As a result of the analysis, the China Container Freight Index (CCFI) positively impacted ocean-going as well as short-sea liner shipping companies. However, the Korean container shipping volume only impacted ocean-going liners positively. Additionally, world and Korea's GDP, world seaborne trade volume, and fuel price are factored in the operating profit of short sea liner shipping. Also, the GDP growth rate of China, exchange rate, and interest rate did not significantly impact both groups. Notably, the operating profitability of Korea's liner shipping shows an exceptionally high rate during the recessions of 1998 and 2020. It is paradoxical, and not correlated with the classical economic indicators. Unlike other studies, this paper focused on the operating profit before financial expenses, considering the complexity as well as difficulty in forecasting the shipping cycle, and rendered conclusions using relatively long-term empirical analysis, including three economic shocks.

Gridding of Automatic Mountain Meteorology Observation Station (AMOS) Temperature Data Using Optimal Kriging with Lapse Rate Correction (기온감률 보정과 최적크리깅을 이용한 산악기상관측망 기온자료의 우리나라 500미터 격자화)

  • Youjeong Youn;Seoyeon Kim;Jonggu Kang;Yemin Jeong;Soyeon Choi;Yungyo Im;Youngmin Seo;Myoungsoo Won;Junghwa Chun;Kyungmin Kim;Keunchang Jang;Joongbin Lim;Yangwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.715-727
    • /
    • 2023
  • To provide detailed and appropriate meteorological information in mountainous areas, the Korea Forest Service has established an Automatic Mountain Meteorology Observation Station (AMOS) network in major mountainous regions since 2012, and 464 stations are currently operated. In this study, we proposed an optimal kriging technique with lapse rate correction to produce gridded temperature data suitable for Korean forests using AMOS point observations. First, the outliers of the AMOS temperature data were removed through statistical processing. Then, an optimized theoretical variogram, which best approximates the empirical variogram, was derived to perform the optimal kriging with lapse rate correction. A 500-meter resolution Kriging map for temperature was created to reflect the elevation variations in Korean mountainous terrain. A blind evaluation of the method using a spatially unbiased validation sample showed a correlation coefficient of 0.899 to 0.953 and an error of 0.933 to 1.230℃, indicating a slight accuracy improvement compared to regular kriging without lapse rate correction. However, the critical advantage of the proposed method is that it can appropriately represent the complex terrain of Korean forests, such as local variations in mountainous areas and coastal forests in Gangwon province and topographical differences in Jirisan and Naejangsan and their surrounding forests.

Efficient Deep Learning Approaches for Active Fire Detection Using Himawari-8 Geostationary Satellite Images (Himawari-8 정지궤도 위성 영상을 활용한 딥러닝 기반 산불 탐지의 효율적 방안 제시)

  • Sihyun Lee;Yoojin Kang;Taejun Sung;Jungho Im
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.979-995
    • /
    • 2023
  • As wildfires are difficult to predict, real-time monitoring is crucial for a timely response. Geostationary satellite images are very useful for active fire detection because they can monitor a vast area with high temporal resolution (e.g., 2 min). Existing satellite-based active fire detection algorithms detect thermal outliers using threshold values based on the statistical analysis of brightness temperature. However, the difficulty in establishing suitable thresholds for such threshold-based methods hinders their ability to detect fires with low intensity and achieve generalized performance. In light of these challenges, machine learning has emerged as a potential-solution. Until now, relatively simple techniques such as random forest, Vanilla convolutional neural network (CNN), and U-net have been applied for active fire detection. Therefore, this study proposed an active fire detection algorithm using state-of-the-art (SOTA) deep learning techniques using data from the Advanced Himawari Imager and evaluated it over East Asia and Australia. The SOTA model was developed by applying EfficientNet and lion optimizer, and the results were compared with the model using the Vanilla CNN structure. EfficientNet outperformed CNN with F1-scores of 0.88 and 0.83 in East Asia and Australia, respectively. The performance was better after using weighted loss, equal sampling, and image augmentation techniques to fix data imbalance issues compared to before the techniques were used, resulting in F1-scores of 0.92 in East Asia and 0.84 in Australia. It is anticipated that timely responses facilitated by the SOTA deep learning-based approach for active fire detection will effectively mitigate the damage caused by wildfires.