• Title/Summary/Keyword: Kernel Density Analysis

Search Result 117, Processing Time 0.024 seconds

Multivariate Time Series Simulation With Component Analysis (독립성분분석을 이용한 다변량 시계열 모의)

  • Lee, Tae-Sam;Salas, Jose D.;Karvanen, Juha;Noh, Jae-Kyoung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2008.05a
    • /
    • pp.694-698
    • /
    • 2008
  • In hydrology, it is a difficult task to deal with multivariate time series such as modeling streamflows of an entire complex river system. Normal distribution based model such as MARMA (Multivariate Autorgressive Moving average) has been a major approach for modeling the multivariate time series. There are some limitations for the normal based models. One of them might be the unfavorable data-transformation forcing that the data follow the normal distribution. Furthermore, the high dimension multivariate model requires the very large parameter matrix. As an alternative, one might be decomposing the multivariate data into independent components and modeling it individually. In 1985, Lins used Principal Component Analysis (PCA). The five scores, the decomposed data from the original data, were taken and were formulated individually. The one of the five scores were modeled with AR-2 while the others are modeled with AR-1 model. From the time series analysis using the scores of the five components, he noted "principal component time series might provide a relatively simple and meaningful alternative to conventional large MARMA models". This study is inspired from the researcher's quote to develop a multivariate simulation model. The multivariate simulation model is suggested here using Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Three modeling step is applied for simulation. (1) PCA is used to decompose the correlated multivariate data into the uncorrelated data while ICA decomposes the data into independent components. Here, the autocorrelation structure of the decomposed data is still dominant, which is inherited from the data of the original domain. (2) Each component is resampled by block bootstrapping or K-nearest neighbor. (3) The resampled components bring back to original domain. From using the suggested approach one might expect that a) the simulated data are different with the historical data, b) no data transformation is required (in case of ICA), c) a complex system can be decomposed into independent component and modeled individually. The model with PCA and ICA are compared with the various statistics such as the basic statistics (mean, standard deviation, skewness, autocorrelation), and reservoir-related statistics, kernel density estimate.

  • PDF

Cluster exploration of water pipe leak and complaints surveillance using a spatio-temporal statistical analysis (스캔통계량 분석을 통한 상수도 누수 및 수질 민원 발생 클러스터 탐색)

  • Juwon Lee;Eunju Kim;Sookhyun Nam;Tae-Mun Hwang
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.5
    • /
    • pp.261-269
    • /
    • 2023
  • In light of recent social concerns related to issues such as water supply pipe deterioration leading to problems like leaks and degraded water quality, the significance of maintenance efforts to enhance water source quality and ensure a stable water supply has grown substantially. In this study, scan statistic was applied to analyze water quality complaints and water leakage accidents from 2015 to 2021 to present a reasonable method to identify areas requiring improvement in water management. SaTScan, a spatio-temporal statistical analysis program, and ArcGIS were used for spatial information analysis, and clusters with high relative risk (RR) were determined using the maximum log-likelihood ratio, relative risk, and Monte Carlo hypothesis test for I city, the target area. Specifically, in the case of water quality complaints, the analysis results were compared by distinguishing cases occurring before and after the onset of "red water." The period between 2015 and 2019 revealed that preceding the occurrence of red water, the leak cluster at location L2 posed a significantly higher risk (RR: 2.45) than other regions. As for water quality complaints, cluster C2 exhibited a notably elevated RR (RR: 2.21) and appeared concentrated in areas D and S, respectively. On the other hand, post-red water incidents of water quality complaints were predominantly concentrated in area S. The analysis found that the locations of complaint clusters were similar to those of red water incidents. Of these, cluster C7 exhibited a substantial RR of 4.58, signifying more than a twofold increase compared to pre-incident levels. A kernel density map analysis was performed using GIS to identify priority areas for waterworks management based on the central location of clusters and complaint cluster RR data.

Spatial Analysis of Colorectal Cancer Cases in Kuala Lumpur

  • Shah, Shamsul Azhar;Neoh, Hui-Min;Syed Abdul Rahim, Syed Sharizman;Azhar, Zahir Izuan;Hassan, Mohd Rohaizat;Safian, Nazarudin;Jamal, Rahman
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.3
    • /
    • pp.1149-1154
    • /
    • 2014
  • Background: In Malaysia, data from the Malaysian Health Ministry showed colorectal cancer (CRC) to be the second most common type of cancer in 2007-2009, after breast cancer. The same was apparent after looking at males and females cases separately. In the present study, the Geographic Information System (GIS) was employed to describe the distribution of CRC cases in Kuala Lumpur (KL), Malaysia, according to socio-demographic factors (age, gender, ethnicity and district). Materials and Methods: This retrospective review concerned data for patients diagnosed with colorectal cancer in the years 1995 to 2011 collected from the Wilayah Persekutuan Health Office, taken from the cancer notification form (NCR-2), and patient medical records from the Surgical Department, Universiti Kebangsaan Malaysia Medical Centre (UKMMC). A total of 146 cases were analyzed. All the data collected were analysed using ArcGIS version 10.0 and SPSS version 19.0. Results: Patients aged 60 to 69 years accounted for the highest proportion of cases (34.2%) and males slightly predominated 76 (52.1%), Chinese had the highest number of registered cases at 108 (74.0%) and staging revealed most cases in the 3rd and 4th stages. Kernel density analysis showed more cases are concentrated up in the northern area of Petaling and Kuala Lumpur subdistricts. Spatial global pattern analysis by average nearest neighbour resulted in nearest neighbour ratio of 0.75, with Z-score of -5.59, p value of <0.01 and the z-score of -5.59. Spatial autocorrelation (Moran's I) showed clustering significant with p<0.01, Z score 3.14 and Moran's Index of 0.007. When mapping clusters with hotspot analysis (Getis-Ord Gi), hot and cold spots were identified. Hot spot areas fell on the northeast side of KL. Conclusions: This study demonstrated significant spatial patterns of cancer incidence in KL. Knowledge about these spatial patterns can provide useful information to policymakers in the planning of screening of CRC in the targeted population and improvement of healthcare facilities to provide better treatment for CRC patients.

Studies on Classification and Genetic Nature of Korean Local Corn Lines (한국(韓國) 재래종(在來種) 옥수수의 계통분류(系統分類) 및 유전적(遺傳的) 특성(特性)에 관(關)한 연구(硏究))

  • Lee, In Sup;Choi, Bong Ho
    • Korean Journal of Agricultural Science
    • /
    • v.9 no.1
    • /
    • pp.396-450
    • /
    • 1982
  • To obtain basic information on the Korean local corn lines a total of 57 lines were selected from 1,000 Korean local collection at Chungnam National University, classified by principal component analysis, and genetic nature was investigated. The results are summarized as follows. 1. There were a great variation in mean values of plant characters of the lines. The mean values of plant characters except for density of kernels varied with types of crossing. All characters except. for tasselling dates were reduced in magnitude when selfed, while those characters were increased when topcrossed. 2. The correlation coefficients among characters studied ranged front 0.99 to -0.59. The correlation coefficients among characters were not greatly changed depending upon types of crosses. 3. In order to classify the lines more effectively, selected 12 plant characters were used to classify 57 local lines by principal component analysis. The first four component could explain 86.4%, 83.4% and 81.1% of the total variations in sibbed lines, selfed lines and topcrossed lines, respectively. 4. Contribution of characters to principal component was high at upper principal components and low at lower principal components. 5. Biological meaning of the principal component and plant types corresponding to the each principal component were explained clearly by the correlation coefficient between principal components and characters. The first principal component appeared to correspond to the size of plant and ear. The second principal component appeared to correspond to the degree of differentiation in organs and the duration of vegetative growing period. But biological meaning of the third and fourth principal components was not clear. 6. The lines were classified into 4 lineal groups by the taxonomic distance. Group I included 52 lines which was 91.2% of total lines, group II 3 lines, group III 1 lines and group IV I lines, respectively. Four groups could be characterized as follows : Group I : early maturity, short-culmed, medium height plant, small ears, medium kernels and medium yielding. Group II : late maturity, medium height plant, small ears, small kernels, prolific ears and higher yielding. Group III : medium maturity, tall-culmed, small ears, small kernels and low yielding. Group IV : medium maturity, tall-calmed, large ears, one ear plant and me yielding. 7. The inbreeding depression varied with plant characters and lines. The characters such as yield, kernel weight per ear, ear weight and plant height showed great degree of inbreeding depression. Group I showed high inbreeding depression in such characters as 100 kernel weight, leaf number, plant height and days to tasselling, while group II showed high inbreeding depression in other plant characters. 8. Heterosis of plant characters varied also with lines. The ear weight, kernel weight per ear, yield, 100 kernel weight, and plant height were some of the plant characters showing high heterosis. Group II showed high values of heterosis in such characters as ear length, ear diameter, ear weight, kernel weight per ear, 100 kernel weight, and leaf length, while group I was high in heterosis in other plant characters. 9. The degree of homozgosity was highest in ear weight (79.1%) and lowest in ear number per plant (-21%). Group II showed higher degree of homozygosity than group I. 10. Correlation coefficients between characters of ribbed and topcrossed lines were positive for all characters. Highly significant. correlation coefficients between ribbed and topcrossed lines were obtained especially for characters such as ear number per plant, plant height, leaf length and yield per plot.

  • PDF

Analysis of Temporal and Spatial Red Tide Change in the South Sea of Korea Using the GOCI Images of COMS (천리안 위성 GOCI 영상을 이용한 남해안의 시공간적 적조변화 분석)

  • Kim, Dong Kyoo;Yoo, Hwan Hee
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.3
    • /
    • pp.129-136
    • /
    • 2014
  • This study deals with red tide detection by using the remote sensing imagery from the Geostationary Ocean Color Imager (GOCI), the world's first geostationary orbit satellite, around the southern coast of Korea where the most severe red tide occurred recently. The red tide zone was determined by the available data selection from the GOCI imagery during the period of red tide occurrence and also the severe red tide zone was detected through the spatial analysis by temporal change out of the red tide zone. This study results showed that the coast in the vicinity of the Hansan and Yokji in Tongyeong-si was classified into the severe red tide zone, and that the red tide was likely to spread from the coast of Hansan and Yokji to the one of Sanyang-eub. In addition, the comparative analysis between the area of red tide occurrence, the prevention activities of Gyeongsangnam-do provincial government and the amount of the damage cost over time showed close correlation among them. It is still early to conclude that the study is showing the severe red tide zone and the spread path exactly due to various factors for red tide occurrence and activities. In order to improve the reliability of the results, the more data analysis is required.

Development of Drought Index based on Streamflow for Monitoring Hydrological Drought (수문학적 가뭄감시를 위한 하천유량 기반 가뭄지수 개발)

  • Yoo, Jiyoung;Kim, Tae-Woong;Kim, Jeong-Yup;Moon, Jang-Won
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.37 no.4
    • /
    • pp.669-680
    • /
    • 2017
  • This study evaluated the consistency of the standard flow to forecast low-flow based on various drought indices. The data used in this study were streamflow data at the Gurye2 station located in the Seomjin River and the Angang station located in the Hyeongsan River, as well as rainfall data of nearby weather stations (Namwon and Pohang). Using streamflow data, the streamflow accumulation drought index (SADI) was developed in this study to represent the hydrological drought condition. For SADI calculations, the threshold of drought was determined by a Change-Point analysis of the flow pattern and a reduction factor was estimated based on the kernel density function. Standardized runoff index (SRI) and standardized precipitation index (SPI) were also calculated to compared with the SADI. SRI and SPI were calculated for the 30-, 90-, 180-, and 270-day period and then an ROC curve analysis was performed to determine the appropriate time-period which has the highest consistency with the standard flow. The result of ROC curve analysis indicated that for the Seomjin River-Gurye2 station SADI_C3, SRI30, SADI_C1, SADI_C2, and SPI90 were confirmed in oder of having high consistency with standard flow under the attention stage and for the Hyeongsan River-Angang station, SADI_C3, SADI_C1, SPI270, SRI30, and SADI_C2 have order of high consistency with standard flow under the attention stage.

Estimation of Flow Population of Seoul Walking Tour Courses Using Telecommunications Data (통신 데이터를 활용한 도보관광코스 유동인구 추정 및 분석)

  • Park, Ye Rim;Kang, Youngok
    • Journal of Cadastre & Land InformatiX
    • /
    • v.49 no.1
    • /
    • pp.181-195
    • /
    • 2019
  • This study aims to analyze the spatial context by analyzing the flow characteristics of the walking tour course and visualizing effectively using the floating population data constructed through the communication data. The floating population data refinement algorithm was developed for estimation flow population along the road and the floating population data for each walking tour courses was constructed. In order to adopt the algorithm for forming suitable for the analysis of the walking tour courses, the estimation of floating population considering the area of the road and the estimation of floating population considering the value of floating population around the road were compared. As a result, the estimation of floating population considering ambient the values of flow population was adopted, which is more appropriate to apply analysis method due to the relatively consistent data. Then, a datamining algorithm for walking tour course was constructed according to the characteristics of the floating population data, the absence of missing values. Finally, this study analyzed the flow characteristics and spatial patterns of 18 walking trails in Seoul through the floating population data according to walking tour course. To do this, the kernel density analysis and the Getis-Ord $G^*_i$ statistical hotspot analysis were applied to visualize the main characteristics of each walking tour course.

A Study on the Landscape Elements and Distribution Characteristics of Mount Tai Appearing in Poems (시문(詩文)에 나타난 태산(泰山) 경관요소 및 분포특성 연구)

  • Yu, Ying;Jung, Taeyeol
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.6
    • /
    • pp.80-92
    • /
    • 2021
  • Mount Tai, with an elevation of 1,532 meters, has a reputation as 'The Most Revered of the Five Sacred Mountains(五嶽獨尊)', despite not being the highest mountain in China. The literati of the past dynasties created a multitude of works based on the landscape of Mount Tai. Traditional literature is a part of national culture that directly reflects the national characteristics and styles, and is an important part of humanities, which can be linked to landscapes. The purpose of this study is to investigate the landscape elements and characteristics of Mount Tai by analyzing the landscape types and elements and the Kernel Density, Mean Center and Standard Deviational Ellipse of the landscape elements appearing in the representative poems of traditional literature. The research results of this study are summarized as follows. First, Mount Tai is a scenic spot dominated by human activities, different from the natural landscape of prior research related to scenic spots. Second, among the landscape elements of Mount Tai, the importance of "sunrise", "cyan", "towering" and "majestic", "Divine Dragon" is confirmed, symbolizing the hope, brightness, vitality, national stability and prosperity represented by Mount Tai, which can explain the leadership position of Mount Tai. Third, it can be found from the poems about Mount Tai that various landscape elements were embodied in belief (the behavior of gods or emperors) in the Pre-Qin, Sui and Tang dynasties, while in modern times, landscape elements are shown by action (climbing and looking far into distance), so it can be said that the landscape elements have changed from belief landscapes to experience landscapes. Fourth, the spatial distribution of landscape elements in the past dynasties was widely distributed in the Daiding(岱頂). Approaching the modern times, the mean center moved from south outside of Mount Tai to the summit of Mount Tai, and the spatial distribution changed from a widely scattered distribution to narrow linear distribution centered on Mount Tai. The present study is of great significance to provide key factors or spaces for future landscape protection and restoration of Mount Tai.

Performance Evaluation and Analysis on Single and Multi-Network Virtualization Systems with Virtio and SR-IOV (가상화 시스템에서 Virtio와 SR-IOV 적용에 대한 단일 및 다중 네트워크 성능 평가 및 분석)

  • Jaehak Lee;Jongbeom Lim;Heonchang Yu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.48-59
    • /
    • 2024
  • As functions that support virtualization on their own in hardware are developed, user applications having various workloads are operating efficiently in the virtualization system. SR-IOV is a virtualization support function that takes direct access to PCI devices, thus giving a high I/O performance by minimizing the need for hypervisor or operating system interventions. With SR-IOV, network I/O acceleration can be realized in virtualization systems that have relatively long I/O paths compared to bare-metal systems and frequent context switches between the user area and kernel area. To take performance advantages of SR-IOV, network resource management policies that can derive optimal network performance when SR-IOV is applied to an instance such as a virtual machine(VM) or container are being actively studied.This paper evaluates and analyzes the network performance of SR-IOV implementing I/O acceleration is compared with Virtio in terms of 1) network delay, 2) network throughput, 3) network fairness, 4) performance interference, and 5) multi-network. The contributions of this paper are as follows. First, the network I/O process of Virtio and SR-IOV was clearly explained in the virtualization system, and second, the evaluation results of the network performance of Virtio and SR-IOV were analyzed based on various performance metrics. Third, the system overhead and the possibility of optimization for the SR-IOV network in a virtualization system with high VM density were experimentally confirmed. The experimental results and analysis of the paper are expected to be referenced in the network resource management policy for virtualization systems that operate network-intensive services such as smart factories, connected cars, deep learning inference models, and crowdsourcing.

Nonlinear Autoregressive Modeling of Southern Oscillation Index (비선형 자기회귀모형을 이용한 남방진동지수 시계열 분석)

  • Kwon, Hyun-Han;Moon, Young-Il
    • Journal of Korea Water Resources Association
    • /
    • v.39 no.12 s.173
    • /
    • pp.997-1012
    • /
    • 2006
  • We have presented a nonparametric stochastic approach for the SOI(Southern Oscillation Index) series that used nonlinear methodology called Nonlinear AutoRegressive(NAR) based on conditional kernel density function and CAFPE(Corrected Asymptotic Final Prediction Error) lag selection. The fitted linear AR model represents heteroscedasticity, and besides, a BDS(Brock - Dechert - Sheinkman) statistics is rejected. Hence, we applied NAR model to the SOI series. We can identify the lags 1, 2 and 4 are appropriate one, and estimated conditional mean function. There is no autocorrelation of residuals in the Portmanteau Test. However, the null hypothesis of normality and no heteroscedasticity is rejected in the Jarque-Bera Test and ARCH-LM Test, respectively. Moreover, the lag selection for conditional standard deviation function with CAFPE provides lags 3, 8 and 9. As the results of conditional standard deviation analysis, all I.I.D assumptions of the residuals are accepted. Particularly, the BDS statistics is accepted at the 95% and 99% significance level. Finally, we split the SOI set into a sample for estimating themodel and a sample for out-of-sample prediction, that is, we conduct the one-step ahead forecasts for the last 97 values (15%). The NAR model shows a MSEP of 0.5464 that is 7% lower than those of the linear model. Hence, the relevance of the NAR model may be proved in these results, and the nonparametric NAR model is encouraging rather than a linear one to reflect the nonlinearity of SOI series.