• Title/Summary/Keyword: Nearest neighbor

Search Result 849, Processing Time 0.027 seconds

Multivariate Time Series Simulation With Component Analysis (독립성분분석을 이용한 다변량 시계열 모의)

  • Lee, Tae-Sam;Salas, Jose D.;Karvanen, Juha;Noh, Jae-Kyoung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2008.05a
    • /
    • pp.694-698
    • /
    • 2008
  • In hydrology, it is a difficult task to deal with multivariate time series such as modeling streamflows of an entire complex river system. Normal distribution based model such as MARMA (Multivariate Autorgressive Moving average) has been a major approach for modeling the multivariate time series. There are some limitations for the normal based models. One of them might be the unfavorable data-transformation forcing that the data follow the normal distribution. Furthermore, the high dimension multivariate model requires the very large parameter matrix. As an alternative, one might be decomposing the multivariate data into independent components and modeling it individually. In 1985, Lins used Principal Component Analysis (PCA). The five scores, the decomposed data from the original data, were taken and were formulated individually. The one of the five scores were modeled with AR-2 while the others are modeled with AR-1 model. From the time series analysis using the scores of the five components, he noted "principal component time series might provide a relatively simple and meaningful alternative to conventional large MARMA models". This study is inspired from the researcher's quote to develop a multivariate simulation model. The multivariate simulation model is suggested here using Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Three modeling step is applied for simulation. (1) PCA is used to decompose the correlated multivariate data into the uncorrelated data while ICA decomposes the data into independent components. Here, the autocorrelation structure of the decomposed data is still dominant, which is inherited from the data of the original domain. (2) Each component is resampled by block bootstrapping or K-nearest neighbor. (3) The resampled components bring back to original domain. From using the suggested approach one might expect that a) the simulated data are different with the historical data, b) no data transformation is required (in case of ICA), c) a complex system can be decomposed into independent component and modeled individually. The model with PCA and ICA are compared with the various statistics such as the basic statistics (mean, standard deviation, skewness, autocorrelation), and reservoir-related statistics, kernel density estimate.

  • PDF

Development of Knee Pain Diagnosis Questionnaire and Clinical Study of Diagnostic Correspondent Rate (슬통 진단용 설문지개발 및 진단 일치도 평가연구)

  • Hwang, Ji-Hoo;Kim, Yu-Jong;Kim, Eun-Jung;Lee, Cham-Kyul;Lee, Eun-Yong;Lee, Seung-Deok;Kim, Kap-Sung
    • Journal of Acupuncture Research
    • /
    • v.29 no.5
    • /
    • pp.61-74
    • /
    • 2012
  • Objectives : This study is perfomed for preparation of oriental medicine clinical guidelines for drawing up the standards of oriental medicine demonstration and diagnosis classification about the knee pain. Methods : Statistical analysis about Crane's-knee wind(鶴膝風), arthralgia syndrome(痺症), knee injury(膝傷), gout arthritis(痛風), Youk jeol poung(歷節風) classified experts' opinions about knee pain patients by Delphi method is conducted by using oriental medicine diagnosis questionnaire. The result was classified by using linear discriminant analysis(LDA), diagonal linear discriminant analysis(DLDA), diagonal quadratic discriminant analysis(DQDA), K-nearest neighbor classification(KNN), classification and regression trees(CART), support vector machines(SVM). Results : The results are summarized as follows. 1. The result analyzed by using LDA has a hit rate of 81.65% in comparison with the original diagnosis. 2. The result analyzed by using DLDA has a hit rate of 63.3% in comparison with the original diagnosis. 3. The result analyzed by using DQDA has a hit rate of 65.14% in comparison with the original diagnosis. 4. The result analyzed by using KNN has a hit rate of 74.31% in comparison with the original diagnosis. 5. The result analyzed by using CART has a hit rate of 75.23% in comparison with the original diagnosis when the test of selected 13 significant questions based on analysis of variance was performed. 6. The result analyzed by using SVM has a hit rate of 87.16% in comparison with the original diagnosis. Conclusions : Statistical analysis using oriental medicine diagnosis questionnaire on knee pain generally turned out to have a significant result.

A Study for 8 Constitution Medicine Diagnosis Expert System Development(2) (8체질 진단을 위한 전문가 시스템 개발에 관한 연구(2))

  • Shin, Yong-Sup;Park, Young-Bae;Park, Young-Jae;Kim, Min-Yong;Lee, Sang-Chul;Oh, Hwan-Sup
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.12 no.2
    • /
    • pp.107-126
    • /
    • 2008
  • Background : There was seldom study about method that diagnose 8 Constitution beside method of pulse diagnosis in 8 Constitution Medicine. Objectives : This study is to make out 8 Constitution Medicine Diagnosis Expert System Development used CBR(Case based Reasoning). Methods : First, at case base construction process we constructed case base for CBR embodiment because gathering 925 cases all to patient who constitution is verified, and second, at study model establishment process superior expert system development by purpose CBR of reasoning process dividing fundamental type CBR that spend basis data value and expert type CBR that reflect weight in basis data value accordin I II III to advice expert opinion, and third, system embodiment process explained about way to give process and weight that diagnose constitution through Nearest Neighbor Method sampling process of CBR techniques, and fourth, at system estimation process we selected superior CBR model because comparing and estimate the diagnosis rate of expert system with fundamental type system (GECBR) model and expert type I II III CBR system (AVCBR, AACBR, AGCBR) model that reflect expert opinion in fundamental type system. GECBR and AGCBR chose on superior study model. Through such 4 study process, we developed 8 constitution diagnosis expert system lastly. Results : 1. When we select GECBR that is fundamental type by reasoning system, diagnosis rate 78.91% of 8 constitution diagnosis expert system is expected, and the constitution diagnosis rate Hepatonia 90.4%, Cholecystonia 63.0%, Pancreotonia 91.1%, Gastrotonia 0%, Pulmotonia 71.2%, Colonotonia 74.4%, Renotonia 37.5%, Vesicotonia 67.1% expect. 2. When we select AGCBR that is expert type III by reasoning system, diagnosis rate 77.51% of 8 constitution diagnosis expert system is expected, and the constitution diagnosis rate Hepatonia 93.4%, Cholecystonia 58.5%, Pancreotonia 91.1%, Gastrotonia 0%, Pulmotonia 73.1%, Colonotonia 64.4%, Renotonia 41.7%, Vesicotonia 72.2% expect. Conclusion : Based on this study, 8 constitution diagnosis expert system may give help to diagnose 8 constitution, and it is going to utilize as objective estimation tool of 8 constitution diagnosis, and further study for 8 Constitution Medicine Diagnosis Expert System Development used CBR(Case based Reasoning) is needed to supplement this study.

  • PDF

A Learning Agent for Automatic Bookmark Classification (북 마크 자동 분류를 위한 학습 에이전트)

  • Kim, In-Cheol;Cho, Soo-Sun
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.455-462
    • /
    • 2001
  • The World Wide Web has become one of the major services provided through Internet. When searching the vast web space, users use bookmarking facilities to record the sites of interests encountered during the course of navigation. One of the typical problems arising from bookmarking is that the list of bookmarks lose coherent organization when the the becomes too lengthy, thus ceasing to function as a practical finding aid. In order to maintain the bookmark file in an efficient, organized manner, the user has to classify all the bookmarks newly added to the file, and update the folders. This paper introduces our learning agent called BClassifier that automatically classifies bookmarks by analyzing the contents of the corresponding web documents. The chief source for the training examples are the bookmarks already classified into several bookmark folders according to their subject by the user. Additionally, the web pages found under top categories of Yahoo site are collected and included in the training examples for diversifying the subject categories to be represented, and the training examples for these categories as well. Our agent employs naive Bayesian learning method that is a well-tested, probability-based categorizing technique. In this paper, the outcome of some experimentation is also outlined and evaluated. A comparison of naive Bayesian learning method alongside other learning methods such as k-Nearest Neighbor and TFIDF is also presented.

  • PDF

Exploratory Research on Automating the Analysis of Scientific Argumentation Using Machine Learning (머신 러닝을 활용한 과학 논변 구성 요소 코딩 자동화 가능성 탐색 연구)

  • Lee, Gyeong-Geon;Ha, Heesoo;Hong, Hun-Gi;Kim, Heui-Baik
    • Journal of The Korean Association For Science Education
    • /
    • v.38 no.2
    • /
    • pp.219-234
    • /
    • 2018
  • In this study, we explored the possibility of automating the process of analyzing elements of scientific argument in the context of a Korean classroom. To gather training data, we collected 990 sentences from science education journals that illustrate the results of coding elements of argumentation according to Toulmin's argumentation structure framework. We extracted 483 sentences as a test data set from the transcription of students' discourse in scientific argumentation activities. The words and morphemes of each argument were analyzed using the Python 'KoNLPy' package and the 'Kkma' module for Korean Natural Language Processing. After constructing the 'argument-morpheme:class' matrix for 1,473 sentences, five machine learning techniques were applied to generate predictive models relating each sentences to the element of argument with which it corresponded. The accuracy of the predictive models was investigated by comparing them with the results of pre-coding by researchers and confirming the degree of agreement. The predictive model generated by the k-nearest neighbor algorithm (KNN) demonstrated the highest degree of agreement [54.04% (${\kappa}=0.22$)] when machine learning was performed with the consideration of morpheme of each sentence. The predictive model generated by the KNN exhibited higher agreement [55.07% (${\kappa}=0.24$)] when the coding results of the previous sentence were added to the prediction process. In addition, the results indicated importance of considering context of discourse by reflecting the codes of previous sentences to the analysis. The results have significance in that, it showed the possibility of automating the analysis of students' argumentation activities in Korean language by applying machine learning.

A Parameter-Free Approach for Clustering and Outlier Detection in Image Databases (이미지 데이터베이스에서 매개변수를 필요로 하지 않는 클러스터링 및 아웃라이어 검출 방법)

  • Oh, Hyun-Kyo;Yoon, Seok-Ho;Kim, Sang-Wook
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.80-91
    • /
    • 2010
  • As the volume of image data increases dramatically, its good organization of image data is crucial for efficient image retrieval. Clustering is a typical way of organizing image data. However, traditional clustering methods have a difficulty of requiring a user to provide the number of clusters as a parameter before clustering. In this paper, we discuss an approach for clustering image data that does not require the parameter. Basically, the proposed approach is based on Cross-Association that finds a structure or patterns hidden in data using the relationship between individual objects. In order to apply Cross-Association to clustering of image data, we convert the image data into a graph first. Then, we perform Cross-Association on the graph thus obtained and interpret the results in the clustering perspective. We also propose the method of hierarchical clustering and the method of outlier detection based on Cross-Association. By performing a series of experiments, we verify the effectiveness of the proposed approach. Finally, we discuss the finding of a good value of k used in k-nearest neighbor search and also compare the clustering results with symmetric and asymmetric ways used in building a graph.

비대칭적 표면 위에 초미세 박막의 미시적 성장구조

  • 서지근;신영호;김재성;민항기
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 1999.07a
    • /
    • pp.187-187
    • /
    • 1999
  • fcc(110) 표면이나 bcc(110) 표면과 같이 2-fold 대칭성을 갖는 표면 위에 초미세 박막을 성장시킬 경우 토대표면의 두 방향에 대한 비 대칭성으로 흡착물이 비대칭적인 cluster 형태로 성장되는 것이 보고되고 있다. 최근 STM에 의한 Ps(110) 표면 위에나 Si(100) 또는 W(110) 표면 위에 성장 실험은 흡착물이 길게 늘어선 한 줄 형태의 성장 또는 가로 세로가 비대칭적인 cluster 형태로 성장되는 것을 보고하고 있고, 이러한 특정 형태의 성장의 원인으로 흡착 원자의 방향에 따른 분산 속도의 비대칭성, 인접 원자와의 비대칭적인 상호작용, 또는 cluster 경계 방향의 분산 속도 등을 들고 있다. 그러나 아직 대부분의 물질계에 비해 흡착원자의 분산속도 또는 분산 장벽에 대해서는 잘 알려져 있지 않다. 원하는 원자 단위 구조물 제작을 위해서는 흡착물의 분산속도에 대한 이해가 필수적이며, 본 연구는 KMC 시뮬레이션과 실험 결과를 비교하는 방법을 통하여 위치와 조건에 따른 각각의 분산 속도를 구하고자 하는 시도이다. 특히 비대칭적 토대 위에서의 나타나는 다양한 형태의 미시적 성장구조에 관심을 가지며, 연구 방법으로는 KMC 시뮬레이션을 이용한다. 미시적 성장 양식은 분산 장벽 형태에 의해 크게 결정된다. 분산장벽 중에서 성장에 비교적 큰 영향을 미치는 것으로는 테라스 위의 원자가 이동할 때의 분산장벽인 Ed, 계단 끝에 부착된 원자가 분리될 때의 장벽인 Ep, 그리고 위 테라스에서 계단 아래로 떨어져 내려갈 때 만나는 Schwoebel 장벽들이 있다. 먼저 대칭적인 fcc(100) 표면 위에서의 성장 구조를 정리해보면 분산 장벽에 따라 다양한 미시적 성장형태를 볼 수 있었다. 다층 성장의 경우도 그 양식은 sub-ML 성장과 동일한 형태를 가지므로 sub-ML 성장구조로 전체 성장 양식을 예견할 수 있다. 일반적인 경향은 Ep가 커질수록 fractal 성장형태가 되며, Ed가 적을수록 cluster 밀도가 작아지나, 같은 Ed+Ep에 대해서는 동일한 크기의 팔 넓이(수평 수직 방향 cluster 두께)를 가진다. 따라서 실험으로부터 얻은 cluster의 팔 넓이로부터 Ed+Ep 값을 결정할 수 있고, cluster 밀도와 fractal 차원으로부터 각각 Ed와 Ep값을 분리하여 얻을 수 있다. 또한 다층 성장에 대한 거칠기(roughness) 값으로부터 Es값도 구할 수 있다. 양방향 대칭성을 갖지 않은 fcc(110) 표면과 같은 경우, 형태는 다양하지만 동일한 방법으로 추정이 가능하다. (110) 표면의 경우 nearest neighbor 원자가 한 축으로 형성되고 따라서 이 축과 이것과 수직인 축에 대한 상호작용이나 분산 장벽 모두가 비대칭적이다. 따라서 분산 장벽도 x-축, y-축 방향에 따라 분리하여 Edx, E요, Epx, Epy 등과 같이 방향에 따라 다르게 고려해야 한다. 이러한 비대칭적인 분산 장벽을 고려하여 KMC 시뮬레이션을 수행하면 수평축과 수직축의 분산 장벽의 비에 따라 cluster의 두께비가 달라지는 성장을 볼 수 있었고, 한 축 방향으로의 팔 넓이는 fcc(100) 표면의 경우 동일한 Ed+Ep값에 대응하는 팔 넓이와 거의 동일한 결과가 나타나는 것을 볼 수 있다. 따라서 이러한 비대칭적인 모양을 가지는 성장의 경우도 cluster 밀도, cluster 모양, cluster의 양 축 방향 길이 비, 양 축 방향의 평균 팔 넓이로부터 각 축 방향의 분산 장벽을 얻어낼 수 있을 것으로 보인다.

  • PDF

An Analysis of Policy Effects of Export Infrastructure Strengthening Program on Export of Food Distribution Companies (수출인프라강화사업이 식품유통기업 수출에 미치는 정책효과 분석)

  • Huang, Seong-Hyuk;Ji, Seong-Tae
    • Journal of Distribution Science
    • /
    • v.16 no.1
    • /
    • pp.87-99
    • /
    • 2018
  • Purpose - The Export Infrastructure Strengthening Program(EISP) is a project to expand exports of agri-food products through providing customized export information to food distribution companies and supporting overseas information activities. A total of 39.6 billion won was provided by 2016. So, the purpose of this study is to analyze whether EISP is effective for expanding exports of agri-food products. Research design, data, and methodology - A simple average difference between the export performance of the policy beneficiaries and the non-policy beneficiaries can be biased if the export capacity or inherent characteristics of the enterprise are not taken into consideration. In order to solve the problem of such a bias, the propensity score matching(PSM) method has been employed in this study. PSM is a method of converting the characteristics of an export company into an index through logit analysis and then reducing the matching to one dimension to improve the accuracy of the performance measurement. Results - The balancing test was conducted to determine how the characteristics of the policy beneficiary group and the matched policy non-beneficiary group corresponded to each other. As a result of the test, we could not reject the null hypothesis that there was no difference between the two groups, so that after the matching, the two groups were similar and the explanatory variables were well controlled. Using the nearest neighbor matching with propensity score estimating through logit analysis, we estimated average treatment effect on the treated(ATT). The food companies participating the EISP had the effect of increasing the exports of $ 5.88 million. As a result, the number of export contracts increased by 11.77, the number of exporting countries by 7.52, the number of export items by 47.51, and the number of buyers' consultation by 3.50. And overseas marketing expenses increased by 35.92 million won. Except for the number of export contracts, other export performance results showed statistically significant results. Conclusions - As the EISP has a positive effect on the expansion of agro-food exports, efforts should be made to find out the limitations or problems of the policy in the future and to make a greater contribution to the increase of exports.

An Implementation of Automatic Genre Classification System for Korean Traditional Music (한국 전통음악 (국악)에 대한 자동 장르 분류 시스템 구현)

  • Lee Kang-Kyu;Yoon Won-Jung;Park Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.1
    • /
    • pp.29-37
    • /
    • 2005
  • This paper proposes an automatic genre classification system for Korean traditional music. The Proposed system accepts and classifies queried input music as one of the six musical genres such as Royal Shrine Music, Classcal Chamber Music, Folk Song, Folk Music, Buddhist Music, Shamanist Music based on music contents. In general, content-based music genre classification consists of two stages - music feature vector extraction and Pattern classification. For feature extraction. the system extracts 58 dimensional feature vectors including spectral centroid, spectral rolloff and spectral flux based on STFT and also the coefficient domain features such as LPC, MFCC, and then these features are further optimized using SFS method. For Pattern or genre classification, k-NN, Gaussian, GMM and SVM algorithms are considered. In addition, the proposed system adopts MFC method to settle down the uncertainty problem of the system performance due to the different query Patterns (or portions). From the experimental results. we verify the successful genre classification performance over $97{\%}$ for both the k-NN and SVM classifier, however SVM classifier provides almost three times faster classification performance than the k-NN.

A Study for 8 Constitution Medicine Diagnosis Expert System Development (8체질의학을 위한 진단 전문가 시스템 개발 및 고찰)

  • Shin, Yong-Sup;Park, Young-Bae;Park, Young-Jae;Kim, Min-Yong;Oh, Hwan-Sup
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.12 no.1
    • /
    • pp.142-184
    • /
    • 2008
  • Background: There was seldom study about method that diagnose 8 Constitution beside method of pulse diagnosis in 8 Constitution Medicine. Objectives: This study is to make out 8 Constitution Medicine Diagnosis Expert System Development used CBR(Case based Reasoning). Methods: First, at case base construction process we constructed case base for CBR embodiment because gathering 925 cases all to patient who constitution is verified, and second, at study model establishment process superior expert system development by purpose CBR of reasoning process dividing fundamental type CBR that spend basis data value and expert type I II III CBR that reflect weight in basis data value according to advice expert opinion, and third, system embodiment process explained about way to give process and weight that diagnose constitution through Nearest Neighbor Method sampling process of CBR techniques, and fourth, at system estimation process we selected superior CBR model because comparing and estimate the diagnosis rate of expert system with fundamental type system (GECBR) model and expert type I II III CBR system (AVCBR, AACBR, AGCBR) model that reflect expert opinion in fundamental type system. GECBR and AGCBR chose on superior study model. Through such 4 study process, we developed 8 constitution diagnosis expert system lastly. Results: 1. When we select GECBR that is fundamental type by reasoning system, diagnosis rate 78.91% of 8 constitution diagnosis expert system is expected, and the constitution diagnosis rate Hepatonia 90.4%, Cholecystonia 63.0%, Pancreotonia 91.1%, Gastrotonia 0%, Pulmotonia 71.2%, Colonotonia 74.4%, Renotonia 37.5%, Vesicotonia 67.1% expect. 2. When we select AGCBR that is expert type III by reasoning system, diagnosis rate 77.51% of 8 constitution diagnosis expert system is expected, and the constitution diagnosis rate Hepatonia 93.4%, Cholecystonia 58.5%, Pancreotonia 91.1%, Gastrotonia 0%, Pulmotonia 73.1%, Colonotonia 64.4%, Renotonia 41.7%, Vesicotonia 72.2% expect. Conclusion: Based on this study, 8 constitution diagnosis expert system may give help to diagnose 8 constitution, and it is going to utilize as objective estimation tool of 8 constitution diagnosis, and further study for 8 Constitution Medicine Diagnosis Expert System Development used CBR(Case based Reasoning) is needed to supplement this study.

  • PDF