• Title/Summary/Keyword: Similarity sampling

Search Result 122, Processing Time 0.028 seconds

A Study on Measuring the Similarity Among Sampling Sites in Lake Yongdam with Water Quality Data Using Multivariate Techniques (다변량기법을 활용한 용담호 수질측정지점 유사성 연구)

  • Lee, Yosang;Kwon, Sehyug
    • Journal of Environmental Impact Assessment
    • /
    • v.18 no.6
    • /
    • pp.401-409
    • /
    • 2009
  • Multivariate statistical approaches to classify sampling sites with measuring their similarity by water quality data and understand the characteristics of classified clusters have been discussed for the optimal water quality monitering network. For empirical study, data of two years (2005, 2006) at the 9 sampling sites with the combination of 2 depth levels and 7 important variables related to water quality is collected in Yongdam reservoir. The similarity among sampling sites is measured with Euclidean distances of water quality related variables and they are classified by hierarchical clustering method. The clustered sites are discussed with principal component variables in the view of the geographical characteristics of them and reducing the number of measuring sites. Nine sampling sites are clustered as follows; One cluster of 5, 6, and 7 sampling sites shows the characteristic of low water depth and main stream of water. The sites of 2 and 4 are clustered into the same group by characteristics of hydraulics which come from that of main stream. But their changing pattern of water quality looks like different since the site of 2 is near to dam. The sampling sites of 3, 8, and 9 are individually positioned due to the different tributary.

A Study on Measuring the Similarity Among Sampling Sites in Lake (저수지 수질조사 지점간 유사성 분석)

  • Lee, Yo-Sang;Koh, Deuk-Koo;Lee, Hyun-Seok
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2010.05a
    • /
    • pp.957-961
    • /
    • 2010
  • Multivariate statistical approaches to classify sampling sites with measuring their similarity by water quality data. For empirical study, data of two years at the 9 sampling sites with the combination of 2 depth levels and 7 important variables related to water quality is collected in reservoir. The similarity among sampling sites is measured with Euclidean distances of water quality related variables and they are classified by hierarchical clustering method. The clustered sites are discussed with principal component variables in the view of the geographical characteristics of them and reducing the number of measuring sites. Nine sampling sites are clustered as follows; One cluster of 5, 6, and 7 sampling sites shows the characteristic of low water depth and main stream of water. The sites of 2 and 4 are clustered into the same group by characteristics of hydraulics which come from that of main stream. But their changing pattern of water quality looks like different since the site of 2 is near to dam. The sampling sites of 3, 8, and 9 are individually positioned due to the different tributary.

  • PDF

A Sampling-based Algorithm for Top-${\kappa}$ Similarity Joins (Top-${\kappa}$ 유사도 조인을 위한 샘플링 기반 알고리즘)

  • Park, Jong Soo
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.256-261
    • /
    • 2014
  • The problem of top-${\kappa}$ set similarity joins finds the top-${\kappa}$ pairs of records ranked by their similarities between two sets of input records. We propose an efficient algorithm to return top-${\kappa}$ similarity join pairs using a sampling technique. From a sample of the input records, we construct a histogram of set similarity joins, and then compute an estimated similarity threshold in the histogram for top-${\kappa}$ join pairs within the error bound of 95% confidence level based on statistical inference. Finally, the estimated threshold is applied to the traditional similarity join algorithm which uses the min-heap structure to get top-${\kappa}$ similarity joins. The experimental results show the good performance of the proposed algorithm on large real datasets.

Comparison of terrestrial insect communities associated with the crabgrass (Digitaria ciliaris) community, Korea

  • Jeong Ho Hwang;Jong-Hak Yun
    • Journal of Ecology and Environment
    • /
    • v.47 no.4
    • /
    • pp.250-260
    • /
    • 2023
  • Background: Crabgrass (Digitaria ciliaris, Poaceae) is a globally distributed weed, including in Afro-Eurasia, America, and Australia. As a highly gregarious plant, crabgrass is an important habitat for a diverse array of insects, and a potential habitat for agricultural pests. To compare the insect communities associated with the crabgrass community, insects were sampled using sweep sampling (100 sweeps per sample) at five sites, including Daejeon (Daejeon and Gap rivers), Anseong, Namhae, and Inje, with a focus on the Daejeon River. Results: A total of 5,888 individual insects belonging to eight orders, 42 families, and 115 species were collected from the five sites. Both the number of species and individuals of Hemiptera were the highest at all of the sites. In the present study, 73% of the insect population fed on D. ciliaris as a host plant. The dominant species in the D. ciliaris community was Laodelphax striatellus (Delphacidae), being ubiquitous at all the sites which showed a high abundance of rice pests in the communities and the suitability of D. ciliaris as an alternative host plant for them. The Shannon-Wiener diversity index was highest in Inje on 17 September (2.88), and the Chao1-bc diversity index was highest in the Gap River on 5 September (80). The sampling efficiency of 100 sweep samples (sample coverage) was calculated to be as high as 90%. The results of the samples taken from September to November in the Daejeon River showed that the number of species and individuals decreased gradually over time, and the number of dominant species decreased sharply between September and October. Similarity analysis indicated that sampling dates that were closer together yielded sampled assemblages with higher faunal similarity. In addition, in each sampling, the difference in the minimum temperature during the two-week period prior to sampling and faunal similarities were negatively correlated. Conclusions: This study provides foundational data that could enhance our understanding of insect diversity in D. ciliaris. The data can facilitate ecological conservation and management of Korean grasslands generally, as well as identification of potential pests that may disperse from D. ciliaris communities to nearby farmland.

Improvement of ASIFT for Object Matching Based on Optimized Random Sampling

  • Phan, Dung;Kim, Soo Hyung;Na, In Seop
    • International Journal of Contents
    • /
    • v.9 no.2
    • /
    • pp.1-7
    • /
    • 2013
  • This paper proposes an efficient matching algorithm based on ASIFT (Affine Scale-Invariant Feature Transform) which is fully invariant to affine transformation. In our approach, we proposed a method of reducing similar measure matching cost and the number of outliers. First, we combined the Manhattan and Chessboard metrics replacing the Euclidean metric by a linear combination for measuring the similarity of keypoints. These two metrics are simple but really efficient. Using our method the computation time for matching step was saved and also the number of correct matches was increased. By applying an Optimized Random Sampling Algorithm (ORSA), we can remove most of the outlier matches to make the result meaningful. This method was experimented on various combinations of affine transform. The experimental result shows that our method is superior to SIFT and ASIFT.

An Empirical Study on Improvement model for Measuring of Project Similarity (과제 유사도 측정 개선모형에 관한 실증적 연구)

  • Jung, Ok-Nam;Rhew, Sung-Yul;Kim, Jong-Bae
    • Journal of Digital Contents Society
    • /
    • v.12 no.4
    • /
    • pp.457-465
    • /
    • 2011
  • The annual R&D investment in Korea increased by an average of 12.2percent during the last 5 years. Therefore, prevention of duplicate projects being performed became an important factor in promoting the efficiency of R&D investment and the originality of R&D projects. On measuring the similarity of projects, the measurement model used to estimate the accuracy of the similarity is crucial. In this paper, we propose an advanced measurement model on checking the similarity of R&D projects for promoting the efficiency of R&D investment. The proposed model is made up of the following steps for the model measurement, sampling and analyzing. During the sampling step, we append the abstract of R&D reports on the search engine based on document vector. We then measure the similarity on projects to use research title network which is consists of the compound keyword and the weight of items on during the analysis. The proposed method improved the accuracy for measuring the similarity of projects by an average of 0.19 over the existing search engine and by 9.25 over the simple keyword search on R&D projects. On searching the similarity with the appending conditions and high sampling, it improved the accuracy of measuring the similarity of R&D projects.

Similarity of Sampling Sites by Water Quality (수질 관측지점 유사성 측정방법 연구)

  • Kwon, Se-Hyug;Lee, Yo-Sang
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.39-45
    • /
    • 2010
  • As the value of environment is increasing, the water quality has been a matter of interest to the nation and people. Research on water quality has been widely studied, but focused on geographical characteristic and river characteristics like inflow, outflow, quantity and speed of water. In this paper, two approaches to measure the similarity of sampling sites by using water quality data are discussed and compared with two-years empirical data of Yongdam-Dam. The existing method has calculated their similarities with principal component scores. The proposed approach in this paper use correlation matrix of water quality related variables and MDS for measuring the similarity, which is shown to be better in the sense of being clustering which is identical to geographical clustering since it can consider the time series pattern of water quality.

Design of Solving Similarity Recognition for Cloth Products Based on Fuzzy Logic and Particle Swarm Optimization Algorithm

  • Chang, Bae-Muu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.4987-5005
    • /
    • 2017
  • This paper introduces a new method to solve Similarity Recognition for Cloth Products, which is based on Fuzzy logic and Particle swarm optimization algorithm. For convenience, it is called the SRCPFP method hereafter. In this paper, the SRCPFP method combines Fuzzy Logic (FL) and Particle Swarm Optimization (PSO) algorithm to solve similarity recognition for cloth products. First, it establishes three features, length, thickness, and temperature resistance, respectively, for each cloth product. Subsequently, these three features are engaged to construct a Fuzzy Inference System (FIS) which can find out the similarity between a query cloth and each sampling cloth in the cloth database D. At the same time, the FIS integrated with the PSO algorithm can effectively search for near optimal parameters of membership functions in eight fuzzy rules of the FIS for the above similarities. Finally, experimental results represent that the SRCPFP method can realize a satisfying recognition performance and outperform other well-known methods for similarity recognition under considerations here.

Mining Clusters of Sequence Data using Sequence Element-based Similarity Measure (시퀀스 요소 기반의 유사도를 이용한 시퀀스 데이터 클러스터링)

  • 오승준;김재련
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2004.11a
    • /
    • pp.221-229
    • /
    • 2004
  • Recently, there has been enormous growth in the amount of commercial and scientific data, such as protein sequences, retail transactions, and web-logs. Such datasets consist of sequence data that have an inherent sequential nature. However, only a few of the existing clustering algorithms consider sequentiality. This study presents a method for clustering such sequence datasets. The similarity between sequences must be decided before clustering the sequences. This study proposes a new similarity measure to compute the similarity between two sequences using a sequence element. Two clustering algorithms using the proposed similarity measure are proposed: a hierarchical clustering algorithm and a scalable clustering algorithm that uses sampling and a k-nearest neighbor method. Using a splice dataset and synthetic datasets, we show that the quality of clusters generated by our proposed clustering algorithms is better than that of clusters produced by traditional clustering algorithms.

  • PDF

Neural-Network and Log-Polar Sampling Based Associative Pattern Recognizer for Aircraft Images (신경 회로망과 Log-Polar Sampling 기법을 사용한 항공기 영상의 연상 연식)

  • 김종오;김인철;진성일
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.12
    • /
    • pp.59-67
    • /
    • 1991
  • In this paper, we aimed to develop associative pattern recognizer based on neural network for aircraft identification. For obtaining invariant feature space description of an object regardless of its scale change and rotation, Log-polar sampling technique recently developed partly due to its similarity to the human visual system was introduced with Fourier transform post-processing. In addition to the recognition results, image recall was associatively performed and also used for the visualization of the recognition reliability. The multilayer perceptron model was learned by backpropagation algorithm.

  • PDF