• Title/Summary/Keyword: Distributed Clustering

Search Result 222, Processing Time 0.021 seconds

RAPD Polymorphism and Genetic Distance among Phenotypic Variants of Tamarindus indica

  • Mayavel, A;Vikashini, B;Bhuvanam, S;Shanthi, A;Kamalakannan, R;Kim, Ki-Won;Kang, Kyu-Suk
    • Journal of Korean Society of Forest Science
    • /
    • v.109 no.4
    • /
    • pp.421-428
    • /
    • 2020
  • Tamarind (Tamarindus indica L.) is one of the multipurpose tree species distributed in the tropical and sub-tropical climates. It is an important fruit yielding tree that supports the livelihood and has high social and cultural values for rural communities. The vegetative, reproductive, qualitative, and quantitative traits of tamarind vary widely. Characterization of phenotypic and genetic structure is essential for the selection of suitable accessions for sustainable cultivation and conservation. This study aimedto examine the genetic relationship among the collected accessions of sweet, red, and sour tamarind by using Random Amplified Polymorphic DNA (RAPD) primers. Nine accessions were collected from germplasm gene banks and subjected to marker analysis. Fifteen highly polymorphic primers generated a total of 169 fragments, out of which 138 bands were polymorphic. The polymorphic information content of RAPD markers varied from 0.10 to 0.44, and the Jaccard's similarity coefficient values ranged from 0.37 to 0.70. The genetic clustering showed a sizable genetic variation in the tamarind accessions at the molecular level. The molecular and biochemical variations in the selected accessions are very important for developing varieties with high sugar, anthocyanin, and acidity traits in the ongoing tamarind improvement program.

Cluster of Parasite Infections by the Spatial Scan Analysis in Korea

  • Bae, Kyoung-Eun;Chang, Yoon Kyung;Kim, Tong-Soo;Hong, Sung-Jong;Ahn, Hye-Jin;Nam, Ho-Woo;Kim, Dongjae
    • Parasites, Hosts and Diseases
    • /
    • v.58 no.6
    • /
    • pp.603-608
    • /
    • 2020
  • This study was performed to find out the clusters with high parasite infection risk to discuss the geographical pattern. Clusters were detected using SatScan software, which is a statistical spatial scan program using Kulldorff's scan statistic. Information on the parasitic infection cases in Korea 2011-2019 were collected from the Korea Centers for Disease Control and Prevention. Clusters of Ascaris lumbricoides infection were detected in Jeollabuk-do, and T. trichiura in Ulsan, Busan, and Gyeongsangnam-do. C. sinensis clusters were detected in Ulsan, Daegu, Busan, Gyeongsangnamdo, and Gyeongsangbuk-do. Clusters of intestinal trematodes were detected in Ulsan, Busan, and Gyeongsangnam-do. P. westermani cluster was found in Jeollabuk-do. E. vermicularis clusters were distributed in Gangwon-do, Jeju-do, Daegu, Daejeon, and Gwangju. This clustering information can be referred for surveillance and control on the parasitic infection outbreak in the infection-prone areas.

Efficient Illegal Contents Detection and Attacker Profiling in Real Environments

  • Kim, Jin-gang;Lim, Sueng-bum;Lee, Tae-jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.2115-2130
    • /
    • 2022
  • With the development of over-the-top (OTT) services, the demand for content is increasing, and you can easily and conveniently acquire various content in the online environment. As a result, copyrighted content can be easily copied and distributed, resulting in serious copyright infringement. Some special forms of online service providers (OSP) use filtering-based technologies to protect copyrights, but illegal uploaders use methods that bypass traditional filters. Uploading with a title that bypasses the filter cannot use a similar search method to detect illegal content. In this paper, we propose a technique for profiling the Heavy Uploader by normalizing the bypassed content title and efficiently detecting illegal content. First, the word is extracted from the normalized title and converted into a bit-array to detect illegal works. This Bloom Filter method has a characteristic that there are false positives but no false negatives. The false positive rate has a trade-off relationship with processing performance. As the false positive rate increases, the processing performance increases, and when the false positive rate decreases, the processing performance increases. We increased the detection rate by directly comparing the word to the result of increasing the false positive rate of the Bloom Filter. The processing time was also as fast as when the false positive rate was increased. Afterwards, we create a function that includes information about overall piracy and identify clustering-based heavy uploaders. Analyze the behavior of heavy uploaders to find the first uploader and detect the source site.

Analysis of the genetic diversity and population structure of Lindera obtusiloba (Lauraceae), a dioecious tree in Korea

  • Ho Bang Kim;Hye-Young Lee;Mi Sun Lee;Yi Lee;Youngtae Choi;Sung-Yeol Kim;Jaeyong Choi
    • Journal of Plant Biotechnology
    • /
    • v.50
    • /
    • pp.207-214
    • /
    • 2023
  • Lindera obtusiloba (Lauraceae) is a dioecious tree that is widely distributed in the low-altitude montane forests of East Asia, including Korea. Despite its various pharmacological properties and ornamental value, the genetic diversity and population structure of this species in Korea have not been explored. In this study, we selected 6 nuclear and 6 chloroplast microsatellite markers with polymorphism or clean cross-amplification and used these markers to perform genetic diversity and population structure analyses of L. obtusiloba samples collected from 20 geographical regions. Using these 12 markers, we identified a total of 44 alleles, ranging from 1 to 8 per locus, and the average observed and expected heterozygosity values were 0.11 and 0.44, respectively. The average polymorphism information content was 0.39. Genetic relationship and population structure analyses revealed that the natural L. obtusiloba population in Korea is composed of 2 clusters, possibly due to two different plastid genotypes. The same clustering patterns have also been observed in Lindera species in mainland China and Japan.

Anomalous Pattern Analysis of Large-Scale Logs with Spark Cluster Environment

  • Sion Min;Youyang Kim;Byungchul Tak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.127-136
    • /
    • 2024
  • This study explores the correlation between system anomalies and large-scale logs within the Spark cluster environment. While research on anomaly detection using logs is growing, there remains a limitation in adequately leveraging logs from various components of the cluster and considering the relationship between anomalies and the system. Therefore, this paper analyzes the distribution of normal and abnormal logs and explores the potential for anomaly detection based on the occurrence of log templates. By employing Hadoop and Spark, normal and abnormal log data are generated, and through t-SNE and K-means clustering, templates of abnormal logs in anomalous situations are identified to comprehend anomalies. Ultimately, unique log templates occurring only during abnormal situations are identified, thereby presenting the potential for anomaly detection.

Segmentation of Multispectral MRI Using Fuzzy Clustering (퍼지 클러스터링을 이용한 다중 스펙트럼 자기공명영상의 분할)

  • 윤옥경;김현순;곽동민;김범수;김동휘;변우목;박길흠
    • Journal of Biomedical Engineering Research
    • /
    • v.21 no.4
    • /
    • pp.333-338
    • /
    • 2000
  • In this paper, an automated segmentation algorithm is proposed for MR brain images using T1-weighted, T2-weighted, and PD images complementarily. The proposed segmentation algorithm is composed of 3 step. In the first step, cerebrum images are extracted by putting a cerebrum mask upon the three input images. In the second step, outstanding clusters that represent inner tissues of the cerebrum are chosen among 3-dimensional(3D) clusters. 3D clusters are determined by intersecting densely distributed parts of 2D histogram in the 3D space formed with three optimal scale images. Optimal scale image is made up of applying scale space filtering to each 2D histogram and searching graph structure. Optimal scale image best describes the shape of densely distributed parts of pixels in 2D histogram and searching graph structure. Optimal scale image best describes the shape of densely distributed parts of pixels in 2D histogram. In the final step, cerebrum images are segmented using FCM algorithm with its initial centroid value as the outstanding clusters centroid value. The proposed cluster's centroid accurately. And also can get better segmentation results from the proposed segmentation algorithm with multi spectral analysis than the method of single spectral analysis.

  • PDF

A Study on Price Volatility and Properties of Time-series for the Tangerine Price in Jeju (제주지역 감귤가격의 시계열적 특성 및 가격변동성에 관한 연구)

  • Ko, Bong-Hyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.6
    • /
    • pp.212-217
    • /
    • 2020
  • The purpose of this study was to analyze the volatility and properties of a time series for tangerine prices in Jeju using the GARCH model of Bollerslev(1986). First, it was found that the time series for the rate of change in tangerine prices had a thicker tail rather than a normal distribution. At a significance level of 1%, the Jarque-Bera statistic led to a rejection of the null hypothesis that the distribution of the time series for the rate of change in tangerine prices is normally distributed. Second, the correlation between the time series was high based on the Ljung-Box Q statistic, which was statistically verified through the ARCH-LM test. Third, the results of the GARCH(1,1) model estimation showed statistically significant results at a significance level of 1%, except for the constant of the mean equation. The persistence parameter value of the variance equation was estimated to be close to 1, which means that there is a high possibility that a similar level of volatility will be present in the future. Finally, it is expected that the results of this study can be used as basic data to optimize the government's tangerine supply and demand control policy.

The Changes in the Quality of Life Measure of the Seoul Metropolitan Area (수도권 삶의 질 지수 변동에 관한 연구)

  • Lee, Se-Hyung;Chang, Hoon;Rho, Jin-A
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.29 no.1
    • /
    • pp.29-37
    • /
    • 2011
  • The purpose of this research is to measure Quality of Life indices using Factor Analysis and Principle Component Analysis and to analyze the spatial patterns of Quality of life distribution in the Seoul Metropolitan Area in terms of spatial association using spatial statistics and spatial exploratory technique. In order to check the degree of clustering, this study used spatial autocorrelation indices, global Moran's I index. In addition, local scale analysis was conducted using Moran Scatterplot and Local Moran's I to identify the spatial association pattern and the high Quality of life. The analysis based on global statics showed that, in the Seoul Metropolitan Area, QoL Indices had been distributed with positive spatial association. According to the local spatial statistics, the general tendency of clustering H-H clusters which were mainly concentrated on the Seoul, L-H clusters were concentrated on the Kyunggi-Do and L-L Clusters showed the regional extent of lagging behind. However, in case of H-H, L-H Clusters they had been spread out in the Newtown as population increase.

Development of an SNP set for marker-assisted breeding based on the genotyping-by-sequencing of elite inbred lines in watermelon (수박 엘리트 계통의 GBS를 통한 마커이용 육종용 SNP 마커 개발)

  • Lee, Junewoo;Son, Beunggu;Choi, Youngwhan;Kang, Jumsoon;Lee, Youngjae;Je, Byoung Il;Park, Younghoon
    • Journal of Plant Biotechnology
    • /
    • v.45 no.3
    • /
    • pp.242-249
    • /
    • 2018
  • This study was conducted to develop an SNP set that can be useful for marker-assisted breeding (MAB) in watermelon (Citrullus. lanatus L) using Genotyping-by-sequencing (GBS) analysis of 20 commercial elite watermelon inbreds. The result of GBS showed that 77% of approximately 1.1 billion raw reads were mapped on the watermelon genome with an average mapping region of about 4,000 Kb, which indicated genome coverage of 2.3%. After the filtering process, a total of 2,670 SNPs with an average depth of 31.57 and the PIC (Polymorphic Information Content) value of 0.1~0.38 for 20 elite inbreds were obtained. Among those SNPs, 55 SNPs (5 SNPs per chromosome that are equally distributed on each chromosome) were selected. For the understanding genetic relationship of 20 elite inbreds, PCA (Principal Component Analysis) was carried out with 55 SNPs, which resulted in the classification of inbreds into 4 groups based on PC1 (52%) and PC2 (11%), thus causing differentiation between the inbreds. A similar classification pattern for PCA was observed from hierarchical clustering analysis. The SNP set developed in this study has the potential for application to cultivar identification, F1 seed purity test, and marker-assisted backcross (MABC) not only for 20 elite inbreds but also for diverse resources for watermelon breeding.

The Model of Network Packet Analysis based on Big Data (빅 데이터 기반의 네트워크 패킷 분석 모델)

  • Choi, Bomin;Kong, Jong-Hwan;Han, Myung-Mook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.5
    • /
    • pp.392-399
    • /
    • 2013
  • Due to the development of IT technology and the information age, a dependency of the network over the most of our lives have grown to a greater extent. Although it provides us to get various useful information and service, it also has negative effectiveness that can provide network intruder with vulnerable roots. In other words, we need to urgently cope with theses serious security problem causing service disableness or system connected to network obstacle with exploiting various packet information. Many experts in a field of security are making an effort to develop the various security solutions to respond against these threats, but existing solutions have a lot of problems such as lack of storage capacity and performance degradation along with the massive increase of packet data volume. Therefore we propose the packet analysis model to apply issuing Big Data technology in the field of security. That is, we used NoSQL which is technology of massive data storage to collect the packet data growing massive and implemented the packet analysis model based on K-means clustering using MapReudce which is distributed programming framework, and then we have shown its high performance by experimenting.