• Title/Summary/Keyword: Data Clustering

Search Result 2,754, Processing Time 0.036 seconds

Network Analysis of Herbs that are Frequently Prescribed for Osteoporosis with a Focus on Oasis Platform Research (골다공증 다빈도 처방과 구성 약물의 네트워크 분석 - 오아시스 검색을 중심으로)

  • Shin, Seon-mi;Ko, Heung
    • The Journal of Internal Korean Medicine
    • /
    • v.42 no.4
    • /
    • pp.628-644
    • /
    • 2021
  • Objectives: This study analyzed, through network analysis and data mining analysis, the relationship between herbs used in osteoporosis prescriptions, diversified the analysis of osteoporosis-related prescriptions, and analyzed the combination of herbs used in osteoporosis-related prescriptions. Methods: The prescriptions used in osteoporosis treatment and experiments were established by conducting a full survey of the papers published by the OASIS site. A database for osteoporosis-related prescriptions was established, herbs were extracted, and the frequency of frequent herbs and prescriptions were investigated using Excel (MS offices ver. 2013). Using the freeware R version 4.0.3 (2020-10-10), igraph, and arules package, network analysis was performed in the first second of prescription composition. Results: Among the osteoporosis-related prescriptions, the most studied prescriptions are as follows.: Yukmijihwang-tang (六味地黃湯) and Samul-tang (四物湯). In the osteoporosis prescription network, herbs with connection centrality, proximity centrality, mediation centrality, and eigenvector centrality appeared in the order of Rehmanniae Radix Preparata, Angelicae Gigantis Radix, Poria Sclerotium, Paeoniae Radix, and Glycyrrhizae Radix et Rhizoma. After extracting the herbal combination network, including the corresponding herbs, and clustering it, it can be divided into drugs of the Yukmijihwang-tang (六味地黃湯) series and the Samul-tang (四物湯). Conclusions: This study could assist researchers in diversifyingy formula analysis in future studies. Moreover, the herbal combination used in osteoporosis prescriptions could be used to search for osteoporosis prescriptions in other databases or to create a new prescription.

Group-based speaker embeddings for text-independent speaker verification (문장 독립 화자 검증을 위한 그룹기반 화자 임베딩)

  • Jung, Youngmoon;Eom, Youngsik;Lee, Yeonghyeon;Kim, Hoirin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.496-502
    • /
    • 2021
  • Recently, deep speaker embedding approach has been widely used in text-independent speaker verification, which shows better performance than the traditional i-vector approach. In this work, to improve the deep speaker embedding approach, we propose a novel method called group-based speaker embedding which incorporates group information. We cluster all speakers of the training data into a predefined number of groups in an unsupervised manner, so that a fixed-length group embedding represents the corresponding group. A Group Decision Network (GDN) produces a group weight, and an aggregated group embedding is generated from the weighted sum of the group embeddings and the group weights. Finally, we generate a group-based embedding by adding the aggregated group embedding to the deep speaker embedding. In this way, a speaker embedding can reduce the search space of the speaker identity by incorporating group information, and thereby can flexibly represent a significant number of speakers. We conducted experiments using the VoxCeleb1 database to show that our proposed approach can improve the previous approaches.

A Method for the Classification of Water Pollutants using Machine Learning Model with Swimming Activities Videos of Caenorhabditis elegans (예쁜꼬마선충의 수영 행동 영상과 기계학습 모델을 이용한 수질 오염 물질 구분 방법)

  • Kang, Seung-Ho;Jeong, In-Seon;Lim, Hyeong-Seok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.7
    • /
    • pp.903-909
    • /
    • 2021
  • Caenorhabditis elegans whose DNA sequence was completely identified is a representative species used in various research fields such as gene functional analysis and animal behavioral research. In the mean time, many researches on the bio-monitoring system to determine whether water is contaminated or not by using the swimming activities of nematodes. In this paper, we show the possibility of using the swimming activities of C. elegans in the development of a machine learning based bio-monitoring system which identifies chemicals that cause water pollution. To characterize swimming activities of nematode, BLS entropy is computed for the nematode in a frame. And, BLS entropy profile, an assembly of entropies, are classified into several patterns using clustering algorithms. Finally these patterns are used to construct data sets. We recorded images of swimming behavior of nematodes in the arenas in which formaldehyde, benzene and toluene were added at a concentration of 0.1 ppm, respectively, and evaluate the performance of the developed HMM.

Development of Polymorphic Simple Sequence Repeat Markers using High-Throughput Sequencing in Button Mushroom (Agaricus bisporus)

  • Lee, Hwa-Yong;Raveendar, Sebastin;An, Hyejin;Oh, Youn-Lee;Jang, Kab-Yeul;Kong, Won-Sik;Ryu, Hojin;So, Yoon-Sup;Chung, Jong-Wook
    • Mycobiology
    • /
    • v.46 no.4
    • /
    • pp.421-428
    • /
    • 2018
  • The white button mushroom (Agaricus bisporus) is one of the most widely cultivated species of edible mushroom. Despite its economic importance, relatively little is known about the genetic diversity of this species. Illumina paired-end sequencing produced 43,871,558 clean reads and 69,174 contigs were generated from five offspring. These contigs were subsequently assembled into 57,594 unigenes. The unigenes were annotated with reference genome in which 6,559 unigenes were associated with clusters, indicating orthologous genes. Gene ontology classification assigned many unigenes. Based on genome data of the five offspring, 44 polymorphic simple sequence repeat (SSR) markers were developed. The major allele frequency ranged from 0.42 to 0.92. The number of genotypes and the number of alleles ranged from 1 to 4, and from 2 to 4, respectively. The observed heterozygosity and the expected heterozygosity ranged from 0.00 to 1.00, and from 0.15 to 0.64, respectively. The polymorphic information content value ranged from 0.14 to 0.57. The genetic distances and UPGMA clustering discriminated offspring strains. The SSR markers developed in this study can be applied in polymorphism analyses of button mushroom and for cultivar discrimination.

Delineation of Rice Productivity Projected via Integration of a Crop Model with Geostationary Satellite Imagery in North Korea

  • Ng, Chi Tim;Ko, Jonghan;Yeom, Jong-min;Jeong, Seungtaek;Jeong, Gwanyong;Choi, Myungin
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.1
    • /
    • pp.57-81
    • /
    • 2019
  • Satellite images can be integrated into a crop model to strengthen the advantages of each technique for crop monitoring and to compensate for weaknesses of each other, which can be systematically applied for monitoring inaccessible croplands. The objective of this study was to outline the productivity of paddy rice based on simulation of the yield of all paddy fields in North Korea, using a grid crop model combined with optical satellite imagery. The grid GRAMI-rice model was used to simulate paddy rice yields for inaccessible North Korea based on the bidirectional reflectance distribution function-adjusted vegetation indices (VIs) and the solar insolation. VIs and solar insolation for the model simulation were obtained from the Geostationary Ocean Color Imager (GOCI) and the Meteorological Imager (MI) sensors of the Communication Ocean and Meteorological Satellite (COMS). Reanalysis data of air temperature were achieved from the Korea Local Analysis and Prediction System (KLAPS). Study results showed that the yields of paddy rice were reproduced with a statistically significant range of accuracy. The regional characteristics of crops for all of the sites in North Korea were successfully defined into four clusters through a spatial analysis using the K-means clustering approach. The current study has demonstrated the potential effectiveness of characterization of crop productivity based on incorporation of a crop model with satellite images, which is a proven consistent technique for monitoring of crop productivity in inaccessible regions.

A Study on the Classification of Jeokbyeok-ga's Version by the Computer Analysis Technique of Bibliographies (컴퓨터 문헌 분석 기법을 활용한 <적벽가> 이본의 계통 분류 연구)

  • Lee, Jin-O;Kim, Dong-Keon
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.6
    • /
    • pp.1-9
    • /
    • 2019
  • The purpose of this study is to examine the system of the Jeokbyeok-ga's version using the Computer analysis technique of bibliographies and to examine the achievements of the Jeokbyeok-ga's version studies. First, in order to provide basic data for analysis, a raw corpus was constructed for 46 species of Jeokbyeok-ga. Through this, the common narrative units of the Jeokbyeok-ga were identified as 5 layers, and thus 146 individual paragraphs could be extracted. Based on the encoded corpus, we tried to measure the similarity and the distance between the two. Next, we applied the Multidimensional scaling method, Hierarchical cluster analysis and Cladistic analysis method of the system to confirm the distribution of versions group and it was possible to visually grasp the distance between versions and the system of the work. As a result of analyzing Computer analysis technique of bibliographies, it was found that version's group of the Jeokbyeok-ga was divided into a Wanpan(完板) series and Changbon(唱本) series. Also, it was possible to examine the influence relationship between the Pansori's traditions and transmission.

An optimal feature selection algorithm for the network intrusion detection system (네트워크 침입 탐지를 위한 최적 특징 선택 알고리즘)

  • Jung, Seung-Hyun;Moon, Jun-Geol;Kang, Seung-Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.342-345
    • /
    • 2014
  • Network intrusion detection system based on machine learning methods is quite dependent on the selected features in terms of accuracy and efficiency. Nevertheless, choosing the optimal combination of features from generally used features to detect network intrusion requires extensive computing resources. For instance, the number of possible feature combinations from given n features is $2^n-1$. In this paper, to tackle this problem we propose a optimal feature selection algorithm. Proposed algorithm is based on the local search algorithm, one of representative meta-heuristic algorithm for solving optimization problem. In addition, the accuracy of clusters which obtained using selected feature components and k-means clustering algorithm is adopted to evaluate a feature assembly. In order to estimate the performance of our proposed algorithm, comparing with a method where all features are used on NSL-KDD data set and multi-layer perceptron.

  • PDF

Online Video Synopsis via Multiple Object Detection

  • Lee, JaeWon;Kim, DoHyeon;Kim, Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.8
    • /
    • pp.19-28
    • /
    • 2019
  • In this paper, an online video summarization algorithm based on multiple object detection is proposed. As crime has been on the rise due to the recent rapid urbanization, the people's appetite for safety has been growing and the installation of surveillance cameras such as a closed-circuit television(CCTV) has been increasing in many cities. However, it takes a lot of time and labor to retrieve and analyze a huge amount of video data from numerous CCTVs. As a result, there is an increasing demand for intelligent video recognition systems that can automatically detect and summarize various events occurring on CCTVs. Video summarization is a method of generating synopsis video of a long time original video so that users can watch it in a short time. The proposed video summarization method can be divided into two stages. The object extraction step detects a specific object in the video and extracts a specific object desired by the user. The video summary step creates a final synopsis video based on the objects extracted in the previous object extraction step. While the existed methods do not consider the interaction between objects from the original video when generating the synopsis video, in the proposed method, new object clustering algorithm can effectively maintain interaction between objects in original video in synopsis video. This paper also proposed an online optimization method that can efficiently summarize the large number of objects appearing in long-time videos. Finally, Experimental results show that the performance of the proposed method is superior to that of the existing video synopsis algorithm.

Motherhood Ideology and Parenting Stress according to Parenting Behavior Patterns of Married Immigrant Women with Young Children (유아기 자녀를 둔 결혼이주여성의 양육행위 유형별 모성이데올로기 및 양육스트레스)

  • Moon, So-Hyun;Kim, Miok;Na, Hyeun
    • Journal of Korean Academy of Nursing
    • /
    • v.49 no.4
    • /
    • pp.449-460
    • /
    • 2019
  • Purpose: This study aims to provide base data for designing education and counseling programs for child-raising by identifying the types, characteristics and predictors of parenting behaviors of married immigrant women. Methods: We used a self-report questionnaire to survey 126 immigrant mothers of young children, who agreed to participate, and who could speak Korean, Vietnamese, Chinese, Filipino, or English, at two children's hospitals and two multicultural support centers. Statistical analysis was conducted using descriptive analysis, K-means clustering, ${\chi}^2$ test, Fisher's exact test, one-way ANOVA, $Sch{\acute{e}}ffe^{\prime}s$ test, and multinominal logistic regression. Results: We identified three clusters of parenting behaviors: 'affectionate acceptance group' (38.9%), 'active engaging group' (26.2%), and 'passive parenting group' (34.9%). Passive parenting and affectionate acceptance groups were distinguished by the conversation time between couples (p=.028, OR=5.52), ideology of motherhood (p=.032, OR=4.33), and parenting stress between parent and child (p=.049, OR=0.22). Passive parenting was distinguished from active engaging group by support from spouses for participating in multicultural support centers or relevant programs (p=.011, OR=2.37), and ideology of motherhood (p=.001, OR=16.65). Ideology of motherhood was also the distinguishing factor between affectionate acceptance and active engaging groups (p=.041, OR=3.85). Conclusion: Since immigrant women's parenting type depends on their ideology of motherhood, parenting stress, and spousal relationships in terms of communication and support to help their child-raising and socio-cultural adaptation, it is necessary to provide them with systematic education and support, as well as interventions across personal, family, and community levels.

Association Rules Analysis of Safe Accidents Caused by Falling Objects (낙하물에 기인한 안전사고의 연관규칙 분석)

  • Son, Ki-Young;Ryu, Han-Guk
    • Journal of the Korea Institute of Building Construction
    • /
    • v.19 no.4
    • /
    • pp.341-350
    • /
    • 2019
  • Construction industry is one of the most dangerous industry. As the construction accidents occur due to the repeated factors found in each accidents, there is a limitation in analyzing all types of occupational accidents by the existing descriptive analysis and statistical test. In this study, we classified safety accidents caused by falling objects among the accident types occurring at construction sites into fatal and nonfatal accidents and deduced the factors. In addition, we deduced the association rules among the safety accidents factors caused by falling objects through the association rule analysis method among the machine learning techniques. Therefore, considering the association rules for fatal and nonfatal accidents proposed in this study, it would be possible to prevent accidents by searching for countermeasures against safety accidents caused by falling objects.