• Title/Summary/Keyword: statistical clustering method

Search Result 231, Processing Time 0.027 seconds

A STUDY OF MANDIBULAR DENIAL ARCH OF KOREAN ADULTS (한국 성인 유치악자의 하악 치열궁에 관한 조사)

  • Kim, Il-Han;Choi, Dae-Gyun
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.36 no.1
    • /
    • pp.166-182
    • /
    • 1998
  • The purposes of this study are to evaluate the Korean mandibular dental arch and classify the mandibular dental arch shape and size based on the incisal angle, canine angle, inter second molar width and height. In this study the mandibular study models were fabricated using irreversible hydrocolloid impression material from 225 volunteers with a mean age 23.62 (range 19-29). And the study models were measured with 3-dimensional measuring device and the mandibular dental arch was classified by means of K-means clustering method and visual inspection, then obtained data were analyzed with t-test for the statistical analysis. The results were as follows ; 1. The average canine height was 5.19mm(s.d. 1.17) in both sex, 5.34mm in male, and 4.95mnm in female. And the sexual difference was significant($0). 2. The average second molar height was 39.81mm(s.d. 2.44) in both sex, 40.19mm in male, and 39.21mm in female. And the sexual difference was significant($0). 3. The average inter-canine width was 27.16mm(s.d. 1.78) in both sex, 27.41mm in male, and 26.77mm in female. And the sexual difference was significant($0). 4. The average inter-first molar width was 46.93mm(s.d. 2.67) in both sex, 47.72mm in male, and 45.7mm in female. And the sexual difference was significant($0). 5. The inter-second molar width was average 56.09mm(s.d. 3.01) in both sex, 57.24mm in male, and 54.32mn in woma. And the sexual difference was significant($0). 6. The arch form was classified into three shapes based on the incisal and canine angle. V-shape showed $124.88^{\circ}$ of incisal angle and $141.64^{\circ}$ of canine angle, U-shape showed $152.76^{\circ}\;and\;125.35^{\circ}$, and O-shape showed $138.03^{\circ}\;and \;33.66^{\circ}$ respectively. Each shape distribution was that the V-shape was 14.2%, the U-Shape was 14.7%, and the O-shape was 71.1% of the 225 study models. 7. It was thought that the use of second molar width is more reasonable than height for classifying the dental arch size. The arch size was classified into four sizes based on the second molar width. Size 1 showed range of 42.24-48.23mm, size 2 showed 48.24-54.23mm, size 3 showed 54.24-60.23mm, and size 4 showed 60.24-66.23mm respectively. Each arch size distribution was that the size 1 was 1.3%, the size 2 was 27.1%, the size 3 was 63.6%, and the size 4 was 8.0% of the 225 study models.

  • PDF

Analysis of Genetic Diversity and Identification of Domestic Bred Phalaenopsis Varieties Using SRAP and SSR Markers (SRAP과 SSR 마커를 이용한 국내 육성 팔레놉시스 품종의 유전적 다양성 분석과 품종판별)

  • Park, Pue Hee;Park, Yong-Jin;Kim, Mi Seon;Lee, Young Ran;Park, Pil Man;Lee, Dong Soo;Yae, Byeong Woo
    • Horticultural Science & Technology
    • /
    • v.31 no.3
    • /
    • pp.337-343
    • /
    • 2013
  • The aims of this study were to compare genetic distances among 14 Phalaenopsis varieties using simple sequence repeat (SSR) and sequence-related amplified polymorphism (SRAP) marker systems and to determine the discrimination using SSR. A total of 111 SSR primers and 30 SRAP combinations were initially screened. Twelve SSR primers and thirty SRAP combinations showed high polymorphism among the 14 Phalaenopsis varieties including domestic breeding varieties, conserved in National Institute of Horticultural & Herbal Science (NIHHS). The amplified DNA fragments were separated by denaturing acrylamide gels and detected by silver staining method. A total of 474 polymorphic bands, including 55 by SSRs and 419 by SRAPs, were identified and used for genetic diversity analysis. Polymorphic bands were scored for calculating a simple matching coefficient of genetic similarity and cluster analysis with multi-variate statistical package (MVSP) 3.1. Fourteen Phalaenopsis varieties were classified into three major groups at similarity coefficient value of 0.683 and 0.66 using SRAP and SSR, respectively. Also we could discriminate these domestic breeding Palaenopsis varieties using only SSR 20 and SSR 22. The results indicate that SSR analysis is effective for discrimination among Phalaenopsis varieties and SRAP is useful for genetic diversity when there is no sequence information. These studied SSR and SRAP markers will be useful tools for genotype identification, germplasm conservation and genetic relationship study in Phalaenopsis.

Evaluation of Water Quality Characteristics in the Nakdong River using Statistical Analysis (통계분석을 이용한 낙동강유역의 수질변화 특성 조사)

  • Choi, Kil Yong;Im, Toe Hyo;Lee, Jae Woon;Cheon, Se Uk
    • Journal of Korea Water Resources Association
    • /
    • v.45 no.11
    • /
    • pp.1157-1168
    • /
    • 2012
  • In this study, we assess changes in water quality trends over time based on certain control measurements in order to identify and analyze the cause of the trend in water quality. The current water pollution in the Nakdong River was analyzed, as it suggests that the significant changes in water quality have occurred in between 2006 and 2010. Based on monthly average data, we have examined for trends of the Nakdong River watershed in water temperature, Biological Oxygen Demand (BOD), Chemical Oxygen Demand (COD), Total Nitrogen (TN), and Total Phosphorus (TP). Moreover, we have investigated seasonal variation of water quality of sites within the Nakdong River Basin by implementing further analyses such as, Correlation Coefficient, Regression Analysis, Hierarchical Clustering Method, and Time Series Analysis on SPSS. Geology and topography of the watershed, controlled by various conditions such as, climate, vegetation, topography, soil, and rain medium, have been affected by the non-homogeneity. Our study suggests that such variables could possibly cause eutrophication problems in the river. One possible way to overcome this particular problem is to lay up a ship on the river by increasing the nasal flow measurement of the Nakdong River during rainy season. Moreover, the water management requires arranging the measurement of the flow in order to secure the river while the numerous construction projects need to be continuously observed. However, the water is not flowing tributary of the reason for the timing to be flowing in a natural state of river water and industrial water intake because agriculture. Therefore, ongoing research is needed in addition to configuration of all observations.

Elicitation of Collective Intelligence by Fuzzy Relational Methodology (퍼지관계 이론에 의한 집단지성의 도출)

  • Joo, Young-Do
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.17-35
    • /
    • 2011
  • The collective intelligence is a common-based production by the collaboration and competition of many peer individuals. In other words, it is the aggregation of individual intelligence to lead the wisdom of crowd. Recently, the utilization of the collective intelligence has become one of the emerging research areas, since it has been adopted as an important principle of web 2.0 to aim openness, sharing and participation. This paper introduces an approach to seek the collective intelligence by cognition of the relation and interaction among individual participants. It describes a methodology well-suited to evaluate individual intelligence in information retrieval and classification as an application field. The research investigates how to derive and represent such cognitive intelligence from individuals through the application of fuzzy relational theory to personal construct theory and knowledge grid technique. Crucial to this research is to implement formally and process interpretatively the cognitive knowledge of participants who makes the mutual relation and social interaction. What is needed is a technique to analyze cognitive intelligence structure in the form of Hasse diagram, which is an instantiation of this perceptive intelligence of human beings. The search for the collective intelligence requires a theory of similarity to deal with underlying problems; clustering of social subgroups of individuals through identification of individual intelligence and commonality among intelligence and then elicitation of collective intelligence to aggregate the congruence or sharing of all the participants of the entire group. Unlike standard approaches to similarity based on statistical techniques, the method presented employs a theory of fuzzy relational products with the related computational procedures to cover issues of similarity and dissimilarity.

Analysis of crown size and morphology, and gingival shape in the maxillary anterior dentition in Korean young adults

  • Song, Jae-Won;Leesungbok, Richard;Park, Su-Jung;Chang, Se Hun;Ahn, Su-Jin;Lee, Suk-Won
    • The Journal of Advanced Prosthodontics
    • /
    • v.9 no.4
    • /
    • pp.315-320
    • /
    • 2017
  • PURPOSE. The aim of this investigation was to analyze the dimensions of clinical crowns and to classify the crown and the gingival type in the anterior teeth in Korean young adults. MATERIALS AND METHODS. Casts were obtained from 50 subjects ranging in age from 24 to 32. Measurements of length and width were made on the casts using a pair of digital calipers on the entire dentition. Crown thickness and papilla height were also measured and MDW/CL (mesiodistal width to clinical length) and CW/CL (cervical width to clinical length) ratios of the maxillary anterior teeth were calculated. The K-clustering method was used for CW/CL to classify the anterior tooth shape into three groups (tapered, ovoid, and square), and one-way analysis of variance and Duncan's post-hoc comparison were used to evaluate statistical significance between the groups. Pearson's correlation analysis was performed between tooth shape and papillary height (PH) to demonstrate the correlation between tooth shape and gingival morphological characteristics. RESULTS. The average length of the maxillary central incisors was 9.89 mm; the mesio-distal width was 8.54 mm; and the ratio of width/length was 0.86 in Korean young adults. The average bucco-palatal thickness of the central incisor was 3.14 mm at the incisal 1/3 aspect. Ovoid type was the most common tooth shape (48%), followed by square type (29%) and taper type (23%) in the central incisors of Korean young adults. Tooth shape and gingival type were correlated with each other. CONCLUSION. New reference data were established for tooth size in Korean young adults and the data show several patterns of tooth shape and gingival type. Clinicians should diagnose and treat based on these characteristics for better results in the Korean population.

The Habitat Classification of mammals in Korea based on the National Ecosystem Survey (전국자연환경조사를 활용한 포유류 서식지 유형의 분류)

  • Lee, Hwajin;Ha, Jeongwook;Cha, Jinyeol;Lee, Junghyo;Yoon, Heenam;Chung, Chulun;Oh, Hongshik;Bae, Soyeon
    • Journal of Environmental Impact Assessment
    • /
    • v.26 no.2
    • /
    • pp.160-170
    • /
    • 2017
  • The purpose of this study is to perform clustering of the habitat types and to identify the characteristics of species in the habitat types using mammal data (70,562) of the 3rd National Ecosystem Survey conducted from 2006 to 2012. The 15 habitat types recorded in the field-paper of the 3rd National ecosystem survey were reclassified, which was followed by the statistical analysis of mammal habitat types. In the habitat types cluster analysis, non-hierarchical cluster analysis (k-means cluster analysis), hierarchical cluster analysis, and non-metric multidimensional scaling method were applied to 14 habitat types recorded more than 30 times. A total of 7 Orders, 16 Families, and 39 Species of mammals were identified in the 3rd National Ecosystem Survey collected nationwide. When 11 clusters were classified by habitat types, the simple structure index was the highest (ssi = 0.07). As a result of the similarities and hierarchies between habitat types suggested by the hierarchical clustering analysis, the residential areas were the most different habitat types for mammals; the next following type was a cluster together with rivers and coasts. The results of the non-metric multidimensional scaling analysis demonstrated that both Mus musculus and Rattus norvegicus restrictively appeared in a residential area, which is the most discriminating habitat type. Lutra lutra restrictively appeared in coastal and river areas. In summary, according to our results, the mammalian habitat can be divided into the following four types: (1) the forest type (using forest as the main habitat and migration route); (2) the river type (using water as the main habitat); (3) the residence habitat (living near residential area); and (4) the lowland type (consuming grain or seeds as the main feeding resource).

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

A Study on the Regional Characteristics of Broadband Internet Termination by Coupling Type using Spatial Information based Clustering (공간정보기반 클러스터링을 이용한 초고속인터넷 결합유형별 해지의 지역별 특성연구)

  • Park, Janghyuk;Park, Sangun;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.45-67
    • /
    • 2017
  • According to the Internet Usage Research performed in 2016, the number of internet users and the internet usage have been increasing. Smartphone, compared to the computer, is taking a more dominant role as an internet access device. As the number of smart devices have been increasing, some views that the demand on high-speed internet will decrease; however, Despite the increase in smart devices, the high-speed Internet market is expected to slightly increase for a while due to the speedup of Giga Internet and the growth of the IoT market. As the broadband Internet market saturates, telecom operators are over-competing to win new customers, but if they know the cause of customer exit, it is expected to reduce marketing costs by more effective marketing. In this study, we analyzed the relationship between the cancellation rates of telecommunication products and the factors affecting them by combining the data of 3 cities, Anyang, Gunpo, and Uiwang owned by a telecommunication company with the regional data from KOSIS(Korean Statistical Information Service). Especially, we focused on the assumption that the neighboring areas affect the distribution of the cancellation rates by coupling type, so we conducted spatial cluster analysis on the 3 types of cancellation rates of each region using the spatial analysis tool, SatScan, and analyzed the various relationships between the cancellation rates and the regional data. In the analysis phase, we first summarized the characteristics of the clusters derived by combining spatial information and the cancellation data. Next, based on the results of the cluster analysis, Variance analysis, Correlation analysis, and regression analysis were used to analyze the relationship between the cancellation rates data and regional data. Based on the results of analysis, we proposed appropriate marketing methods according to the region. Unlike previous studies on regional characteristics analysis, In this study has academic differentiation in that it performs clustering based on spatial information so that the regions with similar cancellation types on adjacent regions. In addition, there have been few studies considering the regional characteristics in the previous study on the determinants of subscription to high-speed Internet services, In this study, we tried to analyze the relationship between the clusters and the regional characteristics data, assuming that there are different factors depending on the region. In this study, we tried to get more efficient marketing method considering the characteristics of each region in the new subscription and customer management in high-speed internet. As a result of analysis of variance, it was confirmed that there were significant differences in regional characteristics among the clusters, Correlation analysis shows that there is a stronger correlation the clusters than all region. and Regression analysis was used to analyze the relationship between the cancellation rate and the regional characteristics. As a result, we found that there is a difference in the cancellation rate depending on the regional characteristics, and it is possible to target differentiated marketing each region. As the biggest limitation of this study and it was difficult to obtain enough data to carry out the analyze. In particular, it is difficult to find the variables that represent the regional characteristics in the Dong unit. In other words, most of the data was disclosed to the city rather than the Dong unit, so it was limited to analyze it in detail. The data such as income, card usage information and telecommunications company policies or characteristics that could affect its cause are not available at that time. The most urgent part for a more sophisticated analysis is to obtain the Dong unit data for the regional characteristics. Direction of the next studies be target marketing based on the results. It is also meaningful to analyze the effect of marketing by comparing and analyzing the difference of results before and after target marketing. It is also effective to use clusters based on new subscription data as well as cancellation data.

The Need for Paradigm Shift in Semantic Similarity and Semantic Relatedness : From Cognitive Semantics Perspective (의미간의 유사도 연구의 패러다임 변화의 필요성-인지 의미론적 관점에서의 고찰)

  • Choi, Youngseok;Park, Jinsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.111-123
    • /
    • 2013
  • Semantic similarity/relatedness measure between two concepts plays an important role in research on system integration and database integration. Moreover, current research on keyword recommendation or tag clustering strongly depends on this kind of semantic measure. For this reason, many researchers in various fields including computer science and computational linguistics have tried to improve methods to calculating semantic similarity/relatedness measure. This study of similarity between concepts is meant to discover how a computational process can model the action of a human to determine the relationship between two concepts. Most research on calculating semantic similarity usually uses ready-made reference knowledge such as semantic network and dictionary to measure concept similarity. The topological method is used to calculated relatedness or similarity between concepts based on various forms of a semantic network including a hierarchical taxonomy. This approach assumes that the semantic network reflects the human knowledge well. The nodes in a network represent concepts, and way to measure the conceptual similarity between two nodes are also regarded as ways to determine the conceptual similarity of two words(i.e,. two nodes in a network). Topological method can be categorized as node-based or edge-based, which are also called the information content approach and the conceptual distance approach, respectively. The node-based approach is used to calculate similarity between concepts based on how much information the two concepts share in terms of a semantic network or taxonomy while edge-based approach estimates the distance between the nodes that correspond to the concepts being compared. Both of two approaches have assumed that the semantic network is static. That means topological approach has not considered the change of semantic relation between concepts in semantic network. However, as information communication technologies make advantage in sharing knowledge among people, semantic relation between concepts in semantic network may change. To explain the change in semantic relation, we adopt the cognitive semantics. The basic assumption of cognitive semantics is that humans judge the semantic relation based on their cognition and understanding of concepts. This cognition and understanding is called 'World Knowledge.' World knowledge can be categorized as personal knowledge and cultural knowledge. Personal knowledge means the knowledge from personal experience. Everyone can have different Personal Knowledge of same concept. Cultural Knowledge is the knowledge shared by people who are living in the same culture or using the same language. People in the same culture have common understanding of specific concepts. Cultural knowledge can be the starting point of discussion about the change of semantic relation. If the culture shared by people changes for some reasons, the human's cultural knowledge may also change. Today's society and culture are changing at a past face, and the change of cultural knowledge is not negligible issues in the research on semantic relationship between concepts. In this paper, we propose the future directions of research on semantic similarity. In other words, we discuss that how the research on semantic similarity can reflect the change of semantic relation caused by the change of cultural knowledge. We suggest three direction of future research on semantic similarity. First, the research should include the versioning and update methodology for semantic network. Second, semantic network which is dynamically generated can be used for the calculation of semantic similarity between concepts. If the researcher can develop the methodology to extract the semantic network from given knowledge base in real time, this approach can solve many problems related to the change of semantic relation. Third, the statistical approach based on corpus analysis can be an alternative for the method using semantic network. We believe that these proposed research direction can be the milestone of the research on semantic relation.

An Investigation on the Periodical Transition of News related to North Korea using Text Mining (텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰)

  • Park, Chul-Soo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.63-88
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year's message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime's South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South's Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles', 'North Korea Nuclear', 'Diplomacy', 'Unification', and ' South-North Korean'. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of 'North Korea Nuclear', 'North and South Korea', 'Missile', 'State Department', and 'International'. There are 16 words adopted in all three periods. The order is as follows: 'missile', 'North Korea Nuclear', 'Diplomacy', 'Unification', 'North and South Korea', 'Military', 'Kaesong Industrial Complex', 'Defense', 'Sanctions', 'Denuclearization', 'Peace', 'Exchange and Cooperation', and 'South Korea'. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through these academic studies, I hope to see a lot of studies that make important contributions to the nation.