• Title/Summary/Keyword: index clustering

Search Result 323, Processing Time 0.031 seconds

Bankruptcy Type Prediction Using A Hybrid Artificial Neural Networks Model (하이브리드 인공신경망 모형을 이용한 부도 유형 예측)

  • Jo, Nam-ok;Kim, Hyun-jung;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.79-99
    • /
    • 2015
  • The prediction of bankruptcy has been extensively studied in the accounting and finance field. It can have an important impact on lending decisions and the profitability of financial institutions in terms of risk management. Many researchers have focused on constructing a more robust bankruptcy prediction model. Early studies primarily used statistical techniques such as multiple discriminant analysis (MDA) and logit analysis for bankruptcy prediction. However, many studies have demonstrated that artificial intelligence (AI) approaches, such as artificial neural networks (ANN), decision trees, case-based reasoning (CBR), and support vector machine (SVM), have been outperforming statistical techniques since 1990s for business classification problems because statistical methods have some rigid assumptions in their application. In previous studies on corporate bankruptcy, many researchers have focused on developing a bankruptcy prediction model using financial ratios. However, there are few studies that suggest the specific types of bankruptcy. Previous bankruptcy prediction models have generally been interested in predicting whether or not firms will become bankrupt. Most of the studies on bankruptcy types have focused on reviewing the previous literature or performing a case study. Thus, this study develops a model using data mining techniques for predicting the specific types of bankruptcy as well as the occurrence of bankruptcy in Korean small- and medium-sized construction firms in terms of profitability, stability, and activity index. Thus, firms will be able to prevent it from occurring in advance. We propose a hybrid approach using two artificial neural networks (ANNs) for the prediction of bankruptcy types. The first is a back-propagation neural network (BPN) model using supervised learning for bankruptcy prediction and the second is a self-organizing map (SOM) model using unsupervised learning to classify bankruptcy data into several types. Based on the constructed model, we predict the bankruptcy of companies by applying the BPN model to a validation set that was not utilized in the development of the model. This allows for identifying the specific types of bankruptcy by using bankruptcy data predicted by the BPN model. We calculated the average of selected input variables through statistical test for each cluster to interpret characteristics of the derived clusters in the SOM model. Each cluster represents bankruptcy type classified through data of bankruptcy firms, and input variables indicate financial ratios in interpreting the meaning of each cluster. The experimental result shows that each of five bankruptcy types has different characteristics according to financial ratios. Type 1 (severe bankruptcy) has inferior financial statements except for EBITDA (earnings before interest, taxes, depreciation, and amortization) to sales based on the clustering results. Type 2 (lack of stability) has a low quick ratio, low stockholder's equity to total assets, and high total borrowings to total assets. Type 3 (lack of activity) has a slightly low total asset turnover and fixed asset turnover. Type 4 (lack of profitability) has low retained earnings to total assets and EBITDA to sales which represent the indices of profitability. Type 5 (recoverable bankruptcy) includes firms that have a relatively good financial condition as compared to other bankruptcy types even though they are bankrupt. Based on the findings, researchers and practitioners engaged in the credit evaluation field can obtain more useful information about the types of corporate bankruptcy. In this paper, we utilized the financial ratios of firms to classify bankruptcy types. It is important to select the input variables that correctly predict bankruptcy and meaningfully classify the type of bankruptcy. In a further study, we will include non-financial factors such as size, industry, and age of the firms. Thus, we can obtain realistic clustering results for bankruptcy types by combining qualitative factors and reflecting the domain knowledge of experts.

A Statistical Analysis of Phenotypic Diversity Based on Genetic Traits in Barley Germplasms (특성평가 정보를 활용한 보리 유전자원 형태적 형질 다양성의 통계적 분석)

  • Yu, Dong Su;Shin, Myoung-Jae;Park, Jin-Cheon;Kang, Manjung
    • Korean Journal of Plant Resources
    • /
    • v.35 no.5
    • /
    • pp.641-651
    • /
    • 2022
  • The biodiversity research of barley, a functional food, is proceeding to conserve germplasms and develop new cultivar of barley to improve its functional effects. In this study, with 25,104 barley germplasms in the National Agrobiodiversity Center, South Korea, the biodiversity index of species was much lower (1.17) than the origins (24.73) because of the presence of a biased species, Hordeum vulgare subsp. vulgare, but the species and origin of germplasms were significantly different with regard to genetic traits. In the clustering analysis based on genetic traits, we found that 97% barley germplasms could mostly be distributed between 1~7 clusters out of a total of 15 clusters; 'normal and uzu type', 'lodging', and 'loose smut' were commonly represented in the 1~7 clusters and some clusters showed specific differences in five genetic traits including 'growth habit'. In correlation of each genetic trait, the infection of 'barley yellow mosaic virus' was highly correlated to 'number of grains per spike'. '1000 grain weight' was weakly correlated with seven genetic traits including 'number of grains per spike'. Our analysis for barley's biodiversity can provide a useful guide to the species' phenotypes that need to be collected to conserve biodiversity and to breed new barley varieties.

Population Genetic Variation of Ulmus davidiana var. japonica in South Korea Based on ISSR Markers (ISSR 표지자를 이용한 느릅나무 자연집단의 유전변이 분석)

  • Ahn, Ji Young;Hong, Kyung Nak;Lee, Jei Wan;Yang, Byung Hoon
    • Journal of Korean Society of Forest Science
    • /
    • v.102 no.4
    • /
    • pp.560-565
    • /
    • 2013
  • Population genetic structure and diversity of Ulmus davidiana var. japonica in South Korea were studied using ISSR markers. A total of 45 polymorphic ISSR amplicons were cropped from 7 ISSR primers and 171 individuals of 7 populations. The average of effective alleles and the proportion of polymorphic loci were 1.5 and 89% respectively. The Shannon's diversity index (I) was 0.435 and the expected heterozygosity from the frequentist's method ($H_e$) and the Bayesian inference (hs) were 0.289 and 0.323 respectively. From AMOVA, 4.2% of total genetic variation in the elm populations was explained with the difference among populations (${\Phi}_{ST}=0.042$) and the other 95.8% was distributed within populations. The ${\theta}^{II}$ value by Bayesian method which was comparable to the FST was 0.043. So the level of genetic diversity in the elm populations was similar to that in Genus Ulmus and the level of genetic differentiation was lower than that of others. No population showed a significant difference in the population-specific fixation indices (average of $PS-F_{IS}=0.822$) or the population-specific genetic differentiations (average of $PS-F_{ST}=0.101$). Seven populations were allocated into 3 groups in the UPGMA and the PCA, but the grouping patterns were different. Also, we could not confirm any geographic trend from Bayesian clustering.

Analysis of Future Bioclimatic Zones Using Multi-climate Models (다중기후모형을 활용한 동북아시아의 미래 생물기후권역 변화분석)

  • Choi, Yuyoung;Lim, Chul-Hee;Ryu, Jieun;Jeon, Seongwoo
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.5
    • /
    • pp.489-508
    • /
    • 2018
  • As climate changes, it is necessary to predict changes in the habitat environment in order to establish more aggressive adaptation strategies. The bioclimatic classification which clusters of areas with similar habitats can provide a useful ecosystem management framework. Therefore, in this study, biological habitat environment of Northeast Asia was identified through the establishment of the bioclimatic zones, and the impac of climate change on the biological habitat was analyzed. An ISODATA clustering was used to classify Northeast Asia (NEA)into 15 bioclimatic zones, and climate change impacts were predicted by projecting the future spatial distribution of bioclimatic zones based upon an ensemble of 17 GCMs across RCP4.5 and 8.5 scenarios for 2050s, and 2070s. Results demonstrated that significant changes in bioclimatic conditions can be expected throughout the NEA by 2050s and 2070s. The overall zones moved upward, and some zones were predicted to be greatly expanded or shrunk where we suggested as regions requiring intensive management. This analysis provides the basis for understanding potential impacts of climate change on biodiversity and ecosystem. Also, this could be used more effectively to support decision making on climate change adaptation.

Spatial Analysis of Colorectal Cancer Cases in Kuala Lumpur

  • Shah, Shamsul Azhar;Neoh, Hui-Min;Syed Abdul Rahim, Syed Sharizman;Azhar, Zahir Izuan;Hassan, Mohd Rohaizat;Safian, Nazarudin;Jamal, Rahman
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.3
    • /
    • pp.1149-1154
    • /
    • 2014
  • Background: In Malaysia, data from the Malaysian Health Ministry showed colorectal cancer (CRC) to be the second most common type of cancer in 2007-2009, after breast cancer. The same was apparent after looking at males and females cases separately. In the present study, the Geographic Information System (GIS) was employed to describe the distribution of CRC cases in Kuala Lumpur (KL), Malaysia, according to socio-demographic factors (age, gender, ethnicity and district). Materials and Methods: This retrospective review concerned data for patients diagnosed with colorectal cancer in the years 1995 to 2011 collected from the Wilayah Persekutuan Health Office, taken from the cancer notification form (NCR-2), and patient medical records from the Surgical Department, Universiti Kebangsaan Malaysia Medical Centre (UKMMC). A total of 146 cases were analyzed. All the data collected were analysed using ArcGIS version 10.0 and SPSS version 19.0. Results: Patients aged 60 to 69 years accounted for the highest proportion of cases (34.2%) and males slightly predominated 76 (52.1%), Chinese had the highest number of registered cases at 108 (74.0%) and staging revealed most cases in the 3rd and 4th stages. Kernel density analysis showed more cases are concentrated up in the northern area of Petaling and Kuala Lumpur subdistricts. Spatial global pattern analysis by average nearest neighbour resulted in nearest neighbour ratio of 0.75, with Z-score of -5.59, p value of <0.01 and the z-score of -5.59. Spatial autocorrelation (Moran's I) showed clustering significant with p<0.01, Z score 3.14 and Moran's Index of 0.007. When mapping clusters with hotspot analysis (Getis-Ord Gi), hot and cold spots were identified. Hot spot areas fell on the northeast side of KL. Conclusions: This study demonstrated significant spatial patterns of cancer incidence in KL. Knowledge about these spatial patterns can provide useful information to policymakers in the planning of screening of CRC in the targeted population and improvement of healthcare facilities to provide better treatment for CRC patients.

The Comparison of Community Characteristics of Ground-dwelling Invertebrates According Agroecosystem Types in the Eastern Region of the Korean Peninsula (한반도 동부 농업생태계에 따른 지표배회성 무척추동물의 군집 특성 비교)

  • Ahn, Chi-Hyun;Oh, Young-Ju;Ock, Suk-Mi;Lee, Wook-Jae;Sohn, Soo-In;Kim, Myung-Hyun;Na, Young-Eun;Kim, Chang-Seok
    • Korean journal of applied entomology
    • /
    • v.56 no.1
    • /
    • pp.29-39
    • /
    • 2017
  • To compare the features of ground-dwelling invertebrates according agroecosystems, we selected paddy fields, dry fields, orchards in the Eastern region of Korea. The surveys were performed by using pit-fall traps twice per year from 2013 to 2015. Total 6,420 individuals of 172 species belonging to 13 orders, 58 families were investigated in the Eastern region, the species of Hymenoptera (38.26%), Orthoptera (16.28%) accounted large portion of the communities. In the geographical observation, invertebrates were caught was 2,983 individuals in Gyeongsangnam-do, the diversity index of Gyeongsangbuk-do community was higher than of the others and abundance and species richness of paddy field were higher than from dry field or orchard. To understand the relation between taxonomic groups and environmental factors, we carried out the canonical correspondence analysis and hierarchical clustering. As a result, Homoptera, Blattaria, Isoptera, and Coleoptera were positively related to soil pH, soil temperature, and moisture contents, and negatively related to the others. Invertebrate community also were patterned dependently by type of ecosystems. This results were shown that distribution of invertebrates is a few influenced the relationship of the space habituated invertebrates and environmental factors.

An Efficient Video Sequence Matching Algorithm (효율적인 비디오 시퀀스 정합 알고리즘)

  • 김상현;박래홍
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.5
    • /
    • pp.45-52
    • /
    • 2004
  • According tothe development of digital media technologies various algorithms for video sequence matching have been proposed to match the video sequences efficiently. A large number of video sequence matching methods have focused on frame-wise query, whereas a relatively few algorithms have been presented for video sequence matching or video shot matching. In this paper, we propose an efficientalgorithm to index the video sequences and to retrieve the sequences for video sequence query. To improve the accuracy and performance of video sequence matching, we employ the Cauchy function as a similarity measure between histograms of consecutive frames, which yields a high performance compared with conventional measures. The key frames extracted from segmented video shots can be used not only for video shot clustering but also for video sequence matching or browsing, where the key frame is defined by the frame that is significantly different from the previous fames. Several key frame extraction algorithms have been proposed, in which similar methods used for shot boundary detection were employed with proper similarity measures. In this paper, we propose the efficient algorithm to extract key frames using the cumulative Cauchy function measure and. compare its performance with that of conventional algorithms. Video sequence matching can be performed by evaluating the similarity between data sets of key frames. To improve the matching efficiency with the set of extracted key frames we employ the Cauchy function and the modified Hausdorff distance. Experimental results with several color video sequences show that the proposed method yields the high matching performance and accuracy with a low computational load compared with conventional algorithms.

Calculation of the Peak-hour Ratio at Urban Railway Stations Reflecting Passenger Demand Pattern and Land Use Inventory - A Case of Seoul - (승객 수요 패턴과 역세권의 토지이용 특성을 반영한 도시철도역 첨두시간 집중률 산정 - 서울시를 대상으로 -)

  • Jang, Sunghoon;Kim, Hyo-Seung;Lee, Chungwon;Kim, Dong-Kyu
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.4
    • /
    • pp.1581-1589
    • /
    • 2013
  • The aim of this study is to suggest a methodology for calculating the peak-hour ratio of passengers at urban railway stations by reflecting the characteristics of passenger demand patterns and the land use inventory of stations. To achieve this, urban railway stations in Seoul are divided into three groups by using factor analysis and cluster analysis. For each station group, we calculate five and four variables related to the passenger demand patterns and the land use inventory of stations, respectively, as well as the peak-hour ratios of passengers. Among these nine variables, average daily passengers and the location quotient (LQ) index for business services are selected as the classification criteria for station groups based on statistical tests. Using the two variables, a group allocation process is suggested to estimate the peak-hour ratio of passengers for a newly-constructed station. Evaluation results based on thirteen stations show that the proposed methodology produces lower errors than the currently-used guideline does. The results of this study contribute to establishing efficiently construction and operation plans for newly-constructed stations.

Morphological Characteristics and Genetic Diversity Analysis of Cultivated Sancho (Zanthoxylum schinifolium) and Chopi (Zanthoxylum piperitum) in Korea (국내 재배지의 산초(Zanthoxylum schinifolium)와 초피(Zanthoxylum piperitum)의 형태학적 특성과 유전적 다양성)

  • Ryu, Jaihyunk;Choi, Hae-Sik;Lyu, Jae-il;Bae, Chang-Hyu
    • Korean Journal of Plant Resources
    • /
    • v.29 no.5
    • /
    • pp.555-563
    • /
    • 2016
  • The morphological characteristics and genetic relationships among 32 germplasms of Zanthoxylum schinifolium and Zanthoxylum piperitum collected from two farms in Korea were investigated. The traits with the most variability were seed color, leaf size, and spine size. The intraspecific polymorphism of Z. schinifolium and Z. piperitum was 96.5% and 60.3%, respectively. The genetic diversity and Shannon’s information index values ranged from 0.11 to 0.33 and 0.19 to 0.50, with average values of 0.26 and 0.42, respectively. Two ISSR primers (UBC861 and UBC862) were able to distinguish the different species. The genetic similarity matrix (GSM) revealed variability among the accessions ranging from 0.116 to 0.816. The intraspecific GSM for Z. schinifolium and Z. piperitum was 0.177-0.780 and 0.250-0.816, respectively. The GSM findings indicate that Z. schinifolium and Z. piperitum accessions have high genetic diversity and possess germplasms qualifying as good genetic resources for cross breeding. The clustering analysis separated Z. schinifolium and Z. piperitum into independent groups, and all accessions could be classified into three categories. Z. Schinifolium var. nermis belonged to independent groups. Comparison of the clusters based on morphological analysis with those based on ISSR data resulted in an unclear pattern of division among the accessions. The study findings indicate that Z. schinifolium and Z. piperitum accessions have genetic diversity, and ISSR markers were useful for identifying Z. schinifolium and Z. piperitum.

Variation of Fruit and Seed Morphology of 6 Natural Populations of Sorbus commixta Hedl. in Korea (마가목 6개 천연집단의 열매와 종자 형질 변이)

  • Song, Jeong-Ho;Jang, Kyung-Hwan
    • Journal of Korean Society of Forest Science
    • /
    • v.102 no.1
    • /
    • pp.1-6
    • /
    • 2013
  • This study was conducted to investigate the variation of fruit and seed morphology among populations and among individuals within population of Sorbus commixta Hedl. distributed in Korea. Fruits collected from 42 trees in six natural populations and their six fruit and four seed characteristics were analyzed. In all characteristics, there were significant differences among populations and among individuals within populations. Especially, number of fruits per fruit-bearing branch and number of seeds per fruit showed higher values among populations in total variance component. Coefficients of variation in the number of fruits per fruit-bearing branch and seed weight are relatively high (42.0~75.3%) compared to other traits (11.9~32.1%). As a result of simple correlation analysis, the number of fruits per fruit-bearing branch showed a significant positive correlation with latitude but showed a negative correlation with longitude. According to cluster analysis, geographically close populations showed the tendency of clustering into the same group. Three principal components (PC) were deduced from principal component analysis, which explain the 87% of total variance of fruit and seed characteristics. The highest contribution was seed length and seed weight in PC1, fruit width and seed index in PC2 and fruit length and number of fruits per fruit-bearing branch in PC3.