• Title/Summary/Keyword: 자기조직화 특성지도

Search Result 30, Processing Time 0.026 seconds

Identification of shear layer at river confluence using (RGB) aerial imagery (RGB 항공 영상을 이용한 하천 합류부 전단층 추출법)

  • Noh, Hyoseob;Park, Yong Sung
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.8
    • /
    • pp.553-566
    • /
    • 2021
  • River confluence is often characterized by shear layer and the associated strong mixing. In natural rivers, the main channel and its tributary can be separated by the shear layer using contrasting colors. The shear layer can be easily observed using aerial images from satellite or unmanned aerial vehicles. This study proposes a low-cost identification method extracting geographic features of the shear layer using RGB aerial image. The method consists of three stages. At first, in order to identify the shear layer, it performs image segmentation using a Gaussian mixture model and extracts the water bodies of the main channel and tributary. Next, the self-organizing map simplifies the flow line of the water bodies into the 1-dimensional curve grid. After that, the curvilinear coordinate transformation is performed using the water body pixels and the curve grid. As a result, the shear layer identification method was successfully applied to the confluence between Nakdong River and Nam River to extract geometric shear layer features (confluence angle, upstream- and downstream- channel widths, shear layer length, maximum shear layer thickness).

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

  • Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1077-1094
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.

Long-term Prediction of Bus Travel Time Using Bus Information System Data (BIS 자료를 이용한 중장기 버스 통행시간 예측)

  • LEE, Jooyoung;Gu, Eunmo;KIM, Hyungjoo;JANG, Kitae
    • Journal of Korean Society of Transportation
    • /
    • v.35 no.4
    • /
    • pp.348-359
    • /
    • 2017
  • Recently, various public transportation activation policies are being implemented in order to mitigate traffic congestion in metropolitan areas. Especially in the metropolitan area, the bus information system has been introduced to provide information on the current location of the bus and the estimated arrival time. However, it is difficult to predict the travel time due to repetitive traffic congestion in buses passing through complex urban areas due to repetitive traffic congestion and bus bunching. The previous bus travel time study has difficulties in providing information on route travel time of bus users and information on long-term travel time due to short-term travel time prediction based on the data-driven method. In this study, the path based long-term bus travel time prediction methodology is studied. For this purpose, the training data is composed of 2015 bus travel information and the 2016 data are composed of verification data. We analyze bus travel information and factors affecting bus travel time were classified into departure time, day of week, and weather factors. These factors were used into clusters with similar patterns using self organizing map. Based on the derived clusters, the reference table for bus travel time by day and departure time for sunny and rainy days were constructed. The accuracy of bus travel time derived from this study was verified using the verification data. It is expected that the prediction algorithm of this paper could overcome the limitation of the existing intuitive and empirical approach, and it is possible to improve bus user satisfaction and to establish flexible public transportation policy by improving prediction accuracy.

The Pattern Analysis of Financial Distress for Non-audited Firms using Data Mining (데이터마이닝 기법을 활용한 비외감기업의 부실화 유형 분석)

  • Lee, Su Hyun;Park, Jung Min;Lee, Hyoung Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.111-131
    • /
    • 2015
  • There are only a handful number of research conducted on pattern analysis of corporate distress as compared with research for bankruptcy prediction. The few that exists mainly focus on audited firms because financial data collection is easier for these firms. But in reality, corporate financial distress is a far more common and critical phenomenon for non-audited firms which are mainly comprised of small and medium sized firms. The purpose of this paper is to classify non-audited firms under distress according to their financial ratio using data mining; Self-Organizing Map (SOM). SOM is a type of artificial neural network that is trained using unsupervised learning to produce a lower dimensional discretized representation of the input space of the training samples, called a map. SOM is different from other artificial neural networks as it applies competitive learning as opposed to error-correction learning such as backpropagation with gradient descent, and in the sense that it uses a neighborhood function to preserve the topological properties of the input space. It is one of the popular and successful clustering algorithm. In this study, we classify types of financial distress firms, specially, non-audited firms. In the empirical test, we collect 10 financial ratios of 100 non-audited firms under distress in 2004 for the previous two years (2002 and 2003). Using these financial ratios and the SOM algorithm, five distinct patterns were distinguished. In pattern 1, financial distress was very serious in almost all financial ratios. 12% of the firms are included in these patterns. In pattern 2, financial distress was weak in almost financial ratios. 14% of the firms are included in pattern 2. In pattern 3, growth ratio was the worst among all patterns. It is speculated that the firms of this pattern may be under distress due to severe competition in their industries. Approximately 30% of the firms fell into this group. In pattern 4, the growth ratio was higher than any other pattern but the cash ratio and profitability ratio were not at the level of the growth ratio. It is concluded that the firms of this pattern were under distress in pursuit of expanding their business. About 25% of the firms were in this pattern. Last, pattern 5 encompassed very solvent firms. Perhaps firms of this pattern were distressed due to a bad short-term strategic decision or due to problems with the enterpriser of the firms. Approximately 18% of the firms were under this pattern. This study has the academic and empirical contribution. In the perspectives of the academic contribution, non-audited companies that tend to be easily bankrupt and have the unstructured or easily manipulated financial data are classified by the data mining technology (Self-Organizing Map) rather than big sized audited firms that have the well prepared and reliable financial data. In the perspectives of the empirical one, even though the financial data of the non-audited firms are conducted to analyze, it is useful for find out the first order symptom of financial distress, which makes us to forecast the prediction of bankruptcy of the firms and to manage the early warning and alert signal. These are the academic and empirical contribution of this study. The limitation of this research is to analyze only 100 corporates due to the difficulty of collecting the financial data of the non-audited firms, which make us to be hard to proceed to the analysis by the category or size difference. Also, non-financial qualitative data is crucial for the analysis of bankruptcy. Thus, the non-financial qualitative factor is taken into account for the next study. This study sheds some light on the non-audited small and medium sized firms' distress prediction in the future.

Visualizing Excercise Prescription Using Visual Path Map (비쥬얼패스맵을 이용한 운동처방 과정 시각화)

  • Ham, Jun-Seok;Jeong, Chan-Soon;Ko, Il-Ju
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.9
    • /
    • pp.1182-1189
    • /
    • 2011
  • We named the system Visual Path Map which visualizes the distribution of clusters according to characteristics and entire process about exercise prescription, and we purpose to visualize a process according to exercise prescription. Visual Path Map visualizes the distribution of clusters according to characteristics, current and object distribution, and changed distribution for prescription. So it visualizes paths from current distribution to object distribution according to prescription. We used SOM in order to express properties along subjects in Visual Path map, and visualized distribution of clusters about physical characteristics, body mass index, and age information of 1,500 ordinary people. Also we visualize practical exercise prescription according to real data of expert of exercise prescription.

Patterning Waterbird Assemblages on Rice Fields Using Self-Organizing Map and Random Forest (자기조직화지도(Self-organizing map)와 랜덤 포레스트 분석(Random forest)을 이용한 논습지에 도래하는 수조류 군집 특성 파악)

  • Nam, Hyung-Kyu;Choi, Seung-Hye;Yoo, Jeong-Chil
    • Korean Journal of Environmental Agriculture
    • /
    • v.34 no.3
    • /
    • pp.168-177
    • /
    • 2015
  • BACKGROUND: In recent year, there has been great concern regarding agricultural land uses and their importance for the conservation of biodiversity. Rice fields are managed unique wetland for wildlife, especially waterbirds. A comprehensive monitoring of the waterbird assemblage to understand patterning changes was attempted for rice ecosystem in South Korea. This rice ecosystem has been recognized as one of the most important for waterbirds conservation. METHODS AND RESULTS: Biweekly monitoring was implemented for the 4 years from April 2009 to March 2010, from April 2011 to March 2014. 32 species of waterbirds were observed. Self-organizing map (SOM) and random forest were applied to the waterbirds dataset to identify the characteristics in waterbirds distribution. SOM and random forest analysis clearly classified into four clusters and extract ecological information from waterbird dataset. Waterbird assemblages represented strong seasonality and habitat use according to waterbird group such as shorebirds, herons and waterfowl. CONCLUSION: Our results showed that the combination of SOM and random forest analysis could be useful for ecosystem assessment and management. Furthermore, we strongly suggested that a strict management strategy for the rice fields to conserve the waterbirds. The strategy could be seasonally and species specific.

Water demand forecasting at the DMA level considering sociodemographic and waterworks characteristics (사회인구통계 및 상수도시설 특성을 고려한 소블록 단위 물 수요예측 연구)

  • Saemmul Jin;Dooyong Choi;Kyoungpil Kim;Jayong Koo
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.6
    • /
    • pp.363-373
    • /
    • 2023
  • Numerous studies have established a correlation between sociodemographic characteristics and water usage, identifying population as a primary independent variable in mid- to long-term demand forecasting. Recent dramatic sociodemographic changes, including urban concentration-rural depopulation, low birth rates-aging population, and the rise in single-person households, are expected to impact water demand and supply patterns. This underscores the necessity for operational and managerial changes in existing water supply systems. While sociodemographic characteristics are regularly surveyed, the conducted surveys use aggregate units that do not align with the actual system. Consequently, many water demand forecasts have been conducted at the administrative district level without adequately considering the water supply system. This study presents an upward water demand forecasting model that accurately reflects real water facilities and consumers. The model comprises three key steps. Firstly, Statistics Korea's SGIS (Statistical Geological Information System) data was reorganized at the DMA level. Secondly, DMAs were classified using the SOM (Self-Organizing Map) algorithm to consider differences in water facilities and consumer characteristics. Lastly, water demand forecasting employed the PCR (Principal Component Regression) method to address multicollinearity and overfitting issues. The performance evaluation of this model was conducted for DMAs classified as rural areas due to the insufficient number of DMAs. The estimation results indicate that the correlation coefficients exceeded 0.9, and the MAPE remained within approximately 10% for the test dataset. This method is expected to be useful for reorganization plans, such as the expansion and contraction of existing facilities.

Bankruptcy Type Prediction Using A Hybrid Artificial Neural Networks Model (하이브리드 인공신경망 모형을 이용한 부도 유형 예측)

  • Jo, Nam-ok;Kim, Hyun-jung;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.79-99
    • /
    • 2015
  • The prediction of bankruptcy has been extensively studied in the accounting and finance field. It can have an important impact on lending decisions and the profitability of financial institutions in terms of risk management. Many researchers have focused on constructing a more robust bankruptcy prediction model. Early studies primarily used statistical techniques such as multiple discriminant analysis (MDA) and logit analysis for bankruptcy prediction. However, many studies have demonstrated that artificial intelligence (AI) approaches, such as artificial neural networks (ANN), decision trees, case-based reasoning (CBR), and support vector machine (SVM), have been outperforming statistical techniques since 1990s for business classification problems because statistical methods have some rigid assumptions in their application. In previous studies on corporate bankruptcy, many researchers have focused on developing a bankruptcy prediction model using financial ratios. However, there are few studies that suggest the specific types of bankruptcy. Previous bankruptcy prediction models have generally been interested in predicting whether or not firms will become bankrupt. Most of the studies on bankruptcy types have focused on reviewing the previous literature or performing a case study. Thus, this study develops a model using data mining techniques for predicting the specific types of bankruptcy as well as the occurrence of bankruptcy in Korean small- and medium-sized construction firms in terms of profitability, stability, and activity index. Thus, firms will be able to prevent it from occurring in advance. We propose a hybrid approach using two artificial neural networks (ANNs) for the prediction of bankruptcy types. The first is a back-propagation neural network (BPN) model using supervised learning for bankruptcy prediction and the second is a self-organizing map (SOM) model using unsupervised learning to classify bankruptcy data into several types. Based on the constructed model, we predict the bankruptcy of companies by applying the BPN model to a validation set that was not utilized in the development of the model. This allows for identifying the specific types of bankruptcy by using bankruptcy data predicted by the BPN model. We calculated the average of selected input variables through statistical test for each cluster to interpret characteristics of the derived clusters in the SOM model. Each cluster represents bankruptcy type classified through data of bankruptcy firms, and input variables indicate financial ratios in interpreting the meaning of each cluster. The experimental result shows that each of five bankruptcy types has different characteristics according to financial ratios. Type 1 (severe bankruptcy) has inferior financial statements except for EBITDA (earnings before interest, taxes, depreciation, and amortization) to sales based on the clustering results. Type 2 (lack of stability) has a low quick ratio, low stockholder's equity to total assets, and high total borrowings to total assets. Type 3 (lack of activity) has a slightly low total asset turnover and fixed asset turnover. Type 4 (lack of profitability) has low retained earnings to total assets and EBITDA to sales which represent the indices of profitability. Type 5 (recoverable bankruptcy) includes firms that have a relatively good financial condition as compared to other bankruptcy types even though they are bankrupt. Based on the findings, researchers and practitioners engaged in the credit evaluation field can obtain more useful information about the types of corporate bankruptcy. In this paper, we utilized the financial ratios of firms to classify bankruptcy types. It is important to select the input variables that correctly predict bankruptcy and meaningfully classify the type of bankruptcy. In a further study, we will include non-financial factors such as size, industry, and age of the firms. Thus, we can obtain realistic clustering results for bankruptcy types by combining qualitative factors and reflecting the domain knowledge of experts.

Improved Focused Sampling for Class Imbalance Problem (클래스 불균형 문제를 해결하기 위한 개선된 집중 샘플링)

  • Kim, Man-Sun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Cheah, Wooi Ping
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.287-294
    • /
    • 2007
  • Many classification algorithms for real world data suffer from a data class imbalance problem. To solve this problem, various methods have been proposed such as altering the training balance and designing better sampling strategies. The previous methods are not satisfy in the distribution of the input data and the constraint. In this paper, we propose a focused sampling method which is more superior than previous methods. To solve the problem, we must select some useful data set from all training sets. To get useful data set, the proposed method devide the region according to scores which are computed based on the distribution of SOM over the input data. The scores are sorted in ascending order. They represent the distribution or the input data, which may in turn represent the characteristics or the whole data. A new training dataset is obtained by eliminating unuseful data which are located in the region between an upper bound and a lower bound. The proposed method gives a better or at least similar performance compare to classification accuracy of previous approaches. Besides, it also gives several benefits : ratio reduction of class imbalance; size reduction of training sets; prevention of over-fitting. The proposed method has been tested with kNN classifier. An experimental result in ecoli data set shows that this method achieves the precision up to 2.27 times than the other methods.

Status and Implications of Hydrogeochemical Characterization of Deep Groundwater for Deep Geological Disposal of High-Level Radioactive Wastes in Developed Countries (고준위 방사성 폐기물 지질처분을 위한 해외 선진국의 심부 지하수 환경 연구동향 분석 및 시사점 도출)

  • Jaehoon Choi;Soonyoung Yu;SunJu Park;Junghoon Park;Seong-Taek Yun
    • Economic and Environmental Geology
    • /
    • v.55 no.6
    • /
    • pp.737-760
    • /
    • 2022
  • For the geological disposal of high-level radioactive wastes (HLW), an understanding of deep subsurface environment is essential through geological, hydrogeological, geochemical, and geotechnical investigations. Although South Korea plans the geological disposal of HLW, only a few studies have been conducted for characterizing the geochemistry of deep subsurface environment. To guide the hydrogeochemical research for selecting suitable repository sites, this study overviewed the status and trends in hydrogeochemical characterization of deep groundwater for the deep geological disposal of HLW in developed countries. As a result of examining the selection process of geological disposal sites in 8 countries including USA, Canada, Finland, Sweden, France, Japan, Germany, and Switzerland, the following geochemical parameters were needed for the geochemical characterization of deep subsurface environment: major and minor elements and isotopes (e.g., 34S and 18O of SO42-, 13C and 14C of DIC, 2H and 18O of water) of both groundwater and pore water (in aquitard), fracture-filling minerals, organic materials, colloids, and oxidation-reduction indicators (e.g., Eh, Fe2+/Fe3+, H2S/SO42-, NH4+/NO3-). A suitable repository was selected based on the integrated interpretation of these geochemical data from deep subsurface. In South Korea, hydrochemical types and evolutionary patterns of deep groundwater were identified using artificial neural networks (e.g., Self-Organizing Map), and the impact of shallow groundwater mixing was evaluated based on multivariate statistics (e.g., M3 modeling). The relationship between fracture-filling minerals and groundwater chemistry also has been investigated through a reaction-path modeling. However, these previous studies in South Korea had been conducted without some important geochemical data including isotopes, oxidationreduction indicators and DOC, mainly due to the lack of available data. Therefore, a detailed geochemical investigation is required over the country to collect these hydrochemical data to select a geological disposal site based on scientific evidence.