• Title/Summary/Keyword: Self Organizing Map(SOM)

Search Result 235, Processing Time 0.02 seconds

Knowledge Discovery in Aerodynamic Design Space using Data Mining (데이터 마이닝을 통한 공력설계공간 지식습득)

  • Jeong, Sin-Gyu;;, 동북대학교
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.34 no.1
    • /
    • pp.49-55
    • /
    • 2006
  • Two data mining techniques, analysis of variance (ANOVA) and self-organizing map (SOM), are applied to knowledge discovery in aerodynamic design space. These methods make it possible to identify the effect of each design variable on the objective functions. Furthermore, ANOVA shows the effect of interaction between design variables on the objective function and SOM visualizes the trade-off among objective functions. Present methods are applied to the result of the supersonic wing design which includes 72 design variables and 4 objective functions.

Speech Query Recognition for Tamil Language Using Wavelet and Wavelet Packets

  • Iswarya, P.;Radha, V.
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1135-1148
    • /
    • 2017
  • Speech recognition is one of the fascinating fields in the area of Computer science. Accuracy of speech recognition system may reduce due to the presence of noise present in speech signal. Therefore noise removal is an essential step in Automatic Speech Recognition (ASR) system and this paper proposes a new technique called combined thresholding for noise removal. Feature extraction is process of converting acoustic signal into most valuable set of parameters. This paper also concentrates on improving Mel Frequency Cepstral Coefficients (MFCC) features by introducing Discrete Wavelet Packet Transform (DWPT) in the place of Discrete Fourier Transformation (DFT) block to provide an efficient signal analysis. The feature vector is varied in size, for choosing the correct length of feature vector Self Organizing Map (SOM) is used. As a single classifier does not provide enough accuracy, so this research proposes an Ensemble Support Vector Machine (ESVM) classifier where the fixed length feature vector from SOM is given as input, termed as ESVM_SOM. The experimental results showed that the proposed methods provide better results than the existing methods.

Self Organized Pattern Classification and Analysis of Hydrologic Data in Juam Lake (주암호 수문자료의 자기조직화 패턴분류 및 분석)

  • Park, Sung-Chun;Jin, Young-Hoon;Roh, Kyong-Bum;Yang, Dong-Hyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2012.05a
    • /
    • pp.790-794
    • /
    • 2012
  • 우리나라는 여름철에 강우가 편중되어 있고 동고서저의 산악지형으로 수자원확보가 어려운 실정이며 이는 곧 하천의 유지유량확보의 어려움과도 직결된다. 이러한 수자원확보를 위해 최근 기존 저수지 둑을 높이는 사업이 전국적으로 활발히 진행되고 있으며 이는 저수지나 댐의 수체와 같은 수자원을 보다 적극적으로 활용하여 그 가치를 높임과 동시에 하천에 대한 활용도를 높이고자 하는 데 그 목적이 있다. 따라서 저수지나 댐의 저류량에 기여하는 강우량, 유입량과 같은 수문학적 자료의 심도 있는 분석이 필요하며 수문변수들이 나타내는 복잡한 패턴에 대한 연구가 이루어져야 할 것이다. 본 연구에서는 저수지나 댐의 저류량에 직접적으로 영향을 주는 수문변수들을 전체적으로 파악하기 위해 수집된 수문자료의 각각의 특성 및 자료들 사이의 복합적인 관계를 파악하였으며 이를 위하여 패턴분류 분야에서 그 적용타당성이 입증된 자기조직화 지도(Self-Organizing Map: SOM)를 이용하였다. 본 연구의 대상지점은 섬진강 유역내에 위치한 주암호를 대상지점으로 선정하였으며 패턴분석에 사용한 수문자료의 기간은 2007~2010년까지 5년간의 월평균 자료를 활용하였다. SOM의 적용 결과, 측정수문자료에 대한 전체적인 특성을 패턴분류를 통해 분류하였으며, 각 변수에 대한 패턴별 상대성을 고려한 클러스터별 특성 및 시간적 이질성을 파악할 수 있었다. 이는 측정 자료에 대한 분석 기법개발의 일환으로 향후 수자원 확보에 대한 개발 및 정책의 기초자료로 활용될 것으로 기대된다.

  • PDF

The Pattern Analysis of Financial Distress for Non-audited Firms using Data Mining (데이터마이닝 기법을 활용한 비외감기업의 부실화 유형 분석)

  • Lee, Su Hyun;Park, Jung Min;Lee, Hyoung Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.111-131
    • /
    • 2015
  • There are only a handful number of research conducted on pattern analysis of corporate distress as compared with research for bankruptcy prediction. The few that exists mainly focus on audited firms because financial data collection is easier for these firms. But in reality, corporate financial distress is a far more common and critical phenomenon for non-audited firms which are mainly comprised of small and medium sized firms. The purpose of this paper is to classify non-audited firms under distress according to their financial ratio using data mining; Self-Organizing Map (SOM). SOM is a type of artificial neural network that is trained using unsupervised learning to produce a lower dimensional discretized representation of the input space of the training samples, called a map. SOM is different from other artificial neural networks as it applies competitive learning as opposed to error-correction learning such as backpropagation with gradient descent, and in the sense that it uses a neighborhood function to preserve the topological properties of the input space. It is one of the popular and successful clustering algorithm. In this study, we classify types of financial distress firms, specially, non-audited firms. In the empirical test, we collect 10 financial ratios of 100 non-audited firms under distress in 2004 for the previous two years (2002 and 2003). Using these financial ratios and the SOM algorithm, five distinct patterns were distinguished. In pattern 1, financial distress was very serious in almost all financial ratios. 12% of the firms are included in these patterns. In pattern 2, financial distress was weak in almost financial ratios. 14% of the firms are included in pattern 2. In pattern 3, growth ratio was the worst among all patterns. It is speculated that the firms of this pattern may be under distress due to severe competition in their industries. Approximately 30% of the firms fell into this group. In pattern 4, the growth ratio was higher than any other pattern but the cash ratio and profitability ratio were not at the level of the growth ratio. It is concluded that the firms of this pattern were under distress in pursuit of expanding their business. About 25% of the firms were in this pattern. Last, pattern 5 encompassed very solvent firms. Perhaps firms of this pattern were distressed due to a bad short-term strategic decision or due to problems with the enterpriser of the firms. Approximately 18% of the firms were under this pattern. This study has the academic and empirical contribution. In the perspectives of the academic contribution, non-audited companies that tend to be easily bankrupt and have the unstructured or easily manipulated financial data are classified by the data mining technology (Self-Organizing Map) rather than big sized audited firms that have the well prepared and reliable financial data. In the perspectives of the empirical one, even though the financial data of the non-audited firms are conducted to analyze, it is useful for find out the first order symptom of financial distress, which makes us to forecast the prediction of bankruptcy of the firms and to manage the early warning and alert signal. These are the academic and empirical contribution of this study. The limitation of this research is to analyze only 100 corporates due to the difficulty of collecting the financial data of the non-audited firms, which make us to be hard to proceed to the analysis by the category or size difference. Also, non-financial qualitative data is crucial for the analysis of bankruptcy. Thus, the non-financial qualitative factor is taken into account for the next study. This study sheds some light on the non-audited small and medium sized firms' distress prediction in the future.

Web Mining Using Fuzzy Integration of Multiple Structure Adaptive Self-Organizing Maps (다중 구조적응 자기구성지도의 퍼지결합을 이용한 웹 마이닝)

  • 김경중;조성배
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.1
    • /
    • pp.61-70
    • /
    • 2004
  • It is difficult to find an appropriate web site because exponentially growing web contains millions of web documents. Personalization of web search can be realized by recommending proper web sites using user profile but more efficient method is needed for estimating preference because user's evaluation on web contents presents many aspects of his characteristics. As user profile has a property of non-linearity, estimation by classifier is needed and combination of classifiers is necessary to anticipate diverse properties. Structure adaptive self-organizing map (SASOM) that is suitable for Pattern classification and visualization is an enhanced model of SOM and might be useful for web mining. Fuzzy integral is a combination method using classifiers' relevance that is defined subjectively. In this paper, estimation of user profile is conducted by using ensemble of SASOM's teamed independently based on fuzzy integral and evaluated by Syskill & Webert UCI benchmark data. Experimental results show that the proposed method performs better than previous naive Bayes classifier as well as voting of SASOM's.

Estimation of Inundation Area by Linking of Rainfall-Duration-Flooding Quantity Relationship Curve with Self-Organizing Map (강우량-지속시간-침수량 관계곡선과 자기조직화 지도의 연계를 통한 범람범위 추정)

  • Kim, Hyun Il;Keum, Ho Jun;Han, Kun Yeun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.6
    • /
    • pp.839-850
    • /
    • 2018
  • The flood damage in urban areas due to torrential rain is increasing with urbanization. For this reason, accurate and rapid flooding forecasting and expected inundation maps are needed. Predicting the extent of flooding for certain rainfalls is a very important issue in preparing flood in advance. Recently, government agencies are trying to provide expected inundation maps to the public. However, there is a lack of quantifying the extent of inundation caused by a particular rainfall scenario and the real-time prediction method for flood extent within a short time. Therefore the real-time prediction of flood extent is needed based on rainfall-runoff-inundation analysis. One/two dimensional model are continued to analyize drainage network, manhole overflow and inundation propagation by rainfall condition. By applying the various rainfall scenarios considering rainfall duration/distribution and return periods, the inundation volume and depth can be estimated and stored on a database. The Rainfall-Duration-Flooding Quantity (RDF) relationship curve based on the hydraulic analysis results and the Self-Organizing Map (SOM) that conducts unsupervised learning are applied to predict flooded area with particular rainfall condition. The validity of the proposed methodology was examined by comparing the results of the expected flood map with the 2-dimensional hydraulic model. Based on the result of the study, it is judged that this methodology will be useful to provide an unknown flood map according to medium-sized rainfall or frequency scenario. Furthermore, it will be used as a fundamental data for flood forecast by establishing the RDF curve which the relationship of rainfall-outflow-flood is considered and the database of expected inundation maps.

Improved Focused Sampling for Class Imbalance Problem (클래스 불균형 문제를 해결하기 위한 개선된 집중 샘플링)

  • Kim, Man-Sun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Cheah, Wooi Ping
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.287-294
    • /
    • 2007
  • Many classification algorithms for real world data suffer from a data class imbalance problem. To solve this problem, various methods have been proposed such as altering the training balance and designing better sampling strategies. The previous methods are not satisfy in the distribution of the input data and the constraint. In this paper, we propose a focused sampling method which is more superior than previous methods. To solve the problem, we must select some useful data set from all training sets. To get useful data set, the proposed method devide the region according to scores which are computed based on the distribution of SOM over the input data. The scores are sorted in ascending order. They represent the distribution or the input data, which may in turn represent the characteristics or the whole data. A new training dataset is obtained by eliminating unuseful data which are located in the region between an upper bound and a lower bound. The proposed method gives a better or at least similar performance compare to classification accuracy of previous approaches. Besides, it also gives several benefits : ratio reduction of class imbalance; size reduction of training sets; prevention of over-fitting. The proposed method has been tested with kNN classifier. An experimental result in ecoli data set shows that this method achieves the precision up to 2.27 times than the other methods.

Preliminary Ecological Assessments of Water Chemistry, Trophic Compositions, and the Ecosystem Health on Massive Constructions of Three Weirs in Geum-River Watershed

  • Ko, Dae-Geun;Choi, Ji-Woong;An, Kwang-Guk
    • Journal of Ecology and Environment
    • /
    • v.39 no.1
    • /
    • pp.61-70
    • /
    • 2016
  • Major objectives of the study were to analyze chemical and biological influences of the river ecosystem on the artificial weir construction at three regions of Sejong-Weir (Sj-W), Gongju-Weir (Gj-W), and Baekje-Weir (Bj-W) during 2008-2012. After the weir construction, the discharge volume increased up to 2.9 times, and biological oxygen demand (BOD) and electrical conductivity (EC) significantly decreased (p < 0.05). Also, the decrease of total phosphorus (TP) was also evident after the weir construction, but still hyper-eutrophic conditions, based on criteria by , were maintained. Multi-metric model of Index of Biological Integrity (IBI) showed that IBI values averaged 21.0 (range: 20-22; fair condition) in the Bwc, and 14.3 (range: 12-18; poor condition) in the Awc. The model values of IBI in Sj-W and Gj-W were significantly decreased after the weir construction. The model of Self-Organizing Map (SOM) showed that two groups (cluster I and cluster II) of Bwc and Awc were divided in the analysis based on the clustering map trained by the SOM. Principal Component Analysis (PCA) was similar to the results of the SOM analysis. Taken together, this research suggests that the weir construction on the river modified the discharge volume and the physical habitat structures along with distinct changes of some chemical water quality. These physical and chemical factors influenced the ecosystem health, measured as a model value of IBI.

A Hybrid Forecasting Framework based on Case-based Reasoning and Artificial Neural Network (사례기반 추론기법과 인공신경망을 이용한 서비스 수요예측 프레임워크)

  • Hwang, Yousub
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.43-57
    • /
    • 2012
  • To enhance the competitive advantage in a constantly changing business environment, an enterprise management must make the right decision in many business activities based on both internal and external information. Thus, providing accurate information plays a prominent role in management's decision making. Intuitively, historical data can provide a feasible estimate through the forecasting models. Therefore, if the service department can estimate the service quantity for the next period, the service department can then effectively control the inventory of service related resources such as human, parts, and other facilities. In addition, the production department can make load map for improving its product quality. Therefore, obtaining an accurate service forecast most likely appears to be critical to manufacturing companies. Numerous investigations addressing this problem have generally employed statistical methods, such as regression or autoregressive and moving average simulation. However, these methods are only efficient for data with are seasonal or cyclical. If the data are influenced by the special characteristics of product, they are not feasible. In our research, we propose a forecasting framework that predicts service demand of manufacturing organization by combining Case-based reasoning (CBR) and leveraging an unsupervised artificial neural network based clustering analysis (i.e., Self-Organizing Maps; SOM). We believe that this is one of the first attempts at applying unsupervised artificial neural network-based machine-learning techniques in the service forecasting domain. Our proposed approach has several appealing features : (1) We applied CBR and SOM in a new forecasting domain such as service demand forecasting. (2) We proposed our combined approach between CBR and SOM in order to overcome limitations of traditional statistical forecasting methods and We have developed a service forecasting tool based on the proposed approach using an unsupervised artificial neural network and Case-based reasoning. In this research, we conducted an empirical study on a real digital TV manufacturer (i.e., Company A). In addition, we have empirically evaluated the proposed approach and tool using real sales and service related data from digital TV manufacturer. In our empirical experiments, we intend to explore the performance of our proposed service forecasting framework when compared to the performances predicted by other two service forecasting methods; one is traditional CBR based forecasting model and the other is the existing service forecasting model used by Company A. We ran each service forecasting 144 times; each time, input data were randomly sampled for each service forecasting framework. To evaluate accuracy of forecasting results, we used Mean Absolute Percentage Error (MAPE) as primary performance measure in our experiments. We conducted one-way ANOVA test with the 144 measurements of MAPE for three different service forecasting approaches. For example, the F-ratio of MAPE for three different service forecasting approaches is 67.25 and the p-value is 0.000. This means that the difference between the MAPE of the three different service forecasting approaches is significant at the level of 0.000. Since there is a significant difference among the different service forecasting approaches, we conducted Tukey's HSD post hoc test to determine exactly which means of MAPE are significantly different from which other ones. In terms of MAPE, Tukey's HSD post hoc test grouped the three different service forecasting approaches into three different subsets in the following order: our proposed approach > traditional CBR-based service forecasting approach > the existing forecasting approach used by Company A. Consequently, our empirical experiments show that our proposed approach outperformed the traditional CBR based forecasting model and the existing service forecasting model used by Company A. The rest of this paper is organized as follows. Section 2 provides some research background information such as summary of CBR and SOM. Section 3 presents a hybrid service forecasting framework based on Case-based Reasoning and Self-Organizing Maps, while the empirical evaluation results are summarized in Section 4. Conclusion and future research directions are finally discussed in Section 5.

Water demand forecasting at the DMA level considering sociodemographic and waterworks characteristics (사회인구통계 및 상수도시설 특성을 고려한 소블록 단위 물 수요예측 연구)

  • Saemmul Jin;Dooyong Choi;Kyoungpil Kim;Jayong Koo
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.6
    • /
    • pp.363-373
    • /
    • 2023
  • Numerous studies have established a correlation between sociodemographic characteristics and water usage, identifying population as a primary independent variable in mid- to long-term demand forecasting. Recent dramatic sociodemographic changes, including urban concentration-rural depopulation, low birth rates-aging population, and the rise in single-person households, are expected to impact water demand and supply patterns. This underscores the necessity for operational and managerial changes in existing water supply systems. While sociodemographic characteristics are regularly surveyed, the conducted surveys use aggregate units that do not align with the actual system. Consequently, many water demand forecasts have been conducted at the administrative district level without adequately considering the water supply system. This study presents an upward water demand forecasting model that accurately reflects real water facilities and consumers. The model comprises three key steps. Firstly, Statistics Korea's SGIS (Statistical Geological Information System) data was reorganized at the DMA level. Secondly, DMAs were classified using the SOM (Self-Organizing Map) algorithm to consider differences in water facilities and consumer characteristics. Lastly, water demand forecasting employed the PCR (Principal Component Regression) method to address multicollinearity and overfitting issues. The performance evaluation of this model was conducted for DMAs classified as rural areas due to the insufficient number of DMAs. The estimation results indicate that the correlation coefficients exceeded 0.9, and the MAPE remained within approximately 10% for the test dataset. This method is expected to be useful for reorganization plans, such as the expansion and contraction of existing facilities.