• Title/Summary/Keyword: MC 기법

Search Result 236, Processing Time 0.025 seconds

Introduction of Denitrification Method for Nitrogen and Oxygen Stable Isotopes (δ15N-NO3 and δ18O-NO3) in Nitrate and Case Study for Tracing Nitrogen Source (탈질미생물을 이용한 질산성 질소의 산소 및 질소 동위원소 분석법 소개)

  • Lim, Bo-La;Kim, Min-Seob;Yoon, Suk-Hee;Park, Jaeseon;Park, Hyunwoo;Chung, Hyen-Mi;Choi, Jong-Woo
    • Korean Journal of Ecology and Environment
    • /
    • v.50 no.4
    • /
    • pp.459-469
    • /
    • 2017
  • Nitrogen (N) loading from domestic, agricultural and industrial sources can lead to excessive growth of macrophytes or phytoplankton in aquatic environment. Many studies have used stable isotope ratios to identify anthropogenic nitrogen in aquatic systems as a useful method for studying nitrogen cycle. In this study to evaluate the precision and accuracy of denitrification bacteria method (Pseudomonas chlororaphis ssp. Aureofaciens ($ATCC^{(R)}$ 13985)), three reference (IAEA-NO-3 (Potassium nitrate $KNO_3$), USGS34 (Potassium nitrate $KNO_3$), USGS35 (Sodium nitrate $KNO_3$)) were analyzed 5 times repeatedly. Measured the ${\delta}^{15}N-NO_3$ and ${\delta}^{18}O-NO_3$ values of IAEA-NO-3, USGS 34 and USGS35 were ${\delta}^{15}N:4.7{\pm}0.1$${\delta}^{18}O:25.6{\pm}0.5$‰, ${\delta}^{15}N:-1.8{\pm}0.1$${\delta}^{18}O:-27.8{\pm}0.4$‰, and ${\delta}^{15}N:2.7{\pm}0.2$${\delta}^{18}O:57.5{\pm}0.7$‰, respectively, which are within recommended values of analytical uncertainties. Also, we investigated isotope values of potential nitrogen source (soil, synthetic fertilizer and organic-animal manures) and temporal patterns of ${\delta}^{15}N-NO_3$ and ${\delta}^{18}O-NO_3$ values in river samples during from May to December. ${\delta}^{15}N-NO_3$ and ${\delta}^{18}O-NO_3$ values are enriched in December suggesting that organic-animal manures should be one of the main N sources in those areas. The current study clarifies the reliability of denitrification bacteria method and the usefulness of stable isotopic techniques to trace the anthropogenic nitrogen source in freshwater ecosystem.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

A Characterization of Oil Sand Reservoir and Selections of Optimal SAGD Locations Based on Stochastic Geostatistical Predictions (지구통계 기법을 이용한 오일샌드 저류층 해석 및 스팀주입중력법을 이용한 비투멘 회수 적지 선정 사전 연구)

  • Jeong, Jina;Park, Eungyu
    • Economic and Environmental Geology
    • /
    • v.46 no.4
    • /
    • pp.313-327
    • /
    • 2013
  • In the study, three-dimensional geostatistical simulations on McMurray Formation which is the largest oil sand reservoir in Athabasca area, Canada were performed, and the optimal site for steam assisted gravity drainage (SAGD) was selected based on the predictions. In the selection, the factors related to the vertical extendibility of steam chamber were considered as the criteria for an optimal site. For the predictions, 110 borehole data acquired from the study area were analyzed in the Markovian transition probability (TP) framework and three-dimensional distributions of the composing media were predicted stochastically through an existing TP based geostatistical model. The potential of a specific medium at a position within the prediction domain was estimated from the ensemble probability based on the multiple realizations. From the ensemble map, the cumulative thickness of the permeable media (i.e. Breccia and Sand) was analyzed and the locations with the highest potential for SAGD applications were delineated. As a supportive criterion for an optimal SAGD site, mean vertical extension of a unit permeable media was also delineated through transition rate based computations. The mean vertical extension of a permeable media show rough agreement with the cumulative thickness in their general distribution. However, the distributions show distinctive disagreement at a few locations where the cumulative thickness was higher due to highly alternating juxtaposition of the permeable and the less permeable media. This observation implies that the cumulative thickness alone may not be a sufficient criterion for an optimal SAGD site and the mean vertical extension of the permeable media needs to be jointly considered for the sound selections.

Estimation of Residual Useful Life and Tracking of Real-time Damage Paths of Rubble-Mound Breakwaters Using Stochastic Wiener Process (추계학적 위너 확률과정을 이용한 경사제의 실시간 피해경로 추적과 잔류수명 추정)

  • Lee, Cheol-Eung
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.32 no.3
    • /
    • pp.147-160
    • /
    • 2020
  • A stochastic probabilistic model for harbor structures such as rubble-mound breakwater has been formulated by using the generalized Wiener process considering the nonlinearity of damage drift and its nonlinear uncertainty, by which the damage path with real-time can be tracked, the residual useful lifetime at some age can also be analyzed properly. The formulated stochastic model can easily calculate the probability of failure with the passage of time through the probability density function of cumulative damage. In particular, the probability density functions of residual useful lifetime of the existing harbor structures can be derived, which can take into account the current age, its present damage state and the future damage process to be occurred. By using the maximum likelihood method and the least square method together, the involved parameters in the stochastic model can be estimated. In the calibration of the stochastic model presented in this paper, the present results are very well similar with the results of MCS about tracking of the damage paths as well as evaluating of the density functions of the cumulative damage and the residual useful lifetime. MTTF and MRL are also evaluated exactly. Meanwhile, the stochastic probabilistic model has been applied to the rubble-mound breakwater. The related parameters can be estimated by using the experimental data of the cumulative damages of armor units measured as a function of time. The theoretical results about the probability density function of cumulative damage and the probability of failure are very well agreed with MCS results such that the density functions of the cumulative damage tend to move to rightward and the amounts of its uncertainty are increased as the elapsed time goes on. Thus, the probabilities of failure with the elapsed time are also increased sharply. Finally, the behaviors of residual useful lifetime have been investigated with the elapsed age. It is concluded for rubble-mound breakwaters that the probability density functions of residual useful lifetime tends to have a longer tail in the right side rather than the left side because of the gradual increases of cumulative damage of armor units. Therefore, its MRLs are sharply decreased after some age. In this paper, the special attentions are paid to the relationship of MTTF and MRL and the elapsed age of the existing structure. In spite of that the sum of the elapsed age and MRL must be equal to MTTF deterministically, the large difference has been shown as the elapsed age is increased which is due to the uncertainty of cumulative damage to be occurred in the future.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.