• Title/Summary/Keyword: Optimal School Size

Search Result 378, Processing Time 0.024 seconds

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Experimental and Numerical Analysis of Package and Solder Ball Crack Reliability using Solid Epoxy Material (Solid Epoxy를 이용한 패키지 및 솔더 크랙 신뢰성 확보를 위한 실험 및 수치해석 연구)

  • Cho, Youngmin;Choa, Sung-Hoon
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.27 no.1
    • /
    • pp.55-65
    • /
    • 2020
  • The use of underfill materials in semiconductor packages is not only important for stress relieving of the package, but also for improving the reliability of the package during shock and vibration. However, in recent years, as the size of the package becomes larger and very thin, the use of the underfill shows adverse effects and rather deteriorates the reliability of the package. To resolve these issues, we developed the package using a solid epoxy material to improve the reliability of the package as a substitute for underfill material. The developed solid epoxy was applied to the package of the application processor in smart phone, and the reliability of the package was evaluated using thermal cycling reliability tests and numerical analysis. In order to find the optimal solid epoxy material and process conditions for improving the reliability, the effects of various factors on the reliability, such as the application number of solid epoxy, type of PCB pad, and different solid epoxy materials, were investigated. The reliability test results indicated that the package with solid epoxy exhibited higher reliability than that without solid epoxy. The application of solid epoxy at six locations showed higher reliability than that of solid epoxy at four locations indicating that the solid epoxy plays a role in relieving stress of the package, thereby improving the reliability of the package. For the different types of PCB pad, NSMD (non-solder mask defined) pad showed higher reliability than the SMD (solder mask defined) pad. This is because the application of the NSMD pad is more advantageous in terms of thermomechanical stress reliability because the solderpad bond area is larger. In addition, for the different solid epoxy materials with different thermal expansion coefficients, the reliability was more improved when solid epoxy having lower thermal expansion coefficient was used.

A study on the comparison by the methods of estimating the relaxation load of SEM-pile (SEM파일의 이완하중 산정방법별 이완하중량 비교 연구)

  • Kim, Hyeong-Gyu;Park, Eun-Hyung;Cho, Kook-Hwan
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.20 no.3
    • /
    • pp.543-560
    • /
    • 2018
  • With the increased development in downtown underground space facilities that vertically cross under a railway at a shallow depth, the demand for non-open cut method is increasing. However, most construction sites still adopt the pipe roof method, where medium and large diameter steel pipes are pressed in to form a roof, enabling excavation of the inside space. Among the many factors that influence the loosening region and loads that occur while pressing in steel pipes, the size of the pipe has the largest impact, and this factor may correspond to the magnitude of load applied to the underground structure inside the steel pipe roof. The super equilibrium method (SEM) has been developed to minimize ground disturbance and loosening load, and uses small diameter pipes of approximately 114 mm instead of conventional medium and large diameter pipes. This small diameter steel pipe is called an SEM pile. After SEM piles are pressed in and the grouting reinforcement is constructed, a crossing structure is pressed in by using a hydraulic jack without ground subsidence or heaving. The SEM pile, which plays the role of timbering, is a fore-poling pile of approximately 5 m length that prevents ground collapse and supports surface load during excavation of toe part. The loosening region should be adequately calculated to estimate the spacing and construction length of the piles and stiffness of members. In this paper, we conducted a comparative analysis of calculations of loosening load that occurs during the press-in of SEM pile to obtain an optimal design of SEM. We analyzed the influence of factors in main theoretical and empirical formulas applied for calculating loosening regions, and carried out FEM analysis to see an appropriate loosening load to the SEM pile. In order to estimate the soil loosening caused by actual SEM-pile indentation and excavation, a steel pipe indentation reduction model test was conducted. Soil subsidence and soil loosening were investigated quantitatively according to soil/steel pipe (H/D).

Effects of Dropwort Powder on the Quality of Castella (미나리가루의 첨가가 Castella의 품질에 미치는 영향)

  • Park, Sang-Jun;Lee, Kwang-Suck;An, Bye-Lyung
    • Journal of the East Asian Society of Dietary Life
    • /
    • v.17 no.6
    • /
    • pp.834-839
    • /
    • 2007
  • This study was designed to determine the optimal ratio of dropwort powder in castella by adding the powder at levels of 0, 3, 6, 9, and 12% respectively. The properties of the castella were analyzed by specific gravity, specific volume, color determinations, texture properties and sensory evaluation. The Specific gravity increased with increasing amount of dropwort powder. However, the specific volume decreased with increasing dropwort powder. For the color values, as more dropwort powder was added, the L-value decreased. The castella with 9% dropwort powder had a higher hardness, gumminess, and chewiness. A sensory panel perceived that the external and internal color of the castella become darker with the dropwort powder substitution and the grain size decreased with increasing amount dropwort powder, while sweet taste showed no significant difference. The order of overall preference was DP 9>DP 6>DP 12>CON>DP 3. Therefore, the substitution of 9% of wheat flour with dropwort powder was recommended in the production of castella.

  • PDF

An Empirical Study on the Influencing Factors for Big Data Intented Adoption: Focusing on the Strategic Value Recognition and TOE Framework (빅데이터 도입의도에 미치는 영향요인에 관한 연구: 전략적 가치인식과 TOE(Technology Organizational Environment) Framework을 중심으로)

  • Ka, Hoi-Kwang;Kim, Jin-soo
    • Asia pacific journal of information systems
    • /
    • v.24 no.4
    • /
    • pp.443-472
    • /
    • 2014
  • To survive in the global competitive environment, enterprise should be able to solve various problems and find the optimal solution effectively. The big-data is being perceived as a tool for solving enterprise problems effectively and improve competitiveness with its' various problem solving and advanced predictive capabilities. Due to its remarkable performance, the implementation of big data systems has been increased through many enterprises around the world. Currently the big-data is called the 'crude oil' of the 21st century and is expected to provide competitive superiority. The reason why the big data is in the limelight is because while the conventional IT technology has been falling behind much in its possibility level, the big data has gone beyond the technological possibility and has the advantage of being utilized to create new values such as business optimization and new business creation through analysis of big data. Since the big data has been introduced too hastily without considering the strategic value deduction and achievement obtained through the big data, however, there are difficulties in the strategic value deduction and data utilization that can be gained through big data. According to the survey result of 1,800 IT professionals from 18 countries world wide, the percentage of the corporation where the big data is being utilized well was only 28%, and many of them responded that they are having difficulties in strategic value deduction and operation through big data. The strategic value should be deducted and environment phases like corporate internal and external related regulations and systems should be considered in order to introduce big data, but these factors were not well being reflected. The cause of the failure turned out to be that the big data was introduced by way of the IT trend and surrounding environment, but it was introduced hastily in the situation where the introduction condition was not well arranged. The strategic value which can be obtained through big data should be clearly comprehended and systematic environment analysis is very important about applicability in order to introduce successful big data, but since the corporations are considering only partial achievements and technological phases that can be obtained through big data, the successful introduction is not being made. Previous study shows that most of big data researches are focused on big data concept, cases, and practical suggestions without empirical study. The purpose of this study is provide the theoretically and practically useful implementation framework and strategies of big data systems with conducting comprehensive literature review, finding influencing factors for successful big data systems implementation, and analysing empirical models. To do this, the elements which can affect the introduction intention of big data were deducted by reviewing the information system's successful factors, strategic value perception factors, considering factors for the information system introduction environment and big data related literature in order to comprehend the effect factors when the corporations introduce big data and structured questionnaire was developed. After that, the questionnaire and the statistical analysis were performed with the people in charge of the big data inside the corporations as objects. According to the statistical analysis, it was shown that the strategic value perception factor and the inside-industry environmental factors affected positively the introduction intention of big data. The theoretical, practical and political implications deducted from the study result is as follows. The frist theoretical implication is that this study has proposed theoretically effect factors which affect the introduction intention of big data by reviewing the strategic value perception and environmental factors and big data related precedent studies and proposed the variables and measurement items which were analyzed empirically and verified. This study has meaning in that it has measured the influence of each variable on the introduction intention by verifying the relationship between the independent variables and the dependent variables through structural equation model. Second, this study has defined the independent variable(strategic value perception, environment), dependent variable(introduction intention) and regulatory variable(type of business and corporate size) about big data introduction intention and has arranged theoretical base in studying big data related field empirically afterwards by developing measurement items which has obtained credibility and validity. Third, by verifying the strategic value perception factors and the significance about environmental factors proposed in the conventional precedent studies, this study will be able to give aid to the afterwards empirical study about effect factors on big data introduction. The operational implications are as follows. First, this study has arranged the empirical study base about big data field by investigating the cause and effect relationship about the influence of the strategic value perception factor and environmental factor on the introduction intention and proposing the measurement items which has obtained the justice, credibility and validity etc. Second, this study has proposed the study result that the strategic value perception factor affects positively the big data introduction intention and it has meaning in that the importance of the strategic value perception has been presented. Third, the study has proposed that the corporation which introduces big data should consider the big data introduction through precise analysis about industry's internal environment. Fourth, this study has proposed the point that the size and type of business of the corresponding corporation should be considered in introducing the big data by presenting the difference of the effect factors of big data introduction depending on the size and type of business of the corporation. The political implications are as follows. First, variety of utilization of big data is needed. The strategic value that big data has can be accessed in various ways in the product, service field, productivity field, decision making field etc and can be utilized in all the business fields based on that, but the parts that main domestic corporations are considering are limited to some parts of the products and service fields. Accordingly, in introducing big data, reviewing the phase about utilization in detail and design the big data system in a form which can maximize the utilization rate will be necessary. Second, the study is proposing the burden of the cost of the system introduction, difficulty in utilization in the system and lack of credibility in the supply corporations etc in the big data introduction phase by corporations. Since the world IT corporations are predominating the big data market, the big data introduction of domestic corporations can not but to be dependent on the foreign corporations. When considering that fact, that our country does not have global IT corporations even though it is world powerful IT country, the big data can be thought to be the chance to rear world level corporations. Accordingly, the government shall need to rear star corporations through active political support. Third, the corporations' internal and external professional manpower for the big data introduction and operation lacks. Big data is a system where how valuable data can be deducted utilizing data is more important than the system construction itself. For this, talent who are equipped with academic knowledge and experience in various fields like IT, statistics, strategy and management etc and manpower training should be implemented through systematic education for these talents. This study has arranged theoretical base for empirical studies about big data related fields by comprehending the main variables which affect the big data introduction intention and verifying them and is expected to be able to propose useful guidelines for the corporations and policy developers who are considering big data implementationby analyzing empirically that theoretical base.

Preparation of Pure CO2 Standard Gas from Calcium Carbonate for Stable Isotope Analysis (탄산칼슘을 이용한 이산화탄소 안정동위원소 표준시료 제작에 대한 연구)

  • Park, Mi-Kyung;Park, Sunyoung;Kang, Dong-Jin;Li, Shanlan;Kim, Jae-Yeon;Jo, Chun Ok;Kim, Jooil;Kim, Kyung-Ryul
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.18 no.1
    • /
    • pp.40-46
    • /
    • 2013
  • The isotope ratios of $^{13}C/^{12}C$ and $^{18}O/^{16}O$ for a sample in a mass spectrometer are measured relative to those of a pure $CO_2$ reference gas (i.e., laboratory working standard). Thus, the calibration of a laboratory working standard gas to the international isotope scales (Pee Dee Belemnite (PDB) for ${\delta}^{13}C$ and Vienna Standard Mean Ocean Water (V-SMOW) for ${\delta}^{18}O$) is essential for comparisons between data sets obtained by other groups on other mass spectrometers. However, one often finds difficulties in getting well-calibrated standard gases, because of their production time and high price. Additional difficulty is that fractionation processes can occur inside the gas cylinder most likely due to pressure drop in long-term use. Therefore, studies on laboratory production of pure $CO_2$ isotope standard gas from stable solid calcium carbonate standard materials, have been performed. For this study, we propose a method to extract pure $CO_2$ gas without isotope fractionation from a solid calcium carbonate material. The method is similar to that suggested by Coplen et al., (1983), but is better optimized particularly to make a large amount of pure $CO_2$ gas from calcium carbonate material. The $CaCO_3$ releases $CO_2$ in reaction with 100% pure phosphoric acid at $25^{\circ}C$ in a custom designed, evacuated reaction vessel. Here we introduce optimal procedure, reaction conditions, and samples/reactants size for calcium carbonate-phosphoric acid reaction and also provide the details for extracting, purifying and collecting $CO_2$ gas out of the reaction vessel. The measurements for ${\delta}^{18}O$ and ${\delta}^{13}C$ of $CO_2$ were performed at Seoul National University using a stable isotope ratio mass spectrometer (VG Isotech, SIRA Series II) operated in dual-inlet mode. The entire analysis precisions for ${\delta}^{18}O$ and ${\delta}^{13}C$ were evaluated based on the standard deviations of multiple measurements on 15 separate samples of purified $CO_2$. The pure $CO_2$ samples were taken from 100-mg aliquots of a solid calcium carbonate (Solenhofen-ori $CaCO_3$) during 8-day experimental period. The multiple measurements yielded the $1{\sigma}$ precisions of ${\pm}0.01$‰ for ${\delta}^{13}C$ and ${\pm}0.05$‰ for ${\delta}^{18}O$, comparable to the internal instrumental precisions of SIRA. Therefore, we conclude the method proposed in this study can serve as a way to produce an accurate secondary and/or laboratory $CO_2$ standard gas. We hope this study helps resolve difficulties in placing a laboratory working standard onto the international isotope scales and does make accurate comparisons with other data sets from other groups.

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.