• Title/Summary/Keyword: small school

Search Result 8,303, Processing Time 0.037 seconds

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

The Effects of the Computer Aided Innovation Capabilities on the R&D Capabilities: Focusing on the SMEs of Korea (Computer Aided Innovation 역량이 연구개발역량에 미치는 효과: 국내 중소기업을 대상으로)

  • Shim, Jae Eok;Byeon, Moo Jang;Moon, Hyo Gon;Oh, Jay In
    • Asia pacific journal of information systems
    • /
    • v.23 no.3
    • /
    • pp.25-53
    • /
    • 2013
  • This study analyzes the effect of Computer Aided Innovation (CAI) to improve R&D Capabilities empirically. Survey was distributed by e-mail and Google Docs, targeting CTO of 235 SMEs. 142 surveys were returned back (rate of return 60.4%) from companies. Survey results from 119 companies (83.8%) which are effective samples except no-response, insincere response, estimated value, etc. were used for statistics analysis. Companies with less than 50billion KRW sales of entire researched companies occupy 76.5% in terms of sample traits. Companies with less than 300 employees occupy 83.2%. In terms of the type of company business Partners (called 'partners with big companies' hereunder) who work with big companies for business occupy 68.1%. SMEs based on their own business (called 'independent small companies') appear to occupy 31.9%. The present status of holding IT system according to traits of company business was classified into partners with big companies versus independent SMEs. The present status of ERP is 18.5% to 34.5%. QMS is 11.8% to 9.2%. And PLM (Product Life-cycle Management) is 6.7% to 2.5%. The holding of 3D CAD is 47.1% to 21%. IT system-holding and its application of independent SMEs seemed very vulnerable, compared with partner companies of big companies. This study is comprised of IT infra and IT Utilization as CAI capacity factors which are independent variables. factors of R&D capabilities which are independent variables are organization capability, process capability, HR capability, technology-accumulating capability, and internal/external collaboration capability. The highest average value of variables was 4.24 in organization capability 2. The lowest average value was 3.01 in IT infra which makes users access to data and information in other areas and use them with ease when required during new product development. It seems that the inferior environment of IT infra of general SMEs is reflected in CAI itself. In order to review the validity used to measure variables, Factors have been analyzed. 7 factors which have over 1.0 pure value of their dependent and independent variables were extracted. These factors appear to explain 71.167% in total of total variances. From the result of factor analysis about measurable variables in this study, reliability of each item was checked by Cronbach's Alpha coefficient. All measurable factors at least over 0.611 seemed to acquire reliability. Next, correlation has been done to explain certain phenomenon by correlation analysis between variables. As R&D capabilities factors which are arranged as dependent variables, organization capability, process capability, HR capability, technology-accumulating capability, and internal/external collaboration capability turned out that they acquire significant correlation at 99% reliability level in all variables of IT infra and IT Utilization which are independent variables. In addition, correlation coefficient between each factor is less than 0.8, which proves that the validity of this study judgement has been acquired. The pair with the highest coefficient had 0.628 for IT utilization and technology-accumulating capability. Regression model which can estimate independent variables was used in this study under the hypothesis that there is linear relation between independent variables and dependent variables so as to identify CAI capability's impact factors on R&D. The total explanations of IT infra among CAI capability for independent variables such as organization capability, process capability, human resources capability, technology-accumulating capability, and collaboration capability are 10.3%, 7%, 11.9%, 30.9%, and 10.5% respectively. IT Utilization exposes comprehensively low explanatory capability with 12.4%, 5.9%, 11.1%, 38.9%, and 13.4% for organization capability, process capability, human resources capability, technology-accumulating capability, and collaboration capability respectively. However, both factors of independent variables expose very high explanatory capability relatively for technology-accumulating capability among independent variable. Regression formula which is comprised of independent variables and dependent variables are all significant (P<0.005). The suitability of regression model seems high. When the results of test for dependent variables and independent variables are estimated, the hypothesis of 10 different factors appeared all significant in regression analysis model coefficient (P<0.01) which is estimated to affect in the hypothesis. As a result of liner regression analysis between two independent variables drawn by influence factor analysis for R&D capability and R&D capability. IT infra and IT Utilization which are CAI capability factors has positive correlation to organization capability, process capability, human resources capability, technology-accumulating capability, and collaboration capability with inside and outside which are dependent variables, R&D capability factors. It was identified as a significant factor which affects R&D capability. However, considering adjustable variables, a big gap is found, compared to entire company. First of all, in case of partner companies with big companies, in IT infra as CAI capability, organization capability, process capability, human resources capability, and technology capability out of R&D capacities seems to have positive correlation. However, collaboration capability appeared insignificance. IT utilization which is a CAI capability factor seemed to have positive relation to organization capability, process capability, human resources capability, and internal/external collaboration capability just as those of entire companies. Next, by analyzing independent types of SMEs as an adjustable variable, very different results were found from those of entire companies or partner companies with big companies. First of all, all factors in IT infra except technology-accumulating capability were rejected. IT utilization was rejected except technology-accumulating capability and collaboration capability. Comprehending the above adjustable variables, the following results were drawn in this study. First, in case of big companies or partner companies with big companies, IT infra and IT utilization affect improving R&D Capabilities positively. It was because most of big companies encourage innovation by using IT utilization and IT infra building over certain level to their partner companies. Second, in all companies, IT infra and IT utilization as CAI capability affect improving technology-accumulating capability positively at least as R&D capability factor. The most of factor explanation is low at around 10%. However, technology-accumulating capability is rather high around 25.6% to 38.4%. It was found that CAI capability contributes to technology-accumulating capability highly. Companies shouldn't consider IT infra and IT utilization as a simple product developing tool in R&D section. However, they have to consider to use them as a management innovating strategy tool which proceeds entire-company management innovation centered in new product development. Not only the improvement of technology-accumulating capability in department of R&D. Centered in new product development, it has to be used as original management innovative strategy which proceeds entire company management innovation. It suggests that it can be a method to improve technology-accumulating capability in R&D section and Dynamic capability to acquire sustainable competitive advantage.

Results of Hyperfractionated Radiation Therapy in Bulky Stage Ib, IIa, and IIb Uterine Cervical Cancer (종괴가 큰 병기 Ib, IIa, IIb 자궁경부암에서 다분할 방사선치료의 결과)

  • Kim, Jin-Hee;Kim, Ok-Bae
    • Radiation Oncology Journal
    • /
    • v.15 no.4
    • /
    • pp.349-356
    • /
    • 1997
  • Purpose : To evaluate the efficacy of hyperfractionated radiation therapy in carcinoma of the cervix, especially on huge exophytic and endophytic stage Ib, IIa and IIb Materials and Materials : Fourty one patients with carcinoma of the cervix treated with hyperfractionated radiation therapy at the Department of Therapeutic Radiology, Dongsan Hospital, Keimyung University. School of Medicine from Jul, 1991 to Apr, 1994. According to FIGO s1aging system, therewere stage Ib (3 patients) IIa (6 patients) with exophytic ($\geq$5cm in dinmeter) and huge endophytic mass. and IIb (32 patients) with median age of 55 yeavs old. Radiation therapy consisted of hyperfractionated external irradition to the whole pelvis (120cGy/fraction, 2 fraction/day (minimum interval of 6 hours), 3600-5520cGy) and boost parametrial doses (for a total of 4480-6480cGy) with midline shield $(4\times10cm)$, and combined with intracavitary irradiation (up to 7480-8520cGy in Ib, IIa and 8480-9980cGy in IIb to point A). The maximum and mean follow up durations were 70 and 47 months respectively . Results : Five year local control rate was $78\%$ and the actuarial overall five year survival rate was $66.1\%$ for all patients, $44.4\%$ for stage Ib, IIa and $71.4\%$ for stage IIb. In bulky IIb (above 5cm in tumor size, 11 patients) five year local control rate and five rear survival rate was $88.9\%,\;73\%$ respectively Pelvic lymph node status (negative : $74\%,\;positive:25\%$, p=0.0015) was significant Prognostic factor affecting to five rear survival rate. There was marginally significant survival difference by total dose to A point ($>84Gy\;:\;70\%,\;>84Gy\;:\;42.8\%$, p=0.1). We consider that the difference of total dose to A point by stage (mean Ib,IIa : 79Gy. IIb 89Gy P=0.001) is one of the causes in worse local control and survival of Ib,IIa than IIb The overall recurrence rate was $39\%$ (16/41). The rates of local failure alone. distant failure alone. and combined local and distant failure were $9.7\%,\;19.5\%,\;and\;9.7\%$, respectively. Two Patients developed leukopenia ($\geq$ grade 3) and Three patients develoued grade 3 gastrointestinal complication. Above grade 3 complication was not noted. There was no treatment related death noted. Conclusion : We thought that it may be necessary to increase A point dose to more than 85Gy in hyperfractionated radiotherapy of huge exophytic and endophvtic stage Ib,IIa. We considered that hyperfractionated radiation therapy may be tolerable in huge exophytic and endophytic stage IIb cervical carcinoma with acceptable morbidity and possible survival gain but this was results in small patient group and will be confirmed by long term follow up in many patients.

  • PDF

Effect of Air Circulation Velocity on the Rate of Lumber Drying in a Small Compartment Wood Drying Kiln (소형 목재인공건조실에 있어서 공기순환속도가 목재건조율에 미치는 영향)

  • Chung, Byung-Jae
    • Journal of the Korean Wood Science and Technology
    • /
    • v.2 no.2
    • /
    • pp.5-7
    • /
    • 1974
  • 1. This study indicates that above the fiber saturation point the drying rate can be increased with increasing the velocity of the air circutation, i.e., the drying rate of sample boards is proportional to the air velocity, but below the fiber saturation point, the effect of the velocity of air circulation is very low as shown in Figs. 1 and 2. 2. Under the controlled temperature and humidity in the kiln, the more the sample boards have moisture, the higher drying rate of it can be obtained. In other words, this means that even though in the case of drying various moisture content of wood, at the final drying stage, approximately the same percentage of moisture content of wood can be secured by employing the higher velocity of air circulation. 3. This study shows that the rate of drying in kiln changes distinctly at the fiber saturation point, i, e., above the fiber saturation point, the drying curve shows concave aginst the X axsis, but below the fiber saturation point, in the range from 30 percent of moisture content to 20 percent of moisture content, the curve shows convex as shown in Fig. 3. As the drying progresses, however, the drying curve shows concave again below 20 percent of moisture content. This means that inflection point of drying curve may be located clearly at the fiber saturation point, i.e., 30 percent of moisture content. As mentioned above, the 30 percent of moisture content of wood at which the inflectional point appears can be recognized as a critical point, i. e., the fiber saturation point at which all free water was removed from wood. The existence of inflectional point indicates that the evaporation of hygroscopic water in a cell wall is more difficult than the evaporation of free water in a cell cavity and the minor space of cell wall. The convex curve in the range of moisture content from 30 percent to 20 percent means that the evaporation of capillary condensed water has a tendency of the same rates of drying approximately, but as approaching to the 20 percent of moisture, the transfusion of moisture from wood becomes difficult because of having less moisture in cell wall. Below 20 percent of moisture content, the drying curve shows concave again, which means that it is difficult to remove the moisture located nearer to the surface of cellulose molecules and the surface bound water. These relations were revealed in Fig. 4. In comparison AC curve which does not have the two inflection points with BD curve which has two inflection points, i.e., Band D, they are mentioned already, by existence of the inflection points, the curve BD shows that the change of drying rate in the interval from 20 percent of moisture content to 30 percent of moisture content is not greater than in the case of the curve AC in the same interval. At the inflection point of 30 percent of moisture content, it can be noticed that the changing of the drying rate is very conspicuous. This phenomenon also can be recognized, as it is noticed by the Fig. 3, the drying rate from green to 30 percent of moisture content is very great. But the inclination of the curve is very slow from 30 percent of moisture content to 20 percent of moisture content, i.e., the inclination of the curve becomes almost horizontal lines. Acknowledgments Gratitude is expressed to Fred E. Dickinson, Professor of 'Wood Technology, School of Natural Resources, University of Michigan, USA for his suggestion to carry out this study.

  • PDF

Effects of firm strategies on customer acquisition of Software as a Service (SaaS) providers: A mediating and moderating role of SaaS technology maturity (SaaS 기업의 차별화 및 가격전략이 고객획득성과에 미치는 영향: SaaS 기술성숙도 수준의 매개효과 및 조절효과를 중심으로)

  • Chae, SeongWook;Park, Sungbum
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.151-171
    • /
    • 2014
  • Firms today have sought management effectiveness and efficiency utilizing information technologies (IT). Numerous firms are outsourcing specific information systems functions to cope with their short of information resources or IT experts, or to reduce their capital cost. Recently, Software-as-a-Service (SaaS) as a new type of information system has become one of the powerful outsourcing alternatives. SaaS is software deployed as a hosted and accessed over the internet. It is regarded as the idea of on-demand, pay-per-use, and utility computing and is now being applied to support the core competencies of clients in areas ranging from the individual productivity area to the vertical industry and e-commerce area. In this study, therefore, we seek to quantify the value that SaaS has on business performance by examining the relationships among firm strategies, SaaS technology maturity, and business performance of SaaS providers. We begin by drawing from prior literature on SaaS, technology maturity and firm strategy. SaaS technology maturity is classified into three different phases such as application service providing (ASP), Web-native application, and Web-service application. Firm strategies are manipulated by the low-cost strategy and differentiation strategy. Finally, we considered customer acquisition as a business performance. In this sense, specific objectives of this study are as follows. First, we examine the relationships between customer acquisition performance and both low-cost strategy and differentiation strategy of SaaS providers. Secondly, we investigate the mediating and moderating effects of SaaS technology maturity on those relationships. For this purpose, study collects data from the SaaS providers, and their line of applications registered in the database in CNK (Commerce net Korea) in Korea using a questionnaire method by the professional research institution. The unit of analysis in this study is the SBUs (strategic business unit) in the software provider. A total of 199 SBUs is used for analyzing and testing our hypotheses. With regards to the measurement of firm strategy, we take three measurement items for differentiation strategy such as the application uniqueness (referring an application aims to differentiate within just one or a small number of target industry), supply channel diversification (regarding whether SaaS vendor had diversified supply chain) as well as the number of specialized expertise and take two items for low cost strategy like subscription fee and initial set-up fee. We employ a hierarchical regression analysis technique for testing moderation effects of SaaS technology maturity and follow the Baron and Kenny's procedure for determining if firm strategies affect customer acquisition through technology maturity. Empirical results revealed that, firstly, when differentiation strategy is applied to attain business performance like customer acquisition, the effects of the strategy is moderated by the technology maturity level of SaaS providers. In other words, securing higher level of SaaS technology maturity is essential for higher business performance. For instance, given that firms implement application uniqueness or a distribution channel diversification as a differentiation strategy, they can acquire more customers when their level of SaaS technology maturity is higher rather than lower. Secondly, results indicate that pursuing differentiation strategy or low cost strategy effectively works for SaaS providers' obtaining customer, which means that continuously differentiating their service from others or making their service fee (subscription fee or initial set-up fee) lower are helpful for their business success in terms of acquiring their customers. Lastly, results show that the level of SaaS technology maturity mediates the relationships between low cost strategy and customer acquisition. That is, based on our research design, customers usually perceive the real value of the low subscription fee or initial set-up fee only through the SaaS service provide by vender and, in turn, this will affect their decision making whether subscribe or not.

Impact of Shortly Acquired IPO Firms on ICT Industry Concentration (ICT 산업분야 신생기업의 IPO 이후 인수합병과 산업 집중도에 관한 연구)

  • Chang, YoungBong;Kwon, YoungOk
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.51-69
    • /
    • 2020
  • Now, it is a stylized fact that a small number of technology firms such as Apple, Alphabet, Microsoft, Amazon, Facebook and a few others have become larger and dominant players in an industry. Coupled with the rise of these leading firms, we have also observed that a large number of young firms have become an acquisition target in their early IPO stages. This indeed results in a sharp decline in the number of new entries in public exchanges although a series of policy reforms have been promulgated to foster competition through an increase in new entries. Given the observed industry trend in recent decades, a number of studies have reported increased concentration in most developed countries. However, it is less understood as to what caused an increase in industry concentration. In this paper, we uncover the mechanisms by which industries have become concentrated over the last decades by tracing the changes in industry concentration associated with a firm's status change in its early IPO stages. To this end, we put emphasis on the case in which firms are acquired shortly after they went public. Especially, with the transition to digital-based economies, it is imperative for incumbent firms to adapt and keep pace with new ICT and related intelligent systems. For instance, after the acquisition of a young firm equipped with AI-based solutions, an incumbent firm may better respond to a change in customer taste and preference by integrating acquired AI solutions and analytics skills into multiple business processes. Accordingly, it is not unusual for young ICT firms become an attractive acquisition target. To examine the role of M&As involved with young firms in reshaping the level of industry concentration, we identify a firm's status in early post-IPO stages over the sample periods spanning from 1990 to 2016 as follows: i) being delisted, ii) being standalone firms and iii) being acquired. According to our analysis, firms that have conducted IPO since 2000s have been acquired by incumbent firms at a relatively quicker time than those that did IPO in previous generations. We also show a greater acquisition rate for IPO firms in the ICT sector compared with their counterparts in other sectors. Our results based on multinomial logit models suggest that a large number of IPO firms have been acquired in their early post-IPO lives despite their financial soundness. Specifically, we show that IPO firms are likely to be acquired rather than be delisted due to financial distress in early IPO stages when they are more profitable, more mature or less leveraged. For those IPO firms with venture capital backup have also become an acquisition target more frequently. As a larger number of firms are acquired shortly after their IPO, our results show increased concentration. While providing limited evidence on the impact of large incumbent firms in explaining the change in industry concentration, our results show that the large firms' effect on industry concentration are pronounced in the ICT sector. This result possibly captures the current trend that a few tech giants such as Alphabet, Apple and Facebook continue to increase their market share. In addition, compared with the acquisitions of non-ICT firms, the concentration impact of IPO firms in early stages becomes larger when ICT firms are acquired as a target. Our study makes new contributions. To our best knowledge, this is one of a few studies that link a firm's post-IPO status to associated changes in industry concentration. Although some studies have addressed concentration issues, their primary focus was on market power or proprietary software. Contrast to earlier studies, we are able to uncover the mechanism by which industries have become concentrated by placing emphasis on M&As involving young IPO firms. Interestingly, the concentration impact of IPO firm acquisitions are magnified when a large incumbent firms are involved as an acquirer. This leads us to infer the underlying reasons as to why industries have become more concentrated with a favor of large firms in recent decades. Overall, our study sheds new light on the literature by providing a plausible explanation as to why industries have become concentrated.

Development of the Accident Prediction Model for Enlisted Men through an Integrated Approach to Datamining and Textmining (데이터 마이닝과 텍스트 마이닝의 통합적 접근을 통한 병사 사고예측 모델 개발)

  • Yoon, Seungjin;Kim, Suhwan;Shin, Kyungshik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.1-17
    • /
    • 2015
  • In this paper, we report what we have observed with regards to a prediction model for the military based on enlisted men's internal(cumulative records) and external data(SNS data). This work is significant in the military's efforts to supervise them. In spite of their effort, many commanders have failed to prevent accidents by their subordinates. One of the important duties of officers' work is to take care of their subordinates in prevention unexpected accidents. However, it is hard to prevent accidents so we must attempt to determine a proper method. Our motivation for presenting this paper is to mate it possible to predict accidents using enlisted men's internal and external data. The biggest issue facing the military is the occurrence of accidents by enlisted men related to maladjustment and the relaxation of military discipline. The core method of preventing accidents by soldiers is to identify problems and manage them quickly. Commanders predict accidents by interviewing their soldiers and observing their surroundings. It requires considerable time and effort and results in a significant difference depending on the capabilities of the commanders. In this paper, we seek to predict accidents with objective data which can easily be obtained. Recently, records of enlisted men as well as SNS communication between commanders and soldiers, make it possible to predict and prevent accidents. This paper concerns the application of data mining to identify their interests, predict accidents and make use of internal and external data (SNS). We propose both a topic analysis and decision tree method. The study is conducted in two steps. First, topic analysis is conducted through the SNS of enlisted men. Second, the decision tree method is used to analyze the internal data with the results of the first analysis. The dependent variable for these analysis is the presence of any accidents. In order to analyze their SNS, we require tools such as text mining and topic analysis. We used SAS Enterprise Miner 12.1, which provides a text miner module. Our approach for finding their interests is composed of three main phases; collecting, topic analysis, and converting topic analysis results into points for using independent variables. In the first phase, we collect enlisted men's SNS data by commender's ID. After gathering unstructured SNS data, the topic analysis phase extracts issues from them. For simplicity, 5 topics(vacation, friends, stress, training, and sports) are extracted from 20,000 articles. In the third phase, using these 5 topics, we quantify them as personal points. After quantifying their topic, we include these results in independent variables which are composed of 15 internal data sets. Then, we make two decision trees. The first tree is composed of their internal data only. The second tree is composed of their external data(SNS) as well as their internal data. After that, we compare the results of misclassification from SAS E-miner. The first model's misclassification is 12.1%. On the other hand, second model's misclassification is 7.8%. This method predicts accidents with an accuracy of approximately 92%. The gap of the two models is 4.3%. Finally, we test if the difference between them is meaningful or not, using the McNemar test. The result of test is considered relevant.(p-value : 0.0003) This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of enlisted men's data. Additionally, various independent variables used in the decision tree model are used as categorical variables instead of continuous variables. So it suffers a loss of information. In spite of extensive efforts to provide prediction models for the military, commanders' predictions are accurate only when they have sufficient data about their subordinates. Our proposed methodology can provide support to decision-making in the military. This study is expected to contribute to the prevention of accidents in the military based on scientific analysis of enlisted men and proper management of them.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

Perceptional Change of a New Product, DMB Phone

  • Kim, Ju-Young;Ko, Deok-Im
    • Journal of Global Scholars of Marketing Science
    • /
    • v.18 no.3
    • /
    • pp.59-88
    • /
    • 2008
  • Digital Convergence means integration between industry, technology, and contents, and in marketing, it usually comes with creation of new types of product and service under the base of digital technology as digitalization progress in electro-communication industries including telecommunication, home appliance, and computer industries. One can see digital convergence not only in instruments such as PC, AV appliances, cellular phone, but also in contents, network, service that are required in production, modification, distribution, re-production of information. Convergence in contents started around 1990. Convergence in network and service begins as broadcasting and telecommunication integrates and DMB(digital multimedia broadcasting), born in May, 2005 is the symbolic icon in this trend. There are some positive and negative expectations about DMB. The reason why two opposite expectations exist is that DMB does not come out from customer's need but from technology development. Therefore, customers might have hard time to interpret the real meaning of DMB. Time is quite critical to a high tech product, like DMB because another product with same function from different technology can replace the existing product within short period of time. If DMB does not positioning well to customer's mind quickly, another products like Wibro, IPTV, or HSPDA could replace it before it even spreads out. Therefore, positioning strategy is critical for success of DMB product. To make correct positioning strategy, one needs to understand how consumer interprets DMB and how consumer's interpretation can be changed via communication strategy. In this study, we try to investigate how consumer perceives a new product, like DMB and how AD strategy change consumer's perception. More specifically, the paper segment consumers into sub-groups based on their DMB perceptions and compare their characteristics in order to understand how they perceive DMB. And, expose them different printed ADs that have messages guiding consumer think DMB in specific ways, either cellular phone or personal TV. Research Question 1: Segment consumers according to perceptions about DMB and compare characteristics of segmentations. Research Question 2: Compare perceptions about DMB after AD that induces categorization of DMB in direction for each segment. If one understand and predict a direction in which consumer perceive a new product, firm can select target customers easily. We segment consumers according to their perception and analyze characteristics in order to find some variables that can influence perceptions, like prior experience, usage, or habit. And then, marketing people can use this variables to identify target customers and predict their perceptions. If one knows how customer's perception is changed via AD message, communication strategy could be constructed properly. Specially, information from segmented customers helps to develop efficient AD strategy for segment who has prior perception. Research framework consists of two measurements and one treatment, O1 X O2. First observation is for collecting information about consumer's perception and their characteristics. Based on first observation, the paper segment consumers into two groups, one group perceives DMB similar to Cellular phone and the other group perceives DMB similar to TV. And compare characteristics of two segments in order to find reason why they perceive DMB differently. Next, we expose two kinds of AD to subjects. One AD describes DMB as Cellular phone and the other Ad describes DMB as personal TV. When two ADs are exposed to subjects, consumers don't know their prior perception of DMB, in other words, which subject belongs 'similar-to-Cellular phone' segment or 'similar-to-TV' segment? However, we analyze the AD's effect differently for each segment. In research design, final observation is for investigating AD effect. Perception before AD is compared with perception after AD. Comparisons are made for each segment and for each AD. For the segment who perceives DMB similar to TV, AD that describes DMB as cellular phone could change the prior perception. And AD that describes DMB as personal TV, could enforce the prior perception. For data collection, subjects are selected from undergraduate students because they have basic knowledge about most digital equipments and have open attitude about a new product and media. Total number of subjects is 240. In order to measure perception about DMB, we use indirect measurement, comparison with other similar digital products. To select similar digital products, we pre-survey students and then finally select PDA, Car-TV, Cellular Phone, MP3 player, TV, and PSP. Quasi experiment is done at several classes under instructor's allowance. After brief introduction, prior knowledge, awareness, and usage about DMB as well as other digital instruments is asked and their similarities and perceived characteristics are measured. And then, two kinds of manipulated color-printed AD are distributed and similarities and perceived characteristics for DMB are re-measured. Finally purchase intension, AD attitude, manipulation check, and demographic variables are asked. Subjects are given small gift for participation. Stimuli are color-printed advertising. Their actual size is A4 and made after several pre-test from AD professionals and students. As results, consumers are segmented into two subgroups based on their perceptions of DMB. Similarity measure between DMB and cellular phone and similarity measure between DMB and TV are used to classify consumers. If subject whose first measure is less than the second measure, she is classified into segment A and segment A is characterized as they perceive DMB like TV. Otherwise, they are classified as segment B, who perceives DMB like cellular phone. Discriminant analysis on these groups with their characteristics of usage and attitude shows that Segment A knows much about DMB and uses a lot of digital instrument. Segment B, who thinks DMB as cellular phone doesn't know well about DMB and not familiar with other digital instruments. So, consumers with higher knowledge perceive DMB similar to TV because launching DMB advertising lead consumer think DMB as TV. Consumers with less interest on digital products don't know well about DMB AD and then think DMB as cellular phone. In order to investigate perceptions of DMB as well as other digital instruments, we apply Proxscal analysis, Multidimensional Scaling technique at SPSS statistical package. At first step, subjects are presented 21 pairs of 7 digital instruments and evaluate similarity judgments on 7 point scale. And for each segment, their similarity judgments are averaged and similarity matrix is made. Secondly, Proxscal analysis of segment A and B are done. At third stage, get similarity judgment between DMB and other digital instruments after AD exposure. Lastly, similarity judgments of group A-1, A-2, B-1, and B-2 are named as 'after DMB' and put them into matrix made at the first stage. Then apply Proxscal analysis on these matrixes and check the positional difference of DMB and after DMB. The results show that map of segment A, who perceives DMB similar as TV, shows that DMB position closer to TV than to Cellular phone as expected. Map of segment B, who perceive DMB similar as cellular phone shows that DMB position closer to Cellular phone than to TV as expected. Stress value and R-square is acceptable. And, change results after stimuli, manipulated Advertising show that AD makes DMB perception bent toward Cellular phone when Cellular phone-like AD is exposed, and that DMB positioning move towards Car-TV which is more personalized one when TV-like AD is exposed. It is true for both segment, A and B, consistently. Furthermore, the paper apply correspondence analysis to the same data and find almost the same results. The paper answers two main research questions. The first one is that perception about a new product is made mainly from prior experience. And the second one is that AD is effective in changing and enforcing perception. In addition to above, we extend perception change to purchase intention. Purchase intention is high when AD enforces original perception. AD that shows DMB like TV makes worst intention. This paper has limitations and issues to be pursed in near future. Methodologically, current methodology can't provide statistical test on the perceptual change, since classical MDS models, like Proxscal and correspondence analysis are not probability models. So, a new probability MDS model for testing hypothesis about configuration needs to be developed. Next, advertising message needs to be developed more rigorously from theoretical and managerial perspective. Also experimental procedure could be improved for more realistic data collection. For example, web-based experiment and real product stimuli and multimedia presentation could be employed. Or, one can display products together in simulated shop. In addition, demand and social desirability threats of internal validity could influence on the results. In order to handle the threats, results of the model-intended advertising and other "pseudo" advertising could be compared. Furthermore, one can try various level of innovativeness in order to check whether it make any different results (cf. Moon 2006). In addition, if one can create hypothetical product that is really innovative and new for research, it helps to make a vacant impression status and then to study how to form impression in more rigorous way.

  • PDF