Search | Korea Science

Export Prediction Using Separated Learning Method and Recommendation of Potential Export Countries (분리학습 모델을 이용한 수출액 예측 및 수출 유망국가 추천)

Jang, Yeongjin;Won, Jongkwan;Lee, Chaerok
- Journal of Intelligence and Information Systems
- /
- v.28 no.1
- /
- pp.69-88
- /
- 2022
One of the characteristics of South Korea's economic structure is that it is highly dependent on exports. Thus, many businesses are closely related to the global economy and diplomatic situation. In addition, small and medium-sized enterprises(SMEs) specialized in exporting are struggling due to the spread of COVID-19. Therefore, this study aimed to develop a model to forecast exports for next year to support SMEs' export strategy and decision making. Also, this study proposed a strategy to recommend promising export countries of each item based on the forecasting model. We analyzed important variables used in previous studies such as country-specific, item-specific, and macro-economic variables and collected those variables to train our prediction model. Next, through the exploratory data analysis(EDA) it was found that exports, which is a target variable, have a highly skewed distribution. To deal with this issue and improve predictive performance, we suggest a separated learning method. In a separated learning method, the whole dataset is divided into homogeneous subgroups and a prediction algorithm is applied to each group. Thus, characteristics of each group can be more precisely trained using different input variables and algorithms. In this study, we divided the dataset into five subgroups based on the exports to decrease skewness of the target variable. After the separation, we found that each group has different characteristics in countries and goods. For example, In Group 1, most of the exporting countries are developing countries and the majority of exporting goods are low value products such as glass and prints. On the other hand, major exporting countries of South Korea such as China, USA, and Vietnam are included in Group 4 and Group 5 and most exporting goods in these groups are high value products. Then we used LightGBM(LGBM) and Exponential Moving Average(EMA) for prediction. Considering the characteristics of each group, models were built using LGBM for Group 1 to 4 and EMA for Group 5. To evaluate the performance of the model, we compare different model structures and algorithms. As a result, it was found that the separated learning model had best performance compared to other models. After the model was built, we also provided variable importance of each group using SHAP-value to add explainability of our model. Based on the prediction model, we proposed a second-stage recommendation strategy for potential export countries. In the first phase, BCG matrix was used to find Star and Question Mark markets that are expected to grow rapidly. In the second phase, we calculated scores for each country and recommendations were made according to ranking. Using this recommendation framework, potential export countries were selected and information about those countries for each item was presented. There are several implications of this study. First of all, most of the preceding studies have conducted research on the specific situation or country. However, this study use various variables and develops a machine learning model for a wide range of countries and items. Second, as to our knowledge, it is the first attempt to adopt a separated learning method for exports prediction. By separating the dataset into 5 homogeneous subgroups, we could enhance the predictive performance of the model. Also, more detailed explanation of models by group is provided using SHAP values. Lastly, this study has several practical implications. There are some platforms which serve trade information including KOTRA, but most of them are based on past data. Therefore, it is not easy for companies to predict future trends. By utilizing the model and recommendation strategy in this research, trade related services in each platform can be improved so that companies including SMEs can fully utilize the service when making strategies and decisions for exports.
https://doi.org/10.13088/jiis.2022.28.1.069 인용 PDF KSCI

Steel Plate Faults Diagnosis with S-MTS (S-MTS를 이용한 강판의 표면 결함 진단)

Kim, Joon-Young;Cha, Jae-Min;Shin, Junguk;Yeom, Choongsub
- Journal of Intelligence and Information Systems
- /
- v.23 no.1
- /
- pp.47-67
- /
- 2017
Steel plate faults is one of important factors to affect the quality and price of the steel plates. So far many steelmakers generally have used visual inspection method that could be based on an inspector's intuition or experience. Specifically, the inspector checks the steel plate faults by looking the surface of the steel plates. However, the accuracy of this method is critically low that it can cause errors above 30% in judgment. Therefore, accurate steel plate faults diagnosis system has been continuously required in the industry. In order to meet the needs, this study proposed a new steel plate faults diagnosis system using Simultaneous MTS (S-MTS), which is an advanced Mahalanobis Taguchi System (MTS) algorithm, to classify various surface defects of the steel plates. MTS has generally been used to solve binary classification problems in various fields, but MTS was not used for multiclass classification due to its low accuracy. The reason is that only one mahalanobis space is established in the MTS. In contrast, S-MTS is suitable for multi-class classification. That is, S-MTS establishes individual mahalanobis space for each class. 'Simultaneous' implies comparing mahalanobis distances at the same time. The proposed steel plate faults diagnosis system was developed in four main stages. In the first stage, after various reference groups and related variables are defined, data of the steel plate faults is collected and used to establish the individual mahalanobis space per the reference groups and construct the full measurement scale. In the second stage, the mahalanobis distances of test groups is calculated based on the established mahalanobis spaces of the reference groups. Then, appropriateness of the spaces is verified by examining the separability of the mahalanobis diatances. In the third stage, orthogonal arrays and Signal-to-Noise (SN) ratio of dynamic type are applied for variable optimization. Also, Overall SN ratio gain is derived from the SN ratio and SN ratio gain. If the derived overall SN ratio gain is negative, it means that the variable should be removed. However, the variable with the positive gain may be considered as worth keeping. Finally, in the fourth stage, the measurement scale that is composed of selected useful variables is reconstructed. Next, an experimental test should be implemented to verify the ability of multi-class classification and thus the accuracy of the classification is acquired. If the accuracy is acceptable, this diagnosis system can be used for future applications. Also, this study compared the accuracy of the proposed steel plate faults diagnosis system with that of other popular classification algorithms including Decision Tree, Multi Perception Neural Network (MLPNN), Logistic Regression (LR), Support Vector Machine (SVM), Tree Bagger Random Forest, Grid Search (GS), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The steel plates faults dataset used in the study is taken from the University of California at Irvine (UCI) machine learning repository. As a result, the proposed steel plate faults diagnosis system based on S-MTS shows 90.79% of classification accuracy. The accuracy of the proposed diagnosis system is 6-27% higher than MLPNN, LR, GS, GA and PSO. Based on the fact that the accuracy of commercial systems is only about 75-80%, it means that the proposed system has enough classification performance to be applied in the industry. In addition, the proposed system can reduce the number of measurement sensors that are installed in the fields because of variable optimization process. These results show that the proposed system not only can have a good ability on the steel plate faults diagnosis but also reduce operation and maintenance cost. For our future work, it will be applied in the fields to validate actual effectiveness of the proposed system and plan to improve the accuracy based on the results.
https://doi.org/10.13088/jiis.2017.23.1.047 인용 PDF KSCI

A Study of Guide System for Cerebrovascular Intervention (뇌혈관 중재시술 지원 가이드 시스템에 관한 연구)

Lee, Sung-Gwon;Jeong, Chang-Won;Yoon, Kwon-Ha;Joo, Su-Chong
- Journal of Internet Computing and Services
- /
- v.17 no.1
- /
- pp.101-107
- /
- 2016
Due to the recent advancement in digital imaging technology, development of intervention equipment has become generalize. Video arbitration procedure is a process to insert a tiny catheter and a guide wire in the body, so in order to enhance the effectiveness and safety of this treatment, the high-quality of x-ray of image should be used. However, the increasing of radiation has become the problem. Therefore, the studies to improve the performance of x-ray detectors are being actively processed. Moreover, this intervention is based on the reference of the angiographic imaging and 3D medical image processing. In this paper, we propose a guidance system to support this intervention. Through this intervention, it can solve the problem of the existing 2D medical images based vessel that has a formation of cerebrovascular disease, and guide the real-time tracking and optimal route to the target lesion by intervention catheter and guide wire tool. As a result, the system was completely composed for medical image acquisition unit and image processing unit as well as a display device. The experimental environment, guide services which are provided by the proposed system Brain Phantom (complete intracranial model with aneurysms, ref H+N-S-A-010) was taken with x-ray and testing. To generate a reference image based on the Laplacian algorithm for the image processing which derived from the cerebral blood vessel model was applied to DICOM by Volume ray casting technique. $A^*$ algorithm was used to provide the catheter with a guide wire tracking path. Finally, the result does show the location of the catheter and guide wire providing in the proposed system especially, it is expected to provide a useful guide for future intervention service.
https://doi.org/10.7472/jksii.2016.17.1.101 인용 PDF KSCI

A Hybrid SVM Classifier for Imbalanced Data Sets (불균형 데이터 집합의 분류를 위한 하이브리드 SVM 모델)

Lee, Jae Sik;Kwon, Jong Gu
- Journal of Intelligence and Information Systems
- /
- v.19 no.2
- /
- pp.125-140
- /
- 2013
We call a data set in which the number of records belonging to a certain class far outnumbers the number of records belonging to the other class, 'imbalanced data set'. Most of the classification techniques perform poorly on imbalanced data sets. When we evaluate the performance of a certain classification technique, we need to measure not only 'accuracy' but also 'sensitivity' and 'specificity'. In a customer churn prediction problem, 'retention' records account for the majority class, and 'churn' records account for the minority class. Sensitivity measures the proportion of actual retentions which are correctly identified as such. Specificity measures the proportion of churns which are correctly identified as such. The poor performance of the classification techniques on imbalanced data sets is due to the low value of specificity. Many previous researches on imbalanced data sets employed 'oversampling' technique where members of the minority class are sampled more than those of the majority class in order to make a relatively balanced data set. When a classification model is constructed using this oversampled balanced data set, specificity can be improved but sensitivity will be decreased. In this research, we developed a hybrid model of support vector machine (SVM), artificial neural network (ANN) and decision tree, that improves specificity while maintaining sensitivity. We named this hybrid model 'hybrid SVM model.' The process of construction and prediction of our hybrid SVM model is as follows. By oversampling from the original imbalanced data set, a balanced data set is prepared. SVM_I model and ANN_I model are constructed using the imbalanced data set, and SVM_B model is constructed using the balanced data set. SVM_I model is superior in sensitivity and SVM_B model is superior in specificity. For a record on which both SVM_I model and SVM_B model make the same prediction, that prediction becomes the final solution. If they make different prediction, the final solution is determined by the discrimination rules obtained by ANN and decision tree. For a record on which SVM_I model and SVM_B model make different predictions, a decision tree model is constructed using ANN_I output value as input and actual retention or churn as target. We obtained the following two discrimination rules: 'IF ANN_I output value <0.285, THEN Final Solution = Retention' and 'IF ANN_I output value ${\geq}0.285$, THEN Final Solution = Churn.' The threshold 0.285 is the value optimized for the data used in this research. The result we present in this research is the structure or framework of our hybrid SVM model, not a specific threshold value such as 0.285. Therefore, the threshold value in the above discrimination rules can be changed to any value depending on the data. In order to evaluate the performance of our hybrid SVM model, we used the 'churn data set' in UCI Machine Learning Repository, that consists of 85% retention customers and 15% churn customers. Accuracy of the hybrid SVM model is 91.08% that is better than that of SVM_I model or SVM_B model. The points worth noticing here are its sensitivity, 95.02%, and specificity, 69.24%. The sensitivity of SVM_I model is 94.65%, and the specificity of SVM_B model is 67.00%. Therefore the hybrid SVM model developed in this research improves the specificity of SVM_B model while maintaining the sensitivity of SVM_I model.
https://doi.org/10.13088/jiis.2013.19.2.125 인용 PDF KSCI

A Study on the Improvement of Recommendation Accuracy by Using Category Association Rule Mining (카테고리 연관 규칙 마이닝을 활용한 추천 정확도 향상 기법)

Lee, Dongwon
- Journal of Intelligence and Information Systems
- /
- v.26 no.2
- /
- pp.27-42
- /
- 2020
Traditional companies with offline stores were unable to secure large display space due to the problems of cost. This limitation inevitably allowed limited kinds of products to be displayed on the shelves, which resulted in consumers being deprived of the opportunity to experience various items. Taking advantage of the virtual space called the Internet, online shopping goes beyond the limits of limitations in physical space of offline shopping and is now able to display numerous products on web pages that can satisfy consumers with a variety of needs. Paradoxically, however, this can also cause consumers to experience the difficulty of comparing and evaluating too many alternatives in their purchase decision-making process. As an effort to address this side effect, various kinds of consumer's purchase decision support systems have been studied, such as keyword-based item search service and recommender systems. These systems can reduce search time for items, prevent consumer from leaving while browsing, and contribute to the seller's increased sales. Among those systems, recommender systems based on association rule mining techniques can effectively detect interrelated products from transaction data such as orders. The association between products obtained by statistical analysis provides clues to predicting how interested consumers will be in another product. However, since its algorithm is based on the number of transactions, products not sold enough so far in the early days of launch may not be included in the list of recommendations even though they are highly likely to be sold. Such missing items may not have sufficient opportunities to be exposed to consumers to record sufficient sales, and then fall into a vicious cycle of a vicious cycle of declining sales and omission in the recommendation list. This situation is an inevitable outcome in situations in which recommendations are made based on past transaction histories, rather than on determining potential future sales possibilities. This study started with the idea that reflecting the means by which this potential possibility can be identified indirectly would help to select highly recommended products. In the light of the fact that the attributes of a product affect the consumer's purchasing decisions, this study was conducted to reflect them in the recommender systems. In other words, consumers who visit a product page have shown interest in the attributes of the product and would be also interested in other products with the same attributes. On such assumption, based on these attributes, the recommender system can select recommended products that can show a higher acceptance rate. Given that a category is one of the main attributes of a product, it can be a good indicator of not only direct associations between two items but also potential associations that have yet to be revealed. Based on this idea, the study devised a recommender system that reflects not only associations between products but also categories. Through regression analysis, two kinds of associations were combined to form a model that could predict the hit rate of recommendation. To evaluate the performance of the proposed model, another regression model was also developed based only on associations between products. Comparative experiments were designed to be similar to the environment in which products are actually recommended in online shopping malls. First, the association rules for all possible combinations of antecedent and consequent items were generated from the order data. Then, hit rates for each of the associated rules were predicted from the support and confidence that are calculated by each of the models. The comparative experiments using order data collected from an online shopping mall show that the recommendation accuracy can be improved by further reflecting not only the association between products but also categories in the recommendation of related products. The proposed model showed a 2 to 3 percent improvement in hit rates compared to the existing model. From a practical point of view, it is expected to have a positive effect on improving consumers' purchasing satisfaction and increasing sellers' sales.
https://doi.org/10.13088/jiis.2020.26.2.027 인용 PDF KSCI

Improved Sentence Boundary Detection Method for Web Documents (웹 문서를 위한 개선된 문장경계인식 방법)

Lee, Chung-Hee;Jang, Myung-Gil;Seo, Young-Hoon
- Journal of KIISE:Software and Applications
- /
- v.37 no.6
- /
- pp.455-463
- /
- 2010
In this paper, we present an approach to sentence boundary detection for web documents that builds on statistical-based methods and uses rule-based correction. The proposed system uses the classification model learned offline using a training set of human-labeled web documents. The web documents have many word-spacing errors and frequently no punctuation mark that indicates the end of sentence boundary. As sentence boundary candidates, the proposed method considers every Ending Eomis as well as punctuation marks. We optimize engine performance by selecting the best feature, the best training data, and the best classification algorithm. For evaluation, we made two test sets; Set1 consisting of articles and blog documents and Set2 of web community documents. We use F-measure to compare results on a large variety of tasks, Detecting only periods as sentence boundary, our basis engine showed 96.5% in Set1 and 56.7% in Set2. We improved our basis engine by adapting features and the boundary search algorithm. For the final evaluation, we compared our adaptation engine with our basis engine in Set2. As a result, the adaptation engine obtained improvements over the basis engine by 39.6%. We proved the effectiveness of the proposed method in sentence boundary detection.
PDF KSCI

Applying Ensemble Model for Identifying Uncertainty in the Species Distribution Models (종분포모형의 불확실성 확인을 위한 앙상블모형 적용)

Kwon, Hyuk Soo
- Journal of Korean Society for Geospatial Information Science
- /
- v.22 no.4
- /
- pp.47-52
- /
- 2014
Species distribution models have been widely applied in order to assess biodiversity, design reserve, manage habitat and predict climate change. However, SDMs has been used restrictively to the public and policy sectors owing to model uncertainty. Recent studies on ensemble and consensus models have been increased to reduce model uncertainty. This paper was carried out single model and multi model for Corylopsis coreana and compares two models. First, model evaluation was used AUC, kappa and TSS. TSS was the most effective method because it was easy to compare several models and convert binary maps. Second, both single and ensemble model show good performance and RF, Maxent and GBM was evaluated higher, GAM and SRE was evaluated lower relatively. Third, ensemble model tended to overestimate over single model. This problem can be solved by the suitable model selection and weighting through collaboration between field experts and modeler. Finally, we should identify causes and magnitude of model uncertainty and improve data quality and model methods in order to apply special decision-making support system and conservation planning, and when we make policy decisions using SDMs, we should recognize uncertainty and risk.
https://doi.org/10.7319/kogsis.2014.22.4.047 인용 PDF KSCI

Psychological burden for legal responsibility of 119 emergency personnels (119구급대원의 법적책임에 대한 심리적 부담감)

Lim, Jae-Man;Yun, Seok-Jeong;Lim, Gwan-Su;Kang, Shin-Kap;Choi, Eun-Sook;Seo, Kyung-Hee
- The Korean Journal of Emergency Medical Services
- /
- v.13 no.1
- /
- pp.87-96
- /
- 2009
Purpose : To grasp the mental burden for legal responsibility that rescue members have in the performance of job. Method : Questionnaire was presented to rescue members serving in 2 direct control safety centers of fire station located in Seoul, Daejon, Incheon, Kwangju, Busan, Daegu and Ulsan. Results : 1. Questioned whether they have mental burden for legal responsibility while performing job on the site, the rescue members responded : very burdensome in 38.0%, burdensome in 56.0%, moderate in 4.5%, not burdensome in 1.0%, no burden at all in 0.5%. 2. Questioned on the first aid treat for which they have the most mental burden, the rescue members responded : intubation into trachea laryngeal mask airway(LMA) in 40.4%, automatic external defibrillator in 16.3%, securing vein providing sap(medicine) in 10.8%, basic cardiopulmonary resuscitation in 7.2%, eliminating foreign matters inserted into body in 5.4%, stanching external bleeding and treating injury in 5.4%, fixing extremities and spine by using splint in 1.8%, measuring the symptom of vitality in 1.2%, providing oxygen in 0.0%. 3. Questioned whether experiencing legal problem or firm petition(complaint) raised by patient while serving as rescue members, they responded : experiencing a complaint in 41.6%, experiencing no complaint in 58.4%. Asked to indicate the stress level in the scale of which they suffered when lawsuit or firm petition was raised, 0(weak)-10(strong), they answered 8.8 in average. 4. Questioned whether 119 rescue members put the legal responsibility in case that they cause damage to patients intentionally in performing, they responded to the inquiry 3.66 in average(of 5.00). It represented meaningful differences (F=2.874, p=.024) whether they had license or not. 5. In future, legal action will raise against the rescue member by 99% because of people's rights improvement(63.1%), high expectations for the rescue system(29.5%), non-licensed rescue members(5.1%). Conclusion : It was found that the rescue members had severe mental burden for advanced life support which was investigated to have low enforcement rate in the preceding research, for instance, intubation into trachea securing vein management by using automatic external defibrillator. To improve the qualitative level of rescue service in the fire fighting, it may be required to construct the environment that eliminates the mental burden of rescue members for legal responsibility.
PDF

Design and Implementation of 3D Geospatial Open Platform Based on HTML5/WebGL Technology (HTML5/WebGL 기반 3D 공간정보 오픈플랫폼 소프트웨어 설계 및 구현)

Kim, Min Soo;Jang, In Sung
- Spatial Information Research
- /
- v.23 no.6
- /
- pp.57-66
- /
- 2015
Recently, the utilization of geospatial open platforms has been constantly increased and the interest in 3D geospatial data such as terrain, building and shopping mall has been increased significantly. In particular, rather than simplified 3D geospatial data, interest in high-precision 3D geospatial data which similarly represents the real world objects has increased significantly. In order to satisfy the demand for such the high-precision 3D geospatial data, various kinds of 3D geospatial open platforms has been developed and has provided services on the web. However, most of the 3D geospatial open platforms have been used plug-in module in order to ensure a fast 3D rendering performance on the web, despite the many problems such as difficulty of the installation, no supporting of cross browser/operating system and security issues. In addition, recently, the existing 3D geospatial open platforms based on plug-in module are facing a serious problem, by declaring the NPAPI service interruption in Chrome and Firefox browsers. In this study, we presents the design and implementation of a new 3D geospatial open platform based on HTML5/WebGL technology without the use of plug-ins. Such the new 3D geospatial open platform based on HTML5/WebGL may support cross browsers such as IE, Chrome, Firefox, Safari and cross OS platforms such as Windows, Linux, Mac and mobile OS platforms.
https://doi.org/10.12672/ksis.2015.23.6.057 인용 PDF KSCI

The big data method for flash flood warning (돌발홍수 예보를 위한 빅데이터 분석방법)

Park, Dain;Yoon, Sanghoo
- Journal of Digital Convergence
- /
- v.15 no.11
- /
- pp.245-250
- /
- 2017
Flash floods is defined as the flooding of intense rainfall over a relatively small area that flows through river and valley rapidly in short time with no advance warning. So that it can cause damage property and casuality. This study is to establish the flash-flood warning system using 38 accident data, reported from the National Disaster Information Center and Land Surface Model(TOPLATS) between 2009 and 2012. Three variables were used in the Land Surface Model: precipitation, soil moisture, and surface runoff. The three variables of 6 hours preceding flash flood were reduced to 3 factors through factor analysis. Decision tree, random forest, Naive Bayes, Support Vector Machine, and logistic regression model are considered as big data methods. The prediction performance was evaluated by comparison of Accuracy, Kappa, TP Rate, FP Rate and F-Measure. The best method was suggested based on reproducibility evaluation at the each points of flash flood occurrence and predicted count versus actual count using 4 years data.
https://doi.org/10.14400/JDC.2017.15.11.245 인용 PDF KSCI

Search Result 2,655, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)