• Title/Summary/Keyword: weight decision

Search Result 520, Processing Time 0.034 seconds

Water Balance Projection Using Climate Change Scenarios in the Korean Peninsula (기후변화 시나리오를 활용한 미래 한반도 물수급 전망)

  • Kim, Cho-Rong;Kim, Young-Oh;Seo, Seung Beom;Choi, Su-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.46 no.8
    • /
    • pp.807-819
    • /
    • 2013
  • This study proposes a new methodology for future water balance projection considering climate change by assigning a weight to each scenario instead of inputting future streamflows based on GCMs into a water balance model directly. K-nearest neighbor algorithm was employed to assign weights and streamflows in non-flood period (October to the following June) was selected as the criterion for assigning weights. GCM-driven precipitation was input to TANK model to simulate future streamflow scenarios and Quantile Mapping was applied to correct bias between GCM hindcast and historical data. Based on these bias-corrected streamflows, different weights were assigned to each streamflow scenarios to calculate water shortage for the projection periods; 2020s (2010~2039), 2050s (2040~2069), and 2080s (2070~2099). As a result by applying the proposed methodology to project water shortage over the Korean Peninsula, average water shortage for 2020s is projected to increase to 10~32% comparing to the basis (1967~2003). In addition, according to getting decreased in streamflows in non-flood period gradually by 2080s, average water shortage for 2080s is projected to increase up to 97% (516.5 million $m^3/yr$) as maximum comparing to the basis. While the existing research on climate change gives radical increase in future water shortage, the results projected by the weighting method shows conservative change. This study has significance in the applicability of water balance projection regarding climate change, keeping the existing framework of national water resources planning and this lessens the confusion for decision-makers in water sectors.

Application of diversity of recommender system accordingtouserpreferencechange (사용자 선호도 변화에 따른 추천시스템의 다양성 적용)

  • Na, Hyeyeon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.67-86
    • /
    • 2020
  • Recommender Systems have been huge influence users and business more and more. Recently the importance of E-commerce has been reached rapid growth greatly in world-wide COVID-19 pandemic. Recommender system is the center of E-commerce lively. Top ranked E-commerce managers mentioned that recommender systems have a major influence on customer's purchase such as about 50% of Netflix, Amazon sales from their recommender systems. Most algorithms have been focused on improving accuracy of recommender system regardless of novelty, diversity, serendipity etc. Recommender systems with only high accuracy cannot satisfy business long-term profit because of generating sales polarization. In addition, customers do not experience enjoyment of shopping from only focusing accuracy recommender system because customer's preference is changed constantly. Therefore, recommender systems with various values need to be developed for user's high satisfaction. Reranking is the most useful methodology to realize diversity of recommender system. In this paper, diversity of recommender system is represented through constructing high similarity with users who have different preference using each user's purchased item's category algorithm. It is distinguished from past research approach which is changing the algorithm of recommender system without user's diversity preference level. We tried to discover user's diversity preference level and observed the results how the effect was different according to user's diversity preference level. In addition, graph-based recommender system was used to show diversity through user's network, not collaborative filtering. In this paper, Amazon Grocery and Gourmet Food data was used because the low-involvement product, such as habitual product, foods, low-priced goods etc., had high probability to show customer's diversity. First, a bipartite graph with users and items simultaneously is constructed to make graph-based recommender system. However, each users and items unipartite graph also need to be established to show diversity of recommender system. The weight of each unipartite graph has played crucial role changing Jaccard Distance of item's category. We can observe two important results from the user's unipartite network. First, the user's diversity preference level is observed from the network and second, dissimilar users can be discovered in the user's network. Through the research process, diversity of recommender system is presented highly with small accuracy loss and optimalization for higher accuracy is possible controlling diversity ratio. This paper has three important theoretical points. First, this research expands recommender system research for user's satisfaction with various values. Second, the graph-based recommender system is developed newly. Third, the evaluation indicator of diversity is made for diversity. In addition, recommender systems are useful for corporate profit practically and this paper has contribution on business closely. Above all, business long-term profit can be improved using recommender system with diversity and the recommender system can provide right service according to user's diversity level. Lastly, the corporate selling low-involvement products have great effect based on the results.

A study on the rock mass classification in boreholes for a tunnel design using machine learning algorithms (머신러닝 기법을 활용한 터널 설계 시 시추공 내 암반분류에 관한 연구)

  • Lee, Je-Kyum;Choi, Won-Hyuk;Kim, Yangkyun;Lee, Sean Seungwon
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.469-484
    • /
    • 2021
  • Rock mass classification results have a great influence on construction schedule and budget as well as tunnel stability in tunnel design. A total of 3,526 tunnels have been constructed in Korea and the associated techniques in tunnel design and construction have been continuously developed, however, not many studies have been performed on how to assess rock mass quality and grade more accurately. Thus, numerous cases show big differences in the results according to inspectors' experience and judgement. Hence, this study aims to suggest a more reliable rock mass classification (RMR) model using machine learning algorithms, which is surging in availability, through the analyses based on various rock and rock mass information collected from boring investigations. For this, 11 learning parameters (depth, rock type, RQD, electrical resistivity, UCS, Vp, Vs, Young's modulus, unit weight, Poisson's ratio, RMR) from 13 local tunnel cases were selected, 337 learning data sets as well as 60 test data sets were prepared, and 6 machine learning algorithms (DT, SVM, ANN, PCA & ANN, RF, XGBoost) were tested for various hyperparameters for each algorithm. The results show that the mean absolute errors in RMR value from five algorithms except Decision Tree were less than 8 and a Support Vector Machine model is the best model. The applicability of the model, established through this study, was confirmed and this prediction model can be applied for more reliable rock mass classification when additional various data is continuously cumulated.

Space design Effect on Marketing ­ - Concentrating on B to B transaction - (공간 디자인이 마케팅에 미치는 영향 ­ - 전문전시회에서 B to B 거래중심으로 -)

  • Kim, Young Soo;Jeong, Dong Bin;Kim, Kyong Hoon
    • Korea Science and Art Forum
    • /
    • v.20
    • /
    • pp.147-158
    • /
    • 2015
  • This study made an approach to the industrial exhibition space, which is a medium of marketing communication, from the position of an enterprise and consumers through the output of Space Design, and conducted it with focus on B2B transactions among specialized exhibitions. In addition, this study inquired into what factors should be considered along with space design by interpreting the purpose of participating in the exhibition and space design of the enterprise which supply capital goods, elements, related technologies and materials, etc. This study aimed at drawing the direct/indirect effect, produced by space design, on the marketing by analyzing correlation between space design and participating enterprises' marketing. Despite the marketing effect of the exhibition, which was proved by preceding research results, the reality is that exhibition-participating expenses work as considerable burden on enterprises. Particularly, booth design, which is forming the most proportion among the participating expenses, was found to have insufficient influence on visitors due to the decline in its importance among diverse factors influencing visitor's decision to visit a booth. Regardless of the business category of participating enterprises in the exhibition, the standard of exhibits was ranked as the most important consideration factor in visiting a booth. Even by business category, the standard of booth design rarely had an influence on booth visit. Booth design had an affirmative influence on participating enterprise's preference, but its influence on product purchase or business talk & contact with a participating enterprise or price was found to be extremely low. It's difficult to judge marketing success or failure of an exhibition by the form and standard of booth design. Preferably, this study infers that it's necessary to put much weight on qualitative excellence of an exhibition, which consists of participation of an enterprise in possession of excellent technologies, exhibits with higher standards and high-quality visitors with purchasing power. This study suggests that it's more effective to set up the plan for expansion of participation in exhibition by optimally regulating the proportion of space design in participating expense to increase marketing effectiveness of an exhibition. The limitations of this study, analysis of which based on the visitors to an exhibition only, requires supplementation through the follow-up research work on participating enterprises in the exhibition.

The detection of collapsible airways contributing to airflow limitation (기류 제한에 영향을 미치는 허탈성 기도의 분석)

  • Kim, Yun Seong;Park, Byung Gyu;Lee, Kyong In;Son, Seok Man;Lee, Hyo Jin;Lee, Min Ki;Son, Choon Hee;Park, Soon Kew
    • Tuberculosis and Respiratory Diseases
    • /
    • v.43 no.4
    • /
    • pp.558-570
    • /
    • 1996
  • Background : The detection of Collapsible airways has important therapeutic implications in chronic airway disease and bronchial asthma. The distinction of a purely collapsible airways disease from that of asthma is important because the treatment of the dormer may include the use of pursed lip breathing or nasal positive pressure ventilation whereas in the latter, pharmacologic approaches are used. One form of irreversible airflow limitation is collapsible airways, which has been shown to be a Component of asthma or to emphysema, it can be assessed by the volume difference between what exits the lung as determined by a spirometer and the volume compressed as measured by the plethysmography. Method : To investigate whether volume difference between slow and forced vital Capacity(SVC-FVC) by spirometry may be used as a surrogate index of airway collapse, we examined pulmonary function parameters before and after bronchodilator agent inhalation by spirometry and body plethysmography in 20 cases of patients with evidence of airflow limitation(chronic obstructive pulmonary disease 12 cases, stable bronchial asthma 7 cases, combined chronic obstructive pulmonary disease with asthma 1 case) and 20 cases of normal subjects without evidence of airflow limitation referred to the Pusan National University Hospital pulmonary function laboratory from January 1995 to July 1995 prospectively. Results : 1) Average and standard deviation of age, height, weight of patients with airflow limitation was $58.3{\pm}7.24$(yr), $166{\pm}8.0$(cm), $59.0{\pm}9.9$(kg) and those of normal subjects was $56.3{\pm}12.47$(yr), $165.9{\pm}6.9$(cm), $64.4{\pm}10.4$(kg), respectively. The differences of physical characteristics of both group were not significant statistically and male to female ratio was 14:6 in both groups. 2) The difference between slow vital capacity and forced vital capacity was $395{\pm}317ml$ in patients group and $154{\pm}176ml$ in normal group and there was statistically significance between two groups(p<0.05). Sensitivity and specificity were most higher when the cut-off value was 208ml. 3) After bronchodilator inhalation, reversible airway obstructions were shown in 16 cases of patients group, 7 cases of control group(p<0.05) by spirometry or body plethysmography d the differences of slow vital capacity and forced vital capacity in bronchodilator response group and nonresponse group were $300.4{\pm}306ml$, $144.7{\pm}180ml$ and this difference was statistically significant. 4) The difference between slow vital capacity and forced vital capacity before bronchodilator inhalation was correlated with airway resistance before bronchodilator(r=0.307 p=0.05), and the difference between slow vital capacity and forced vital capacity after bronchodilator was correlated with difference between slow vital capacity and forced vital capacity(r=0.559 p=0.0002), thoracic gas volume(r=0.488 p=0.002) before bronchodilator and airway resistance(r=0.583 p=0.0001), thoracic gas volume(r=0.375 p=0.0170) after bronchodilator, respectively. 5) The difference between slow vital capacity and forced vital capacity in smokers and nonsmokers was $257.5{\pm}303ml$, $277.5{\pm}276ml$, respectively and this difference did not reach statistical significance(p>0.05). Conclusion : The difference between slow vital capacity and forced vital capacity by spirometry may be useful for the detection of collapsible airway and may help decision making of therapeutic plans.

  • PDF

The Effect of Common Features on Consumer Preference for a No-Choice Option: The Moderating Role of Regulatory Focus (재몰유선택적정황하공동특성대우고객희호적영향(在没有选择的情况下共同特性对于顾客喜好的影响): 조절초점적조절작용(调节焦点的调节作用))

  • Park, Jong-Chul;Kim, Kyung-Jin
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.1
    • /
    • pp.89-97
    • /
    • 2010
  • This study researches the effects of common features on a no-choice option with respect to regulatory focus theory. The primary interest is in three factors and their interrelationship: common features, no-choice option, and regulatory focus. Prior studies have compiled vast body of research in these areas. First, the "common features effect" has been observed bymany noted marketing researchers. Tversky (1972) proposed the seminal theory, the EBA model: elimination by aspect. According to this theory, consumers are prone to focus only on unique features during comparison processing, thereby dismissing any common features as redundant information. Recently, however, more provocative ideas have attacked the EBA model by asserting that common features really do affect consumer judgment. Chernev (1997) first reported that adding common features mitigates the choice gap because of the increasing perception of similarity among alternatives. Later, however, Chernev (2001) published a critically developed study against his prior perspective with the proposition that common features may be a cognitive load to consumers, and thus consumers are possible that they are prone to prefer the heuristic processing to the systematic processing. This tends to bring one question to the forefront: Do "common features" affect consumer choice? If so, what are the concrete effects? This study tries to answer the question with respect to the "no-choice" option and regulatory focus. Second, some researchers hold that the no-choice option is another best alternative of consumers, who are likely to avoid having to choose in the context of knotty trade-off settings or mental conflicts. Hope for the future also may increase the no-choice option in the context of optimism or the expectancy of a more satisfactory alternative appearing later. Other issues reported in this domain are time pressure, consumer confidence, and alternative numbers (Dhar and Nowlis 1999; Lin and Wu 2005; Zakay and Tsal 1993). This study casts the no-choice option in yet another perspective: the interactive effects between common features and regulatory focus. Third, "regulatory focus theory" is a very popular theme in recent marketing research. It suggests that consumers have two focal goals facing each other: promotion vs. prevention. A promotion focus deals with the concepts of hope, inspiration, achievement, or gain, whereas prevention focus involves duty, responsibility, safety, or loss-aversion. Thus, while consumers with a promotion focus tend to take risks for gain, the same does not hold true for a prevention focus. Regulatory focus theory predicts consumers' emotions, creativity, attitudes, memory, performance, and judgment, as documented in a vast field of marketing and psychology articles. The perspective of the current study in exploring consumer choice and common features is a somewhat creative viewpoint in the area of regulatory focus. These reviews inspire this study of the interaction possibility between regulatory focus and common features with a no-choice option. Specifically, adding common features rather than omitting them may increase the no-choice option ratio in the choice setting only to prevention-focused consumers, but vice versa to promotion-focused consumers. The reasoning is that when prevention-focused consumers come in contact with common features, they may perceive higher similarity among the alternatives. This conflict among similar options would increase the no-choice ratio. Promotion-focused consumers, however, are possible that they perceive common features as a cue of confirmation bias. And thus their confirmation processing would make their prior preference more robust, then the no-choice ratio may shrink. This logic is verified in two experiments. The first is a $2{\times}2$ between-subject design (whether common features or not X regulatory focus) using a digital cameras as the relevant stimulus-a product very familiar to young subjects. Specifically, the regulatory focus variable is median split through a measure of eleven items. Common features included zoom, weight, memory, and battery, whereas the other two attributes (pixel and price) were unique features. Results supported our hypothesis that adding common features enhanced the no-choice ratio only to prevention-focus consumers, not to those with a promotion focus. These results confirm our hypothesis - the interactive effects between a regulatory focus and the common features. Prior research had suggested that including common features had a effect on consumer choice, but this study shows that common features affect choice by consumer segmentation. The second experiment was used to replicate the results of the first experiment. This experimental study is equal to the prior except only two - priming manipulation and another stimulus. For the promotion focus condition, subjects had to write an essay using words such as profit, inspiration, pleasure, achievement, development, hedonic, change, pursuit, etc. For prevention, however, they had to use the words persistence, safety, protection, aversion, loss, responsibility, stability etc. The room for rent had common features (sunshine, facility, ventilation) and unique features (distance time and building state). These attributes implied various levels and valence for replication of the prior experiment. Our hypothesis was supported repeatedly in the results, and the interaction effects were significant between regulatory focus and common features. Thus, these studies showed the dual effects of common features on consumer choice for a no-choice option. Adding common features may enhance or mitigate no-choice, contradictory as it may sound. Under a prevention focus, adding common features is likely to enhance the no-choice ratio because of increasing mental conflict; under the promotion focus, it is prone to shrink the ratio perhaps because of a "confirmation bias." The research has practical and theoretical implications for marketers, who may need to consider common features carefully in a practical display context according to consumer segmentation (i.e., promotion vs. prevention focus.) Theoretically, the results suggest some meaningful moderator variable between common features and no-choice in that the effect on no-choice option is partly dependent on a regulatory focus. This variable corresponds not only to a chronic perspective but also a situational perspective in our hypothesis domain. Finally, in light of some shortcomings in the research, such as overlooked attribute importance, low ratio of no-choice, or the external validity issue, we hope it influences future studies to explore the little-known world of the "no-choice option."

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Multi-Dimensional Analysis Method of Product Reviews for Market Insight (마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)

  • Park, Jeong Hyun;Lee, Seo Ho;Lim, Gyu Jin;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.57-78
    • /
    • 2020
  • With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.

A Study on Searching for Export Candidate Countries of the Korean Food and Beverage Industry Using Node2vec Graph Embedding and Light GBM Link Prediction (Node2vec 그래프 임베딩과 Light GBM 링크 예측을 활용한 식음료 산업의 수출 후보국가 탐색 연구)

  • Lee, Jae-Seong;Jun, Seung-Pyo;Seo, Jinny
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.73-95
    • /
    • 2021
  • This study uses Node2vec graph embedding method and Light GBM link prediction to explore undeveloped export candidate countries in Korea's food and beverage industry. Node2vec is the method that improves the limit of the structural equivalence representation of the network, which is known to be relatively weak compared to the existing link prediction method based on the number of common neighbors of the network. Therefore, the method is known to show excellent performance in both community detection and structural equivalence of the network. The vector value obtained by embedding the network in this way operates under the condition of a constant length from an arbitrarily designated starting point node. Therefore, it has the advantage that it is easy to apply the sequence of nodes as an input value to the model for downstream tasks such as Logistic Regression, Support Vector Machine, and Random Forest. Based on these features of the Node2vec graph embedding method, this study applied the above method to the international trade information of the Korean food and beverage industry. Through this, we intend to contribute to creating the effect of extensive margin diversification in Korea in the global value chain relationship of the industry. The optimal predictive model derived from the results of this study recorded a precision of 0.95 and a recall of 0.79, and an F1 score of 0.86, showing excellent performance. This performance was shown to be superior to that of the binary classifier based on Logistic Regression set as the baseline model. In the baseline model, a precision of 0.95 and a recall of 0.73 were recorded, and an F1 score of 0.83 was recorded. In addition, the light GBM-based optimal prediction model derived from this study showed superior performance than the link prediction model of previous studies, which is set as a benchmarking model in this study. The predictive model of the previous study recorded only a recall rate of 0.75, but the proposed model of this study showed better performance which recall rate is 0.79. The difference in the performance of the prediction results between benchmarking model and this study model is due to the model learning strategy. In this study, groups were classified by the trade value scale, and prediction models were trained differently for these groups. Specific methods are (1) a method of randomly masking and learning a model for all trades without setting specific conditions for trade value, (2) arbitrarily masking a part of the trades with an average trade value or higher and using the model method, and (3) a method of arbitrarily masking some of the trades with the top 25% or higher trade value and learning the model. As a result of the experiment, it was confirmed that the performance of the model trained by randomly masking some of the trades with the above-average trade value in this method was the best and appeared stably. It was found that most of the results of potential export candidates for Korea derived through the above model appeared appropriate through additional investigation. Combining the above, this study could suggest the practical utility of the link prediction method applying Node2vec and Light GBM. In addition, useful implications could be derived for weight update strategies that can perform better link prediction while training the model. On the other hand, this study also has policy utility because it is applied to trade transactions that have not been performed much in the research related to link prediction based on graph embedding. The results of this study support a rapid response to changes in the global value chain such as the recent US-China trade conflict or Japan's export regulations, and I think that it has sufficient usefulness as a tool for policy decision-making.

DEVELOPMENT OF STATEWIDE TRUCK TRAFFIC FORECASTING METHOD BY USING LIMITED O-D SURVEY DATA (한정된 O-D조사자료를 이용한 주 전체의 트럭교통예측방법 개발)

  • 박만배
    • Proceedings of the KOR-KST Conference
    • /
    • 1995.02a
    • /
    • pp.101-113
    • /
    • 1995
  • The objective of this research is to test the feasibility of developing a statewide truck traffic forecasting methodology for Wisconsin by using Origin-Destination surveys, traffic counts, classification counts, and other data that are routinely collected by the Wisconsin Department of Transportation (WisDOT). Development of a feasible model will permit estimation of future truck traffic for every major link in the network. This will provide the basis for improved estimation of future pavement deterioration. Pavement damage rises exponentially as axle weight increases, and trucks are responsible for most of the traffic-induced damage to pavement. Consequently, forecasts of truck traffic are critical to pavement management systems. The pavement Management Decision Supporting System (PMDSS) prepared by WisDOT in May 1990 combines pavement inventory and performance data with a knowledge base consisting of rules for evaluation, problem identification and rehabilitation recommendation. Without a r.easonable truck traffic forecasting methodology, PMDSS is not able to project pavement performance trends in order to make assessment and recommendations in the future years. However, none of WisDOT's existing forecasting methodologies has been designed specifically for predicting truck movements on a statewide highway network. For this research, the Origin-Destination survey data avaiiable from WisDOT, including two stateline areas, one county, and five cities, are analyzed and the zone-to'||'&'||'not;zone truck trip tables are developed. The resulting Origin-Destination Trip Length Frequency (00 TLF) distributions by trip type are applied to the Gravity Model (GM) for comparison with comparable TLFs from the GM. The gravity model is calibrated to obtain friction factor curves for the three trip types, Internal-Internal (I-I), Internal-External (I-E), and External-External (E-E). ~oth "macro-scale" calibration and "micro-scale" calibration are performed. The comparison of the statewide GM TLF with the 00 TLF for the macro-scale calibration does not provide suitable results because the available 00 survey data do not represent an unbiased sample of statewide truck trips. For the "micro-scale" calibration, "partial" GM trip tables that correspond to the 00 survey trip tables are extracted from the full statewide GM trip table. These "partial" GM trip tables are then merged and a partial GM TLF is created. The GM friction factor curves are adjusted until the partial GM TLF matches the 00 TLF. Three friction factor curves, one for each trip type, resulting from the micro-scale calibration produce a reasonable GM truck trip model. A key methodological issue for GM. calibration involves the use of multiple friction factor curves versus a single friction factor curve for each trip type in order to estimate truck trips with reasonable accuracy. A single friction factor curve for each of the three trip types was found to reproduce the 00 TLFs from the calibration data base. Given the very limited trip generation data available for this research, additional refinement of the gravity model using multiple mction factor curves for each trip type was not warranted. In the traditional urban transportation planning studies, the zonal trip productions and attractions and region-wide OD TLFs are available. However, for this research, the information available for the development .of the GM model is limited to Ground Counts (GC) and a limited set ofOD TLFs. The GM is calibrated using the limited OD data, but the OD data are not adequate to obtain good estimates of truck trip productions and attractions .. Consequently, zonal productions and attractions are estimated using zonal population as a first approximation. Then, Selected Link based (SELINK) analyses are used to adjust the productions and attractions and possibly recalibrate the GM. The SELINK adjustment process involves identifying the origins and destinations of all truck trips that are assigned to a specified "selected link" as the result of a standard traffic assignment. A link adjustment factor is computed as the ratio of the actual volume for the link (ground count) to the total assigned volume. This link adjustment factor is then applied to all of the origin and destination zones of the trips using that "selected link". Selected link based analyses are conducted by using both 16 selected links and 32 selected links. The result of SELINK analysis by u~ing 32 selected links provides the least %RMSE in the screenline volume analysis. In addition, the stability of the GM truck estimating model is preserved by using 32 selected links with three SELINK adjustments, that is, the GM remains calibrated despite substantial changes in the input productions and attractions. The coverage of zones provided by 32 selected links is satisfactory. Increasing the number of repetitions beyond four is not reasonable because the stability of GM model in reproducing the OD TLF reaches its limits. The total volume of truck traffic captured by 32 selected links is 107% of total trip productions. But more importantly, ~ELINK adjustment factors for all of the zones can be computed. Evaluation of the travel demand model resulting from the SELINK adjustments is conducted by using screenline volume analysis, functional class and route specific volume analysis, area specific volume analysis, production and attraction analysis, and Vehicle Miles of Travel (VMT) analysis. Screenline volume analysis by using four screenlines with 28 check points are used for evaluation of the adequacy of the overall model. The total trucks crossing the screenlines are compared to the ground count totals. L V/GC ratios of 0.958 by using 32 selected links and 1.001 by using 16 selected links are obtained. The %RM:SE for the four screenlines is inversely proportional to the average ground count totals by screenline .. The magnitude of %RM:SE for the four screenlines resulting from the fourth and last GM run by using 32 and 16 selected links is 22% and 31 % respectively. These results are similar to the overall %RMSE achieved for the 32 and 16 selected links themselves of 19% and 33% respectively. This implies that the SELINICanalysis results are reasonable for all sections of the state.Functional class and route specific volume analysis is possible by using the available 154 classification count check points. The truck traffic crossing the Interstate highways (ISH) with 37 check points, the US highways (USH) with 50 check points, and the State highways (STH) with 67 check points is compared to the actual ground count totals. The magnitude of the overall link volume to ground count ratio by route does not provide any specific pattern of over or underestimate. However, the %R11SE for the ISH shows the least value while that for the STH shows the largest value. This pattern is consistent with the screenline analysis and the overall relationship between %RMSE and ground count volume groups. Area specific volume analysis provides another broad statewide measure of the performance of the overall model. The truck traffic in the North area with 26 check points, the West area with 36 check points, the East area with 29 check points, and the South area with 64 check points are compared to the actual ground count totals. The four areas show similar results. No specific patterns in the L V/GC ratio by area are found. In addition, the %RMSE is computed for each of the four areas. The %RMSEs for the North, West, East, and South areas are 92%, 49%, 27%, and 35% respectively, whereas, the average ground counts are 481, 1383, 1532, and 3154 respectively. As for the screenline and volume range analyses, the %RMSE is inversely related to average link volume. 'The SELINK adjustments of productions and attractions resulted in a very substantial reduction in the total in-state zonal productions and attractions. The initial in-state zonal trip generation model can now be revised with a new trip production's trip rate (total adjusted productions/total population) and a new trip attraction's trip rate. Revised zonal production and attraction adjustment factors can then be developed that only reflect the impact of the SELINK adjustments that cause mcreases or , decreases from the revised zonal estimate of productions and attractions. Analysis of the revised production adjustment factors is conducted by plotting the factors on the state map. The east area of the state including the counties of Brown, Outagamie, Shawano, Wmnebago, Fond du Lac, Marathon shows comparatively large values of the revised adjustment factors. Overall, both small and large values of the revised adjustment factors are scattered around Wisconsin. This suggests that more independent variables beyond just 226; population are needed for the development of the heavy truck trip generation model. More independent variables including zonal employment data (office employees and manufacturing employees) by industry type, zonal private trucks 226; owned and zonal income data which are not available currently should be considered. A plot of frequency distribution of the in-state zones as a function of the revised production and attraction adjustment factors shows the overall " adjustment resulting from the SELINK analysis process. Overall, the revised SELINK adjustments show that the productions for many zones are reduced by, a factor of 0.5 to 0.8 while the productions for ~ relatively few zones are increased by factors from 1.1 to 4 with most of the factors in the 3.0 range. No obvious explanation for the frequency distribution could be found. The revised SELINK adjustments overall appear to be reasonable. The heavy truck VMT analysis is conducted by comparing the 1990 heavy truck VMT that is forecasted by the GM truck forecasting model, 2.975 billions, with the WisDOT computed data. This gives an estimate that is 18.3% less than the WisDOT computation of 3.642 billions of VMT. The WisDOT estimates are based on the sampling the link volumes for USH, 8TH, and CTH. This implies potential error in sampling the average link volume. The WisDOT estimate of heavy truck VMT cannot be tabulated by the three trip types, I-I, I-E ('||'&'||'pound;-I), and E-E. In contrast, the GM forecasting model shows that the proportion ofE-E VMT out of total VMT is 21.24%. In addition, tabulation of heavy truck VMT by route functional class shows that the proportion of truck traffic traversing the freeways and expressways is 76.5%. Only 14.1% of total freeway truck traffic is I-I trips, while 80% of total collector truck traffic is I-I trips. This implies that freeways are traversed mainly by I-E and E-E truck traffic while collectors are used mainly by I-I truck traffic. Other tabulations such as average heavy truck speed by trip type, average travel distance by trip type and the VMT distribution by trip type, route functional class and travel speed are useful information for highway planners to understand the characteristics of statewide heavy truck trip patternS. Heavy truck volumes for the target year 2010 are forecasted by using the GM truck forecasting model. Four scenarios are used. Fo~ better forecasting, ground count- based segment adjustment factors are developed and applied. ISH 90 '||'&'||' 94 and USH 41 are used as example routes. The forecasting results by using the ground count-based segment adjustment factors are satisfactory for long range planning purposes, but additional ground counts would be useful for USH 41. Sensitivity analysis provides estimates of the impacts of the alternative growth rates including information about changes in the trip types using key routes. The network'||'&'||'not;based GMcan easily model scenarios with different rates of growth in rural versus . . urban areas, small versus large cities, and in-state zones versus external stations. cities, and in-state zones versus external stations.

  • PDF