• Title/Summary/Keyword: Science and Technology Predictions

Search Result 335, Processing Time 0.023 seconds

Prediction of infectious diseases using multiple web data and LSTM (다중 웹 데이터와 LSTM을 사용한 전염병 예측)

  • Kim, Yeongha;Kim, Inhwan;Jang, Beakcheol
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.139-148
    • /
    • 2020
  • Infectious diseases have long plagued mankind, and predicting and preventing them has been a big challenge for mankind. For this reasen, various studies have been conducted so far to predict infectious diseases. Most of the early studies relied on epidemiological data from the Centers for Disease Control and Prevention (CDC), and the problem was that the data provided by the CDC was updated only once a week, making it difficult to predict the number of real-time disease outbreaks. However, with the emergence of various Internet media due to the recent development of IT technology, studies have been conducted to predict the occurrence of infectious diseases through web data, and most of the studies we have researched have been using single Web data to predict diseases. However, disease forecasting through a single Web data has the disadvantage of having difficulty collecting large amounts of learning data and making accurate predictions through models for recent outbreaks such as "COVID-19". Thus, we would like to demonstrate through experiments that models that use multiple Web data to predict the occurrence of infectious diseases through LSTM models are more accurate than those that use single Web data and suggest models suitable for predicting infectious diseases. In this experiment, we predicted the occurrence of "Malaria" and "Epidemic-parotitis" using a single web data model and the model we propose. A total of 104 weeks of NEWS, SNS, and search query data were collected, of which 75 weeks were used as learning data and 29 weeks were used as verification data. In the experiment we predicted verification data using our proposed model and single web data, Pearson correlation coefficient for the predicted results of our proposed model showed the highest similarity at 0.94, 0.86, and RMSE was also the lowest at 0.19, 0.07.

Pressure Drop Predictions Using Multiple Regression Model in Pulse Jet Type Bag Filter Without Venturi (다중회귀모형을 이용한 벤츄리가 없는 충격기류식 여과집진장치 압력손실 예측)

  • Suh, Jeong-Min;Park, Jeong-Ho;Cho, Jae-Hwan;Jin, Kyung-Ho;Jung, Moon-Sub;Yi, Pyong-In;Hong, Sung-Chul;Sivakumar, S.;Choi, Kum-Chan
    • Journal of Environmental Science International
    • /
    • v.23 no.12
    • /
    • pp.2045-2056
    • /
    • 2014
  • In this study, pressure drop was measured in the pulse jet bag filter without venturi on which 16 numbers of filter bags (Ø$140{\times}850{\ell}$) are installed according to operation condition(filtration velocity, inlet dust concentration, pulse pressure, and pulse interval) using coke dust from steel mill. The obtained 180 pressure drop test data were used to predict pressure drop with multiple regression model so that pressure drop data can be used for effective operation condition and as basic data for economical design. The prediction results showed that when filtration velocity was increased by 1%, pressure drop was increased by 2.2% which indicated that filtration velocity among operation condition was attributed on the pressure drop the most. Pressure was dropped by 1.53% when pulse pressure was increased by 1% which also confirmed that pulse pressure was the major factor affecting on the pressure drop next to filtration velocity. Meanwhile, pressure drops were found increased by 0.3% and 0.37%, respectively when inlet dust concentration and pulse interval were increased by 1% implying that the effects of inlet dust concentration and pulse interval were less as compared with those changes of filtration velocity and pulse pressure. Therefore, the larger effect on the pressure drop the pulse jet bag filter was found in the order of filtration velocity($V_f$), pulse pressure($P_p$), inlet dust concentration($C_i$), pulse interval($P_i$). Also, the prediction result of filtration velocity, inlet dust concentration, pulse pressure, and pulse interval which showed the largest effect on the pressure drop indicated that stable operation can be executed with filtration velocity less than 1.5 m/min and inlet dust concentration less than $4g/m^3$. However, it was regarded that pulse pressure and pulse interval need to be adjusted when inlet dust concentration is higher than $4g/m^3$. When filtration velocity and pulse pressure were examined, operation was possible regardless of changes in pulse pressure if filtration velocity was at 1.5 m/min. If filtration velocity was increased to 2 m/min. operation would be possible only when pulse pressure was set at higher than $5.8kgf/cm^2$. Also, the prediction result of pressure drop with filtration velocity and pulse interval showed that operation with pulse interval less than 50 sec. should be carried out under filtration velocity at 1.5 m/min. While, pulse interval should be set at lower than 11 sec. if filtration velocity was set at 2 m/min. Under the conditions of filtration velocity lower than 1 m/min and high pulse pressure higher than $7kgf/cm^2$, though pressure drop would be less, in this case, economic feasibility would be low due to increased in installation and operation cost since scale of dust collection equipment becomes larger and life of filtration bag becomes shortened due to high pulse pressure.

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.

The Adaptive Personalization Method According to Users Purchasing Index : Application to Beverage Purchasing Predictions (고객별 구매빈도에 동적으로 적응하는 개인화 시스템 : 음료수 구매 예측에의 적용)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.95-108
    • /
    • 2011
  • TThis is a study of the personalization method that intelligently adapts the level of clustering considering purchasing index of a customer. In the e-biz era, many companies gather customers' demographic and transactional information such as age, gender, purchasing date and product category. They use this information to predict customer's preferences or purchasing patterns so that they can provide more customized services to their customers. The previous Customer-Segmentation method provides customized services for each customer group. This method clusters a whole customer set into different groups based on their similarity and builds predictive models for the resulting groups. Thus, it can manage the number of predictive models and also provide more data for the customers who do not have enough data to build a good predictive model by using the data of other similar customers. However, this method often fails to provide highly personalized services to each customer, which is especially important to VIP customers. Furthermore, it clusters the customers who already have a considerable amount of data as well as the customers who only have small amount of data, which causes to increase computational cost unnecessarily without significant performance improvement. The other conventional method called 1-to-1 method provides more customized services than the Customer-Segmentation method for each individual customer since the predictive model are built using only the data for the individual customer. This method not only provides highly personalized services but also builds a relatively simple and less costly model that satisfies with each customer. However, the 1-to-1 method has a limitation that it does not produce a good predictive model when a customer has only a few numbers of data. In other words, if a customer has insufficient number of transactional data then the performance rate of this method deteriorate. In order to overcome the limitations of these two conventional methods, we suggested the new method called Intelligent Customer Segmentation method that provides adaptive personalized services according to the customer's purchasing index. The suggested method clusters customers according to their purchasing index, so that the prediction for the less purchasing customers are based on the data in more intensively clustered groups, and for the VIP customers, who already have a considerable amount of data, clustered to a much lesser extent or not clustered at all. The main idea of this method is that applying clustering technique when the number of transactional data of the target customer is less than the predefined criterion data size. In order to find this criterion number, we suggest the algorithm called sliding window correlation analysis in this study. The algorithm purposes to find the transactional data size that the performance of the 1-to-1 method is radically decreased due to the data sparity. After finding this criterion data size, we apply the conventional 1-to-1 method for the customers who have more data than the criterion and apply clustering technique who have less than this amount until they can use at least the predefined criterion amount of data for model building processes. We apply the two conventional methods and the newly suggested method to Neilsen's beverage purchasing data to predict the purchasing amounts of the customers and the purchasing categories. We use two data mining techniques (Support Vector Machine and Linear Regression) and two types of performance measures (MAE and RMSE) in order to predict two dependent variables as aforementioned. The results show that the suggested Intelligent Customer Segmentation method can outperform the conventional 1-to-1 method in many cases and produces the same level of performances compare with the Customer-Segmentation method spending much less computational cost.

A Study on the Born Global Venture Corporation's Characteristics and Performance ('본글로벌(born global)전략'을 추구하는 벤처기업의 특성과 성과에 관한 연구)

  • Kim, Hyung-Jun;Jung, Duk-Hwa
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.3
    • /
    • pp.39-59
    • /
    • 2007
  • The international involvement of a firm has been described as a gradual development process "a process in which the enterprise gradually increases its international involvement in many studies. This process evolves in the interplay between the development of knowledge about foreign markets and operations on one hand and increasing commitment of resources to foreign markets on the other." On the basis of Uppsala internationalization model, many studies strengthen strong theoretical and empirical support. According to the predictions of the classic stages theory, the internationalization process of firms have been recognized and characterized gradual evolution to foreign markets, so called stage theory: indirect & direct export, strategic alliance and foreign direct investment. However, termed "international new ventures" (McDougall, Shane, and Oviatt 1994), "born globals" (Knight 1997; Knight and Cavusgil 1996; Madsen and Servais 1997), "instant internationals" (Preece, Miles, and Baetz 1999), or "global startups" (Oviatt and McDougall 1994) have been used and come into spotlight in internationalization study of technology intensity venture companies. Recent researches focused on venture company have suggested the phenomenons of 'born global' firms as a contradiction to the stages theory. Especially the article by Oviatt and McDougall threw the spotlight on international entrepreneurs, on international new ventures, and on their importance in the globalising world economy. Since venture companies have, by definition. lack of economies of scale, lack of resources (financial and knowledge), and aversion to risk taking, they have a difficulty in expanding their market to abroad and pursue internalization gradually and step by step. However many venture companies have pursued 'Born Global Strategy', which is different from process strategy, because corporate's environment has been rapidly changing to globalization. The existing studies investigate that (1) why the ventures enter into overseas market in those early stage, even in infancy, (2) what make the different international strategy among ventures and the born global strategy is better to the infant ventures. However, as for venture's performance(growth and profitability), the existing results do not correspond each other. They also, don't include marketing strategy (differentiation, low price, market breadth and market pioneer) that is important factors in studying of BGV's performance. In this paper I aim to delineate the appearance of international new ventures and the phenomenons of venture companies' internationalization strategy. In order to verify research problems, I develop a resource-based model and marketing strategies for analyzing the effects of the born global venture firms. In this paper, I suggested 3 research problems. First, do the korean venture companies take some advantages in the aspects of corporate's performances (growth, profitability and overall market performances) when they pursue internationalization from inception? Second, do the korean BGV have firm specific assets (foreign experiences, foreign orientation, organizational absorptive capacity)? Third, What are the marketing strategies of korean BGV and is it different from others? Under these problems, I test then (1) whether the BGV that a firm started its internationalization activity almost from inception, has more intangible resources(foreign experience of corporate members, foreign orientation, technological competences and absorptive capacity) than any other venture firms(Non_BGV) and (2) also whether the BGV's marketing strategies-differentiation, low price, market diversification and preemption strategy are different from Non_BGV. Above all, the main purpose of this research is that results achieved by BGV are indeed better than those obtained by Non_BGV firms with respect to firm's growth rate and efficiency. To do this research, I surveyed venture companies located in Seoul and Deajeon in Korea during November to December, 2005. I gather the data from 200 venture companies and then selected 84 samples, which have been founded during 1999${\sim}$2000. To compare BGV's characteristics with those of Non_BGV, I also had to classify BGV by export intensity over 50% among five or six aged venture firms. Many other researches tried to classify BGV and Non_BGV, but there were various criterion as many as researchers studied on this topic. Some of them use time gap, which is time difference of establishment and it's first internationalization experience and others use export intensity, ration of export sales amount divided by total sales amount. Although using a mixed criterion of prior research in my case, I do think this kinds of criterion is subjective and arbitrary rather than objective, so I do mention my research has some critical limitation in the classification of BGV and Non_BGV. The first purpose of research is the test of difference of performance between BGV and Non_BGV. As a result of t-test, the research show that there are statistically efficient difference not only in the growth rate (sales growth rate compared to competitors and 3 years averaged sales growth rate) but also in general market performance of BGV. But in case of profitability performance, the hypothesis that is BGV is more profit (return on investment(ROI) compared to competitors and 3 years averaged ROI) than Non-BGV was not supported. From these results, this paper concludes that BGV grows rapidly and gets a high market performance (in aspect of market share and customer loyalty) but there is no profitability difference between BGV and Non_BGV. The second result is that BGV have more absorptive capacity especially, knowledge competence, and entrepreneur's international experience than Non_BGV. And this paper also found BGV search for product differentiation, exemption strategy and market diversification strategy while Non_BGV search for low price strategy. These results have never been dealt with other existing studies. This research has some limitations. First limitation is concerned about the definition of BGV, as I mentioned above. Conceptually speaking, BGV is defined as company pursue internationalization from inception, but in empirical study, it's very difficult to classify between BGV and Non_BGV. I tried to classify on the basis of time difference and export intensity, this criterions are so subjective and arbitrary that the results are not robust if the criterion were changed. Second limitation is concerned about sample used in this research. I surveyed venture companies just located in Seoul and Daejeon and also use only 84 samples which more or less provoke sample bias problem and generalization of results. I think the more following studies that focus on ventures located in other region, the better to verify the results of this paper.

  • PDF