• Title/Summary/Keyword: data value prediction

Search Result 1,106, Processing Time 0.03 seconds

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

A PLS Path Modeling Approach on the Cause-and-Effect Relationships among BSC Critical Success Factors for IT Organizations (PLS 경로모형을 이용한 IT 조직의 BSC 성공요인간의 인과관계 분석)

  • Lee, Jung-Hoon;Shin, Taek-Soo;Lim, Jong-Ho
    • Asia pacific journal of information systems
    • /
    • v.17 no.4
    • /
    • pp.207-228
    • /
    • 2007
  • Measuring Information Technology(IT) organizations' activities have been limited to mainly measure financial indicators for a long time. However, according to the multifarious functions of Information System, a number of researches have been done for the new trends on measurement methodologies that come with financial measurement as well as new measurement methods. Especially, the researches on IT Balanced Scorecard(BSC), concept from BSC measuring IT activities have been done as well in recent years. BSC provides more advantages than only integration of non-financial measures in a performance measurement system. The core of BSC rests on the cause-and-effect relationships between measures to allow prediction of value chain performance measures to allow prediction of value chain performance measures, communication, and realization of the corporate strategy and incentive controlled actions. More recently, BSC proponents have focused on the need to tie measures together into a causal chain of performance, and to test the validity of these hypothesized effects to guide the development of strategy. Kaplan and Norton[2001] argue that one of the primary benefits of the balanced scorecard is its use in gauging the success of strategy. Norreklit[2000] insist that the cause-and-effect chain is central to the balanced scorecard. The cause-and-effect chain is also central to the IT BSC. However, prior researches on relationship between information system and enterprise strategies as well as connection between various IT performance measurement indicators are not so much studied. Ittner et al.[2003] report that 77% of all surveyed companies with an implemented BSC place no or only little interest on soundly modeled cause-and-effect relationships despite of the importance of cause-and-effect chains as an integral part of BSC. This shortcoming can be explained with one theoretical and one practical reason[Blumenberg and Hinz, 2006]. From a theoretical point of view, causalities within the BSC method and their application are only vaguely described by Kaplan and Norton. From a practical consideration, modeling corporate causalities is a complex task due to tedious data acquisition and following reliability maintenance. However, cause-and effect relationships are an essential part of BSCs because they differentiate performance measurement systems like BSCs from simple key performance indicator(KPI) lists. KPI lists present an ad-hoc collection of measures to managers but do not allow for a comprehensive view on corporate performance. Instead, performance measurement system like BSCs tries to model the relationships of the underlying value chain in cause-and-effect relationships. Therefore, to overcome the deficiencies of causal modeling in IT BSC, sound and robust causal modeling approaches are required in theory as well as in practice for offering a solution. The propose of this study is to suggest critical success factors(CSFs) and KPIs for measuring performance for IT organizations and empirically validate the casual relationships between those CSFs. For this purpose, we define four perspectives of BSC for IT organizations according to Van Grembergen's study[2000] as follows. The Future Orientation perspective represents the human and technology resources needed by IT to deliver its services. The Operational Excellence perspective represents the IT processes employed to develop and deliver the applications. The User Orientation perspective represents the user evaluation of IT. The Business Contribution perspective captures the business value of the IT investments. Each of these perspectives has to be translated into corresponding metrics and measures that assess the current situations. This study suggests 12 CSFs for IT BSC based on the previous IT BSC's studies and COBIT 4.1. These CSFs consist of 51 KPIs. We defines the cause-and-effect relationships among BSC CSFs for IT Organizations as follows. The Future Orientation perspective will have positive effects on the Operational Excellence perspective. Then the Operational Excellence perspective will have positive effects on the User Orientation perspective. Finally, the User Orientation perspective will have positive effects on the Business Contribution perspective. This research tests the validity of these hypothesized casual effects and the sub-hypothesized causal relationships. For the purpose, we used the Partial Least Squares approach to Structural Equation Modeling(or PLS Path Modeling) for analyzing multiple IT BSC CSFs. The PLS path modeling has special abilities that make it more appropriate than other techniques, such as multiple regression and LISREL, when analyzing small sample sizes. Recently the use of PLS path modeling has been gaining interests and use among IS researchers in recent years because of its ability to model latent constructs under conditions of nonormality and with small to medium sample sizes(Chin et al., 2003). The empirical results of our study using PLS path modeling show that the casual effects in IT BSC significantly exist partially in our hypotheses.

A Study on Intelligent Value Chain Network System based on Firms' Information (기업정보 기반 지능형 밸류체인 네트워크 시스템에 관한 연구)

  • Sung, Tae-Eung;Kim, Kang-Hoe;Moon, Young-Su;Lee, Ho-Shin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.67-88
    • /
    • 2018
  • Until recently, as we recognize the significance of sustainable growth and competitiveness of small-and-medium sized enterprises (SMEs), governmental support for tangible resources such as R&D, manpower, funds, etc. has been mainly provided. However, it is also true that the inefficiency of support systems such as underestimated or redundant support has been raised because there exist conflicting policies in terms of appropriateness, effectiveness and efficiency of business support. From the perspective of the government or a company, we believe that due to limited resources of SMEs technology development and capacity enhancement through collaboration with external sources is the basis for creating competitive advantage for companies, and also emphasize value creation activities for it. This is why value chain network analysis is necessary in order to analyze inter-company deal relationships from a series of value chains and visualize results through establishing knowledge ecosystems at the corporate level. There exist Technology Opportunity Discovery (TOD) system that provides information on relevant products or technology status of companies with patents through retrievals over patent, product, or company name, CRETOP and KISLINE which both allow to view company (financial) information and credit information, but there exists no online system that provides a list of similar (competitive) companies based on the analysis of value chain network or information on potential clients or demanders that can have business deals in future. Therefore, we focus on the "Value Chain Network System (VCNS)", a support partner for planning the corporate business strategy developed and managed by KISTI, and investigate the types of embedded network-based analysis modules, databases (D/Bs) to support them, and how to utilize the system efficiently. Further we explore the function of network visualization in intelligent value chain analysis system which becomes the core information to understand industrial structure ystem and to develop a company's new product development. In order for a company to have the competitive superiority over other companies, it is necessary to identify who are the competitors with patents or products currently being produced, and searching for similar companies or competitors by each type of industry is the key to securing competitiveness in the commercialization of the target company. In addition, transaction information, which becomes business activity between companies, plays an important role in providing information regarding potential customers when both parties enter similar fields together. Identifying a competitor at the enterprise or industry level by using a network map based on such inter-company sales information can be implemented as a core module of value chain analysis. The Value Chain Network System (VCNS) combines the concepts of value chain and industrial structure analysis with corporate information simply collected to date, so that it can grasp not only the market competition situation of individual companies but also the value chain relationship of a specific industry. Especially, it can be useful as an information analysis tool at the corporate level such as identification of industry structure, identification of competitor trends, analysis of competitors, locating suppliers (sellers) and demanders (buyers), industry trends by item, finding promising items, finding new entrants, finding core companies and items by value chain, and recognizing the patents with corresponding companies, etc. In addition, based on the objectivity and reliability of the analysis results from transaction deals information and financial data, it is expected that value chain network system will be utilized for various purposes such as information support for business evaluation, R&D decision support and mid-term or short-term demand forecasting, in particular to more than 15,000 member companies in Korea, employees in R&D service sectors government-funded research institutes and public organizations. In order to strengthen business competitiveness of companies, technology, patent and market information have been provided so far mainly by government agencies and private research-and-development service companies. This service has been presented in frames of patent analysis (mainly for rating, quantitative analysis) or market analysis (for market prediction and demand forecasting based on market reports). However, there was a limitation to solving the lack of information, which is one of the difficulties that firms in Korea often face in the stage of commercialization. In particular, it is much more difficult to obtain information about competitors and potential candidates. In this study, the real-time value chain analysis and visualization service module based on the proposed network map and the data in hands is compared with the expected market share, estimated sales volume, contact information (which implies potential suppliers for raw material / parts, and potential demanders for complete products / modules). In future research, we intend to carry out the in-depth research for further investigating the indices of competitive factors through participation of research subjects and newly developing competitive indices for competitors or substitute items, and to additively promoting with data mining techniques and algorithms for improving the performance of VCNS.

Recent Changes in Bloom Dates of Robinia pseudoacacia and Bloom Date Predictions Using a Process-Based Model in South Korea (최근 12년간 아까시나무 만개일의 변화와 과정기반모형을 활용한 지역별 만개일 예측)

  • Kim, Sukyung;Kim, Tae Kyung;Yoon, Sukhee;Jang, Keunchang;Lim, Hyemin;Lee, Wi Young;Won, Myoungsoo;Lim, Jong-Hwan;Kim, Hyun Seok
    • Journal of Korean Society of Forest Science
    • /
    • v.110 no.3
    • /
    • pp.322-340
    • /
    • 2021
  • Due to climate change and its consequential spring temperature rise, flowering time of Robinia pseudoacacia has advanced and a simultaneous blooming phenomenon occurred in different regions in South Korea. These changes in flowering time became a major crisis in the domestic beekeeping industry and the demand for accurate prediction of flowering time for R. pseudoacacia is increasing. In this study, we developed and compared performance of four different models predicting flowering time of R. pseudoacacia for the entire country: a Single Model for the country (SM), Modified Single Model (MSM) using correction factors derived from SM, Group Model (GM) estimating parameters for each region, and Local Model (LM) estimating parameters for each site. To achieve this goal, the bloom date data observed at 26 points across the country for the past 12 years (2006-2017) and daily temperature data were used. As a result, bloom dates for the north central region, where spring temperature increase was more than two-fold higher than southern regions, have advanced and the differences compared with the southwest region decreased by 0.7098 days per year (p-value=0.0417). Model comparisons showed MSM and LM performed better than the other models, as shown by 24% and 15% lower RMSE than SM, respectively. Furthermore, validation with 16 additional sites for 4 years revealed co-krigging of LM showed better performance than expansion of MSM for the entire nation (RMSE: p-value=0.0118, Bias: p-value=0.0471). This study improved predictions of bloom dates for R. pseudoacacia and proposed methods for reliable expansion to the entire nation.

Supercomputing Performance Demand Forecasting Using Cross-sectional and Time Series Analysis (횡단면분석과 추세분석을 이용한 슈퍼컴퓨팅 성능수요 예측)

  • Park, Manhee
    • Journal of Technology Innovation
    • /
    • v.23 no.2
    • /
    • pp.33-54
    • /
    • 2015
  • Supercomputing performance demand forecasting at the national level is an important information to the researchers in fields of the computational science field, the specialized agencies which establish and operate R&D infrastructure, and the government agencies which establish science and technology infrastructure. This study derived the factors affecting the scientific and technological capability through the analysis of supercomputing performance prediction research, and it proposed a hybrid forecasting model of applying the super-computer technology trends. In the cross-sectional analysis, multiple regression analysis was performed using factors with GDP, GERD, the number of researchers, and the number of SCI papers that could affect the supercomputing performance. In addition, the supercomputing performance was predicted by multiplying in the cross-section analysis with technical progress rate of time period which was calculated by time series analysis using performance(Rmax) of Top500 data. Korea's performance scale of supercomputing in 2016 was predicted using the proposed forecasting model based on data of the top500 supercomputer and supercomputing performance demand in Korea was predicted using a cross-sectional analysis and technical progress rate. The results of this study showed that the supercomputing performance is expected to require 15~30PF when it uses the current trend, and is expected to require 20~40PF when it uses the trend of the targeting national-level. These two results showed significant differences between the forecasting value(9.6PF) of regression analysis and the forecasting value(2.5PF) of cross-sectional analysis.

Suggestion of Modified Compression Index for secondary consolidation using by Nonlinear Elasto Viscoplastic Models (비선형 점탄소성 모델을 이용한 2차압밀이 포함된 수정압축지수개발)

  • Choi, Bu-Sung;Im, Jong-Chul;Kwon, Jung-Keun
    • Proceedings of the Korean Geotechical Society Conference
    • /
    • 2008.10a
    • /
    • pp.1115-1123
    • /
    • 2008
  • When constructing projects such as road embankments, bridge approaches, dikes or buildings on soft, compressible soils, significant settlements may occur due to the consolidation of these soils under the superimposed loads. The compressibility of the soil skeleton of a soft clay is influenced by such factors as structure and fabric, stress path, temperature and loading rate. Although it is possible to determine appropriate relations and the corresponding material parameters in the laboratory, it is well known that sample disturbance due to stress release, temperature change and moisture content change can have a profound effect on the compressibility of a clay. The early research of Tezaghi and Casagrande has had a lasting influence on our interpretation of consolidation data. The 24 hour, incremental load, oedometer test has become, more or less, the standard procedure for determining the one-dimensional, stress-strain behavior of clays. An important notion relates to the interpretation of the data is the ore-consolidation pressure ${\sigma}_p$, which is located approximately at the break in the slope on the curve. From a practical point of view, this pressure is usually viewed as corresponding to the maximum past effective stress supported by the soil. Researchers have shown, however, that the value of ${\sigma}_p$ depends on the test procedure. furthermore, owing to sampling disturbance, the results of the laboratory consolidation test must be corrected to better capture the in-situ compressibility characteristics. The corrections apply, strictly speaking, to soils where the relation between strain and effective stress is time independent. An important assumption in Terzaghi's one-dimensional theory of consolidation is that the soil skeleton behaves elastically. On the other hand, Buisman recognized that creep deformations in settlement analysis can be important. this has led to extensions to Terzaghi's theory by various investigators, including the applicant and coworkers. The main object of this study is to suggestion the modified compression index value to predict settlements by back calculating the $C_c$ from different numerical models, which are giving best prediction settlements for multi layers including very thick soft clay.

  • PDF

A Study on Non-financial Factors Affecting the Insolvency of Social Enterprises (사회적기업의 부실에 영향을 미치는 비재무요인에 관한 연구 )

  • Yong-Chan, Chun;Hyeok, Kim;Dong-Myung, Lee
    • Journal of Industrial Convergence
    • /
    • v.21 no.11
    • /
    • pp.13-27
    • /
    • 2023
  • This study aims to contribute to the reduction of the failure rate and social costs resulting from business failures by analyzing factors that affect the insolvency of social enterprises, as the role of social enterprises is increasing in our economy. The data used in this study were classified as normal and insolvent companies among social enterprises (including prospective social enterprises) that were established between 2009 and 2018 and received credit guarantees from credit guarantee institutions as of the end of June 2022. Among the collected data, 439 social enterprises with available financial information were targeted; 406 (92.5%) were normal enterprises, and 33 (7.5%) were insolvent enterprises. Through a literature review, eight non-financial factors commonly used for insolvency prediction were selected. The cross-analysis results showed that four of these factors were significant. Logistic regression analysis revealed that two variables, including corporate credit rating and the personal credit rating of the representative, were significant. Financial factors such as debt ratio, sales operating profit rate, and total asset turnover were used as control variables. The empirical analysis confirmed that the two independent variables maintained their influence even after controlling for financial factors. Given that government-led support and development policies have limitations, there is a need to shift policy direction so that various companies aspiring to create social value can enter the social enterprise sector through private and regional initiatives. This would enable the social economy to create an environment where local residents can collaborate to realize social value, and the government should actively support this.

The Height of Fall as a Predictor of Fatality of Fall (추락 후 사망 예측인자로서의 추락 높이)

  • Suh, Joo Hyun;Eo, Eun Kyung;Jung, Koo Young
    • Journal of Trauma and Injury
    • /
    • v.18 no.2
    • /
    • pp.101-106
    • /
    • 2005
  • Purpose: The number of the deceased from free-fall is increasing nowadays. Free-fall comes to a great social problem in that even the survivor will be suffering for cord injury or brain injury, and so on. We analyzed the cases of free-fall patients to find out whether the injury severity is mainly correlated with the height of fall. Methods: We retrospectively investigated the characteristics of patients, who fall from the height above 2m from January 2000 to August 2004. We excluded the patients who transferred to other hospital, transferred from other hospital, and not known the height of fall. 145 patients were evaluated. Variables included in data analysis were age, height of fall, injury severity score (ISS), the being of barrier, and the survival or not. To find out the correlation between height of fall and death, we used receive operating characteristics (ROC) curve analysis. Results: The mean age of patients was $36.5{\pm}19.4$ years old. 110 were male and 35 were female. Mean height of fall was $11.1{\pm}8.5m$. 51 patients (35.2%) were died and 30 patients of them (58.9%) got emergency room on dead body. The mean height of fall is $8.9{\pm}5.8m$ for 94 survivors and $15.2{\pm}11.0m$ for the 51 deceased (p<0.001). The area under the ROC curve was 0.646, which means the height of fall was not adequate factor for predicting for death. At 13.5m, as cut?off value, sensitivity is 52.9%, specificity is 86.2%, positive predictive value is 67.5% and negative predictive value is 77.1%. There were statistical differences in mortality rate and ISS between 'below 13.5m group' and 'above 13.5m group', but there was not statistical difference in head and neck AIS. Conclusion: The height of fall is not adequate factor for prediction of death. So other factors like intoxication or not, the being of barrier or protection device need to be evaluated for predicting of free-fall patient's death.

A numerical study of the effects of the ventilation velocity on the thermal characteristics in underground utility tunnel (지하공동구 터널내 풍속 변화에 따른 열특성에 관한 수치 해석적 연구)

  • Yoo, Ji-Oh;Kim, Jin-Su;Ra, Kwang-Hoon
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.19 no.1
    • /
    • pp.29-39
    • /
    • 2017
  • In this research, thermal design data such as heat transfer coefficient on the wall surface required for ventilation system design which is to prevent the temperature rise in the underground utility tunnel that three sides are adjoined with the ground was investigated in numerical analalysis. The numerical model has been devised including the tunnel lining of the underground utility tunnel in order to take account for the heat transfer in the tunnel walls. The air temperature in the tunnel, wall temperature, and the heating value through the wall based on heating value(117~468 kW/km) of the power cable installed in the tunnel and the wind speed in the tunnel(0.5~4.0 m/s) were calculated by CFD simulation. In addition, the wall heat transfer coefficient was computed from the results analysis, and the limit distance used to keep the air temperature in the tunnel stable was examined through the research. The convective heat transfer coefficient at the wall surface shows unstable pattern at the inlet area. However, it converges to a constant value beyond approximately 100 meter. The tunnel wall heat transfer coefficient is $3.1{\sim}9.16W/m^2^{\circ}C$ depending on the wind speed, and following is the dimensionless number:$Nu=1.081Re^{0.4927}({\mu}/{\mu}_w)^{0.14}$. This study has suggested the prediction model of temperature in the tunnel based on the thermal resistance analysis technique, and it is appraised that deviation can be used in the range of 3% estimation.

The study on estimated breeding value and accuracy for economic traits in Gyoungnam Hanwoo cow (Korean cattle)

  • Kim, Eun Ho;Kim, Hyeon Kwon;Sun, Du Won;Kang, Ho Chan;Lee, Doo Ho;Lee, Seung Hwan;Lee, Jae Bong;Lim, Hyun Tae
    • Journal of Animal Science and Technology
    • /
    • v.62 no.4
    • /
    • pp.429-437
    • /
    • 2020
  • This study was conducted to construct basic data for the selection of elite cows by analyzing the estimated breeding value (EBV) and accuracy using the pedigree of Hanwoo cows in Gyeongnam. The phenotype trait used in the analysis are the carcass weight (CWT), eye muscle area (EMA), backfat thickness (BFT) and marbling score (MS). The pedigree of the test group and reference group was collected to build a pedigree structure and a numeric relationship matrix (NRM). The EBV, genetic parameters and accuracy were estimated by applying NRM to the best linear unbiased prediction (BLUP) multiple-trait animal model of the BLUPF90 program. Looking at the pedigree structure of the test group, there were a total of 2,371 cows born between 2003 to 2009, of these 603 cows had basic registration (25%), 562 cows had pedigree registration (24%) and 1,206 cows had advanced registration (51%). The proportion of pedigree registered cows was relatively low but it gradually increased and reached a point of 20,847 cows (68%) between 2010 to 2017. Looking at the change in the EBV, the CWT improved from 4.992 kg to 9.885 kg, the EMA from 0.970 ㎠ to 2.466 ㎠, the BFT from -0.186 mm to -0.357 mm, and the MS from 0.328 to 0.559 points. As a result of genetic parameter estimation, the heritability of CWT, EMA, BFT, and MS were 0.587, 0.416, 0.476, and 0.571, respectively, and the accuracy of those were estimated to be 0.559, 0.551, 0.554, and 0.558, respectively. Selection of superior genetic breed and efficient improvement could be possible if cow ability verification is implemented by using the accurate pedigree of each individual in the farms.