• Title/Summary/Keyword: combining forecasting

Search Result 73, Processing Time 0.03 seconds

Forecasting Cryptocurrency Prices in COVID-19 Phase: Convergence Study on Naver Trends and Deep Learning (COVID-19 국면의 암호화폐 가격 예측: 네이버트렌드와 딥러닝의 융합 연구)

  • Kim, Sun-Woong
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.3
    • /
    • pp.116-125
    • /
    • 2022
  • The purpose of this study is to analyze whether investor anxiety caused by COVID-19 affects cryptocurrency prices in the COVID-19 pandemic, and to experiment with cryptocurrency price prediction based on a deep learning model. Investor anxiety is calculated by combining Naver's Corona search index and Corona confirmed information, analyzing Granger causality with cryptocurrency prices, and predicting cryptocurrency prices using deep learning models. The experimental results are as follows. First, CCI indicators showed significant Granger causality in the returns of Bitcoin, Ethereum, and Lightcoin. Second, LSTM with CCI as an input variable showed high predictive performance. Third, Bitcoin's price prediction performance was the highest in comparison between cryptocurrencies. This study is of academic significance in that it is the first attempt to analyze the relationship between Naver's Corona search information and cryptocurrency prices in the Corona phase. In future studies, extended studies into various deep learning models are needed to increase price prediction accuracy.

Instruction Fine-tuning and LoRA Combined Approach for Optimizing Large Language Models (대규모 언어 모델의 최적화를 위한 지시형 미세 조정과 LoRA 결합 접근법)

  • Sang-Gook Kim;Kyungran Noh;Hyuk Hahn;Boong Kee Choi
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.2
    • /
    • pp.134-146
    • /
    • 2024
  • This study introduces and experimentally validates a novel approach that combines Instruction fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning to optimize the performance of Large Language Models (LLMs). These models have become revolutionary tools in natural language processing, showing remarkable performance across diverse application areas. However, optimizing their performance for specific domains necessitates fine-tuning of the base models (FMs), which is often limited by challenges such as data complexity and resource costs. The proposed approach aims to overcome these limitations by enhancing the performance of LLMs, particularly in the analysis precision and efficiency of national Research and Development (R&D) data. The study provides theoretical foundations and technical implementations of Instruction fine-tuning and LoRA fine-tuning. Through rigorous experimental validation, it is demonstrated that the proposed method significantly improves the precision and efficiency of data analysis, outperforming traditional fine-tuning methods. This enhancement is not only beneficial for national R&D data but also suggests potential applicability in various other data-centric domains, such as medical data analysis, financial forecasting, and educational assessments. The findings highlight the method's broad utility and significant contribution to advancing data analysis techniques in specialized knowledge domains, offering new possibilities for leveraging LLMs in complex and resource-intensive tasks. This research underscores the transformative potential of combining Instruction fine-tuning with LoRA fine-tuning to achieve superior performance in diverse applications, paving the way for more efficient and effective utilization of LLMs in both academic and industrial settings.

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

A Study on Chaff Echo Detection using AdaBoost Algorithm and Radar Data (AdaBoost 알고리즘과 레이더 데이터를 이용한 채프에코 식별에 관한 연구)

  • Lee, Hansoo;Kim, Jonggeun;Yu, Jungwon;Jeong, Yeongsang;Kim, Sungshin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.6
    • /
    • pp.545-550
    • /
    • 2013
  • In pattern recognition field, data classification is an essential process for extracting meaningful information from data. Adaptive boosting algorithm, known as AdaBoost algorithm, is a kind of improved boosting algorithm for applying to real data analysis. It consists of weak classifiers, such as random guessing or random forest, which performance is slightly more than 50% and weights for combining the classifiers. And a strong classifier is created with the weak classifiers and the weights. In this paper, a research is performed using AdaBoost algorithm for detecting chaff echo which has similar characteristics to precipitation echo and interrupts weather forecasting. The entire process for implementing chaff echo classifier starts spatial and temporal clustering based on similarity with weather radar data. With them, learning data set is prepared that separated chaff echo and non-chaff echo, and the AdaBoost classifier is generated as a result. For verifying the classifier, actual chaff echo appearance case is applied, and it is confirmed that the classifier can distinguish chaff echo efficiently.

Scenario Planning based on Collective Intelligence Using Wiki (위키를 활용한 집단지성 기반의 시나리오 플래닝)

  • Han, Jongmin;Yim, Hyun;Lee, Jae-Shin
    • Journal of Technology Innovation
    • /
    • v.20 no.2
    • /
    • pp.29-48
    • /
    • 2012
  • As the complexity and uncertainty of social and economic systems increase, the strategic foresight that actively and effectively responds to the environmental changes becomes important. A wide range of future forecasting methods are available for strategic foresight. Selecting one of the methods depends on several factors such as availability of time and financial resources and the objectives of the exercise. Although trend extrapolation analysis has been used for many years, scenario planning is being widely used by government and corporate as a tool for strategic decision making in recent years. Generally, scenario planning is carried out through workshop, in which experts with diverse backgrounds exchange information, views, and insights and integrate the diverse viewpoints. However, only a small number of experts can participate in a workshop and citizen opinion is not easily transformed into the policy for the scenario exercise due to the limitation of budget and short duration of a project. It is also much harder to develop creative ideas in the workshop because of the limited time and space. In this study, a new scenario process combining scenario workshop and wiki is proposed to overcome the limitation of scenario workshop. This combined approach can be more productive than using scenario workshop alone when developing new ideas. In this study, we applied the combined approach to develop scenarios for the strategic foresight of future media and present suggestions for improving the process.

  • PDF

Stock prediction using combination of BERT sentiment Analysis and Macro economy index

  • Jang, Euna;Choi, HoeRyeon;Lee, HongChul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.47-56
    • /
    • 2020
  • The stock index is used not only as an economic indicator for a country, but also as an indicator for investment judgment, which is why research into predicting the stock index is ongoing. The task of predicting the stock price index involves technical, basic, and psychological factors, and it is also necessary to consider complex factors for prediction accuracy. Therefore, it is necessary to study the model for predicting the stock price index by selecting and reflecting technical and auxiliary factors that affect the fluctuation of the stock price according to the stock price. Most of the existing studies related to this are forecasting studies that use news information or macroeconomic indicators that create market fluctuations, or reflect only a few combinations of indicators. In this paper, this we propose to present an effective combination of the news information sentiment analysis and various macroeconomic indicators in order to predict the US Dow Jones Index. After Crawling more than 93,000 business news from the New York Times for two years, the sentiment results analyzed using the latest natural language processing techniques BERT and NLTK, along with five macroeconomic indicators, gold prices, oil prices, and five foreign exchange rates affecting the US economy Combination was applied to the prediction algorithm LSTM, which is known to be the most suitable for combining numeric and text information. As a result of experimenting with various combinations, the combination of DJI, NLTK, BERT, OIL, GOLD, and EURUSD in the DJI index prediction yielded the smallest MSE value.

Parallel Flood Inundation Analysis using MPI Technique (MPI 기법을 이용한 병렬 홍수침수해석)

  • Park, Jae Hong
    • Journal of Korea Water Resources Association
    • /
    • v.47 no.11
    • /
    • pp.1051-1060
    • /
    • 2014
  • This study is attempted to realize an improved computation performance by combining the MPI (Message Passing Interface) Technique, a standard model of the parallel programming in the distributed memory environment, with the DHM(Diffusion Hydrodynamic Model), a inundation analysis model. With parallelizing inundation model, it compared with the existing calculation method about the results of applications to complicate and required long computing time problems. In addition, it attempted to prove the capability to estimate inundation extent, depth and speed-up computing time due to the flooding in protected lowlands and to validate the applicability of the parallel model to the actual flooding analysis by simulating based on various inundation scenarios. To verify the model developed in this study, it was applied to a hypothetical two-dimensional protected land and a real flooding case, and then actually verified the applicability of this model. As a result of this application, this model shows that the improvement effectiveness of calculation time is better up to the maximum of about 41% to 48% in using multi cores than a single core based on the same accuracy. The flood analysis model using the parallel technique in this study can be used for calculating flooding water depth, flooding areas, propagation speed of flooding waves, etc. with a shorter runtime with applying multi cores, and is expected to be actually used for promptly predicting real time flood forecasting and for drawing flood risk maps etc.

Predicting the Number of Confirmed COVID-19 Cases Using Deep Learning Models with Search Term Frequency Data (검색어 빈도 데이터를 반영한 코로나 19 확진자수 예측 딥러닝 모델)

  • Sungwook Jung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.387-398
    • /
    • 2023
  • The COVID-19 outbreak has significantly impacted human lifestyles and patterns. It was recommended to avoid face-to-face contact and over-crowded indoor places as much as possible as COVID-19 spreads through air, as well as through droplets or aerosols. Therefore, if a person who has contacted a COVID-19 patient or was at the place where the COVID-19 patient occurred is concerned that he/she may have been infected with COVID-19, it can be fully expected that he/she will search for COVID-19 symptoms on Google. In this study, an exploratory data analysis using deep learning models(DNN & LSTM) was conducted to see if we could predict the number of confirmed COVID-19 cases by summoning Google Trends, which played a major role in surveillance and management of influenza, again and combining it with data on the number of confirmed COVID-19 cases. In particular, search term frequency data used in this study are available publicly and do not invade privacy. When the deep neural network model was applied, Seoul (9.6 million) with the largest population in South Korea and Busan (3.4 million) with the second largest population recorded lower error rates when forecasting including search term frequency data. These analysis results demonstrate that search term frequency data plays an important role in cities with a population above a certain size. We also hope that these predictions can be used as evidentiary materials to decide policies, such as the deregulation or implementation of stronger preventive measures.

A Spatial Projection of Demand for Green Infrastructure and Its Application to GeoDesign - Evidence-Based Design for Urban Resilience - (융합도시모델링을 통한 그린인프라 수요 예측 및 지오디자인 적용 - 도시 레질리언스를 위한 근거 기반 디자인 -)

  • Kwak, Yoonshin
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.5
    • /
    • pp.30-43
    • /
    • 2023
  • Green infrastructure(GI) is considered a key strategy in establishing sustainable communities. However, research on GI from the perspective of urban system dynamics and resilience lacks depth, as does its integration with physical design. This research addresses two primary causes. First, there is a gap in methods between existing GI planning, which considers static variables, and urban modeling research, which addresses dynamic variables. Second, there is a gap in information between landscape design and urban modeling research. To address these issues, this study proposes an integrated modeling approach in consideration of design decision-making. By combining the LEAM model and MCDA model, this study evaluates the relationship between GI services and socioeconomic growth, while spatially forecasting the geographies of GI demand in 2050. The resulting information reveals a potential degradation in ecosystem services over the region due to Chicago's sub-urbanization. This indicates that there would be a spatial shift in GI demand, emphasizing the need for comprehensive, dynamic GI strategies. This study further discusses the applications of evidence-based design in a studio environment. This study aims to contribute to the GeoDesign literature in terms of the creation of a more resilient urban environment by facilitating efficient evidence-based decision-making.

Monthly temperature forecasting using large-scale climate teleconnections and multiple regression models (대규모 기후 원격상관성 및 다중회귀모형을 이용한 월 평균기온 예측)

  • Kim, Chul-Gyum;Lee, Jeongwoo;Lee, Jeong Eun;Kim, Nam Won;Kim, Hyeonjun
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.9
    • /
    • pp.731-745
    • /
    • 2021
  • In this study, the monthly temperature of the Han River basin was predicted by statistical multiple regression models that use global climate indices and weather data of the target region as predictors. The optimal predictors were selected through teleconnection analysis between the monthly temperature and the preceding patterns of each climate index, and forecast models capable of predicting up to 12 months in advance were constructed by combining the selected predictors and cross-validating the past period. Fore each target month, 1000 optimized models were derived and forecast ranges were presented. As a result of analyzing the predictability of monthly temperature from January 1992 to December 2020, PBIAS was -1.4 to -0.7%, RSR was 0.15 to 0.16, NSE was 0.98, and r was 0.99, indicating a high goodness-of-fit. The probability of each monthly observation being included in the forecast range was about 64.4% on average, and by month, the predictability was relatively high in September, December, February, and January, and low in April, August, and March. The predicted range and median were in good agreement with the observations, except for some periods when temperature was dramatically lower or higher than in normal years. The quantitative temperature forecast information derived from this study will be useful not only for forecasting changes in temperature in the future period (1 to 12 months in advance), but also in predicting changes in the hydro-ecological environment, including evapotranspiration highly correlated with temperature.