• Title/Summary/Keyword: 지능 시스템

Search Result 12,320, Processing Time 0.032 seconds

Development of the forecasting model for import volume by item of major countries based on economic, industrial structural and cultural factors: Focusing on the cultural factors of Korea (경제적, 산업구조적, 문화적 요인을 기반으로 한 주요 국가의 한국 품목별 수입액 예측 모형 개발: 한국의, 한국에 대한 문화적 요인을 중심으로)

  • Jun, Seung-pyo;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.23-48
    • /
    • 2021
  • The Korean economy has achieved continuous economic growth for the past several decades thanks to the government's export strategy policy. This increase in exports is playing a leading role in driving Korea's economic growth by improving economic efficiency, creating jobs, and promoting technology development. Traditionally, the main factors affecting Korea's exports can be found from two perspectives: economic factors and industrial structural factors. First, economic factors are related to exchange rates and global economic fluctuations. The impact of the exchange rate on Korea's exports depends on the exchange rate level and exchange rate volatility. Global economic fluctuations affect global import demand, which is an absolute factor influencing Korea's exports. Second, industrial structural factors are unique characteristics that occur depending on industries or products, such as slow international division of labor, increased domestic substitution of certain imported goods by China, and changes in overseas production patterns of major export industries. Looking at the most recent studies related to global exchanges, several literatures show the importance of cultural aspects as well as economic and industrial structural factors. Therefore, this study attempted to develop a forecasting model by considering cultural factors along with economic and industrial structural factors in calculating the import volume of each country from Korea. In particular, this study approaches the influence of cultural factors on imports of Korean products from the perspective of PUSH-PULL framework. The PUSH dimension is a perspective that Korea develops and actively promotes its own brand and can be defined as the degree of interest in each country for Korean brands represented by K-POP, K-FOOD, and K-CULTURE. In addition, the PULL dimension is a perspective centered on the cultural and psychological characteristics of the people of each country. This can be defined as how much they are inclined to accept Korean Flow as each country's cultural code represented by the country's governance system, masculinity, risk avoidance, and short-term/long-term orientation. The unique feature of this study is that the proposed final prediction model can be selected based on Design Principles. The design principles we presented are as follows. 1) A model was developed to reflect interest in Korea and cultural characteristics through newly added data sources. 2) It was designed in a practical and convenient way so that the forecast value can be immediately recalled by inputting changes in economic factors, item code and country code. 3) In order to derive theoretically meaningful results, an algorithm was selected that can interpret the relationship between the input and the target variable. This study can suggest meaningful implications from the technical, economic and policy aspects, and is expected to make a meaningful contribution to the export support strategies of small and medium-sized enterprises by using the import forecasting model.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

  • Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.191-206
    • /
    • 2022
  • Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.

Analysis of the relationship between interest rate spreads and stock returns by industry (금리 스프레드와 산업별 주식 수익률 관계 분석)

  • Kim, Kyuhyeong;Park, Jinsoo;Suh, Jihae
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.105-117
    • /
    • 2022
  • This study analyzes the effects between stock returns and interest rate spread, difference between long-term and short-term interest rate through the polynomial linear regression analysis. The existing research concentrated on the business forecast through the interest rate spread focusing on the US market. The previous studies verified the interest rate spread based on the leading indicators of business forecast by moderating the period of long-term/short-term interest rates and analyzing the degree of leading. After the 7th reform of composite indices of business indicators in Korea of 2006, the interest rate spread was included in the items of composing the business leading indicators, which is utilized till today. Nevertheless, there are a few research on stock returns of each industry and interest rate spread in domestic stock market. Therefore, this study analyzed the stock returns of each industry and interest rate spread targeting Korean stock market. This study selected the long-term/short-term interest rates with high causality through the regression analysis, and then understood the correlations with each leading period and industry. To overcome the limitation of the simple linear regression analysis, polynomial linear regression analysis is used, which raised explanatory power. As a result, the high causality was verified when using differences between returns of corporate bond(AA-) without guarantee for three years by leading six months and call rate returns as interest rate spread. In addition, analyzing the stock returns of each industry, the relation between the relevant interest rate spread and returns of the automobile industry was the closest. This study is significant in the aspect of verifying the causality of interest rate spread, business forecast, and stock returns in Korea. Even though it could be limited to forecast the stock price by using only the interest rate spread, it would be working as a strong factor when it is properly utilized with other various factors.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

Image based Experience Goods, Text-based Search Goods: Cognitive Fit between Product Information Composition and Product Type depending on Regulatory Focus (이미지 기반의 경험재, 텍스트 기반의 탐색재: 조절초점에 따른 제품 정보 구성 방식과 제품 유형의 일치 효과)

  • Park, Kyung-Hee;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.75-100
    • /
    • 2022
  • Untact mobile commerce shows a rapid growth due to the prolonged COVID-19 pandemic. And companies have a lot of tough competition in this trend. However, the detail pages of products which play an important role in purchase decision have been provided mostly for consumers in a form of stereotyped information composition. This study has found that the form of (image-centered vs. text-centered) information composition of detailed descriptions of products in the detail pages of mobile products has an effect on product attitude and purchase intention as consumers' information appeal methods vary depending on product types (search goods vs. experience goods). That is, search goods whose information search is easy and whose quality is predictable could be found that product attitude and purchase intention have a more positive effect on the form of image-centered information composition. And experience goods whose quality is unpredictable could be found that product attitude and purchase intention have a more positive effect on the form of text-centered information composition. And effects of congruence between product types based on Higgins' regulatory focus theory and the form of information composition have found to vary depending on consumers' chronic regulatory focus. Promotion focus seeking consumers showed effects of congruence between product types and the form of information composition and prevention focus seeking consumers did not show effects of congruence between them. That is, promotion focus seeking consumers have found to have more positive product attitude and purchase intention in the form of image-centered information composition of experience goods and text-centered information composition of search goods. And prevention focus seeking consumers have found to be unable to have an effect on product attitude and purchase intention even though the form of image or text-centered information composition of search and experience goods is presented. The study implies that the form of information composition should be designed, produced, and provided for consumers by considering product types and consumer propensity when designing it in the detail pages of mobile products.

ICT Company Profiling Analysis and the Mechanism for Performance Creation Depending on the Type of Government Start-up Support Program (정부창업지원 프로그램 참여에 따른 ICT 기업 프로파일링과 성과창출 메커니즘)

  • Ha, Sangjip;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.237-258
    • /
    • 2022
  • As the global market environment changes, the domestic ICT industry has a growing influence on the world economy. This industry is regarded as an important driving force in the national economy from a technological and social point of view. In particular, small and medium-sized enterprises (SMEs) in the ICT industry are regarded as essential actors of domestic economic development in terms of company diversity, technology development and job creation. However, since it is small compared to large-sized enterprises, it is difficult for SMEs to survive with a differentiated strategy in an incomplete and rapidly changing environment. Therefore, SMEs must make a lot of efforts to improve their own capabilities, and the government needs to provide the desirable help suitable for corporate internal resources so that they can continue to be competitive. This study classifies the types of ICT SMEs participating in government support programs, and analyzes the relationship between resources and performance creation of each type. The data from the "ICT Small and Medium Enterprises Survey" conducted annually by the Ministry of Science and ICT was used. In the first stage, ICT SMEs were clustered based on common factors according to their experiences with government support programs. Three clusters were meaningfully classified, and each cluster was named "active participation type," "initial support type," and "soloist type." As a second step, this study compared the characteristics of each cluster through profiling analysis for each cluster. The third step carried out in this study was to find out the mechanism of R&D performance creation for each cluster through regression analysis. Different factors affected performance creation for each cluster, and the magnitude of the influence was also different. Specifically, for "active participation type", "current manpower", "technology competitiveness", and "R&D investment in the previous year" were found to be important factors in creating R&D performance. "Initial support type" was identified as "whether or not a dedicated R&D organization exists", "R&D investment amount in the previous year", "Ratio of sales to large companies", and "Ratio of vendors supplied to large companies" contributed to the performance. Lastly, in the case of "soloist type", "current workforce" and "future recruitment plan", "technological competitiveness", "R&D investment", "large company sales ratio", and "overseas sales ratio" showed a significant relationship with the performance. This study has practical implications of showing what strategy should be established when supporting SMEs in the future according to the government's participation in the startup program and providing a guide on what kind of support should be provided.

Classification Algorithm-based Prediction Performance of Order Imbalance Information on Short-Term Stock Price (분류 알고리즘 기반 주문 불균형 정보의 단기 주가 예측 성과)

  • Kim, S.W.
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.157-177
    • /
    • 2022
  • Investors are trading stocks by keeping a close watch on the order information submitted by domestic and foreign investors in real time through Limit Order Book information, so-called price current provided by securities firms. Will order information released in the Limit Order Book be useful in stock price prediction? This study analyzes whether it is significant as a predictor of future stock price up or down when order imbalances appear as investors' buying and selling orders are concentrated to one side during intra-day trading time. Using classification algorithms, this study improved the prediction accuracy of the order imbalance information on the short-term price up and down trend, that is the closing price up and down of the day. Day trading strategies are proposed using the predicted price trends of the classification algorithms and the trading performances are analyzed through empirical analysis. The 5-minute KOSPI200 Index Futures data were analyzed for 4,564 days from January 19, 2004 to June 30, 2022. The results of the empirical analysis are as follows. First, order imbalance information has a significant impact on the current stock prices. Second, the order imbalance information observed in the early morning has a significant forecasting power on the price trends from the early morning to the market closing time. Third, the Support Vector Machines algorithm showed the highest prediction accuracy on the day's closing price trends using the order imbalance information at 54.1%. Fourth, the order imbalance information measured at an early time of day had higher prediction accuracy than the order imbalance information measured at a later time of day. Fifth, the trading performances of the day trading strategies using the prediction results of the classification algorithms on the price up and down trends were higher than that of the benchmark trading strategy. Sixth, except for the K-Nearest Neighbor algorithm, all investment performances using the classification algorithms showed average higher total profits than that of the benchmark strategy. Seventh, the trading performances using the predictive results of the Logical Regression, Random Forest, Support Vector Machines, and XGBoost algorithms showed higher results than the benchmark strategy in the Sharpe Ratio, which evaluates both profitability and risk. This study has an academic difference from existing studies in that it documented the economic value of the total buy & sell order volume information among the Limit Order Book information. The empirical results of this study are also valuable to the market participants from a trading perspective. In future studies, it is necessary to improve the performance of the trading strategy using more accurate price prediction results by expanding to deep learning models which are actively being studied for predicting stock prices recently.

The Effect of Domain Specificity on the Performance of Domain-Specific Pre-Trained Language Models (도메인 특수성이 도메인 특화 사전학습 언어모델의 성능에 미치는 영향)

  • Han, Minah;Kim, Younha;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.251-273
    • /
    • 2022
  • Recently, research on applying text analysis to deep learning has steadily continued. In particular, researches have been actively conducted to understand the meaning of words and perform tasks such as summarization and sentiment classification through a pre-trained language model that learns large datasets. However, existing pre-trained language models show limitations in that they do not understand specific domains well. Therefore, in recent years, the flow of research has shifted toward creating a language model specialized for a particular domain. Domain-specific pre-trained language models allow the model to understand the knowledge of a particular domain better and reveal performance improvements on various tasks in the field. However, domain-specific further pre-training is expensive to acquire corpus data of the target domain. Furthermore, many cases have reported that performance improvement after further pre-training is insignificant in some domains. As such, it is difficult to decide to develop a domain-specific pre-trained language model, while it is not clear whether the performance will be improved dramatically. In this paper, we present a way to proactively check the expected performance improvement by further pre-training in a domain before actually performing further pre-training. Specifically, after selecting three domains, we measured the increase in classification accuracy through further pre-training in each domain. We also developed and presented new indicators to estimate the specificity of the domain based on the normalized frequency of the keywords used in each domain. Finally, we conducted classification using a pre-trained language model and a domain-specific pre-trained language model of three domains. As a result, we confirmed that the higher the domain specificity index, the higher the performance improvement through further pre-training.

Analysis of Munitions Contract Work Using Process Mining (프로세스 마이닝을 이용한 군수품 계약업무 분석 : 공군 군수사 계약업무를 중심으로)

  • Joo, Yong Seon;Kim, Su Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.41-59
    • /
    • 2022
  • The timely procurement of military supplies is essential to maintain the military's operational capabilities, and contract work is the first step toward timely procurement. In addition, rapid signing of a contract enables consumers to set a leisurely delivery date and increases the possibility of budget execution, so it is essential to improve the contract process to prevent early execution of the budget and transfer or disuse. Recently, research using big data has been actively conducted in various fields, and process analysis using big data and process mining, an improvement technique, are also widely used in the private sector. However, the analysis of contract work in the military is limited to the level of individual analysis such as identifying the cause of each problem case of budget transfer and disuse contracts using the experience and fragmentary information of the person in charge. In order to improve the contract process, this study analyzed using the process mining technique with data on a total of 560 contract tasks directly contracted by the Department of Finance of the Air Force Logistics Command for about one year from November 2019. Process maps were derived by synthesizing distributed data, and process flow, execution time analysis, bottleneck analysis, and additional detailed analysis were conducted. As a result of the analysis, it was found that review/modification occurred repeatedly after request in a number of contracts. Repeated reviews/modifications have a significant impact on the delay in the number of days to complete the cost calculation, which has also been clearly revealed through bottleneck visualization. Review/modification occurs in more than 60% of the top 5 departments with many contract requests, and it usually occurs in the first half of the year when requests are concentrated, which means that a thorough review is required before requesting contracts from the required departments. In addition, the contract work of the Department of Finance was carried out in accordance with the procedures according to laws and regulations, but it was found that it was necessary to adjust the order of some tasks. This study is the first case of using process mining for the analysis of contract work in the military. Based on this, if further research is conducted to apply process mining to various tasks in the military, it is expected that the efficiency of various tasks can be derived.