Search | Korea Science

Predicting stock movements based on financial news with systematic group identification (시스템적인 군집 확인과 뉴스를 이용한 주가 예측)

Seong, NohYoon;Nam, Kihwan
- Journal of Intelligence and Information Systems
- /
- v.25 no.3
- /
- pp.1-17
- /
- 2019
Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts. The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance. To overcome the limitation, we first used random matrix theory with text mining for stock prediction. Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies. Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms. The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals. The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.
https://doi.org/10.13088/jiis.2019.25.3.001 인용 PDF KSCI

A Recidivism Prediction Model Based on XGBoost Considering Asymmetric Error Costs (비대칭 오류 비용을 고려한 XGBoost 기반 재범 예측 모델)

Won, Ha-Ram;Shim, Jae-Seung;Ahn, Hyunchul
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.127-137
- /
- 2019
Recidivism prediction has been a subject of constant research by experts since the early 1970s. But it has become more important as committed crimes by recidivist steadily increase. Especially, in the 1990s, after the US and Canada adopted the 'Recidivism Risk Assessment Report' as a decisive criterion during trial and parole screening, research on recidivism prediction became more active. And in the same period, empirical studies on 'Recidivism Factors' were started even at Korea. Even though most recidivism prediction studies have so far focused on factors of recidivism or the accuracy of recidivism prediction, it is important to minimize the prediction misclassification cost, because recidivism prediction has an asymmetric error cost structure. In general, the cost of misrecognizing people who do not cause recidivism to cause recidivism is lower than the cost of incorrectly classifying people who would cause recidivism. Because the former increases only the additional monitoring costs, while the latter increases the amount of social, and economic costs. Therefore, in this paper, we propose an XGBoost(eXtream Gradient Boosting; XGB) based recidivism prediction model considering asymmetric error cost. In the first step of the model, XGB, being recognized as high performance ensemble method in the field of data mining, was applied. And the results of XGB were compared with various prediction models such as LOGIT(logistic regression analysis), DT(decision trees), ANN(artificial neural networks), and SVM(support vector machines). In the next step, the threshold is optimized to minimize the total misclassification cost, which is the weighted average of FNE(False Negative Error) and FPE(False Positive Error). To verify the usefulness of the model, the model was applied to a real recidivism prediction dataset. As a result, it was confirmed that the XGB model not only showed better prediction accuracy than other prediction models but also reduced the cost of misclassification most effectively.
https://doi.org/10.13088/jiis.2019.25.1.127 인용 PDF KSCI HTML

A Study of the Application of 'Digital Heritage ODA' - Focusing on the Myanmar cultural heritage management system - (디지털 문화유산 ODA 적용에 관한 시론적 연구 -미얀마 문화유산 관리시스템을 중심으로-)

Jeong, Seongmi
- Korean Journal of Heritage: History & Science
- /
- v.53 no.4
- /
- pp.198-215
- /
- 2020
Official development assistance refers to assistance provided by governments and other public institutions in donor countries, aimed at promoting economic development and social welfare in developing countries. The purpose of this research is to examine the construction process of the "Myanmar Cultural Heritage Management System" that is underway as part of the ODA project to strengthen cultural and artistic capabilities and analyze the achievements and challenges of the Digital Cultural Heritage ODA. The digital cultural heritage management system is intended to achieve the permanent preservation and sustainable utilization of tangible and intangible cultural heritage materials. Cultural heritage can be stored in digital archives, newly approached using computer analysis technology, and information can be used in multiple dimensions. First, the Digital Cultural Heritage ODA was able to permanently preserve cultural heritage content that urgently needed digitalization by overcoming and documenting the "risk" associated with cultural heritage under threat of being extinguished, damaged, degraded, or distorted in Myanmar. Second, information on Myanmar's cultural heritage can be systematically managed and used in many ways through linkages between materials. Third, cultural maps can be implemented that are based on accurate geographical location information as to where cultural heritage is located or inherited. Various items of cultural heritage were collectively and intensively visualized to maximize utility and convenience for academic, policy, and practical purposes. Fourth, we were able to overcome the one-sided limitations of cultural ODA in relations between donor and recipient countries. Fifth, the capacity building program run by officials in charge of the beneficiary country, which could be the most important form of sustainable development in the cultural ODA, was operated together. Sixth, there is an implication that it is an ODA that can be relatively smooth and non-face-to-face in nature, without requiring the movement of manpower between countries during the current global pandemic. However, the following tasks remain to be solved through active discussion and deliberation in the future. First, the content of the data uploaded to the system should be verified. Second, to preserve digital cultural heritage, it must be protected from various threats. For example, it is necessary to train local experts to prepare for errors caused by computer viruses, stored data, or operating systems. Third, due to the nature of the rapidly changing environment of computer technology, measures should also be discussed to address the problems that tend to follow when new versions and programs are developed after the end of the ODA project, or when developers have not continued to manage their programs. Fourth, since the classification system criteria and decisions regarding whether the data will be disclosed or not are set according to Myanmar's political judgment, it is necessary to let the beneficiary country understand the ultimate purpose of the cultural ODA project.
https://doi.org/10.22755/kjchs.2020.53.4.198 인용 PDF

UX Methodology Study by Data Analysis Focusing on deriving persona through customer segment classification (데이터 분석을 통한 UX 방법론 연구 고객 세그먼트 분류를 통한 페르소나 도출을 중심으로)

Lee, Seul-Yi;Park, Do-Hyung
- Journal of Intelligence and Information Systems
- /
- v.27 no.1
- /
- pp.151-176
- /
- 2021
As the information technology industry develops, various kinds of data are being created, and it is now essential to process them and use them in the industry. Analyzing and utilizing various digital data collected online and offline is a necessary process to provide an appropriate experience for customers in the industry. In order to create new businesses, products, and services, it is essential to use customer data collected in various ways to deeply understand potential customers' needs and analyze behavior patterns to capture hidden signals of desire. However, it is true that research using data analysis and UX methodology, which should be conducted in parallel for effective service development, is being conducted separately and that there is a lack of examples of use in the industry. In thiswork, we construct a single process by applying data analysis methods and UX methodologies. This study is important in that it is highly likely to be used because it applies methodologies that are actively used in practice. We conducted a survey on the topic to identify and cluster the associations between factors to establish customer classification and target customers. The research methods are as follows. First, we first conduct a factor, regression analysis to determine the association between factors in the happiness data survey. Groups are grouped according to the survey results and identify the relationship between 34 questions of psychological stability, family life, relational satisfaction, health, economic satisfaction, work satisfaction, daily life satisfaction, and residential environment satisfaction. Second, we classify clusters based on factors affecting happiness and extract the optimal number of clusters. Based on the results, we cross-analyzed the characteristics of each cluster. Third, forservice definition, analysis was conducted by correlating with keywords related to happiness. We leverage keyword analysis of the thumb trend to derive ideas based on the interest and associations of the keyword. We also collected approximately 11,000 news articles based on the top three keywords that are highly related to happiness, then derived issues between keywords through text mining analysis in SAS, and utilized them in defining services after ideas were conceived. Fourth, based on the characteristics identified through data analysis, we selected segmentation and targetingappropriate for service discovery. To this end, the characteristics of the factors were grouped and selected into four groups, and the profile was drawn up and the main target customers were selected. Fifth, based on the characteristics of the main target customers, interviewers were selected and the In-depthinterviews were conducted to discover the causes of happiness, causes of unhappiness, and needs for services. Sixth, we derive customer behavior patterns based on segment results and detailed interviews, and specify the objectives associated with the characteristics. Seventh, a typical persona using qualitative surveys and a persona using data were produced to analyze each characteristic and pros and cons by comparing the two personas. Existing market segmentation classifies customers based on purchasing factors, and UX methodology measures users' behavior variables to establish criteria and redefine users' classification. Utilizing these segment classification methods, applying the process of producinguser classification and persona in UX methodology will be able to utilize them as more accurate customer classification schemes. The significance of this study is summarized in two ways: First, the idea of using data to create a variety of services was linked to the UX methodology used to plan IT services by applying it in the hot topic era. Second, we further enhance user classification by applying segment analysis methods that are not currently used well in UX methodologies. To provide a consistent experience in creating a single service, from large to small, it is necessary to define customers with common goals. To this end, it is necessary to derive persona and persuade various stakeholders. Under these circumstances, designing a consistent experience from beginning to end, through fast and concrete user descriptions, would be a very effective way to produce a successful service.
https://doi.org/10.13088/jiis.2021.27.1.151 인용 PDF KSCI

Stand-alone Real-time Healthcare Monitoring Driven by Integration of Both Triboelectric and Electro-magnetic Effects (실시간 헬스케어 모니터링의 독립 구동을 위한 접촉대전 발전과 전자기 발전 원리의 융합)

Cho, Sumin;Joung, Yoonsu;Kim, Hyeonsu;Park, Minseok;Lee, Donghan;Kam, Dongik;Jang, Sunmin;Ra, Yoonsang;Cha, Kyoung Je;Kim, Hyung Woo;Seo, Kyoung Duck;Choi, Dongwhi
- Korean Chemical Engineering Research
- /
- v.60 no.1
- /
- pp.86-92
- /
- 2022
Recently, the bio-healthcare market is enlarging worldwide due to various reasons such as the COVID-19 pandemic. Among them, biometric measurement and analysis technology are expected to bring about future technological innovation and socio-economic ripple effect. Existing systems require a large-capacity battery to drive signal processing, wireless transmission part, and an operating system in the process. However, due to the limitation of the battery capacity, it causes a spatio-temporal limitation on the use of the device. This limitation can act as a cause for the disconnection of data required for the user's health care monitoring, so it is one of the major obstacles of the health care device. In this study, we report the concept of a standalone healthcare monitoring module, which is based on both triboelectric effects and electromagnetic effects, by converting biomechanical energy into suitable electric energy. The proposed system can be operated independently without an external power source. In particular, the wireless foot pressure measurement monitoring system, which is rationally designed triboelectric sensor (TES), can recognize the user's walking habits through foot pressure measurement. By applying the triboelectric effects to the contact-separation behavior that occurs during walking, an effective foot pressure sensor was made, the performance of the sensor was verified through an electrical output signal according to the pressure, and its dynamic behavior is measured through a signal processing circuit using a capacitor. In addition, the biomechanical energy dissipated during walking is harvested as electrical energy by using the electromagnetic induction effect to be used as a power source for wireless transmission and signal processing. Therefore, the proposed system has a great potential to reduce the inconvenience of charging caused by limited battery capacity and to overcome the problem of data disconnection.
https://doi.org/10.9713/kcer.2022.60.1.86 인용 PDF KSCI

Classification Algorithm-based Prediction Performance of Order Imbalance Information on Short-Term Stock Price (분류 알고리즘 기반 주문 불균형 정보의 단기 주가 예측 성과)

Kim, S.W.
- Journal of Intelligence and Information Systems
- /
- v.28 no.4
- /
- pp.157-177
- /
- 2022
Investors are trading stocks by keeping a close watch on the order information submitted by domestic and foreign investors in real time through Limit Order Book information, so-called price current provided by securities firms. Will order information released in the Limit Order Book be useful in stock price prediction? This study analyzes whether it is significant as a predictor of future stock price up or down when order imbalances appear as investors' buying and selling orders are concentrated to one side during intra-day trading time. Using classification algorithms, this study improved the prediction accuracy of the order imbalance information on the short-term price up and down trend, that is the closing price up and down of the day. Day trading strategies are proposed using the predicted price trends of the classification algorithms and the trading performances are analyzed through empirical analysis. The 5-minute KOSPI200 Index Futures data were analyzed for 4,564 days from January 19, 2004 to June 30, 2022. The results of the empirical analysis are as follows. First, order imbalance information has a significant impact on the current stock prices. Second, the order imbalance information observed in the early morning has a significant forecasting power on the price trends from the early morning to the market closing time. Third, the Support Vector Machines algorithm showed the highest prediction accuracy on the day's closing price trends using the order imbalance information at 54.1%. Fourth, the order imbalance information measured at an early time of day had higher prediction accuracy than the order imbalance information measured at a later time of day. Fifth, the trading performances of the day trading strategies using the prediction results of the classification algorithms on the price up and down trends were higher than that of the benchmark trading strategy. Sixth, except for the K-Nearest Neighbor algorithm, all investment performances using the classification algorithms showed average higher total profits than that of the benchmark strategy. Seventh, the trading performances using the predictive results of the Logical Regression, Random Forest, Support Vector Machines, and XGBoost algorithms showed higher results than the benchmark strategy in the Sharpe Ratio, which evaluates both profitability and risk. This study has an academic difference from existing studies in that it documented the economic value of the total buy & sell order volume information among the Limit Order Book information. The empirical results of this study are also valuable to the market participants from a trading perspective. In future studies, it is necessary to improve the performance of the trading strategy using more accurate price prediction results by expanding to deep learning models which are actively being studied for predicting stock prices recently.
https://doi.org/10.13088/jiis.2022.28.4.157 인용 PDF KSCI

Utilizing the Idle Railway Sites: A Proposal for the Location of Solar Power Plants Using Cluster Analysis (철도 유휴부지 활용방안: 군집분석을 활용한 태양광발전 입지 제안)

Eunkyung Kang;Seonuk Yang;Jiyoon Kwon;Sung-Byung Yang
- Journal of Intelligence and Information Systems
- /
- v.29 no.1
- /
- pp.79-105
- /
- 2023
Due to unprecedented extreme weather events such as global warming and climate change, many parts of the world suffer from severe pain, and economic losses are also snowballing. In order to address these problems, 'The Paris Agreement' was signed in 2016, and an intergovernmental consultative body was formed to keep the average temperature rise of the Earth below 1.5℃. Korea also declared 'Carbon Neutrality in 2050' to prevent climate catastrophe. In particular, it was found that the increase in temperature caused by greenhouse gas emissions hurts the environment and society as a whole, as well as the export-dependent economy of Korea. In addition, as the diversification of transportation types is accelerating, the change in means of choice is also increasing. As the development paradigm in the low-growth era changes to urban regeneration, interest in idle railway sites is rising due to reduced demand for routes, improvement of alignment, and relocation of urban railways. Meanwhile, it is possible to partially achieve the solar power generation goal of 'Renewable Energy 3020' by utilizing already developed but idle railway sites and take advantage of being free from environmental damage and resident acceptance issues surrounding the location; but the actual use and plan for these solar power facilities are still lacking. Therefore, in this study, using the big data provided by the Korea National Railway and the Renewable Energy Cloud Platform, we develop an algorithm to discover and analyze suitable idle sites where solar power generation facilities can be installed and identify potentially applicable areas considering conditions desired by users. By searching and deriving these idle but relevant sites, it is intended to devise a plan to save enormous costs for facilities or expansion in the early stages of development. This study uses various cluster analyses to develop an optimal algorithm that can derive solar power plant locations on idle railway sites and, as a result, suggests 202 'actively recommended areas.' These results would help decision-makers make rational decisions from the viewpoint of simultaneously considering the economy and the environment.
https://doi.org/10.13088/jiis.2023.29.1.079 인용 PDF

A COVID-19 Diagnosis Model based on Various Transformations of Cough Sounds (기침 소리의 다양한 변환을 통한 코로나19 진단 모델)

Minkyung Kim;Gunwoo Kim;Keunho Choi
- Journal of Intelligence and Information Systems
- /
- v.29 no.3
- /
- pp.57-78
- /
- 2023
COVID-19, which started in Wuhan, China in November 2019, spread beyond China in 2020 and spread worldwide in March 2020. It is important to prevent a highly contagious virus like COVID-19 in advance and to actively treat it when confirmed, but it is more important to identify the confirmed fact quickly and prevent its spread since it is a virus that spreads quickly. However, PCR test to check for infection is costly and time consuming, and self-kit test is also easy to access, but the cost of the kit is not easy to receive every time. Therefore, if it is possible to determine whether or not a person is positive for COVID-19 based on the sound of a cough so that anyone can use it easily, anyone can easily check whether or not they are confirmed at anytime, anywhere, and it can have great economic advantages. In this study, an experiment was conducted on a method to identify whether or not COVID-19 was confirmed based on a cough sound. Cough sound features were extracted through MFCC, Mel-Spectrogram, and spectral contrast. For the quality of cough sound, noisy data was deleted through SNR, and only the cough sound was extracted from the voice file through chunk. Since the objective is COVID-19 positive and negative classification, learning was performed through XGBoost, LightGBM, and FCNN algorithms, which are often used for classification, and the results were compared. Additionally, we conducted a comparative experiment on the performance of the model using multidimensional vectors obtained by converting cough sounds into both images and vectors. The experimental results showed that the LightGBM model utilizing features obtained by converting basic information about health status and cough sounds into multidimensional vectors through MFCC, Mel-Spectogram, Spectral contrast, and Spectrogram achieved the highest accuracy of 0.74.
https://doi.org/10.13088/jiis.2023.29.3.057 인용 PDF

Estimation of the CY Area Required for Each Container Handling System in Mokpo New Port (목표 신항만의 터미널 운영시스템에 따른 CY 소요면적 산정에 관한 연구)

Keum, J.S.
- Journal of Korean Port Research
- /
- v.12 no.1
- /
- pp.35-46
- /
- 1998
The CY can be said to function in various respect as a buffer zone between the maritime and overland inflow-outflow of container. The amount of storage area needed requires a very critical appraisal at pre-operational stage. A container terminal should be designed to handle and store containers in the most efficient and economic way possible. In order to achieve this aim it is necessary to figure out or forecast numbers and types of containers to be handled, CY area required, and internal handling systems to be adopted. This paper aims to calculate the CY area required for each container handling system in Mokpo New Port. The CY area required are directly dependent on the equipment being used and the storage demand. And also the CY area required depends on the dwell time. Furthermore, containers need to be segregated by destination, weight, class, FCL(full container load), LCL(less than container load), direction of travel, and sometimes by type and often by shipping line or service. Thus the full use of a storage area is not always possible as major unbalances and fluctuations in these flow occuring all the time. The calculating CY area must therefore be taken into account in terms of these operational factors. For solving such problem, all these factors have been applied to estimation of CY area in Mokpo New Port. The CY area required in Mokpo New Port was summarized in the conclusion section.
PDF

Development of an Eye Patch-Type Biosignal Measuring Device to Measure Sleep Quality (수면의 질을 측정하기 위한 안대형 생체신호 측정기기 개발)

Changsun Ahn;Jaekwan Lim;Bongsu Jung;Youngjoo Kim
- KIPS Transactions on Computer and Communication Systems
- /
- v.12 no.5
- /
- pp.171-180
- /
- 2023
The three major sleep disorders in Korea are snoring, sleep apnea, and insomnia. Lack of sleep is the root of all diseases. Some of the most serious potential problems associated with sleep deprivation are cardiovascular problems, cognitive impairment, obesity, diabetes, colitis, prostate cancer, etc. To solve these problems, the Korean government provided low-cost national health insurance benefits for polysomnography tests in July 2018. However, insomnia patients still have problems getting treated in terms of time, space, and economic perspectives. Therefore, it would be better for insomnia patients to be allowed to test at home. The measuring device can measure six biosignals (eye movement, tossing and turning, body temperature, oxygen saturation, heart rate, and audio). A gyroscope sensor (MPU9250, InvenSense, USA) was used for eye movement, tossing, and turning. The input range of the sensor was in 258°/sec to 460°/sec, and the data range was in the input range. Body temperature, oxygen saturation range, and heart rate were measured by a sensor (MAX30102, Analog Devices, USA). The body temperature was measured in 30 ℃ to 45 ℃, and the oxygen saturation range was 0% for the unused state and 20 % to 90 % for the used state. The heart rate measurement range was in 40 bpm to 180 bpm. The measurement of audio signal was performed by an audio sensor (AMM2742-T-R, PUIaudio, USA). The was -42 dB ±1 dB frequency range was 20 Hz to 20 kHz. The measured data was successfully received in wireless network conditions. The system configuration was consisted of a PC and a mobile app for bio-signal measurement and data collection. The measured data was collected by mobile phones and desktops. The data collected can be used as preliminary data to determine the stage of sleep and perform the screening function for sleep induction and sleep disturbances. In the future, this convenient sleep measurement device could be beneficial for treating insomnia.
https://doi.org/10.3745/KTCCS.2023.12.5.171 인용 PDF

Search Result 3,119, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)