Search | Korea Science

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
- Journal of Intelligence and Information Systems
- /
- v.19 no.1
- /
- pp.95-110
- /
- 2013
Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.
https://doi.org/10.13088/jiis.2013.19.1.095 인용 PDF KSCI

Seismic Evaluation of Low-rise RC Building in korea (국내 저층구조물의 내진성능평가)

Park, Jin Hwa;Ahn, Tea Sang;Seo, Hyun Sik;Kim, Sang Dea
- 한국방재학회:학술대회논문집
- /
- 2011.02a
- /
- pp.29-29
- /
- 2011
국내에서 기존건축물의 내진성능평가 기법이 연구되기 시작한지 20여 년간 다양한 평가방법이 제안되었다. 그러나, 제안된 평가방법은 미국이나 일본의 평가 방법을 도입 및 수정하는 내용이 주가 되어 국내실정에 맞지 않는 부분도 많이 발견되었다. 따라서 국내에서 제안된 기존 건축물의 내진성능 평가기법, 지진피해예측에 근거한 보강건축물의 합리적인 선정방법 및 이들 건축물에 적합한 내진보강방법 등의 연구는 아직까지 초보적인 단계라고 할 수 있다. 이에 본 연구의 목적은 이러한 평가 기법을 적용한 국내 저층구조물의 내진성능을 평가하는 것이다. 저층구조물의 내진성능을 평가하기 위하여 1988년 내진설계가 도입되기 이전에 건립된 4층 규모의 학교구조물을 해석대상 구조물로 선정하였다. 대상 해석구조물의 내진성능평가는 일본의 내진성능 평가법을 참고하여 평가절차가 다소 복잡한 부분을 국내 실정에 맞게 개선시킨 내진화 우선도 평가방법과 정밀한 내진성능을 평하는 방법으로 세계적으로 널리 사용되고 있는 ATC-40 성능평가방법에서 등가단자유 모델로 변환 과정에서 등가유효감쇠 및 등가유효주기 산정 관계식의 문제점을 개선한 FEMA-440의 선형화 성능평가방법(Linearization Method)을 사용하여 구조물의 성능을 평가하였다. 내진 성능 평가를 위해 현재 전 세계적으로 널리 사용되고 있는 구조물 비선형 전용 해석 프로그램인 Perform-3D를 이용하여 해석을 수행하였다. 본 연구를 통해 기존 저층구조물로 선정한 학교구조물에 대한 내진성능을 평가한 결과, 내진화 우선도 평가법 및 FEMA-440의 내진성능 평가는 유사한 경향의 결과를 나타내었고, 두 평가결과를 요약하면 Y방향은 보와 기둥에 끼인 조적벽체의 영향으로 별도의 내진성능이 향상 보강이 필요없으나, X방향은 창문하부 허리 조적벽 등의 영향으로 다소 취성적인 내진성능을 보유하고 있어 충분한 내진성능 확보를 위한 추가적인 보강이 필요한 것으로 판단된다.
PDF

A domain-specific sentiment lexicon construction method for stock index directionality (주가지수 방향성 예측을 위한 도메인 맞춤형 감성사전 구축방안)

Kim, Jae-Bong;Kim, Hyoung-Joong
- Journal of Digital Contents Society
- /
- v.18 no.3
- /
- pp.585-592
- /
- 2017
As development of personal devices have made everyday use of internet much easier than before, it is getting generalized to find information and share it through the social media. In particular, communities specialized in each field have become so powerful that they can significantly influence our society. Finally, businesses and governments pay attentions to reflecting their opinions in their strategies. The stock market fluctuates with various factors of society. In order to consider social trends, many studies have tried making use of bigdata analysis on stock market researches as well as traditional approaches using buzz amount. In the example at the top, the studies using text data such as newspaper articles are being published. In this paper, we analyzed the post of 'Paxnet', a securities specialists' site, to supplement the limitation of the news. Based on this, we help researchers analyze the sentiment of investors by generating a domain-specific sentiment lexicon for the stock market.
https://doi.org/10.9728/dcs.2017.18.3.585 인용 PDF KSCI

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

Lee, Mo-Se;Ahn, Hyunchul
- Journal of Intelligence and Information Systems
- /
- v.24 no.1
- /
- pp.167-181
- /
- 2018
Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.
https://doi.org/10.13088/jiis.2018.24.1.167 인용 PDF KSCI

Comparative Study of Automatic Trading and Buy-and-Hold in the S&P 500 Index Using a Volatility Breakout Strategy (변동성 돌파 전략을 사용한 S&P 500 지수의 자동 거래와 매수 및 보유 비교 연구)

Sunghyuck Hong
- Journal of Internet of Things and Convergence
- /
- v.9 no.6
- /
- pp.57-62
- /
- 2023
This research is a comparative analysis of the U.S. S&P 500 index using the volatility breakout strategy against the Buy and Hold approach. The volatility breakout strategy is a trading method that exploits price movements after periods of relative market stability or concentration. Specifically, it is observed that large price movements tend to occur more frequently after periods of low volatility. When a stock moves within a narrow price range for a while and then suddenly rises or falls, it is expected to continue moving in that direction. To capitalize on these movements, traders adopt the volatility breakout strategy. The 'k' value is used as a multiplier applied to a measure of recent market volatility. One method of measuring volatility is the Average True Range (ATR), which represents the difference between the highest and lowest prices of recent trading days. The 'k' value plays a crucial role for traders in setting their trade threshold. This study calculated the 'k' value at a general level and compared its returns with the Buy and Hold strategy, finding that algorithmic trading using the volatility breakout strategy achieved slightly higher returns. In the future, we plan to present simulation results for maximizing returns by determining the optimal 'k' value for automated trading of the S&P 500 index using artificial intelligence deep learning techniques.
https://doi.org/10.20465/KIOTS.2023.9.6.057 인용 PDF

A Study on the Men's Fashion Trend through the Statistical Analysis (통계적 분석을 통한 남성 패션 트렌드 연구)

Kim, Yoon-Kyoung;Lee, Kyoung-Hee
- Journal of the Korean Society of Clothing and Textiles
- /
- v.31 no.6 s.165
- /
- pp.837-847
- /
- 2007
1,098 pieces of photographs($1995{\sim}2002$) of men's suit style have been classified according to fashion images in order to examine features and change aspects with statistical analysis. The findings of examining features of the trend by year with test of homogeneity, correspondence analysis, biplots, correlation analysis and regression analysis are as follows: (a) there are significant differences on fashion images as the trend by yew with test of homogeneity, (b) there are remarkable differences on the fashion trend by year with correspondence analysis and biplots. (c) There are significant correlations for appearance among fashion images by its frequency through correlation analysis, and (d) it is assumed that fashion images are going to be gradually outstanding according to regression analysis.
https://doi.org/10.5850/JKSCT.2007.31.6.837 인용 PDF KSCI

Numerical Simulation of the Formation of Oxygen Deficient Water-masses in Jinhae Bay (진해만의 빈산소 수괴 형성에 관한 수치실험)

CHOI Woo-Jeung;PARK Chung-Kill;LEE Suk-Mo
- Korean Journal of Fisheries and Aquatic Sciences
- /
- v.27 no.4
- /
- pp.413-433
- /
- 1994
Jinhae Bay once was a productive area of fisheries. It is, however, now notorious for its red tides; and oxygen deficient water-masses extensively develop at present in summer. Therefore the shellfish production of the bay has been decreasing and mass mortality often occurs. Under these circumstances, the three-dimensional numerical hydrodynamic and the material cycle models, which were developed by the Institute for Resources and Environment of Japan, were applied to analyze the processes affecting the oxygen depletion and also to evaluate the environment capacity for the reception of pollutant loads without dissolved oxygen depletion. In field surveys, oxygen deficient water-masses were formed with concentrations of below 2.0mg/l at the bottom layer in Masan Bay and the western part of Jinhae Bay during the summer. Current directions, computed by the $M_2$ constituent, were mainly toward the western part of Jinhae Bay during flood flows and in opposite directions during ebb flows. Tidal currents velocities during the ebb tide were stronger than that of the flood tide. The comparision between the simulated and observed tidal ellipses showed fairly good agreement. The residual currents, which were obtained by averaging the simulated tidal currents over 1 tidal cycle, showed the presence of counterclockwise eddies in the central part of Jinhae Bay. Density driven currents were generated southward at surface and northward at the bottom in Masan Bay and Jindong Bay, where the fresh water of rivers entered. The material cycle model was calibrated with the data surveyed in the field of the study area from June to July, 1992. The calibrated results are in fairly good agreement with measured values within relative error of $28\%$. The simulated dissolved oxygen distributions of bottom layer were relatively high with the concentration of $6.0{\sim}8.0mg/l$ at the boundaries, but an oxygen deficient water-masses were formed within the concentration of 2.0mg/l at the inner part of Masan Bay and the western part of Jinhae Bay. The results of sensitivity analyses showed that sediment oxygen demand(SOD) was one of the most important influence on the formation of oxygen depletion. Therefore, to control the oxygen deficient water-masses and to conserve the coastal environment, it is an effective method to reduce the SOD by improving the polluted sediment. As the results of simulations, in Masan Bay, oxygen deficient water-masses recovered to 5.0mg/l when the $50\%$ reduction in input COD loads from Masan basin and $70\%$ reduction in SOD was conducted. In the western part of Jinhae Bay, oxygen deficient water-masses recovered to 5.0mg/l when the $95\%$ reduction in SOD and $90\%$ reduction in culturing ground fecal loads was conducted.
PDF

Assessment of Soil Loss Estimated by Soil Catena Originated from Granite and Gneiss in Catchment (소유역단위 화강암/편마암 기원 토양 연접군(catena)에 따른 토양 유실 평가)

Hur, Seung-Oh;Sonn, Yeon-Kyu;Jung, Kang-Ho;Park, Chan-Won;Lee, Hyun-Hang;Ha, Sang-Keun;Kim, Jeong-Gyu
- Korean Journal of Soil Science and Fertilizer
- /
- v.40 no.5
- /
- pp.383-391
- /
- 2007
This study was conducted for an assessment through the estimation of soil loss by each catchment classified by soil catena. Ten catchments, which are Geumgang21, Namgang03, Dongjincheon, Gapyongcheon01, Gyongancheon02, Geumgang16, Byongsungcheon01, Daesincheon, Bukcheon02, Youngsangang08, were selected from the hydrologic unit map and the detailed soil digital map (1:25,000) for this study. The catchments like Geumgang21, Namgang03, Dongjincheon, Gapyongcheon01 and Gyongancheon02 were mainly composed with soils originated from gneiss. The catchments like Geumgang16, Byongsungcheon01, Daesincheon, Bukcheon02 and Youngsangang08 were mainly composed with soils originated from granites. The grades, which are divided into seven grades with A(very tolerable), B(tolerable), C(moderate), D(low), E(high), F(severe), G(very severe), of soil erosion estimated by USLE in catchments were distributed in most A and B because of paddy land and forestry. In detailed, the soil erosion grade of catchments mainly distributing soils originated from gneiss showed more the distribution of B and C than it of catchments mainly distributing soils originated from granites. The reason of results would be derived from topographic characteristics of soils originated from gneiss located at mountainous. The soil loss according to soil catena linked with Songsan and Jigok series, which are soils originated from gneiss was calculated with $7.66ton\;ha^{-1}\;yr^{-1}$. The soil loss of Geumgang16, Byongsungcheon01, Daesincheon, Bukcheon02 which have the soil catena linked with Samgak and Sangju soil series originated from granite, was calculated with $5.55ton\;ha^{-1}\;yr^{-1}$. The soil loss of Youngsangang08 which have the soil catena linked with Songjung and Baeksan soil series originated from granite was calculated with $9.6ton\;ha^{-1}\;yr^{-1}$, but the conclusion on soil loss in this kind of soil catena would be drawn from the analysis of more catchments. In conclusion, the results of this study inform that the classification of soil catena by catchments and estimation of soil loss according to soil catena would be effective for analysis on the grade of non-point pollution by soil erosion in a catchment.
PDF KSCI

Study of Rainfall-Runoff Variation by Grid Size and Critical Area (격자크기와 임계면적에 따른 홍수유출특성 변화)

Ahn, Seung-Seop;Lee, Jeung-Seok;Jung, Do-Joon;Han, Ho-Chul
- Journal of Environmental Science International
- /
- v.16 no.4
- /
- pp.523-532
- /
- 2007
This study utilized the 1/25,000 topographic map of the upper area from the Geum-ho watermark located at the middle of Geum-ho river from the National Geographic Information Institute. For the analysis, first, the influence of the size of critical area to the hydro topographic factors was examined changing grid size to $10m{\times}10m,\;30m{\times}30m\;and\;50m{\times}50m$, and the critical area for the formation of a river to $0.01km^2{\sim}0.50km^2$. It is known from the examination result of watershed morphology according to the grid size that the smaller grid size, the better resolution and accuracy. And it is found, from the analysis result of the degree of the river according to the minimum critical area for each grid size, that the grid size does not affect on the degree of the river, and the number of rivers with 2nd and higher degree does not show remarkable difference while there is big difference in the number of 1st degree rivers. From the results above, it is thought that the critical area of $0.15km^2{\sim}0.20km^2$ is appropriate for formation of a river being irrelevant to the grid size in extraction of hydro topographic parameters that are used in the runoff analysis model using topographic maps. Therefore, the GIUH model applied analysis results by use of the river level difference law proposed in this study for the explanation on the outflow response-changing characters according to the decision of a critical value of a minimum level difference river, showed that, since an ogival occurrence time and an ogival flow volume are very significant in a flood occurrence in case of not undertow facilities, the researcher could obtain a good result for the forecast of river outflow when considering a convenient application of the model and an easy acquisition of data, so it's judged that this model is proper as an algorism for the decision of a critical value of a river basin.
https://doi.org/10.5322/JES.2007.16.4.523 인용 PDF KSCI

Study on the channel of bipolar plate for PEM fuel cell (고분자 전해질 연료전지용 바이폴라 플레이트의 유로 연구)

Ahn Bum Jong;Ko Jae-Churl;Jo Young-Do
- Journal of the Korean Institute of Gas
- /
- v.8 no.2 s.23
- /
- pp.15-27
- /
- 2004
The purpose of this paper is to improve the performance of Polymer electrolyte fuel cell(PEMFC) by studying the channel dimension of bipolar plates using commercial CFD program 'Fluent'. Simulations are done ranging from 0.5 to 3.0mm for different size in order to find the channel size which shoves the highst hydrogen consumption. The results showed that the smaller channel width, land width, channel depth, the higher hydrogen consumption in anode. When channel width is increased, the pressure drop in channel is decreased because total channel length Is decreased, and when land width is increased, the net hydrogen consumption is decreased because hydrogen is diffused under the land width. It is also found that the influence of hydrogen consumption is larger at different channel width than it at different land width. The change of hydrogen consumption with different channel depth isn't as large as it with different channel width, but channel depth has to be small as can as it does because it has influence on the volume of bipolar plates. however the hydrogen utilization among the channel sizes more than 1.0mm which can be machined in reality is the most at channel width 1.0, land width 1.0, channel depth 0.5mm and considered as optimum channel size. The fuel cell combined with 2cm${\times}$2cm diagonal or serpentine type flow field and MEA(Membrane Electrode Assembly) is tested using 100W PEMFC test station to confirm that the channel size studied in simulation. The results showed that diagonal and serpentine flow field have similarly high OCV and current density of diagonal (low field is higher($2-40mA/m^2$) than that of serpentine flow field under 0.6 voltage, but the current density of serpentine type has higher performance($5-10mA/m^2$) than that of diagonal flow field under 0.7-0.8 voltage.
PDF

Search Result 31, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)