Search | Korea Science

The Analysis on the Relationship between Firms' Exposures to SNS and Stock Prices in Korea (기업의 SNS 노출과 주식 수익률간의 관계 분석)

Kim, Taehwan;Jung, Woo-Jin;Lee, Sang-Yong Tom
- Asia pacific journal of information systems
- /
- v.24 no.2
- /
- pp.233-253
- /
- 2014
Can the stock market really be predicted? Stock market prediction has attracted much attention from many fields including business, economics, statistics, and mathematics. Early research on stock market prediction was based on random walk theory (RWT) and the efficient market hypothesis (EMH). According to the EMH, stock market are largely driven by new information rather than present and past prices. Since it is unpredictable, stock market will follow a random walk. Even though these theories, Schumaker [2010] asserted that people keep trying to predict the stock market by using artificial intelligence, statistical estimates, and mathematical models. Mathematical approaches include Percolation Methods, Log-Periodic Oscillations and Wavelet Transforms to model future prices. Examples of artificial intelligence approaches that deals with optimization and machine learning are Genetic Algorithms, Support Vector Machines (SVM) and Neural Networks. Statistical approaches typically predicts the future by using past stock market data. Recently, financial engineers have started to predict the stock prices movement pattern by using the SNS data. SNS is the place where peoples opinions and ideas are freely flow and affect others' beliefs on certain things. Through word-of-mouth in SNS, people share product usage experiences, subjective feelings, and commonly accompanying sentiment or mood with others. An increasing number of empirical analyses of sentiment and mood are based on textual collections of public user generated data on the web. The Opinion mining is one domain of the data mining fields extracting public opinions exposed in SNS by utilizing data mining. There have been many studies on the issues of opinion mining from Web sources such as product reviews, forum posts and blogs. In relation to this literatures, we are trying to understand the effects of SNS exposures of firms on stock prices in Korea. Similarly to Bollen et al. [2011], we empirically analyze the impact of SNS exposures on stock return rates. We use Social Metrics by Daum Soft, an SNS big data analysis company in Korea. Social Metrics provides trends and public opinions in Twitter and blogs by using natural language process and analysis tools. It collects the sentences circulated in the Twitter in real time, and breaks down these sentences into the word units and then extracts keywords. In this study, we classify firms' exposures in SNS into two groups: positive and negative. To test the correlation and causation relationship between SNS exposures and stock price returns, we first collect 252 firms' stock prices and KRX100 index in the Korea Stock Exchange (KRX) from May 25, 2012 to September 1, 2012. We also gather the public attitudes (positive, negative) about these firms from Social Metrics over the same period of time. We conduct regression analysis between stock prices and the number of SNS exposures. Having checked the correlation between the two variables, we perform Granger causality test to see the causation direction between the two variables. The research result is that the number of total SNS exposures is positively related with stock market returns. The number of positive mentions of has also positive relationship with stock market returns. Contrarily, the number of negative mentions has negative relationship with stock market returns, but this relationship is statistically not significant. This means that the impact of positive mentions is statistically bigger than the impact of negative mentions. We also investigate whether the impacts are moderated by industry type and firm's size. We find that the SNS exposures impacts are bigger for IT firms than for non-IT firms, and bigger for small sized firms than for large sized firms. The results of Granger causality test shows change of stock price return is caused by SNS exposures, while the causation of the other way round is not significant. Therefore the correlation relationship between SNS exposures and stock prices has uni-direction causality. The more a firm is exposed in SNS, the more is the stock price likely to increase, while stock price changes may not cause more SNS mentions.
https://doi.org/10.14329/apjis.2014.24.2.233 인용 PDF

The research for the yachting development of Korean Marina operation plans (요트 발전을 위한 한국형 마리나 운영방안에 관한 연구)

Jeong Jong-Seok;Hugh Ihl
- Journal of Navigation and Port Research
- /
- v.28 no.10 s.96
- /
- pp.899-908
- /
- 2004
The rise of income and introduction of 5 day a week working system give korean people opportunities to enjoy their leisure time. And many korean people have much interest in oceanic sports such as yachting and also oceanic leisure equipments. With the popularization and development of the equipments, the scope of oceanic activities has been expanding in Korea just as in the advanced oceanic countries. However, The current conditions for the sports in Korea are not advanced and even worse than underdeveloped countries. In order to develop the underdeveloped resources of Korean marina, we need to customize the marina models of advanced nations to serve the specific needs and circumstances of Korea As such we have carried out a comparative analysis of how Austrailia, Newzealand, Singapore, japan and Malaysia operate their marina, reaching the following conclusions. Firstly, in marina operations, in order to protect personal property rights and to preserve the environment, we must operate membership and non-membership, profit and non-profit schemes separately, yet without regulating the dress code entering or leaving the club house. Secondly, in order to accumulate greater value added, new sporting events should be hosted each year. There is also the need for an active use of volunteers, the generation of greater interest in yacht tourism, and the simplification of CIQ procedures for foreign yachts as well as the provision of language services. Thirdly, a permanent yacht school should be established, and classes should be taught by qualified instructors. Beginners, intermediary, and advanced learner classes should be managed separately with special emphasis on the dinghy yacht program for children. Fourthly, arrival and departure at the moorings must be regulated autonomically, and there must be systematic measures for the marina to be able, in part, to compensate for loss and damages to equipment, security and surveillance after usage fees have been paid for. Fifthly, marine safety personnel must be formed in accordance with Korea's current circumstances from civilian organizations in order to be used actively in benchmarking, rescue operations, and oceanic searches at times of disaster at sea.
https://doi.org/10.5394/KINPR.2004.28.10.899 인용 PDF KSCI

The PRISM-based Rainfall Mapping at an Enhanced Grid Cell Resolution in Complex Terrain (복잡지형 고해상도 격자망에서의 PRISM 기반 강수추정법)

Chung, U-Ran;Yun, Kyung-Dahm;Cho, Kyung-Sook;Yi, Jae-Hyun;Yun, Jin-I.
- Korean Journal of Agricultural and Forest Meteorology
- /
- v.11 no.2
- /
- pp.72-78
- /
- 2009
The demand for rainfall data in gridded digital formats has increased in recent years due to the close linkage between hydrological models and decision support systems using the geographic information system. One of the most widely used tools for digital rainfall mapping is the PRISM (parameter-elevation regressions on independent slopes model) which uses point data (rain gauge stations), a digital elevation model (DEM), and other spatial datasets to generate repeatable estimates of monthly and annual precipitation. In the PRISM, rain gauge stations are assigned with weights that account for other climatically important factors besides elevation, and aspects and the topographic exposure are simulated by dividing the terrain into topographic facets. The size of facet or grid cell resolution is determined by the density of rain gauge stations and a $5{\times}5km$ grid cell is considered as the lowest limit under the situation in Korea. The PRISM algorithms using a 270m DEM for South Korea were implemented in a script language environment (Python) and relevant weights for each 270m grid cell were derived from the monthly data from 432 official rain gauge stations. Weighted monthly precipitation data from at least 5 nearby stations for each grid cell were regressed to the elevation and the selected linear regression equations with the 270m DEM were used to generate a digital precipitation map of South Korea at 270m resolution. Among 1.25 million grid cells, precipitation estimates at 166 cells, where the measurements were made by the Korea Water Corporation rain gauge network, were extracted and the monthly estimation errors were evaluated. An average of 10% reduction in the root mean square error (RMSE) was found for any months with more than 100mm monthly precipitation compared to the RMSE associated with the original 5km PRISM estimates. This modified PRISM may be used for rainfall mapping in rainy season (May to September) at much higher spatial resolution than the original PRISM without losing the data accuracy.
https://doi.org/10.5532/KJAFM.2009.11.2.072 인용 PDF KSCI

The Construction of QoS Integration Platform for Real-time Negotiation and Adaptation Stream Service in Distributed Object Computing Environments (분산 객체 컴퓨팅 환경에서 실시간 협약 및 적응 스트림 서비스를 위한 QoS 통합 플랫폼의 구축)

Jun, Byung-Taek;Kim, Myung-Hee;Joo, Su-Chong
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.11S
- /
- pp.3651-3667
- /
- 2000
Recently, in the distributed multimedia environments based on internet, as radical growing technologies, the most of researchers focus on both streaming technology and distributed object thchnology, Specially, the studies which are tried to integrate the streaming services on the distributed object technology have been progressing. These technologies are applied to various stream service mamgements and protocols. However, the stream service management mexlels which are being proposed by the existing researches are insufficient for suporting the QoS of stream services. Besides, the existing models have the problems that cannot support the extensibility and the reusability, when the QoS-reiatedfunctions are being developed as a sub-module which is suited on the specific-purpose application services. For solving these problems, in this paper. we suggested a QoS Integrated platform which can extend and reuse using the distributed object technologies, and guarantee the QoS of the stream services. A structure of platform we suggested consists of three components such as User Control Module(UCM), QoS Management Module(QoSM) and Stream Object. Stream Object has Send/Receive operations for transmitting the RTP packets over TCP/IP. User Control ModuleI(UCM) controls Stream Objects via the COREA service objects. QoS Management Modulel(QoSM) has the functions which maintain the QoS of stream service between the UCMs in client and server. As QoS control methexlologies, procedures of resource monitoring, negotiation, and resource adaptation are executed via the interactions among these comiXments mentioned above. For constmcting this QoS integrated platform, we first implemented the modules mentioned above independently, and then, used IDL for defining interfaces among these mexlules so that can support platform independence, interoperability and portability base on COREA. This platform is constructed using OrbixWeb 3.1c following CORBA specification on Solaris 2.5/2.7, Java language, Java, Java Media Framework API 2.0, Mini-SQL1.0.16 and multimedia equipments. As results for verifying this platform functionally, we showed executing results of each module we mentioned above, and a numerical data obtained from QoS control procedures on client and server's GUI, while stream service is executing on our platform.
PDF

Transfer and Validation of NIRS Calibration Models for Evaluating Forage Quality in Italian Ryegrass Silages (이탈리안 라이그라스 사일리지의 품질평가를 위한 근적외선분광 (NIRS) 검량식의 이설 및 검증)

Cho, Kyu Chae;Park, Hyung Soo;Lee, Sang Hoon;Choi, Jin Hyeok;Seo, Sung;Choi, Gi Jun
- Journal of Animal Environmental Science
- /
- v.18 no.sup
- /
- pp.81-90
- /
- 2012
This study was evaluated high end research grade Near infrared spectrophotometer (NIRS) to low end popular field grade multiple Near infrared spectrophotometer (NIRS) for rapid analysis at forage quality at sight with 241 samples of Italian ryegrass silage during 3 years collected whole country for evaluate accuracy and precision between instruments. Firstly collected and build database high end research grade NIRS using with Unity Scientific Model 2500X (650 nm~2,500 nm) then trim and fit to low end popular field grade NIRS with Unity Scientific Model 1400 (1,400 nm~2,400 nm) then build and create calibration, transfer calibration with special transfer algorithm. The result between instruments was 0.000%~0.343% differences, rapidly analysis for chemical constituents, NDF, ADF, and crude protein, crude ash and fermentation parameter such as moisture, pH and lactic acid, finally forage quality parameter, TDN, DMI, RFV within 5 minutes at sight and the result equivalent with laboratory data. Nevertheless during 3 years collected samples for build calibration was organic samples that make differentiate by local or yearly bases etc. This strongly suggest population evaluation technique needed and constantly update calibration and maintenance calibration to proper handling database accumulation and spread out by knowledgable control laboratory analysis and reflect calibration update such as powerful control center needed for long lasting usage of forage analysis with NIRS at sight. Especially the agriculture products such as forage will continuously changes that made easily find out the changes and update routinely, if not near future NIRS was worthless due to those changes. Many research related NIRS was shortly study not long term study that made not well using NIRS, so the system needed check simple and instantly using with local language supported signal methods Global Distance (GD) and Neighbour Distance (ND) algorithm. Finally the multiple popular field grades instruments should be the same results not only between research grade instruments but also between multiple popular field grade instruments that needed easily transfer calibration and maintenance between instruments via internet networking techniques.
PDF KSCI

Installation Art In Indonesian Contemporary Art; A Quest For Medium and Social Spaces (인도네시아 현대미술에 있어서의 설치미술 - 미디엄과 사회적 공간을 위한 탐색)

Kusmara, A. Rikrik
- The Journal of Art Theory & Practice
- /
- no.5
- /
- pp.217-229
- /
- 2007
Many historical research and facet about modern art in Indonesia which formulating background of contemporary Indonesian Art. Indonesian art critic Sanento Yuliman states that Modern art has been rapidly developing in Indonesia since the Indonesian Independence in 1945. Modern Art is a part of the super culture of the Indonesian metropolitan and is closely related to the contact between the Indonesian and Western Cultures. Its birth was part of the nationalism project, when the Indonesian people consists of various ethnics were determined to become a new nation, the Indonesian nation, and they wished for a new culture, and therefore, a new art. The period 1960s, which was the beginning of the creation and development of the painters and the painters associations, was the first stage of the development of modern art in Indonesia. The second stage showed the important role of the higher education institutes for art. These institutes have developed since the 1950s and in the 1970s they were the main education institutes for painters and other artists. The artists awareness of the medium, forms or the organization of shapes were encouraged more intensely and these encouraged the exploring and experimental attitudes. Meanwhile, the information about the world's modern art, particularly Western Art; was widely and rapidly spread. The 1960s and 1970s were marked by the development of various abstractions and abstract art and the great number of explorations in various new media, like the experiment with collage, assemblage, mixed media. The works of the Neo Art Movement-group in the second half of the 1970s and in the 1980s shows environmental art and installations, influenced by the elements of popular art, from the commercial world and mass media, as well as the involvement of art in the social and environmental affairs. The issues about the environment, frequently launched by the intellectuals in the period of economic development starting in the 1970s, echoed among the artists, and they were widened in the social, art and cultural circles. The Indonesian economic development following the important change in the 1970s has caused a change in the life of the middle and upper class society, as has the change in various aspects of a big city, particularly Jakarta. The new genre emerged in 1975 which indicates contemporary art in Indonesia, when a group of young artists organized a movement, which was widely known as the Indonesian New Art Movement. This movement criticized international style, universalism and the long standing debate on an east-west-dichotomy. As far as the actual practice of the arts was concerned the movement criticized the domination of the art of painting and saw this as a sign of stagnation in Indonesian art development. Based on this criticism 'the movement' introduced ready-mades and installations (Jim Supangkat). Takes almost two decades that the New Art Movement activists were establishing Indonesian Installation art genre as contemporary paradigm and influenced the 1980's gene ration like, FX Harsono, Dadang Christanto, Arahmaiani, Tisna Sanjaya, Diyanto, Andarmanik, entering the 1990's decade as "rebellion period" ; reject towards established aesthetic mainstream i.e. painting, sculpture, graphic art which are insufficient to express "new language" and artistic needs especially to mediate social politic and cultural situation. Installation Art which contains open possibilities of creation become a vehicle for aesthetic establishment rejection and social politics stagnant expression in 1990s. Installation art accommodates two major field; first, the rejection of aesthetic establishment has a consequences an artists quest for medium; deconstruction models and cross disciplines into multi and intermedia i.e. performance, music, video etc. Second aspect is artists' social politic intention for changes, both conclude as characteristics of Indonesian Installation Art and establishing the freedom of expression in contemporary Indonesian Art until today.
PDF

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

Kim, JaeHun;Lee, Myungjin
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.43-61
- /
- 2019
Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.
https://doi.org/10.13088/jiis.2019.25.1.043 인용 PDF KSCI HTML

Performance of Investment Strategy using Investor-specific Transaction Information and Machine Learning (투자자별 거래정보와 머신러닝을 활용한 투자전략의 성과)

Kim, Kyung Mock;Kim, Sun Woong;Choi, Heung Sik
- Journal of Intelligence and Information Systems
- /
- v.27 no.1
- /
- pp.65-82
- /
- 2021
Stock market investors are generally split into foreign investors, institutional investors, and individual investors. Compared to individual investor groups, professional investor groups such as foreign investors have an advantage in information and financial power and, as a result, foreign investors are known to show good investment performance among market participants. The purpose of this study is to propose an investment strategy that combines investor-specific transaction information and machine learning, and to analyze the portfolio investment performance of the proposed model using actual stock price and investor-specific transaction data. The Korea Exchange offers daily information on the volume of purchase and sale of each investor to securities firms. We developed a data collection program in C# programming language using an API provided by Daishin Securities Cybosplus, and collected 151 out of 200 KOSPI stocks with daily opening price, closing price and investor-specific net purchase data from January 2, 2007 to July 31, 2017. The self-organizing map model is an artificial neural network that performs clustering by unsupervised learning and has been introduced by Teuvo Kohonen since 1984. We implement competition among intra-surface artificial neurons, and all connections are non-recursive artificial neural networks that go from bottom to top. It can also be expanded to multiple layers, although many fault layers are commonly used. Linear functions are used by active functions of artificial nerve cells, and learning rules use Instar rules as well as general competitive learning. The core of the backpropagation model is the model that performs classification by supervised learning as an artificial neural network. We grouped and transformed investor-specific transaction volume data to learn backpropagation models through the self-organizing map model of artificial neural networks. As a result of the estimation of verification data through training, the portfolios were rebalanced monthly. For performance analysis, a passive portfolio was designated and the KOSPI 200 and KOSPI index returns for proxies on market returns were also obtained. Performance analysis was conducted using the equally-weighted portfolio return, compound interest rate, annual return, Maximum Draw Down, standard deviation, and Sharpe Ratio. Buy and hold returns of the top 10 market capitalization stocks are designated as a benchmark. Buy and hold strategy is the best strategy under the efficient market hypothesis. The prediction rate of learning data using backpropagation model was significantly high at 96.61%, while the prediction rate of verification data was also relatively high in the results of the 57.1% verification data. The performance evaluation of self-organizing map grouping can be determined as a result of a backpropagation model. This is because if the grouping results of the self-organizing map model had been poor, the learning results of the backpropagation model would have been poor. In this way, the performance assessment of machine learning is judged to be better learned than previous studies. Our portfolio doubled the return on the benchmark and performed better than the market returns on the KOSPI and KOSPI 200 indexes. In contrast to the benchmark, the MDD and standard deviation for portfolio risk indicators also showed better results. The Sharpe Ratio performed higher than benchmarks and stock market indexes. Through this, we presented the direction of portfolio composition program using machine learning and investor-specific transaction information and showed that it can be used to develop programs for real stock investment. The return is the result of monthly portfolio composition and asset rebalancing to the same proportion. Better outcomes are predicted when forming a monthly portfolio if the system is enforced by rebalancing the suggested stocks continuously without selling and re-buying it. Therefore, real transactions appear to be relevant.
https://doi.org/10.13088/jiis.2021.27.1.065 인용 PDF KSCI

Topic Modeling Insomnia Social Media Corpus using BERTopic and Building Automatic Deep Learning Classification Model (BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축)

Ko, Young Soo;Lee, Soobin;Cha, Minjung;Kim, Seongdeok;Lee, Juhee;Han, Ji Yeong;Song, Min
- Journal of the Korean Society for information Management
- /
- v.39 no.2
- /
- pp.111-129
- /
- 2022
Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from 'insomnia', a community on 'Reddit', a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups ('Negative emotions', 'Advice and help and gratitude', 'Insomnia-related diseases', 'Sleeping pills', 'Exercise and eating habits', 'Physical characteristics', 'Activity characteristics', 'Environmental characteristics') could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.
https://doi.org/10.3743/KOSIM.2022.39.2.111 인용 PDF KSCI

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
- Journal of Intelligence and Information Systems
- /
- v.25 no.2
- /
- pp.25-38
- /
- 2019
Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.
https://doi.org/10.13088/jiis.2019.25.2.025 인용 PDF KSCI HTML

Search Result 883, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)