• Title/Summary/Keyword: Word learning system

Search Result 202, Processing Time 0.016 seconds

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Spatial effect on the diffusion of discount stores (대형할인점 확산에 대한 공간적 영향)

  • Joo, Young-Jin;Kim, Mi-Ae
    • Journal of Distribution Research
    • /
    • v.15 no.4
    • /
    • pp.61-85
    • /
    • 2010
  • Introduction: Diffusion is process by which an innovation is communicated through certain channel overtime among the members of a social system(Rogers 1983). Bass(1969) suggested the Bass model describing diffusion process. The Bass model assumes potential adopters of innovation are influenced by mass-media and word-of-mouth from communication with previous adopters. Various expansions of the Bass model have been conducted. Some of them proposed a third factor affecting diffusion. Others proposed multinational diffusion model and it stressed interactive effect on diffusion among several countries. We add a spatial factor in the Bass model as a third communication factor. Because of situation where we can not control the interaction between markets, we need to consider that diffusion within certain market can be influenced by diffusion in contiguous market. The process that certain type of retail extends is a result that particular market can be described by the retail life cycle. Diffusion of retail has pattern following three phases of spatial diffusion: adoption of innovation happens in near the diffusion center first, spreads to the vicinity of the diffusing center and then adoption of innovation is completed in peripheral areas in saturation stage. So we expect spatial effect to be important to describe diffusion of domestic discount store. We define a spatial diffusion model using multinational diffusion model and apply it to the diffusion of discount store. Modeling: In this paper, we define a spatial diffusion model and apply it to the diffusion of discount store. To define a spatial diffusion model, we expand learning model(Kumar and Krishnan 2002) and separate diffusion process in diffusion center(market A) from diffusion process in the vicinity of the diffusing center(market B). The proposed spatial diffusion model is shown in equation (1a) and (1b). Equation (1a) is the diffusion process in diffusion center and equation (1b) is one in the vicinity of the diffusing center. $$\array{{S_{i,t}=(p_i+q_i{\frac{Y_{i,t-1}}{m_i}})(m_i-Y_{i,t-1})\;i{\in}\{1,{\cdots},I\}\;(1a)}\\{S_{j,t}=(p_j+q_j{\frac{Y_{j,t-1}}{m_i}}+{\sum\limits_{i=1}^I}{\gamma}_{ij}{\frac{Y_{i,t-1}}{m_i}})(m_j-Y_{j,t-1})\;i{\in}\{1,{\cdots},I\},\;j{\in}\{I+1,{\cdots},I+J\}\;(1b)}}$$ We rise two research questions. (1) The proposed spatial diffusion model is more effective than the Bass model to describe the diffusion of discount stores. (2) The more similar retail environment of diffusing center with that of the vicinity of the contiguous market is, the larger spatial effect of diffusing center on diffusion of the vicinity of the contiguous market is. To examine above two questions, we adopt the Bass model to estimate diffusion of discount store first. Next spatial diffusion model where spatial factor is added to the Bass model is used to estimate it. Finally by comparing Bass model with spatial diffusion model, we try to find out which model describes diffusion of discount store better. In addition, we investigate the relationship between similarity of retail environment(conceptual distance) and spatial factor impact with correlation analysis. Result and Implication: We suggest spatial diffusion model to describe diffusion of discount stores. To examine the proposed spatial diffusion model, 347 domestic discount stores are used and we divide nation into 5 districts, Seoul-Gyeongin(SG), Busan-Gyeongnam(BG), Daegu-Gyeongbuk(DG), Gwan- gju-Jeonla(GJ), Daejeon-Chungcheong(DC), and the result is shown

    . In a result of the Bass model(I), the estimates of innovation coefficient(p) and imitation coefficient(q) are 0.017 and 0.323 respectively. While the estimate of market potential is 384. A result of the Bass model(II) for each district shows the estimates of innovation coefficient(p) in SG is 0.019 and the lowest among 5 areas. This is because SG is the diffusion center. The estimates of imitation coefficient(q) in BG is 0.353 and the highest. The imitation coefficient in the vicinity of the diffusing center such as BG is higher than that in the diffusing center because much information flows through various paths more as diffusion is progressing. A result of the Bass model(II) shows the estimates of innovation coefficient(p) in SG is 0.019 and the lowest among 5 areas. This is because SG is the diffusion center. The estimates of imitation coefficient(q) in BG is 0.353 and the highest. The imitation coefficient in the vicinity of the diffusing center such as BG is higher than that in the diffusing center because much information flows through various paths more as diffusion is progressing. In a result of spatial diffusion model(IV), we can notice the changes between coefficients of the bass model and those of the spatial diffusion model. Except for GJ, the estimates of innovation and imitation coefficients in Model IV are lower than those in Model II. The changes of innovation and imitation coefficients are reflected to spatial coefficient(${\gamma}$). From spatial coefficient(${\gamma}$) we can infer that when the diffusion in the vicinity of the diffusing center occurs, the diffusion is influenced by one in the diffusing center. The difference between the Bass model(II) and the spatial diffusion model(IV) is statistically significant with the ${\chi}^2$-distributed likelihood ratio statistic is 16.598(p=0.0023). Which implies that the spatial diffusion model is more effective than the Bass model to describe diffusion of discount stores. So the research question (1) is supported. In addition, we found that there are statistically significant relationship between similarity of retail environment and spatial effect by using correlation analysis. So the research question (2) is also supported.

  • PDF

  • (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.