• Title/Summary/Keyword: Used Trading

Search Result 383, Processing Time 0.022 seconds

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

A Study on the Location of Retail Trade in Kwangju-si and Its Inhabitants와 Effcient Utilization (광주시 소매업의 입지와 주민의 효율적 이용에 관한 연구)

  • ;Jeon, Kyung-sook
    • Journal of the Korean Geographical Society
    • /
    • v.30 no.1
    • /
    • pp.68-92
    • /
    • 1995
  • Recentry the structure of the retail trade have been chanaed with its environmantal changes. Some studies may be necessary on the changing process of environment and fundamental structure analyses of the retail trade. This study analyzes the location of retail trades, inhabitants' behavior in retail tredes and their desirable utilization scheme of them in Kwangju-si. Some study methods, contents and coming-out results are as follows: 1. Retail trades can be classified into independent stores, chain-stores (supermarket, voluntary chain and frenchiise system and convenience store), department stores, cooperative associations, traditional, markets mail-order marketing, automatic vending and others by service levels, selling-items, prices, managements, methods of retailing and store or nonstore type. 2. In Kwangju, the environment of retail trades is related to the consumers of population structure: chanes in consumers pattern, trends toward agings and nuclear family, increase of leisur: time and female advances to society. Rapid structural shift in retail trade has also been occurred due to these social changes. Traditionl and premodern markets until 1970s altere to supermarkets or department stores in 1980s, and various types, large enterprises and foreign capitals came into being in 1990s. 3. The locational characteristics of retail trades are resulted from the spatial analysis of the total population distribution, and from the calculation of segregation index in the light of potential demand. The densely-populated areas occurs in newly-built apartment housing complex which is distributed with a ring-shaped pattern around the old urban core. The numbers and rates of the aged over sixty in Kwangsan-gu and the circumference area of Mt.Moodeung, are larger and higher where rural elements are remarkable. A relation between population distribution and retail trade are analysed by the index of population per shop. The index of the population number per shop is lower in urban center, as a whole, being more convenient for consumers. In newly-formed apartment complex areas, on the other, the index more than 1,000 per shop, meeting not the demands for consumers. Because both the younger and the aged are numerous in these areas, the retail trade pattern pertinent to both are needed. Urban fringes including Kwangsan-gu and the vicinity of Mt.Moodeung have some problems owing to the most of population number per shop (more than 1, 500) and the most extensive as well. 4. The regional characteristic of retail trade is analyzed through the location quotient of shops by locational patterns and centerality index. Chungkum-dong is the highest-order central place in CBD. It is the core of retail trades, which has higher-ordered specialty store including three big department stores, supermarkets and large stores. Taegum-dong, Chungsu-dong, Taeui-dong, and Numun-dong that are neiahbored to Chungkum-dong fall on the second group. They have a central commercial section where large chain stores, specialty shopping streets, narrow-line retailing shops (furniture, amusement service, and gallary), supermarkets and daily markets are located. The third group is formed on the axis of state roads linking to Naju-kun, Changseong-kun, Tamyang-kun, Hwasun-kun and forme-Songjeong-eup. It is related to newly, rising apartment housing complex along a trunk road, and characterized by markets and specialty stores. The fourth group has neibourhood-shopping centers including older residential area and Songjeong-eup area with independent stores and supermarkets as main retailing functions. The last group contains inner residential area and outer part of a city including Songjeong-eup. Outer part of miscellaneous shops being occasionally found is rural rather than urban (Fig. 7). 5. The residents' behaviors using retail trade are analyzed by factors of goods and facilities. Department stores are very high level in preference for higher-order shopping-goods such as clothes for full dress in view of both diversity and quality of goods(28.9%). But they have severe traffic congestions, and high competitions for market ranges caused by their sma . 64.0% of respondents make combined purpose trips together with banking and shopping. 6. For more efficiency of retail-trading, it is necessary to induce spatial distribution policy with regard to opportunity frequency of goods selection by central place, frontier regions and age groups. Also we must consider to analyze competition among different types of retail trade and analyze the consumption behaviors of working females and younger-aged groups, in aspects of time and space. Service improvement and the rationalization of management should be accomplished in such as cooperative location (situation) must be under consideration in relations to other functions such as finance, leisure & sports, and culture centers. Various service systems such as installment, credit card and peremium ticket, new used by enterprises, must also be carried service improvement. The rationalization and professionalization in for the commercial goods are bsically requested.

  • PDF

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.