Search | Korea Science

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

Lee, Minsik;Lee, Hong Joo
- Journal of Intelligence and Information Systems
- /
- v.23 no.2
- /
- pp.123-138
- /
- 2017
Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.
https://doi.org/10.13088/jiis.2017.23.2.123 인용 PDF KSCI

Effects of Benzo〔a〕pyrene on Growth and Photosynthesis of Phytoplankton (식물플랑크톤의 성장과 광합성에 대한 benzo〔a〕pyrene의 영향)

Kim, Sun-Ju;Shin, Kyung-Soon;Moon, Chang-Ho;Park, Dong-Won;Chang, Man
- Korean Journal of Environmental Biology
- /
- v.22
- /
- pp.54-62
- /
- 2004
We examined the impacts of anthyopogenic pollutant (benzo〔a〕pyrene) on the growth and photosynthesis of five marine phytoplankton species (Skeletonema costatum, Heterosigma akashiwo, Prorocentrum dentatum, P. minimum, Aknshiwo sanguinea), which are dominant in Korean coastal water. After the 72 h exposure to benzo〔a〕pyrene, the dramatic decrease in cell numbers was observed in the range of 1 to 10 $\mu\textrm{g}$ L$^{-1}$ for S. costatum, P. minimum, P. dentatum, whereas for A. sanguinea and H. akashiwo at the low concentrations 0.1 to 1 $\mu\textrm{g}$ L$^{-1}$ . Among the 5 phytoplankton species, the highest growth inhibition concentration ($IC_{50}$/) was 6.20 $\mu\textrm{g}$ L$^{-1}$ for P. minimum, followed by 2.14 $\mu\textrm{g}$ L$^{-1}$ for P. dentatum, 1.68 $\mu\textrm{g}$ L$^{-1}$ for S. costatum, 0.74 $\mu\textrm{g}$ L$^{-1}$ for H. akashiwo, 0.10 $\mu\textrm{g}$ L$^{-1}$ for A. sanguinea. The five species exposed to the low concentration of 1 $\mu\textrm{g}$ L$^{-1}$ were recovered after transferring to new media, but the species exposed to the high concentrations of 10 and 100 $\mu\textrm{g}$ L$^{-1}$ were not recovered, with the exception of P. minimum. Those results indicate that the thecate dinoflagellate P. minimum is most tolerant to the chemical and the athecate dinoflagellate A. sanguinea is not. Geneyally, the cell-specific photosynthetic capacity of H. akashiwo exposed to the low concentrations of 0.1 and 1 $\mu\textrm{g}$ L$^{-1}$ was higher than that of the cells in the control, whereas the cells exposed to the high concentrations of 5 and 10 $\mu\textrm{g}$ L$^{-1}$ showed the negligible photosynthetic level by the first few days of the experiment. In the case of the cells exposed to the concentration of 5 $\mu\textrm{g}$ L$^{-1}$ , after 12 days of the experiment the photosynthetic capacity was increased toward the end of the experiment. This indicates that H. akashiwo may utilize the benzo〔a〕pyrene as a carton source for its growth when exposed to low concentrations. Results suggest that anthropogenic pollutants such as benzo〔a〕pyrene may have significant influence on the succession of phytoplankton species composition and the primary production in coastal marine environments.
PDF KSCI

The Analyses of Treatment Results and Prognostic Factors in Supradiaphragmatic CS I-II Hodgkin's Disease (횡경막상부에 국한된 임상적 병기 1-2기 호지킨병에서 치료 결과와 예후 인자의 분석)

Park Won;Suh Chang Ok;Chung Eun Ji;Cho Jae Ho;Chung Hyun Cheol;Kim Joo Hang;Roh Jae Kyung;Hahn Jee Sook;Kim Gwi Eon
- Radiation Oncology Journal
- /
- v.16 no.2
- /
- pp.147-157
- /
- 1998
Purpose : The aim of this retrospective study is to assess the necessity of s1aging laparotomy in the management of supradiaphragmatic CS I-II Hodgkin's disease. Prognostic factors and the usefulness of prognostic factor groups were also analyzed. Materials and Methods : From 1985 to 1995, fifty one Patients who were diagnosed as supradiaphragmatic CS I-II Hodgkin's disease at Yonsei Cancer Center in Seoul, Korea were enrolled in this study Age range was 4 to 67 with median age of 30. The number of patients with each CS IA, II A, and IIB were 16, 25, and 10, respectively. Radiotherapy(RT) was delivered using 4 or 6 MV photon beam to a total dose of 19.5 to 55.6Gy (median dose : 45Gy) with a 1.5 to 1.BGy per fraction. Chemotherapy(CT) was given in 2-12 cycles(median : 6 cycles). Thirty one Patients were treated with RT alone, 4 patients with CT alone and 16 patients with combined chemoradiotherapy. RT volumes varied from involved fields(3), subtotal nodal fields(18) or mantle fields(26). Results : Five-year disease-free survival rate(DFS) was $78.0\%$ and overall survival rate(05) was $87.6\%$. Fifty Patients achieved a complete remission after initial treatment and 8 patients were relapsed. Salvage therapy was given to 7 patients, 1 with RT alone, 4 with CT alone, 2 with RT+CT. Only two patients were successfully salvaged. Feminine gender and large media-stinal adenopathy were significant adverse prognostic factors in the univariate analysis for DFS. The significant adverse prognostic factors of OS were B symptom and clinical stage. When patients were analyzed according to European Organization for Research and Treatment of Cancer(EORTC) prognostic factor groups, the DFS in Patients with very favorable, favorable and unfavorable group was 100, 100 and $55.8\%$(p<0.05), and the 05 in each patients' group was 100, 100 and $75.1\%$(p<0.05), respectively. In very favorable and favorable groups, the DFS and 05 were all $100\%$ by RT alone, but in unfavorable group, RT with CT had a lesser relapse rate than RT alone. The subtotal nodal irradiation had better OFS than mantle RT in patients treated with RT. Conclusion : In present study, the DFS and OS in patients who did not undergo s1aging laparotomy were similar with the results in the literatures of which patients were surgically staged. Therefore, we may suggest that staging laparotomy would not influence the outcome of treatments. In univariate analysis, gender, large mediastinal adenopathy. B symptoms and clinical stage were significant prognostic factors for the survival rate. We confirm the usefulness of EORTC prognostic factor groups which may be a good.
PDF

Major Class Recommendation System based on Deep learning using Network Analysis (네트워크 분석을 활용한 딥러닝 기반 전공과목 추천 시스템)

Lee, Jae Kyu;Park, Heesung;Kim, Wooju
- Journal of Intelligence and Information Systems
- /
- v.27 no.3
- /
- pp.95-112
- /
- 2021
In university education, the choice of major class plays an important role in students' careers. However, in line with the changes in the industry, the fields of major subjects by department are diversifying and increasing in number in university education. As a result, students have difficulty to choose and take classes according to their career paths. In general, students choose classes based on experiences such as choices of peers or advice from seniors. This has the advantage of being able to take into account the general situation, but it does not reflect individual tendencies and considerations of existing courses, and has a problem that leads to information inequality that is shared only among specific students. In addition, as non-face-to-face classes have recently been conducted and exchanges between students have decreased, even experience-based decisions have not been made as well. Therefore, this study proposes a recommendation system model that can recommend college major classes suitable for individual characteristics based on data rather than experience. The recommendation system recommends information and content (music, movies, books, images, etc.) that a specific user may be interested in. It is already widely used in services where it is important to consider individual tendencies such as YouTube and Facebook, and you can experience it familiarly in providing personalized services in content services such as over-the-top media services (OTT). Classes are also a kind of content consumption in terms of selecting classes suitable for individuals from a set content list. However, unlike other content consumption, it is characterized by a large influence of selection results. For example, in the case of music and movies, it is usually consumed once and the time required to consume content is short. Therefore, the importance of each item is relatively low, and there is no deep concern in selecting. Major classes usually have a long consumption time because they have to be taken for one semester, and each item has a high importance and requires greater caution in choice because it affects many things such as career and graduation requirements depending on the composition of the selected classes. Depending on the unique characteristics of these major classes, the recommendation system in the education field supports decision-making that reflects individual characteristics that are meaningful and cannot be reflected in experience-based decision-making, even though it has a relatively small number of item ranges. This study aims to realize personalized education and enhance students' educational satisfaction by presenting a recommendation model for university major class. In the model study, class history data of undergraduate students at University from 2015 to 2017 were used, and students and their major names were used as metadata. The class history data is implicit feedback data that only indicates whether content is consumed, not reflecting preferences for classes. Therefore, when we derive embedding vectors that characterize students and classes, their expressive power is low. With these issues in mind, this study proposes a Net-NeuMF model that generates vectors of students, classes through network analysis and utilizes them as input values of the model. The model was based on the structure of NeuMF using one-hot vectors, a representative model using data with implicit feedback. The input vectors of the model are generated to represent the characteristic of students and classes through network analysis. To generate a vector representing a student, each student is set to a node and the edge is designed to connect with a weight if the two students take the same class. Similarly, to generate a vector representing the class, each class was set as a node, and the edge connected if any students had taken the classes in common. Thus, we utilize Node2Vec, a representation learning methodology that quantifies the characteristics of each node. For the evaluation of the model, we used four indicators that are mainly utilized by recommendation systems, and experiments were conducted on three different dimensions to analyze the impact of embedding dimensions on the model. The results show better performance on evaluation metrics regardless of dimension than when using one-hot vectors in existing NeuMF structures. Thus, this work contributes to a network of students (users) and classes (items) to increase expressiveness over existing one-hot embeddings, to match the characteristics of each structure that constitutes the model, and to show better performance on various kinds of evaluation metrics compared to existing methodologies.
https://doi.org/10.13088/jiis.2021.27.3.095 인용 PDF KSCI

Search Result 1,554, Processing Time 0.024 seconds

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

Effects of Benzo〔a〕pyrene on Growth and Photosynthesis of Phytoplankton (식물플랑크톤의 성장과 광합성에 대한 benzo〔a〕pyrene의 영향)

The Analyses of Treatment Results and Prognostic Factors in Supradiaphragmatic CS I-II Hodgkin's Disease (횡경막상부에 국한된 임상적 병기 1-2기 호지킨병에서 치료 결과와 예후 인자의 분석)

Major Class Recommendation System based on Deep learning using Network Analysis (네트워크 분석을 활용한 딥러닝 기반 전공과목 추천 시스템)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)