• Title/Summary/Keyword: Recommended Algorithm

Search Result 271, Processing Time 0.031 seconds

Evaluation of Perfusion and Image Quality Changes by Reconstruction Methods in 13N-Ammonia Myocardial Perfusion PET/CT (13N-암모니아 심근관류 PET/CT 검사 시 영상 재구성 방법에 따른 관류량 변화와 영상 평가)

  • Do, Yong Ho;Lee, Hong Jae;Kim, Jin Eui
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.18 no.1
    • /
    • pp.69-75
    • /
    • 2014
  • Purpose: The aim of this study was to evaluate changes of quantitative and semi-quantitative myocardial perfusion indices and image quality by image reconstruction methods in $^{13}N$-ammonia ($^{13}N-NH_3$) myocardial perfusion PET/CT. Materials and Methods: Data of 14 (8 men, 6 women) patients underwent rest and adenosine stress $^{13}N-NH_3$ PET/CT (Biograph TruePoint 40 with TrueV, Siemens) were collected. Listmode scans were acquired for 10 minutes by injecting 370MBq of $^{13}N-NH_3$. Dynamic and static reconstruction was performed by use of FBP, iterative2D (2D), iterative3D (3D) and iterative TrueX (TrueX) algorithm. Coronary flow reserve (CFR) of dynamic reconstruction data, extent(%) and total perfusion deficit (TPD) (%) measured in sum of 4-10 minutes scan were evaluated by comparing with 2D method which was recommended by vendor. The image quality of each reconstructed data was compared and evaluated by five nuclear medicine physicians through a blind test. Results: CFR were lower in TrueX 18.68% (P=0.0002), FBP 4.35% (P=0.1243) and higher in 3D 7.91% (P<0.0001). As semi-quantitative values, extent and TPD of stress were higher in 3D 3.07%p (P=0.001), 2.36%p (P=0.0002), FBP 1.93%p (P=0.4275), 1.57%p (P=0.4595), TrueX 5.43%p (P=0.0003), 3.93%p (P<0.0001). Extent and TPD of rest were lower in FBP 0.86%p (P=0.1953), 0.57%p (P=0.2053) and higher in 3D 3.21%p (P=0.0006), 2.57%p (P=0.0001) and TrueX 5.36%p (P<0.0001), 4.36%p (P<0.0001). Based on the results of the blind test for image resolution and noise from the snapshot, 3D obtained the highest score, followed by 2D, TrueX and FBP. Conclusion: We found that quantitative and semi-quantitative myocardial perfusion values could be under- or over-estimated according to the reconstruction algorithm in $^{13}N-NH_3$ PET/CT. Therefore, proper dynamic and static reconstruction method should be established to provide accurate myocardial perfusion value.

  • PDF

Recommender system using BERT sentiment analysis (BERT 기반 감성분석을 이용한 추천시스템)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-15
    • /
    • 2021
  • If it is difficult for us to make decisions, we ask for advice from friends or people around us. When we decide to buy products online, we read anonymous reviews and buy them. With the advent of the Data-driven era, IT technology's development is spilling out many data from individuals to objects. Companies or individuals have accumulated, processed, and analyzed such a large amount of data that they can now make decisions or execute directly using data that used to depend on experts. Nowadays, the recommender system plays a vital role in determining the user's preferences to purchase goods and uses a recommender system to induce clicks on web services (Facebook, Amazon, Netflix, Youtube). For example, Youtube's recommender system, which is used by 1 billion people worldwide every month, includes videos that users like, "like" and videos they watched. Recommended system research is deeply linked to practical business. Therefore, many researchers are interested in building better solutions. Recommender systems use the information obtained from their users to generate recommendations because the development of the provided recommender systems requires information on items that are likely to be preferred by the user. We began to trust patterns and rules derived from data rather than empirical intuition through the recommender systems. The capacity and development of data have led machine learning to develop deep learning. However, such recommender systems are not all solutions. Proceeding with the recommender systems, there should be no scarcity in all data and a sufficient amount. Also, it requires detailed information about the individual. The recommender systems work correctly when these conditions operate. The recommender systems become a complex problem for both consumers and sellers when the interaction log is insufficient. Because the seller's perspective needs to make recommendations at a personal level to the consumer and receive appropriate recommendations with reliable data from the consumer's perspective. In this paper, to improve the accuracy problem for "appropriate recommendation" to consumers, the recommender systems are proposed in combination with context-based deep learning. This research is to combine user-based data to create hybrid Recommender Systems. The hybrid approach developed is not a collaborative type of Recommender Systems, but a collaborative extension that integrates user data with deep learning. Customer review data were used for the data set. Consumers buy products in online shopping malls and then evaluate product reviews. Rating reviews are based on reviews from buyers who have already purchased, giving users confidence before purchasing the product. However, the recommendation system mainly uses scores or ratings rather than reviews to suggest items purchased by many users. In fact, consumer reviews include product opinions and user sentiment that will be spent on evaluation. By incorporating these parts into the study, this paper aims to improve the recommendation system. This study is an algorithm used when individuals have difficulty in selecting an item. Consumer reviews and record patterns made it possible to rely on recommendations appropriately. The algorithm implements a recommendation system through collaborative filtering. This study's predictive accuracy is measured by Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Netflix is strategically using the referral system in its programs through competitions that reduce RMSE every year, making fair use of predictive accuracy. Research on hybrid recommender systems combining the NLP approach for personalization recommender systems, deep learning base, etc. has been increasing. Among NLP studies, sentiment analysis began to take shape in the mid-2000s as user review data increased. Sentiment analysis is a text classification task based on machine learning. The machine learning-based sentiment analysis has a disadvantage in that it is difficult to identify the review's information expression because it is challenging to consider the text's characteristics. In this study, we propose a deep learning recommender system that utilizes BERT's sentiment analysis by minimizing the disadvantages of machine learning. This study offers a deep learning recommender system that uses BERT's sentiment analysis by reducing the disadvantages of machine learning. The comparison model was performed through a recommender system based on Naive-CF(collaborative filtering), SVD(singular value decomposition)-CF, MF(matrix factorization)-CF, BPR-MF(Bayesian personalized ranking matrix factorization)-CF, LSTM, CNN-LSTM, GRU(Gated Recurrent Units). As a result of the experiment, the recommender system based on BERT was the best.

Utilizing the Idle Railway Sites: A Proposal for the Location of Solar Power Plants Using Cluster Analysis (철도 유휴부지 활용방안: 군집분석을 활용한 태양광발전 입지 제안)

  • Eunkyung Kang;Seonuk Yang;Jiyoon Kwon;Sung-Byung Yang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.79-105
    • /
    • 2023
  • Due to unprecedented extreme weather events such as global warming and climate change, many parts of the world suffer from severe pain, and economic losses are also snowballing. In order to address these problems, 'The Paris Agreement' was signed in 2016, and an intergovernmental consultative body was formed to keep the average temperature rise of the Earth below 1.5℃. Korea also declared 'Carbon Neutrality in 2050' to prevent climate catastrophe. In particular, it was found that the increase in temperature caused by greenhouse gas emissions hurts the environment and society as a whole, as well as the export-dependent economy of Korea. In addition, as the diversification of transportation types is accelerating, the change in means of choice is also increasing. As the development paradigm in the low-growth era changes to urban regeneration, interest in idle railway sites is rising due to reduced demand for routes, improvement of alignment, and relocation of urban railways. Meanwhile, it is possible to partially achieve the solar power generation goal of 'Renewable Energy 3020' by utilizing already developed but idle railway sites and take advantage of being free from environmental damage and resident acceptance issues surrounding the location; but the actual use and plan for these solar power facilities are still lacking. Therefore, in this study, using the big data provided by the Korea National Railway and the Renewable Energy Cloud Platform, we develop an algorithm to discover and analyze suitable idle sites where solar power generation facilities can be installed and identify potentially applicable areas considering conditions desired by users. By searching and deriving these idle but relevant sites, it is intended to devise a plan to save enormous costs for facilities or expansion in the early stages of development. This study uses various cluster analyses to develop an optimal algorithm that can derive solar power plant locations on idle railway sites and, as a result, suggests 202 'actively recommended areas.' These results would help decision-makers make rational decisions from the viewpoint of simultaneously considering the economy and the environment.

Robo-Advisor Algorithm with Intelligent View Model (지능형 전망모형을 결합한 로보어드바이저 알고리즘)

  • Kim, Sunwoong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.39-55
    • /
    • 2019
  • Recently banks and large financial institutions have introduced lots of Robo-Advisor products. Robo-Advisor is a Robot to produce the optimal asset allocation portfolio for investors by using the financial engineering algorithms without any human intervention. Since the first introduction in Wall Street in 2008, the market size has grown to 60 billion dollars and is expected to expand to 2,000 billion dollars by 2020. Since Robo-Advisor algorithms suggest asset allocation output to investors, mathematical or statistical asset allocation strategies are applied. Mean variance optimization model developed by Markowitz is the typical asset allocation model. The model is a simple but quite intuitive portfolio strategy. For example, assets are allocated in order to minimize the risk on the portfolio while maximizing the expected return on the portfolio using optimization techniques. Despite its theoretical background, both academics and practitioners find that the standard mean variance optimization portfolio is very sensitive to the expected returns calculated by past price data. Corner solutions are often found to be allocated only to a few assets. The Black-Litterman Optimization model overcomes these problems by choosing a neutral Capital Asset Pricing Model equilibrium point. Implied equilibrium returns of each asset are derived from equilibrium market portfolio through reverse optimization. The Black-Litterman model uses a Bayesian approach to combine the subjective views on the price forecast of one or more assets with implied equilibrium returns, resulting a new estimates of risk and expected returns. These new estimates can produce optimal portfolio by the well-known Markowitz mean-variance optimization algorithm. If the investor does not have any views on his asset classes, the Black-Litterman optimization model produce the same portfolio as the market portfolio. What if the subjective views are incorrect? A survey on reports of stocks performance recommended by securities analysts show very poor results. Therefore the incorrect views combined with implied equilibrium returns may produce very poor portfolio output to the Black-Litterman model users. This paper suggests an objective investor views model based on Support Vector Machines(SVM), which have showed good performance results in stock price forecasting. SVM is a discriminative classifier defined by a separating hyper plane. The linear, radial basis and polynomial kernel functions are used to learn the hyper planes. Input variables for the SVM are returns, standard deviations, Stochastics %K and price parity degree for each asset class. SVM output returns expected stock price movements and their probabilities, which are used as input variables in the intelligent views model. The stock price movements are categorized by three phases; down, neutral and up. The expected stock returns make P matrix and their probability results are used in Q matrix. Implied equilibrium returns vector is combined with the intelligent views matrix, resulting the Black-Litterman optimal portfolio. For comparisons, Markowitz mean-variance optimization model and risk parity model are used. The value weighted market portfolio and equal weighted market portfolio are used as benchmark indexes. We collect the 8 KOSPI 200 sector indexes from January 2008 to December 2018 including 132 monthly index values. Training period is from 2008 to 2015 and testing period is from 2016 to 2018. Our suggested intelligent view model combined with implied equilibrium returns produced the optimal Black-Litterman portfolio. The out of sample period portfolio showed better performance compared with the well-known Markowitz mean-variance optimization portfolio, risk parity portfolio and market portfolio. The total return from 3 year-period Black-Litterman portfolio records 6.4%, which is the highest value. The maximum draw down is -20.8%, which is also the lowest value. Sharpe Ratio shows the highest value, 0.17. It measures the return to risk ratio. Overall, our suggested view model shows the possibility of replacing subjective analysts's views with objective view model for practitioners to apply the Robo-Advisor asset allocation algorithms in the real trading fields.

A Study on the Effect of Network Centralities on Recommendation Performance (네트워크 중심성 척도가 추천 성능에 미치는 영향에 대한 연구)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.23-46
    • /
    • 2021
  • Collaborative filtering, which is often used in personalization recommendations, is recognized as a very useful technique to find similar customers and recommend products to them based on their purchase history. However, the traditional collaborative filtering technique has raised the question of having difficulty calculating the similarity for new customers or products due to the method of calculating similaritiesbased on direct connections and common features among customers. For this reason, a hybrid technique was designed to use content-based filtering techniques together. On the one hand, efforts have been made to solve these problems by applying the structural characteristics of social networks. This applies a method of indirectly calculating similarities through their similar customers placed between them. This means creating a customer's network based on purchasing data and calculating the similarity between the two based on the features of the network that indirectly connects the two customers within this network. Such similarity can be used as a measure to predict whether the target customer accepts recommendations. The centrality metrics of networks can be utilized for the calculation of these similarities. Different centrality metrics have important implications in that they may have different effects on recommended performance. In this study, furthermore, the effect of these centrality metrics on the performance of recommendation may vary depending on recommender algorithms. In addition, recommendation techniques using network analysis can be expected to contribute to increasing recommendation performance even if they apply not only to new customers or products but also to entire customers or products. By considering a customer's purchase of an item as a link generated between the customer and the item on the network, the prediction of user acceptance of recommendation is solved as a prediction of whether a new link will be created between them. As the classification models fit the purpose of solving the binary problem of whether the link is engaged or not, decision tree, k-nearest neighbors (KNN), logistic regression, artificial neural network, and support vector machine (SVM) are selected in the research. The data for performance evaluation used order data collected from an online shopping mall over four years and two months. Among them, the previous three years and eight months constitute social networks composed of and the experiment was conducted by organizing the data collected into the social network. The next four months' records were used to train and evaluate recommender models. Experiments with the centrality metrics applied to each model show that the recommendation acceptance rates of the centrality metrics are different for each algorithm at a meaningful level. In this work, we analyzed only four commonly used centrality metrics: degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality. Eigenvector centrality records the lowest performance in all models except support vector machines. Closeness centrality and betweenness centrality show similar performance across all models. Degree centrality ranking moderate across overall models while betweenness centrality always ranking higher than degree centrality. Finally, closeness centrality is characterized by distinct differences in performance according to the model. It ranks first in logistic regression, artificial neural network, and decision tree withnumerically high performance. However, it only records very low rankings in support vector machine and K-neighborhood with low-performance levels. As the experiment results reveal, in a classification model, network centrality metrics over a subnetwork that connects the two nodes can effectively predict the connectivity between two nodes in a social network. Furthermore, each metric has a different performance depending on the classification model type. This result implies that choosing appropriate metrics for each algorithm can lead to achieving higher recommendation performance. In general, betweenness centrality can guarantee a high level of performance in any model. It would be possible to consider the introduction of proximity centrality to obtain higher performance for certain models.

The Evaluation of TrueX Reconstruction Method in Low Dose (저선량에서의 TrueX 재구성 방법에 의한 유용성 평가)

  • Oh, Se-Moon;Kim, Kye-Hwan;Kim, Seung-Jeong;Lee, Hong-Jae;Kim, Jin-Eui
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.15 no.2
    • /
    • pp.83-87
    • /
    • 2011
  • Purpose: Recently in diagnostics area PET/CT is using a variety of areas including oncology, as well as in cardiology, neurology, etc. While increasing in the importance of PET/CT, there are various researches in the image quality related to reconstruction method. We compared and tested Iterative 2D Reconstruction Method with True X Reconstruction method by Siemens through phantom experiment, so we can see increasing of clinical usefulness of PET/CT. Materials and Methods: We measured contrast ratio and FWHM due to evaluating images on dose and experiment using Biograph 40 True Point PET/CT (Siemens, Germany). Getting a result of contrast ratio and FWHM, we used NEMA IEC PET body phantom (Data Spectrum Corp.) and capillary tube. We used the current TrueX and the previous Iterative 2D algorithm for all images which have 10 minutes long. Also, a clinical suitability of parameter for Iterative 2D and a recommended parameter by Siemens for True X are applied to the experiment. Results: We tested FWHM using capillary tube. As a result, TrueX was less than Iterative 2D. Also, the differences of FWHM get bigger in low dose. On the other hand, we tested contrasts ratio using NEMA IEC PET body phantom. As a result, TrueX was better aspect than Iterative 2D. However, there was no difference in dose. Conclusion: In this experiment, TrueX get higher results of contrast ratio and spatial resolution than Itertive 2D through experiment. Also, in the reconstruction result through TrueX, TrueX had better aspect of resolution than Iterative 2D in low dose. However, contrast ratio had no specific difference. In other words, TrueX reconstruction method in PET/CT had higher clinical value in use because TrueX can reduce exposure of patient and had a better quality of screen.

  • PDF

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

  • Kim, Minjeong;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.93-110
    • /
    • 2015
  • The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.

Comparison of Nutrient Intakes Regarding Stages of Change in Dietary Fat Reduction for College Students in Gyeonggi-Do (경기지역 일부 대학생의 지방제한 섭취 행동단계에 따른 영양소 섭취상태 비교)

  • Chung, Eun-Jung
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.33 no.8
    • /
    • pp.1327-1336
    • /
    • 2004
  • This study was conducted to compare nutrient intakes regarding stages of change in dietary fat reduction behavior. Subjects were consisted of healthy 383 college students (250 females and 133 males) in Gyeonggi-Do. Stages of change classified by an algorithm based on 6 items were designed each subjects into one of the 5 stages: precontemplation (PC), contemplation (CO), preparation (PR), action (AC), maintenance (MA). Nutrient intakes were assessed by 24-hr recall method. Regarding the 5 stages of changes, PR stage comprised the largest group (31.1%), followed by AC (28.7%), PC (19.3%), CO (13.8%), MA (7.1%). Female were more belong to either AC or MA. Those in PC and PR had the most energy, fat, saturated fatty acid and cholesterol (except male) and those in AC and MA had the least. These dietary patterns were more distinctive in female than in male. The higher stage of change in dietary fat reduction behavior, the higher self-efficacy. Energy % from fat in PC, CO, PR was too higher than 20%, that of in AC and MA (except male in MA) was within 20%. The average P/S and $\omega$6/$\omega$3 ratio of diet fat for female were similar to the recommended ratio, but the average $\omega$6/$\omega$3 ratio for male was found to be 10.1~12.9, which was beyond the suggested range, 4~10. In male, energy, fat and protein intakes from dinner were significantly different among stages of change, but in female, besides dinner, those from breakfast, lunch and snack were significantly different among stages of change. These results of our study confirm differences in stages of change in fat intake in terms of nutritional status, especially in female, and indicate the need for taking these phases of changes into account in nutrition advice.

Treatment Strategies for Depression during Pregnancy and Lactation (임신과 수유기 우울증의 치료 전략)

  • Lee, Soyoung Irene;Jung, Han-Yong
    • Korean Journal of Biological Psychiatry
    • /
    • v.14 no.2
    • /
    • pp.91-98
    • /
    • 2007
  • Objectives : Considering the impact of depressive illness on physical and mental health of both mother and fetus, specification of a treatment algorithm for depressive disorder during pregnancy is legitimated. This article provides a systemic review of treatments for depressive disorder during pregnancy and lactation. Methods : According to the search strategy of the Clinical Research Center for Depression of Korean Health 21 R & D Project, PubMed and EMBASE were searched using terms with regard to the treatment of depressive disorders during pregnancy and lactation. Reference lists of related reviews and studies were searched. In addition, relevant practice guidelines were searched using the PubMed. All identified clinical literatures were reviewed and summarized in a narrative manner. Results : Pharmacotherapy during pregnancy and lactation requires a comprehensive assessment of the risks and benefits of treatment for both mother and fetus or neonate. Recently, there is growing evidence that the use of tricyclic and selective serotonin reuptake inhibitors during pregnancy and lactation does not result in increased risks of teratogenicity. Treatment strategies are described according to the point of time of pregnancy or lactation. FDA categories for antidepressants during pregnancy and lactation are described. In addition, issues regarding to the electroconvulsive therapy and psychosocial treatment are discussed. Conclusion : The treatment option for depressive disorders during pregnancy and lactation depends on the severity of depressive illnesses of the individual patient. For mild to moderate depression, the non-pharmacological treatment should be considered first. For moderate to severe depression, pharmacotherapy should be administered in addition to the psychosocial treatment. ECT is recommended for depressive disorder of severe intensity. As the research knowledge is limited, the recommendations should based on the best judgement of psychiatrists.

  • PDF

The Effect of Data Size on the k-NN Predictability: Application to Samsung Electronics Stock Market Prediction (데이터 크기에 따른 k-NN의 예측력 연구: 삼성전자주가를 사례로)

  • Chun, Se-Hak
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.239-251
    • /
    • 2019
  • Statistical methods such as moving averages, Kalman filtering, exponential smoothing, regression analysis, and ARIMA (autoregressive integrated moving average) have been used for stock market predictions. However, these statistical methods have not produced superior performances. In recent years, machine learning techniques have been widely used in stock market predictions, including artificial neural network, SVM, and genetic algorithm. In particular, a case-based reasoning method, known as k-nearest neighbor is also widely used for stock price prediction. Case based reasoning retrieves several similar cases from previous cases when a new problem occurs, and combines the class labels of similar cases to create a classification for the new problem. However, case based reasoning has some problems. First, case based reasoning has a tendency to search for a fixed number of neighbors in the observation space and always selects the same number of neighbors rather than the best similar neighbors for the target case. So, case based reasoning may have to take into account more cases even when there are fewer cases applicable depending on the subject. Second, case based reasoning may select neighbors that are far away from the target case. Thus, case based reasoning does not guarantee an optimal pseudo-neighborhood for various target cases, and the predictability can be degraded due to a deviation from the desired similar neighbor. This paper examines how the size of learning data affects stock price predictability through k-nearest neighbor and compares the predictability of k-nearest neighbor with the random walk model according to the size of the learning data and the number of neighbors. In this study, Samsung electronics stock prices were predicted by dividing the learning dataset into two types. For the prediction of next day's closing price, we used four variables: opening value, daily high, daily low, and daily close. In the first experiment, data from January 1, 2000 to December 31, 2017 were used for the learning process. In the second experiment, data from January 1, 2015 to December 31, 2017 were used for the learning process. The test data is from January 1, 2018 to August 31, 2018 for both experiments. We compared the performance of k-NN with the random walk model using the two learning dataset. The mean absolute percentage error (MAPE) was 1.3497 for the random walk model and 1.3570 for the k-NN for the first experiment when the learning data was small. However, the mean absolute percentage error (MAPE) for the random walk model was 1.3497 and the k-NN was 1.2928 for the second experiment when the learning data was large. These results show that the prediction power when more learning data are used is higher than when less learning data are used. Also, this paper shows that k-NN generally produces a better predictive power than random walk model for larger learning datasets and does not when the learning dataset is relatively small. Future studies need to consider macroeconomic variables related to stock price forecasting including opening price, low price, high price, and closing price. Also, to produce better results, it is recommended that the k-nearest neighbor needs to find nearest neighbors using the second step filtering method considering fundamental economic variables as well as a sufficient amount of learning data.