• Title/Summary/Keyword: 성능개선

Search Result 12,193, Processing Time 0.047 seconds

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.

A Comparative Study of Subset Construction Methods in OSEM Algorithms using Simulated Projection Data of Compton Camera (모사된 컴프턴 카메라 투사데이터의 재구성을 위한 OSEM 알고리즘의 부분집합 구성법 비교 연구)

  • Kim, Soo-Mee;Lee, Jae-Sung;Lee, Mi-No;Lee, Ju-Hahn;Kim, Joong-Hyun;Kim, Chan-Hyeong;Lee, Chun-Sik;Lee, Dong-Soo;Lee, Soo-Jin
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.41 no.3
    • /
    • pp.234-240
    • /
    • 2007
  • Purpose: In this study we propose a block-iterative method for reconstructing Compton scattered data. This study shows that the well-known expectation maximization (EM) approach along with its accelerated version based on the ordered subsets principle can be applied to the problem of image reconstruction for Compton camera. This study also compares several methods of constructing subsets for optimal performance of our algorithms. Materials and Methods: Three reconstruction algorithms were implemented; simple backprojection (SBP), EM, and ordered subset EM (OSEM). For OSEM, the projection data were grouped into subsets in a predefined order. Three different schemes for choosing nonoverlapping subsets were considered; scatter angle-based subsets, detector position-based subsets, and both scatter angle- and detector position-based subsets. EM and OSEM with 16 subsets were performed with 64 and 4 iterations, respectively. The performance of each algorithm was evaluated in terms of computation time and normalized mean-squared error. Results: Both EM and OSEM clearly outperformed SBP in all aspects of accuracy. The OSEM with 16 subsets and 4 iterations, which is equivalent to the standard EM with 64 iterations, was approximately 14 times faster in computation time than the standard EM. In OSEM, all of the three schemes for choosing subsets yielded similar results in computation time as well as normalized mean-squared error. Conclusion: Our results show that the OSEM algorithm, which have proven useful in emission tomography, can also be applied to the problem of image reconstruction for Compton camera. With properly chosen subset construction methods and moderate numbers of subsets, our OSEM algorithm significantly improves the computational efficiency while keeping the original quality of the standard EM reconstruction. The OSEM algorithm with scatter angle- and detector position-based subsets is most available.

A Study on the Utilzation of Two Furrow Combine (2조형(條型) Combine의 이용(利用)에 관(關)한 연구(硏究))

  • Lee, Sang Woo;Kim, Soung Rai
    • Korean Journal of Agricultural Science
    • /
    • v.3 no.1
    • /
    • pp.95-104
    • /
    • 1976
  • This study was conducted to test the harvesting operation of two kinds of rice varieties such as Milyang #15 and Tong-il with a imported two furrow Japanese combine and was performed to find out the operational accuracy of it, the adaptability of this machine, and the feasibility of supplying this machine to rural area in Korea. The results obtained in this study are summarized as follows; 1. The harvesting test of the Milyang #15 was carried out 5 times from the optimum harvesting operation was good regardless of its maturity. The field grain loss ratio and the rate of unthreshed paddy were all about 1 percent. 2. The field grain loss of Tong-il harvested was increased from 5.13% to 10.34% along its maturity as shown in Fig 1. In considering this, it was needed that the combine mechanism should be improved mechanically for harvesting of Tong-il rice variety. 3. The rate of unthreshed paddy of Tong-il rice variety of which stem was short was average 1.6 percent, because the sample combine used in this study was developed on basisof the long stem variety in Japan, therefore some ears owing to the uneven stem of Tong-il rice could nat reach the teeth of the threshing drum. 4. The cracking rates of brown rice depending mostly upon the revolution speed of the threshing drum(240-350 rpm) in harvesting of Tong-il and Milyang #15 were all below 1 percent, and there was no significance between two varieties. 5. Since the ears of Tong-il rice variety covered with its leaves, a lots of trashes was produced, especially when threshed in raw materials, and the cleaning and the trashout mechanisms were clogged with those trashes very often, and so these two mechanisms were needed for being improved. 6. The sample combine of which track pressure was $0.19kg/cm^2$ could drive on the soft ground of which sinking was even 25cm as shown in Fig 3. But in considering the reaping height adjustment, 5cm sinking may be afford to drive the combine on the irregular sinking level ground without any readjustment of the resaping height. 7. The harvesting expenses per ha. by the sample combine of which annual coverage area is 4.7 ha. under conditions that the yearly workable days is 40, percentage of days being good for harvesting operation is 60%, field efficiency is 56%, working speed is 0.273m/sec, and daily workable hours is 8 hrs is reasonable to spread this combine to rural area in Korea, comparing to the expenses by the conventional harvesting expenses, if mechanical improvement is supplemented so as to harvest Tong-il rice. 8. In order to harvest Tong-il rice, the two furrow combine should be needed some mechanical improvements that divider can control not to touch ears of paddy, the space between the feeding chain and the thrshing drum is reduced, trash treatment apparatus must be improved, fore and rear adjust-interval is enlarged, and width of track must be enlarged so as to drive on the soft ground.

  • PDF

A Study on Improvement on National Legislation for Sustainable Progress of Space Development Project (우주개발사업의 지속발전을 위한 국내입법의 개선방향에 관한 연구)

  • Lee, Kang-Bin
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.25 no.1
    • /
    • pp.97-158
    • /
    • 2010
  • The purpose of this paper is to research on the contents and improvement of national legislations relating to space development in Korea to make the sustainable progress of space development project in Korea. Korea has launched its first satellite KITST-1 in 1992. The National Space Committee has established "The Space Development Promotion Basic Plan" in 2007. The plan addressed the development of total 13 satellites by 2010 and the space launch vehicle by 2020, and the launch of moon exploration spaceship by 2021. Korea has built the space center at Oinarodo, Goheng Province in June 2009. In Korea the first small launch vehicle KSLV-1 was launched at the Naro Space Center in August 2009, and its second launch was made in June 2010. The United Nations has adopted five treaties relating to the development of outer space as follows : The Outer Space Treaty of 1967, the Rescue and Return Agreement of 1968, the Liability Convention of 1972, the Registration Convention of 1974, and the Moon Treaty of 1979. All five treaties has come into force. Korea has ratified the Outer Space Treaty, the Rescue and Return Agreement, the Liability Convention and the Registration Convention excepting the Moon Treaty. Most of development countries have enacted the national legislation relating to the development of our space as follows : The National Aeronautic and Space Act of 1958 and the Commercial Space Act of 1998 in the United States, Outer Space Act of 1986 in England, Establishment Act of National Space Center of 1961 in France, Canadian Space Agency Act of 1990 in Canada, Space Basic Act of 2008 in Japan, and Law on Space Activity of 1993 in Russia. There are currently three national legislations relating to space development in Korea as follows : Aerospace Industry Development Promotion Act of 1987, Outer Space Development Promotion Act of 2005, Outer Space Damage Compensation Act of 2008. The Ministry of Knowledge Economy of Korea has announced the Full Amendment Draft of Aerospace Industry Development Promotion Act in December 2009, and it's main contents are as follows : (1) Changing the title of Act into Aerospace Industry Promotion Act, (2) Newly regulating the definition of air flight test place, etc., (3) Establishment of aerospace industry basic plan, establishment of aerospace industry committee, (4) Project for promoting aerospace industry, (5) Exploration development, international joint development, (6) Cooperative research development, (7) Mutual benefit project, (8) Project for furthering basis of aerospace industry, (9) Activating cluster of aerospace industry, (10) Designation of air flight test place, etc., (11) Abolishing the designation and assistance of specific enterprise, (12) Abolishing the inspection of performance and quality. The Outer Space Development Promotion Act should be revised with regard to the following matters : (1) Overlapping problem in legal system between the Outer Space Development Promotion Act and the Aerospace industry Development promotion Act, (2) Distribution and adjustment problem of the national research development budget for space development between National Space Committee and National Science Technology Committee, (3) Consideration and preservation of environment in space development, (4) Taking the legal action and maintaining the legal system for policy and regulation relating to space development. The Outer Space Damage Compensation Act should be revised with regard to the following matters : (1) Definition of space damage and indirect damage, (2) Currency unit of limit of compensation liability, (3) Joint liability and compensation claim right of launching person of space object, (4) Establishment of Space Damage Compensation Council. In Korea, it will be possible to make a space tourism in 2013, and it is planned to introduce and operate a manned spaceship in 2013. Therefore, it is necessary to develop the policy relating to the promotion of commercial space transportation industry. Also it is necessary to make the proper maintenance of the current Aviation Law and space development-related laws and regulations for the promotion of space transportation industry in Korea.

  • PDF

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.

Implementation Strategy for the Elderly Care Solution Based on Usage Log Analysis: Focusing on the Case of Hyodol Product (사용자 로그 분석에 기반한 노인 돌봄 솔루션 구축 전략: 효돌 제품의 사례를 중심으로)

  • Lee, Junsik;Yoo, In-Jin;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.117-140
    • /
    • 2019
  • As the aging phenomenon accelerates and various social problems related to the elderly of the vulnerable are raised, the need for effective elderly care solutions to protect the health and safety of the elderly generation is growing. Recently, more and more people are using Smart Toys equipped with ICT technology for care for elderly. In particular, log data collected through smart toys is highly valuable to be used as a quantitative and objective indicator in areas such as policy-making and service planning. However, research related to smart toys is limited, such as the development of smart toys and the validation of smart toy effectiveness. In other words, there is a dearth of research to derive insights based on log data collected through smart toys and to use them for decision making. This study will analyze log data collected from smart toy and derive effective insights to improve the quality of life for elderly users. Specifically, the user profiling-based analysis and elicitation of a change in quality of life mechanism based on behavior were performed. First, in the user profiling analysis, two important dimensions of classifying the type of elderly group from five factors of elderly user's living management were derived: 'Routine Activities' and 'Work-out Activities'. Based on the dimensions derived, a hierarchical cluster analysis and K-Means clustering were performed to classify the entire elderly user into three groups. Through a profiling analysis, the demographic characteristics of each group of elderlies and the behavior of using smart toy were identified. Second, stepwise regression was performed in eliciting the mechanism of change in quality of life. The effects of interaction, content usage, and indoor activity have been identified on the improvement of depression and lifestyle for the elderly. In addition, it identified the role of user performance evaluation and satisfaction with smart toy as a parameter that mediated the relationship between usage behavior and quality of life change. Specific mechanisms are as follows. First, the interaction between smart toy and elderly was found to have an effect of improving the depression by mediating attitudes to smart toy. The 'Satisfaction toward Smart Toy,' a variable that affects the improvement of the elderly's depression, changes how users evaluate smart toy performance. At this time, it has been identified that it is the interaction with smart toy that has a positive effect on smart toy These results can be interpreted as an elderly with a desire to meet emotional stability interact actively with smart toy, and a positive assessment of smart toy, greatly appreciating the effectiveness of smart toy. Second, the content usage has been confirmed to have a direct effect on improving lifestyle without going through other variables. Elderly who use a lot of the content provided by smart toy have improved their lifestyle. However, this effect has occurred regardless of the attitude the user has toward smart toy. Third, log data show that a high degree of indoor activity improves both the lifestyle and depression of the elderly. The more indoor activity, the better the lifestyle of the elderly, and these effects occur regardless of the user's attitude toward smart toy. In addition, elderly with a high degree of indoor activity are satisfied with smart toys, which cause improvement in the elderly's depression. However, it can be interpreted that elderly who prefer outdoor activities than indoor activities, or those who are less active due to health problems, are hard to satisfied with smart toys, and are not able to get the effects of improving depression. In summary, based on the activities of the elderly, three groups of elderly were identified and the important characteristics of each type were identified. In addition, this study sought to identify the mechanism by which the behavior of the elderly on smart toy affects the lives of the actual elderly, and to derive user needs and insights.

How to improve the accuracy of recommendation systems: Combining ratings and review texts sentiment scores (평점과 리뷰 텍스트 감성분석을 결합한 추천시스템 향상 방안 연구)

  • Hyun, Jiyeon;Ryu, Sangyi;Lee, Sang-Yong Tom
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.219-239
    • /
    • 2019
  • As the importance of providing customized services to individuals becomes important, researches on personalized recommendation systems are constantly being carried out. Collaborative filtering is one of the most popular systems in academia and industry. However, there exists limitation in a sense that recommendations were mostly based on quantitative information such as users' ratings, which made the accuracy be lowered. To solve these problems, many studies have been actively attempted to improve the performance of the recommendation system by using other information besides the quantitative information. Good examples are the usages of the sentiment analysis on customer review text data. Nevertheless, the existing research has not directly combined the results of the sentiment analysis and quantitative rating scores in the recommendation system. Therefore, this study aims to reflect the sentiments shown in the reviews into the rating scores. In other words, we propose a new algorithm that can directly convert the user 's own review into the empirically quantitative information and reflect it directly to the recommendation system. To do this, we needed to quantify users' reviews, which were originally qualitative information. In this study, sentiment score was calculated through sentiment analysis technique of text mining. The data was targeted for movie review. Based on the data, a domain specific sentiment dictionary is constructed for the movie reviews. Regression analysis was used as a method to construct sentiment dictionary. Each positive / negative dictionary was constructed using Lasso regression, Ridge regression, and ElasticNet methods. Based on this constructed sentiment dictionary, the accuracy was verified through confusion matrix. The accuracy of the Lasso based dictionary was 70%, the accuracy of the Ridge based dictionary was 79%, and that of the ElasticNet (${\alpha}=0.3$) was 83%. Therefore, in this study, the sentiment score of the review is calculated based on the dictionary of the ElasticNet method. It was combined with a rating to create a new rating. In this paper, we show that the collaborative filtering that reflects sentiment scores of user review is superior to the traditional method that only considers the existing rating. In order to show that the proposed algorithm is based on memory-based user collaboration filtering, item-based collaborative filtering and model based matrix factorization SVD, and SVD ++. Based on the above algorithm, the mean absolute error (MAE) and the root mean square error (RMSE) are calculated to evaluate the recommendation system with a score that combines sentiment scores with a system that only considers scores. When the evaluation index was MAE, it was improved by 0.059 for UBCF, 0.0862 for IBCF, 0.1012 for SVD and 0.188 for SVD ++. When the evaluation index is RMSE, UBCF is 0.0431, IBCF is 0.0882, SVD is 0.1103, and SVD ++ is 0.1756. As a result, it can be seen that the prediction performance of the evaluation point reflecting the sentiment score proposed in this paper is superior to that of the conventional evaluation method. In other words, in this paper, it is confirmed that the collaborative filtering that reflects the sentiment score of the user review shows superior accuracy as compared with the conventional type of collaborative filtering that only considers the quantitative score. We then attempted paired t-test validation to ensure that the proposed model was a better approach and concluded that the proposed model is better. In this study, to overcome limitations of previous researches that judge user's sentiment only by quantitative rating score, the review was numerically calculated and a user's opinion was more refined and considered into the recommendation system to improve the accuracy. The findings of this study have managerial implications to recommendation system developers who need to consider both quantitative information and qualitative information it is expect. The way of constructing the combined system in this paper might be directly used by the developers.

Estimation of Fresh Weight and Leaf Area Index of Soybean (Glycine max) Using Multi-year Spectral Data (다년도 분광 데이터를 이용한 콩의 생체중, 엽면적 지수 추정)

  • Jang, Si-Hyeong;Ryu, Chan-Seok;Kang, Ye-Seong;Park, Jun-Woo;Kim, Tae-Yang;Kang, Kyung-Suk;Park, Min-Jun;Baek, Hyun-Chan;Park, Yu-hyeon;Kang, Dong-woo;Zou, Kunyan;Kim, Min-Cheol;Kwon, Yeon-Ju;Han, Seung-ah;Jun, Tae-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.4
    • /
    • pp.329-339
    • /
    • 2021
  • Soybeans (Glycine max), one of major upland crops, require precise management of environmental conditions, such as temperature, water, and soil, during cultivation since they are sensitive to environmental changes. Application of spectral technologies that measure the physiological state of crops remotely has great potential for improving quality and productivity of the soybean by estimating yields, physiological stresses, and diseases. In this study, we developed and validated a soybean growth prediction model using multispectral imagery. We conducted a linear regression analysis between vegetation indices and soybean growth data (fresh weight and LAI) obtained at Miryang fields. The linear regression model was validated at Goesan fields. It was found that the model based on green ratio vegetation index (GRVI) had the greatest performance in prediction of fresh weight at the calibration stage (R2=0.74, RMSE=246 g/m2, RE=34.2%). In the validation stage, RMSE and RE of the model were 392 g/m2 and 32%, respectively. The errors of the model differed by cropping system, For example, RMSE and RE of model in single crop fields were 315 g/m2 and 26%, respectively. On the other hand, the model had greater values of RMSE (381 g/m2) and RE (31%) in double crop fields. As a result of developing models for predicting a fresh weight into two years (2018+2020) with similar accumulated temperature (AT) in three years and a single year (2019) that was different from that AT, the prediction performance of a single year model was better than a two years model. Consequently, compared with those models divided by AT and a three years model, RMSE of a single crop fields were improved by about 29.1%. However, those of double crop fields decreased by about 19.6%. When environmental factors are used along with, spectral data, the reliability of soybean growth prediction can be achieved various environmental conditions.