• Title/Summary/Keyword: 성능향상

Search Result 19,417, Processing Time 0.047 seconds

Analysis of Market and Technology Status of Major Agricultural Machinery (Tractor, Combine Harvester and Rice Transplanter) (핵심 농기계(트랙터, 콤바인 및 이앙기) 시장 및 기술 현황 분석)

  • Hong, Sungha;Choi, Kyu-hong
    • Journal of the Korean Society of International Agriculture
    • /
    • v.31 no.1
    • /
    • pp.8-16
    • /
    • 2019
  • Alternatives for increasing the competitiveness of locally manufactured agricultural machinery in domestic and foreign markets has been proposed. This was done by analyzing the major agricultural machinery's price and market share as well as their performance and quality. In the Korean domestic market, the market share of Japanese agricultural machinery has been identified to be 14.5% for tractors, 31.1% for combine harvesters, and 35.8% for rice transplanters, and on track for further increase. Japanese manufacturers' domestic patent shares are 58.5% for tractors, 79.9% for combine harvesters, and 69.8% for rice transplanters, showing the dire need for Korean domestic firms to expand their technological rights. To strengthen the industrial competitiveness of agricultural machinery, therefore, researches that develop the fundamental and elemental technology to reduce the frequency of breakdown should be needed in the short term. To achieve this, it is imperative to establish technology roadmap, promote greater cooperation between academia and industry, and systematically increase research funding. In addition, as a long-term solution for enhancing the competitiveness, an establishment of Agricultural Equipment Technology Institute is strongly recommended to systematically support R&D for developing core technologies, particularly high-quality components that guarantee durability and quality.

Development Strengths of High Strength Headed Bars of RC and SFRC Exterior Beam-Column Joint (RC 및 SFRC 외부 보-기둥 접합부에 대한 고강도 확대머리 철근의 정착강도)

  • Duck-Young Jang;Jae-Won Jeong;Kang-Seok Lee;Seung-Hun Kim
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.27 no.6
    • /
    • pp.94-101
    • /
    • 2023
  • In this study, the development performance of the head bars, which is SD700, was experimentally evaluated at the RC (reinforced concrete) or SFRC (steel fiber reinforced concrete external beam-column joint. A total of 10 specimens were tested, and variables such as steel fibers, length of settlement, effective depth of the beam, and stirrups of the column were planned. As a result of the experiment, the specimens showed side-face blowout, concrete breakout, and shear failure depending on the experimental variables. In the RC series experiments with development length as a variable, it was confirmed that the development strength increased by 26.5~42.2% as the development length increased by 25-80%, which was not proportional to the development length. JD-based experiments with twice the effective depth of beams showed concrete breakout failure, reducing the maximum strength by 31.5% to 62% compared to the reference experiment. The S-series experiment, in which the spacing of the shear reinforcement around the enlarged head reinforcement was 1/2 times that of the reference experiment, increased the maximum strength by 8.4 to 9.7%. The concrete compressive strength of SFRC was evaluated to be 29.3% smaller than the concrete compressive strength of RC, but the development strength of SFRC specimens increased by 7.3% to 12.2%. Accordingly it was confirmed that the development performance of the head bar was greatly improved by reinforcing the steel fiber. Considering the results of 92% and 99% of the experimental maximum strength of the experiment arranged with 92% and 110% of the KDS-based settlement length, it is judged that the safety rate needs to be considered even more. In addition, it is required to present a design formula that considers the effective depth of the beam compared to the development length.

Evaluation of Application Possibility for Floating Marine Pollutants Detection Using Image Enhancement Techniques: A Case Study for Thin Oil Film on the Sea Surface (영상 강화 기법을 통한 부유성 해양오염물질 탐지 기술 적용 가능성 평가: 해수면의 얇은 유막을 대상으로)

  • Soyeong Jang;Yeongbin Park;Jaeyeop Kwon;Sangheon Lee;Tae-Ho Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1353-1369
    • /
    • 2023
  • In the event of a disaster accident at sea, the scale of damage will vary due to weather effects such as wind, currents, and tidal waves, and it is obligatory to minimize the scale of damage by establishing appropriate control plans through quick on-site identification. In particular, it is difficult to identify pollutants that exist in a thin film at sea surface due to their relatively low viscosity and surface tension among pollutants discharged into the sea. Therefore, this study aims to develop an algorithm to detect suspended pollutants on the sea surface in RGB images using imaging equipment that can be easily used in the field, and to evaluate the performance of the algorithm using input data obtained from actual waters. The developed algorithm uses image enhancement techniques to improve the contrast between the intensity values of pollutants and general sea surfaces, and through histogram analysis, the background threshold is found,suspended solids other than pollutants are removed, and finally pollutants are classified. In this study, a real sea test using substitute materials was performed to evaluate the performance of the developed algorithm, and most of the suspended marine pollutants were detected, but the false detection area occurred in places with strong waves. However, the detection results are about three times better than the detection method using a single threshold in the existing algorithm. Through the results of this R&D, it is expected to be useful for on-site control response activities by detecting suspended marine pollutants that were difficult to identify with the naked eye at existing sites.

A Study on the Digital Drawing of Archaeological Relics Using Open-Source Software (오픈소스 소프트웨어를 활용한 고고 유물의 디지털 실측 연구)

  • LEE Hosun;AHN Hyoungki
    • Korean Journal of Heritage: History & Science
    • /
    • v.57 no.1
    • /
    • pp.82-108
    • /
    • 2024
  • With the transition of archaeological recording method's transition from analog to digital, the 3D scanning technology has been actively adopted within the field. Research on the digital archaeological digital data gathered from 3D scanning and photogrammetry is continuously being conducted. However, due to cost and manpower issues, most buried cultural heritage organizations are hesitating to adopt such digital technology. This paper aims to present a digital recording method of relics utilizing open-source software and photogrammetry technology, which is believed to be the most efficient method among 3D scanning methods. The digital recording process of relics consists of three stages: acquiring a 3D model, creating a joining map with the edited 3D model, and creating an digital drawing. In order to enhance the accessibility, this method only utilizes open-source software throughout the entire process. The results of this study confirms that in terms of quantitative evaluation, the deviation of numerical measurement between the actual artifact and the 3D model was minimal. In addition, the results of quantitative quality analysis from the open-source software and the commercial software showed high similarity. However, the data processing time was overwhelmingly fast for commercial software, which is believed to be a result of high computational speed from the improved algorithm. In qualitative evaluation, some differences in mesh and texture quality occurred. In the 3D model generated by opensource software, following problems occurred: noise on the mesh surface, harsh surface of the mesh, and difficulty in confirming the production marks of relics and the expression of patterns. However, some of the open source software did generate the quality comparable to that of commercial software in quantitative and qualitative evaluations. Open-source software for editing 3D models was able to not only post-process, match, and merge the 3D model, but also scale adjustment, join surface production, and render image necessary for the actual measurement of relics. The final completed drawing was tracked by the CAD program, which is also an open-source software. In archaeological research, photogrammetry is very applicable to various processes, including excavation, writing reports, and research on numerical data from 3D models. With the breakthrough development of computer vision, the types of open-source software have been diversified and the performance has significantly improved. With the high accessibility to such digital technology, the acquisition of 3D model data in archaeology will be used as basic data for preservation and active research of cultural heritage.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

Adaptive RFID anti-collision scheme using collision information and m-bit identification (충돌 정보와 m-bit인식을 이용한 적응형 RFID 충돌 방지 기법)

  • Lee, Je-Yul;Shin, Jongmin;Yang, Dongmin
    • Journal of Internet Computing and Services
    • /
    • v.14 no.5
    • /
    • pp.1-10
    • /
    • 2013
  • RFID(Radio Frequency Identification) system is non-contact identification technology. A basic RFID system consists of a reader, and a set of tags. RFID tags can be divided into active and passive tags. Active tags with power source allows their own operation execution and passive tags are small and low-cost. So passive tags are more suitable for distribution industry than active tags. A reader processes the information receiving from tags. RFID system achieves a fast identification of multiple tags using radio frequency. RFID systems has been applied into a variety of fields such as distribution, logistics, transportation, inventory management, access control, finance and etc. To encourage the introduction of RFID systems, several problems (price, size, power consumption, security) should be resolved. In this paper, we proposed an algorithm to significantly alleviate the collision problem caused by simultaneous responses of multiple tags. In the RFID systems, in anti-collision schemes, there are three methods: probabilistic, deterministic, and hybrid. In this paper, we introduce ALOHA-based protocol as a probabilistic method, and Tree-based protocol as a deterministic one. In Aloha-based protocols, time is divided into multiple slots. Tags randomly select their own IDs and transmit it. But Aloha-based protocol cannot guarantee that all tags are identified because they are probabilistic methods. In contrast, Tree-based protocols guarantee that a reader identifies all tags within the transmission range of the reader. In Tree-based protocols, a reader sends a query, and tags respond it with their own IDs. When a reader sends a query and two or more tags respond, a collision occurs. Then the reader makes and sends a new query. Frequent collisions make the identification performance degrade. Therefore, to identify tags quickly, it is necessary to reduce collisions efficiently. Each RFID tag has an ID of 96bit EPC(Electronic Product Code). The tags in a company or manufacturer have similar tag IDs with the same prefix. Unnecessary collisions occur while identifying multiple tags using Query Tree protocol. It results in growth of query-responses and idle time, which the identification time significantly increases. To solve this problem, Collision Tree protocol and M-ary Query Tree protocol have been proposed. However, in Collision Tree protocol and Query Tree protocol, only one bit is identified during one query-response. And, when similar tag IDs exist, M-ary Query Tree Protocol generates unnecessary query-responses. In this paper, we propose Adaptive M-ary Query Tree protocol that improves the identification performance using m-bit recognition, collision information of tag IDs, and prediction technique. We compare our proposed scheme with other Tree-based protocols under the same conditions. We show that our proposed scheme outperforms others in terms of identification time and identification efficiency.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Pipetting Stability and Improvement Test of the Robotic Liquid Handling System Depending on Types of Liquid (용액에 따른 자동분주기의 분주능력 평가와 분주력 향상 실험)

  • Back, Hyangmi;Kim, Youngsan;Yun, Sunhee;Heo, Uisung;Kim, Hosin;Ryu, Hyeonggi;Lee, Guiwon
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.20 no.2
    • /
    • pp.62-68
    • /
    • 2016
  • Purpose In a cyclosporine experiment using a robotic liquid handing system has found a deviation of its standard curve and low reproducibility of patients's results. The difference of the test is that methanol is mixed with samples and the extractions are used for the test. Therefore, we assumed that the abnormal test results came from using methanol and conducted this test. In a manual of a robotic liquid handling system mentions that we can choose several setting parameters depending on the viscosity of the liquids being used, the size of the sampling tips and the motor speeds that you elect to use but there's no exact order. This study was undertaken to confirm pipetting ability depending on types of liquids and investigate proper setting parameters for the optimum dispensing ability. Materials and Methods 4types of liquids(water, serum, methanol, PEG 6000(25%)) and $TSH^{125}I$ tracer(515 kBq) are used to confirm pipetting ability. 29 specimens for Cyclosporine test are used to compare results. Prepare 8 plastic tubes for each of the liquids and with multi pipette $400{\mu}l$ of each liquid is dispensed to 8 tubes and $100{\mu}l$ of $TSH^{125}I$ tracer are dispensed to all of the tubes. From the prepared samples, $100{\mu}l$ of liquids are dispensed using a robotic liquid handing system, counted and calculated its CV(%) depending on types of liquids. And then by adjusting several setting parameters(air gap, dispense time, delay time) the change of the CV(%)are calcutated and finds optimum setting parameters. 29 specimens are tested with 3 methods. The first(A) is manual method and the second(B) is used robotic liquid handling system with existing parameters. The third(C) is used robotic liquid handling system with adjusted parameters. Pipetting ability depending on types of liquids is assessed with CV(%). On the basis of (A), patients's test results are compared (A)and(B), (A)and(C) and they are assessed with %RE(%Relative error) and %Diff(%Difference). Results The CV(%) of the CPM depending on liquid types were water 0.88, serum 0.95, methanol 10.22 and PEG 0.68. As expected dispensing of methanol using a liquid handling system was the problem and others were good. The methanol's dispensing were conducted by adjusting several setting parameters. When transport air gap 0 was adjusted to 2 and 5, CV(%) were 20.16, 12.54 and when system air gap 0 was adjusted to 2 and 5, CV(%) were 8.94, 1.36. When adjusted to system air gap 2, transport air gap 2 was 12.96 and adjusted to system air gap 5, Transport air gap 5 was 1.33. When dispense speed was adjusted 300 to 100, CV(%) was 13.32 and when dispense delay was adjusted 200 to 100 was 13.55. When compared (B) to (A), the result increased 99.44% and %RE was 93.59%. When compared (C-system air gap was adjusted 0 to 5) to (A), the result increased 6.75% and %RE was 5.10%. Conclusion Adjusting speed and delay time of aspiration and dispense was meaningless but changing system air gap was effective. By adjusting several parameters proper value was found and it affected the practical result of the experiment. To optimize the system active efforts are needed through the test and in case of dispensing new types of liquids proper test is required to check the liquid is suitable for using the equipment.

  • PDF

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.

How to improve the accuracy of recommendation systems: Combining ratings and review texts sentiment scores (평점과 리뷰 텍스트 감성분석을 결합한 추천시스템 향상 방안 연구)

  • Hyun, Jiyeon;Ryu, Sangyi;Lee, Sang-Yong Tom
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.219-239
    • /
    • 2019
  • As the importance of providing customized services to individuals becomes important, researches on personalized recommendation systems are constantly being carried out. Collaborative filtering is one of the most popular systems in academia and industry. However, there exists limitation in a sense that recommendations were mostly based on quantitative information such as users' ratings, which made the accuracy be lowered. To solve these problems, many studies have been actively attempted to improve the performance of the recommendation system by using other information besides the quantitative information. Good examples are the usages of the sentiment analysis on customer review text data. Nevertheless, the existing research has not directly combined the results of the sentiment analysis and quantitative rating scores in the recommendation system. Therefore, this study aims to reflect the sentiments shown in the reviews into the rating scores. In other words, we propose a new algorithm that can directly convert the user 's own review into the empirically quantitative information and reflect it directly to the recommendation system. To do this, we needed to quantify users' reviews, which were originally qualitative information. In this study, sentiment score was calculated through sentiment analysis technique of text mining. The data was targeted for movie review. Based on the data, a domain specific sentiment dictionary is constructed for the movie reviews. Regression analysis was used as a method to construct sentiment dictionary. Each positive / negative dictionary was constructed using Lasso regression, Ridge regression, and ElasticNet methods. Based on this constructed sentiment dictionary, the accuracy was verified through confusion matrix. The accuracy of the Lasso based dictionary was 70%, the accuracy of the Ridge based dictionary was 79%, and that of the ElasticNet (${\alpha}=0.3$) was 83%. Therefore, in this study, the sentiment score of the review is calculated based on the dictionary of the ElasticNet method. It was combined with a rating to create a new rating. In this paper, we show that the collaborative filtering that reflects sentiment scores of user review is superior to the traditional method that only considers the existing rating. In order to show that the proposed algorithm is based on memory-based user collaboration filtering, item-based collaborative filtering and model based matrix factorization SVD, and SVD ++. Based on the above algorithm, the mean absolute error (MAE) and the root mean square error (RMSE) are calculated to evaluate the recommendation system with a score that combines sentiment scores with a system that only considers scores. When the evaluation index was MAE, it was improved by 0.059 for UBCF, 0.0862 for IBCF, 0.1012 for SVD and 0.188 for SVD ++. When the evaluation index is RMSE, UBCF is 0.0431, IBCF is 0.0882, SVD is 0.1103, and SVD ++ is 0.1756. As a result, it can be seen that the prediction performance of the evaluation point reflecting the sentiment score proposed in this paper is superior to that of the conventional evaluation method. In other words, in this paper, it is confirmed that the collaborative filtering that reflects the sentiment score of the user review shows superior accuracy as compared with the conventional type of collaborative filtering that only considers the quantitative score. We then attempted paired t-test validation to ensure that the proposed model was a better approach and concluded that the proposed model is better. In this study, to overcome limitations of previous researches that judge user's sentiment only by quantitative rating score, the review was numerically calculated and a user's opinion was more refined and considered into the recommendation system to improve the accuracy. The findings of this study have managerial implications to recommendation system developers who need to consider both quantitative information and qualitative information it is expect. The way of constructing the combined system in this paper might be directly used by the developers.