• Title/Summary/Keyword: level set algorithm

Search Result 420, Processing Time 0.03 seconds

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

The Prediction of DEA based Efficiency Rating for Venture Business Using Multi-class SVM (다분류 SVM을 이용한 DEA기반 벤처기업 효율성등급 예측모형)

  • Park, Ji-Young;Hong, Tae-Ho
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.139-155
    • /
    • 2009
  • For the last few decades, many studies have tried to explore and unveil venture companies' success factors and unique features in order to identify the sources of such companies' competitive advantages over their rivals. Such venture companies have shown tendency to give high returns for investors generally making the best use of information technology. For this reason, many venture companies are keen on attracting avid investors' attention. Investors generally make their investment decisions by carefully examining the evaluation criteria of the alternatives. To them, credit rating information provided by international rating agencies, such as Standard and Poor's, Moody's and Fitch is crucial source as to such pivotal concerns as companies stability, growth, and risk status. But these types of information are generated only for the companies issuing corporate bonds, not venture companies. Therefore, this study proposes a method for evaluating venture businesses by presenting our recent empirical results using financial data of Korean venture companies listed on KOSDAQ in Korea exchange. In addition, this paper used multi-class SVM for the prediction of DEA-based efficiency rating for venture businesses, which was derived from our proposed method. Our approach sheds light on ways to locate efficient companies generating high level of profits. Above all, in determining effective ways to evaluate a venture firm's efficiency, it is important to understand the major contributing factors of such efficiency. Therefore, this paper is constructed on the basis of following two ideas to classify which companies are more efficient venture companies: i) making DEA based multi-class rating for sample companies and ii) developing multi-class SVM-based efficiency prediction model for classifying all companies. First, the Data Envelopment Analysis(DEA) is a non-parametric multiple input-output efficiency technique that measures the relative efficiency of decision making units(DMUs) using a linear programming based model. It is non-parametric because it requires no assumption on the shape or parameters of the underlying production function. DEA has been already widely applied for evaluating the relative efficiency of DMUs. Recently, a number of DEA based studies have evaluated the efficiency of various types of companies, such as internet companies and venture companies. It has been also applied to corporate credit ratings. In this study we utilized DEA for sorting venture companies by efficiency based ratings. The Support Vector Machine(SVM), on the other hand, is a popular technique for solving data classification problems. In this paper, we employed SVM to classify the efficiency ratings in IT venture companies according to the results of DEA. The SVM method was first developed by Vapnik (1995). As one of many machine learning techniques, SVM is based on a statistical theory. Thus far, the method has shown good performances especially in generalizing capacity in classification tasks, resulting in numerous applications in many areas of business, SVM is basically the algorithm that finds the maximum margin hyperplane, which is the maximum separation between classes. According to this method, support vectors are the closest to the maximum margin hyperplane. If it is impossible to classify, we can use the kernel function. In the case of nonlinear class boundaries, we can transform the inputs into a high-dimensional feature space, This is the original input space and is mapped into a high-dimensional dot-product space. Many studies applied SVM to the prediction of bankruptcy, the forecast a financial time series, and the problem of estimating credit rating, In this study we employed SVM for developing data mining-based efficiency prediction model. We used the Gaussian radial function as a kernel function of SVM. In multi-class SVM, we adopted one-against-one approach between binary classification method and two all-together methods, proposed by Weston and Watkins(1999) and Crammer and Singer(2000), respectively. In this research, we used corporate information of 154 companies listed on KOSDAQ market in Korea exchange. We obtained companies' financial information of 2005 from the KIS(Korea Information Service, Inc.). Using this data, we made multi-class rating with DEA efficiency and built multi-class prediction model based data mining. Among three manners of multi-classification, the hit ratio of the Weston and Watkins method is the best in the test data set. In multi classification problems as efficiency ratings of venture business, it is very useful for investors to know the class with errors, one class difference, when it is difficult to find out the accurate class in the actual market. So we presented accuracy results within 1-class errors, and the Weston and Watkins method showed 85.7% accuracy in our test samples. We conclude that the DEA based multi-class approach in venture business generates more information than the binary classification problem, notwithstanding its efficiency level. We believe this model can help investors in decision making as it provides a reliably tool to evaluate venture companies in the financial domain. For the future research, we perceive the need to enhance such areas as the variable selection process, the parameter selection of kernel function, the generalization, and the sample size of multi-class.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

A study on the prediction of korean NPL market return (한국 NPL시장 수익률 예측에 관한 연구)

  • Lee, Hyeon Su;Jeong, Seung Hwan;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.123-139
    • /
    • 2019
  • The Korean NPL market was formed by the government and foreign capital shortly after the 1997 IMF crisis. However, this market is short-lived, as the bad debt has started to increase after the global financial crisis in 2009 due to the real economic recession. NPL has become a major investment in the market in recent years when the domestic capital market's investment capital began to enter the NPL market in earnest. Although the domestic NPL market has received considerable attention due to the overheating of the NPL market in recent years, research on the NPL market has been abrupt since the history of capital market investment in the domestic NPL market is short. In addition, decision-making through more scientific and systematic analysis is required due to the decline in profitability and the price fluctuation due to the fluctuation of the real estate business. In this study, we propose a prediction model that can determine the achievement of the benchmark yield by using the NPL market related data in accordance with the market demand. In order to build the model, we used Korean NPL data from December 2013 to December 2017 for about 4 years. The total number of things data was 2291. As independent variables, only the variables related to the dependent variable were selected for the 11 variables that indicate the characteristics of the real estate. In order to select the variables, one to one t-test and logistic regression stepwise and decision tree were performed. Seven independent variables (purchase year, SPC (Special Purpose Company), municipality, appraisal value, purchase cost, OPB (Outstanding Principle Balance), HP (Holding Period)). The dependent variable is a bivariate variable that indicates whether the benchmark rate is reached. This is because the accuracy of the model predicting the binomial variables is higher than the model predicting the continuous variables, and the accuracy of these models is directly related to the effectiveness of the model. In addition, in the case of a special purpose company, whether or not to purchase the property is the main concern. Therefore, whether or not to achieve a certain level of return is enough to make a decision. For the dependent variable, we constructed and compared the predictive model by calculating the dependent variable by adjusting the numerical value to ascertain whether 12%, which is the standard rate of return used in the industry, is a meaningful reference value. As a result, it was found that the hit ratio average of the predictive model constructed using the dependent variable calculated by the 12% standard rate of return was the best at 64.60%. In order to propose an optimal prediction model based on the determined dependent variables and 7 independent variables, we construct a prediction model by applying the five methodologies of discriminant analysis, logistic regression analysis, decision tree, artificial neural network, and genetic algorithm linear model we tried to compare them. To do this, 10 sets of training data and testing data were extracted using 10 fold validation method. After building the model using this data, the hit ratio of each set was averaged and the performance was compared. As a result, the hit ratio average of prediction models constructed by using discriminant analysis, logistic regression model, decision tree, artificial neural network, and genetic algorithm linear model were 64.40%, 65.12%, 63.54%, 67.40%, and 60.51%, respectively. It was confirmed that the model using the artificial neural network is the best. Through this study, it is proved that it is effective to utilize 7 independent variables and artificial neural network prediction model in the future NPL market. The proposed model predicts that the 12% return of new things will be achieved beforehand, which will help the special purpose companies make investment decisions. Furthermore, we anticipate that the NPL market will be liquidated as the transaction proceeds at an appropriate price.

The Impact of Market Environments on Optimal Channel Strategy Involving an Internet Channel: A Game Theoretic Approach (시장 환경이 인터넷 경로를 포함한 다중 경로 관리에 미치는 영향에 관한 연구: 게임 이론적 접근방법)

  • Yoo, Weon-Sang
    • Journal of Distribution Research
    • /
    • v.16 no.2
    • /
    • pp.119-138
    • /
    • 2011
  • Internet commerce has been growing at a rapid pace for the last decade. Many firms try to reach wider consumer markets by adding the Internet channel to the existing traditional channels. Despite the various benefits of the Internet channel, a significant number of firms failed in managing the new type of channel. Previous studies could not cleary explain these conflicting results associated with the Internet channel. One of the major reasons is most of the previous studies conducted analyses under a specific market condition and claimed that as the impact of Internet channel introduction. Therefore, their results are strongly influenced by the specific market settings. However, firms face various market conditions in the real worlddensity and disutility of using the Internet. The purpose of this study is to investigate the impact of various market environments on a firm's optimal channel strategy by employing a flexible game theory model. We capture various market conditions with consumer density and disutility of using the Internet.

    shows the channel structures analyzed in this study. Before the Internet channel is introduced, a monopoly manufacturer sells its products through an independent physical store. From this structure, the manufacturer could introduce its own Internet channel (MI). The independent physical store could also introduce its own Internet channel and coordinate it with the existing physical store (RI). An independent Internet retailer such as Amazon could enter this market (II). In this case, two types of independent retailers compete with each other. In this model, consumers are uniformly distributed on the two dimensional space. Consumer heterogeneity is captured by a consumer's geographical location (ci) and his disutility of using the Internet channel (${\delta}_{N_i}$).
    shows various market conditions captured by the two consumer heterogeneities.
    (a) illustrates a market with symmetric consumer distributions. The model captures explicitly the asymmetric distributions of consumer disutility in a market as well. In a market like that is represented in
    (c), the average consumer disutility of using an Internet store is relatively smaller than that of using a physical store. For example, this case represents the market in which 1) the product is suitable for Internet transactions (e.g., books) or 2) the level of E-Commerce readiness is high such as in Denmark or Finland. On the other hand, the average consumer disutility when using an Internet store is relatively greater than that of using a physical store in a market like (b). Countries like Ukraine and Bulgaria, or the market for "experience goods" such as shoes, could be examples of this market condition. summarizes the various scenarios of consumer distributions analyzed in this study. The range for disutility of using the Internet (${\delta}_{N_i}$) is held constant, while the range of consumer distribution (${\chi}_i$) varies from -25 to 25, from -50 to 50, from -100 to 100, from -150 to 150, and from -200 to 200.
    summarizes the analysis results. As the average travel cost in a market decreases while the average disutility of Internet use remains the same, average retail price, total quantity sold, physical store profit, monopoly manufacturer profit, and thus, total channel profit increase. On the other hand, the quantity sold through the Internet and the profit of the Internet store decrease with a decreasing average travel cost relative to the average disutility of Internet use. We find that a channel that has an advantage over the other kind of channel serves a larger portion of the market. In a market with a high average travel cost, in which the Internet store has a relative advantage over the physical store, for example, the Internet store becomes a mass-retailer serving a larger portion of the market. This result implies that the Internet becomes a more significant distribution channel in those markets characterized by greater geographical dispersion of buyers, or as consumers become more proficient in Internet usage. The results indicate that the degree of price discrimination also varies depending on the distribution of consumer disutility in a market. The manufacturer in a market in which the average travel cost is higher than the average disutility of using the Internet has a stronger incentive for price discrimination than the manufacturer in a market where the average travel cost is relatively lower. We also find that the manufacturer has a stronger incentive to maintain a high price level when the average travel cost in a market is relatively low. Additionally, the retail competition effect due to Internet channel introduction strengthens as average travel cost in a market decreases. This result indicates that a manufacturer's channel power relative to that of the independent physical retailer becomes stronger with a decreasing average travel cost. This implication is counter-intuitive, because it is widely believed that the negative impact of Internet channel introduction on a competing physical retailer is more significant in a market like Russia, where consumers are more geographically dispersed, than in a market like Hong Kong, that has a condensed geographic distribution of consumers.
    illustrates how this happens. When mangers consider the overall impact of the Internet channel, however, they should consider not only channel power, but also sales volume. When both are considered, the introduction of the Internet channel is revealed as more harmful to a physical retailer in Russia than one in Hong Kong, because the sales volume decrease for a physical store due to Internet channel competition is much greater in Russia than in Hong Kong. The results show that manufacturer is always better off with any type of Internet store introduction. The independent physical store benefits from opening its own Internet store when the average travel cost is higher relative to the disutility of using the Internet. Under an opposite market condition, however, the independent physical retailer could be worse off when it opens its own Internet outlet and coordinates both outlets (RI). This is because the low average travel cost significantly reduces the channel power of the independent physical retailer, further aggravating the already weak channel power caused by myopic inter-channel price coordination. The results implies that channel members and policy makers should explicitly consider the factors determining the relative distributions of both kinds of consumer disutility, when they make a channel decision involving an Internet channel. These factors include the suitability of a product for Internet shopping, the level of E-Commerce readiness of a market, and the degree of geographic dispersion of consumers in a market. Despite the academic contributions and managerial implications, this study is limited in the following ways. First, a series of numerical analyses were conducted to derive equilibrium solutions due to the complex forms of demand functions. In the process, we set up V=100, ${\lambda}$=1, and ${\beta}$=0.01. Future research may change this parameter value set to check the generalizability of this study. Second, the five different scenarios for market conditions were analyzed. Future research could try different sets of parameter ranges. Finally, the model setting allows only one monopoly manufacturer in the market. Accommodating competing multiple manufacturers (brands) would generate more realistic results.

  • PDF
  • GPU Based Feature Profile Simulation for Deep Contact Hole Etching in Fluorocarbon Plasma

    • Im, Yeon-Ho;Chang, Won-Seok;Choi, Kwang-Sung;Yu, Dong-Hun;Cho, Deog-Gyun;Yook, Yeong-Geun;Chun, Poo-Reum;Lee, Se-A;Kim, Jin-Tae;Kwon, Deuk-Chul;Yoon, Jung-Sik;Kim3, Dae-Woong;You, Shin-Jae
      • Proceedings of the Korean Vacuum Society Conference
      • /
      • 2012.08a
      • /
      • pp.80-81
      • /
      • 2012
    • Recently, one of the critical issues in the etching processes of the nanoscale devices is to achieve ultra-high aspect ratio contact (UHARC) profile without anomalous behaviors such as sidewall bowing, and twisting profile. To achieve this goal, the fluorocarbon plasmas with major advantage of the sidewall passivation have been used commonly with numerous additives to obtain the ideal etch profiles. However, they still suffer from formidable challenges such as tight limits of sidewall bowing and controlling the randomly distorted features in nanoscale etching profile. Furthermore, the absence of the available plasma simulation tools has made it difficult to develop revolutionary technologies to overcome these process limitations, including novel plasma chemistries, and plasma sources. As an effort to address these issues, we performed a fluorocarbon surface kinetic modeling based on the experimental plasma diagnostic data for silicon dioxide etching process under inductively coupled C4F6/Ar/O2 plasmas. For this work, the SiO2 etch rates were investigated with bulk plasma diagnostics tools such as Langmuir probe, cutoff probe and Quadruple Mass Spectrometer (QMS). The surface chemistries of the etched samples were measured by X-ray Photoelectron Spectrometer. To measure plasma parameters, the self-cleaned RF Langmuir probe was used for polymer deposition environment on the probe tip and double-checked by the cutoff probe which was known to be a precise plasma diagnostic tool for the electron density measurement. In addition, neutral and ion fluxes from bulk plasma were monitored with appearance methods using QMS signal. Based on these experimental data, we proposed a phenomenological, and realistic two-layer surface reaction model of SiO2 etch process under the overlying polymer passivation layer, considering material balance of deposition and etching through steady-state fluorocarbon layer. The predicted surface reaction modeling results showed good agreement with the experimental data. With the above studies of plasma surface reaction, we have developed a 3D topography simulator using the multi-layer level set algorithm and new memory saving technique, which is suitable in 3D UHARC etch simulation. Ballistic transports of neutral and ion species inside feature profile was considered by deterministic and Monte Carlo methods, respectively. In case of ultra-high aspect ratio contact hole etching, it is already well-known that the huge computational burden is required for realistic consideration of these ballistic transports. To address this issue, the related computational codes were efficiently parallelized for GPU (Graphic Processing Unit) computing, so that the total computation time could be improved more than few hundred times compared to the serial version. Finally, the 3D topography simulator was integrated with ballistic transport module and etch reaction model. Realistic etch-profile simulations with consideration of the sidewall polymer passivation layer were demonstrated.

    • PDF

    Study of Crustal Structure in North Korea Using 3D Velocity Tomography (3차원 속도 토모그래피를 이용한 북한지역의 지각구조 연구)

    • So Gu Kim;Jong Woo Shin
      • The Journal of Engineering Geology
      • /
      • v.13 no.3
      • /
      • pp.293-308
      • /
      • 2003
    • New results about the crustal structure down to a depth of 60 km beneath North Korea were obtained using the seismic tomography method. About 1013 P- and S-wave travel times from local earthquakes recorded by the Korean stations and the vicinity were used in the research. All earthquakes were relocated on the basis of an algorithm proposed in this study. Parameterization of the velocity structure is realized with a set of nodes distributed in the study volume according to the ray density. 120 nodes located at four depth levels were used to obtain the resulting P- and S-wave velocity structures. As a result, it is found that P- and S-wave velocity anomalies of the Rangnim Massif at depth of 8 km are high and low, respectively, whereas those of the Pyongnam Basin are low up to 24 km. It indicates that the Rangnim Massif contains Archean-early Lower Proterozoic Massif foldings with many faults and fractures which may be saturated with underground water and/or hot springs. On the other hand, the Pyongyang-Sariwon in the Pyongnam Basin is an intraplatform depression which was filled with sediments for the motion of the Upper Proterozoic, Silurian and Upper Paleozoic, and Lower Mesozoic origin. In particular, the high P- and S-wave velocity anomalies are observed at depth of 8, 16, and 24 km beneath Mt. Backdu, indicating that they may be the shallow conduits of the solidified magma bodies, while the low P-and S-wave velocity anomalies at depth of 38 km must be related with the magma chamber of low velocity bodies with partial melting. We also found the Moho discontinuities beneath the Origin Basin including Sari won to be about 55 km deep, whereas those of Mt. Backdu is found to be about 38 km. The high ratio of P-wave velocity/S-wave velocity at Moho suggests that there must be a partial melting body near the boundary of the crust and mantle. Consequently we may well consider Mt. Backdu as a dormant volcano which is holding the intermediate magma chamber near the Moho discontinuity. This study also brought interesting and important findings that there exist some materials with very high P- and S-wave velocity annomoalies at depth of about 40 km near Mt. Myohyang area at the edge of the Rangnim Massif shield.

    A Collaborative Filtering System Combined with Users' Review Mining : Application to the Recommendation of Smartphone Apps (사용자 리뷰 마이닝을 결합한 협업 필터링 시스템: 스마트폰 앱 추천에의 응용)

    • Jeon, ByeoungKug;Ahn, Hyunchul
      • Journal of Intelligence and Information Systems
      • /
      • v.21 no.2
      • /
      • pp.1-18
      • /
      • 2015
    • Collaborative filtering(CF) algorithm has been popularly used for recommender systems in both academic and practical applications. A general CF system compares users based on how similar they are, and creates recommendation results with the items favored by other people with similar tastes. Thus, it is very important for CF to measure the similarities between users because the recommendation quality depends on it. In most cases, users' explicit numeric ratings of items(i.e. quantitative information) have only been used to calculate the similarities between users in CF. However, several studies indicated that qualitative information such as user's reviews on the items may contribute to measure these similarities more accurately. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's reviews can be regarded as the informative source for identifying user's preference with accuracy. Under this background, this study proposes a new hybrid recommender system that combines with users' review mining. Our proposed system is based on conventional memory-based CF, but it is designed to use both user's numeric ratings and his/her text reviews on the items when calculating similarities between users. In specific, our system creates not only user-item rating matrix, but also user-item review term matrix. Then, it calculates rating similarity and review similarity from each matrix, and calculates the final user-to-user similarity based on these two similarities(i.e. rating and review similarities). As the methods for calculating review similarity between users, we proposed two alternatives - one is to use the frequency of the commonly used terms, and the other one is to use the sum of the importance weights of the commonly used terms in users' review. In the case of the importance weights of terms, we proposed the use of average TF-IDF(Term Frequency - Inverse Document Frequency) weights. To validate the applicability of the proposed system, we applied it to the implementation of a recommender system for smartphone applications (hereafter, app). At present, over a million apps are offered in each app stores operated by Google and Apple. Due to this information overload, users have difficulty in selecting proper apps that they really want. Furthermore, app store operators like Google and Apple have cumulated huge amount of users' reviews on apps until now. Thus, we chose smartphone app stores as the application domain of our system. In order to collect the experimental data set, we built and operated a Web-based data collection system for about two weeks. As a result, we could obtain 1,246 valid responses(ratings and reviews) from 78 users. The experimental system was implemented using Microsoft Visual Basic for Applications(VBA) and SAS Text Miner. And, to avoid distortion due to human intervention, we did not adopt any refining works by human during the user's review mining process. To examine the effectiveness of the proposed system, we compared its performance to the performance of conventional CF system. The performances of recommender systems were evaluated by using average MAE(mean absolute error). The experimental results showed that our proposed system(MAE = 0.7867 ~ 0.7881) slightly outperformed a conventional CF system(MAE = 0.7939). Also, they showed that the calculation of review similarity between users based on the TF-IDF weights(MAE = 0.7867) leaded to better recommendation accuracy than the calculation based on the frequency of the commonly used terms in reviews(MAE = 0.7881). The results from paired samples t-test presented that our proposed system with review similarity calculation using the frequency of the commonly used terms outperformed conventional CF system with 10% statistical significance level. Our study sheds a light on the application of users' review information for facilitating electronic commerce by recommending proper items to users.

    A Study on Spatial Pattern of Impact Area of Intersection Using Digital Tachograph Data and Traffic Assignment Model (차량 운행기록정보와 통행배정 모형을 이용한 교차로 영향권의 공간적 패턴에 관한 연구)

    • PARK, Seungjun;HONG, Kiman;KIM, Taegyun;SEO, Hyeon;CHO, Joong Rae;HONG, Young Suk
      • Journal of Korean Society of Transportation
      • /
      • v.36 no.2
      • /
      • pp.155-168
      • /
      • 2018
    • In this study, we studied the directional pattern of entering the intersection from the intersection upstream link prior to predicting short future (such as 5 or 10 minutes) intersection direction traffic volume on the interrupted flow, and examined the possibility of traffic volume prediction using traffic assignment model. The analysis method of this study is to investigate the similarity of patterns by performing cluster analysis with the ratio of traffic volume by intersection direction divided by 2 hours using taxi DTG (Digital Tachograph) data (1 week). Also, for linking with the result of the traffic assignment model, this study compares the impact area of 5 minutes or 10 minutes from the center of the intersection with the analysis result of taxi DTG data. To do this, we have developed an algorithm to set the impact area of intersection, using the taxi DTG data and traffic assignment model. As a result of the analysis, the intersection entry pattern of the taxi is grouped into 12, and the Cubic Clustering Criterion indicating the confidence level of clustering is 6.92. As a result of correlation analysis with the impact area of the traffic assignment model, the correlation coefficient for the impact area of 5 minutes was analyzed as 0.86, and significant results were obtained. However, it was analyzed that the correlation coefficient is slightly lowered to 0.69 in the impact area of 10 minutes from the center of the intersection, but this was due to insufficient accuracy of O/D (Origin/Destination) travel and network data. In future, if accuracy of traffic network and accuracy of O/D traffic by time are improved, it is expected that it will be able to utilize traffic volume data calculated from traffic assignment model when controlling traffic signals at intersections.

    A Hybrid Recommender System based on Collaborative Filtering with Selective Use of Overall and Multicriteria Ratings (종합 평점과 다기준 평점을 선택적으로 활용하는 협업필터링 기반 하이브리드 추천 시스템)

    • Ku, Min Jung;Ahn, Hyunchul
      • Journal of Intelligence and Information Systems
      • /
      • v.24 no.2
      • /
      • pp.85-109
      • /
      • 2018
    • Recommender system recommends the items expected to be purchased by a customer in the future according to his or her previous purchase behaviors. It has been served as a tool for realizing one-to-one personalization for an e-commerce service company. Traditional recommender systems, especially the recommender systems based on collaborative filtering (CF), which is the most popular recommendation algorithm in both academy and industry, are designed to generate the items list for recommendation by using 'overall rating' - a single criterion. However, it has critical limitations in understanding the customers' preferences in detail. Recently, to mitigate these limitations, some leading e-commerce companies have begun to get feedback from their customers in a form of 'multicritera ratings'. Multicriteria ratings enable the companies to understand their customers' preferences from the multidimensional viewpoints. Moreover, it is easy to handle and analyze the multidimensional ratings because they are quantitative. But, the recommendation using multicritera ratings also has limitation that it may omit detail information on a user's preference because it only considers three-to-five predetermined criteria in most cases. Under this background, this study proposes a novel hybrid recommendation system, which selectively uses the results from 'traditional CF' and 'CF using multicriteria ratings'. Our proposed system is based on the premise that some people have holistic preference scheme, whereas others have composite preference scheme. Thus, our system is designed to use traditional CF using overall rating for the users with holistic preference, and to use CF using multicriteria ratings for the users with composite preference. To validate the usefulness of the proposed system, we applied it to a real-world dataset regarding the recommendation for POI (point-of-interests). Providing personalized POI recommendation is getting more attentions as the popularity of the location-based services such as Yelp and Foursquare increases. The dataset was collected from university students via a Web-based online survey system. Using the survey system, we collected the overall ratings as well as the ratings for each criterion for 48 POIs that are located near K university in Seoul, South Korea. The criteria include 'food or taste', 'price' and 'service or mood'. As a result, we obtain 2,878 valid ratings from 112 users. Among 48 items, 38 items (80%) are used as training dataset, and the remaining 10 items (20%) are used as validation dataset. To examine the effectiveness of the proposed system (i.e. hybrid selective model), we compared its performance to the performances of two comparison models - the traditional CF and the CF with multicriteria ratings. The performances of recommender systems were evaluated by using two metrics - average MAE(mean absolute error) and precision-in-top-N. Precision-in-top-N represents the percentage of truly high overall ratings among those that the model predicted would be the N most relevant items for each user. The experimental system was developed using Microsoft Visual Basic for Applications (VBA). The experimental results showed that our proposed system (avg. MAE = 0.584) outperformed traditional CF (avg. MAE = 0.591) as well as multicriteria CF (avg. AVE = 0.608). We also found that multicriteria CF showed worse performance compared to traditional CF in our data set, which is contradictory to the results in the most previous studies. This result supports the premise of our study that people have two different types of preference schemes - holistic and composite. Besides MAE, the proposed system outperformed all the comparison models in precision-in-top-3, precision-in-top-5, and precision-in-top-7. The results from the paired samples t-test presented that our proposed system outperformed traditional CF with 10% statistical significance level, and multicriteria CF with 1% statistical significance level from the perspective of average MAE. The proposed system sheds light on how to understand and utilize user's preference schemes in recommender systems domain.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.