• Title/Summary/Keyword: Intelligence information technology

Search Result 1,945, Processing Time 0.031 seconds

A Study on People Counting in Public Metro Service using Hybrid CNN-LSTM Algorithm (Hybrid CNN-LSTM 알고리즘을 활용한 도시철도 내 피플 카운팅 연구)

  • Choi, Ji-Hye;Kim, Min-Seung;Lee, Chan-Ho;Choi, Jung-Hwan;Lee, Jeong-Hee;Sung, Tae-Eung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.131-145
    • /
    • 2020
  • In line with the trend of industrial innovation, IoT technology utilized in a variety of fields is emerging as a key element in creation of new business models and the provision of user-friendly services through the combination of big data. The accumulated data from devices with the Internet-of-Things (IoT) is being used in many ways to build a convenience-based smart system as it can provide customized intelligent systems through user environment and pattern analysis. Recently, it has been applied to innovation in the public domain and has been using it for smart city and smart transportation, such as solving traffic and crime problems using CCTV. In particular, it is necessary to comprehensively consider the easiness of securing real-time service data and the stability of security when planning underground services or establishing movement amount control information system to enhance citizens' or commuters' convenience in circumstances with the congestion of public transportation such as subways, urban railways, etc. However, previous studies that utilize image data have limitations in reducing the performance of object detection under private issue and abnormal conditions. The IoT device-based sensor data used in this study is free from private issue because it does not require identification for individuals, and can be effectively utilized to build intelligent public services for unspecified people. Especially, sensor data stored by the IoT device need not be identified to an individual, and can be effectively utilized for constructing intelligent public services for many and unspecified people as data free form private issue. We utilize the IoT-based infrared sensor devices for an intelligent pedestrian tracking system in metro service which many people use on a daily basis and temperature data measured by sensors are therein transmitted in real time. The experimental environment for collecting data detected in real time from sensors was established for the equally-spaced midpoints of 4×4 upper parts in the ceiling of subway entrances where the actual movement amount of passengers is high, and it measured the temperature change for objects entering and leaving the detection spots. The measured data have gone through a preprocessing in which the reference values for 16 different areas are set and the difference values between the temperatures in 16 distinct areas and their reference values per unit of time are calculated. This corresponds to the methodology that maximizes movement within the detection area. In addition, the size of the data was increased by 10 times in order to more sensitively reflect the difference in temperature by area. For example, if the temperature data collected from the sensor at a given time were 28.5℃, the data analysis was conducted by changing the value to 285. As above, the data collected from sensors have the characteristics of time series data and image data with 4×4 resolution. Reflecting the characteristics of the measured, preprocessed data, we finally propose a hybrid algorithm that combines CNN in superior performance for image classification and LSTM, especially suitable for analyzing time series data, as referred to CNN-LSTM (Convolutional Neural Network-Long Short Term Memory). In the study, the CNN-LSTM algorithm is used to predict the number of passing persons in one of 4×4 detection areas. We verified the validation of the proposed model by taking performance comparison with other artificial intelligence algorithms such as Multi-Layer Perceptron (MLP), Long Short Term Memory (LSTM) and RNN-LSTM (Recurrent Neural Network-Long Short Term Memory). As a result of the experiment, proposed CNN-LSTM hybrid model compared to MLP, LSTM and RNN-LSTM has the best predictive performance. By utilizing the proposed devices and models, it is expected various metro services will be provided with no illegal issue about the personal information such as real-time monitoring of public transport facilities and emergency situation response services on the basis of congestion. However, the data have been collected by selecting one side of the entrances as the subject of analysis, and the data collected for a short period of time have been applied to the prediction. There exists the limitation that the verification of application in other environments needs to be carried out. In the future, it is expected that more reliability will be provided for the proposed model if experimental data is sufficiently collected in various environments or if learning data is further configured by measuring data in other sensors.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

The Effect of Herding Behavior and Perceived Usefulness on Intention to Purchase e-Learning Content: Comparison Analysis by Purchase Experience (무리행동과 지각된 유용성이 이러닝 컨텐츠 구매의도에 미치는 영향: 구매경험에 의한 비교분석)

  • Yoo, Chul-Woo;Kim, Yang-Jin;Moon, Jung-Hoon;Choe, Young-Chan
    • Asia pacific journal of information systems
    • /
    • v.18 no.4
    • /
    • pp.105-130
    • /
    • 2008
  • Consumers of e-learning market differ from those of other markets in that they are replaced in a specific time scale. For example, e-learning contents aimed at highschool senior students cannot be consumed by a specific consumer over the designated period of time. Hence e-learning service providers need to attract new groups of students every year. Due to lack of information on products designed for continuously emerging consumers, the consumers face difficulties in making rational decisions in a short time period. Increased uncertainty of product purchase leads customers to herding behaviors to obtain information of the product from others and imitate them. Taking into consideration of these features of e-learning market, this study will focus on the online herding behavior in purchasing e-learning contents. There is no definite concept for e-learning. However, it is being discussed in a wide range of perspectives from educational engineering to management to e-business etc. Based upon the existing studies, we identify two main view-points regarding e-learning. The first defines e-learning as a concept that includes existing terminologies, such as CBT (Computer Based Training), WBT (Web Based Training), and IBT (Internet Based Training). In this view, e-learning utilizes IT in order to support professors and a part of or entire education systems. In the second perspective, e-learning is defined as the usage of Internet technology to deliver diverse intelligence and achievement enhancing solutions. In other words, only the educations that are done through the Internet and network can be classified as e-learning. We take the second definition of e-learning for our working definition. The main goal of this study is to investigate what factors affect consumer intention to purchase e-learning contents and to identify the differential impact of the factors between consumers with purchase experience and those without the experience. To accomplish the goal of this study, it focuses on herding behavior and perceived usefulness as antecedents to behavioral intention. The proposed research model in the study extends the Technology Acceptance Model by adding herding behavior and usability to take into account the unique characteristics of e-learning content market and e-learning systems use, respectively. The current study also includes consumer experience with e-learning content purchase because the previous experience is believed to affect purchasing intention when consumers buy experience goods or services. Previous studies on e-learning did not consider the characteristics of e-learning contents market and the differential impact of consumer experience on the relationship between the antecedents and behavioral intention, which is the target of this study. This study employs a survey method to empirically test the proposed research model. A survey questionnaire was developed and distributed to 629 informants. 528 responses were collected, which consist of potential customer group (n = 133) and experienced customer group (n = 395). The data were analyzed using PLS method, a structural equation modeling method. Overall, both herding behavior and perceived usefulness influence consumer intention to purchase e-learning contents. In detail, in the case of potential customer group, herding behavior has stronger effect on purchase intention than does perceived usefulness. However, in the case of shopping-experienced customer group, perceived usefulness has stronger effect than does herding behavior. In sum, the results of the analysis show that with regard to purchasing experience, perceived usefulness and herding behavior had differential effects upon the purchase of e-learning contents. As a follow-up analysis, the interaction effects of the number of purchase transaction and herding behavior/perceived usefulness on purchase intention were investigated. The results show that there are no interaction effects. This study contributes to the literature in a couple of ways. From a theoretical perspective, this study examined and showed evidence that the characteristics of e-learning market such as continuous renewal of consumers and thus high uncertainty and individual experiences are important factors to be considered when the purchase intention of e-learning content is studied. This study can be used as a basis for future studies on e-learning success. From a practical perspective, this study provides several important implications on what types of marketing strategies e-learning companies need to build. The bottom lines of these strategies include target group attraction, word-of-mouth management, enhancement of web site usability quality, etc. The limitations of this study are also discussed for future studies.

Determinants of Mobile Application Use: A Study Focused on the Correlation between Application Categories (모바일 앱 사용에 영향을 미치는 요인에 관한 연구: 앱 카테고리 간 상관관계를 중심으로)

  • Park, Sangkyu;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.157-176
    • /
    • 2016
  • For a long time, mobile phone had a sole function of communication. Recently however, abrupt innovations in technology allowed extension of the sphere in mobile phone activities. Development of technology enabled realization of almost computer-like environment even on a very small device. Such advancement yielded several forms of new high-tech devices such as smartphone and tablet PC, which quickly proliferated. Simultaneously with the diffusion of the mobile devices, mobile applications for those devices also prospered and soon became deeply penetrated in consumers' daily lives. Numerous mobile applications have been released in app stores yielding trillions of cumulative downloads. However, a big majority of the applications are disregarded from consumers. Even after the applications are purchased, they do not survive long in consumers' mobile devices and are soon abandoned. Nevertheless, it is imperative for both app developers and app-store operators to understand consumer behaviors and to develop marketing strategies aiming to make sustainable business by first increasing sales of mobile applications and by also designing surviving strategy for applications. Therefore, this research analyzes consumers' mobile application usage behavior in a frame of substitution/supplementary of application categories and several explanatory variables. Considering that consumers of mobile devices use multiple apps simultaneously, this research adopts multivariate probit models to explain mobile application usage behavior and to derive correlation between categories of applications for observing substitution/supplementary of application use. The research adopts several explanatory variables including sociodemographic data, user experiences of purchased applications that reflect future purchasing behavior of paid applications as well as consumer attitudes toward marketing efforts, variables representing consumer attitudes toward rating of the app and those representing consumer attitudes toward app-store promotion efforts (i.e., top developer badge and editor's choice badge). Results of this study can be explained in hedonic and utilitarian framework. Consumers who use hedonic applications, such as those of game and entertainment-related, are of young age with low education level. However, consumers who are old and have received higher education level prefer utilitarian application category such as life, information etc. There are disputable arguments over whether the users of SNS are hedonic or utilitarian. In our results, consumers who are younger and those with higher education level prefer using SNS category applications, which is in a middle of utilitarian and hedonic results. Also, applications that are directly related to tangible assets, such as banking, stock and mobile shopping, are only negatively related to experience of purchasing of paid app, meaning that consumers who put weights on tangible assets do not prefer buying paid application. Regarding categories, most correlations among categories are significantly positive. This is because someone who spend more time on mobile devices tends to use more applications. Game and entertainment category shows significant and positive correlation; however, there exists significantly negative correlation between game and information, as well as game and e-commerce categories of applications. Meanwhile, categories of game and SNS as well as game and finance have shown no significant correlations. This result clearly shows that mobile application usage behavior is quite clearly distinguishable - that the purpose of using mobile devices are polarized into utilitarian and hedonic purpose. This research proves several arguments that can only be explained by second-hand real data, not by survey data, and offers behavioral explanations of mobile application usage in consumers' perspectives. This research also shows substitution/supplementary patterns of consumer application usage, which then explain consumers' mobile application usage behaviors. However, this research has limitations in some points. Classification of categories itself is disputable, for classification is diverged among several studies. Therefore, there is a possibility of change in results depending on the classification. Lastly, although the data are collected in an individual application level, we reduce its observation into an individual level. Further research will be done to resolve these limitations.

Case Study on the Enterprise Microblog Usage: Focusing on Knowledge Management Strategy (기업용 마이크로블로그의 사용행태에 대한 사례연구: 지식경영전략을 중심으로)

  • Kang, Min Su;Park, Arum;Lee, Kyoung-Jun
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.47-63
    • /
    • 2015
  • As knowledge is paid attention as a new production factor that generates added value, studies continue to apply knowledge management to business environment. In addition, as ICT (Information Communication Technology) was engrafted in business environment, it leads to increasing task efficiency and productivity of individual workers. Accordingly, the way that a business achieves its goal has changed to one in which its individual members are willing to take part in the organization and share information to create new values (Han, 2003) and studies for the system and service to support such transition are carrying out. Of late, a new concept called 'Enterprise 2.0' newly appears. It is the extension of Wen 2.0 and its technology, which focus on participation, sharing and openness, to the work environment of a business (Jung, 2013). Enterprise 2.0 is being used as a collaborative tool to prop up individual creativity and group brain power by combining Web 2.0 technologies such as blog, Wiki, RSS and tag with business software (McAfee, 2006). As Tweeter gets popular, Enterprise Microblog (EMB), which is an example of Enterprise 2.0 for business, has been developed as equivalent to Tweeter in business circle and SaaS (Software as a Service) such as Yammer was introduced The studies of EMB mainly focus on demonstrating its usability in terms of intra-firm communication and knowledge management. However existing studies lean too much towards large-sized companies and certain departments, rather than a company as a whole. Therefore, few studies have been conducted on small and medium-sized companies that have difficulty preparing separate resources and supplying exclusive workforce to introduce knowledge management. In this respect, the present study placed its analytic focus on small-sized companies actually equipped with EMB to know how they use it. And, based on the findings, this study examined their knowledge management strategies for EMB from the point of codification and personalization. Hypothesis -"as a company grows, it shifts EMB strategy from codification to personalization'- was established on the basis of reviewing precedent studies and literature. To demonstrate the hypothesis, this study analyzed the usage of EMB by small companies that have used it from foundation. For case study, the duration of the use was divided into 2 spans and longitudinal analysis was employed to examine the contents of the blogs. Using the key findings of the analysis, this study is aimed to propose practical implications for the operation of knowledge management of small-sized company and the suitable application of knowledge management system for operation Knowledge Management Strategy can be classified by codification strategy and personalization strategy (Hansen et. al., 1999), and how to manage the two strategies were always studied. Also, current studies regarding the knowledge management strategy were targeted mostly for major companies, resulting in lack of studies in how it can be applied on SMEs. This research, with the knowledge management strategy suited for SMEs, sets an Enterprise Microblog (EMB), and with the EMB applied on SMEs' Knowledge Management Strategy, it is reviewed on the perspective of SMEs' Codification and Personalization Strategies. Through the advanced research regarding Knowledge Management Strategy and EMB, the hypothesis is set that "Depending on the development of the company, the main application of EMB alters from Codification Strategy to Personalization Strategy". To check the hypothesis, SME that have used the EMB called 'Yammer' was analyzed from the date of their foundation until today. The case study has implemented longitudinal analysis which divides the period when the EMBs were used into three stages and analyzes the contents. As the result of the study, this suggests a substantial implication regarding the application of Knowledge Management Strategy and its Knowledge Management System that is suitable for SME.

The Role of Open Innovation for SME's R&D Success (중소기업 R&D 성공에 있어서 개방형 혁신의 효과에 관한 연구)

  • Yoo, In-Jin;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.89-117
    • /
    • 2018
  • The Korean companies are intensifying competition with not only domestic companies but also foreign companies in globalization. In this environment, it is essential activities not only for large companies but also Small and Medium Enterprises (SMEs) to get and develop the core competency. Particularly, SMEs that are inferior to resources of various aspects, such as financial resources etc., can make innovation through effective R&D investment. And then, SMEs can occupy a competency and can be survive at the environment. Conventionally, the method of "self-development" by using only the internal resources of the company has been dominant. Recently, however, R&D method through cooperation, also called "Open Innovation", is emerging. Especially SMEs are relatively short of available internal resources. Therefore, it is necessary to utilize technology and resources through cooperation with external companies(such as joint development or contract development etc.) rather than self-development R&D. In this context, we confirmed the effect of SMEs' factors on sales in Korea. Specifically, the factors that SMEs hold are classified as 'Technical characteristic', 'Company competency', and 'R&D activity' and analyzed how they influence the sales achieved as a result of R&D. The analysis was based on a two-year statistical survey conducted by the Korean government. In addition, we confirmed the influence of the factors on the sales according to the R&D method(Self-Development vs. Open Innovation), and also observed the influence change in 29 industrial categories. The results of the study are summarized as follows: First, regression analysis shows that twelve factors of SMEs have a significant effect on sales. Specifically, 15 factors included in the analysis, 12 factors excluding 3 factors were found to have significant influence. In the technical characteristic, 'imitation period' and 'product life cycle' of the technology were confirmed. In the company competency, 'R&D led person', 'researcher number', 'intellectual property registration status', 'number of R&D attempts', and 'ratio of success to trial' were confirmed. The R&D activity was found to have a significant impact on all included factors. Second, the influence of factors on the R&D method was confirmed, and the change was confirmed in four factors. In addition, these factors were found that have different effects on sales according to the R&D method. Specifically, 'researcher number', 'number of R&D attempts', 'performance compensation system', and 'R&D investment' were found to have significant moderate effects. In other words, the moderating effect of open innovation was confirmed for four factors. Third, on the industrial classification, it is confirmed that different factors have a significant influence on each industrial classification. At this point, it was confirmed that at least one factor, up to nine factors had a significant effect on the sales according to the industrial classification. Furthermore, different moderate effects have been confirmed in the industrial classification and R&D method. In the moderate effect, up to eight significant moderate effects were confirmed according to the industrial classification. In particular, 'R&D investment' and 'performance compensation system' were confirmed to be the most common moderating effect by each 12 times and 11 times in all industrial classification. This study provides the following suggestions: First, it is necessary for SMEs to determine the R&D method in consideration of the characteristics of the technology to be R&D as well as the enterprise competency and the R&D activity. In addition, there is a need to identify and concentrate on the factors that increase sales in R&D decisions, which are mainly affected by the industry classification to which the company belongs. Second, governments that support SMEs' R&D need to provide guidelines that are fit to their situation. It is necessary to differentiate the support for the company considering various factors such as technology and R&D purpose for their effective budget execution. Finally, based on the results of this study, we urge the need to reconsider the effectiveness of existing SME support policies.

Risk Education and Educational Needs Related to Science and Technology: A Study on Science Teachers' Perceptions (중등 과학교사들이 생각하는 과학기술 관련 위험교육 실태와 교육 요구)

  • Jinhee Kim;Jiyeon Na;Yong Wook Cheong
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.1
    • /
    • pp.57-75
    • /
    • 2024
  • This study aimed to investigate the current state and educational needs of risk education related to science and technology as perceived by secondary science teachers. A survey was conducted with a total of 366 secondary science teachers. The results are as follows. First, There were more teachers who had not provided education on risks arising from science and technology in terms of risk perception, risk assessment, and risk management than those who had not. Global warming was the most common risk taught by teachers, followed by earthquakes, artificial intelligence, and traffic accidents. Second, teachers recognized that they lacked understanding that the achievement standards of the 2022 revised science curriculum include risks that may occur due to science and technology, but they thought they were prepared to teach. Third, teachers recognized that their understanding of risk perception was higher than that of risk management and risk assessment. Fourth, the experience of teachers in training on risk was very limited, with fewer having training in risk assessment and risk management compared to risk perception. The most common training experienced was in laboratory safety. Fifth, teachers recognized that their capabilities for the 10 goals of risk education were not high. Middle school teachers or teachers majoring in integrated science education evaluated their capabilities relatively highly. Sixth, many teachers thought it was important to address risks in school science education. They prioritized 'information use', 'decision-making skills', and 'influence of mass media', in that order, for importance and called for urgent education in 'action skills', 'information use', and 'influence of risk perception'. Seventh, as a result of deriving the priorities of education needs for each of the 10 goals of risk education, 'action skills', 'influence of risk perception', and 'evaluate risk assessment' were ranked 1st, 2nd, and 3rd, respectively.

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

  • Kim, Minjeong;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.93-110
    • /
    • 2015
  • The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.

PIRS : Personalized Information Retrieval System using Adaptive User Profiling and Real-time Filtering for Search Results (적응형 사용자 프로파일기법과 검색 결과에 대한 실시간 필터링을 이용한 개인화 정보검색 시스템)

  • Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.21-41
    • /
    • 2010
  • This paper proposes a system that can serve users with appropriate search results through real time filtering, and implemented adaptive user profiling based personalized information retrieval system(PIRS) using users' implicit feedbacks in order to deal with the problem of existing search systems such as Google or MSN that does not satisfy various user' personal search needs. One of the reasons that existing search systems hard to satisfy various user' personal needs is that it is not easy to recognize users' search intentions because of the uncertainty of search intentions. The uncertainty of search intentions means that users may want to different search results using the same query. For example, when a user inputs "java" query, the user may want to be retrieved "java" results as a computer programming language, a coffee of java, or a island of Indonesia. In other words, this uncertainty is due to ambiguity of search queries. Moreover, if the number of the used words for a query is fewer, this uncertainty will be more increased. Real-time filtering for search results returns only those results that belong to user-selected domain for a given query. Although it looks similar to a general directory search, it is different in that the search is executed for all web documents rather than sites, and each document in the search results is classified into the given domain in real time. By applying information filtering using real time directory classifying technology for search results to personalization, the number of delivering results to users is effectively decreased, and the satisfaction for the results is improved. In this paper, a user preference profile has a hierarchical structure, and consists of domains, used queries, and selected documents. Because the hierarchy structure of user preference profile can apply the context when users perfomed search, the structure is able to deal with the uncertainty of user intentions, when search is carried out, the intention may differ according to the context such as time or place for the same query. Furthermore, this structure is able to more effectively track web documents search behaviors of a user for each domain, and timely recognize the changes of user intentions. An IP address of each device was used to identify each user, and the user preference profile is continuously updated based on the observed user behaviors for search results. Also, we measured user satisfaction for search results by observing the user behaviors for the selected search result. Our proposed system automatically recognizes user preferences by using implicit feedbacks from users such as staying time on the selected search result and the exit condition from the page, and dynamically updates their preferences. Whenever search is performed by a user, our system finds the user preference profile for the given IP address, and if the file is not exist then a new user preference profile is created in the server, otherwise the file is updated with the transmitted information. If the file is not exist in the server, the system provides Google' results to users, and the reflection value is increased/decreased whenever user search. We carried out some experiments to evaluate the performance of adaptive user preference profile technique and real time filtering, and the results are satisfactory. According to our experimental results, participants are satisfied with average 4.7 documents in the top 10 search list by using adaptive user preference profile technique with real time filtering, and this result shows that our method outperforms Google's by 23.2%.

A MVC Framework for Visualizing Text Data (텍스트 데이터 시각화를 위한 MVC 프레임워크)

  • Choi, Kwang Sun;Jeong, Kyo Sung;Kim, Soo Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.39-58
    • /
    • 2014
  • As the importance of big data and related technologies continues to grow in the industry, it has become highlighted to visualize results of processing and analyzing big data. Visualization of data delivers people effectiveness and clarity for understanding the result of analyzing. By the way, visualization has a role as the GUI (Graphical User Interface) that supports communications between people and analysis systems. Usually to make development and maintenance easier, these GUI parts should be loosely coupled from the parts of processing and analyzing data. And also to implement a loosely coupled architecture, it is necessary to adopt design patterns such as MVC (Model-View-Controller) which is designed for minimizing coupling between UI part and data processing part. On the other hand, big data can be classified as structured data and unstructured data. The visualization of structured data is relatively easy to unstructured data. For all that, as it has been spread out that the people utilize and analyze unstructured data, they usually develop the visualization system only for each project to overcome the limitation traditional visualization system for structured data. Furthermore, for text data which covers a huge part of unstructured data, visualization of data is more difficult. It results from the complexity of technology for analyzing text data as like linguistic analysis, text mining, social network analysis, and so on. And also those technologies are not standardized. This situation makes it more difficult to reuse the visualization system of a project to other projects. We assume that the reason is lack of commonality design of visualization system considering to expanse it to other system. In our research, we suggest a common information model for visualizing text data and propose a comprehensive and reusable framework, TexVizu, for visualizing text data. At first, we survey representative researches in text visualization era. And also we identify common elements for text visualization and common patterns among various cases of its. And then we review and analyze elements and patterns with three different viewpoints as structural viewpoint, interactive viewpoint, and semantic viewpoint. And then we design an integrated model of text data which represent elements for visualization. The structural viewpoint is for identifying structural element from various text documents as like title, author, body, and so on. The interactive viewpoint is for identifying the types of relations and interactions between text documents as like post, comment, reply and so on. The semantic viewpoint is for identifying semantic elements which extracted from analyzing text data linguistically and are represented as tags for classifying types of entity as like people, place or location, time, event and so on. After then we extract and choose common requirements for visualizing text data. The requirements are categorized as four types which are structure information, content information, relation information, trend information. Each type of requirements comprised with required visualization techniques, data and goal (what to know). These requirements are common and key requirement for design a framework which keep that a visualization system are loosely coupled from data processing or analyzing system. Finally we designed a common text visualization framework, TexVizu which is reusable and expansible for various visualization projects by collaborating with various Text Data Loader and Analytical Text Data Visualizer via common interfaces as like ITextDataLoader and IATDProvider. And also TexVisu is comprised with Analytical Text Data Model, Analytical Text Data Storage and Analytical Text Data Controller. In this framework, external components are the specifications of required interfaces for collaborating with this framework. As an experiment, we also adopt this framework into two text visualization systems as like a social opinion mining system and an online news analysis system.