• Title/Summary/Keyword: 웹정보

Search Result 10,394, Processing Time 0.038 seconds

A Ranking Algorithm for Semantic Web Resources: A Class-oriented Approach (시맨틱 웹 자원의 랭킹을 위한 알고리즘: 클래스중심 접근방법)

  • Rho, Sang-Kyu;Park, Hyun-Jung;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.17 no.4
    • /
    • pp.31-59
    • /
    • 2007
  • We frequently use search engines to find relevant information in the Web but still end up with too much information. In order to solve this problem of information overload, ranking algorithms have been applied to various domains. As more information will be available in the future, effectively and efficiently ranking search results will become more critical. In this paper, we propose a ranking algorithm for the Semantic Web resources, specifically RDF resources. Traditionally, the importance of a particular Web page is estimated based on the number of key words found in the page, which is subject to manipulation. In contrast, link analysis methods such as Google's PageRank capitalize on the information which is inherent in the link structure of the Web graph. PageRank considers a certain page highly important if it is referred to by many other pages. The degree of the importance also increases if the importance of the referring pages is high. Kleinberg's algorithm is another link-structure based ranking algorithm for Web pages. Unlike PageRank, Kleinberg's algorithm utilizes two kinds of scores: the authority score and the hub score. If a page has a high authority score, it is an authority on a given topic and many pages refer to it. A page with a high hub score links to many authoritative pages. As mentioned above, the link-structure based ranking method has been playing an essential role in World Wide Web(WWW), and nowadays, many people recognize the effectiveness and efficiency of it. On the other hand, as Resource Description Framework(RDF) data model forms the foundation of the Semantic Web, any information in the Semantic Web can be expressed with RDF graph, making the ranking algorithm for RDF knowledge bases greatly important. The RDF graph consists of nodes and directional links similar to the Web graph. As a result, the link-structure based ranking method seems to be highly applicable to ranking the Semantic Web resources. However, the information space of the Semantic Web is more complex than that of WWW. For instance, WWW can be considered as one huge class, i.e., a collection of Web pages, which has only a recursive property, i.e., a 'refers to' property corresponding to the hyperlinks. However, the Semantic Web encompasses various kinds of classes and properties, and consequently, ranking methods used in WWW should be modified to reflect the complexity of the information space in the Semantic Web. Previous research addressed the ranking problem of query results retrieved from RDF knowledge bases. Mukherjea and Bamba modified Kleinberg's algorithm in order to apply their algorithm to rank the Semantic Web resources. They defined the objectivity score and the subjectivity score of a resource, which correspond to the authority score and the hub score of Kleinberg's, respectively. They concentrated on the diversity of properties and introduced property weights to control the influence of a resource on another resource depending on the characteristic of the property linking the two resources. A node with a high objectivity score becomes the object of many RDF triples, and a node with a high subjectivity score becomes the subject of many RDF triples. They developed several kinds of Semantic Web systems in order to validate their technique and showed some experimental results verifying the applicability of their method to the Semantic Web. Despite their efforts, however, there remained some limitations which they reported in their paper. First, their algorithm is useful only when a Semantic Web system represents most of the knowledge pertaining to a certain domain. In other words, the ratio of links to nodes should be high, or overall resources should be described in detail, to a certain degree for their algorithm to properly work. Second, a Tightly-Knit Community(TKC) effect, the phenomenon that pages which are less important but yet densely connected have higher scores than the ones that are more important but sparsely connected, remains as problematic. Third, a resource may have a high score, not because it is actually important, but simply because it is very common and as a consequence it has many links pointing to it. In this paper, we examine such ranking problems from a novel perspective and propose a new algorithm which can solve the problems under the previous studies. Our proposed method is based on a class-oriented approach. In contrast to the predicate-oriented approach entertained by the previous research, a user, under our approach, determines the weights of a property by comparing its relative significance to the other properties when evaluating the importance of resources in a specific class. This approach stems from the idea that most queries are supposed to find resources belonging to the same class in the Semantic Web, which consists of many heterogeneous classes in RDF Schema. This approach closely reflects the way that people, in the real world, evaluate something, and will turn out to be superior to the predicate-oriented approach for the Semantic Web. Our proposed algorithm can resolve the TKC(Tightly Knit Community) effect, and further can shed lights on other limitations posed by the previous research. In addition, we propose two ways to incorporate data-type properties which have not been employed even in the case when they have some significance on the resource importance. We designed an experiment to show the effectiveness of our proposed algorithm and the validity of ranking results, which was not tried ever in previous research. We also conducted a comprehensive mathematical analysis, which was overlooked in previous research. The mathematical analysis enabled us to simplify the calculation procedure. Finally, we summarize our experimental results and discuss further research issues.

SKU recommender system for retail stores that carry identical brands using collaborative filtering and hybrid filtering (협업 필터링 및 하이브리드 필터링을 이용한 동종 브랜드 판매 매장간(間) 취급 SKU 추천 시스템)

  • Joe, Denis Yongmin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.77-110
    • /
    • 2017
  • Recently, the diversification and individualization of consumption patterns through the web and mobile devices based on the Internet have been rapid. As this happens, the efficient operation of the offline store, which is a traditional distribution channel, has become more important. In order to raise both the sales and profits of stores, stores need to supply and sell the most attractive products to consumers in a timely manner. However, there is a lack of research on which SKUs, out of many products, can increase sales probability and reduce inventory costs. In particular, if a company sells products through multiple in-store stores across multiple locations, it would be helpful to increase sales and profitability of stores if SKUs appealing to customers are recommended. In this study, the recommender system (recommender system such as collaborative filtering and hybrid filtering), which has been used for personalization recommendation, is suggested by SKU recommendation method of a store unit of a distribution company that handles a homogeneous brand through a plurality of sales stores by country and region. We calculated the similarity of each store by using the purchase data of each store's handling items, filtering the collaboration according to the sales history of each store by each SKU, and finally recommending the individual SKU to the store. In addition, the store is classified into four clusters through PCA (Principal Component Analysis) and cluster analysis (Clustering) using the store profile data. The recommendation system is implemented by the hybrid filtering method that applies the collaborative filtering in each cluster and measured the performance of both methods based on actual sales data. Most of the existing recommendation systems have been studied by recommending items such as movies and music to the users. In practice, industrial applications have also become popular. In the meantime, there has been little research on recommending SKUs for each store by applying these recommendation systems, which have been mainly dealt with in the field of personalization services, to the store units of distributors handling similar brands. If the recommendation method of the existing recommendation methodology was 'the individual field', this study expanded the scope of the store beyond the individual domain through a plurality of sales stores by country and region and dealt with the store unit of the distribution company handling the same brand SKU while suggesting a recommendation method. In addition, if the existing recommendation system is limited to online, it is recommended to apply the data mining technique to develop an algorithm suitable for expanding to the store area rather than expanding the utilization range offline and analyzing based on the existing individual. The significance of the results of this study is that the personalization recommendation algorithm is applied to a plurality of sales outlets handling the same brand. A meaningful result is derived and a concrete methodology that can be constructed and used as a system for actual companies is proposed. It is also meaningful that this is the first attempt to expand the research area of the academic field related to the existing recommendation system, which was focused on the personalization domain, to a sales store of a company handling the same brand. From 05 to 03 in 2014, the number of stores' sales volume of the top 100 SKUs are limited to 52 SKUs by collaborative filtering and the hybrid filtering method SKU recommended. We compared the performance of the two recommendation methods by totaling the sales results. The reason for comparing the two recommendation methods is that the recommendation method of this study is defined as the reference model in which offline collaborative filtering is applied to demonstrate higher performance than the existing recommendation method. The results of this model are compared with the Hybrid filtering method, which is a model that reflects the characteristics of the offline store view. The proposed method showed a higher performance than the existing recommendation method. The proposed method was proved by using actual sales data of large Korean apparel companies. In this study, we propose a method to extend the recommendation system of the individual level to the group level and to efficiently approach it. In addition to the theoretical framework, which is of great value.

Analysis of media trends related to spent nuclear fuel treatment technology using text mining techniques (텍스트마이닝 기법을 활용한 사용후핵연료 건식처리기술 관련 언론 동향 분석)

  • Jeong, Ji-Song;Kim, Ho-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.33-54
    • /
    • 2021
  • With the fourth industrial revolution and the arrival of the New Normal era due to Corona, the importance of Non-contact technologies such as artificial intelligence and big data research has been increasing. Convergent research is being conducted in earnest to keep up with these research trends, but not many studies have been conducted in the area of nuclear research using artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. This study was conducted to confirm the applicability of data science analysis techniques to the field of nuclear research. Furthermore, the study of identifying trends in nuclear spent fuel recognition is critical in terms of being able to determine directions to nuclear industry policies and respond in advance to changes in industrial policies. For those reasons, this study conducted a media trend analysis of pyroprocessing, a spent nuclear fuel treatment technology. We objectively analyze changes in media perception of spent nuclear fuel dry treatment techniques by applying text mining analysis techniques. Text data specializing in Naver's web news articles, including the keywords "Pyroprocessing" and "Sodium Cooled Reactor," were collected through Python code to identify changes in perception over time. The analysis period was set from 2007 to 2020, when the first article was published, and detailed and multi-layered analysis of text data was carried out through analysis methods such as word cloud writing based on frequency analysis, TF-IDF and degree centrality calculation. Analysis of the frequency of the keyword showed that there was a change in media perception of spent nuclear fuel dry treatment technology in the mid-2010s, which was influenced by the Gyeongju earthquake in 2016 and the implementation of the new government's energy conversion policy in 2017. Therefore, trend analysis was conducted based on the corresponding time period, and word frequency analysis, TF-IDF, degree centrality values, and semantic network graphs were derived. Studies show that before the 2010s, media perception of spent nuclear fuel dry treatment technology was diplomatic and positive. However, over time, the frequency of keywords such as "safety", "reexamination", "disposal", and "disassembly" has increased, indicating that the sustainability of spent nuclear fuel dry treatment technology is being seriously considered. It was confirmed that social awareness also changed as spent nuclear fuel dry treatment technology, which was recognized as a political and diplomatic technology, became ambiguous due to changes in domestic policy. This means that domestic policy changes such as nuclear power policy have a greater impact on media perceptions than issues of "spent nuclear fuel processing technology" itself. This seems to be because nuclear policy is a socially more discussed and public-friendly topic than spent nuclear fuel. Therefore, in order to improve social awareness of spent nuclear fuel processing technology, it would be necessary to provide sufficient information about this, and linking it to nuclear policy issues would also be a good idea. In addition, the study highlighted the importance of social science research in nuclear power. It is necessary to apply the social sciences sector widely to the nuclear engineering sector, and considering national policy changes, we could confirm that the nuclear industry would be sustainable. However, this study has limitations that it has applied big data analysis methods only to detailed research areas such as "Pyroprocessing," a spent nuclear fuel dry processing technology. Furthermore, there was no clear basis for the cause of the change in social perception, and only news articles were analyzed to determine social perception. Considering future comments, it is expected that more reliable results will be produced and efficiently used in the field of nuclear policy research if a media trend analysis study on nuclear power is conducted. Recently, the development of uncontact-related technologies such as artificial intelligence and big data research is accelerating in the wake of the recent arrival of the New Normal era caused by corona. Convergence research is being conducted in earnest in various research fields to follow these research trends, but not many studies have been conducted in the nuclear field with artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. The academic significance of this study is that it was possible to confirm the applicability of data science analysis technology in the field of nuclear research. Furthermore, due to the impact of current government energy policies such as nuclear power plant reductions, re-evaluation of spent fuel treatment technology research is undertaken, and key keyword analysis in the field can contribute to future research orientation. It is important to consider the views of others outside, not just the safety technology and engineering integrity of nuclear power, and further reconsider whether it is appropriate to discuss nuclear engineering technology internally. In addition, if multidisciplinary research on nuclear power is carried out, reasonable alternatives can be prepared to maintain the nuclear industry.

The Effect of Users' Personality on Emotional and Cognitive Evaluation in UCC Web Site Usage (UCC(user-created-contents) 웹 사이트에서 사용자의 인성이 감정적, 인지적 평가와 UCC 활용에 미치는 영향)

  • Moon, Yun-Ji;Kang, So-Ra;Kim, Woo-Gon
    • Asia pacific journal of information systems
    • /
    • v.20 no.3
    • /
    • pp.167-190
    • /
    • 2010
  • The research conducted here focuses on the effect of factors that affect the behavior of UCC (User Created Content) website users, other than user's rational recognition of how useful a UCC website can be. Most discussions in the existing literature on information systems have focused on users' evaluation how a UCC website can help to attain the users' own goals. However, there are other factors and this research pays attention to an individual's 'personality,' which is stable and biological in nature. Specifically, I have noted here that 'extroversion' and 'neuroticism,' the two common personality factors presented in Eysenck's most representative 'EPQ Model' and 'Big Five Model,' are the two personality factors that affect a site's 'usefulness,' by this I mean how useful does the user consider the website and its content. How useful a site is considered by the user is the other factor that has been regarded as the antecedent factor that influences the adoption of information systems in the existing MIS (Management Information System) research. Secondly, as using or creating a UCC website does not guarantee the user's or the creator's extrinsic motivation, unlike when using the information system within an organization, there is a greater likelihood that the increase in user's activities in relation to a UCC website is motivated by emotional factors rather than rational factors. Thus, I have decided to include the relationship between an individual's personality and what they find pleasurable in the research model. Thirdly, when based on the S-O-R Paradigm of Mehrabian and Russell, the two cognitive factors and emotional factors are finally affected by stimulus, and thus these factors ultimately have an effect on an individual's respondent behavior. Therefore, this research has presented an assumption that the recognition of how useful the site and content is and what emotional pleasure it provides will finally affect the behavior of the UCC website users. Finally, the relationship between the recognition of how useful a site is and how pleasurable it is to useand UCC usage may differ depending on certain situational conditions. In other words, the relationship between the three factors may vary according to how much users are involved in the creation of the website content. Creation thus emerges as the keyword of UCC. I analyzed the above relationships through the moderating variable of the user's involvement in the creation of the site. The research result shows the following: When it comes to the relationship between an individual's personality and what they find pleasurable it is extroverted users who have a greater likelihood to feel pleasure when using a UCC website, as was expected in this research. This in turn leads to a more active usage of the UCC web site because a person who is an extrovert likes to spend time on activities with other people, is sensitive to new experiences and stimuli and thus actively responds to these. An extroverted person accepts new UCC activities as part of his/her social life, rather than getting away from this new UCC environment. This is represented by the term 'Foxonomy' where the users meet a variety of users from all over the world and contact new types of content created by these users. However, neuroticism creates the opposite situation to that created by extroversion. The representative symptoms of neuroticism are instability, stress, and tension. These dispositions are more closely related to stress caused by a new environment rather than this creatingcuriosity or pleasure. Thus, neurotic persons have an uneasy feeling and will eventually avoid the situation where their own or others' daily lives are frequently exposed to the open web environment, this eventually makes them have a negative attitude towards the web environment. When it comes to an individual's personality and how useful site is, the two personality factors of extroversion and neuroticism both have a positive relationship with the recognition of how useful the site and its content is. The positive, curious, and social dispositions of extroverted persons tend to make them consider the future usefulness and possibilities of a new type of information system, or website, based on their positive attitude, which has a significant influence on the recognition of how useful these UCC sites are. Neuroticism also favorably affects how useful a UCC website can be through a different mechanism from that of extroversion. As the neurotic persons tend to feel uneasy and have much doubt about a new type of information system, they actively explore its usefulness in order to relieve their uncomfortable feelings. In other words, neurotic persons seek out how useful a site can be in order to secure their own stable feelings. Meanwhile, extroverted persons explore how useful a site can be because of their positive attitude and curiosity. As a lot of MIS research has revealed that the recognition of how useful a site can be and how pleasurable it can be to use have been proven to have a significant effect on UCC activity. However, the relationship between these factors reveals different aspects based on the user's involvement in creation. This factor of creationgauges the interest of users in the creation of UCC contents. Involvement is a variable that shows the level of an individual's mental effort in creating UCC contents. When a user is highly involved in the creation process and makes an enormous effort to create UCC content (classed a part of a high-involvement group), their own pleasure and recognition of how useful the site is have a significantly higher effect on the future usage of the UCC contents, more significantly than the users who sit back and just retrieve the UCC content created by others. The cognitive and emotional response of those in the low-involvement group is unlikely to last long,even if they recognize the contents of a UCC website is pleasurable and useful to them. However, the high-involvement group tends to participate in the creation and the usage of UCC more favorably, connecting the experience with their own goals. In this respect, this research presents an answer to the question; why so many people are participating in the usage of UCC, the representative form of the Web 2.0 that has drastically involved more and more people in the creation of UCC, even if they cannot gain any monetary or social compensation. Neither information system nor a website can succeed unless it secures a certain level of user base. Moreover, it cannot be further developed when the reasons, or problems, for people's participation are not suitably explored, even if it has a certain user base. Thus, what is significant in this research is that it has studied users' respondent behavior based on an individual's innate personality, emotion, and cognitive interaction, unlike the existing research that has focused on 'compensation' to explain users' participation with the UCC website. There are also limitations in this research. Firstly, I divided an individual's personality into extroversion and neuroticism; however, there are many other personal factors such as neuro-psychiatricism, which also needs to be analyzed for its influence on UCC activities. Secondly, as a UCC website comes in many types such as multimedia, Wikis, and podcasting, these types need to be included as a sub-category of the UCC websites and their relationship with personality, emotion, cognition, and behavior also needs to be analyzed.