• Title/Summary/Keyword: Social semantic web

Search Result 63, Processing Time 0.029 seconds

A Technique for Extracting GeoSemantic Knowledge from Micro-blog (마이크로 블로그기반의 공간 지식 추출 기법연구)

  • Ha, Su-Wook;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Spatial Information Research
    • /
    • v.20 no.2
    • /
    • pp.129-136
    • /
    • 2012
  • Recently international organizations such as ISO/TC211, OGC, INSPIRE (Infrastructure for Spatial Information in Europe) make an effort to share geospatial data using semantic web technologies. In addition, smart phone and social networking services enable community-based opportunities for participants to share issues of a social phenomenon based on geographic area, and many researchers try to find a method of extracting issues from that. However, serviceable spatial ontologies are still insufficient at application level, and studies of spatial information extraction from SNS were focused on user's location finding or geocoding by text mining. Therefore, a study of extracting spatial phenomenon from social media information and converting it into geosemantic knowledge is very usable. In this paper, we propose a framework for extracting keywords from micro-blog, one of the social media services, finding their relationships using data mining technique, and converting it into spatiotemopral knowledge. The result of this study could be used for implementing a related system as a procedure and ontology model for constructing geoseem antic issue. And from this, it is expected to improve the effectiveness of finding, publishing and analysing spatial issues.

A Study on the Ontology-Based Regional User-centric convergence content design information retrieval (온톨로지 기반의 사용자 중심 융합 컨텐츠 디자인 정보 검색에 관한 연구)

  • Park, Ju-Ok;Yeom, Mi-Ryeong;Jung, Doo-Yong
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.2
    • /
    • pp.19-24
    • /
    • 2016
  • On a huge space of information called the Internet, users can use a smart mobile web to get information on various intellectual fields and can access to various Medias such as personal blogs and social networking sites (SNS). This is why a vast amount of information on the web has been effectively managed and researched nowadays through a technology named Semantic Web. However, it still needs for an improvement for studies on searching for intellectual information, though it is enhanced to integrate variously spread information and search for intellectual information user-oriented. Thus, this study aims to research on searching information and knowledge spread around a knowledge-filled information space, which can improve credibility according to user-oriented logic.

Analysis of Social Media Utilization based on Big Data-Focusing on the Chinese Government Weibo

  • Li, Xiang;Guo, Xiaoqin;Kim, Soo Kyun;Lee, Hyukku
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2571-2586
    • /
    • 2022
  • The rapid popularity of government social media has generated huge amounts of text data, and the analysis of these data has gradually become the focus of digital government research. This study uses Python language to analyze the big data of the Chinese provincial government Weibo. First, this study uses a web crawler approach to collect and statistically describe over 360,000 data from 31 provincial government microblogs in China, covering the period from January 2018 to April 2022. Second, a word separation engine is constructed and these text data are analyzed using word cloud word frequencies as well as semantic relationships. Finally, the text data were analyzed for sentiment using natural language processing methods, and the text topics were studied using LDA algorithm. The results of this study show that, first, the number and scale of posts on the Chinese government Weibo have grown rapidly. Second, government Weibo has certain social attributes, and the epidemics, people's livelihood, and services have become the focus of government Weibo. Third, the contents of government Weibo account for more than 30% of negative sentiments. The classified topics show that the epidemics and epidemic prevention and control overshadowed the other topics, which inhibits the diversification of government Weibo.

A Study of Comparison between Cruise Tours in China and U.S.A through Big Data Analytics

  • Shuting, Tao;Kim, Hak-Seon
    • Culinary science and hospitality research
    • /
    • v.23 no.6
    • /
    • pp.1-11
    • /
    • 2017
  • The purpose of this study was to compare the cruise tours between China and U.S.A. through the semantic network analysis of big data by collecting online data with SCTM (Smart crawling & Text mining), a data collecting and processing program. The data analysis period was from January $1^{st}$, 2015 to August $15^{th}$, 2017, meanwhile, "cruise tour, china", "cruise tour, usa" were conducted to be as keywords to collet related data and packaged Netdraw along with UCINET 6.0 were utilized for data analysis. Currently, Chinese cruisers concern on the cruising destinations while American cruisers pay more attention on the onboard experience and cruising expenditure. After performing CONCOR (convergence of iterated correlation) analysis, for Chinese cruise tour, there were three clusters created with domestic destinations, international destinations and hospitality tourism. As for American cruise tour, four groups have been segmented with cruise expenditure, onboard experience, cruise brand and destinations. Since the cruise tourism of America was greatly developed, this study also was supposed to provide significant and social network-oriented suggestions for Chinese cruise tourism.

Customized Knowledge Creation Framework using Context- and intensity-based Similarity (상황과 정보 집적도를 고려한 유사도 기반의 맞춤형 지식 생성프레임워크)

  • Sohn, Mye M.;Lee, Hyun-Jung
    • Journal of Internet Computing and Services
    • /
    • v.12 no.5
    • /
    • pp.113-125
    • /
    • 2011
  • As information resources have become more various and the number of the resources has increased, knowledge customization on the social web has been becoming more difficult. To reduce the burden, we offer a framework for context-based similarity calculation for knowledge customization using ontology on the CBR. Thereby, we newly developed context- and intensity-based similarity calculation methods which are applied to extraction of the most similar case considered semantic similarity and syntactic, and effective creation of the user-tailored knowledge using the selected case. The process is comprised of conversion of unstructured web information into cases, extraction of an appropriate case according to the user requirements, and customization of the knowledge using the selected case. In the experimental section, the effectiveness of the developed similarity methods are compared with other edge-counting similarity methods using two classes which are compared with each other. It shows that our framework leads higher similarity values for conceptually close classes compared with other methods.

A Study on the Direction of Art Policy through Semantic Network Analysis in New Normal Era (뉴노멀(New Normal) 시대 언어네트워크 분석에 의한 예술정책 방향 연구)

  • Kim, Mi Yeon;Kwon, Byeong Woong
    • Korean Association of Arts Management
    • /
    • no.58
    • /
    • pp.153-177
    • /
    • 2021
  • This study attempted to analyze language networks based on the theory of art policy in the New Normal era triggered by COVID-19 and domestic and foreign policy trends. For analysis, data containing key words of "Corona" and "Art" were collected from Google News and Web documents from March to September 2020 to extract 227 refined subject words, and the extracted subject words were analyzed as indicators of frequency and centrality of subject words through the Netminor program. In addition, visualization analysis of semantic networks has been attempted for the analysis of relationships between each topic languages. As a result of the semantic network analysis, the most frequent topic was "Corona," and "Culture and Art," "Art," "Performance," "Online" and "Support" were included in the group with the most frequencies. In the centrality analysis, "Corona" was the most popular, followed by "the era," "after," "post," "art," and "cultural arts," with high frequency, "Corona," "art," and "cultural arts" also dominated most centrality. In particular, the top-level key words in the analysis of frequency and centrality of the topic are 'online' and 'support' and 'policy'. This can be seen as indicating that the rapid rise of non-face-to-face and online content and support policies for the artistic communities are needed due to the dailyization of social distance due to COVID-19.

A Folksonomy Ranking Framework: A Semantic Graph-based Approach (폭소노미 사이트를 위한 랭킹 프레임워크 설계: 시맨틱 그래프기반 접근)

  • Park, Hyun-Jung;Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.89-116
    • /
    • 2011
  • In collaborative tagging systems such as Delicious.com and Flickr.com, users assign keywords or tags to their uploaded resources, such as bookmarks and pictures, for their future use or sharing purposes. The collection of resources and tags generated by a user is called a personomy, and the collection of all personomies constitutes the folksonomy. The most significant need of the folksonomy users Is to efficiently find useful resources or experts on specific topics. An excellent ranking algorithm would assign higher ranking to more useful resources or experts. What resources are considered useful In a folksonomic system? Does a standard superior to frequency or freshness exist? The resource recommended by more users with mere expertise should be worthy of attention. This ranking paradigm can be implemented through a graph-based ranking algorithm. Two well-known representatives of such a paradigm are Page Rank by Google and HITS(Hypertext Induced Topic Selection) by Kleinberg. Both Page Rank and HITS assign a higher evaluation score to pages linked to more higher-scored pages. HITS differs from PageRank in that it utilizes two kinds of scores: authority and hub scores. The ranking objects of these pages are limited to Web pages, whereas the ranking objects of a folksonomic system are somewhat heterogeneous(i.e., users, resources, and tags). Therefore, uniform application of the voting notion of PageRank and HITS based on the links to a folksonomy would be unreasonable, In a folksonomic system, each link corresponding to a property can have an opposite direction, depending on whether the property is an active or a passive voice. The current research stems from the Idea that a graph-based ranking algorithm could be applied to the folksonomic system using the concept of mutual Interactions between entitles, rather than the voting notion of PageRank or HITS. The concept of mutual interactions, proposed for ranking the Semantic Web resources, enables the calculation of importance scores of various resources unaffected by link directions. The weights of a property representing the mutual interaction between classes are assigned depending on the relative significance of the property to the resource importance of each class. This class-oriented approach is based on the fact that, in the Semantic Web, there are many heterogeneous classes; thus, applying a different appraisal standard for each class is more reasonable. This is similar to the evaluation method of humans, where different items are assigned specific weights, which are then summed up to determine the weighted average. We can check for missing properties more easily with this approach than with other predicate-oriented approaches. A user of a tagging system usually assigns more than one tags to the same resource, and there can be more than one tags with the same subjectivity and objectivity. In the case that many users assign similar tags to the same resource, grading the users differently depending on the assignment order becomes necessary. This idea comes from the studies in psychology wherein expertise involves the ability to select the most relevant information for achieving a goal. An expert should be someone who not only has a large collection of documents annotated with a particular tag, but also tends to add documents of high quality to his/her collections. Such documents are identified by the number, as well as the expertise, of users who have the same documents in their collections. In other words, there is a relationship of mutual reinforcement between the expertise of a user and the quality of a document. In addition, there is a need to rank entities related more closely to a certain entity. Considering the property of social media that ensures the popularity of a topic is temporary, recent data should have more weight than old data. We propose a comprehensive folksonomy ranking framework in which all these considerations are dealt with and that can be easily customized to each folksonomy site for ranking purposes. To examine the validity of our ranking algorithm and show the mechanism of adjusting property, time, and expertise weights, we first use a dataset designed for analyzing the effect of each ranking factor independently. We then show the ranking results of a real folksonomy site, with the ranking factors combined. Because the ground truth of a given dataset is not known when it comes to ranking, we inject simulated data whose ranking results can be predicted into the real dataset and compare the ranking results of our algorithm with that of a previous HITS-based algorithm. Our semantic ranking algorithm based on the concept of mutual interaction seems to be preferable to the HITS-based algorithm as a flexible folksonomy ranking framework. Some concrete points of difference are as follows. First, with the time concept applied to the property weights, our algorithm shows superior performance in lowering the scores of older data and raising the scores of newer data. Second, applying the time concept to the expertise weights, as well as to the property weights, our algorithm controls the conflicting influence of expertise weights and enhances overall consistency of time-valued ranking. The expertise weights of the previous study can act as an obstacle to the time-valued ranking because the number of followers increases as time goes on. Third, many new properties and classes can be included in our framework. The previous HITS-based algorithm, based on the voting notion, loses ground in the situation where the domain consists of more than two classes, or where other important properties, such as "sent through twitter" or "registered as a friend," are added to the domain. Forth, there is a big difference in the calculation time and memory use between the two kinds of algorithms. While the matrix multiplication of two matrices, has to be executed twice for the previous HITS-based algorithm, this is unnecessary with our algorithm. In our ranking framework, various folksonomy ranking policies can be expressed with the ranking factors combined and our approach can work, even if the folksonomy site is not implemented with Semantic Web languages. Above all, the time weight proposed in this paper will be applicable to various domains, including social media, where time value is considered important.

Automatic In-Text Keyword Tagging based on Information Retrieval

  • Kim, Jin-Suk;Jin, Du-Seok;Kim, Kwang-Young;Choe, Ho-Seop
    • Journal of Information Processing Systems
    • /
    • v.5 no.3
    • /
    • pp.159-166
    • /
    • 2009
  • As shown in Wikipedia, tagging or cross-linking through major keywords in a document collection improves not only the readability of documents but also responsive and adaptive navigation among related documents. In recent years, the Semantic Web has increased the importance of social tagging as a key feature of the Web 2.0 and, as its crucial phenotype, Tag Cloud has emerged to the public. In this paper we provide an efficient method of automated in-text keyword tagging based on large-scale controlled term collection or keyword dictionary, where the computational complexity of O(mN) - if a pattern matching algorithm is used - can be reduced to O(mlogN) - if an Information Retrieval technique is adopted - while m is the length of target document and N is the total number of candidate terms to be tagged. The result shows that automatic in-text tagging with keywords filtered by Information Retrieval speeds up to about 6 $\sim$ 40 times compared with the fastest pattern matching algorithm.

Semantic analysis via application of deep learning using Naver movie review data (네이버 영화 리뷰 데이터를 이용한 의미 분석(semantic analysis))

  • Kim, Sojin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.19-33
    • /
    • 2022
  • With the explosive growth of social media, its abundant text-based data generated by web users has become an important source for data analysis. For example, we often witness online movie reviews from the 'Naver Movie' affecting the general public to decide whether they should watch the movie or not. This study has conducted analysis on the Naver Movie's text-based review data to predict the actual ratings. After examining the distribution of movie ratings, we performed semantics analysis using Korean Natural Language Processing. This research sought to find the best review rating prediction model by comparing machine learning and deep learning models. We also compared various regression and classification models in 2-class and multi-class cases. Lastly we explained the causes of review misclassification related to movie review data characteristics.

User-centralized Social Semantic Web Framework (사용자 중심 소셜 시맨틱 웹 프레임워크)

  • Wang, Dong-Seung;Sohn, Jong-Soo;Kim, Jung-Hun;Chung, In-Jeong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.185-187
    • /
    • 2012
  • SNS를 비롯한 소셜 웹 사용자의 급격한 증가로 인하여 소셜 웹은 사회 여러 분야에 영향력이 커지게 되었을 뿐만 아니라 자료의 저장소로써 중요한 역할을 하게 되었다. 이에 따라 최근에 들어서는 방대한 소셜 웹의 자료를 분석하기 위하여 시맨틱 웹의 역할이 중요해 지고 있다. 그러나 소셜 웹 자료와 시맨틱 웹 기술을 효과적으로 융합하기 위한 프레임워크의 연구는 상대적으로 부족하다. 이에, 본 논문에서는 소셜 웹 자료를 수집하고 이를 시맨틱 웹 기술로 처리할 수 있는 프레임워크를 제안한다. 제안하는 프레임 워크는 여러 소셜 웹 서비스에서 제공하는 데이터의 수집과 시맨틱 웹 기술 기반의 자료처리를 수행한다. 본 논문에서 제안하는 프레임워크를 사용하면 여러 서비스에 분산된 사용자의 메시지와 프로파일을 이용하여 보다 더 신뢰성 있는 자료의 분석이 가능하다.