• Title/Summary/Keyword: Semantic Web Technologies

Search Result 99, Processing Time 0.031 seconds

Efficient Authorization Conflict Detection Using Prime Number Graph Labeling in RDF Access Control (RDF 접근 제어에서 소수 그래프 레이블링을 사용한 효율적 권한 충돌 발견)

  • Kim, Jae-Hoon;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.112-124
    • /
    • 2008
  • RDF and OWL are the primary base technologies for implementing Semantic Web. Recently, many researches related with them, or applying them into the other application domains, have been introduced. However, relatively little work has been done for securing the RDF and OWL data. In this article, we briefly introduce an RDF triple based model for specifying RDF access authorization related with RDF security. Next, to efficiently find the authorization conflict by RDF inference, we introduce a method using prime number graph labeling in detail. The problem of authorization conflict by RDF inference is that although the lower concept is permitted to be accessed, it can be inaccessible due to the disapproval for the upper concept. Because by the RDF inference, the lower concept can be interpreted into the upper concept. Some experimental results show that the proposed method using the prime number graph labeling has better performance than the existing simple method for the detection of the authorization conflict.

uLAMP: Unified Linguistic Asset Management Platform for Natural Language Processing (uLAMP: 자연어 처리를 위한 자원 통합 관리 플랫폼)

  • Um, Jung-Ho;Shin, Sung-Ho;Choi, Sung-Pil;Jung, Hanmin
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.12
    • /
    • pp.25-34
    • /
    • 2012
  • Due to the development of wireless devices such as smart-phone and internet, a lot of linguistic resources actively are opened in each area of expertise. Also, various systems using semantic web technologies are developing for determining whether such information are useful or not. In order to build these systems, the processes of data collection and natural language processing are necessary. But, there is few systems to use of integrating software and data required those processes. In this paper, we propose the system, uLAMP, integrating software and data related to natural language processing. In terms of economics, the cost can be reduced by preventing duplicated implementation and data collection. On the other hand, data and software usability are increasing in terms of management aspects. In addition, for the evaluation of uLAMP usability and effectiveness, user survey was conducted. Through this evaluation, the advantages of the currentness of data and the ease of use are found.

HTML specification and semantics analysis of korean news sites (한국 인터넷신문 HTML 규격 및 시맨틱스 수준 분석)

  • Lee, Byoung-Hak
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.949-956
    • /
    • 2017
  • Visual interfaces of news sites look similar while their HTML have lots of different specifications and qualities. It's getting more and more significant to describe HTML semantically to make every computer able to understand contents to be shared as HTML5 specification refers. In this study, I have analysed HTML codes of 110 korean news sites in comparison to those of 8 global news sites. As results, 68% of news sites are still described in HTML4 specifications and only 10 out of 110 are in HTML5 specification and as high quality and strong semantics as global news sites. The result shows most korean news sites platforms had not been changed since they developed in mid-2000 and it's needed to be upgraded as language translation technologies are making it possible to share korean digital contents with the rest of world.

Use of Text Processing Technologies in a Semantic Web Application (시맨틱 웹 응용 서비스에서의 텍스트 처리 기술 적용)

  • Jung, Han-Min;Kang, In-Su;Koo, Hee-Kwan;Lee, Seung-Woo;Kim, Pyung;Sung, Won-Kyung
    • Annual Conference on Human and Language Technology
    • /
    • 2006.10e
    • /
    • pp.189-196
    • /
    • 2006
  • 본 논문은 시맨틱 웹 응용 서비스를 구현함에 있어 필수적으로 요구되는 온톨로지 인스턴스 구축을 효율적으로 처리하는 데 있어 텍스트 처리 기술이 어떤 역할을 수행할 수 있는 가를 $OntoFrame-K^{(R)}$라는 시맨틱 웹 기반 정보 유통 체계에의 적용 사례를 통해 살펴본다. 본 논문에서 소개하는 텍스트 처리 기술은 개체 확인물 통한 개념 사례화, 주제 분야 할당을 통한 메타데이터 확장에, 그리고 인용 정보 추출 및 인용 관계 구축을 통한 객체 관계속성 구축에 적용된다. 개체 확인에서는 메타데이터 비교 잊 병합을 사용하였으며 이를 기반으로 한 수작업 구축을 통해 8,543명의 인력 URI를 확보하였다. 주제 및 분야 할당에서는 색인어와 분야분류명이 매핑된 시소러스 개념어의 매칭을 통해 색인어 별 TF (Term Frequency), 색인어와 매칭된 개념어 별 TF, 색인어와 매칭된 개념어 별 시소러스에서의 깊이, 색인어와 매칭된 개념어 별 개념 패싯, 색인어와 매칭된 각 개념어에 부착된 분야분류명 목록 등 할당을 위한 다양한 자질을 확보 적용하였다. 인용 정보 추출과 인용 관계 구축에서는 객체 URI와 인력 URI를 기반으로 하여 자동 추출된 인용 정보를 반영하는 방식으로 7,237개 문헌으로부터 총 135개의 인용 네트워크 그룹을 자동으로 확보하였다. 본 연구를 통해 제시된 텍스트 처리 기술의 활용 방안이 향후 시맨틱 웹 응용 서비스 및 인프라 구현에서 다각적으로 활용될 수 있기를 기대한다.

  • PDF

Identifying potential buyers in the technology market using a semantic network analysis (시맨틱 네트워크 분석을 이용한 원천기술 분야의 잠재적 기술수요 발굴기법에 관한 연구)

  • Seo, Il Won;Chon, ChaeNam;Lee, Duk Hee
    • Journal of Technology Innovation
    • /
    • v.21 no.1
    • /
    • pp.279-301
    • /
    • 2013
  • This study demonstrates how social network analysis can be used for identifying potential buyers in technology marketing; in such, the methodology and empirical results are proposed. First of all, we derived the three most important 'seed' keywords from 'technology description' sections. The technologies are generated by various types of R&D activities organized by South Korea's public research institutes in the fundamental science fields. Second, some 3, 000 words were collected from websites related to the three 'seed' keywords. Next, three network matrices (i.e., one matrix per seed keyword) were constructed. To explore the technology network structure, each network is analyzed by degree centrality and Euclidean distance. The network analysis suggests 100 potentially demanding companies and identifies seven common companies after comparing results derived from each network. The usefulness of the result is verified by investigating the business area of the firm's homepages. Finally, five out of seven firms were proven to have strong relevance to the target technology. In terms of social network analysis, this study expands its application scope of methodology by combining semantic network analysis and the technology marketing method. From a practical perspective, the empirical study suggests the illustrative framework for exploiting prospective demanding companies on the web, raising possibilities of technology commercialization in the basic research fields. Future research is planned to examine how the efficiency of process and accuracy of result is increased.

  • PDF

Index for Efficient Ontology Retrieval and Inference (효율적인 온톨로지 검색과 추론을 위한 인덱스)

  • Song, Seungjae;Kim, Insung;Chun, Jonghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.2
    • /
    • pp.153-173
    • /
    • 2013
  • The ontology has been gaining increasing interests by recent arise of the semantic web and related technologies. The focus is mostly on inference query processing that requires high-level techniques for storage and searching ontologies efficiently, and it has been actively studied in the area of semantic-based searching. W3C's recommendation is to use RDFS and OWL for representing ontologies. However memory-based editors, inference engines, and triple storages all store ontology as a simple set of triplets. Naturally the performance is limited, especially when a large-scale ontology needs to be processed. A variety of researches on proposing algorithms for efficient inference query processing has been conducted, and many of them are based on using proven relational database technology. However, none of them had been successful in obtaining the complete set of inference results which reflects the five characteristics of the ontology properties. In this paper, we propose a new index structure called hyper cube index to efficiently process inference queries. Our approach is based on an intuition that an index can speed up the query processing when extensive inferencing is required.

Design and Implementation of Thesaurus System for Geological Terms (지질용어 시소러스 시스템의 설계 및 구축)

  • Hwang, Jaehong;Chi, KwangHoon;Han, JongGyu;Yeon, Young Kwang;Ryu, Keun Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.10 no.2
    • /
    • pp.23-35
    • /
    • 2007
  • With the development of semantic web technologies in information retrieval area, the necessity for thesaurus is recently increasing along with internet lexicons. A thesaurus is the combination of classification and a lexicon, and is the topic map of knowledge structure expressing relations among concepts(terms) subject to human knowledge activities such as learning and research using formally organized and controlled index terms for clarifying the context of superordinate and subordinate concepts. However, although thesaurus are regarded as essential tools for controlling and standardizing terms and searching and processing information efficiently, we do not have a Korean thesaurus for geology. To build a thesaurus, we need standardized and well-defined guidelines. The standardized guidelines enable efficient information management and help information users use correct information easily and conveniently. The present study purposed to build a thesaurus system with terms used in geology. For this, First, we surveyed related works for standardizing geological terms in Korea and other countries. Second, we defined geological topics in 15 areas and prepared a classification system(draft) for each topic. Third, based on the geological thesaurus classification system, we created the specification of geological thesaurus. Lastly, we designed and implemented an internet-based geological thesaurus system using the specification.

  • PDF

Analysis of media trends related to spent nuclear fuel treatment technology using text mining techniques (텍스트마이닝 기법을 활용한 사용후핵연료 건식처리기술 관련 언론 동향 분석)

  • Jeong, Ji-Song;Kim, Ho-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.33-54
    • /
    • 2021
  • With the fourth industrial revolution and the arrival of the New Normal era due to Corona, the importance of Non-contact technologies such as artificial intelligence and big data research has been increasing. Convergent research is being conducted in earnest to keep up with these research trends, but not many studies have been conducted in the area of nuclear research using artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. This study was conducted to confirm the applicability of data science analysis techniques to the field of nuclear research. Furthermore, the study of identifying trends in nuclear spent fuel recognition is critical in terms of being able to determine directions to nuclear industry policies and respond in advance to changes in industrial policies. For those reasons, this study conducted a media trend analysis of pyroprocessing, a spent nuclear fuel treatment technology. We objectively analyze changes in media perception of spent nuclear fuel dry treatment techniques by applying text mining analysis techniques. Text data specializing in Naver's web news articles, including the keywords "Pyroprocessing" and "Sodium Cooled Reactor," were collected through Python code to identify changes in perception over time. The analysis period was set from 2007 to 2020, when the first article was published, and detailed and multi-layered analysis of text data was carried out through analysis methods such as word cloud writing based on frequency analysis, TF-IDF and degree centrality calculation. Analysis of the frequency of the keyword showed that there was a change in media perception of spent nuclear fuel dry treatment technology in the mid-2010s, which was influenced by the Gyeongju earthquake in 2016 and the implementation of the new government's energy conversion policy in 2017. Therefore, trend analysis was conducted based on the corresponding time period, and word frequency analysis, TF-IDF, degree centrality values, and semantic network graphs were derived. Studies show that before the 2010s, media perception of spent nuclear fuel dry treatment technology was diplomatic and positive. However, over time, the frequency of keywords such as "safety", "reexamination", "disposal", and "disassembly" has increased, indicating that the sustainability of spent nuclear fuel dry treatment technology is being seriously considered. It was confirmed that social awareness also changed as spent nuclear fuel dry treatment technology, which was recognized as a political and diplomatic technology, became ambiguous due to changes in domestic policy. This means that domestic policy changes such as nuclear power policy have a greater impact on media perceptions than issues of "spent nuclear fuel processing technology" itself. This seems to be because nuclear policy is a socially more discussed and public-friendly topic than spent nuclear fuel. Therefore, in order to improve social awareness of spent nuclear fuel processing technology, it would be necessary to provide sufficient information about this, and linking it to nuclear policy issues would also be a good idea. In addition, the study highlighted the importance of social science research in nuclear power. It is necessary to apply the social sciences sector widely to the nuclear engineering sector, and considering national policy changes, we could confirm that the nuclear industry would be sustainable. However, this study has limitations that it has applied big data analysis methods only to detailed research areas such as "Pyroprocessing," a spent nuclear fuel dry processing technology. Furthermore, there was no clear basis for the cause of the change in social perception, and only news articles were analyzed to determine social perception. Considering future comments, it is expected that more reliable results will be produced and efficiently used in the field of nuclear policy research if a media trend analysis study on nuclear power is conducted. Recently, the development of uncontact-related technologies such as artificial intelligence and big data research is accelerating in the wake of the recent arrival of the New Normal era caused by corona. Convergence research is being conducted in earnest in various research fields to follow these research trends, but not many studies have been conducted in the nuclear field with artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. The academic significance of this study is that it was possible to confirm the applicability of data science analysis technology in the field of nuclear research. Furthermore, due to the impact of current government energy policies such as nuclear power plant reductions, re-evaluation of spent fuel treatment technology research is undertaken, and key keyword analysis in the field can contribute to future research orientation. It is important to consider the views of others outside, not just the safety technology and engineering integrity of nuclear power, and further reconsider whether it is appropriate to discuss nuclear engineering technology internally. In addition, if multidisciplinary research on nuclear power is carried out, reasonable alternatives can be prepared to maintain the nuclear industry.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.