• Title/Summary/Keyword: semantic weight

Search Result 71, Processing Time 0.024 seconds

Research on Function and Policy for e-Government System using Semantic Technology (전자정부내 의미기반 기술 도입에 따른 기능 및 정책 연구)

  • Go, Gwang-Seop;Jang, Yeong-Cheol;Lee, Chang-Hun
    • 한국디지털정책학회:학술대회논문집
    • /
    • 2007.06a
    • /
    • pp.79-87
    • /
    • 2007
  • This paper aims to offer a solution based on semantic document classification to improve e-Government utilization and efficiency for people using their own information retrieval system and linguistic expression Generally, semantic document classification method is an approach that classifies documents based on the diverse relationships between keywords in a document without fully describing hierarchial concepts between keywords. Our approach considers the deep meanings within the context of the document and radically enhances the information retrieval performance. Concept Weight Document Classification(CoWDC) method, which goes beyond using exist ing keyword and simple thesaurus/ontology methods by fully considering the concept hierarchy of various concepts is proposed, experimented, and evaluated. With the recognition that in order to verify the superiority of the semantic retrieval technology through test results of the CoWDC and efficiently integrate it into the e-Government, creation of a thesaurus, management of the operating system, expansion of the knowledge base and improvements in search service and accuracy at the national level were needed.

  • PDF

Lexical Semantic Information and Pitch Accent in English (영어 어휘 의미 정보와 피치 액센트)

  • Jeon, Yoon-Shil;Kim, Kee-Ho;Lee, Yong-Jae
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.187-209
    • /
    • 2003
  • In this paper, we examine if the lexical information of the verb and its noun object affects the pitch accent patterns of the verb phrase focus. Three types of verb-object combinations with different semantic weights are discussed: when the verbs have optional direct objects, when the objects have the greater semantic weight relative to verbs, and when the verbs and the objects have equal semantic weight. Argument-structure-based works note that the pitch accent location in a focused phrase is closely related to the argument structure and contextual information. For example, it has been argued that contextually new noun objects receive accent while given noun objects don't. Contrary to nouns, verbs can be accented or not in verb phrase focus regardless of whether they are given information or new information (Selkirk 1984, 1992). However, the production experiment in this paper shows that the accenting of verbs is not fully optional, but influenced by the lexical semantic information of the verbs. The accenting of noun objects with given information is possible and the deaccenting of new noun objects also occurs depending on the lexical information of the noun objects. The results demonstrate that in addition to argument structure and information by means of context sentences, the lexical semantic information of words influences the pitch accent location in focused phrase.

  • PDF

Constructing the Semantic Information Model using A Collective Intelligence Approach

  • Lyu, Ki-Gon;Lee, Jung-Yong;Sun, Dong-Eon;Kwon, Dai-Young;Kim, Hyeon-Cheol
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.10
    • /
    • pp.1698-1711
    • /
    • 2011
  • Knowledge is often represented as a set of rules or a semantic network in intelligent systems. Recently, ontology has been widely used to represent semantic knowledge, because it organizes thesaurus and hierarchal information between concepts in a particular domain. However, it is not easy to collect semantic relationships among concepts. Much time and expense are incurred in ontology construction. Collective intelligence can be a good alternative approach to solve these problems. In this paper, we propose a collective intelligence approach of Games With A Purpose (GWAP) to collect various semantic resources, such as words and word-senses. We detail how to construct the semantic information model or ontology from the collected semantic resources, constructing a system named FunWords. FunWords is a Korean lexical-based semantic resource collection tool. Experiments demonstrated the resources were grouped as common nouns, abstract nouns, adjective and neologism. Finally, we analyzed their characteristics, acquiring the semantic relationships noted above. Common nouns, with structural semantic relationships, such as hypernym and hyponym, are highlighted. Abstract nouns, with descriptive and characteristic semantic relationships, such as synonym and antonym are underlined. Adjectives, with such semantic relationships, as description and status, illustration - for example, color and sound - are expressed more. Last, neologism, with the semantic relationships, such as description and characteristics, are emphasized. Weighting the semantic relationships with these characteristics can help reduce time and cost, because it need not consider unnecessary or slightly related factors. This can improve the expressive power, such as readability, concentrating on the weighted characteristics. Our proposal to collect semantic resources from the collective intelligence approach of GWAP (our FunWords) and to weight their semantic relationship can help construct the semantic information model or ontology would be a more effective and expressive alternative.

Semantic Network Analysis of Online News and Social Media Text Related to Comprehensive Nursing Care Service (간호간병통합서비스 관련 온라인 기사 및 소셜미디어 빅데이터의 의미연결망 분석)

  • Kim, Minji;Choi, Mona;Youm, Yoosik
    • Journal of Korean Academy of Nursing
    • /
    • v.47 no.6
    • /
    • pp.806-816
    • /
    • 2017
  • Purpose: As comprehensive nursing care service has gradually expanded, it has become necessary to explore the various opinions about it. The purpose of this study is to explore the large amount of text data regarding comprehensive nursing care service extracted from online news and social media by applying a semantic network analysis. Methods: The web pages of the Korean Nurses Association (KNA) News, major daily newspapers, and Twitter were crawled by searching the keyword 'comprehensive nursing care service' using Python. A morphological analysis was performed using KoNLPy. Nodes on a 'comprehensive nursing care service' cluster were selected, and frequency, edge weight, and degree centrality were calculated and visualized with Gephi for the semantic network. Results: A total of 536 news pages and 464 tweets were analyzed. In the KNA News and major daily newspapers, 'nursing workforce' and 'nursing service' were highly rated in frequency, edge weight, and degree centrality. On Twitter, the most frequent nodes were 'National Health Insurance Service' and 'comprehensive nursing care service hospital.' The nodes with the highest edge weight were 'national health insurance,' 'wards without caregiver presence,' and 'caregiving costs.' 'National Health Insurance Service' was highest in degree centrality. Conclusion: This study provides an example of how to use atypical big data for a nursing issue through semantic network analysis to explore diverse perspectives surrounding the nursing community through various media sources. Applying semantic network analysis to online big data to gather information regarding various nursing issues would help to explore opinions for formulating and implementing nursing policies.

Semantic Similarity Measures Between Words within a Document using WordNet (워드넷을 이용한 문서내에서 단어 사이의 의미적 유사도 측정)

  • Kang, SeokHoon;Park, JongMin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.11
    • /
    • pp.7718-7728
    • /
    • 2015
  • Semantic similarity between words can be applied in many fields including computational linguistics, artificial intelligence, and information retrieval. In this paper, we present weighted method for measuring a semantic similarity between words in a document. This method uses edge distance and depth of WordNet. The method calculates a semantic similarity between words on the basis of document information. Document information uses word term frequencies(TF) and word concept frequencies(CF). Each word weight value is calculated by TF and CF in the document. The method includes the edge distance between words, the depth of subsumer, and the word weight in the document. We compared out scheme with the other method by experiments. As the result, the proposed method outperforms other similarity measures. In the document, the word weight value is calculated by the proposed method. Other methods which based simple shortest distance or depth had difficult to represent the information or merge informations. This paper considered shortest distance, depth and information of words in the document, and also improved the performance.

Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings

  • Al-Sabahi, Kamal;Zuping, Zhang;Kang, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.1
    • /
    • pp.254-276
    • /
    • 2019
  • Since the amount of information on the internet is growing rapidly, it is not easy for a user to find relevant information for his/her query. To tackle this issue, the researchers are paying much attention to Document Summarization. The key point in any successful document summarizer is a good document representation. The traditional approaches based on word overlapping mostly fail to produce that kind of representation. Word embedding has shown good performance allowing words to match on a semantic level. Naively concatenating word embeddings makes common words dominant which in turn diminish the representation quality. In this paper, we employ word embeddings to improve the weighting schemes for calculating the Latent Semantic Analysis input matrix. Two embedding-based weighting schemes are proposed and then combined to calculate the values of this matrix. They are modified versions of the augment weight and the entropy frequency that combine the strength of traditional weighting schemes and word embedding. The proposed approach is evaluated on three English datasets, DUC 2002, DUC 2004 and Multilingual 2015 Single-document Summarization. Experimental results on the three datasets show that the proposed model achieved competitive performance compared to the state-of-the-art leading to a conclusion that it provides a better document representation and a better document summary as a result.

Research on Function and Policy for e-Government System using Semantic Technology (전자정부내 의미기반 기술 도입에 따른 기능 및 정책 연구)

  • Jang, Young-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.13 no.5
    • /
    • pp.22-28
    • /
    • 2008
  • This paper aims to offer a solution based on semantic document classification to improve e-Government utilization and efficiency for people using their own information retrieval system and linguistic expression. Generally, semantic document classification method is an approach that classifies documents based on the diverse relationships between keywords in a document without fully describing hierarchial concepts between keywords. Our approach considers the deep meanings within the context of the document and radically enhances the information retrieval performance. Concept Weight Document Classification(CoWDC) method, which goes beyond using existing keyword and simple thesaurus/ontology methods by fully considering the concept hierarchy of various concepts is proposed, experimented, and evaluated. With the recognition that in order to verify the superiority of the semantic retrieval technology through test results of the CoWDC and efficiently integrate it into the e-Government, creation of a thesaurus, management of the operating system, expansion of the knowledge base and improvements in search service and accuracy at the national level were needed.

  • PDF

Disambiguation of Homograph Suffixes using Lexical Semantic Network(U-WIN) (어휘의미망(U-WIN)을 이용한 동형이의어 접미사의 의미 중의성 해소)

  • Bae, Young-Jun;Ock, Cheol-Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.1 no.1
    • /
    • pp.31-42
    • /
    • 2012
  • In order to process the suffix derived nouns of Korean, most of Korean processing systems have been registering the suffix derived nouns in dictionary. However, this approach is limited because the suffix is very high productive. Therefore, it is necessary to analyze semantically the unregistered suffix derived nouns. In this paper, we propose a method to disambiguate homograph suffixes using Korean lexical semantic network(U-WIN) for the purpose of semantic analysis of the suffix derived nouns. 33,104 suffix derived nouns including the homograph suffixes in the morphological and semantic tagged Sejong Corpus were used for experiments. For the experiments first of all we semantically tagged the homograph suffixes and extracted root of the suffix derived nouns and mapped the root to nodes in the U-WIN. And we assigned the distance weight to the nodes in U-WIN that could combine with each homograph suffix and we used the distance weight for disambiguating the homograph suffixes. The experiments for 35 homograph suffixes occurred in the Sejong corpus among 49 homograph suffixes in a Korean dictionary result in 91.01% accuracy.

A New Semantic Distance Measurement Method using TF-IDF in Linked Open Data (링크드 오픈 데이터에서 TF-IDF를 이용한 새로운 시맨틱 거리 측정 기법)

  • Cho, Jung-Gil
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.10
    • /
    • pp.89-96
    • /
    • 2020
  • Linked Data allows structured data to be published in a standard way that datasets from various domains can be interlinked. With the rapid evolution of Linked Open Data(LOD), researchers are exploiting it to solve particular problems such as semantic similarity assessment. In this paper, we propose a method, on top of the basic concept of Linked Data Semantic Distance (LDSD), for calculating the Linked Data semantic distance between resources that can be used in the LOD-based recommender system. The semantic distance measurement model proposed in this paper is based on a similarity measurement that combines the LOD-based semantic distance and a new link weight using TF-IDF, which is well known in the field of information retrieval. In order to verify the effectiveness of this paper's approach, performance was evaluated in the context of an LOD-based recommendation system using mixed data of DBpedia and MovieLens. Experimental results show that the proposed method shows higher accuracy compared to other similar methods. In addition, it contributed to the improvement of the accuracy of the recommender system by expanding the range of semantic distance calculation.

Ontology Selection Ranking Model based on Semantic Similarity Approach (의미적 유사성에 기반한 온톨로지 선택 랭킹 모델)

  • Oh, Sun-Ju;Ahn, Joong-Ho;Park, Jin-Soo
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.2
    • /
    • pp.95-116
    • /
    • 2009
  • Ontologies have provided supports in integrating heterogeneous and distributed information. More and more ontologies and tools have been developed in various domains. However, building ontologies requires much time and effort. Therefore, ontologies need to be shared and reused among users. Specifically, finding the desired ontology from an ontology repository will benefit users. In the past, most of the studies on retrieving and ranking ontologies have mainly focused on lexical level supports. In those cases, it is impossible to find an ontology that includes concepts that users want to use at the semantic level. Most ontology libraries and ontology search engines have not provided semantic matching capability. Retrieving an ontology that users want to use requires a new ontology selection and ranking mechanism based on semantic similarity matching. We propose an ontology selection and ranking model consisting of selection criteria and metrics which are enhanced in semantic matching capabilities. The model we propose presents two novel features different from the previous research models. First, it enhances the ontology selection and ranking method practically and effectively by enabling semantic matching of taxonomy or relational linkage between concepts. Second, it identifies what measures should be used to rank ontologies in the given context and what weight should be assigned to each selection measure.

  • PDF