• Title/Summary/Keyword: Semantic Importance

Search Result 138, Processing Time 0.02 seconds

Automatic In-Text Keyword Tagging based on Information Retrieval

  • Kim, Jin-Suk;Jin, Du-Seok;Kim, Kwang-Young;Choe, Ho-Seop
    • Journal of Information Processing Systems
    • /
    • v.5 no.3
    • /
    • pp.159-166
    • /
    • 2009
  • As shown in Wikipedia, tagging or cross-linking through major keywords in a document collection improves not only the readability of documents but also responsive and adaptive navigation among related documents. In recent years, the Semantic Web has increased the importance of social tagging as a key feature of the Web 2.0 and, as its crucial phenotype, Tag Cloud has emerged to the public. In this paper we provide an efficient method of automated in-text keyword tagging based on large-scale controlled term collection or keyword dictionary, where the computational complexity of O(mN) - if a pattern matching algorithm is used - can be reduced to O(mlogN) - if an Information Retrieval technique is adopted - while m is the length of target document and N is the total number of candidate terms to be tagged. The result shows that automatic in-text tagging with keywords filtered by Information Retrieval speeds up to about 6 $\sim$ 40 times compared with the fastest pattern matching algorithm.

A Framework for Internet of Things (IoT) Data Management

  • Kim, Kyung-Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.3
    • /
    • pp.159-166
    • /
    • 2019
  • The collection and manipulation of Internet of Things (IoT) data is increasing at a fast pace and its importance is recognized in every sector of our society. For efficient utilization of IoT data, the vast and varied IoT data needs to be reliable and meaningful. In this paper, we propose an IoT framework to realize this need. The IoT framework is based on a four layer IoT architecture onto which context aware computing technology is applied. If the collected IoT data is unreliable it cannot be used for its intended purpose and the whole service using the data must be abandoned. In this paper, we include techniques to remove uncertainty in the early stage of IoT data capture and collection resulting in reliable data. Since the data coming out of the various IoT devices have different formats, it is important to convert them into a standard format before further processing, We propose the RDF format to be the standard format for all IoT data. In addition, it is not feasible to process all captured Iot data from the sensor devices. In order to decide which data to process and understand, we propose to use contexts and reasoning based on these contexts. For reasoning, we propose to use standard AI and statistical techniques. We also propose an experiment environment that can be used to develop an IoT application to realize the IoT framework.

Semantic Analysis of Information Assurance Concept : A Literature Review (문헌 연구를 통한 정보보증 개념의 구문 분석)

  • Kang, Ji-Won;Choi, Heon-jun;Lee, Hanhee
    • Convergence Security Journal
    • /
    • v.19 no.1
    • /
    • pp.31-40
    • /
    • 2019
  • Today, information security (INFOSEC) as a discipline is gaining more and more importance according to the emergence and extension of the cyberspace. Originated from Joint Doctrine for Information Operation (Joint Pub 3-13) by the U.S. Department of Defense, 'information assurance (IA)' is the concept widely used in the relevant field. Grown from the practice of information security, it encompasses broader and more proactive protection that includes countermeasures and repair, security management throughout an information system (IS)'s life-cycle, and trustworthiness of an IS in the process of risk analysis. In Korea, many industry professionals tend to misunderstand IA, remaining unaware of the conceptual differences between IA and INFOSEC. On this account, the current study attempted to provide a combined definition of IA by reviewing relevant literature. This study showed the validity of the wordings used in the proposed definition phrase by phrase.

Aspect-Based Sentiment Analysis with Position Embedding Interactive Attention Network

  • Xiang, Yan;Zhang, Jiqun;Zhang, Zhoubin;Yu, Zhengtao;Xian, Yantuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.5
    • /
    • pp.614-627
    • /
    • 2022
  • Aspect-based sentiment analysis is to discover the sentiment polarity towards an aspect from user-generated natural language. So far, most of the methods only use the implicit position information of the aspect in the context, instead of directly utilizing the position relationship between the aspect and the sentiment terms. In fact, neighboring words of the aspect terms should be given more attention than other words in the context. This paper studies the influence of different position embedding methods on the sentimental polarities of given aspects, and proposes a position embedding interactive attention network based on a long short-term memory network. Firstly, it uses the position information of the context simultaneously in the input layer and the attention layer. Secondly, it mines the importance of different context words for the aspect with the interactive attention mechanism. Finally, it generates a valid representation of the aspect and the context for sentiment classification. The model which has been posed was evaluated on the datasets of the Semantic Evaluation 2014. Compared with other baseline models, the accuracy of our model increases by about 2% on the restaurant dataset and 1% on the laptop dataset.

Analyzing Self-Introduction Letter of Freshmen at Korea National College of Agricultural and Fisheries by Using Semantic Network Analysis : Based on TF-IDF Analysis (언어네트워크분석을 활용한 한국농수산대학 신입생 자기소개서 분석 - TF-IDF 분석을 기초로 -)

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Kim, S.H.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.23 no.1
    • /
    • pp.89-104
    • /
    • 2021
  • Based on the TF-IDF weighted value that evaluates the importance of words that play a key role, the semantic network analysis(SNA) was conducted on the self-introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The top three words calculated by TF-IDF weights were agriculture, mathematics, study (Q. 1), clubs, plants, friends (Q. 2), friends, clubs, opinions, (Q. 3), mushrooms, insects, and fathers (Q. 4). In the relationship between words, the words with high betweenness centrality are reason, high school, attending (Q. 1), garbage, high school, school (Q. 2), importance, misunderstanding, completion (Q.3), processing, feed, and farmhouse (Q. 4). The words with high degree centrality are high school, inquiry, grades (Q. 1), garbage, cleanup, class time (Q. 2), opinion, meetings, volunteer activities (Q.3), processing, space, and practice (Q. 4). The combination of words with high frequency of simultaneous appearances, that is, high correlation, appeared as 'certification - acquisition', 'problem - solution', 'science - life', and 'misunderstanding - concession'. In cluster analysis, the number of clusters obtained by the height of cluster dendrogram was 2(Q.1), 4(Q.2, 4) and 5(Q. 3). At this time, the cohesion in Cluster was high and the heterogeneity between Clusters was clearly shown.

Story-based Information Retrieval (스토리 기반의 정보 검색 연구)

  • You, Eun-Soon;Park, Seung-Bo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.81-96
    • /
    • 2013
  • Video information retrieval has become a very important issue because of the explosive increase in video data from Web content development. Meanwhile, content-based video analysis using visual features has been the main source for video information retrieval and browsing. Content in video can be represented with content-based analysis techniques, which can extract various features from audio-visual data such as frames, shots, colors, texture, or shape. Moreover, similarity between videos can be measured through content-based analysis. However, a movie that is one of typical types of video data is organized by story as well as audio-visual data. This causes a semantic gap between significant information recognized by people and information resulting from content-based analysis, when content-based video analysis using only audio-visual data of low level is applied to information retrieval of movie. The reason for this semantic gap is that the story line for a movie is high level information, with relationships in the content that changes as the movie progresses. Information retrieval related to the story line of a movie cannot be executed by only content-based analysis techniques. A formal model is needed, which can determine relationships among movie contents, or track meaning changes, in order to accurately retrieve the story information. Recently, story-based video analysis techniques have emerged using a social network concept for story information retrieval. These approaches represent a story by using the relationships between characters in a movie, but these approaches have problems. First, they do not express dynamic changes in relationships between characters according to story development. Second, they miss profound information, such as emotions indicating the identities and psychological states of the characters. Emotion is essential to understanding a character's motivation, conflict, and resolution. Third, they do not take account of events and background that contribute to the story. As a result, this paper reviews the importance and weaknesses of previous video analysis methods ranging from content-based approaches to story analysis based on social network. Also, we suggest necessary elements, such as character, background, and events, based on narrative structures introduced in the literature. We extract characters' emotional words from the script of the movie Pretty Woman by using the hierarchical attribute of WordNet, which is an extensive English thesaurus. WordNet offers relationships between words (e.g., synonyms, hypernyms, hyponyms, antonyms). We present a method to visualize the emotional pattern of a character over time. Second, a character's inner nature must be predetermined in order to model a character arc that can depict the character's growth and development. To this end, we analyze the amount of the character's dialogue in the script and track the character's inner nature using social network concepts, such as in-degree (incoming links) and out-degree (outgoing links). Additionally, we propose a method that can track a character's inner nature by tracing indices such as degree, in-degree, and out-degree of the character network in a movie through its progression. Finally, the spatial background where characters meet and where events take place is an important element in the story. We take advantage of the movie script to extracting significant spatial background and suggest a scene map describing spatial arrangements and distances in the movie. Important places where main characters first meet or where they stay during long periods of time can be extracted through this scene map. In view of the aforementioned three elements (character, event, background), we extract a variety of information related to the story and evaluate the performance of the proposed method. We can track story information extracted over time and detect a change in the character's emotion or inner nature, spatial movement, and conflicts and resolutions in the story.

Korea National College of Agriculture and Fisheries in Naver News by Web Crolling : Based on Keyword Analysis and Semantic Network Analysis (웹 크롤링에 의한 네이버 뉴스에서의 한국농수산대학 - 키워드 분석과 의미연결망분석 -)

  • Joo, J.S.;Lee, S.Y.;Kim, S.H.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.23 no.2
    • /
    • pp.71-86
    • /
    • 2021
  • This study was conducted to find information on the university's image from words related to 'Korea National College of Agriculture and Fisheries (KNCAF)' in Naver News. For this purpose, word frequency analysis, TF-IDF evaluation and semantic network analysis were performed using web crawling technology. In word frequency analysis, 'agriculture', 'education', 'support', 'farmer', 'youth', 'university', 'business', 'rural', 'CEO' were important words. In the TF-IDF evaluation, the key words were 'farmer', 'dron', 'agricultural and livestock food department', 'Jeonbuk', 'young farmer', 'agriculture', 'Chonju', 'university', 'device', 'spreading'. In the semantic network analysis, the Bigrams showed high correlations in the order of 'youth' - 'farmer', 'digital' - 'agriculture', 'farming' - 'settlement', 'agriculture' - 'rural', 'digital' - 'turnover'. As a result of evaluating the importance of keywords as five central index, 'agriculture' ranked first. And the keywords in the second place of the centrality index were 'farmers' (Cc, Cb), 'education' (Cd, Cp) and 'future' (Ce). The sperman's rank correlation coefficient by centrality index showed the most similar rank between Degree centrality and Pagerank centrality. The KNCAF articles of Naver News were used as important words such as 'agriculture', 'education', 'support', 'farmer', 'youth' in terms of word frequency. However, in the evaluation including document frequency, the words such as 'farmer', 'dron', 'Ministry of Agriculture, Food and Rural Affairs', 'Jeonbuk', and 'young farmers' were found to be key words. The centrality analysis considering the network connectivity between words was suitable for evaluation by Cd and Cp. And the words with strong centrality were 'agriculture', 'education', 'future', 'farmer', 'digital', 'support', 'utilization'.

Machine Learning Based MMS Point Cloud Semantic Segmentation (머신러닝 기반 MMS Point Cloud 의미론적 분할)

  • Bae, Jaegu;Seo, Dongju;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_3
    • /
    • pp.939-951
    • /
    • 2022
  • The most important factor in designing autonomous driving systems is to recognize the exact location of the vehicle within the surrounding environment. To date, various sensors and navigation systems have been used for autonomous driving systems; however, all have limitations. Therefore, the need for high-definition (HD) maps that provide high-precision infrastructure information for safe and convenient autonomous driving is increasing. HD maps are drawn using three-dimensional point cloud data acquired through a mobile mapping system (MMS). However, this process requires manual work due to the large numbers of points and drawing layers, increasing the cost and effort associated with HD mapping. The objective of this study was to improve the efficiency of HD mapping by segmenting semantic information in an MMS point cloud into six classes: roads, curbs, sidewalks, medians, lanes, and other elements. Segmentation was performed using various machine learning techniques including random forest (RF), support vector machine (SVM), k-nearest neighbor (KNN), and gradient-boosting machine (GBM), and 11 variables including geometry, color, intensity, and other road design features. MMS point cloud data for a 130-m section of a five-lane road near Minam Station in Busan, were used to evaluate the segmentation models; the average F1 scores of the models were 95.43% for RF, 92.1% for SVM, 91.05% for GBM, and 82.63% for KNN. The RF model showed the best segmentation performance, with F1 scores of 99.3%, 95.5%, 94.5%, 93.5%, and 90.1% for roads, sidewalks, curbs, medians, and lanes, respectively. The variable importance results of the RF model showed high mean decrease accuracy and mean decrease gini for XY dist. and Z dist. variables related to road design, respectively. Thus, variables related to road design contributed significantly to the segmentation of semantic information. The results of this study demonstrate the applicability of segmentation of MMS point cloud data based on machine learning, and will help to reduce the cost and effort associated with HD mapping.

A Comparison of the Freshmen's Cognitive Frame about the 'Crisis of Earth' ('위기의 지구'에 대한 인지프레임 비교: 대학교 신입생들 대상으로)

  • Chung, Duk Ho;Choi, Hyeon A;Park, Seon Ok
    • Journal of the Korean earth science society
    • /
    • v.37 no.2
    • /
    • pp.117-131
    • /
    • 2016
  • The purpose of this study was to compare of freshmen's cognitive frames about the 'Crisis of the Earth' upon taking the Earth science I course in high school to confirm if they reflect the goal of the curriculum reasonably. Data were collected from 67 freshmen who graduated from high school. All participants were asked to express about the 'Crisis of the Earth' in painting with explanation, then we picked meaningful units from paintings. We analyzed the words and frames presented in the paintings using the Semantic Network Analysis. Result are as follows. First, when both groups' (one that took the course vs. the other that did not take it) built their cognitive frames for the 'crisis of the Earth', they reasonably connected areas that are composed of the global environment and they understood that their relation was constantly changing by interacting each other. Second, when configuring a cognitive frame about the 'crisis of the Earth', both groups reflected the characteristics of interrelationship with human activities. In particular, the group that took the course of Earth Science I fully reflected the goal of the curriculum. It is suggested that students recognize the 'crisis of the Earth' not only from a cosmic perspective bot also from the Earth's interior since most of students have strongly connected it to the phenomenon of the Earth's interior rather than the Earth's outward symptoms. In addition, it is recommended that the Earth science curriculum put more emphasis on understanding the importance of problem-solving of the Earth's crisis.

A Case Study on the Types of Queries' Relations for Recognizing User intention (검색의도 파악을 위한 질의어 관계유형에 관한 사례연구)

  • Kwon, Soon-Jin;Kim, Won-Il;Yoo, Seong-Joon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.4
    • /
    • pp.414-422
    • /
    • 2011
  • IR (Information Retrieval) systems have the methods that compare relationships between query and index to identify document that may be fit to the user's query keyword. However, the methods usually ignore the importance of relations that are not expressed in the query. Therefore, in this study, we describe how to refine the queries' relation from keyword and to reveal the hidden intent. A useful relationship between query and keyword in IR wth studied and we classified the tion fromrelation. Firstfromall, we did researchmrelated on semantic relationship and ontolhiical researchmin foreign and domestic research, and also analyzed semantic network practices, information retrieval technolhiy, extracted and classified the tion fromrelationships s' relasite's real-world datamin whichminformation retrieval technolhiin fare applied. Next, we souiht to solve the problems occurred frequently i' relasituation that searchers tioically face. I' relacurrent search technolhiy, the mesh searchmresult fare poured by simply comparn ina query with index terms. Therefore, the need for an intelligent search fittn inusers' intent is required. The relationships between two queries to re hiddee and identify relasearcher's intent have to be revealed. By analyzn inthe practical cthes s' queries and classifyn inthem into nine kind fromrelationship tion, we proposed the method to design relation revealn inand role namn i, and we have also illustrated limitations of that methods.