• Title/Summary/Keyword: 그래프 검색

Search Result 226, Processing Time 0.117 seconds

Text Extraction and Summarization from Web News (웹 뉴스의 기사 추출과 요약)

  • Han, Kwang-Rok;Sun, Bok-Keun;Yoo, Hyoung-Sun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.5
    • /
    • pp.1-10
    • /
    • 2007
  • Many types of information provided through the web including news contents contain unnecessary clutters. These clutters make it difficult to build automated information processing systems such as the summarization, extraction and retrieval of documents. We propose a system that extracts and summarizes news contents from the web. The extraction system receives news contents in HTML as input and builds an element tree similar to DOM tree, and extracts texts while removing clutters with the hyperlink attribute in the HTML tag from the element tree. Texts extracted through the extraction system are transferred to the summarization system, which extracts key sentences from the texts. We implement the summarization system using co-occurrence relation graph. The summarized sentences of this paper are expected to be transmissible to PDA or cellular phone by message services such as SMS.

  • PDF

Concurrency Control based on Serialization Graph for Query Transactions in Broadcast Environment : CCSG/QT (방송환경에서 질의 거래를 위해 직렬화 그래프에 기반을 둔 동시성 제어 기법)

  • 이욱현;황부현
    • Journal of KIISE:Databases
    • /
    • v.30 no.1
    • /
    • pp.95-107
    • /
    • 2003
  • The broadcast environment has asymmetric communication aspect that is typically much greater communication bandwidth available from server to clients than in the opposite direction. In addition, most of mobile computing systems allow mostly read-only transactions from mobile clients for retrieving different types of information such as stock data, traffic information and mews updates. Since previous concurrency control protocols, however, do not consider such a particular characteristics, the performance degradation occurs when previous schemes are applied to the broadcast environment. In this paper, we propose the efficient concurrency control for query transaction in broadcast environment. The following requirements are satisfied by adapting weak consistency that is the appropriate correctness criterion of read-only transactions: (1) the mutual consistency of data maintained by the server and read by clients (2) the currency of data read by clients. We also use the serialization graph scheme to check the weak consistency efficiently. As a result, we improved a performance by reducing unnecessary aborts and restarts of read-only transactions caused when global serializability was adopted.

Development of Technology Mapping Algorithm for CPLD by Considering Time Constraint (시간제약 조건을 고려한 CPLD 기술 매핑 알고리즘 개발)

  • Kim, Hi-Seok;Byun, Sang-Zoon
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.6
    • /
    • pp.9-17
    • /
    • 1999
  • In this paper, we propose a new technology mapping algorithm for CPLD under time constraint. In our technology mapping algorithm, a given logic equation is constructed as the DAG type, then the DAG is reconstructed by replicating the node that outdegree is more than or equal to 2. As a result, it makes delay time and the number of CLBs to be minimized. Also, after the number of multi-level is defined and cost of each nodes is calculated, the graph is partitioned in order to fit to k that is the number of OR term within CLB. The partitioned nodes are merged through collapsing and bin packing is performed in order to fit to the number of OR term within CLB. In the results of experiments to MCNC circuits for logic synthesis benchmark, we can shows that proposed technology mapping algorithm reduces delay time and the number of CLBs much more than the existing tools of technology mapping algoritm.

  • PDF

Knowledge Graph-based Korean New Words Detection Mechanism for Spam Filtering (스팸 필터링을 위한 지식 그래프 기반의 신조어 감지 매커니즘)

  • Kim, Ji-hye;Jeong, Ok-ran
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.79-85
    • /
    • 2020
  • Today, to block spam texts on smartphone, a simple string comparison between text messages and spam keywords or a blocking spam phone numbers is used. As results, spam text is sent in a gradually hanged way to prevent if from being automatically blocked. In particular, for words included in spam keywords, spam texts are sent to abnormal words using special characters, Chinese characters, and whitespace to prevent them from being detected by simple string match. There is a limit that traditional spam filtering methods can't block these spam texts well. Therefore, new technologies are needed to respond to changing spam text messages. In this paper, we propose a knowledge graph-based new words detection mechanism that can detect new words frequently used in spam texts and respond to changing spam texts. Also, we show experimental results of the performance when detected Korean new words are applied to the Naive Bayes algorithm.

Development of CPLD technology mapping algorithm improving run-time under Time Constraint (시간적 조건에서 실행시간을 개선한 CPLD 기술 매핑 알고리즘 개발)

  • 윤충모;김희석
    • Journal of the Korea Society of Computer and Information
    • /
    • v.4 no.3
    • /
    • pp.35-46
    • /
    • 1999
  • In this paper, we propose a new CPLD technology mapping algorithm improving run-time under time constraint. In our technology mapping algorithm. a given logic equation is constructed as the DAG type. then the DAG is reconstructed by replicating the node that outdegree is more than or equal to 2. As a result, it makes delay time and the number of CLBs, run-time to be minimized. Also. after the number of multi-level is defined and cost of each nodes is calculated, the graph is partitioned in order to fit to k that is the number of OR term within Cl.B. The partitioned nodes are merged through collapsing and bin packing is performed in order to fit to the number of OR term within CLB. In the results of experiments to MCNC circuits for logic synthesis benchmark, we can shows that proposed technology mapping algorithm reduces run-time much more than the TMCPLD.

Implementation of WebGIS for Integration of GIS Spatial Analysis and Social Network Analysis (GIS 공간분석과 소셜 네트워크 분석의 통합을 위한 WebGIS 구현)

  • Choi, Hyo-Seok;Yom, Jae-Hong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.32 no.2
    • /
    • pp.95-107
    • /
    • 2014
  • In general, topographical phenomena are represented graphically by data in the spatial domain, while attributes of the non-spatial domain are expressed by alpha-numeric texts. GIS functions for analysis of attributes in the non-spatial domain remain quite simple, such as search methods and simple statistical analysis. Recently, graph modeling and network analysis of social phenomena are commonly used for understanding various social events and phenomena. In this study, we applied the network analysis functions to the non-spatial domain data of GIS to enhance the overall spatial analysis. For this purpose, a novel design was presented to integrate the spatial database and the graph database, and this design was then implemented into a WebGIS system for better decision makings. The developed WebGIS with underlying synchronized databases, was tested in a simulated application about the selection of water supply households during an epidemic of the foot-and-mouse disease. The results of this test indicate that the developed WebGIS can contribute to improved decisions by taking into account the social proximity factors as well as geospatial factors.

Implementation of Digitizing System for Sea Level Measurements Record (조위관측 기록 디지타이징 시스템 구현)

  • Yu, Young-Jung;Park, Seong-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.8
    • /
    • pp.1907-1917
    • /
    • 2010
  • It is much needed research for ocean scientists to implement a digitizing system that effectively extracts and digitializes sea level records accumulated from the past. The main difficulty of such a system is huge anount of data to be processed. In this paper, we implement a digitizing system to handle such mass-data of sea level records. This system consists of a pre-process step, a digitizing step and a post-process step. In pre-process step, the system adjusts skewnesses of scanned images and normalizes the size of images automatically. Then, it extracts a graph area from images and thins the graph area in digitizing step. Finally, in the post-process step, the system tests the reliability. It is cost-effective and labour-reducing software for scientists not wasting their time to such boring manual digitizing jobs.

Analysis of interest in non-face-to-face medical counseling of modern people in the medical industry (의료 산업에 있어 현대인의 비대면 의학 상담에 대한 관심도 분석 기법)

  • Kang, Yooseong;Park, Jong Hoon;Oh, Hayoung;Lee, Se Uk
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.11
    • /
    • pp.1571-1576
    • /
    • 2022
  • This study aims to analyze the interest of modern people in non-face-to-face medical counseling in the medical industrys. Big data was collected on two social platforms, 지식인, a platform that allows experts to receive medical counseling, and YouTube. In addition to the top five keywords of telephone counseling, "internal medicine", "general medicine", "department of neurology", "department of mental health", and "pediatrics", a data set was built from each platform with a total of eight search terms: "specialist", "medical counseling", and "health information". Afterwards, pre-processing processes such as morpheme classification, disease extraction, and normalization were performed based on the crawled data. Data was visualized with word clouds, broken line graphs, quarterly graphs, and bar graphs by disease frequency based on word frequency. An emotional classification model was constructed only for YouTube data, and the performance of GRU and BERT-based models was compared.

A Parameter-Free Approach for Clustering and Outlier Detection in Image Databases (이미지 데이터베이스에서 매개변수를 필요로 하지 않는 클러스터링 및 아웃라이어 검출 방법)

  • Oh, Hyun-Kyo;Yoon, Seok-Ho;Kim, Sang-Wook
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.80-91
    • /
    • 2010
  • As the volume of image data increases dramatically, its good organization of image data is crucial for efficient image retrieval. Clustering is a typical way of organizing image data. However, traditional clustering methods have a difficulty of requiring a user to provide the number of clusters as a parameter before clustering. In this paper, we discuss an approach for clustering image data that does not require the parameter. Basically, the proposed approach is based on Cross-Association that finds a structure or patterns hidden in data using the relationship between individual objects. In order to apply Cross-Association to clustering of image data, we convert the image data into a graph first. Then, we perform Cross-Association on the graph thus obtained and interpret the results in the clustering perspective. We also propose the method of hierarchical clustering and the method of outlier detection based on Cross-Association. By performing a series of experiments, we verify the effectiveness of the proposed approach. Finally, we discuss the finding of a good value of k used in k-nearest neighbor search and also compare the clustering results with symmetric and asymmetric ways used in building a graph.

Integrating Color, Texture and Edge Features for Content-Based Image Retrieval (내용기반 이미지 검색을 위한 색상, 텍스쳐, 에지 기능의 통합)

  • Ma Ming;Park Dong-Won
    • Science of Emotion and Sensibility
    • /
    • v.7 no.4
    • /
    • pp.57-65
    • /
    • 2004
  • In this paper, we present a hybrid approach which incorporates color, texture and shape in content-based image retrieval. Colors in each image are clustered into a small number of representative colors. The feature descriptor consists of the representative colors and their percentages in the image. A similarity measure similar to the cumulative color histogram distance measure is defined for this descriptor. The co-occurrence matrix as a statistical method is used for texture analysis. An optimal set of five statistical functions are extracted from the co-occurrence matrix of each image, in order to render the feature vector for eachimage maximally informative. The edge information captured within edge histograms is extracted after a pre-processing phase that performs color transformation, quantization, and filtering. The features where thus extracted and stored within feature vectors and were later compared with an intersection-based method. The content-based retrieval system is tested to be effective in terms of retrieval and scalability through experimental results and precision-recall analysis.

  • PDF