• Title/Summary/Keyword: Similar information retrieval

Search Result 297, Processing Time 0.024 seconds

An Indexing System for Retrieving Similar Paths in XML Documents (XML 문서의 유사 경로 검색을 위한 인덱싱 시스템)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.171-178
    • /
    • 2008
  • Since the XML standard was introduced by the W3C in 1998, documents that have been written in XML have been gradually increasing. Accordingly, several systems have been developed in order to efficiently manage and retrieve massive XML documents. BitCube-a bitmap indexing system-is a representative system for this field of research. Based on the bitmap indexing technique, the path bitmap indexing system(LH06), which performs the clustering of similar paths, improved the problem that the existing BitCube system could not solve, namely, determining similar paths. The path bitmap indexing system has the advantage of a higher retrieval speed in not only exactly matched path searching but also similar path searching. However, the similarity calculation algorithm of this system has a few particular problems. Consequently, it sometimes cannot calculate the similarity even though some of two paths have extremely similar relationships; further, it results in an increment in the number of meaningless clusters. In this paper, we have proposed a novel method that clustering, the similarity between the paths in order to solve these problems. The proposed system yields a stable result for clustering, and it obtains a high score in clustering precision during a performance evaluation against LH06.

Exploring the Strategy for Acquiring ISMS Certification through Probit Regression: Focusing on Organizational Characteristics (Probit 회귀분석을 통한 ISMS 인증 취득 전략 탐색: 조직 특성을 중심으로)

  • SunJoo Kim;Tae-Sung Kim
    • Journal of Information Technology Services
    • /
    • v.23 no.1
    • /
    • pp.11-25
    • /
    • 2024
  • In the field of information security management systems, one of the representative certifications in Korea is ISMS-P certification, and internationally, ISO/IEC 27001 certification is recognized. When companies acquire both ISMS-P (or ISMS) and ISO/IEC 27001 certifications, budget and manpower are duplicated in similar areas. Therefore, it is necessary for the company to choose and invest in a certification that is suitable for its conditions. This paper proposes a strategy for obtaining information security management system certification that is suitable for the characteristics of the company, allowing for effective information security management based on the company's conditions. To achieve this, data were collected from the Ministry of Science and ICT's Information Security Disclosure System (ISDS), the Korea Internet & Security Agency (KISA), and the Financial Supervisory Service's Data Analysis, Retrieval and Transfer System (DART), and Probit regression analysis was conducted. During the Probit regression analysis, the relationships between seven independent variables and five cases of ISMS-P (or ISMS) acquisition, ISMS-P acquisition, ISMS acquisition, ISO/IEC 27001 acquisition, and both ISMS-P (or ISMS) and ISO/IEC 27001 acquisition were analyzed. The analysis results revealed the relationship between company characteristics, including industry, and certification acquisition in the ISMS field. Through this, strategies for certification acquisition based on company types could be suggested.

Clustering of Web Document Exploiting with the Co-link in Hypertext (동시링크를 이용한 웹 문서 클러스터링 실험)

  • 김영기;이원희;권혁철
    • Journal of Korean Library and Information Science Society
    • /
    • v.34 no.2
    • /
    • pp.233-253
    • /
    • 2003
  • Knowledge organization is the way we humans understand the world. There are two types of information organization mechanisms studied in information retrieval: namely classification md clustering. Classification organizes entities by pigeonholing them into predefined categories, whereas clustering organizes information by grouping similar or related entities together. The system of the Internet information resources extracts a keyword from the words which appear in the web document and draws up a reverse file. Term clustering based on grouping related terms, however, did not prove overly successful and was mostly abandoned in cases of documents used different languages each other or door-way-pages composed of only an anchor text. This study examines infometric analysis and clustering possibility of web documents based on co-link topology of web pages.

  • PDF

QuLa: Queue and Latency-Aware Service Selection and Routing in Service-Centric Networking

  • Smet, Piet;Simoens, Pieter;Dhoedt, Bart
    • Journal of Communications and Networks
    • /
    • v.17 no.3
    • /
    • pp.306-320
    • /
    • 2015
  • Due to an explosive growth in services running in different datacenters, there is need for service selection and routing to deliver user requests to the best service instance. In current solutions, it is generally the client that must first select a datacenter to forward the request to before an internal load-balancer of the selected datacenter can select the optimal instance. An optimal selection requires knowledge of both network and server characteristics, making clients less suitable to make this decision. Information-Centric Networking (ICN) research solved a similar selection problem for static data retrieval by integrating content delivery as a native network feature. We address the selection problem for services by extending the ICN-principles for services. In this paper we present Queue and Latency, a network-driven service selection algorithm which maps user demand to service instances, taking into account both network and server metrics. To reduce the size of service router forwarding tables, we present a statistical method to approximate an optimal load distribution with minimized router state required. Simulation results show that our statistical routing approach approximates the average system response time of source-based routing with minimized state in forwarding tables.

An Index Mechanism and Structure Information for Efficient Retrieval of XML DTD (XML DTD의 효율적인 검색을 위한 구조 정보 및 인덱스 메카니즘)

  • 김영란
    • Journal of the Korea Society of Computer and Information
    • /
    • v.8 no.3
    • /
    • pp.80-86
    • /
    • 2003
  • XML is being watched with keen interest for the communication and saving of information. Information represented in XML provides more accuracy and a higher-speed of reference after the process of being implication. But, it is difficult that XML document is exchanged or shared in different area such as electronic commerce or digital library. Because, XML document is being different in syntax but similar in logic, with using structured difference analysis. In this thesis, we converted object-oriented class diagram to XML DTD and designed an index mechanism based on the structure information for the converted XML DTD. With our methods, we could effectively and lastly retrieve the specific element and respect to usefully access element by simple operations.

  • PDF

Retrieval of background surface reflectance with pre-running BRD components

  • Choi, Sungwon;Lee, Chang Suk;Seo, Minji;Seong, Noh-hun;Lee, Kyeong-Sang;Han, Kyung-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.32 no.1
    • /
    • pp.61-65
    • /
    • 2016
  • Importance of remote sensing for surface is increased than past. So many countries try to many ways to retrieve surface reflectance. In this study, we study a Bidirectional Reflectance Distribution Function (BRDF) to retrieve surface reflectance. We apply BRDF using observed surface reflectance of SPOT/VEGETATION (VGT-S1) and angular data to get Bidirectional Reflectance Distribution (BRD) coefficients for calculating scattering. And then we apply BRDF in the opposite direction with BRD coefficients and angular data to retrieve Background Surface Reflectance (BSR). The range of BSR is not over $0.4{\mu}m$ (blue), $0.45{\mu}m$ (red), $0.55{\mu}m$ (NIR). And for validation we compare BSR with VGT-S1, there are bias is from 0.0116 to 0.0158 and RMSE is from 0.0459 to 0.0545. As a result, we confirm that BSR is similar to VGT-S1.

Similar Trajectory Retrieval on Road Networks using Spatio-Temporal Similarity (시공간 유사성을 이용한 도로 네트워크 상의 유사한 궤적 검색)

  • Hwang Jung-Rae;Kang Hye-Young;Li Ki-Joune
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.337-346
    • /
    • 2006
  • In order to analyze the behavior of moving objects, a measure for determining the similarity of trajectories needs to be defined. Although research has been conducted that retrieved similar trajectories of moving objects in Euclidean space, very little research has been conducted on moving objects in the space defined by road networks. In terms of real applications, most moving objects are located in road network space rather than in Euclidean space. In similarity measure between trajectories, however, previous methods were based on Euclidean distance and only considered spatial similarity. In this paper, we define similarity measure based on POI and TOI in road network space. With this definition, we present methods to retrieve similar trajectories using spatio-temporal similarity between trajectories. We show clustering results for similar trajectories. Experimental results show that similar trajectories searched by each method and consistency rate between each method for the searched trajectories.

A Study on Location-based Routing Technique for Improving the Performance of P2P in MANET (MANET에서 P2P 성능 향상을 위한 위치기반 라우팅 기법에 관한 연구)

  • Yang, Hwanseok
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.2
    • /
    • pp.37-45
    • /
    • 2015
  • The technology development of MANET and dissemination of P2P services has been made very widely. In particular, the development of many application services for the integration of P2P services in MANET has been made actively. P2P networks are commonly used because of the advantages of efficient use of network bandwidth and rapid information exchange. In P2P network, the infrastructure managing each node in the middle does not exist and each node is a structure playing a role as the sender and receiver. Such a structure is very similar to the structure of the MANET. However, it is difficult to provide reliable P2P service due to the high mobility of mobile nodes. In this paper, we propose location-based routing technique in order to provide efficient file sharing and management between nodes. GMN managing the group is elected after network is configured to the area of a certain size. Each node is assigned an identifier of 12 bit dynamically to provide routing which uses location information to the identifier. ZGT is managed in the GMN in order to provide management of group nodes and distributed cache information. The distributed cache technique is applied to provide a rapid retrieval of the sharing files in the each node. The excellent performance of the proposed technique was confirmed through experiments.

Retrieving Information from Korean OCR Text Database (문자 인식에 의해 구축된 한글 문서 데이터베이스에 대한 정보 검색)

  • Lee, Jun-Ho;Lee, Chung-Sik;Han, Seon-Hwa;Kim, Jin-Hyeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.4
    • /
    • pp.833-841
    • /
    • 1999
  • The texts constructed with Optical Character Recognition(OCR) contain more errors than those constructed with keyboard typing. Therefore, in order to retrieve useful information from OCR texts, we need to develop an effective automatic indexing method. In this paer, we investigate automatic indexing methods that can retrieve information effectively from Korean OCR text database with the character-level recognition ratio of 90%. Experimental result shows that 2-gram indexing provides similar retrieval effectiveness of morpheme-based indexing for the Korean OCR text database.

  • PDF

A Study of Integrated RM & IM with KM Governance: Public Enterprise Centered (KM 기반의 기록관리 및 일반 정보관리 통합화 연구 - 공기업을 중심으로 -)

  • Jeong, Ki-Ae;Nam, Young-Joon
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.19 no.2
    • /
    • pp.23-43
    • /
    • 2008
  • Knowledge resources are classified with two groups, the records produced by internal parts of company and general information materials from external organizations. Production, acquisition, storage, retrieval and utilization patterns of two groups became similar due to digitization of knowledge resources. And separated RM and IM should be changed integrated management concept. This paper compares RM and IM based on KM governance strategy and several methods to integrate RM and IM. Especially the selection and identification of knowledge resources, information systems to be integrated and the methods of integration and integrators of public enterprises are presented.