• Title/Summary/Keyword: 데이터 인덱스 정보

Search Result 383, Processing Time 0.022 seconds

Hyper-Text Compression Method Based on LZW Dictionary Entry Management (개선된 LZW 사전 관리 기법에 기반한 효과적인 Hyper-Text 문서 압축 방안)

  • Sin, Gwang-Cheol;Han, Sang-Yong
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.311-316
    • /
    • 2002
  • LZW is a popular variant of LZ78 to compress text documents. LZW yields a high compression rate and is widely used by many commercial programs. Its core idea is to assign most probably used character group an entry in a dictionary. If a group of character which is already positioned in a dictionary appears in the streaming data, then an index of a dictionary is replaced in the position of character group. In this paper, we propose a new efficient method to find least used entries in a dictionary using counter. We also achieve higher compression rate by preassigning widely used tags in hyper-text documents. Experimental results show that the proposed method is more effective than V.42bis and Unix compression method. It gives 3∼8% better in the standard Calgary Corpus and 23∼24% better in HTML documents.

Recognition of Bill Form using Feature Pyramid Network (FPN(Feature Pyramid Network)을 이용한 고지서 양식 인식)

  • Kim, Dae-Jin;Hwang, Chi-Gon;Yoon, Chang-Pyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.4
    • /
    • pp.523-529
    • /
    • 2021
  • In the era of the Fourth Industrial Revolution, technological changes are being applied in various fields. Automation digitization and data management are also in the field of bills. There are more than tens of thousands of forms of bills circulating in society and bill recognition is essential for automation, digitization and data management. Currently in order to manage various bills, OCR technology is used for character recognition. In this time, we can increase the accuracy, when firstly recognize the form of the bill and secondly recognize bills. In this paper, a logo that can be used as an index to classify the form of the bill was recognized as an object. At this time, since the size of the logo is smaller than that of the entire bill, FPN was used for Small Object Detection among deep learning technologies. As a result, it was possible to reduce resource waste and increase the accuracy of OCR recognition through the proposed algorithm.

A Study on DB Security Problem Improvement of DB Masking by Security Grade (DB 보안의 문제점 개선을 위한 보안등급별 Masking 연구)

  • Baek, Jong-Il;Park, Dea-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.101-109
    • /
    • 2009
  • An encryption module is equipped basically at 8i version ideal of Oracle DBMS, encryption module, but a performance decrease is caused, and users are restrictive. We analyze problem of DB security by technology by circles at this paper whether or not there is an index search, object management disorder, a serious DB performance decrease by encryption, real-time data encryption beauty whether or not there is data approach control beauty circular-based IP. And presentation does the comprehensive security Frame Work which utilized the DB Masking technique that is an alternative means technical encryption in order to improve availability of DB security. We use a virtual account, and set up a DB Masking basis by security grades as alternatives, we check advance user authentication and SQL inquiry approvals and integrity after the fact through virtual accounts, utilize to method as collect by an auditing log that an officer was able to do safely DB.

Efficient Distributed Broadcast Schemes using Sensor Networks in Road Network Environments (도로 네트워크 환경에서 센서 네트워크를 이용한 효율적인 분산 브로드캐스트 기법)

  • Jang, Yong-Jin;Lee, Jin-Ju;Park, Jun-Ho;Seong, Dong-Ook;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.1
    • /
    • pp.26-33
    • /
    • 2011
  • In ubiquitous environments that numerous mobile objects exist, the location-based services have risen as an important application field. For efficient location-based services, various techniques with broadcast schemes have been studied. However, they were mainly concerned with the implementation of a broadcast index and did not consider techniques for reducing the size of the entire broadcast data. Therefore, this paper proposes a data distribution broadcast scheme based on sensor networks that considers the mobile patterns of an object in road network environments. In this paper we also propose a road network based sensor clustering technique for the efficiency of the proposed distributed broadcast scheme. In order to show the superiority of the proposed scheme, we compare it with the existing broadcast scheme in various environments.

Efficient Multi-Step k-NN Search Methods Using Multidimensional Indexes in Large Databases (대용량 데이터베이스에서 다차원 인덱스를 사용한 효율적인 다단계 k-NN 검색)

  • Lee, Sanghun;Kim, Bum-Soo;Choi, Mi-Jung;Moon, Yang-Sae
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.242-254
    • /
    • 2015
  • In this paper, we address the problem of improving the performance of multi-step k-NN search using multi-dimensional indexes. Due to information loss by lower-dimensional transformations, existing multi-step k-NN search solutions produce a large tolerance (i.e., a large search range), and thus, incur a large number of candidates, which are retrieved by a range query. Those many candidates lead to overwhelming I/O and CPU overheads in the postprocessing step. To overcome this problem, we propose two efficient solutions that improve the search performance by reducing the tolerance of a range query, and accordingly, reducing the number of candidates. First, we propose a tolerance reduction-based (approximate) solution that forcibly decreases the tolerance, which is determined by a k-NN query on the index, by the average ratio of high- and low-dimensional distances. Second, we propose a coefficient control-based (exact) solution that uses c k instead of k in a k-NN query to obtain a tigher tolerance and performs a range query using this tigher tolerance. Experimental results show that the proposed solutions significantly reduce the number of candidates, and accordingly, improve the search performance in comparison with the existing multi-step k-NN solution.

An Efficient Technique for Processing Frequent Updates in the R-tree (R-트리에서 빈번한 변경 질의 처리를 위한 효율적인 기법)

  • 권동섭;이상준;이석호
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.261-273
    • /
    • 2004
  • Advances in information and communication technologies have been creating new classes of applications in the area of databases. For example, in moving object databases, which track positions of a lot of objects, or stream databases, which process data streams from a lot of sensors, data Processed in such database systems are usually changed very rapidly and continuously. However, traditional database systems have a problem in processing these rapidly and continuously changing data because they suppose that a data item stored in the database remains constant until It is explicitly modified. The problem becomes more serious in the R-tree, which is a typical index structure for multidimensional data, because modifying data in the R-tree can generate cascading node splits or merges. To process frequent updates more efficiently, we propose a novel update technique for the R-tree, which we call the leaf-update technique. If a new value of a data item lies within the leaf MBR that the data item belongs, the leaf-update technique changes the leaf node only, not whole of the tree. Using this leaf-update manner and the leaf-access hash table for direct access to leaf nodes, the proposed technique can reduce update cost greatly. In addition, the leaf-update technique can be adopted in diverse variants of the R-tree and various applications that use the R-tree since it is based on the R-tree and it guarantees the correctness of the R-tree. In this paper, we prove the effectiveness of the leaf-update techniques theoretically and present experimental results that show that our technique outperforms traditional one.

A PageRank based Data Indexing Method for Designing Natural Language Interface to CRM Databases (분석 CRM 실무자의 자연어 질의 처리를 위한 기업 데이터베이스 구성요소 인덱싱 방법론)

  • Park, Sung-Hyuk;Hwang, Kyeong-Seo;Lee, Dong-Won
    • CRM연구
    • /
    • v.2 no.2
    • /
    • pp.53-70
    • /
    • 2009
  • Understanding consumer behavior based on the analysis of the customer data is one essential part of analytic CRM. To do this, the analytic skills for data extraction and data processing are required to users. As a user has various kinds of questions for the consumer data analysis, the user should use database language such as SQL. However, for the firm's user, to generate SQL statements is not easy because the accuracy of the query result is hugely influenced by the knowledge of work-site operation and the firm's database. This paper proposes a natural language based database search framework finding relevant database elements. Specifically, we describe how our TableRank method can understand the user's natural query language and provide proper relations and attributes of data records to the user. Through several experiments, it is supported that the TableRank provides accurate database elements related to the user's natural query. We also show that the close distance among relations in the database represents the high data connectivity which guarantees matching with a search query from a user.

  • PDF

Feeature extraction for recognition rate improvemen of hand written numerals (필기체 숫자 인식률 향상을 위한 특징추출)

  • Koh, Chan;Lee, Chang-In
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.10
    • /
    • pp.2102-2111
    • /
    • 1997
  • Hand written numeral is projected on the 3D space after pre-processing of inputs and it makes a index by tracking of numerals. It computes the distance between extracted every features. It is used by input part of recognition process from the statistical historgram of the normalization of data in order to adaptation from variation. One hundred unmeral patterns have used for making a standard feature map and 100 pattern for the recogintion experiment. The result of it, we have the recoginition rete is 93.5% based on thresholding is 0.20 and 97.5% based on 0.25.

  • PDF

Efficient Broadcast Scheme Based on Ergodic Index Coding (에르고딕 인덱스 코딩을 바탕으로 한 효율적인 브로드캐스트 기법)

  • Choi, Sang Won;Kim, Juyeop;Kim, Yong-Kyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.8
    • /
    • pp.1500-1506
    • /
    • 2015
  • In this paper, en efficient broadcast scheme with acknowledged mode is proposed. Specifically, based on stochastic pattern of ACK/NACK across all users and index coding, adaptive coding scheme with XOR operation is used at the transmitter. At each receiver, packets are decoded using layered decoding method with already successfully decoded packets. From numerical results, proposed index coded broadcast scheme is shown to be more efficient than naive broadcast scheme in the sense of average total number of transmitted packets.

A research on the algorithm of traffic card for blacklist checking (교통카드 블랙리스트 체크를 위한 알고리즘에 관한 연구)

  • Jeong, Yang-Kwon;Kim, Yong-Sik;Kim, Kyung-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.1
    • /
    • pp.58-65
    • /
    • 2010
  • The research which sees is to paying in advance and or the after non traffic card use composes shortens about the method which sorts the difference of the method which with the thing is proposing from card system of existing and that system improves only the unable card or serviceable card information and the response time of the system operation and to improve the method which composes information, control method preparation improved a updating speed and effectiveness of system improvement at the time. The respectively file composed with the multiple mind section from the research which sees hereupon and also each section composed of the multiple mind block and each block multiple mind divided at size of the unit which will count and with the index father whom composes more kicked a low-end ratio use wrongly or in serviceable card information and the low to compose with the data bringing up for discussion territory which composes of information the efficiency of system, improved.