• Title/Summary/Keyword: Document Reading

Search Result 64, Processing Time 0.023 seconds

A system for detecting document leakage by insiders through continuous user authentication by using document reading behavior (문서 읽기 행위를 이용한 연속적 사용자 인증 기반의 내부자 문서유출 탐지기술 연구)

  • Cho, Sungyoung;Kim, Minsu;Won, Jongil;Kwon, SangEun;Lim, Chaeho;Kang, Brent ByungHoon;Kim, Sehun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.2
    • /
    • pp.181-192
    • /
    • 2013
  • There have been various techniques to detect and control document leakage; however, most techniques concentrate on document leakage by outsiders. There are rare techniques to detect and monitor document leakage by insiders. In this study, we observe user's document reading behavior to detect and control document leakage by insiders. We make each user's document reading patterns from attributes gathered by a logger program running on Microsoft Word, and then we apply the proposed system to help determine whether a current user who is reading a document matches the true user. We expect that our system based on document reading behavior can effectively prevent document leakage.

Automatic Reading System for On-off Type DNA Chip

  • Ryu, Mun-Ho;Kim, Jong-Dae;Kim, Jong-Won
    • Journal of Information Processing Systems
    • /
    • v.2 no.3 s.4
    • /
    • pp.189-193
    • /
    • 2006
  • In this study we propose an automatic reading system for diagnostic DNA chips. We define a general specification for an automatic reading system and propose a possible implementation method. The proposed system performs the whole reading process automatically without any user intervention, covering image acquisition, image analysis, and report generation. We applied the system for the automatic report generation of a commercialized DNA chip for cervical cancer detection. The fluorescence image of the hybridization result was acquired with a $GenePix^{TM}$ scanner using its library running in HTML pages. The processing of the acquired image and the report generation were executed by a component object module programmed with Microsoft Visual C++ 6.0. To generate the report document, we made an HWP 2002 document template with marker strings that were supposed to be searched and replaced with the corresponding information such as patient information and diagnosis results. The proposed system generates the report document by reading the template and changing the marker strings with the resultant contents. The system is expected to facilitate the usage of a diagnostic DNA chip for mass screening by the automation of a conventional manual reading process, shortening its processing time, and quantifying the reading criteria.

Analyzing Undergraduate Nursing Students' Electronic Document Use and Document Reading Behavior (간호학과 학생들의 전자형태 문서이용 및 문서읽기행태에 대한 분석)

  • Na, Kyoungsik;Lee, Jisu
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.3
    • /
    • pp.271-291
    • /
    • 2014
  • The purpose of this study is to analyze undergraduate nursing students' electronic document use and reading behavior. To do this, a survey questionnaire was collected from 509 respondents who experienced reading behavior for the last semester. The results of this study show that nursing students' preference of electronic documents is higher than that of printed documents in general. They also prefer electronic documents to printed documents when they want to keep documents. Of respondents, about 94% or higher spent 30mins or more to find information and the main source to find information is 'Naver' search engine as the highest information source, and the place to access information is 'Home' as their highest information access location. In particular, the preference of the document 'on the move' is electronic documents and the main reason includes convenience and easiness to access and move the documents. The findings of this study expect to facilitate the understanding of undergraduate nursing students electronic document use and reading behavior so that it can be used to design and develop medical digital library services and tools more effectively and efficiently in medical area in the future. Furthermore, it expects to provide useful data in promoting user services in digital library in a whole.

SATS: Structure-Aware Touch-Based Scrolling

  • Kim, Dohyung;Gweon, Gahgene;Lee, Geehyuk
    • ETRI Journal
    • /
    • v.38 no.6
    • /
    • pp.1104-1113
    • /
    • 2016
  • Non-linear document navigation refers to the process of repeatedly reading a document at different levels to provide an overview, including selective reading to search for useful information within a document under time constraints. Currently, this function is not supported well by small-screen tablets. In this study, we propose the concept of structure-aware touch-based scrolling (SATS), which allows structural document navigation using region-dependent touch gestures for non-sequential navigation within tablets or tablet-sized e-book readers. In SATS, the screen is divided into four vertical sections representing the different structural levels of a document, where dragging into the different sections allows navigating from the macro to micro levels. The implementation of a prototype is presented, as well as details of a comparative evaluation using typical non-sequential navigation tasks performed under time constraints. The results showed that SATS obtained better performance, higher user satisfaction, and a lower usability workload compared with a conventional structural overview interface.

Analysis of Furniture Planning and Layout Type in Subject Specialization of University Library (대학도서관 주제자료실의 가구계획 및 배치유형 분석)

  • Chang, Ari;Hwang, Yeon-Sook
    • Korean Institute of Interior Design Journal
    • /
    • v.24 no.2
    • /
    • pp.180-188
    • /
    • 2015
  • University libraries aim to improve not only educational effects but also the general quality of colleges. A primary way of pursuing this goal is through providing professors and students with sufficient amounts of available references and materials that can be used for academic purposes. However, even though university libraries are intended to be used by college students majoring in different fields, they tend to provide mostly books. This limited offering of resources means that they are not distinguishing themselves from regular libraries. The purpose of this study is to present basic data for the spatial design of a subject specialization room in a college library. Included in the design are recommendations for the type and placement of the furniture in the room. The summary of results for this study and the conclusions are as follows: The layout of data space and reading space in a subject specialization room can be categorized into both document-oriented (document centralized and document categorized) and reading-oriented (reading centralized, all, and group types). The public reading seats and private reading seats in a subject specialization room, according to their ratio, can be divided into private reading, public reading, and distributed reading sections. The ratio of open-spaced tables is higher for groups of four or more people, but users often sit separately from others in order to ensure privacy. Unfortunately, this practice results in seating gaps that do not make efficient use of space. The result is that the public reading seats are less efficient than the private reading seats in terms of space. Therefore, it is necessary to increase the number of cubicles.

HTML Tag Depth Embedding: An Input Embedding Method of the BERT Model for Improving Web Document Reading Comprehension Performance (HTML 태그 깊이 임베딩: 웹 문서 기계 독해 성능 개선을 위한 BERT 모델의 입력 임베딩 기법)

  • Mok, Jin-Wang;Jang, Hyun Jae;Lee, Hyun-Seob
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.5
    • /
    • pp.17-25
    • /
    • 2022
  • Recently the massive amount of data has been generated because of the number of edge devices increases. And especially, the number of raw unstructured HTML documents has been increased. Therefore, MRC(Machine Reading Comprehension) in which a natural language processing model finds the important information within an HTML document is becoming more important. In this paper, we propose HTDE(HTML Tag Depth Embedding Method), which allows the BERT to train the depth of the HTML document structure. HTDE makes a tag stack from the HTML document for each input token in the BERT and then extracts the depth information. After that, we add a HTML embedding layer that takes the depth of the token as input to the step of input embedding of BERT. Since tokenization using HTDE identifies the HTML document structures through the relationship of surrounding tokens, HTDE improves the accuracy of BERT for HTML documents. Finally, we demonstrated that the proposed idea showing the higher accuracy compared than the accuracy using the conventional embedding of BERT.

Open Domain Machine Reading Comprehension using InferSent (InferSent를 활용한 오픈 도메인 기계독해)

  • Jeong-Hoon, Kim;Jun-Yeong, Kim;Jun, Park;Sung-Wook, Park;Se-Hoon, Jung;Chun-Bo, Sim
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.89-96
    • /
    • 2022
  • An open domain machine reading comprehension is a model that adds a function to search paragraphs as there are no paragraphs related to a given question. Document searches have an issue of lower performance with a lot of documents despite abundant research with word frequency based TF-IDF. Paragraph selections also have an issue of not extracting paragraph contexts, including sentence characteristics accurately despite a lot of research with word-based embedding. Document reading comprehension has an issue of slow learning due to the growing number of parameters despite a lot of research on BERT. Trying to solve these three issues, this study used BM25 which considered even sentence length and InferSent to get sentence contexts, and proposed an open domain machine reading comprehension with ALBERT to reduce the number of parameters. An experiment was conducted with SQuAD1.1 datasets. BM25 recorded a higher performance of document research than TF-IDF by 3.2%. InferSent showed a higher performance in paragraph selection than Transformer by 0.9%. Finally, as the number of paragraphs increased in document comprehension, ALBERT was 0.4% higher in EM and 0.2% higher in F1.

Keyword Weight based Paragraph Extraction Algorithm (키워드 가중치 기반 문단 추출 알고리즘)

  • Lee, Jongwon;Joo, Sangwoong;Lee, Hyunju;Jung, Hoekyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.504-505
    • /
    • 2017
  • Existing morpheme analyzers classify the words used in writing documents. A system for extracting sentences and paragraphs based on a morpheme analyzer is being developed. However, there are very few systems that compress documents and extract important paragraphs. The algorithm proposed in this paper calculates the weights of the keyword written in the document and extracts the paragraphs containing the keyword. Users can reduce the time to understand the document by reading the paragraphs containing the keyword without reading the entire document. In addition, since the number of extracted paragraphs differs according to the number of keyword used in the search, the user can search various patterns compared to the existing system.

  • PDF

XML-based EDI Document Processing System with Binary Format Mapping Rules

  • Kim, Chang-Su;Jung, Hoe-Kyung
    • Journal of information and communication convergence engineering
    • /
    • v.10 no.3
    • /
    • pp.258-263
    • /
    • 2012
  • Recently, the magnitude of electronic data interchange (EDI) document processing for the handling of port logistics is abruptly being increased. The existing system processes EDI documents in a script mode, but due to a complicated script preparation procedure and low document processing efficiency, it cannot meet the demand as the usage flow of documents increases. In this paper, an EDI electronic document processing system was designed and implemented in a document scanner and mapper, which are binary form electronic document processing tools and do not require script files during the conversion of extensible markup language (XML)-based electronic documents. This new system has the merits of XML features during reading and writing with improved speed, usage convenience, and good portability on systems when compared to the conventional ones.

The Analysis of the security requirements for a circulation of the classified documents (비밀문서유통을 위한 보안 요구사항 분석)

  • Lee, Ji-Yeong;Park, Jin-Seop;Kang, Seong-Ki
    • Journal of National Security and Military Science
    • /
    • s.1
    • /
    • pp.361-390
    • /
    • 2003
  • In this paper, we analyze the security requirement for a circulation of the classified documents. During the whole document process phases, including phases of drafting, sending/receiving messages, document approval, storing and saving, reading, examining, out-sending and canceling a document, we catch hold of accompanied threat factors and export every threat factors of security. We also propose an appropriate and correspondent approach for security in a well-prepared way. Last, we present the security guidelines for security architecture of the classified documents circulation.

  • PDF