• Title/Summary/Keyword: Multimedia Documents

Search Result 218, Processing Time 0.022 seconds

An Analysis of Element Information in XML Documents

  • Kim, Sungrim;Yoon, Yong-ik
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.834-837
    • /
    • 2002
  • This paper proposes the way to analyse XML docuements according to the element information. XML documents, which are becoming new standard for expressing and exchanging data in the Internet, don't have defined schema. It is not adequate to directly apply XML documents to the existing relational database or object-oriented database query language. Research on how to extract schema for XML documents and query language is going on actively For users' query, the results could be too many or too less. It is important to give the users adequate results. Our proposed analysis method can be reduced or extended to correspond to the users' query more flexibly.

  • PDF

A Method for Measuring Similarity Measure of Thesaurus Transformation Documents using DBSCAN (DBSCAN을 활용한 유의어 변환 문서 유사도 측정 방법)

  • Kim, Byeongsik;Shin, Juhyun
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.9
    • /
    • pp.1035-1043
    • /
    • 2018
  • There is a case where the core content of another person's work is decorated as though it is his own thoughts by changing own thoughts without showing the source. Plagiarism test of copykiller free service used in plagiarism check is performed by comparing plagiarism more than 6th word. However, it is not enough to judge it as a plagiarism with a six - word match if it is replaced with a similar word. Therefore, in this paper, we construct word clusters by using DBSCAN algorithm, find synonyms, convert the words in the clusters into representative synonyms, and construct L-R tables through L-R parsing. We then propose a method for determining the similarity of documents by applying weights to the thesaurus and weights for each paragraph of the thesis.

Analysis of Indexing Schemes for Structure-Based Retrieval (구조 기반 검색을 위한 색인 구조에 대한 분석)

  • 김영자;김현주;배종민
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.5
    • /
    • pp.601-616
    • /
    • 2004
  • Information retrieval systems for structured documents provide multiple levels of retrieval capability by supporting structure-based queries. In order to process structure-based queries for structured documents, information for structural nesting relationship between elements and for element sequence must be maintained. This paper presents four index structures that can process various query types about structures such as structural relationships between elements or element occurrence order. The proposed algorithms are based on the concept of Global Document Instance Tree.

  • PDF

Designing and Implementing EJB Component for Transform XML Documents based on Object Model (객체 모델 기반 XML 문서 변환을 위한 EJB 컴포넌트 설계 및 구현)

  • 김용수;임종선;주경수
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.4
    • /
    • pp.468-476
    • /
    • 2002
  • Nowadays for reliable software and cost reduction there are many research works on software development based on component. One of the challenge in designing a component-based system is determining which components are required and where they fit in the overall system architecture. In this paper, we developed a EJB component for transforming RDB instances to XML documents. Accordly users can build XML applications based on relational database just by assembling this component. Consequently they can reduce time and cost for developing their XML applications.

  • PDF

Table based Matching Algorithm for Soft Categorization of News Articles in Reuter 21578

  • Jo, Tae-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.6
    • /
    • pp.875-882
    • /
    • 2008
  • This research proposes an alternative approach to machine learning based ones for text categorization. For using machine learning based approaches for any task of text mining, documents should be encoded into numerical vectors; it causes two problems: huge dimensionality and sparse distribution. Although there are various tasks of text mining such as text categorization, text clustering, and text summarization, the scope of this research is restricted to text categorization. The idea of this research is to avoid the two problems by encoding a document or documents into a table, instead of numerical vectors. Therefore, the goal of this research is to improve the performance of text categorization by proposing approaches, which are free from the two problems.

  • PDF

A Method on Associated Document Recommendation with Word Correlation Weights (단어 연관성 가중치를 적용한 연관 문서 추천 방법)

  • Kim, Seonmi;Na, InSeop;Shin, Juhyun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.2
    • /
    • pp.250-259
    • /
    • 2019
  • Big data processing technology and artificial intelligence (AI) are increasingly attracting attention. Natural language processing is an important research area of artificial intelligence. In this paper, we use Korean news articles to extract topic distributions in documents and word distribution vectors in topics through LDA-based Topic Modeling. Then, we use Word2vec to vector words, and generate a weight matrix to derive the relevance SCORE considering the semantic relationship between the words. We propose a way to recommend documents in order of high score.

Grouping of Multimedia Documents using SRR and DRR (SRR과 DRR을 이용한 멀티미디어 문서 그룹화)

  • 이종득;김양범;정택원
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.4
    • /
    • pp.435-442
    • /
    • 2001
  • According to the current increase of the usefulness of information in Internet, several methods are proposed in which multimedia information may be efficiently managed and retrieved. The purpose of this paper is to propose the new grouping method by SRR(Semantic Reference Relation) and DRR(Direct Reference Relation). The important point of this method proposed in this paper is to group MDI(Multimedia Document Informations) as a cluster of this multimedia objects. According to the result of experimental simulation, which has been tested by by the 1,000 multimedia items in internet, this method has made more efficiently the service and grouping of MDI possible than any other methods do in internet.

  • PDF

Representing and Processing Multimedia and Structured Documents For XML-Based Virtual Documents (XML 기반 가상문서에서의 멀티미디어 및 구조적 문서의 표현과 처리)

  • 박천수;임동수;박종현;강민구;강지훈
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10a
    • /
    • pp.246-248
    • /
    • 2000
  • 가상문서는 웹 상에 존재하는 내용 중에서 원하는 부분만을 링크를 이용해 새로운 문서를 생성하는 개념이다. 본 논문에서는 가상문서를 지원하는 디지털 도서관 시스템에서 텍스트, 이미지 데이터뿐 아니라 멀티미디어 데이터와 구조적 의미를 갖는 데이터를 처리 가능하도록 DTD의 표기법을 확장하였다. 또한, 저작도구에서 생성된 내포링크, 참조링크, 총칭링크 등 다양한 의미의 가상무서를 브라우징 가능하도록 문서 변환기에서 멀티미디어와 구조적 문서를 처리하기 위한 방법을 제시하였다.

  • PDF

Obscene Material Searching Method in WWW (WWW상에서 음란물 검색기법)

  • 노경택;김경우;이기영;김규호
    • Journal of the Korea Society of Computer and Information
    • /
    • v.4 no.2
    • /
    • pp.1-7
    • /
    • 1999
  • World-Wide Web(WWW) is a protocol for changing information exchanges which is central to text documents in the existing network to make a multimedia data exchanges. It is possible for a beginner to search and access data which he wants to find as data were stored in the form of hypertext. The easiness for searching and accessing the multimedia data in WWW makes a important role for obscene materials to be toward generalization and multimedia and occurs social problems for them to be commercialized, while other researchers have actively studied the way to block effectively the site providing obscene materials for solving such problems. This paper presents and implements the blocking method for the sites having obscene material as it effectively search them. The proposed model was based on Link-Based information retrieval method and proved that it accomplished more efficient retrieval of relevant documents than probabilistic model when compared the one with the other which is known to generate the most correct results. The improvements in the average recall and precision ratio were shown as 12% and 8% respectively. Especially, the retrieval capability of relevant documents which include non-text data and have a few links increased highly.

  • PDF

An Efficient Algorithm for Detecting Tables in HTML Documents (HTML 문서의 테이블 식별을 위한 효율적인 알고리즘)

  • Kim Yeon-Seok;Lee Kyong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1339-1353
    • /
    • 2004
  • < TABLE > tags in HTML documents are widely used for formatting layout of Web documents as well as for describing genuine tables with relational information. As a prerequisite for information extraction from the Web, this paper presents an efficient method for sophisticated table detection. The proposed method consists of two phases: preprocessing and attribute-value relations extraction. For the preprocessing where genuine or ungenuine tables are filtered out, appropriate rules are devised based on a careful examination of general characteristics of < TABLE > tags. The remaining is detected at the attribute-value relations extraction phase. Specifically, a value area is extracted and checked out whether there is a syntactic coherency Futhermore, the method looks for a semantic coherency between an attribute area and a value area of a table that may be inappropriate for the syntactic coherency checkup. Experimental results with 11,477 < TABLE > tags from 1,393 HTML documents show at the method has performed better compared with previous works, resulting in a precision of 97.54% and a recall of 99.22% in average.

  • PDF