• Title/Summary/Keyword: Information retrieval

Search Result 3,682, Processing Time 0.03 seconds

A Personal Digital Library on a Distributed Mobile Multiagents Platform (분산 모바일 멀티에이전트 플랫폼을 이용한 사용자 기반 디지털 라이브러리 구축)

  • Cho Young Im
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1637-1648
    • /
    • 2004
  • When digital libraries are developed by the traditional client/sever system using a single agent on the distributed environment, several problems occur. First, as the search method is one dimensional, the search results have little relationship to each other. Second, the results do not reflect the user's preference. Third, whenever a client connects to the server, users have to receive the certification. Therefore, the retrieval of documents is less efficient causing dissatisfaction with the system. I propose a new platform of mobile multiagents for a personal digital library to overcome these problems. To develop this new platform I combine the existing DECAF multiagents platform with the Voyager mobile ORB and propose a new negotiation algorithm and scheduling algorithm. Although there has been some research for a personal digital library, I believe there have been few studies on their integration and systemization. For searches of related information, the proposed platform could increase the relationship of search results by subdividing the related documents, which are classified by a supervised neural network. For the user's preference, as some modular clients are applied to a neural network, the search results are optimized. By combining a mobile and multiagents platform a new mobile, multiagents platform is developed in order to decrease a network burden. Furthermore, a new negotiation algorithm and a scheduling algorithm are activated for the effectiveness of PDS. The results of the simulation demonstrate that as the number of servers and agents are increased, the search time for PDS decreases while the degree of the user's satisfaction is four times greater than with the C/S model.

Development of Collaborative Environment for Community-driven Scientific Data Curation (커뮤니티 주도적 과학 데이터 큐레이션 협업 환경의 개발)

  • Choi, Dong-Hoon;Park, Jae-Won;Kim, ByungKyu;Shin, Jin-Sup
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.9
    • /
    • pp.1-11
    • /
    • 2017
  • The importance of data curation is increasingly recognized as the need of data reuse drastically grows. Due to recent data explosion, scientists invest almost 90% of their efforts in the retrieval and collection of data needed to their study. In this paper, we deal with the development and application of a collaborative environment for community-driven data curation which is essential to enhance scientific data reusability and citability. The collaborative scientific data curation environment focuses on the cross-linking between data (or data collections) and their associated literatures to capture and organize inter-relations among research results in a specific domain. Also, plenty of contextual information is provided as metadata in order to support users in understanding data. The cross-linking has been realized by using DOI system to guarantee global accessibility to data and their relationships to literatures. The curation environment has been adopted to build a community-driven curated DB by a globally well-known intrinsically-disorderd protein research group. The curated DB will drastically reduce researchers' efforts to retrieve and collect the data required for scientific discovery.

Multiple Cause Model-based Topic Extraction and Semantic Kernel Construction from Text Documents (다중요인모델에 기반한 텍스트 문서에서의 토픽 추출 및 의미 커널 구축)

  • 장정호;장병탁
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.5
    • /
    • pp.595-604
    • /
    • 2004
  • Automatic analysis of concepts or semantic relations from text documents enables not only an efficient acquisition of relevant information, but also a comparison of documents in the concept level. We present a multiple cause model-based approach to text analysis, where latent topics are automatically extracted from document sets and similarity between documents is measured by semantic kernels constructed from the extracted topics. In our approach, a document is assumed to be generated by various combinations of underlying topics. A topic is defined by a set of words that are related to the same topic or cooccur frequently within a document. In a network representing a multiple-cause model, each topic is identified by a group of words having high connection weights from a latent node. In order to facilitate teaming and inferences in multiple-cause models, some approximation methods are required and we utilize an approximation by Helmholtz machines. In an experiment on TDT-2 data set, we extract sets of meaningful words where each set contains some theme-specific terms. Using semantic kernels constructed from latent topics extracted by multiple cause models, we also achieve significant improvements over the basic vector space model in terms of retrieval effectiveness.

A MPEG Audio-Visual Conversational Communication Terminal on the B-ISDN Environment (광대역 ISDN용 MPEG 오디오-비쥬열 대화형 통신단말의 설계 및 구현)

  • Hwang, Dae-Hwan;Cho, Kyu-Seob
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.8
    • /
    • pp.1960-1971
    • /
    • 1998
  • The researches and developments to provide multimedia communication services such as Video on Demand(VoDJ), real time video phonc and multipoint vidco conferencing on broadband ISDN environmcnts have been proceeded with activity. Specifications for Vol) services which is worked by Digital Audio-Visual Council(DAVIC) to support detail technologies including total service system that is consist of VoD server. delive[\! networl, and Set-Top Box(STB) had been already finished and ITU-T SG16 also recommended the standards of H.300 series terminal aspects for conversational multimedia services, But the architectures of multimedia tenninals recommended and specified by these organizations do not have an efficient st11lcture to provide all of retrieval, distrihution and conversational service due to a different point of view about multimedia terminals and services. In this paper, we analyzed the recornmendatio!E and the specifications of intemational public and private organizations like lTU-T, DAVIC and ATM forum. As a result of these analysis. we propose an efficient terminal architecture, and then we have designed, lmplemented the multimedia communication terminal for offering VoI) and real- time conversation ,,, functional module test according to the individual commumication service session and confirined the validiry or terminal implemented to be used on broadband ISDK environments.

  • PDF

Construction of Component Repository for Supporting the CBD Process (CBD 프로세스 지원을 위한 컴포넌트 저장소의 구축)

  • Cha, Jung-Eun;Kim, Hang-Kon
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.7
    • /
    • pp.476-486
    • /
    • 2002
  • CBD(Component Based Development) has become the best strategical method for the business application. Because CBD is a new development paradigm which makes it possible to assemble the software components for application, it copes with the rapid challenge of business process and meets the increasing requirements for productivity. Since the business process is rapidly changing, CBD technology is the promising way to solve the productivity. Especially, the repository is the most important part for the development, distribution and reuse of components. In component repository, we can store and manage the related work-products produced at each step of component development as well as component itself. In this paper, we suggested a practical approach for repository construction to support and realize the CBD process and developed the CRMS(Component Repository Management System) as implementation product of the proposed techniques. CRMS can manage a variety of component products based on component architecture, and help software developers to search a candidate component for their project and to understand a variety of information for the component. In the paper, a practical approach for component repository was suggested, and a supporting environment was constructed to make CBD to be working efficiently. We expect this work wall be valuable research for component repository and the entire supporting Component Based Development Process.

Detecting Errors in POS-Tagged Corpus on XGBoost and Cross Validation (XGBoost와 교차검증을 이용한 품사부착말뭉치에서의 오류 탐지)

  • Choi, Min-Seok;Kim, Chang-Hyun;Park, Ho-Min;Cheon, Min-Ah;Yoon, Ho;Namgoong, Young;Kim, Jae-Kyun;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.7
    • /
    • pp.221-228
    • /
    • 2020
  • Part-of-Speech (POS) tagged corpus is a collection of electronic text in which each word is annotated with a tag as the corresponding POS and is widely used for various training data for natural language processing. The training data generally assumes that there are no errors, but in reality they include various types of errors, which cause performance degradation of systems trained using the data. To alleviate this problem, we propose a novel method for detecting errors in the existing POS tagged corpus using the classifier of XGBoost and cross-validation as evaluation techniques. We first train a classifier of a POS tagger using the POS-tagged corpus with some errors and then detect errors from the POS-tagged corpus using cross-validation, but the classifier cannot detect errors because there is no training data for detecting POS tagged errors. We thus detect errors by comparing the outputs (probabilities of POS) of the classifier, adjusting hyperparameters. The hyperparameters is estimated by a small scale error-tagged corpus, in which text is sampled from a POS-tagged corpus and which is marked up POS errors by experts. In this paper, we use recall and precision as evaluation metrics which are widely used in information retrieval. We have shown that the proposed method is valid by comparing two distributions of the sample (the error-tagged corpus) and the population (the POS-tagged corpus) because all detected errors cannot be checked. In the near future, we will apply the proposed method to a dependency tree-tagged corpus and a semantic role tagged corpus.

An Interconnection Method for Streaming Framework and Multimedia Database (스트리밍 프레임워크와 멀티미디어 데이타베이스와의 연동기법)

  • Lee, Jae-Wook;Lee, Sung-Young;Lee, Jong-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.7
    • /
    • pp.436-449
    • /
    • 2002
  • This paper describes on our experience of developing the Database Connector as an interconnection method between multimedia database, and the streaming framework. It is possible to support diverse and mature multimedia database services such as retrieval and join operation during the streaming if an interconnection method is provided in between streaming system and multimedia databases. The currently available interconnection schemes, however have mainly used the file systems or the relational databases that are Implemented with separated form of meta data, which deafs with information of multimedia contents, and streaming data which deals with multimedia data itself. Consequently, existing interconnection mechanisms could not come up with many virtues of multimedia database services during the streaming operation. In order to resolve these drawbacks, we propose a novel scheme for an interconnection between streaming framework and multimedia database, called the Inter-Process Communication (IPC) based Database connector, under the assumption that two systems are located in a same host. We define four transaction primitives; Read, Write, Find, Play, as well as define the interface for transactions that are implemented based on the plug-in, which in consequence can extend to other multimedia databases that will come for some later years. Our simulation study show that performance of the proposed IPC based interconnection scheme is not much far behind compared with that of file systems.

Development of Real-time Video Search System Using the Intelligent Object Recognition Technology (지능형 객체 인식 기술을 이용한 실시간 동영상 검색시스템)

  • Chang, Jae-Young;Kang, Chan-Hyeok;Yoon, Jae-Min;Cho, Jae-Won;Jung, Ji-Sung;Chun, Jonghoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.6
    • /
    • pp.85-91
    • /
    • 2020
  • Recently, video-taping equipment such as CCTV have been seeing more use for crime prevention and general safety concerns. Since these video-taping equipment operates all throughout the day, the need for security personnel is lessened, and naturally costs incurred from managing such manpower should also decrease. However, technology currently used predominantly lacks self-sufficiency when given the task of searching for a specific object in the recorded video such as a person, and has to be done manually; current security-based video equipment is insufficient in an environment where real-time information retrieval is required. In this paper, we propose a technology that uses the latest deep-learning technology and OpenCV library to quickly search for a specific person in a video; the search is based on the clothing information that is inputted by the user and transmits the result in real time. We implemented our system to automatically recognize specific human objects in real time by using the YOLO library, whilst deep learning technology is used to classify human clothes into top/bottom clothes. Colors are also detected through the OpenCV library which are then all combined to identify the requested object. The system presented in this paper not only accurately and quickly recognizes a person object with a specific clothing, but also has a potential extensibility that can be used for other types of object recognition in a video surveillance system for various purposes.

Design of Standard Metadata Schema for Computing Resource Management (컴퓨팅 리소스 관리를 위한 표준 메타데이터 스키마 설계)

  • Lee, Mikyoung;Cho, Minhee;Song, Sa-Kwang;Yim, Hyung-Jun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.433-435
    • /
    • 2022
  • In this paper, we introduce a computing resource standard metadata schema design plan for registering, retrieving, and managing computing resources used for research data analysis and utilization in the Korea Research Data Commons(KRDC). KRDC is a joint utilization system of research data and computing resources to maximize the sharing and utilization of research data. Computing resources refer to all resources in the computing environment, such as analysis infrastructure and analysis software, necessary to analyze and utilize research data used in the entire research process. The standard metadata schema for KRDC computing resource management is designed by considering common attributes for computing resource management and other attributes according to each computing resource feature. The standard metadata schema for computing resource management consists of a computing resource metadata schema and a computing resource provider metadata schema. In addition, the metadata schema of computing resources and providers was designed as a service schema and a system schema group according to their characteristics. The standard metadata schema designed in this paper is used for computing resource registration, retrieval, management, and workflow services for computing resource providers and computing resource users through the KRDC web service, and is designed in a scalable form for various computing resource links.

  • PDF

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

  • Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.697-705
    • /
    • 2009
  • An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.