• Title/Summary/Keyword: Priority Retrieval

Search Result 39, Processing Time 0.025 seconds

A Probabilistic Context Sensitive Rewriting Method for Effective Transliteration Variants Generation (효과적인 외래어 이형태 생성을 위한 확률 문맥 의존 치환 방법)

  • Lee, Jae-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.2
    • /
    • pp.73-83
    • /
    • 2007
  • An information retrieval system, using exact match, needs preprocessing or query expansion to generate transliteration variants in order to search foreign word transliteration variants in the documents. This paper proposes an effective method to generate other transliteration variants from a given transliteration. Because simple rewriting of confused characters produces too many false variants, the proposed method controls the generation priority by learning confusion patterns from real uses and calculating their probability. Especially, the left and right context of a pattern is considered, and local rewriting probability and global rewriting probability are calculated to produce more probable variants in earlier stage. The experimental result showed that the method was very effective by showing more than 80% recall with top 20 generations for a transliteration variants set collected from KT SET 2.0.

Development of Meteorologic Data Retrieval Program for Vulnerability Assessment to Natural Hazards (재해 취약성 평가를 위한 기상자료 처리 프로그램 MetSystem 개발)

  • Jang, Min-Won;Kim, Sang-Min
    • Journal of Korean Society of Rural Planning
    • /
    • v.19 no.4
    • /
    • pp.47-54
    • /
    • 2013
  • Climate change is the most direct threatening factors in sustaining agricultural productivity. It is necessary to reduce the damages from the natural hazards such as flood, drought, typhoons, and snowstorms caused by climate change. Through the vulnerability assessment to adapt the climate change, it is possible to analyze the priority, feasibility, effect of the reduction policy. For the vulnerability assessment, broad amount of weather data for each meterological station are required. Making the database management system for the meteorologic data could troubleshoot of the difficulties lie in handling and processing the weather data. In this study, we generated the meteorologic data retrieval system (MetSystem) for climate change vulnerability assessment. The user interface of MetSystem was implemented in the web-browser so as to access to a database server at any time and place, and it provides different query executions according to the criteria of meteorologic stations, temporal range, meteorologic items, statistics, and range of values, as well as the function of exporting to Excel format (*.xls). The developed system is expected that it will make it easier to try different analyses of vulnerability to natural hazards by the simple access to meteorologic database and the extensive search functions.

Multi-class Support Vector Machines Model Based Clustering for Hierarchical Document Categorization in Big Data Environment (빅 데이터 환경에서 계층적 문서 유형 분류를 위한 클러스터링 기반 다중 SVM 모델)

  • Kim, Young Soo;Lee, Byoung Yup
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.11
    • /
    • pp.600-608
    • /
    • 2017
  • Recently data growth rates are growing exponentially according to the rapid expansion of internet. Since users need some of all the information, they carry a heavy workload for examination and discovery of the necessary contents. Therefore information retrieval must provide hierarchical class information and the priority of examination through the evaluation of similarity on query and documents. In this paper we propose an Multi-class support vector machines model based clustering for hierarchical document categorization that make semantic search possible considering the word co-occurrence measures. A combination of hierarchical document categorization and SVM classifier gives high performance for analytical classification of web documents that increase exponentially according to extension of document hierarchy. More information retrieval systems are expected to use our proposed model in their developments and can perform a accurate and rapid information retrieval service.

A Scheduling Algorithm using The Priority of Broker for Improving The Performance of Semantic Web-based Visual Media Retrieval Framework (분산시각 미디어 검색 프레임워크의 성능향상을 위한 브로커 서버 우선순위를 이용한 라운드 로빈 스케줄링 기법)

  • Shim, Jun-Yong;Won, Jae-Hoon;Kim, Se-Chang;Kim, Jung-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.1
    • /
    • pp.22-32
    • /
    • 2008
  • To overcome the weakness of the image retrieval system using the existing Ontology and the distributed image based on the database having a simple structure, HERMES was suggested to ensure the self-control of various image suppliers and support the image retrieval based on semantic, the mentioned framework could not solve the problems which are not considered the deterioration in the capacity and scalability when many users connect to broker server simultaneously. In this paper the tables are written which in the case numerous users connect at the same time to the supply analogous level of services without the deterioration in the capacity installs Broker servers and then measures the performance time of each inner Broker Component through Monitoring System and saved and decides the ranking in saved data. As many Query performances are dispersed into several Servers User inputted from the users Interface with reference to Broker Ranking Table, Load Balancing system improving reliability in capacity is proposed. Through the experiment, the scheduling technique has proved that this schedule is faster than existing techniques.

Component Classification and Retrieval using Clustering Algorithm (클러스터링 알고리즘을 이용한 컴포넌트 분유 및 검색)

  • 김귀정
    • The Journal of the Korea Contents Association
    • /
    • v.2 no.3
    • /
    • pp.87-95
    • /
    • 2002
  • This study proposes method to classify components in repository and retrieve them introducing the idea of domain orientation for successful reuse of components. About components of existing systems design pattern was applied to, us suggest component classification method to compare structural similarity between each component in relevant domain and criterion pattern. Component reusability and portability between platforms can be increased through classifying reusable components by function and giving their structures with diagram. Efficiency of component reuse can be raised because the most appropriate component to query and similar candidate components and provided in priority by use of E-SARM algorithm.

  • PDF

Systematic Review of Bug Report Processing Techniques to Improve Software Management Performance

  • Lee, Dong-Gun;Seo, Yeong-Seok
    • Journal of Information Processing Systems
    • /
    • v.15 no.4
    • /
    • pp.967-985
    • /
    • 2019
  • Bug report processing is a key element of bug fixing in modern software maintenance. Bug reports are not processed immediately after submission and involve several processes such as bug report deduplication and bug report triage before bug fixing is initiated; however, this method of bug fixing is very inefficient because all these processes are performed manually. Software engineers have persistently highlighted the need to automate these processes, and as a result, many automation techniques have been proposed for bug report processing; however, the accuracy of the existing methods is not satisfactory. Therefore, this study focuses on surveying to improve the accuracy of existing techniques for bug report processing. Reviews of each method proposed in this study consist of a description, used techniques, experiments, and comparison results. The results of this study indicate that research in the field of bug deduplication still lacks and therefore requires numerous studies that integrate clustering and natural language processing. This study further indicates that although all studies in the field of triage are based on machine learning, results of studies on deep learning are still insufficient.

Research Needs in Librarianship

  • Wilson, T.D.
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.4
    • /
    • pp.5-18
    • /
    • 2010
  • Library and information research is often directed towards either the management of resources (e.g., the economics of resource management), their storage and retrieval (e.g., much information retrieval research), and the users of these resources (the whole area of information behaviour. However, the question that is less often asked is, "What research do librarians want to have carried out to help them in their work?" Clearly, some of the topics just mentioned will fall into the priority areas, but what do librarians actually perceive will be of use to them. There is a notion that a research-practice gap exists in the field and perhaps the reason for that is that researchers do not ask the practioners what research will be of value to them. To find an answer to this question on a global basis would, of course, be impossible - at least impossible without the level of funding that would be difficult to obtain from any source. However, it is possible to carry out research on a national level that could prove useful both to practitioners and to the library and information research community. This was the aim of a project, supported by the Svensk Biblioteksf$\"{o}$rening (Swedish Library Association), which was carried out in 2008/2009. Ideas on potential research projects were collected from librarians themselves, from discussion group archives and from the professional journals in a number of countries. These ideas were then grouped thematically and formed the basis of two rounds of a Delphi process to solicit the opinions of a panel of librarians in different sectors, recommended by their peers as 'expert' in their field. The Delphi process was concluded with a workshop involving a subset of the panel. This paper will report on the results of the investigation, which attracted a great deal of interest within the profession in Sweden, and will also reflect on issues that were ranked lowly in the investigation. For example, not a great deal of priority was given to topics relating to the development and use of technology: why was this? And would the same result be found in other countries? One major area of research interest was into the future of libraries and a topic of relevance here, especially for academic and research libraries, is the changing information behaviour of researchers: what, now, do researchers want of libraries? Clearly, technology is playing a role here, but digitized resources and the World Wide Web may not be the answer to every researcher's need. Research into libraries and research for libraries ought to figure largely in the profession's view of its aims, objectives and visions of the future: but for it to do so requires a recognition that the work will not be done unless researchers and practitioners come together to determine how to approach the future.

A Study on the Development of Search Algorithm for Identifying the Similar and Redundant Research (유사과제파악을 위한 검색 알고리즘의 개발에 관한 연구)

  • Park, Dong-Jin;Choi, Ki-Seok;Lee, Myung-Sun;Lee, Sang-Tae
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.11
    • /
    • pp.54-62
    • /
    • 2009
  • To avoid the redundant investment on the project selection process, it is necessary to check whether the submitted research topics have been proposed or carried out at other institutions before. This is possible through the search engines adopted by the keyword matching algorithm which is based on boolean techniques in national-sized research results database. Even though the accuracy and speed of information retrieval have been improved, they still have fundamental limits caused by keyword matching. This paper examines implemented TFIDF-based algorithm, and shows an experiment in search engine to retrieve and give the order of priority for similar and redundant documents compared with research proposals, In addition to generic TFIDF algorithm, feature weighting and K-Nearest Neighbors classification methods are implemented in this algorithm. The documents are extracted from NDSL(National Digital Science Library) web directory service to test the algorithm.

A Development of Ontology-Based Law Retrieval System: Focused on Railroad R&D Projects (온톨로지 기반 법령 검색시스템의 개발: 철도·교통 분야 연구개발사업을 중심으로)

  • Won, Min-Jae;Kim, Dong-He;Jung, Hae-Min;Lee, Sang Keun;Hong, June Seok;Kim, Wooju
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.4
    • /
    • pp.209-225
    • /
    • 2015
  • Research and development projects in railroad domain are different from those in other domains in terms of their close relationship with laws. Some cases are reported that new technologies from R&D projects could not be industrialized because of relevant laws restricting them. This problem comes from the fact that researchers don't know exactly what laws can affect the result of R&D projects. To deal with this problem, we suggest a model for law retrieval system that can be used by researchers of railroad R&D projects to find related legislation. Input of this system is a research plan describing the main contents of projects. After laws related to the R&D project is provided with their rankings, which are assigned by scores we developed. A ranking of a law means its order of priority to be checked. By using this system, researchers can search the laws that may affect R&D projects throughout all the stages of project cycle. So, using our system model, researchers can get a list of laws to be considered before the project they participate ends. As a result, they can adjust their project direction by checking the law list, avoiding their elaborate projects being useless.

Tag Ranking System based on Semantic Similarity of Tag-pair (태그쌍의 의미유사도 기반 태그 랭킹 시스템)

  • Lee, Si-Hwa;Hwang, Dae-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.11
    • /
    • pp.1305-1314
    • /
    • 2013
  • The existing tag based system deducts a retrieval result with low accuracy through the usage of a single tag matching by using tags tagged in contents. And the system doesn't provide effectively contents related information which the tags have, as the users place tags on contents without considering the priority and associative relation between tags. For a solve of above problems, this paper suggests a tag ranking system which extracts semantic similarity between tags and re-ranks the tags tagged in contents. In order to evaluate the performance of suggested system, this paper experiments and compares the ranking result of this paper's tag ranking system with the result of baseline method using tags tagged in images and frequency method adapting tag co-appearance frequency.