• Title/Summary/Keyword: example retrieval

Search Result 108, Processing Time 0.023 seconds

Fast, Flexible Text Search Using Genomic Short-Read Mapping Model

  • Kim, Sung-Hwan;Cho, Hwan-Gue
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.518-528
    • /
    • 2016
  • The searching of an extensive document database for documents that are locally similar to a given query document, and the subsequent detection of similar regions between such documents, is considered as an essential task in the fields of information retrieval and data management. In this paper, we present a framework for such a task. The proposed framework employs the method of short-read mapping, which is used in bioinformatics to reveal similarities between genomic sequences. In this paper, documents are considered biological objects; consequently, edit operations between locally similar documents are viewed as an evolutionary process. Accordingly, we are able to apply the method of evolution tracing in the detection of similar regions between documents. In addition, we propose heuristic methods to address issues associated with the different stages of the proposed framework, for example, a frequency-based fragment ordering method and a locality-aware interval aggregation method. Extensive experiments covering various scenarios related to the search of an extensive document database for documents that are locally similar to a given query document are considered, and the results indicate that the proposed framework outperforms existing methods.

Improved Query By Sketch Method for Contents-Based Retrieval (내용 기반 검색을 위한 향상된 스케치 질의 방법)

  • Ha Myung-Hwan;Jung Byung-Hee;Kim Hee-Jung;Lim Mi-Young;Kim Hyoung-Joon;Kim Whoi-Yul Yura
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2003.11a
    • /
    • pp.275-278
    • /
    • 2003
  • 디지털 콘텐츠의 증가에 따라 이들의 효율적인 검색과 관리를 위하여 내용 기반 검색에 관한 많은 연구가 이루어지고 있다. 이러만 내용 기반 검색의 질의 방법으로는 유사한 영상을 질의의 사용하는 QBE(Query By Example)와 영상을 사용자가 직접 스케치하여 질의에 사용하는 QBS(Query By Sketch)가 대표적이다. 본 논문서는 질의로 용할 정확한 영상이 필요한 QBE 방법의 제약과 질의할 영상 전체를 처음부터 스케치해야 하는 QBS 방법의 문제점을 보완하는 개선된 질의 방법을 제안한다. 제안하는 방법은 입력 영상을 단순화하여 스케치의 근간이 되는 밑그림을 제공하고 사용자치 수정 과정을 거쳐 질의하는 방법으로 정확한 검색 결과와 검색 시 소요되는 시간과 노력을 절감할 수 있는 장점이 있다.

  • PDF

An Image Retrieving Scheme Using Salient Features and Annotation Watermarking

  • Wang, Jenq-Haur;Liu, Chuan-Ming;Syu, Jhih-Siang;Chen, Yen-Lin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.1
    • /
    • pp.213-231
    • /
    • 2014
  • Existing image search systems allow users to search images by keywords, or by example images through content-based image retrieval (CBIR). On the other hand, users might learn more relevant textual information about an image from its text captions or surrounding contexts within documents or Web pages. Without such contexts, it's difficult to extract semantic description directly from the image content. In this paper, we propose an annotation watermarking system for users to embed text descriptions, and retrieve more relevant textual information from similar images. First, tags associated with an image are converted by two-dimensional code and embedded into the image by discrete wavelet transform (DWT). Next, for images without annotations, similar images can be obtained by CBIR techniques and embedded annotations can be extracted. Specifically, we use global features such as color ratios and dominant sub-image colors for preliminary filtering. Then, local features such as Scale-Invariant Feature Transform (SIFT) descriptors are extracted for similarity matching. This design can achieve good effectiveness with reasonable processing time in practical systems. Our experimental results showed good accuracy in retrieving similar images and extracting relevant tags from similar images.

Development of a multi criteria decision analysis framework for the assessment of integrated waste management options for irradiated graphite

  • Abrahamsen-Mills, Liam;Wareing, Alan;Fowler, Linda;Jarvis, Richard;Norris, Simon;Banford, Anthony
    • Nuclear Engineering and Technology
    • /
    • v.53 no.4
    • /
    • pp.1224-1235
    • /
    • 2021
  • An integrated waste management approach for irradiated graphite was developed during the European Commission project 'Treatment and Disposal of Irradiated Graphite and other Carbonaceous Waste'. This included the identification of potential options for the management of irradiated graphite, taking account of storage, retrieval, treatment and disposal methods. This paper describes how these options can be assessed using multi-criteria decision analysis (MCDA) for a case study relating to a generic power reactor. Criteria have been defined to account for safety, environmental, economic and socio-political factors, including radiological impact, resource usage, economic costs and risks. The impact of each option against each criterion has been assessed using data from the project and the wider literature. A linear additive approach has been used to convert the calculated impacts to scores. To account for the relative importance of the criteria, example weightings were allocated. This application has shown that MCDA approaches can be used to support complex decisions regarding irradiated graphite management, accounting for a wide range of criteria. Use of this approach by individual countries or organisations will need to account for the specific options, scores, weightings and constraints that apply, based on their national strategies, regulatory requirements and public acceptability.

Sorting Instagram Hashtags all the Way throw Mass Tagging using HITS Algorithm

  • D.Vishnu Vardhan;Dr.CH.Aparna
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.93-98
    • /
    • 2023
  • Instagram is one of the fastest-growing online photo social web services where users share their life images and videos with other users. Image tagging is an essential step for developing Automatic Image Annotation (AIA) methods that are based on the learning by example paradigm. Hashtags can be used on just about any social media platform, but they're most popular on Twitter and Instagram. Using hashtags is essentially a way to group together conversations or content around a certain topic, making it easy for people to find content that interests them. Practically on average, 20% of the Instagram hashtags are related to the actual visual content of the image they accompany, i.e., they are descriptive hashtags, while there are many irrelevant hashtags, i.e., stophashtags, that are used across totally different images just for gathering clicks and for search ability enhancement. Hence in this work, Sorting instagram hashtags all the way through mass tagging using HITS (Hyperlink-Induced Topic Search) algorithm is presented. The hashtags can sorted to several groups according to Jensen-Shannon divergence between any two hashtags. This approach provides an effective and consistent way for finding pairs of Instagram images and hashtags, which lead to representative and noise-free training sets for content-based image retrieval. The HITS algorithm is first used to rank the annotators in terms of their effectiveness in the crowd tagging task and then to identify the right hashtags per image.

A Proposal of Methods for Extracting Temporal Information of History-related Web Document based on Historical Objects Using Machine Learning Techniques (역사객체 기반의 기계학습 기법을 활용한 웹 문서의 시간정보 추출 방안 제안)

  • Lee, Jun;KWON, YongJin
    • Journal of Internet Computing and Services
    • /
    • v.16 no.4
    • /
    • pp.39-50
    • /
    • 2015
  • In information retrieval process through search engine, some users want to retrieve several documents that are corresponding with specific time period situation. For example, if user wants to search a document that contains the situation before 'Japanese invasions of Korea era', he may use the keyword 'Japanese invasions of Korea' by using searching query. Then, search engine gives all of documents about 'Japanese invasions of Korea' disregarding time period in order. It makes user to do an additional work. In addition, a large percentage of cases which is related to historical documents have different time period between generation date of a document and record time of contents. If time period in document contents can be extracted, it may facilitate effective information for retrieval and various applications. Consequently, we pursue a research extracting time period of Joseon era's historical documents by using historic literature for Joseon era in order to deduct the time period corresponding with document content in this paper. We define historical objects based on historic literature that was collected from web and confirm a possibility of extracting time period of web document by machine learning techniques. In addition to the machine learning techniques, we propose and apply the similarity filtering based on the comparison between the historical objects. Finally, we'll evaluate the result of temporal indexing accuracy and improvement.

Study on a Methodology for Developing Shanghanlun Ontology (상한론(傷寒論)온톨로지 구축 방법론 연구)

  • Jung, Tae-Young;Kim, Hee-Yeol;Park, Jong-Hyun
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.25 no.5
    • /
    • pp.765-772
    • /
    • 2011
  • Knowledge which is represented by formal logic are widely used in many domains such like artificial intelligence, information retrieval, e-commerce and so on. And for medical field, medical documentary records retrieval, information systems in hospitals, medical data sharing, remote treatment and expert systems need knowledge representation technology. To retrieve information intellectually and provide advanced information services, systematically controlled mechanism is needed to represent and share knowledge. Importantly, medical expert's knowledge should be represented in a form that is understandable to computers and also to humans to be applied to the medical information system supporting decision making. And it should have a suitable and efficient structure for its own purposes including reasoning, extendability of knowledge, management of data, accuracy of expressions, diversity, and so on. we call it ontology which can be processed with machines. We can use the ontology to represent traditional medicine knowledge in structured and systematic way with visualization, then also it can also be used education materials. Hence, the authors developed an Shanghanlun ontology by way of showing an example, so that we suggested a methodology for ontology development and also a model to structure the traditional medical knowledge. And this result can be used for student to learn Shanghanlun by graphical representation of it's knowledge. We analyzed the text of Shanghanlun to construct relational database including it's original text, symptoms and herb formulars. And then we classified the terms following some criterion, confirmed the structure of the ontology to describe semantic relations between the terms, especially we developed the ontology considering visual representation. The ontology developed in this study provides database showing fomulas, herbs, symptoms, the name of diseases and the text written in Shanghanlun. It's easy to retrieve contents by their semantic relations so that it is convenient to search knowledge of Shanghanlun and to learn it. It can display the related concepts by searching terms and provides expanded information with a simple click. It has some limitations such as standardization problems, short coverage of pattern(證), and error in chinese characters input. But we believe this research can be used for basic foundation to make traditional medicine more structural and systematic, to develop application softwares, and also to applied it in Shanghanlun educations.

An Efficient Bitmap Indexing Method for Multimedia Data Reflecting the Characteristics of MPEG-7 Visual Descriptors (MPEG-7 시각 정보 기술자의 특성을 반영한 효율적인 멀티미디어 데이타 비트맵 인덱싱 방법)

  • Jeong Jinguk;Nang Jongho
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.1
    • /
    • pp.9-20
    • /
    • 2005
  • Recently, the MPEG-7 standard a multimedia content description standard is wide]y used for content based image/video retrieval systems. However, since the descriptors standardized in MPEG-7 are usually multidimensional and the problem called 'Curse of dimensionality', previously proposed indexing methods(for example, multidimensional indexing methods, dimensionality reduction methods, filtering methods, and so on) could not be used to effectively index the multimedia database represented in MPEG-7. This paper proposes an efficient multimedia data indexing mechanism reflecting the characteristics of MPEG-7 visual descriptors. In the proposed indexing mechanism, the descriptor is transformed into a histogram of some attributes. By representing the value of each bin as a binary number, the histogram itself that is a visual descriptor for the object in multimedia database could be represented as a bit string. Bit strings for all objects in multimedia database are collected to form an index file, bitmap index, in the proposed indexing mechanism. By XORing them with the descriptors for query object, the candidate solutions for similarity search could be computed easily and they are checked again with query object to precisely compute the similarity with exact metric such as Ll-norm. These indexing and searching mechanisms are efficient because the filtering process is performed by simple bit-operation and it reduces the search space dramatically. Upon experimental results with more than 100,000 real images, the proposed indexing and searching mechanisms are about IS times faster than the sequential searching with more than 90% accuracy.

A Smart Image Classification Algorithm for Digital Camera by Exploiting Focal Length Information (초점거리 정보를 이용한 디지털 사진 분류 알고리즘)

  • Ju, Young-Ho;Cho, Hwan-Gue
    • Journal of the Korea Computer Graphics Society
    • /
    • v.12 no.4
    • /
    • pp.23-32
    • /
    • 2006
  • In recent years, since the digital camera has been popularized, so users can easily collect hundreds of photos in a single usage. Thus the managing of hundreds of digital photos is not a simple job comparing to the keeping paper photos. We know that managing and classifying a number of digital photo files are burdensome and annoying sometimes. So people hope to use an automated system for managing digital photos especially for their own purposes. The previous studies, e.g. content-based image retrieval, were focused on the clustering of general images, which it is not to be applied on digital photo clustering and classification. Recently, some specialized clustering algorithms for images clustering digital camera images were proposed. These algorithms exploit mainly the statistics of time gap between sequent photos. Though they showed a quite good result in image clustering for digital cameras, still lots of improvements are remained and unsolved. For example the current tools ignore completely the image transformation with the different focal lengths. In this paper, we present a photo considering focal length information recorded in EXIF. We propose an algorithms based on MVA(Matching Vector Analysis) for classification of digital images taken in the every day activity. Our experiment shows that our algorithm gives more than 95% success rates, which is competitive among all available methods in terms of sensitivity, specificity and flexibility.

  • PDF

Research Needs in Librarianship

  • Wilson, T.D.
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.4
    • /
    • pp.5-18
    • /
    • 2010
  • Library and information research is often directed towards either the management of resources (e.g., the economics of resource management), their storage and retrieval (e.g., much information retrieval research), and the users of these resources (the whole area of information behaviour. However, the question that is less often asked is, "What research do librarians want to have carried out to help them in their work?" Clearly, some of the topics just mentioned will fall into the priority areas, but what do librarians actually perceive will be of use to them. There is a notion that a research-practice gap exists in the field and perhaps the reason for that is that researchers do not ask the practioners what research will be of value to them. To find an answer to this question on a global basis would, of course, be impossible - at least impossible without the level of funding that would be difficult to obtain from any source. However, it is possible to carry out research on a national level that could prove useful both to practitioners and to the library and information research community. This was the aim of a project, supported by the Svensk Biblioteksf$\"{o}$rening (Swedish Library Association), which was carried out in 2008/2009. Ideas on potential research projects were collected from librarians themselves, from discussion group archives and from the professional journals in a number of countries. These ideas were then grouped thematically and formed the basis of two rounds of a Delphi process to solicit the opinions of a panel of librarians in different sectors, recommended by their peers as 'expert' in their field. The Delphi process was concluded with a workshop involving a subset of the panel. This paper will report on the results of the investigation, which attracted a great deal of interest within the profession in Sweden, and will also reflect on issues that were ranked lowly in the investigation. For example, not a great deal of priority was given to topics relating to the development and use of technology: why was this? And would the same result be found in other countries? One major area of research interest was into the future of libraries and a topic of relevance here, especially for academic and research libraries, is the changing information behaviour of researchers: what, now, do researchers want of libraries? Clearly, technology is playing a role here, but digitized resources and the World Wide Web may not be the answer to every researcher's need. Research into libraries and research for libraries ought to figure largely in the profession's view of its aims, objectives and visions of the future: but for it to do so requires a recognition that the work will not be done unless researchers and practitioners come together to determine how to approach the future.