Search | Korea Science

Improving Transformer with Dynamic Convolution and Shortcut for Video-Text Retrieval

Liu, Zhi;Cai, Jincen;Zhang, Mengmeng
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.7
- /
- pp.2407-2424
- /
- 2022
Recently, Transformer has made great progress in video retrieval tasks due to its high representation capability. For the structure of a Transformer, the cascaded self-attention modules are capable of capturing long-distance feature dependencies. However, the local feature details are likely to have deteriorated. In addition, increasing the depth of the structure is likely to produce learning bias in the learned features. In this paper, an improved Transformer structure named TransDCS (Transformer with Dynamic Convolution and Shortcut) is proposed. A Multi-head Conv-Self-Attention module is introduced to model the local dependencies and improve the efficiency of local features extraction. Meanwhile, the augmented shortcuts module based on a dual identity matrix is applied to enhance the conduction of input features, and mitigate the learning bias. The proposed model is tested on MSRVTT, LSMDC and Activity-Net benchmarks, and it surpasses all previous solutions for the video-text retrieval task. For example, on the LSMDC benchmark, a gain of about 2.3% MdR and 6.1% MnR is obtained over recently proposed multimodal-based methods.
https://doi.org/10.3837/tiis.2022.07.016 인용 PDF KSCI HTML

Using Context Information to Improve Retrieval Accuracy in Content-Based Image Retrieval Systems

Hejazi, Mahmoud R.;Woo, Woon-Tack;Ho, Yo-Sung
- 한국HCI학회:학술대회논문집
- /
- 2006.02a
- /
- pp.926-930
- /
- 2006
Current image retrieval techniques have shortcomings that make it difficult to search for images based on a semantic understanding of what the image is about. Since an image is normally associated with multiple contexts (e.g. when and where a picture was taken,) the knowledge of these contexts can enhance the quantity of semantic understanding of an image. In this paper, we present a context-aware image retrieval system, which uses the context information to infer a kind of metadata for the captured images as well as images in different collections and databases. Experimental results show that using these kinds of information can not only significantly increase the retrieval accuracy in conventional content-based image retrieval systems but decrease the problems arise by manual annotation in text-based image retrieval systems as well.
PDF

Consideration of a Robust Search Methodology that could be used in Full-Text Information Retrieval Systems (퍼지 논리를 이용한 사용자 중심적인 Full-Text 검색방법에 관한 연구)

Lee, Won-Bu
- Asia pacific journal of information systems
- /
- v.1 no.1
- /
- pp.87-101
- /
- 1991
The primary purpose of this study was to investigate a robust search methodology that could be used in full-text information retrieval systems. A robust search methodology is one that can be easily used by a variety of users (particularly naive users) and it will give them comparable search performance regardless of their different expertise or interests In order to develop a possibly robust search methodology, a fully functional prototype of a fuzzy knowledge based information retrieval system was developed. Also, an experiment that used this prototype information retreival system was designed to investigate the performance of that search methodology over a small exploratory sample of user queries To probe the relatonships between the possibly robust search performance and the query organization using fuzzy inference logic, the search performance of a shallow query structure was analyzes. Consequently the following several noteworthy findings were obtained: 1) the hierachical(tree type) query structure might be a better query organization than the linear type query structure 2) comparing with the complex tree query structure, the simple tree query structure that has at most three levels of query might provide better search performance 3) the fuzzy search methodology that employs a proper levels of cut-off value might provide more efficient search performance than the boolean search methodology. Even though findings could not be statistically verified because the experiments were done using a single replication, it is worth noting however, that the research findings provided valuable information for developing a possibly robust search methodology in full-text information retrieval.
PDF

A Study on Radiological Image Retrieval System (방사선 의료영상 검색 시스템에 관한 연구)

Park, Byung-Rae;Shin, Yong-Won
- Journal of radiological science and technology
- /
- v.28 no.1
- /
- pp.19-24
- /
- 2005
The purpose of this study was to design and implement a useful annotation-based Radiological image retrieval system to accurately determine on education and image information for Radiological technologists. For better retrieval performance based on large image databases, we presented an indexing technique that integrated $B^+-tree$ proposed by Bayer for indexing simple attributes and inverted file structure for text medical keywords acquired from additional description information about Radiological images. In our results, we implemented proposed retrieval system with Delphi under Windows XP environment. End users, Radiological technologists, are able to store simple attributes information such as doctor name, operator name, body parts, disease and so on, additional text-based description information, and Radiological image itself as well as to retrieve wanted results by using simple attributes and text keywords from large image databases by graphic user interface. Consequently proposed system can be used for effective clinical decision on Radiological image, reduction of education time by organizing the knowledge, and well organized education in the clinical fields. In addition, It can be expected to develop as decision support system by constructing web-based integrated imaging system included general image and special contrast image for the future.
PDF

A Study on the Improvement of Retrieval Efficiency Based on the CRFMD (공통기술표현포맷에 기반한 다매체자료의 검색효율 향상에 관한 연구)

Park, Il-Jong;Jeong, Ki-Tai
- Journal of the Korean Society for information Management
- /
- v.23 no.3 s.61
- /
- pp.5-21
- /
- 2006
In recent years, theories of image and sound analysis have been proposed to work with text retrieval systems and have progressed quickly with the rapid progress in data processing speeds. This study proposes a common representation format for multimedia documents (CRFMD) composed of both images and text to form a single data structure. It also shows that image classification of a given test set is dramatically improved when text features are encoded together with image features. CRFMD might be applicable to other areas of multimedia document retrieval and processing, such as medical image retrieval, World Wide Web searching, and museum collection retrieval.
https://doi.org/10.3743/KOSIM.2006.23.3.005 인용 PDF

A Categorization Scheme of Tag-based Folksonomy Images for Efficient Image Retrieval (효과적인 이미지 검색을 위한 태그 기반의 폭소노미 이미지 카테고리화 기법)

Ha, Eunji;Kim, Yongsung;Hwang, Eenjun
- KIISE Transactions on Computing Practices
- /
- v.22 no.6
- /
- pp.290-295
- /
- 2016
Recently, folksonomy-based image-sharing sites where users cooperatively make and utilize tags of image annotation have been gaining popularity. Typically, these sites retrieve images for a user request using simple text-based matching and display retrieved images in the form of photo stream. However, these tags are personal and subjective and images are not categorized, which results in poor retrieval accuracy and low user satisfaction. In this paper, we propose a categorization scheme for folksonomy images which can improve the retrieval accuracy in the tag-based image retrieval systems. Consequently, images are classified by the semantic similarity using text-information and image-information generated on the folksonomy. To evaluate the performance of our proposed scheme, we collect folksonomy images and categorize them using text features and image features. And then, we compare its retrieval accuracy with that of existing systems.
https://doi.org/10.5626/KTCP.2016.22.6.290 인용 KSCI

Efficient and User-Friendly Image Retrieval System Based on Query by Visual Keys

Serata, M.;Sakuma, K.;Stejic, Z.;Kawamoto, K.;Nobuhara, H.;Yoshida, S.;Hirota, K.
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2003.09a
- /
- pp.451-454
- /
- 2003
A new query method, called query by visual keys, is proposed to aim easy operation and efficient region-based image retrieval (RBIR). Visual keys are constructed from representative regions/subimages in a given image database, and the database is indexed with visual keys. A system on PC is presented, where text retrieval techniques are applied to the image retrieval with visual keys. Experimental results show that one retrieval is done within 4ms and that the proposed system achieves the comparable retrieval precision (with user-friendly operation and low computational cost) to conventional region based image retrieval systems
PDF

The Prefix Array for Multimedia Information Retrieval in the Real-Time Stenograph (실시간 속기 자막 환경에서 멀티미디어 정보 검색을 위한 Prefix Array)

Kim, Dong-Joo;Kim, Han-Woo
- Proceedings of the KIEE Conference
- /
- 2006.10c
- /
- pp.521-523
- /
- 2006
This paper proposes an algorithm and its data structure to support real-time full-text search for the streamed or broadcasted multimedia data containing real-time stenograph text. Since the traditional indexing method used at information retrieval area uses the linguistic information, there is a heavy cost. Therefore, we propose the algorithm and its data structure based on suffix array, which is a simple data structure and has low space complexity. Suffix array is useful frequently to search for huge text. However, subtitle text of multimedia data is to get longer by time. Therefore, suffix array must be reconstructed because subtitle text is continually changed. We propose the data structure called prefix array and search algorithm using it.
PDF

Enhancing the Performance of Blog Retrieval by User Tagging and Social Network Analysis (사용자 태그와 중심성 지수를 이용한 블로그 검색 성능 향상에 관한 연구)

Kim, Eun-Hee;Chung, Young-Mee
- Journal of the Korean Society for information Management
- /
- v.27 no.1
- /
- pp.61-77
- /
- 2010
Blogs are now one of the major information resources on the web. The purpose of this study is to enhance the performance of blog retrieval by means of user assigned tags and trackback information. To this end, retrieval experiments were performed with a dataset of 4,908 blog pages together with their associated trackback URLs. In the experiments, text terms, user tags, and network centrality values based on trackbacks were variously combined as retrieval features. The experimental results showed that employing user tags and network centrality values as retrieval features in addition to text words could improve the performance of blog retrieval.
https://doi.org/10.3743/KOSIM.2010.27.1.061 인용 PDF

Semantic Image Retrieval Using RDF Metadata Based on the Representation of Spatial Relationships (공간관계 표현 기반 RDF 메타데이터를 이용한 의미적 이미지 검색)

Hwang, Myung-Gwun;Kong, Hyun-Jang;Kim, Pan-Koo
- The KIPS Transactions:PartB
- /
- v.11B no.5
- /
- pp.573-580
- /
- 2004
As the modern techniques have improved, people intend to store and manage the information on the web. Especially, it is the image data that is given a great deal of weight of the information because of the development of the scan and popularization of the digital camera and the cell-phone's camera. However, most image retrieval systems are still based on the text annotations while many images are creating everyday on the web. In this paper, we suggest the new approach for the semantic image retrieval using the RDF metadata based on the representation of the spatial relationships. For the semantic image retrieval, firstly we define the new vocabularies to represent the spatial relationships between the objects in the image. Secondly, we write the metadata about the image using RDF and new vocabularies. Finally. we could expect more correct result in our image retrieval system.
https://doi.org/10.3745/KIPSTB.2004.11B.5.573 인용 PDF KSCI

Search Result 213, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)