Search | Korea Science

An Experimental Study on Topic Distillation Using Web Site Structure (웹 사이트 구조를 이용한 토픽 검색 연구)

Lee, Jee-Suk;Chung, Yung-Mee
- Journal of the Korean Society for information Management
- /
- v.24 no.3
- /
- pp.201-218
- /
- 2007
This study proposes a topic distillation algorithm that ranks the relevant sites selected from retrieved web pages, and evaluates the performance of the algorithm. The algorithm calculates the topic score of a site using its hierarchical structure. The TREC .GOV test collection and a set of TREC-2004 queries for topic distillation task are used for the experiment. The experimental results showed the algorithm returned at least 2 relevant sites in top ten retrieval results. We peformed an in-depth analysis of the relevant sites list provided by TREC-2004 to find out that the definition of topic distillation was not strictly applied in selecting relevant sites. When we re-evaluated the retrieved sites/sub-sites using the revised list of relevant sites, the performance of the proposed algorithm was improved significantly.
https://doi.org/10.3743/KOSIM.2007.24.3.201 인용 PDF

Novel Category Discovery in Plant Species and Disease Identification through Knowledge Distillation

Jiuqing Dong;Alvaro Fuentes;Mun Haeng Lee;Taehyun Kim;Sook Yoon;Dong Sun Park
- Smart Media Journal
- /
- v.13 no.7
- /
- pp.36-44
- /
- 2024
Identifying plant species and diseases is crucial for maintaining biodiversity and achieving optimal crop yields, making it a topic of significant practical importance. Recent studies have extended plant disease recognition from traditional closed-set scenarios to open-set environments, where the goal is to reject samples that do not belong to known categories. However, in open-world tasks, it is essential not only to define unknown samples as "unknown" but also to classify them further. This task assumes that images and labels of known categories are available and that samples of unknown categories can be accessed. The model classifies unknown samples by learning the prior knowledge of known categories. To the best of our knowledge, there is no existing research on this topic in plant-related recognition tasks. To address this gap, this paper utilizes knowledge distillation to model the category space relationships between known and unknown categories. Specifically, we identify similarities between different species or diseases. By leveraging a fine-tuned model on known categories, we generate pseudo-labels for unknown categories. Additionally, we enhance the baseline method's performance by using a larger pre-trained model, dino-v2. We evaluate the effectiveness of our method on the large plant specimen dataset Herbarium 19 and the disease dataset Plant Village. Notably, our method outperforms the baseline by 1% to 20% in terms of accuracy for novel category classification. We believe this study will contribute to the community.
https://doi.org/10.30693/SMJ.2024.13.7.36 인용 PDF

Search Result 2, Processing Time 0.02 seconds

An Experimental Study on Topic Distillation Using Web Site Structure (웹 사이트 구조를 이용한 토픽 검색 연구)

Novel Category Discovery in Plant Species and Disease Identification through Knowledge Distillation

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)