• 제목/요약/키워드: Precision-recall

검색결과 703건 처리시간 0.026초

정보검색효율에 관한 연구 (A Study on the Effectiveness of Information Retrieval)

  • 윤구호
    • 한국문헌정보학회지
    • /
    • 제8권
    • /
    • pp.73-101
    • /
    • 1981
  • Retrieval effectiveness is the principal criterion for measuring the performance of an information retrieval system. The effectiveness of a retrieval system depends primarily on the extent to which it can retrieve wanted documents without retrieving unwanted ones. So, ultimately, effectiveness is a function of the relevant and nonrelevant documents retrieved. Consequently, 'relevance' of information to the user's request has become one of the most fundamental concept encountered in the theory of information retrieval. Although there is at present no consensus as to how this notion should be defined, relevance has been widely used as a meaningful quantity and an adequate criterion for measures of the evaluation of retrieval effectiveness. The recall and precision among various parameters based on the 'two-by-two' table (or, contingency table) were major considerations in this paper, because it is assumed that recall and precision are sufficient for the measurement of effectiveness. Accordingly, different concepts of 'relevance' and 'pertinence' of documents to user requests and their proper usages were investigated even though the two terms have unfortunately been used rather loosely in the literature. In addition, a number of variables affecting the recall and precision values were discussed. Some conclusions derived from this study are as follows: Any notion of retrieval effectiveness is based on 'relevance' which itself is extremely difficult to define. Recall and precision are valuable concepts in the study of any information retrieval system. They are, however, not the only criteria by which a system may be judged. The recall-precision curve represents the average performance of any given system, and this may vary quite considerably in particular situations. Therefore, it is possible to some extent to vary the indexing policy, the indexing policy, the indexing language, or the search methodology to improve the performance of the system in terms of recall and precision. The 'inverse relationship' between average recall and precision could be accepted as the 'fundamental law of retrieval', and it should certainly be used as an aid to evaluation. Finally, there is a limit to the performance(in terms of effectiveness) achievable by an information retrieval system. That is : "Perfect retrieval is impossible."

  • PDF

머신 러닝을 사용한 이미지 클러스터링: K-means 방법을 사용한 InceptionV3 연구 (Image Clustering Using Machine Learning : Study of InceptionV3 with K-means Methods.)

  • 닌담 솜사우트;이효종
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2021년도 추계학술발표대회
    • /
    • pp.681-684
    • /
    • 2021
  • In this paper, we study image clustering without labeling using machine learning techniques. We proposed an unsupervised machine learning technique to design an image clustering model that automatically categorizes images into groups. Our experiment focused on inception convolutional neural networks (inception V3) with k-mean methods to cluster images. For this, we collect the public datasets containing Food-K5, Flowers, Handwritten Digit, Cats-dogs, and our dataset Rice Germination, and the owner dataset Palm print. Our experiment can expand into three-part; First, format all the images to un-label and move to whole datasets. Second, load dataset into the inception V3 extraction image features and transferred to the k-mean cluster group hold on six classes. Lastly, evaluate modeling accuracy using the confusion matrix base on precision, recall, F1 to analyze. In this our methods, we can get the results as 1) Handwritten Digit (precision = 1.000, recall = 1.000, F1 = 1.00), 2) Food-K5 (precision = 0.975, recall = 0.945, F1 = 0.96), 3) Palm print (precision = 1.000, recall = 0.999, F1 = 1.00), 4) Cats-dogs (precision = 0.997, recall = 0.475, F1 = 0.64), 5) Flowers (precision = 0.610, recall = 0.982, F1 = 0.75), and our dataset 6) Rice Germination (precision = 0.997, recall = 0.943, F1 = 0.97). Our experiment showed that modeling could get an accuracy rate of 0.8908; the outcomes state that the proposed model is strongest enough to differentiate the different images and classify them into clusters.

영상 세그멘테이션 및 템플리트 매칭 기술을 응용한 필름 결함 검출 시스템 (A Film-Defect Inspection System Using Image Segmentation and Template Matching Techniques)

  • 윤영근;이석룡;박호현;정진완;김상희
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제34권2호
    • /
    • pp.99-108
    • /
    • 2007
  • 본 논문에서는 TFT-LCD에 사용되는 편광 필름(polarized film)의 제작 과정 중 최종 단계에서 수행되는 필름의 결함 검출 및 결함 유형을 판정하기 위한 필름 결함 검출 시스템(Film Defect Inspection System: FDIS)을 설계하고 이를 구현하였다. 제안한 시스템은 영상 세그멘테이션 기법을 이용하여 편광 필름 영상으로부터 결함을 검출하였고, 검출된 결함의 영상을 분석하여 결함 유형을 판정할 수 있도록 설계되었다. 결함 유형의 판정은 결함 영역의 형태적 특성 및 질감(texture) 등의 특징을 추출하여 템플리트(template) 데이타베이스에 저장된 기준(reference) 결함 영상과 비교함으로써 수행된다. FDIS를 이용한 실험 결과, 테스트 영상에서 모든 결함 영역을 빠른 시간 안에 (평균 0.64초), 정확히 검출하였으며(Precision 1.0, Recall 1.0), 결함 유형을 판정하는 실험에서도 평균 Precision 0.96, Recall 0.95로 정확도가 매우 높은 것을 관찰할 수 있었다. 또한 회전 변형을 적용한 경우의 결함 유형 검출 실험에서도 평균 Precision 0.95, Recall 0.89로 제안한 기법이 회전 변환에 대하여 견고함을 보여 주었다.

Sentiment Analysis From Images - Comparative Study of SAI-G and SAI-C Models' Performances Using AutoML Vision Service from Google Cloud and Clarifai Platform

  • Marcu, Daniela;Danubianu, Mirela
    • International Journal of Computer Science & Network Security
    • /
    • 제21권9호
    • /
    • pp.179-184
    • /
    • 2021
  • In our study we performed a sentiments analysis from the images. For this purpose, we used 153 images that contain: people, animals, buildings, landscapes, cakes and objects that we divided into two categories: images that suggesting a positive or a negative emotion. In order to classify the images using the two categories, we created two models. The SAI-G model was created with Google's AutoML Vision service. The SAI-C model was created on the Clarifai platform. The data were labeled in a preprocessing stage, and for the SAI-C model we created the concepts POSITIVE (POZITIV) AND NEGATIVE (NEGATIV). In order to evaluate the performances of the two models, we used a series of evaluation metrics such as: Precision, Recall, ROC (Receiver Operating Characteristic) curve, Precision-Recall curve, Confusion Matrix, Accuracy Score and Average precision. Precision and Recall for the SAI-G model is 0.875, at a confidence threshold of 0.5, while for the SAI-C model we obtained much lower scores, respectively Precision = 0.727 and Recall = 0.571 for the same confidence threshold. The results indicate a lower classification performance of the SAI-C model compared to the SAI-G model. The exception is the value of Precision for the POSITIVE concept, which is 1,000.

Tree-Pattern-Based Clone Detection with High Precision and Recall

  • Lee, Hyo-Sub;Choi, Myung-Ryul;Doh, Kyung-Goo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권5호
    • /
    • pp.1932-1950
    • /
    • 2018
  • The paper proposes a code-clone detection method that gives the highest possible precision and recall, without giving much attention to efficiency and scalability. The goal is to automatically create a reliable reference corpus that can be used as a basis for evaluating the precision and recall of clone detection tools. The algorithm takes an abstract-syntax-tree representation of source code and thoroughly examines every possible pair of all duplicate tree patterns in the tree, while avoiding unnecessary and duplicated comparisons wherever possible. The largest possible duplicate patterns are then collected in the set of pattern clusters that are used to identify code clones. The method is implemented and evaluated for a standard set of open-source Java applications. The experimental result shows very high precision and recall. False-negative clones missed by our method are all non-contiguous clones. Finally, the concept of neighbor patterns, which can be used to improve recall by detecting non-contiguous clones and intertwined clones, is proposed.

신문사 자료실에 대한 평가 -문헌전달능력과 검색효율을 중심으로- (Evaluation of the Newspaper Library -With Emphasis on the Document Delivery Capability and Retrieval Effectivenss-)

  • 노동조
    • 한국비블리아학회지
    • /
    • 제7권1호
    • /
    • pp.319-351
    • /
    • 1994
  • This rearch is a case study for the newspaper libraries in Seoul and the primary purpose of the this study are to investigate its document delivery capability. To achieve the above-mentioned purpose, representative rsers visited seven the newspaper library and checked their searching time. Document delivery capability was checked by units of hour, minute, second(searching time). Retrieval effectiveness was tested through the recall ratio and the precision ratio. The major findings of the study are summarized as follows: 1) Most of the newspaper libraries excellent to the document delivery capability; 6 newspaper libraries deliverived the data related subject. 2) The newspaper libraries were came out 50.1% the mean recall ratio and 84.8% the mean precision ratio about the all materials. 3) Concerned their own articles, the newspaper libraries showed 71.4% the recall ratio and 90.0% the precision ratio. That moaned their own articles were more effectived than others. 4) The Kookmin Ilbo library had the most excellent system, and the precision ratio of The Dong-A Ilbo library prior to the recall ratio. The Han Kyoreh Shinmun library had a excellent arragement in own articles, but The Segye Times library had problem in every parties.

  • PDF

사용자 상황을 이용한 추천 서비스 시스템의 필터링 기법에 관한 연구 (A Study on a Filtering Method of Recommendation Service System Using User's Context)

  • 한동조;박대영;최기호
    • 한국ITS학회 논문지
    • /
    • 제8권1호
    • /
    • pp.119-126
    • /
    • 2009
  • 최근 개개인의 취향이나 특성을 고려하여 자동으로 사용자에게 정보를 찾아주거나 추천해주는 추천 서비스 시스템이 많이 개발되고 있다. 하지만 사용자의 상황에 따른 선호도를 고려하지 않을 경우 정확한 추천이 힘든 단점이 있다. 따라서 본 논문에서는 사용자의 상황에 따른 선호도를 고려하여 정확한 추천을 할 수 있는 필터링 방법을 제안하였다. 이를 위해 상황에 따른 사용자 선호도를 구하고 피어슨 상관계수를 이용하여 사용자의 상황별 오브젝트 선호도를 구하였다. 실험 결과, 기존의 서비스 시스템들과 비교하여 precision은 11%, 2%, recall은 8%, 4% 향상되었으며, 전체적으로 precision은 77%, recall은 53%로 나타났다.

  • PDF

CCR : 트리패턴 기반의 코드클론 탐지기 (CCR : Tree-pattern based Code-clone Detector)

  • 이효섭;도경구
    • 한국소프트웨어감정평가학회 논문지
    • /
    • 제8권2호
    • /
    • pp.13-27
    • /
    • 2012
  • 본 연구에서는 트리패턴 기반으로 코드클론을 탐지하는 도구인 CCR(Code Clone Ransacker)를 제안하고 구현하였다. CCR은 프로그램 트리의 모든 하위트리 쌍을 비교하여 중복된 부분인 트리패턴을 찾고 동일한 모양의 패턴들을 하나로 묶어 프로그램에 존재하는 클론들을 샅샅이 탐지한다. 이때 이미 찾은 패턴 내부의 클론 패턴을 비교대상에서 제외하여 중복계산을 하지 않아 불필요한 예산을 최대한 줄인다. 실험으로 CCR의 성능을 평가한 결과, CCR의 정확성과 탐지성은 높다. 프로그램의 구조를 비교하는 기존의 트리패턴 기반의 코드클론 탐지 도구들의 정확성과 탐지성은 이미 좋은 것으로 알려져 있지만, CCR은 높은 정확성을 유지하면서 탐지성은 기존의 Asta보다는 최대 5배, CloneDigger보다는 약 1.9배 높다. 그리고 CCR이 찾은 코드클론은 기존의 코드클론 표본 집합체의 클론을 대부분 포함한다.

구조화된 소셜 메타데이터를 활용한 이미지 자료의 시맨틱 검색에 관한 실험적 연구 (An Experimental Study on Semantic Searches for Image Data Using Structured Social Metadata)

  • 김현희;김용호
    • 한국문헌정보학회지
    • /
    • 제44권1호
    • /
    • pp.117-135
    • /
    • 2010
  • 본 연구는 이미지의 시맨틱 검색을 위해서 동등어, 동의어 및 관련어를 질의 확장시에 사용하여 태그를 통제한 구조화된 폭소노미 시스템의 모형을 제안하였다. 그런 다음 제안한 시스템의 효율성을 평가하기 위해서 실험을 통해서 이 시스템을 태그를 전혀 통제하지 않은 태그 기반 시스템과 검색 효율성(재현율과 정확률)과 만족도 측면에서 비교해 보았다. 이외에 검색 효율성이 질의 확장 방식에 따라서 어떤 차이를 보이는지 조사해 보았다. 실험 결과, 제안된 구조화된 폭소노미 시스템이 태그 기반 시스템 보다 재현율, 정확률 및 만족도에서 더 높게 나타났으며 그 차이도 통계적으로 유의미한 것으로 나타났다. 한편 질의 확장 방식에 따라서 재현율은 차이가 없었지만 정확률은 부분적으로 차이가 있는 것으로 나타났다. 본 연구 결과는 라이브러리 2.0 시대의 디지털 도서관 시스템에 적용되어 디지털 자원에 대한 접근성을 향상시킬 수 있을 것이다.

Diagnostic performance of artificial intelligence using cone-beam computed tomography imaging of the oral and maxillofacial region: A scoping review and meta-analysis

  • Farida Abesi ;Mahla Maleki ;Mohammad Zamani
    • Imaging Science in Dentistry
    • /
    • 제53권2호
    • /
    • pp.101-108
    • /
    • 2023
  • Purpose: The aim of this study was to conduct a scoping review and meta-analysis to provide overall estimates of the recall and precision of artificial intelligence for detection and segmentation using oral and maxillofacial cone-beam computed tomography (CBCT) scans. Materials and Methods: A literature search was done in Embase, PubMed, and Scopus through October 31, 2022 to identify studies that reported the recall and precision values of artificial intelligence systems using oral and maxillofacial CBCT images for the automatic detection or segmentation of anatomical landmarks or pathological lesions. Recall (sensitivity) indicates the percentage of certain structures that are correctly detected. Precision (positive predictive value) indicates the percentage of accurately identified structures out of all detected structures. The performance values were extracted and pooled, and the estimates were presented with 95% confidence intervals(CIs). Results: In total, 12 eligible studies were finally included. The overall pooled recall for artificial intelligence was 0.91 (95% CI: 0.87-0.94). In a subgroup analysis, the pooled recall was 0.88 (95% CI: 0.77-0.94) for detection and 0.92 (95% CI: 0.87-0.96) for segmentation. The overall pooled precision for artificial intelligence was 0.93 (95% CI: 0.88-0.95). A subgroup analysis showed that the pooled precision value was 0.90 (95% CI: 0.77-0.96) for detection and 0.94 (95% CI: 0.89-0.97) for segmentation. Conclusion: Excellent performance was found for artificial intelligence using oral and maxillofacial CBCT images.