• Title/Summary/Keyword: BM25 algorithm

Search Result 7, Processing Time 0.019 seconds

A Research on Enhancement of Text Categorization Performance by using Okapi BM25 Word Weight Method (Okapi BM25 단어 가중치법 적용을 통한 문서 범주화의 성능 향상)

  • Lee, Yong-Hun;Lee, Sang-Bum
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.12
    • /
    • pp.5089-5096
    • /
    • 2010
  • Text categorization is one of important features in information searching system which classifies documents according to some criteria. The general method of categorization performs the classification of the target documents by eliciting important index words and providing the weight on them. Therefore, the effectiveness of algorithm is so important since performance and correctness of text categorization totally depends on such algorithm. In this paper, an enhanced method for text categorization by improving word weighting technique is introduced. A method called Okapi BM25 has been proved its effectiveness from some information retrieval engines. We applied Okapi BM25 and showed its good performance in the categorization. Various other words weights methods are compared: TF-IDF, TF-ICF and TF-ISF. The target documents used for this experiment is Reuter-21578, and SVM and KNN algorithms are used. Finally, modified Okapi BM25 shows the most excellent performance.

Optimization of block-matching and 3D filtering (BM3D) algorithm in brain SPECT imaging using fan beam collimator: Phantom study

  • Do, Yongho;Cho, Youngkwon;Kang, Seong-Hyeon;Lee, Youngjin
    • Nuclear Engineering and Technology
    • /
    • v.54 no.9
    • /
    • pp.3403-3414
    • /
    • 2022
  • The purpose of this study is to model and optimize the block-matching and 3D filtering (BM3D) algorithm and to evaluate its applicability in brain single-photon emission computed tomography (SPECT) images using a fan beam collimator. For quantitative evaluation of the noise level, the coefficient of variation (COV) and contrast-to-noise ratio (CNR) were used, and finally, a no-reference-based evaluation parameter was used for optimization of the BM3D algorithm in the brain SPECT images. As a result, optimized results were derived when the sigma values of the BM3D algorithm were 0.15, 0.2, and 0.25 in brain SPECT images acquired for 5, 10, and 15 s, respectively. In addition, when the sigma value of the optimized BM3D algorithm was applied, superior results were obtained compared with conventional filtering methods. In particular, we confirmed that the COV and CNR of the images obtained using the BM3D algorithm were improved by 2.40 and 2.33 times, respectively, compared with the original image. In conclusion, the usefulness of the optimized BM3D algorithm in brain SPECT images using a fan beam collimator has been proven, and based on the results, it is expected that its application in various nuclear medicine examinations will be possible.

Assistant Chatbot for Database Design Course (데이터베이스 설계 교과목을 위한 조교 챗봇)

  • Kim, Eun-Gyung;Jeong, Tae-Hun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.11
    • /
    • pp.1615-1622
    • /
    • 2022
  • In order to overcome the limitations of the instructor-centered lecture-style teaching method, recently, flipped learning, a learner-centered teaching method, has been widely introduced. However, despite the many advantages of flipped learning, there is a problem that students cannot solve questions that arise during prior learning in real time. Therefore, in order to solve this problem, we developed DBbot, an assistant chatbot for database design course managed in the flipped learning method. The DBBot is composed of a chatbot app for learners and a chatbot management app for instructors. Also, it's implemented so that questions that instructors can anticipate in advance, such as questions related to class operation and every semester repeated questions related to learning content, can be answered using Google's DialogFlow. It's implemented so that questions that the instructor cannot predict in advance, such as questions related to team projects, can be answered using the question/answer DB and the BM25 algorithm, which is a similarity comparison algorithm.

Implementation of Battery Management System for Li-ion Battery Considering Self-energy Balancing (셀프에너지 밸런싱을 고려한 리튬이온전지의 Battery Management System 구현)

  • Kim, Ji-Myung;Lee, Hu-Dong;Tae, Dong-Hyun;Ferreira, Marito;Park, Ji-Hyun;Rho, Dae-Seok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.3
    • /
    • pp.585-593
    • /
    • 2020
  • Until now, 29 fire accidents have occurred; 22 of them were caused by the interconnection of renewable energy sources that occurred during the rest period after the lithium-ion battery had been fully charged regardless of the seasons. The fire accidents of ESS were attributed to thermal runaway due to the overcharging of a few cells with the phenomenon of self-energy balancing, which is unintentional current flow from cells with a high SOC to the low cells if the SOC condition of each cell connected in parallel is different. Therefore, this paper proposes a novel configuration and operation algorithm of the BMS to prevent the self-energy balancing of ESS and presents a hybrid SOC estimation algorithm. From the test results of the self-energy balancing phenomenon between aging and normal cells based on the proposed algorithm and BMS, it was confirmed the possibility of self-energy balancing, which is unintentional current flow from cells with a high SOC to cells with a low SOC. In addition, the proposed configuration of the BMS is useful and practical to improve the safety of lithium-ion batteries because the BMS can reliably disconnect a parallel connection of the cells if the self-energy balancing current becomes excessively high.

Usefulness of Deep Learning Image Reconstruction in Pediatric Chest CT (소아 흉부 CT 검사 시 딥러닝 영상 재구성의 유용성)

  • Do-Hun Kim;Hyo-Yeong Lee
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.3
    • /
    • pp.297-303
    • /
    • 2023
  • Pediatric Computed Tomography (CT) examinations can often result in exam failures or the need for frequent retests due to the difficulty of cooperation from young patients. Deep Learning Image Reconstruction (DLIR) methods offer the potential to obtain diagnostically valuable images while reducing the retest rate in CT examinations of pediatric patients with high radiation sensitivity. In this study, we investigated the possibility of applying DLIR to reduce artifacts caused by respiration or motion and obtain clinically useful images in pediatric chest CT examinations. Retrospective analysis was conducted on chest CT examination data of 43 children under the age of 7 from P Hospital in Gyeongsangnam-do. The images reconstructed using Filtered Back Projection (FBP), Adaptive Statistical Iterative Reconstruction (ASIR-50), and the deep learning algorithm TrueFidelity-Middle (TF-M) were compared. Regions of interest (ROI) were drawn on the right ascending aorta (AA) and back muscle (BM) in contrast-enhanced chest images, and noise (standard deviation, SD) was measured using Hounsfield units (HU) in each image. Statistical analysis was performed using SPSS (ver. 22.0), analyzing the mean values of the three measurements with one-way analysis of variance (ANOVA). The results showed that the SD values for AA were FBP=25.65±3.75, ASIR-50=19.08±3.93, and TF-M=17.05±4.45 (F=66.72, p=0.00), while the SD values for BM were FBP=26.64±3.81, ASIR-50=19.19±3.37, and TF-M=19.87±4.25 (F=49.54, p=0.00). Post-hoc tests revealed significant differences among the three groups. DLIR using TF-M demonstrated significantly lower noise values compared to conventional reconstruction methods. Therefore, the application of the deep learning algorithm TrueFidelity-Middle (TF-M) is expected to be clinically valuable in pediatric chest CT examinations by reducing the degradation of image quality caused by respiration or motion.

Tensor-based tag emotion aware recommendation with probabilistic ranking

  • Lim, Hyewon;Kim, Hyoung-Joo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.12
    • /
    • pp.5826-5841
    • /
    • 2019
  • In our previous research, we proposed a tag emotion-based item recommendation scheme. The ternary associations among users, items, and tags are described as a three-order tensor in order to capture the emotions in tags. The candidates for recommendation are created based on the latent semantics derived by a high-order singular value decomposition technique (HOSVD). However, the tensor is very sparse because the number of tagged items is smaller than the amount of all items. The previous research do not consider the previous behaviors of users and items. To mitigate the problems, in this paper, the item-based collaborative filtering scheme is used to build an extended data. We also apply the probabilistic ranking algorithm considering the user and item profiles to improve the recommendation performance. The proposed method is evaluated based on Movielens dataset, and the results show that our approach improves the performance compared to other methods.

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

  • Song, Min
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.1
    • /
    • pp.6-21
    • /
    • 2014
  • This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.