• Title/Summary/Keyword: Search algorithm

Search Result 3,911, Processing Time 0.033 seconds

Using Roots and Patterns to Detect Arabic Verbs without Affixes Removal

  • Abdulmonem Ahmed;Aybaba Hancrliogullari;Ali Riza Tosun
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.1-6
    • /
    • 2023
  • Morphological analysis is a branch of natural language processing, is now a rapidly growing field. The fundamental tenet of morphological analysis is that it can establish the roots or stems of words and enable comparison to the original term. Arabic is a highly inflected and derivational language and it has a strong structure. Each root or stem can have a large number of affixes attached to it due to the non-concatenative nature of Arabic morphology, increasing the number of possible inflected words that can be created. Accurate verb recognition and extraction are necessary nearly all issues in well-known study topics include Web Search, Information Retrieval, Machine Translation, Question Answering and so forth. in this work we have designed and implemented an algorithm to detect and recognize Arbic Verbs from Arabic text.The suggested technique was created with "Python" and the "pyqt5" visual package, allowing for quick modification and easy addition of new patterns. We employed 17 alternative patterns to represent all verbs in terms of singular, plural, masculine, and feminine pronouns as well as past, present, and imperative verb tenses. All of the verbs that matched these patterns were used when a verb has a root, and the outcomes were reliable. The approach is able to recognize all verbs with the same structure without requiring any alterations to the code or design. The verbs that are not recognized by our method have no antecedents in the Arabic roots. According to our work, the strategy can rapidly and precisely identify verbs with roots, but it cannot be used to identify verbs that are not in the Arabic language. We advise employing a hybrid approach that combines many principles as a result.

Conversion of Large RDF Data using Hash-based ID Mapping Tables with MapReduce Jobs (맵리듀스 잡을 사용한 해시 ID 매핑 테이블 기반 대량 RDF 데이터 변환 방법)

  • Kim, InA;Lee, Kyu-Chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.236-239
    • /
    • 2021
  • With the growth of AI technology, the scale of Knowledge Graphs continues to be expanded. Knowledge Graphs are mainly expressed as RDF representations that consist of connected triples. Many RDF storages compress and transform RDF triples into the condensed IDs. However, if we try to transform a large scale of RDF triples, it occurs the high processing time and memory overhead because it needs to search the large ID mapping table. In this paper, we propose the method of converting RDF triples using Hash-based ID mapping tables with MapReduce, which is the software framework with a parallel, distributed algorithm. Our proposed method not only transforms RDF triples into Integer-based IDs, but also improves the conversion speed and memory overhead. As a result of our experiment with the proposed method for LUBM, the size of the dataset is reduced by about 3.8 times and the conversion time was spent about 106 seconds.

  • PDF

The evaluation of Spectral Vegetation Indices for Classification of Nutritional Deficiency in Rice Using Machine Learning Method

  • Jaekyeong Baek;Wan-Gyu Sang;Dongwon Kwon;Sungyul Chanag;Hyeojin Bak;Ho-young Ban;Jung-Il Cho
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.88-88
    • /
    • 2022
  • Detection of stress responses in crops is important to diagnose crop growth and evaluate yield. Also, the multi-spectral sensor is effectively known to evaluate stress caused by nutrient and moisture in crops or biological agents such as weeds or diseases. Therefore, in this experiment, multispectral images were taken by an unmanned aerial vehicle(UAV) under field condition. The experiment was conducted in the long-term fertilizer field in the National Institute of Crop Science, and experiment area was divided into different status of NPK(Control, N-deficiency, P-deficiency, K-deficiency, Non-fertilizer). Total 11 vegetation indices were created with RGB and NIR reflectance values using python. Variations in nutrient content in plants affect the amount of light reflected or absorbed for each wavelength band. Therefore, the objective of this experiment was to evaluate vegetation indices derived from multispectral reflectance data as input into machine learning algorithm for the classification of nutritional deficiency in rice. RandomForest model was used as a representative ensemble model, and parameters were adjusted through hyperparameter tuning such as RandomSearchCV. As a result, training accuracy was 0.95 and test accuracy was 0.80, and IPCA, NDRE, and EVI were included in the top three indices for feature importance. Also, precision, recall, and f1-score, which are indicators for evaluating the performance of the classification model, showed a distribution of 0.7-0.9 for each class.

  • PDF

Improving Classification Accuracy in Hierarchical Trees via Greedy Node Expansion

  • Byungjin Lim;Jong Wook Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.6
    • /
    • pp.113-120
    • /
    • 2024
  • With the advancement of information and communication technology, we can easily generate various forms of data in our daily lives. To efficiently manage such a large amount of data, systematic classification into categories is essential. For effective search and navigation, data is organized into a tree-like hierarchical structure known as a category tree, which is commonly seen in news websites and Wikipedia. As a result, various techniques have been proposed to classify large volumes of documents into the terminal nodes of category trees. However, document classification methods using category trees face a problem: as the height of the tree increases, the number of terminal nodes multiplies exponentially, which increases the probability of misclassification and ultimately leads to a reduction in classification accuracy. Therefore, in this paper, we propose a new node expansion-based classification algorithm that satisfies the classification accuracy required by the application, while enabling detailed categorization. The proposed method uses a greedy approach to prioritize the expansion of nodes with high classification accuracy, thereby maximizing the overall classification accuracy of the category tree. Experimental results on real data show that the proposed technique provides improved performance over naive methods.

Audio Fingerprint Extraction Method Using Multi-Level Quantization Scheme (다중 레벨 양자화 기법을 적용한 오디오 핑거프린트 추출 방법)

  • Song Won-Sik;Park Man-Soo;Kim Hoi-Rin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.4
    • /
    • pp.151-158
    • /
    • 2006
  • In this paper, we proposed a new audio fingerprint extraction method, based on Philips' music retrieval algorithm, which uses the energy difference of neighboring filter-bank and probabilistic characteristics of music. Since Philips method uses too many filter-banks in limited frequency band, it may cause audio fingerprints to be highly sensitive to additive noises and to have too high correlation between neighboring bands. The proposed method improves robustness to noises by reducing the number of filter-banks while it maintains the discriminative power by representing the energy difference of bands with 2 bits where the quantization levels are determined by probabilistic characteristics. The correlation which exists among 4 different levels in 2 bits is not only utilized in similarity measurement. but also in efficient reduction of searching area. Experiments show that the proposed method is not only more robust to various environmental noises (street, department, car, office, and restaurant), but also takes less time for database search than Philips in the case where music is highly degraded.

The new frontier: utilizing ChatGPT to expand craniofacial research

  • Andi Zhang;Ethan Dimock;Rohun Gupta;Kevin Chen
    • Archives of Craniofacial Surgery
    • /
    • v.25 no.3
    • /
    • pp.116-122
    • /
    • 2024
  • Background: Due to the importance of evidence-based research in plastic surgery, the authors of this study aimed to assess the accuracy of ChatGPT in generating novel systematic review ideas within the field of craniofacial surgery. Methods: ChatGPT was prompted to generate 20 novel systematic review ideas for 10 different subcategories within the field of craniofacial surgery. For each topic, the chatbot was told to give 10 "general" and 10 "specific" ideas that were related to the concept. In order to determine the accuracy of ChatGPT, a literature review was conducted using PubMed, CINAHL, Embase, and Cochrane. Results: In total, 200 total systematic review research ideas were generated by ChatGPT. We found that the algorithm had an overall 57.5% accuracy at identifying novel systematic review ideas. ChatGPT was found to be 39% accurate for general topics and 76% accurate for specific topics. Conclusion: Craniofacial surgeons should use ChatGPT as a tool. We found that ChatGPT provided more precise answers with specific research questions than with general questions and helped narrow down the search scope, leading to a more relevant and accurate response. Beyond research purposes, ChatGPT can augment patient consultations, improve healthcare equity, and assist in clinical decision-making. With rapid advancements in artificial intelligence (AI), it is important for plastic surgeons to consider using AI in their clinical practice to improve patient-centered outcomes.

Sixteen years progress in recanalization of chronic carotid artery occlusion: A comprehensive review

  • Stanishevskiy Artem;Babichev Konstantin;Savello Alexander;Gizatullin Shamil;Svistov Dmitriy;Davydov Denis
    • Journal of Cerebrovascular and Endovascular Neurosurgery
    • /
    • v.25 no.1
    • /
    • pp.1-12
    • /
    • 2023
  • Objective: Although chronic carotid artery occlusion seems to be associated with significant risk of ischemic stroke, revascularization techniques are neither well established nor widespread. In contrast, extracranial-intracranial bypass is common despite the lack of evidence regarding neurological improvement or prevention of ischemic events. The aim of current review is to evaluate the effectiveness of various methods of recanalization of chronic carotid artery occlusion. Methods: Comprehensive literature search through PubMed, Scopus, Cochrane and Web of Science databases performed. Various parameters were assessed among patients underwent surgical, endovascular and hybrid recanalization for chronic carotid artery occlusion. Results: 40 publications from 2005 to 2021 with total of more than 1300 cases of revascularization of chronic carotid artery occlusion have been reviewed. Further parameters were assessed among patients underwent surgical, endovascular and hybrid recanalization for chronic carotid artery occlusion: mean age, male to female ratio, mean duration of occlusion before treatment, rate of successful recanalization, frequency of restenosis and reocclusion, prevalence of ischemic stroke postoperatively, neurological or other symptoms improvement and complications. Based on proposed through reviewed literature indications for revascularization and predictive factors of various recanalizing procedures, an algorithm for clinical decision making have been formulated. Conclusions: Although treatment of chronic carotid artery occlusion remains challenging, current literature suggests revascularization as single option for verified neurological improvement and prevention of ischemic events. Surgical and endovascular procedures should be taken into account when treating patients with symptomatic chronic carotid artery occlusion.

Improved Sentence Boundary Detection Method for Web Documents (웹 문서를 위한 개선된 문장경계인식 방법)

  • Lee, Chung-Hee;Jang, Myung-Gil;Seo, Young-Hoon
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.455-463
    • /
    • 2010
  • In this paper, we present an approach to sentence boundary detection for web documents that builds on statistical-based methods and uses rule-based correction. The proposed system uses the classification model learned offline using a training set of human-labeled web documents. The web documents have many word-spacing errors and frequently no punctuation mark that indicates the end of sentence boundary. As sentence boundary candidates, the proposed method considers every Ending Eomis as well as punctuation marks. We optimize engine performance by selecting the best feature, the best training data, and the best classification algorithm. For evaluation, we made two test sets; Set1 consisting of articles and blog documents and Set2 of web community documents. We use F-measure to compare results on a large variety of tasks, Detecting only periods as sentence boundary, our basis engine showed 96.5% in Set1 and 56.7% in Set2. We improved our basis engine by adapting features and the boundary search algorithm. For the final evaluation, we compared our adaptation engine with our basis engine in Set2. As a result, the adaptation engine obtained improvements over the basis engine by 39.6%. We proved the effectiveness of the proposed method in sentence boundary detection.

Task Balancing Scheme of MPI Gridding for Large-scale LiDAR Data Interpolation (대용량 LiDAR 데이터 보간을 위한 MPI 격자처리 과정의 작업량 발란싱 기법)

  • Kim, Seon-Young;Lee, Hee-Zin;Park, Seung-Kyu;Oh, Sang-Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.1-10
    • /
    • 2014
  • In this paper, we propose MPI gridding algorithm of LiDAR data that minimizes the communication between the cores. The LiDAR data collected from aircraft is a 3D spatial information which is used in various applications. Since there are many cases where the LiDAR data has too high resolution than actually required or non-surface information is included in the data, filtering the raw LiDAR data is required. In order to use the filtered data, the interpolation using the data structure to search adjacent locations is conducted to reconstruct the data. Since the processing time of LiDAR data is directly proportional to the size of it, there have been many studies on the high performance parallel processing system using MPI. However, previously proposed methods in parallel approach possess possible performance degradations such as imbalanced data size among cores or communication overhead for resolving boundary condition inconsistency. We conduct empirical experiments to verify the effectiveness of our proposed algorithm. The results show that the total execution time of the proposed method decreased up to 4.2 times than that of the conventional method on heterogeneous clusters.

Automatic Extraction of Eye and Mouth Fields from Face Images using MultiLayer Perceptrons and Eigenfeatures (고유특징과 다층 신경망을 이용한 얼굴 영상에서의 눈과 입 영역 자동 추출)

  • Ryu, Yeon-Sik;O, Se-Yeong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.37 no.2
    • /
    • pp.31-43
    • /
    • 2000
  • This paper presents a novel algorithm lot extraction of the eye and mouth fields (facial features) from 2D gray level face images. First of all, it has been found that Eigenfeatures, derived from the eigenvalues and the eigenvectors of the binary edge data set constructed from the eye and mouth fields are very good features to locate these fields. The Eigenfeatures, extracted from the positive and negative training samples for the facial features, ate used to train a MultiLayer Perceptron(MLP) whose output indicates the degree to which a particular image window contains the eye or the mouth within itself. Second, to ensure robustness, the ensemble network consisting of multiple MLPs is used instead of a single MLP. The output of the ensemble network becomes the average of the multiple locations of the field each found by the constituent MLPs. Finally, in order to reduce the computation time, we extracted the coarse search region lot eyes and mouth by using prior information on face images. The advantages of the proposed approach includes that only a small number of frontal faces are sufficient to train the nets and furthermore, lends themselves to good generalization to non-frontal poses and even to other people's faces. It was also experimentally verified that the proposed algorithm is robust against slight variations of facial size and pose due to the generalization characteristics of neural networks.

  • PDF