• Title/Summary/Keyword: bag of words

Search Result 90, Processing Time 0.026 seconds

Evaluating AI Techniques for Blind Students Using Voice-Activated Personal Assistants

  • Almurayziq, Tariq S;Alshammari, Gharbi Khamis;Alshammari, Abdullah;Alsaffar, Mohammad;Aljaloud, Saud
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.1
    • /
    • pp.61-68
    • /
    • 2022
  • The present study was based on developing an AI based model to facilitate the academic registration needs of blind students. The model was developed to enable blind students to submit academic service requests and tasks with ease. The findings from previous studies formed the basis of the study where functionality gaps from the literary research identified by blind students were utilized when the system was devised. Primary simulation data were composed based on several thousand cases. As such, the current study develops a model based on archival insight. Given that the model is theoretical, it was partially applied to help determine how efficient the associated AI tools are and determine how effective they are in real-world settings by incorporating them into the portal that institutions currently use. In this paper, we argue that voice-activated personal assistant (VAPA), text mining, bag of words, and case-based reasoning (CBR) perform better together, compared with other classifiers for analyzing and classifying the text in academic request submission through the VAPA.

INSTABILITY OF THE BETTI SEQUENCE FOR PERSISTENT HOMOLOGY AND A STABILIZED VERSION OF THE BETTI SEQUENCE

  • JOHNSON, MEGAN;JUNG, JAE-HUN
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.25 no.4
    • /
    • pp.296-311
    • /
    • 2021
  • Topological Data Analysis (TDA), a relatively new field of data analysis, has proved very useful in a variety of applications. The main persistence tool from TDA is persistent homology in which data structure is examined at many scales. Representations of persistent homology include persistence barcodes and persistence diagrams, both of which are not straightforward to reconcile with traditional machine learning algorithms as they are sets of intervals or multisets. The problem of faithfully representing barcodes and persistent diagrams has been pursued along two main avenues: kernel methods and vectorizations. One vectorization is the Betti sequence, or Betti curve, derived from the persistence barcode. While the Betti sequence has been used in classification problems in various applications, to our knowledge, the stability of the sequence has never before been discussed. In this paper we show that the Betti sequence is unstable under the 1-Wasserstein metric with regards to small perturbations in the barcode from which it is calculated. In addition, we propose a novel stabilized version of the Betti sequence based on the Gaussian smoothing seen in the Stable Persistence Bag of Words for persistent homology. We then introduce the normalized cumulative Betti sequence and provide numerical examples that support the main statement of the paper.

Some Management Practices Affecting Outcrossing and Seed Production in Burley Tobacco (Nicotiana tabacum L.) (연초 버어리종의 자연교잡율과 종자생산에 관련된 몇가지 요인)

  • 정석훈;최상주;조천준;김대송;조명조;이승철
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.18 no.2
    • /
    • pp.126-131
    • /
    • 1996
  • In this study effects of isolation distance, transplanting time of maternal plants, and bagging of flower head with the gauze-cloth bag on the outcrossing of burley tobacco (Nicotiana tabacum L.) were investigated. Also the effect of fertilizer level and control of the number of capsules per plant on seed production and quality were examined. A male sterile line. produced 0.3 to 3.8 capsules Per plant when it was Planted with normally flowering tobacco with the average outcrossing of 7.2 plants, ranging from 2 to 18 out of 20 plants. With the farther the isolation distance between maternal plants and pollen donor plant, the lower the outcrossing occurred. Outcrossing occurred even at the isolation distance of 312 m. When the maternal plants were transplanted 35 days after transplanting the pollen donor ones, the outcrossed plants were not decreased significantly. The bagging of the flower head with the gauze-cloth bas (#0.9∼ 1.0 mm) decreased the outcrossed plants significantly, but couldn't prevent the outcrossing completely. The seed amount per plant was higher in the highly fertilized cultivation. The number of seed capsules per plant affected significantly on seed yield and quality. When the seed capsules was controlled by 30 or 50 capsules per plant, the weight of 1,000 seeds and germination rate were higher than those with 70 or 90 capsules per plant. Key words : Nicotiana tabacum, outcrossing, bagging.

  • PDF

How Different are Vowel Epentheses in Learner Speech and Loanword Phonology?

  • Park, Mi-Sun;Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.33-51
    • /
    • 2008
  • Difference of learner speech and loanword phonology is investigated in terms of Korean learners' speech and their loanword adaptation of English words with a post-vocalic word-final stop. When we compared the speech of 12 Korean learners in mid-intermediate level with that of eight English speakers, the learner speech did not reflect loanword phonology of the vowel insertion after a voiced word-final stop (e.g., rib$[\dotplus]$, bad$[\dotplus]$, gag$[\dotplus]$ vs. tip[=], cat[=], book[=]), but, instead, the target phonology of vowel lengthening before a voiced word-final stop (e.g., rib[r.I:b], CAD$[k{\ae}:d]$, bag$[b{\ae}:g]$ vs. rip[rI.p], cat$[k{\ae}t]$, back$[b{\ae}k])$. A longitudinal study of learner speech before and after instruction showed some development toward the acquisition of target phonology. The results indicate that learner speech departs from loanword phonology, and approaches to target speech in a faster rate than direct ratio. Thus, native phonology predicts loanword phonology, but lends little support to learner speech. Our results also indicate that loanword phonology is constant, while learner speech changes toward the acquisition of target phonology.

  • PDF

Document Clustering Using Semantic Features and Fuzzy Relations

  • Kim, Chul-Won;Park, Sun
    • Journal of information and communication convergence engineering
    • /
    • v.11 no.3
    • /
    • pp.179-184
    • /
    • 2013
  • Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

Map Alignment Method in Monocular SLAM based on Point-Line Feature (특징점과 특징선을 활용한 단안 카메라 SLAM에서의 지도 병합 방법)

  • Back, Mu Hyun;Lee, Jin Kyu;Moon, Ji Won;Hwang, Sung Soo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.2
    • /
    • pp.127-134
    • /
    • 2020
  • In this paper, we propose a map alignment method for maps generated by point-line monocular SLAM. In the proposed method, the information of feature lines as well as feature points extracted from multiple maps are fused into a single map. To this end, the proposed method first searches for similar areas between maps via Bag-of-Words-based image matching. Thereafter, it calculates the similarity transformation between the maps in the corresponding areas to align the maps. Finally, we merge the overlapped information of multiple maps into a single map by removing duplicate information from similar areas. Experimental results show that maps created by different users are combined into a single map, and the accuracy of the fused map is similar with the one generated by a single user. We expect that the proposed method can be utilized for fast imagery map generation.

Text Summarization on Large-scale Vietnamese Datasets

  • Ti-Hon, Nguyen;Thanh-Nghi, Do
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.4
    • /
    • pp.309-316
    • /
    • 2022
  • This investigation is aimed at automatic text summarization on large-scale Vietnamese datasets. Vietnamese articles were collected from newspaper websites and plain text was extracted to build the dataset, that included 1,101,101 documents. Next, a new single-document extractive text summarization model was proposed to evaluate this dataset. In this summary model, the k-means algorithm is used to cluster the sentences of the input document using different text representations, such as BoW (bag-of-words), TF-IDF (term frequency - inverse document frequency), Word2Vec (Word-to-vector), Glove, and FastText. The summary algorithm then uses the trained k-means model to rank the candidate sentences and create a summary with the highest-ranked sentences. The empirical results of the F1-score achieved 51.91% ROUGE-1, 18.77% ROUGE-2 and 29.72% ROUGE-L, compared to 52.33% ROUGE-1, 16.17% ROUGE-2, and 33.09% ROUGE-L performed using a competitive abstractive model. The advantage of the proposed model is that it can perform well with O(n,k,p) = O(n(k+2/p)) + O(nlog2n) + O(np) + O(nk2) + O(k) time complexity.

A Sentiment Analysis of Internet Movie Reviews Using String Kernels (문자열 커널을 이용한 인터넷 영화평의 감정 분석)

  • Kim, Sang-Do;Yoon, Hee-Geun;Park, Seong-Bae;Park, Se-Young;Lee, Sang-Jo
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.56-60
    • /
    • 2009
  • 오늘날 인터넷은 개인의 감정, 의견을 서로 공유할 수 있는 공간이 되고 있다. 하지만 인터넷에는 너무나 방대한 문서가 존재하기 때문에 다른 사용자들의 감정, 의견 정보를 개인의 의사 결정에 활용하기가 쉽지 않다. 최근 들어 감정이나 의견을 자동으로 추출하기 위한 연구가 활발하게 진행되고 있으며, 감정 분석에 관한 기존 연구들은 대부분 어구의 극성(polarity) 정보가 있는 감정 사전을 사용하고 있다. 하지만 인터넷에는 나날이 신조어가 새로 생기고 언어 파괴 현상이 자주 일어나기 때문에 사전에 기반한 방법은 한계가 있다. 본 논문은 감정 분석 문제를 긍정과 부정으로 구분하는 이진 분류 문제로 본다. 이진 분류 문제에서 탁월한 성능을 보이는 Support Vector Machines(SVM)을 사용하며, 문서들 간의 유사도 계산을 위해 문장의 부분 문자열을 비교하는 문자열 커널을 사용한다. 실험 결과, 실제 영화평에서 제안된 모델이 비교 대상으로 삼은 Bag of Words(BOW) 모델보다 안정적인 성능을 보였다.

  • PDF

Object Categorization Using PLSA Based on Weighting (특이점 가중치 기반 PLSA를 이용한 객체 범주화)

  • Song, Hyun-Chul;Whoang, In-Teck;Choi, Kwang-Nam
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.45-54
    • /
    • 2009
  • In this paper we propose a new approach that recognizes the similar categories by weighting distinctive features. The approach is based on the PLSA that is one of the effective methods for the object categorization. PLSA is introduced from the information retrieval of text domain. PLSA, unsupervised method, shows impressive performance of category recognition. However, it shows relatively low performance for the similar categories which have the analog distribution of the features. In this paper, we consider the effective object categorization for the similar categories by weighting the mainly distinctive features. We present that the proposed algorithm, weighted PLSA, recognizes similar categories. Our method shows better results than the standard PLSA.

  • PDF

Loop Closure in a Line-based SLAM (직선기반 SLAM에서의 루프결합)

  • Zhang, Guoxuan;Suh, Il-Hong
    • The Journal of Korea Robotics Society
    • /
    • v.7 no.2
    • /
    • pp.120-128
    • /
    • 2012
  • The loop closure problem is one of the most challenging issues in the vision-based simultaneous localization and mapping community. It requires the robot to recognize a previously visited place from current camera measurements. While the loop closure often relies on visual bag-of-words based on point features in the previous works, however, in this paper we propose a line-based method to solve the loop closure in the corridor environments. We used both the floor line and the anchored vanishing point as the loop closing feature, and a two-step loop closure algorithm was devised to detect a known place and perform the global pose correction. We propose an anchored vanishing point as a novel loop closure feature, as it includes position information and represents the vanishing points in bi-direction. In our system, the accumulated heading error is reduced using an observation of a previously registered anchored vanishing points firstly, and the observation of known floor lines allows for further pose correction. Experimental results show that our method is very efficient in a structured indoor environment as a suitable loop closure solution.