• Title/Summary/Keyword: Topic vector

Search Result 72, Processing Time 0.024 seconds

Fast Decision Method of Adaptive Motion Vector Resolution (적응적 움직임 벡터 해상도 고속 결정 기법)

  • Park, Sang-hyo
    • Journal of Broadcast Engineering
    • /
    • v.25 no.3
    • /
    • pp.305-312
    • /
    • 2020
  • As a demand for a new video coding standard having higher coding efficiency than the existing standards is growing, recently, MPEG and VCEG has been developing and standardizing the next-generation video coding project, named Versatile Video Coding (VVC). Many inter prediction techniques have been introduced to increase the coding efficiency, and among them, an adaptive motion vector resolution (AMVR) technique has contributed on increasing the efficiency of VVC. However, the best motion vector can only be determined by computing many rate-distortion costs, thereby increasing encoding complexity. It is necessary to reduce the complexity for real-time video broadcasting and streaming services, but it is yet an open research topic to reduce the complexity of AMVR. Therefore, in this paper, an efficient technique is proposed, which reduces the encoding complexity of AMVR. For that, the proposed method exploits a special VVC tree structure (i.e., multi-type tree structure) to accelerate the decision process of AMVR. Experiment results show that the proposed decision method reduces the encoding complexity of VVC test model by 10% with a negligible loss of coding efficiency.

Data Analysis of Dropouts of University Students Using Topic Modeling (토픽모델링을 활용한 대학생의 중도탈락 데이터 분석)

  • Jeong, Do-Heon;Park, Ju-Yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.88-95
    • /
    • 2021
  • This study aims to provide implications for establishing support policies for students by empirically analyzing data on university students dropouts. To this end, data of students enrolled in D University after 2017 were sampled and collected. The collected data was analyzed using topic modeling(LDA: Latent Dirichlet Allocation) technique, which is a probabilistic model based on text mining. As a result of the study, it was found that topics that were characteristic of dropout students were found, and the classification performance between groups through topics was also excellent. Based on these results, a specific educational support system was proposed to prevent dropout of university students. This study is meaningful in that it shows the use of text mining techniques in the education field and suggests an education policy based on data analysis.

Emotional Recognition System Using Eigenfaces (Eigenface를 이용한 인간의 감정인식 시스템)

  • Joo, Young-Hoon;Lee, Sang-Yun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.2
    • /
    • pp.216-221
    • /
    • 2003
  • Emotions recognition is a topic on which little research has been done to date. This paper proposes a new method that can recognize the human s emotion from facial image by using eigenspace. To do so, first, we get the face image by using the skin color from the original color image acquired by CCD color camera. Second, we get the vector image which is projected the obtained face image into eigenspace. And then, we propose the method for finding out each person s identification and emotion from the weight of vector image. Finally, we show the practical application possibility of the proposed method through the experiment.

A Korean Sentence and Document Sentiment Classification System Using Sentiment Features (감정 자질을 이용한 한국어 문장 및 문서 감정 분류 시스템)

  • Hwang, Jaw-Won;Ko, Young-Joong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.3
    • /
    • pp.336-340
    • /
    • 2008
  • Sentiment classification is a recent subdiscipline of text classification, which is concerned not with the topic but with opinion. In this paper, we present a Korean sentence and document classification system using effective sentiment features. Korean sentiment classification starts from constructing effective sentiment feature sets for positive and negative. The synonym information of a English word thesaurus is used to extract effective sentiment features and then the extracted English sentiment features are translated in Korean features by English-Korean dictionary. A sentence or a document is represented by using the extracted sentiment features and is classified and evaluated by SVM(Support Vector Machine).

Empirical Comparison of Word Similarity Measures Based on Co-Occurrence, Context, and a Vector Space Model

  • Kadowaki, Natsuki;Kishida, Kazuaki
    • Journal of Information Science Theory and Practice
    • /
    • v.8 no.2
    • /
    • pp.6-17
    • /
    • 2020
  • Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentionally selected words from a Reuters corpus. There were three targets, including (1) co-occurrence-based similarity measures (for which a co-occurrence frequency is counted as the number of documents or sentences), (2) context-based distributional similarity measures obtained from a latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), and Word2Vec algorithm, and (3) similarity measures computed from the tf-idf weights of each word according to a vector space model (VSM). Here, a Pearson correlation coefficient for a pair of VSM-based similarity measures and co-occurrence-based similarity measures according to the number of documents was highest. Group-average agglomerative hierarchical clustering was also applied to similarity matrices computed by individual measures. An evaluation of the cluster sets according to an answer set revealed that VSM- and LDA-based similarity measures performed best.

Usage of coot optimization-based random forests analysis for determining the shallow foundation settlement

  • Yi, Han;Xingliang, Jiang;Ye, Wang;Hui, Wang
    • Geomechanics and Engineering
    • /
    • v.32 no.3
    • /
    • pp.271-291
    • /
    • 2023
  • Settlement estimation in cohesion materials is a crucial topic to tackle because of the complexity of the cohesion soil texture, which could be solved roughly by substituted solutions. The goal of this research was to implement recently developed machine learning features as effective methods to predict settlement (Sm) of shallow foundations over cohesion soil properties. These models include hybridized support vector regression (SVR), random forests (RF), and coot optimization algorithm (COM), and black widow optimization algorithm (BWOA). The results indicate that all created systems accurately simulated the Sm, with an R2 of better than 0.979 and 0.9765 for the train and test data phases, respectively. This indicates extraordinary efficiency and a good correlation between the experimental and simulated Sm. The model's results outperformed those of ANFIS - PSO, and COM - RF findings were much outstanding to those of the literature. By analyzing established designs utilizing different analysis aspects, such as various error criteria, Taylor diagrams, uncertainty analyses, and error distribution, it was feasible to arrive at the final result that the recommended COM - RF was the outperformed approach in the forecasting process of Sm of shallow foundation, while other techniques were also reliable.

Analysis of Research Trends in SIAM Journal on Applied Mathematics Using Topic Modeling (토픽모델링을 활용한 SIAM Journal on Applied Mathematics의 연구 동향 분석)

  • Kim, Sung-Yeun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.7
    • /
    • pp.607-615
    • /
    • 2020
  • The purpose of this study was to analyze the research status and trends related to the industrial mathematics based on text mining techniques with a sample of 4910 papers collected in the SIAM Journal on Applied Mathematics from 1970 to 2019. The R program was used to collect titles, abstracts, and key words from the papers and to analyze topic modeling techniques based on LDA algorithm. As a result of the coherence score on the collected papers, 20 topics were determined optimally using the Gibbs sampling methods. The main results were as follows. First, studies on industrial mathematics were conducted in a variety of mathematics fields, including computational mathematics, geometry, mathematical modeling, topology, discrete mathematics, probability and statistics, with a focus on analysis and algebra. Second, 5 hot topics (mathematical biology, nonlinear partial differential equation, discrete mathematics, statistics, topology) and 1 cold topic (probability theory) were found based on time series regression analysis. Third, among the fields that were not reflected in the 2015 revised mathematics curriculum, numeral system, matrix, vector in space, and complex numbers were extracted as the contents to be covered in the high school mathematical curriculum. Finally, this study suggested strategies to activate industrial mathematics in Korea, described the study limitations, and proposed directions for future research.

Multiple Cause Model-based Topic Extraction and Semantic Kernel Construction from Text Documents (다중요인모델에 기반한 텍스트 문서에서의 토픽 추출 및 의미 커널 구축)

  • 장정호;장병탁
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.5
    • /
    • pp.595-604
    • /
    • 2004
  • Automatic analysis of concepts or semantic relations from text documents enables not only an efficient acquisition of relevant information, but also a comparison of documents in the concept level. We present a multiple cause model-based approach to text analysis, where latent topics are automatically extracted from document sets and similarity between documents is measured by semantic kernels constructed from the extracted topics. In our approach, a document is assumed to be generated by various combinations of underlying topics. A topic is defined by a set of words that are related to the same topic or cooccur frequently within a document. In a network representing a multiple-cause model, each topic is identified by a group of words having high connection weights from a latent node. In order to facilitate teaming and inferences in multiple-cause models, some approximation methods are required and we utilize an approximation by Helmholtz machines. In an experiment on TDT-2 data set, we extract sets of meaningful words where each set contains some theme-specific terms. Using semantic kernels constructed from latent topics extracted by multiple cause models, we also achieve significant improvements over the basic vector space model in terms of retrieval effectiveness.

영상압축 : Digital Image Compression

  • Kim, Gyeong-Seop
    • Korean Journal of Digital Imaging in Medicine
    • /
    • v.4 no.1
    • /
    • pp.166-180
    • /
    • 1998
  • $\cdot$ 영상 압축은 영상의 통계학적 분포, 반복성을 이용하여 빈도가 높은 데이터는 적은 수의 bits를, 빈도가 낮은 데이터에는 보다 많은 수의 bits를 할당하여 전체 영상을 나타내는 bits 수를 줄이는 것임. $\cdot$ 영상 압축은 크게 Lossy Coding, Lossless Coding으로 나뉘며, Lossy coding은 DCT, 양자화기, VLC Codes를 쓰며 압축 율은 높으나 원래의 영상을 정확히 복원하지 못함. $\cdot$ 영상 압축에 대한 국제 규격 협회는 JPEG, MPEG I, MPEG II, MPEG IV, H.261, H.263 등이 있으나 본 seminar에서는 JPEG 규격만 논함. $\cdot$ 의학 영상은 Resolution이 크고 study 단위로 관리되기 때문에 영상 데이터량이 많으나 진단의 목적으로 쓰이기 때문에 주로 lossless 압축을 쓰게 되나 압축율이 낮음.(3:1 이하). 최근에는 Fractal, Wavelet Coding을 통한 압축율을 증가 시키는 Image Compression Algorithms이 활용됨. $\cdot$ MPEG은 동영상의 압축 표준안이며, 동영상은 한frame 당 25개 이상의 정지 화상으로 이루어지기 때문에 JPEG 규격에서 사용되었던 기법이 그대로 활용되며 영상과 영상간, 또는 frame과 frame 간의 여상의 변화, 움직임을 Vector로 coding하는 interframe Coding 기법을 활용하나 설명하기에는 광범위한 topic이므로 본 seminar에서는 생략함.

  • PDF

Academic Registration Text Classification Using Machine Learning

  • Alhawas, Mohammed S;Almurayziq, Tariq S
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.1
    • /
    • pp.93-96
    • /
    • 2022
  • Natural language processing (NLP) is utilized to understand a natural text. Text analysis systems use natural language algorithms to find the meaning of large amounts of text. Text classification represents a basic task of NLP with a wide range of applications such as topic labeling, sentiment analysis, spam detection, and intent detection. The algorithm can transform user's unstructured thoughts into more structured data. In this work, a text classifier has been developed that uses academic admission and registration texts as input, analyzes its content, and then automatically assigns relevant tags such as admission, graduate school, and registration. In this work, the well-known algorithms support vector machine SVM and K-nearest neighbor (kNN) algorithms are used to develop the above-mentioned classifier. The obtained results showed that the SVM classifier outperformed the kNN classifier with an overall accuracy of 98.9%. in addition, the mean absolute error of SVM was 0.0064 while it was 0.0098 for kNN classifier. Based on the obtained results, the SVM is used to implement the academic text classification in this work.