• Title/Summary/Keyword: Text Model learning

Search Result 427, Processing Time 0.03 seconds

Text Classification on Social Network Platforms Based on Deep Learning Models

  • YA, Chen;Tan, Juan;Hoekyung, Jung
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.9-16
    • /
    • 2023
  • The natural language on social network platforms has a certain front-to-back dependency in structure, and the direct conversion of Chinese text into a vector makes the dimensionality very high, thereby resulting in the low accuracy of existing text classification methods. To this end, this study establishes a deep learning model that combines a big data ultra-deep convolutional neural network (UDCNN) and long short-term memory network (LSTM). The deep structure of UDCNN is used to extract the features of text vector classification. The LSTM stores historical information to extract the context dependency of long texts, and word embedding is introduced to convert the text into low-dimensional vectors. Experiments are conducted on the social network platforms Sogou corpus and the University HowNet Chinese corpus. The research results show that compared with CNN + rand, LSTM, and other models, the neural network deep learning hybrid model can effectively improve the accuracy of text classification.

Pill Identification Algorithm Based on Deep Learning Using Imprinted Text Feature (음각 정보를 이용한 딥러닝 기반의 알약 식별 알고리즘 연구)

  • Seon Min, Lee;Young Jae, Kim;Kwang Gi, Kim
    • Journal of Biomedical Engineering Research
    • /
    • v.43 no.6
    • /
    • pp.441-447
    • /
    • 2022
  • In this paper, we propose a pill identification model using engraved text feature and image feature such as shape and color, and compare it with an identification model that does not use engraved text feature to verify the possibility of improving identification performance by improving recognition rate of the engraved text. The data consisted of 100 classes and used 10 images per class. The engraved text feature was acquired through Keras OCR based on deep learning and 1D CNN, and the image feature was acquired through 2D CNN. According to the identification results, the accuracy of the text recognition model was 90%. The accuracy of the comparative model and the proposed model was 91.9% and 97.6%. The accuracy, precision, recall, and F1-score of the proposed model were better than those of the comparative model in terms of statistical significance. As a result, we confirmed that the expansion of the range of feature improved the performance of the identification model.

A Lightweight Deep Learning Model for Text Detection in Fashion Design Sketch Images for Digital Transformation

  • Ju-Seok Shin;Hyun-Woo Kang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.17-25
    • /
    • 2023
  • In this paper, we propose a lightweight deep learning architecture tailored for efficient text detection in fashion design sketch images. Given the increasing prominence of Digital Transformation in the fashion industry, there is a growing emphasis on harnessing digital tools for creating fashion design sketches. As digitization becomes more pervasive in the fashion design process, the initial stages of text detection and recognition take on pivotal roles. In this study, a lightweight network was designed by building upon existing text detection deep learning models, taking into consideration the unique characteristics of apparel design drawings. Additionally, a separately collected dataset of apparel design drawings was added to train the deep learning model. Experimental results underscore the superior performance of our proposed deep learning model, outperforming existing text detection models by approximately 20% when applied to fashion design sketch images. As a result, this paper is expected to contribute to the Digital Transformation in the field of clothing design by means of research on optimizing deep learning models and detecting specialized text information.

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning

  • Kong, Jun;Sun, Jinhua;Jiang, Min;Hou, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.771-789
    • /
    • 2019
  • Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.

Development of Elementary School AI Education Contents using Entry Text Model Learning (엔트리 텍스트 모델 학습을 활용한 초등 인공지능 교육 내용 개발)

  • Kim, Byungjo;Kim, Hyenbae
    • Journal of The Korean Association of Information Education
    • /
    • v.26 no.1
    • /
    • pp.65-73
    • /
    • 2022
  • In this study, by using Entry text model learning, educational contents for artificial intelligence education of elementary school students are developed and applied to actual classes. Based on the elementary and secondary artificial intelligence content table, the achievement standards of practical software education and artificial intelligence education will be reconstructed.. Among text, images, and sounds capable of machine learning, "production of emotion recognition programs using text model learning" will be selected as the educational content, which can be easily understood while reducing data preparation time for elementary school students. Entry artificial intelligence is selected as an education platform to develop artificial intelligence education contents that create emotion recognition programs using text model learning and apply them to actual elementary school classes. Based on the contents of this study, As a result of class application, students showed positive responses and interest in the entry AI class. it is suggested that quantitative research on the effectiveness of classes for elementary school students is necessary as a follow-up study.

A Text Sentiment Classification Method Based on LSTM-CNN

  • Wang, Guangxing;Shin, Seong-Yoon;Lee, Won Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.12
    • /
    • pp.1-7
    • /
    • 2019
  • With the in-depth development of machine learning, the deep learning method has made great progress, especially with the Convolution Neural Network(CNN). Compared with traditional text sentiment classification methods, deep learning based CNNs have made great progress in text classification and processing of complex multi-label and multi-classification experiments. However, there are also problems with the neural network for text sentiment classification. In this paper, we propose a fusion model based on Long-Short Term Memory networks(LSTM) and CNN deep learning methods, and applied to multi-category news datasets, and achieved good results. Experiments show that the fusion model based on deep learning has greatly improved the precision and accuracy of text sentiment classification. This method will become an important way to optimize the model and improve the performance of the model.

An Optimal Weighting Method in Supervised Learning of Linguistic Model for Text Classification

  • Mikawa, Kenta;Ishida, Takashi;Goto, Masayuki
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.1
    • /
    • pp.87-93
    • /
    • 2012
  • This paper discusses a new weighting method for text analyzing from the view point of supervised learning. The term frequency and inverse term frequency measure (tf-idf measure) is famous weighting method for information retrieval, and this method can be used for text analyzing either. However, it is an experimental weighting method for information retrieval whose effectiveness is not clarified from the theoretical viewpoints. Therefore, other effective weighting measure may be obtained for document classification problems. In this study, we propose the optimal weighting method for document classification problems from the view point of supervised learning. The proposed measure is more suitable for the text classification problem as used training data than the tf-idf measure. The effectiveness of our proposal is clarified by simulation experiments for the text classification problems of newspaper article and the customer review which is posted on the web site.

A Tensor Space Model based Deep Neural Network for Automated Text Classification (자동문서분류를 위한 텐서공간모델 기반 심층 신경망)

  • Lim, Pu-reum;Kim, Han-joon
    • Database Research
    • /
    • v.34 no.3
    • /
    • pp.3-13
    • /
    • 2018
  • Text classification is one of the text mining technologies that classifies a given textual document into its appropriate categories and is used in various fields such as spam email detection, news classification, question answering, emotional analysis, and chat bot. In general, the text classification system utilizes machine learning algorithms, and among a number of algorithms, naïve Bayes and support vector machine, which are suitable for text data, are known to have reasonable performance. Recently, with the development of deep learning technology, several researches on applying deep neural networks such as recurrent neural networks (RNN) and convolutional neural networks (CNN) have been introduced to improve the performance of text classification system. However, the current text classification techniques have not yet reached the perfect level of text classification. This paper focuses on the fact that the text data is expressed as a vector only with the word dimensions, which impairs the semantic information inherent in the text, and proposes a neural network architecture based upon the semantic tensor space model.

Research on Chinese Microblog Sentiment Classification Based on TextCNN-BiLSTM Model

  • Haiqin Tang;Ruirui Zhang
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.842-857
    • /
    • 2023
  • Currently, most sentiment classification models on microblogging platforms analyze sentence parts of speech and emoticons without comprehending users' emotional inclinations and grasping moral nuances. This study proposes a hybrid sentiment analysis model. Given the distinct nature of microblog comments, the model employs a combined stop-word list and word2vec for word vectorization. To mitigate local information loss, the TextCNN model, devoid of pooling layers, is employed for local feature extraction, while BiLSTM is utilized for contextual feature extraction in deep learning. Subsequently, microblog comment sentiments are categorized using a classification layer. Given the binary classification task at the output layer and the numerous hidden layers within BiLSTM, the Tanh activation function is adopted in this model. Experimental findings demonstrate that the enhanced TextCNN-BiLSTM model attains a precision of 94.75%. This represents a 1.21%, 1.25%, and 1.25% enhancement in precision, recall, and F1 values, respectively, in comparison to the individual deep learning models TextCNN. Furthermore, it outperforms BiLSTM by 0.78%, 0.9%, and 0.9% in precision, recall, and F1 values.

Self-Supervised Document Representation Method

  • Yun, Yeoil;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.187-197
    • /
    • 2020
  • Recently, various methods of text embedding using deep learning algorithms have been proposed. Especially, the way of using pre-trained language model which uses tremendous amount of text data in training is mainly applied for embedding new text data. However, traditional pre-trained language model has some limitations that it is hard to understand unique context of new text data when the text has too many tokens. In this paper, we propose self-supervised learning-based fine tuning method for pre-trained language model to infer vectors of long-text. Also, we applied our method to news articles and classified them into categories and compared classification accuracy with traditional models. As a result, it was confirmed that the vector generated by the proposed model more accurately expresses the inherent characteristics of the document than the vectors generated by the traditional models.