• 제목/요약/키워드: Text network

검색결과 1,111건 처리시간 0.026초

Text Classification on Social Network Platforms Based on Deep Learning Models

  • YA, Chen;Tan, Juan;Hoekyung, Jung
    • Journal of information and communication convergence engineering
    • /
    • 제21권1호
    • /
    • pp.9-16
    • /
    • 2023
  • The natural language on social network platforms has a certain front-to-back dependency in structure, and the direct conversion of Chinese text into a vector makes the dimensionality very high, thereby resulting in the low accuracy of existing text classification methods. To this end, this study establishes a deep learning model that combines a big data ultra-deep convolutional neural network (UDCNN) and long short-term memory network (LSTM). The deep structure of UDCNN is used to extract the features of text vector classification. The LSTM stores historical information to extract the context dependency of long texts, and word embedding is introduced to convert the text into low-dimensional vectors. Experiments are conducted on the social network platforms Sogou corpus and the University HowNet Chinese corpus. The research results show that compared with CNN + rand, LSTM, and other models, the neural network deep learning hybrid model can effectively improve the accuracy of text classification.

Arabic Text Recognition with Harakat Using Deep Learning

  • Ashwag, Maghraby;Esraa, Samkari
    • International Journal of Computer Science & Network Security
    • /
    • 제23권1호
    • /
    • pp.41-46
    • /
    • 2023
  • Because of the significant role that harakat plays in Arabic text, this paper used deep learning to extract Arabic text with its harakat from an image. Convolutional neural networks and recurrent neural network algorithms were applied to the dataset, which contained 110 images, each representing one word. The results showed the ability to extract some letters with harakat.

Neural Text Categorizer for Exclusive Text Categorization

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • 제4권2호
    • /
    • pp.77-86
    • /
    • 2008
  • This research proposes a new neural network for text categorization which uses alternative representations of documents to numerical vectors. Since the proposed neural network is intended originally only for text categorization, it is called NTC (Neural Text Categorizer) in this research. Numerical vectors representing documents for tasks of text mining have inherently two main problems: huge dimensionality and sparse distribution. Although many various feature selection methods are developed to address the first problem, the reduced dimension remains still large. If the dimension is reduced excessively by a feature selection method, robustness of text categorization is degraded. Even if SVM (Support Vector Machine) is tolerable to huge dimensionality, it is not so to the second problem. The goal of this research is to address the two problems at same time by proposing a new representation of documents and a new neural network using the representation for its input vector.

Hidden LMS 적응 필터링 알고리즘을 이용한 경쟁학습 화자검증 (Speaker Verification Using Hidden LMS Adaptive Filtering Algorithm and Competitive Learning Neural Network)

  • 조성원;김재민
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제51권2호
    • /
    • pp.69-77
    • /
    • 2002
  • Speaker verification can be classified in two categories, text-dependent speaker verification and text-independent speaker verification. In this paper, we discuss text-dependent speaker verification. Text-dependent speaker verification system determines whether the sound characteristics of the speaker are equal to those of the specific person or not. In this paper we obtain the speaker data using a sound card in various noisy conditions, apply a new Hidden LMS (Least Mean Square) adaptive algorithm to it, and extract LPC (Linear Predictive Coding)-cepstrum coefficients as feature vectors. Finally, we use a competitive learning neural network for speaker verification. The proposed hidden LMS adaptive filter using a neural network reduces noise and enhances features in various noisy conditions. We construct a separate neural network for each speaker, which makes it unnecessary to train the whole network for a new added speaker and makes the system expansion easy. We experimentally prove that the proposed method improves the speaker verification performance.

텍스트 네트워크 분석을 이용한 보험 이미지 분석 (Analyzing insurance image using text network analysis)

  • 박경보;고해리;홍종의
    • 예술인문사회 융합 멀티미디어 논문지
    • /
    • 제8권3호
    • /
    • pp.531-541
    • /
    • 2018
  • 본 연구는 소비자들의 농협보험에 대한 이미지 이미지를 분석하기 위해 텍스트 마이닝과 텍스트 네트워크 분석을 실시하였다. 최근 소셜미디어의 발달로 많은 텍스트가 생산 및 재생산되고 있으며, 텍스트는 기업에게 중요한 정보들을 제공한다. 이러한 정보의 의미를 도출하기 위해, 텍스트 마이닝과 텍스트 네트워크 분석을 많은 연구에서 실시하고 있다. 텍스트 분석결과, 농협보험의 긍정적 이미지는 주로 안전과 안정으로 나타났다. 농협보험의 부정적 이미지로는 우려와 불안으로 나타났다. 텍스트 네트워크 분석을 통해 도출한 농협보험의 이미지는 안전과 우려를 중심으로 형성되었다. 텍스트 네트워크 분석을 통해 도출된 결과를 인터뷰를 통해 확인하였다. 인터뷰 결과, 농협은 자산규모 등을 통해 안정적인 재무와 보험금 지급은 안전함이 긍정적 이미지의 주요한 요인이었다. 부정적 이미지로는 최근의 정보유출 사태로 인해 소비자들의 개인정보유출에 대한 우려가 큰 것으로 나타났다. 본 연구에서 분석을 통해 타 상품의 이미지 분석도 사용가능할 것이다.

자동문서분류를 위한 텐서공간모델 기반 심층 신경망 (A Tensor Space Model based Deep Neural Network for Automated Text Classification)

  • 임푸름;김한준
    • 데이타베이스연구회지:데이타베이스연구
    • /
    • 제34권3호
    • /
    • pp.3-13
    • /
    • 2018
  • 자동문서분류(Text Classification)는 주어진 텍스트 문서를 이에 적합한 카테고리로 분류하는 텍스트 마이닝 기술 중의 하나로서 스팸메일 탐지, 뉴스분류, 자동응답, 감성분석, 쳇봇 등 다양한 분야에 활용되고 있다. 일반적으로 자동문서분류 시스템은 기계학습 알고리즘을 활용하며, 이 중에서 텍스트 데이터에 적합한 알고리즘인 나이브베이즈(Naive Bayes), 지지벡터머신(Support Vector Machine) 등이 합리적 수준의 성능을 보이는 것으로 알려져 있다. 최근 딥러닝 기술의 발전에 따라 자동문서분류 시스템의 성능을 개선하기 위해 순환신경망(Recurrent Neural Network)과 콘볼루션 신경망(Convolutional Neural Network)을 적용하는 연구가 소개되고 있다. 그러나 이러한 최신 기법들이 아직 완벽한 수준의 문서분류에는 미치지 못하고 있다. 본 논문은 그 이유가 텍스트 데이터가 단어 차원 중심의 벡터로 표현되어 텍스트에 내재한 의미 정보를 훼손하는데 주목하고, 선행 연구에서 그 효능이 검증된 시멘틱 텐서공간모델에 기반하여 심층 신경망 아키텍처를 제안하고 이를 활용한 문서분류기의 성능이 대폭 상승함을 보인다.

리뷰 텍스트 기반 감성 분석과 네트워크 분석에 관한 연구 (Sentiment Analysis and Network Analysis based on Review Text)

  • 김유미;허고은
    • 한국문헌정보학회지
    • /
    • 제55권3호
    • /
    • pp.397-417
    • /
    • 2021
  • 리뷰 텍스트는 이용자들의 경험과 의견이 구체적으로 담겨있어 이를 분석하면 리뷰 대상에 대한 많은 내용을 파악할 수 있다. 이에 따라 리뷰 텍스트에 대해 감성 분석을 진행하여 음식점의 각 요인에 대한 이용자의 평가 등을 파악하는 연구, 네트워크 분석을 통한 이용자들의 선호를 파악하는 연구들이 진행되어왔다. 본 연구에서는 음식점 리뷰 텍스트의 별점 기반 만족도가 높은 음식점과 낮은 음식점을 분석대상으로 선정하여 감성 분석과 네트워크 분석을 통합적으로 수행하였다. 서로 다른 두 집단의 리뷰 텍스트에서 나타나는 차이로 음식점의 특성을 파악하여 좋은 음식점의 기준과 음식점 만족도에 영향을 미치는 주요인을 확인하고자 하였다.

간호학 학술논문의 주제 분석을 위한 텍스트네크워크분석방법 활용 (Using Text Network Analysis for Analyzing Academic Papers in Nursing)

  • 박찬숙
    • Perspectives in Nursing Science
    • /
    • 제16권1호
    • /
    • pp.12-24
    • /
    • 2019
  • Purpose: This study examined the suitability of using text network analysis (TNA) methodology for topic analysis of academic papers related to nursing. Methods: TNA background theories, software programs, and research processes have been described in this paper. Additionally, the research methodology that applied TNA to the topic analysis of the academic nursing papers was analyzed. Results: As background theories for the study, we explained information theory, word co-occurrence analysis, graph theory, network theory, and social network analysis. The TNA procedure was described as follows: 1) collection of academic articles, 2) text extraction, 3) preprocessing, 4) generation of word co-occurrence matrices, 5) social network analysis, and 6) interpretation and discussion. Conclusion: TNA using author-keywords has several advantages. It can utilize recognized terms such as MeSH headings or terms chosen by professionals, and it saves time and effort. Additionally, the study emphasizes the necessity of developing a sophisticated research design that explores nursing research trends in a multidimensional method by applying TNA methodology.

Text Categorization for Authorship based on the Features of Lingual Conceptual Expression

  • Zhang, Quan;Zhang, Yun-liang;Yuan, Yi
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.515-521
    • /
    • 2007
  • The text categorization is an important field for the automatic text information processing. Moreover, the authorship identification of a text can be treated as a special text categorization. This paper adopts the conceptual primitives' expression based on the Hierarchical Network of Concepts (HNC) theory, which can describe the words meaning in hierarchical symbols, in order to avoid the sparse data shortcoming that is aroused by the natural language surface features in text categorization. The KNN algorithm is used as computing classification element. Then, the experiment has been done on the Chinese text authorship identification. The experiment result gives out that the processing mode that is put forward in this paper achieves high correct rate, so it is feasible for the text authorship identification.

  • PDF

An Ensemble Approach for Cyber Bullying Text messages and Images

  • Zarapala Sunitha Bai;Sreelatha Malempati
    • International Journal of Computer Science & Network Security
    • /
    • 제23권11호
    • /
    • pp.59-66
    • /
    • 2023
  • Text mining (TM) is most widely used to find patterns from various text documents. Cyber-bullying is the term that is used to abuse a person online or offline platform. Nowadays cyber-bullying becomes more dangerous to people who are using social networking sites (SNS). Cyber-bullying is of many types such as text messaging, morphed images, morphed videos, etc. It is a very difficult task to prevent this type of abuse of the person in online SNS. Finding accurate text mining patterns gives better results in detecting cyber-bullying on any platform. Cyber-bullying is developed with the online SNS to send defamatory statements or orally bully other persons or by using the online platform to abuse in front of SNS users. Deep Learning (DL) is one of the significant domains which are used to extract and learn the quality features dynamically from the low-level text inclusions. In this scenario, Convolutional neural networks (CNN) are used for training the text data, images, and videos. CNN is a very powerful approach to training on these types of data and achieved better text classification. In this paper, an Ensemble model is introduced with the integration of Term Frequency (TF)-Inverse document frequency (IDF) and Deep Neural Network (DNN) with advanced feature-extracting techniques to classify the bullying text, images, and videos. The proposed approach also focused on reducing the training time and memory usage which helps the classification improvement.