• Title/Summary/Keyword: Text analysis

Search Result 3,326, Processing Time 0.031 seconds

Study of Mental Disorder Schizophrenia, based on Big Data

  • Hye-Sun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.4
    • /
    • pp.279-285
    • /
    • 2023
  • This study provides academic implications by considering trends of domestic research regarding therapy for Mental disorder schizophrenia and psychosocial. For the analysis of this study, text mining with the use of R program and social network analysis method have been used and 65 papers have been collected The result of this study is as follows. First, collected data were visualized through analysis of keywords by using word cloud method. Second, keywords such as intervention, schizophrenia, research, patients, program, effect, society, mind, ability, function were recorded with highest frequency resulted from keyword frequency analysis. Third, LDA (latent Dirichlet allocation) topic modeling result showed that classified into 3 keywords: patient, subjects, intervention of psychosocial, efficacy of interventions. Fourth, the social network analysis results derived connectivity, closeness centrality, betweennes centrality. In conclusion, this study presents significant results as it provided basic rehabilitation data for schizophrenia and psychosocial therapy through new research methods by analyzing with big data method by proposing the results through visualization from seeking research trends of schizophrenia and psychosocial therapy through text mining and social network analysis.

Text Region Detection Method in Mobile Phone Video (휴대전화 동영상에서의 문자 영역 검출 방법)

  • Lee, Hoon-Jae;Sull, Sang-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.192-198
    • /
    • 2010
  • With the popularization of the mobile phone with a built-in camera, there are a lot of effort to provide useful information to users by detecting and recognizing the text in the video which is captured by the camera in mobile phone, and there is a need to detect the text regions in such mobile phone video. In this paper, we propose a method to detect the text regions in the mobile phone video. We employ morphological operation as a preprocessing and obtain binarized image using modified k-means clustering. After that, candidate text regions are obtained by applying connected component analysis and general text characteristic analysis. In addition, we increase the precision of the text detection by examining the frequency of the candidate regions. Experimental results show that the proposed method detects the text regions in the mobile phone video with high precision and recall.

The Effect of Text Information Frame Ratio and Font Size on the Text Readability of Circle Smartwatch

  • Park, Seungtaek;Park, Jaekyu;Choe, Jaeho;Jung, Eui S.
    • Journal of the Ergonomics Society of Korea
    • /
    • v.33 no.6
    • /
    • pp.499-513
    • /
    • 2014
  • Objective: The objective of this study was to examine frame ratio of text information and font size in the circle smartwatch. Background: Recently, electronic manufacturers try to develop the original metaphor of traditional wrist watch (circle) in terms of smartwatch. They endeavor to break the square display in order to improve emotional customer satisfaction. Method: The experiments examined twenty level of text information design, combinations of four frame ratios (1:1, 4:3, 16:9, 21:9) and five font sizes (6pt, 7pt, 8pt, 9pt, 10pt). Nineteen participants volunteered for the experiment. Dependent variables were WPM (Words per Minute), reading preference, design preference and total preference. Furthermore, small circle display was made by using circle display data (1.3inch), which was exhibited in IFA (International Funkausstellung) 2014. Results: As a result, ANOVA (Analysis of Variance) revealed that WPM, and task time preference affect the specific frame ratio and font size. Results of ANOVA for reading preference, design preference, total preference were grouped by post-analysis LSD (Least Significant Difference). Among users, display ratio (16:9, 21:9), and font size (9pt) were preferred. In conclusion, 16:9 display ratio and 9pt are adaptable for text information in 1.3inch circle display. Conclusion: From the study, it is shown that 16:9 display ratio and 9pt size are more adaptable for text information in 1.3inch circle display than others. It is mainly due to the fact that the order of frame ratio and font size may affect the usability of reading long text information in a small circle display. Therefore, when developers design a circle display, the square frame ratio and font size are required to be considered according to circle size. Application: The 16:9 display ratio and 9pt font size may be utilized as a text information frame in the circle display design guideline for smartwatch.

- For the Development of Inquiring, integrated Science Curricular Materials - The Comparison and Analysis of Inquiry Activity between "The FAST Program" and "The Secondary Science Books" (탐구적 통합 과학 교재 개발을 위한, "FAST program"과 "중등 과학 교과서"의 탐구 활동 비교 분석)

  • Son, Yeon-A;Lee, Hack-Dong
    • Journal of The Korean Association For Science Education
    • /
    • v.14 no.1
    • /
    • pp.45-57
    • /
    • 1994
  • The purpose of this study is to verify whether the FAST program is the Inquiry Science Curricular Materials, through the Comparison and Analysis of Inquiry Activities between the FAST program and our Secondary Science Books. The results of this study are as follows ; 1. FAST has 226 tasks of the Inquiry Activities, which is analyzed over two times than our text. 2. In level one, FAST holds the parts of Synthesizing Results and Evaluation, Hypothesizing and Designing an Experiment but u.ese aren't found in our text. 3. In level two, our text is analyzed No Discussion 72.2%, Demonstrating or Verifying the Content of the Text 82%, but FAST has Discussion Guided 81.8%, and isn't found any tesk of Demonstrating or Verifying the Content of the text. 4. In level three, our text is exposed a typical type I and analyzed Inquiry Index 15-25 ( Middle ), but FAST is found type IV, excepting Manipulating Apparatus and Observation and analyzed Inquiry Index over 35 ( Very - High ). Therefore, FAST Program is proved to be the desirable Inquiry Science Curricular Materials. In future, this worker is to arrange the results of the following paper as follows ; 1. The verification of the FAST Program by means of the Integrated Science Curricular Materials. 2. The development of the Inquiring, Integrated Science Curricular Materials through the results of the preceding study.

  • PDF

Modern Methods of Text Analysis as an Effective Way to Combat Plagiarism

  • Myronenko, Serhii;Myronenko, Yelyzaveta
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.242-248
    • /
    • 2022
  • The article presents the analysis of modern methods of automatic comparison of original and unoriginal text to detect textual plagiarism. The study covers two types of plagiarism - literal, when plagiarists directly make exact copying of the text without changing anything, and intelligent, using more sophisticated techniques, which are harder to detect due to the text manipulation, like words and signs replacement. Standard techniques related to extrinsic detection are string-based, vector space and semantic-based. The first, most common and most successful target models for detecting literal plagiarism - N-gram and Vector Space are analyzed, and their advantages and disadvantages are evaluated. The most effective target models that allow detecting intelligent plagiarism, particularly identifying paraphrases by measuring the semantic similarity of short components of the text, are investigated. Models using neural network architecture and based on natural language sentence matching approaches such as Densely Interactive Inference Network (DIIN), Bilateral Multi-Perspective Matching (BiMPM) and Bidirectional Encoder Representations from Transformers (BERT) and its family of models are considered. The progress in improving plagiarism detection systems, techniques and related models is summarized. Relevant and urgent problems that remain unresolved in detecting intelligent plagiarism - effective recognition of unoriginal ideas and qualitatively paraphrased text - are outlined.

Text-Driven Multiple-Path Discourse Processing for Descriptive Texts

  • Seo, Jungyun
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.2
    • /
    • pp.1-8
    • /
    • 1996
  • This paper presents a text-driven discourse analysis system, called DPAS. DPAS constructs a discourse structure by weaving together clauses in the text by finding discourse relations between a clause and the clauses in a context. The basic processing model of DPAS is based on the stack based model of discourse analysis suggested by Grosz and Sidner. We extend the model with dynamic programming method to handle various discourse ambiguities effectively and efficiently. We develop the idea of a context space to keep all information of a context. DPAS parses a text by considering all possible discourse relations between a clause and a context. Since different discourse relations may result in different states of a context, DPAS maintains multiple context spaces for an ambiguous text. Since maintaining all interpretations until the whole text is processed requires too much computing resources, DPAS uses the idea of depth-limited search to limit the search space. If there is more than one discourse relation between an input clause and a context, DPAS constructs context spaces one context space for each discourse relation. Then, DPAS applies heuristics to choose the most desirable context space after it processes some more input clauses. Since the basic idea of DPAS is domain independent, although we used descriptive texts to demonstrate DPAS, we believe the idea of DPAS can be extended to understand other styles of texts.

  • PDF

A Study on Monitoring Method of Citizen Opinion based on Big Data : Focused on Gyeonggi Lacal Currency (Gyeonggi Money) (빅데이터 기반 시민의견 모니터링 방안 연구 : "경기지역화폐"를 중심으로)

  • Ahn, Soon-Jae;Lee, Sae-Mi;Ryu, Seung-Ei
    • Journal of Digital Convergence
    • /
    • v.18 no.7
    • /
    • pp.93-99
    • /
    • 2020
  • Text mining is one of the big data analysis methods that extracts meaningful information from atypical large-scale text data. In this study, text mining was used to monitor citizens' opinions on the policies and systems being implemented. We collected 5,108 newspaper articles and 748 online cafe posts related to 'Gyeonggi Lacal Currency' and performed frequency analysis, TF-IDF analysis, association analysis, and word tree visualization analysis. As a result, many articles related to the purpose of introducing local currency, the benefits provided, and the method of use. However, the contents related to the actual use of local currency were written in the online cafe posts. In order to revitalize local currency, the news was involved in the promotion of local currency as an informant. Online cafe posts consisted of the opinions of citizens who are local currency users. SNS and text mining are expected to effectively activate various policies as well as local currency.

A Study on Text Pattern Analysis Applying Discrete Fourier Transform - Focusing on Sentence Plagiarism Detection - (이산 푸리에 변환을 적용한 텍스트 패턴 분석에 관한 연구 - 표절 문장 탐색 중심으로 -)

  • Lee, Jung-Song;Park, Soon-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.22 no.2
    • /
    • pp.43-52
    • /
    • 2017
  • Pattern Analysis is One of the Most Important Techniques in the Signal and Image Processing and Text Mining Fields. Discrete Fourier Transform (DFT) is Generally Used to Analyzing the Pattern of Signals and Images. We thought DFT could also be used on the Analysis of Text Patterns. In this Paper, DFT is Firstly Adapted in the World to the Sentence Plagiarism Detection Which Detects if Text Patterns of a Document Exist in Other Documents. We Signalize the Texts Converting Texts to ASCII Codes and Apply the Cross-Correlation Method to Detect the Simple Text Plagiarisms such as Cut-and-paste, term Relocations and etc. WordNet is using to find Similarities to Detect the Plagiarism that uses Synonyms, Translations, Summarizations and etc. The Data set, 2013 Corpus, Provided by PAN Which is the One of Well-known Workshops for Text Plagiarism is used in our Experiments. Our Method are Fourth Ranked Among the Eleven most Outstanding Plagiarism Detection Methods.

Detecting Spam Data for Securing the Reliability of Text Analysis (텍스트 분석의 신뢰성 확보를 위한 스팸 데이터 식별 방안)

  • Hyun, Yoonjin;Kim, Namgyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.493-504
    • /
    • 2017
  • Recently, tremendous amounts of unstructured text data that is distributed through news, blogs, and social media has gained much attention from many researchers and practitioners as this data contains abundant information about various consumers' opinions. However, as the usefulness of text data is increasing, more and more attempts to gain profits by distorting text data maliciously or nonmaliciously are also increasing. This increase in spam text data not only burdens users who want to obtain useful information with a large amount of inappropriate information, but also damages the reliability of information and information providers. Therefore, efforts must be made to improve the reliability of information and the quality of analysis results by detecting and removing spam data in advance. For this purpose, many studies to detect spam have been actively conducted in areas such as opinion spam detection, spam e-mail detection, and web spam detection. In this study, we introduce core concepts and current research trends of spam detection and propose a methodology to detect the spam tag of a blog as one of the challenging attempts to improve the reliability of blog information.

WCTT: Web Crawling System based on HTML Document Formalization (WCTT: HTML 문서 정형화 기반 웹 크롤링 시스템)

  • Kim, Jin-Hwan;Kim, Eun-Gyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.4
    • /
    • pp.495-502
    • /
    • 2022
  • Web crawler, which is mainly used to collect text on the web today, is difficult to maintain and expand because researchers must implement different collection logic by collection channel after analyzing tags and styles of HTML documents. To solve this problem, the web crawler should be able to collect text by formalizing HTML documents to the same structure. In this paper, we designed and implemented WCTT(Web Crawling system based on Tag path and Text appearance frequency), a web crawling system that collects text with a single collection logic by formalizing HTML documents based on tag path and text appearance frequency. Because WCTT collects texts with the same logic for all collection channels, it is easy to maintain and expand the collection channel. In addition, it provides the preprocessing function that removes stopwords and extracts only nouns for keyword network analysis and so on.