• Title/Summary/Keyword: Text analysis

Search Result 3,350, Processing Time 0.033 seconds

Understanding Facility Management on Tunnel through Text Mining of Precision Safety Diagnosis Data (터널시설물 점검진단 데이터의 텍스트마이닝 분석을 통한 유형별·지역별 중점 유지관리요소의 이해)

  • Seo, Jeong-eun;Oh, Jintak
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.3
    • /
    • pp.85-92
    • /
    • 2021
  • The purpose of this paper is to understand the key factors for efficient maintenance of rapidly aging facilities. Therefore, the safety inspection/diagnosis reports accumulated in the unstructured data were collected and preprocessed. Then, the analysis was performed using a text mining analysis method. The derived vulnerabilities of tunnel facilities can be used as elements of inspections that take into account the characteristics of individual facilities during regular inspections and daily inspections in the short term. In addition, if detailed specification information and other inspection results(safety, durability, and ease of use) are used for analysis, it provides a stepping stone for supporting preemptive maintenance decision-making in the long term.

Romanian-Lexicon-Based Sentiment Analysis for Assesing Teachers' Activity

  • Barila, Adina;Danubianu, Mirela;Gradinaru, Bogdanel
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.43-50
    • /
    • 2022
  • The students' feedback is important to measure and improve teaching performance. Many teacher performance evaluation systems are based on responses to closed question, but the free text answers can contain useful information which had to be explored. In this paper we present a lexicon-based sentiment analysis to explore students' text feedback. The data was collected from a system for the evaluation of teachers by students developed and used in our university. The students comments are in Romanian language so we built a Romanian sentiment word lexicon. We used this to categorize the feeback text as positive, negative or neutral. In addition, we added a new polarity - indifferent - in order to categorize blank and "I don't answer" responses.

Analysis of Success Factors of Electric Scooter Sharing Service Using User Review Text Mining

  • Kyoung-ae Seo;Jung Seung Lee
    • Journal of Information Technology Applications and Management
    • /
    • v.30 no.2
    • /
    • pp.19-30
    • /
    • 2023
  • This study aims to analyze service improvement and success factors of electric scooter sharing service companies by using text mining after collecting reviews of shared electric scooter service applications among various models of sharing economy. In this study, the factors of satisfaction and dissatisfaction of service users were identified using the term frequency inverse document frequency (TF-IDF) technique, and topics for each keyword were extracted using the Latent Dirichlet Allocation (LDA) Topic Modeling technique. According to the analysis results, the main topics were entertainment, safety, service area, application complaints, use complaints, convenience, and mobility. Using the analysis results of this study, employees and researchers of electric scooter sharing service companies will be able to contribute to the improvement and success of related services.

Korean Voice Phishing Text Classification Performance Analysis Using Machine Learning Techniques (머신러닝 기법을 이용한 한국어 보이스피싱 텍스트 분류 성능 분석)

  • Boussougou, Milandu Keith Moussavou;Jin, Sangyoon;Chang, Daeho;Park, Dong-Joo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.297-299
    • /
    • 2021
  • Text classification is one of the popular tasks in Natural Language Processing (NLP) used to classify text or document applications such as sentiment analysis and email filtering. Nowadays, state-of-the-art (SOTA) Machine Learning (ML) and Deep Learning (DL) algorithms are the core engine used to perform these classification tasks with high accuracy, and they show satisfying results. This paper conducts a benchmarking performance's analysis of multiple SOTA algorithms on the first known labeled Korean voice phishing dataset called KorCCVi. Experimental results reveal performed on a test set of 366 samples reveal which algorithm performs the best considering the training time and metrics such as accuracy and F1 score.

A Study about Inter-Textuality in Modern Hair Style - Focused on Collections - (현대 헤어스타일에 표현된 텍스트의 다원화 현상에 관한 연구 - 컬렉션을 중심으로 -)

  • Kim, Sung-Ah;Yoo, Tae-Soon
    • Fashion & Textile Research Journal
    • /
    • v.11 no.6
    • /
    • pp.934-941
    • /
    • 2009
  • The purpose of this study is to examine by which correlation the pluralistic phenomenon in text is functioned in comparison with hair style and fashion in collection. As a result, the pluralistic image in text, which was shown in modern fashion, was indicated to be pluralistic phenomenon by gender, T.P.O, coordination, and material. The pluralistic image in text for hair style can be known to have been indicated to be the pluralistic phenomenon in text for gender and to be the pluralistic phenomenon in text according to material and cultural category. As for a method of this study, it did put limitation on the part that is shown in the fashion collection from 2001 to 2007, analyzed hair-style features centering on photos, which were extracted from style.com, the online site of specializing in fashion, and carried out a literature research side by side with the theoretical background on intertextuality. The analysis in work according to the pluralistic phenomenon in text made it possible for looking at with a new sight differently from the recognition in the past, and opened the potentiality for being able to understand lots of strange representations, which have been impossible so far. The process of imitating and reconstructing each text according to compositional principle led to possibly knowing the necessity of an artist's ability that can implement the originative world.

An Investigation of Exposure to Informational Text through English Textbooks

  • Kim, Tae-Eun
    • English Language & Literature Teaching
    • /
    • v.15 no.2
    • /
    • pp.185-207
    • /
    • 2009
  • This study investigated the extent of informational text genre appeared in English textbooks at grades six, seven, and nine. Employing content analysis to analyze the literary forms, the researcher identified genre in each reading selection of each English textbook and classified it into six categories - fiction, information, biography, poetry, play, or fantasy. Especially, informational genre was classified further into two subcategories - non-narrative and narrative - in order to investigate the extent of non-narrative informational text only. The text genre was examined by analyzing (a) the number of reading selections representing each genre and (b) the number of words in reading selections devoted to each genre. The most frequent type of genre at grade 6 and 7 was fiction with 94% and 71% respectively, whereas at grade 9 it was devoted to information (51%), followed by fiction (37%). The largest number of words was devoted to fiction with 96% at the sixth grade and 70% at the seventh grade; on the other hand, for grade 9, it was devoted to information (46%), followed by fiction (39%). Although there was variance across different publishers, the informational text genre gained more significance as the grade level increased. In particular, the percentage of reading selections and words devoted to the non-narrative or expository informational genre was overall 4% at grade 6, 17% at grade 7, and 44% at grade 9. The findings demonstrated the need to pay more attention to informational literacy especially in the early grades for the development of balanced genre knowledge.

  • PDF

Unstructured Data Quantification Scheme Based on Text Mining for User Feedback Extraction (사용자 의견 추출을 위한 텍스트 마이닝 기반 비정형 데이터 정량화 방안)

  • Jo, Jung-Heum;Chung, Yong-Taek;Choi, Seong-Wook;Ok, Changsoo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.41 no.4
    • /
    • pp.131-137
    • /
    • 2018
  • People write reviews of numerous products or services on the Internet, in their blogs or community bulletin boards. These unstructured data contain important emotions and opinions about the author's product or service, which can provide important information for future product design or marketing. However, this text-based information cannot be evaluated quantitatively, and thus they are difficult to apply to mathematical models or optimization problems for product design and improvement. Therefore, this study proposes a method to quantitatively extract user's opinion or preference about a specific product or service by utilizing a lot of text-based information existing on the Internet or online. The extracted unstructured text information is decomposed into basic unit words, and positive rate is evaluated by using existing emotional dictionaries and additional lists proposed in this study. This can be a way to effectively utilize unstructured text data, which is being generated and stored in vast quantities, in product or service design. Finally, to verify the effectiveness of the proposed method, a case study was conducted using movie review data retrieved from a portal website. By comparing the positive rates calculated by the proposed framework with user ratings for movies, a guideline on text mining based evaluation of unstructured data is provided.

Exploring the Core Keywords of the Secondary School Home Economics Teacher Selection Test: A Mixed Method of Content and Text Network Analyses (중등학교 가정과교사 임용시험의 핵심 키워드 탐색: 내용 분석과 텍스트 네트워크 분석을 중심으로)

  • Mi Jeong, Park;Ju, Han
    • Human Ecology Research
    • /
    • v.60 no.4
    • /
    • pp.625-643
    • /
    • 2022
  • The purpose of this study was to explore the trends and core keywords of the secondary school home economics teacher selection test using content analysis and text network analysis. The sample comprised texts of the secondary school home economics teacher 1st selection test for the 2017-2022 school years. Determination of frequency of occurrence, generation of word clouds, centrality analysis, and topic modeling were performed using NetMiner 4.4. The key results were as follows. First, content analysis revealed that the number of questions and scores for each subject (field) has remained constant since 2020, unlike before 2020. In terms of subjects, most questions focused on 'theory of home economics education', and among the evaluation content elements, the highest percentage of questions asked was for 'home economics teaching·learning methods and practice'. Second, the network of the secondary school home economics teacher selection test covering the 2017-2022 school years has an extremely weak density. For the 2017-2019 school years, 'learning', 'evaluation', 'instruction', and 'method' appeared as important keywords, and 7 topics were extracted. For the 2020-2022 school years, 'evaluation', 'class', 'learning', 'cycle', and 'model' were influential keywords, and five topics were extracted. This study is meaningful in that it attempted a new research method combining content analysis and text network analysis and prepared basic data for the revision of the evaluation area and evaluation content elements of the secondary school home economics teacher selection test.

An Analysis of Linguistic Features in Science Textbooks across Grade Levels: Focus on Text Cohesion (과학교과서의 학년 간 언어적 특성 분석 -텍스트 정합성을 중심으로-)

  • Ryu, Jisu;Jeon, Moongee
    • Journal of The Korean Association For Science Education
    • /
    • v.41 no.2
    • /
    • pp.71-82
    • /
    • 2021
  • Learning efficiency can be maximized by careful matching of text features to expected reader features (i.e., linguistic and cognitive abilities, and background knowledge). The present study aims to explore whether this systematic principle is reflected in the development of science textbooks. The current study examined science textbook texts on 20 measures provided by Auto-Kohesion, a Korean language analysis tool. In addition to surface-level features (basic counts, word-related measures, syntactic complexity measures) which have been commonly used in previous text analysis studies, the present study included cohesion-related features as well (noun overlap ratios, connectives, pronouns). The main findings demonstrate that the surface measures (e.g., word and sentence length, word frequency) overall increased in complexity with grade levels, whereas the majority of the other measures, particularly cohesion-related measures, did not systematically vary across grade levels. The current results suggest that students of lower grades are expected to experience learning difficulties and lowered motivation due to the challenging texts. Textbooks are also not likely to be suitable for students of higher grades to develop the ability to process difficulty level texts required for higher education. The current study suggests that various text-related features including cohesion-related measures need to be carefully considered in the process of textbook development.

Validity Analysis on Writing Directions and Content Development of Texts for 'Invention and Problem Solving' ('발명과 문제해결'의 집필 방향과 교재 내용에 대한 타당도 분석)

  • Lee, Byung-Wook;Choi, Yu-Hyun;Kim, Taehoon;Kang, Kyoung-Kyoon
    • 대한공업교육학회지
    • /
    • v.34 no.1
    • /
    • pp.155-170
    • /
    • 2009
  • This study aims at examining text contents and its writing directions and analyzing their validity to develop text books of "invention and problem solving", which will be used for advanced courses of specialized high school of invention and patents. To develop text book contents and writing direction, literature research and professional association meetings were performed and to verify validity on developed text book contents and writing direction, survey research was performed. The subjects of survey research to verify validity consist of seventy five teachers who participated in the training course for invention leaders hosted by International Intellectual Property Training Institute (IIPTI) of Korean Intellectual Property Office (KIPO). To examine validity on text writing directions, each area of the text, themes, and modules, questionnaires that consist of multiple choice questions, and open questions that participants can describe their opinions were developed. Text book writing plans are included in the questionnaires to help the understanding on text book contents. The conclusions drawn from results of validity analysis are as follows: First, each theme and modules of 'invention and problem solving' were properly developed for common text books for the advance course of specialized high school of invention and patents. Second, as for the text book writing direction of 'invention and problem solving', text books emphasize research ability and creative thinking. They were developed to help increase critical thinking, logical thinking and problem solving ability.