• Title/Summary/Keyword: Text data

Search Result 2,956, Processing Time 0.025 seconds

An Autobiographical Narrative Inquiry on the Process of Becoming-Scientist for Science Teachers (과학교사의 과학연구자-되기 과정에 관한 자서전적 내러티브 탐구)

  • Kwan-Young Kim;Sang-Hak Jeon
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.4
    • /
    • pp.369-387
    • /
    • 2023
  • This study aims to interpret the experience of science research in a graduate school laboratory from the perspective of Gilles Deleuze's concepts of "agencement" and "becoming". The research was conducted as an autobiographical narrative inquiry. The research text is written in a way that tells the story of my science research experience and retells it from the perspective of Gilles Deleuze. In Deleuze's view, science research is a constantly flowing agencement. The science research agencement is composed of a mechanical agencement of various experimental tools-machines and researcher-machines as well as a collective agencement of speech acts such as biological knowledge, experiment protocols, and laboratory rules. Furthermore, science research agencement is fluid as events occur all over the agencement. Data, as a change occurring in the material dimension, is an event and sign that raises problems. It has the agency to influence agencement through an intersubjective relationship with researchers, and the meaning of data is generated in this process. The change of agencement compelled me to perform science practice. I have performed repeated science practice, meaning that my body has constantly been connected to other machines. As a result of this connection, my body has been affected, and the capacity of my body that constitutes the agencement has been augmented. In addition, I was able to be deterritorialized from the existing science research agencement and reterritorialized in a new science research agencement with data. This process of differentiation allowed me to becoming-scientist. In sum, this study provides implications for science practice-oriented education by exploring the process of becoming-scientist based on my science research experience.

Liaohe National Park based on big data visualization Visitor Perception Study

  • Qi-Wei Jing;Zi-Yang Liu;Cheng-Kang Zheng
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.133-142
    • /
    • 2023
  • National parks are one of the important types of protected area management systems established by IUCN and a management model for implementing effective conservation and sustainable use of natural and cultural heritage in countries around the world, and they assume important roles in conservation, scientific research, education, recreation and driving community development. In the context of big data, this study takes China's Liaohe National Park, a typical representative of global coastal wetlands, as a case study, and using Python technology to collect tourists' travelogues and reviews from major OTA websites in China as a source. The text spans from 2015 to 2022 and contains 2998 reviews with 166,588 words in total. The results show that wildlife resources, natural landscape, wetland ecology and the fishing and hunting culture of northern China are fully reflected in the perceptions of visitors to Liaohe National Park; visitors have strong positive feelings toward Liaohe National Park, but there is still much room for improvement in supporting services and facilities, public education and visitor experience and participation.

GPT-enabled SNS Sentence writing support system Based on Image Object and Meta Information (이미지 객체 및 메타정보 기반 GPT 활용 SNS 문장 작성 보조 시스템)

  • Dong-Hee Lee;Mikyeong Moon;Bong-Jun, Choi
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.3
    • /
    • pp.160-165
    • /
    • 2023
  • In this study, we propose an SNS sentence writing assistance system that utilizes YOLO and GPT to assist users in writing texts with images, such as SNS. We utilize the YOLO model to extract objects from images inserted during writing, and also extract meta-information such as GPS information and creation time information, and use them as prompt values for GPT. To use the YOLO model, we trained it on form image data, and the mAP score of the model is about 0.25 on average. GPT was trained on 1,000 blog text data with the topic of 'restaurant reviews', and the model trained in this study was used to generate sentences with two types of keywords extracted from the images. A survey was conducted to evaluate the practicality of the generated sentences, and a closed-ended survey was conducted to clearly analyze the survey results. There were three evaluation items for the questionnaire by providing the inserted image and keyword sentences. The results showed that the keywords in the images generated meaningful sentences. Through this study, we found that the accuracy of image-based sentence generation depends on the relationship between image keywords and GPT learning contents.

Automatic scoring of mathematics descriptive assessment using random forest algorithm (랜덤 포레스트 알고리즘을 활용한 수학 서술형 자동 채점)

  • Inyong Choi;Hwa Kyung Kim;In Woo Chung;Min Ho Song
    • The Mathematical Education
    • /
    • v.63 no.2
    • /
    • pp.165-186
    • /
    • 2024
  • Despite the growing attention on artificial intelligence-based automated scoring technology as a support method for the introduction of descriptive items in school environments and large-scale assessments, there is a noticeable lack of foundational research in mathematics compared to other subjects. This study developed an automated scoring model for two descriptive items in first-year middle school mathematics using the Random Forest algorithm, evaluated its performance, and explored ways to enhance this performance. The accuracy of the final models for the two items was found to be between 0.95 to 1.00 and 0.73 to 0.89, respectively, which is relatively high compared to automated scoring models in other subjects. We discovered that the strategic selection of the number of evaluation categories, taking into account the amount of data, is crucial for the effective development and performance of automated scoring models. Additionally, text preprocessing by mathematics education experts proved effective in improving both the performance and interpretability of the automated scoring model. Selecting a vectorization method that matches the characteristics of the items and data was identified as one way to enhance model performance. Furthermore, we confirmed that oversampling is a useful method to supplement performance in situations where practical limitations hinder balanced data collection. To enhance educational utility, further research is needed on how to utilize feature importance derived from the Random Forest-based automated scoring model to generate useful information for teaching and learning, such as feedback. This study is significant as foundational research in the field of mathematics descriptive automatic scoring, and there is a need for various subsequent studies through close collaboration between AI experts and math education experts.

A Study on the Perception of Pit and Fissure Sealant using Unstructured Big Data (비정형 빅데이터를 이용한 치면열구전색(치아홈메우기)에 대한 인식분석)

  • Han-A Cho
    • Journal of Korean Dental Hygiene Science
    • /
    • v.6 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • Background: This study aimed to explore the overall perception of pit and fissure sealants and suggest methods to revitalize their current stagnation. Methods: To determine the social perception of the change in coverage policy for pit and fissure sealants, we categorized them into five time periods. The first period (December 1, 2009 to November 30, 2010), the second period (December 1, 2010 to September 30, 2012), the third period (October 1, 2012 to May 5, 2013), the fourth period (May 6, 2013 to September 30, 2017), and the fifth period (October 1, 2017 to December 31, 2022). We utilized text mining, an unstructured big data analysis method. Keywords were collected and analyzed using Textom, and the frequency analysis of the top 30 keywords, structural features of the semantic network, centrality analysis, QAP correlation analysis, and co-occurrence analysis were conducted. Results: The frequency analysis showed that the top keywords for each time period were 'Cavities', 'Treatment', and 'Children'. In the structural features of the semantic network of pit and fissure sealants by time period, the density index was found to be around 1.00 for all time periods. The QAP correlation analysis showed the highest correlation between the first and second periods and the fourth and fifth periods with a correlation coefficient of 0.834. The co-occurrence analysis showed that 'cavities' and 'prevention were the top two words across all time periods. Conclusion: This study showed that pit and fissure sealants are well accepted by the society as a preventive treatment for caries. However, the awareness of health education related to these sealants was found to be low. Efforts to revitalize stagnant pit and fissure sealants need to be strengthened with effective education.

Analysis of the AI Convergence Science Education Research Trends Using Text Mining (텍스트 마이닝을 활용한 AI융합 과학교육 연구 동향 분석)

  • Lee, Ju-Young
    • Journal of Korean Elementary Science Education
    • /
    • v.43 no.4
    • /
    • pp.544-553
    • /
    • 2024
  • The purpose of this study was to analyze the trends of research focusing on artificial intelligence and the science education and derive important problems, topics, and research trends,. The analysis of the AI convergence science education research trends targeted 83 articles on the awareness of artificial intelligence, research trends, design, development, and application of the education programs related to artificial intelligence. The analysis data was collected through the RISS. The collected data was refined using Excel and Textom, and the main keywords were identified and analyzed through the frequency analysis and keyword network analysis. The connection centrality of the keywords was confirmed using the CONCOR analysis. The research results showed that the AI convergence science education research was expanding in both quantitative and qualitative aspects, and that the main keywords were identified as 'AI,' 'AI convergence education,' 'AI convergence science education,' 'AI education,' 'science education,' 'science,' 'machine learning,' 'elementary school,' 'generative AI,' and 'educational program.' Through the connection centrality analysis and CONCOR analysis, it was confirmed that the clusters were formed around the 'naming,' 'content and method,' 'elementary,' and 'data' in the AI integrated science education. Based on the results, the main topics and trends of the research integrating artificial intelligence into the science subjects were derived and the implications and directions for follow-up research were set forth.

Development of a Large-scale Korean Language Model in the Field of Geosciences (지질과학 분야 한국어 대규모 언어 모델 개발)

  • Sang-ho Lee
    • Economic and Environmental Geology
    • /
    • v.57 no.5
    • /
    • pp.539-550
    • /
    • 2024
  • With the rapid development and commercialization of large-scale generative language models, concerns regarding the appropriateness of model outputs, expertise, and data security have been emerged. In particular, Korean generative language models specialized in the field of geoscience have not yet been studied due to difficulties in data processing, preprocessing and a lack of development cases. This study conducted the entire process for developing a Korean language model specialized in the field of geoscience and evaluated its applicability in related fields. To achieve this, academic data related to geoscience were collected and preprocessed to create a dataset suitable for the training of the language model. The dataset was applied to the Llama2 model for the training. The trained model was quantitatively evaluated using 19 different evaluation datasets from various fields. The results demonstrated improved functionalities related to scientific question-answering and Korean text interpretation compared to the original model. The language model developed through this study can potentially enhance research productivity in the field of geoscience, offering benefits such as idea generation. The outcomes of this study are expected to stimulate further research and the utilization of generative language models in geoscience in the future.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Implementation of Realtime B2B System using Mobile Terminal (모바일 단말기를 이용한 실시간 B2B 시스템 구현)

  • Lee Hyae-Jung;Joung Suck-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.1
    • /
    • pp.1-6
    • /
    • 2006
  • Previous to do business computerization of industry field, it isn't efficient for write down all account book. Recently many companies competitive is raised from competence improvement for a task and cost-cutting for electronic data processing system. Moreover, sudden increase of internet-user need effective management information system and share information because of network connection. Accordingly, it needs to get out of text base information system from multimedia base system and it need to introduce E-catalog system of merchandise information. In addition, it is required technological development and merchandising with movement, real time omnipresent. carrying. In this paper, established portable B2B system with movement of precious metals and jewels field's portable terminal from mobile technology.

A Study on Implementation of Writing Supporting System(ICWS) for Interactive Storytelling Contents (인터렉티브 스토리텔링 콘텐츠 저작지원도구 설계 및 구현에 관한 연구)

  • Lee, Eun Ryoung;Kim, Kio Chung
    • Journal of Digital Convergence
    • /
    • v.11 no.2
    • /
    • pp.263-269
    • /
    • 2013
  • This research paper is applying Writing Supporting System on the previous research study about writing tool data model on interactive storytelling about family Story. Family story writing supporting system enables users to create text, images, videos and digital contents based on experimental knowledge collected from the first and second generations. The paper about studies on writing tool system on family story, aims to create documentary based high quality contents about each family members and family history. At the same time, overcome generation gaps and the lack of creation infrastructures. Throughout this process, the author will contribute to the expansion of creation devices which can be applied in other researches and writing tools.