• Title/Summary/Keyword: 텍스트 연구

Search Result 3,492, Processing Time 0.029 seconds

A study of Artificial Intelligence (AI) Speaker's Development Process in Terms of Social Constructivism: Focused on the Products and Periodic Co-revolution Process (인공지능(AI) 스피커에 대한 사회구성 차원의 발달과정 연구: 제품과 시기별 공진화 과정을 중심으로)

  • Cha, Hyeon-ju;Kweon, Sang-hee
    • Journal of Internet Computing and Services
    • /
    • v.22 no.1
    • /
    • pp.109-135
    • /
    • 2021
  • his study classified the development process of artificial intelligence (AI) speakers through analysis of the news text of artificial intelligence (AI) speakers shown in traditional news reports, and identified the characteristics of each product by period. The theoretical background used in the analysis are news frames and topic frames. As analysis methods, topic modeling and semantic network analysis using the LDA method were used. The research method was a content analysis method. From 2014 to 2019, 2710 news related to AI speakers were first collected, and secondly, topic frames were analyzed using Nodexl algorithm. The result of this study is that, first, the trend of topic frames by AI speaker provider type was different according to the characteristics of the four operators (communication service provider, online platform, OS provider, and IT device manufacturer). Specifically, online platform operators (Google, Naver, Amazon, Kakao) appeared as a frame that uses AI speakers as'search or input devices'. On the other hand, telecommunications operators (SKT, KT) showed prominent frames for IPTV, which is the parent company's flagship business, and 'auxiliary device' of the telecommunication business. Furthermore, the frame of "personalization of products and voice service" was remarkable for OS operators (MS, Apple), and the frame for IT device manufacturers (Samsung) was "Internet of Things (IoT) Integrated Intelligence System". The econd, result id that the trend of the topic frame by AI speaker development period (by year) showed a tendency to develop around AI technology in the first phase (2014-2016), and in the second phase (2017-2018), the social relationship between AI technology and users It was related to interaction, and in the third phase (2019), there was a trend of shifting from AI technology-centered to user-centered. As a result of QAP analysis, it was found that news frames by business operator and development period in AI speaker development are socially constituted by determinants of media discourse. The implication of this study was that the evolution of AI speakers was found by the characteristics of the parent company and the process of co-evolution due to interactions between users by business operator and development period. The implications of this study are that the results of this study are important indicators for predicting the future prospects of AI speakers and presenting directions accordingly.

A Study on the Causality of Technology Culture of East Asian Roof Tile Making Technology Since the 17th Century (17세기 이후 동아시아 제와(製瓦)의 기술문화적 인과성)

  • Kim, Hajin
    • Korean Journal of Heritage: History & Science
    • /
    • v.52 no.3
    • /
    • pp.56-73
    • /
    • 2019
  • This paper aims to establish the technical style of roof tiles by analyzing East Asian roof tile making techniques. It will examine the existing main research data, such as excavation results and the subsequent analysis of the roof tiles' production traces, as well as references and transmitted techniques. Regions are grouped according to technical similarity, then grouped again by artistic styles of pattern and shape and by the technical styles of tools, procedures, and manpower plans. Accordingly, intends to find out whether an understanding of technical style can facilitate an understanding of not only cultural aspects, but also the causality of techniques. Korean, Chinese and Japanese tools were examined, and procedures for making roof tiles were classified into 4 groups. In a superficial way, China, Okinawa, Korea, and Honshu share similar technical traits. Research of procedural details and manpower plans revealed characteristics of each region. As a result, comparisons were made between each region's technical characteristics attempting to investigate their causes. The groups were classified according to their possessing techniques, but it was revealed that East Asia's shared production techniques were based on architectural methodss. The skill of "Pyeon Jeol(Clay Cutting)" classified according to its possessing techniques, turned out to be one such technique. Also, the procedure of technical localization based on the skill of "Ta-nal(Tapping)" showed that the condition of this technique was the power to localize in response to a transfer of techniques. Previous comparison parameters of artifacts would have been a similarity of style originated from exchanges between regions and stylistic characteristics of regions decided by the demander's taste of beauty. This methodology enlarges cultural perception and affords a positive basis of historical facts. However, it suggests the possibility of finding cultural aspects' origins by understanding the technical style and seeing same result in view of "technology culture."

Exploring Changes in Science PCK Characteristics through a Family Resemblance Approach (가족유사성 접근을 통한 과학 PCK 변화 탐색)

  • Kwak, Youngsun
    • Journal of the Korean Society of Earth Science Education
    • /
    • v.15 no.2
    • /
    • pp.235-248
    • /
    • 2022
  • With the changes in the future educational environment, such as the rapid decline of the school-age population and the expansion of students' choice of curriculum, changes are also required in PCK, the expertise of science teachers. In other words, the categories constituting the existing 'consensus-PCK' and the characteristics of 'science PCK' are not fixed, so more categories and characteristics can be added. The purpose of this study is to explore the potential area of science PCK required to cope with changes in the future educational environment in the form of 'Family Resemblance Science PCK (Family Resemblance-PCK, hereafter)' through Wittgenstein's family resemblance approach. For this purpose, in-depth interviews were conducted with three focus groups. In the focus group in-depth interview, participants discussed how the science PCK required for science teachers in future schools in 2030-2045 will change due to changes in the future society and educational environment. Qualitative analysis was performed based on the in-depth interview, and semantic network analysis was performed on the in-depth interview text to analyze the characteristics of 'Family Resemblance-PCK' differentiated from the existing 'consensus-PCK'. In results, the characteristics of Family Resemblance-PCK, which are newly requested along with changes in role expectations of science teachers, were examined by PCK area. As a result of semantic network analysis of Family Resemblance-PCK, it was found that Family Resemblance-PCK expands its boundaries from the existing consensus-PCK, which is the starting point, and new PCK elements were added. Looking at the aspects of Family Resemblance-PCK, [AI-Convergence Knowledge-Contents-Digital], [Community-Network-Human Resources-Relationships], [Technology-Exploration-Virtual Reality-Research], [Self-Directed Learning-Collaboration-Community], etc., form a distinct network cluster, and it is expected that future science teacher expertise will be formed and strengthened around these PCK areas. Based on the research results, changes in the professionalism of science teachers in future schools and countermeasures were proposed as a conclusion.

Semantic Visualization of Dynamic Topic Modeling (다이내믹 토픽 모델링의 의미적 시각화 방법론)

  • Yeon, Jinwook;Boo, Hyunkyung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.131-154
    • /
    • 2022
  • Recently, researches on unstructured data analysis have been actively conducted with the development of information and communication technology. In particular, topic modeling is a representative technique for discovering core topics from massive text data. In the early stages of topic modeling, most studies focused only on topic discovery. As the topic modeling field matured, studies on the change of the topic according to the change of time began to be carried out. Accordingly, interest in dynamic topic modeling that handle changes in keywords constituting the topic is also increasing. Dynamic topic modeling identifies major topics from the data of the initial period and manages the change and flow of topics in a way that utilizes topic information of the previous period to derive further topics in subsequent periods. However, it is very difficult to understand and interpret the results of dynamic topic modeling. The results of traditional dynamic topic modeling simply reveal changes in keywords and their rankings. However, this information is insufficient to represent how the meaning of the topic has changed. Therefore, in this study, we propose a method to visualize topics by period by reflecting the meaning of keywords in each topic. In addition, we propose a method that can intuitively interpret changes in topics and relationships between or among topics. The detailed method of visualizing topics by period is as follows. In the first step, dynamic topic modeling is implemented to derive the top keywords of each period and their weight from text data. In the second step, we derive vectors of top keywords of each topic from the pre-trained word embedding model. Then, we perform dimension reduction for the extracted vectors. Then, we formulate a semantic vector of each topic by calculating weight sum of keywords in each vector using topic weight of each keyword. In the third step, we visualize the semantic vector of each topic using matplotlib, and analyze the relationship between or among the topics based on the visualized result. The change of topic can be interpreted in the following manners. From the result of dynamic topic modeling, we identify rising top 5 keywords and descending top 5 keywords for each period to show the change of the topic. Existing many topic visualization studies usually visualize keywords of each topic, but our approach proposed in this study differs from previous studies in that it attempts to visualize each topic itself. To evaluate the practical applicability of the proposed methodology, we performed an experiment on 1,847 abstracts of artificial intelligence-related papers. The experiment was performed by dividing abstracts of artificial intelligence-related papers into three periods (2016-2017, 2018-2019, 2020-2021). We selected seven topics based on the consistency score, and utilized the pre-trained word embedding model of Word2vec trained with 'Wikipedia', an Internet encyclopedia. Based on the proposed methodology, we generated a semantic vector for each topic. Through this, by reflecting the meaning of keywords, we visualized and interpreted the themes by period. Through these experiments, we confirmed that the rising and descending of the topic weight of a keyword can be usefully used to interpret the semantic change of the corresponding topic and to grasp the relationship among topics. In this study, to overcome the limitations of dynamic topic modeling results, we used word embedding and dimension reduction techniques to visualize topics by era. The results of this study are meaningful in that they broadened the scope of topic understanding through the visualization of dynamic topic modeling results. In addition, the academic contribution can be acknowledged in that it laid the foundation for follow-up studies using various word embeddings and dimensionality reduction techniques to improve the performance of the proposed methodology.

A comparative study on Diaspora consciousness of polish emigrants before and after the transformation of the political system reflected in the polish literary works (2) (체제전환 이전과 이후 폴란드 문학에 나타난 폴란드 이민자들의 디아스포라적 의식 비교 연구 (2))

  • Choi, Sung Eun
    • East European & Balkan Studies
    • /
    • v.35
    • /
    • pp.153-186
    • /
    • 2013
  • Literature has been special for the Polish who suffered from the numerous invasions from surrounding countries for her geographical location at the center of Europe. In the late 18th century at a time when Poland was divided and ruled by Russia, Prussia and Austria, literature played an important role in uniting Poland. During the 2nd world war in which Poland was occupied by the Soviet Union and by Germany, and during the Cold War period under socialism system(1948~1989), the Polish literature was in the front to keep unique national culture with overseas migration community at the center. The Polish Diaspora literature from 19th century up to now has naturally embodied national sufferings from foreign powers in their literary tradition linked to the problem of 'migration'. In addition, they belong to other cultural sphere, but keep their own unique identity, which is similar to Korean Diaspora literature to a great degree. This study has two stages. In the first stage, it figures out the formation and trend of the Polish Diaspora literature followed by their meaning in the history of Polish literature. In the second stage, specific texts (two dramas) are analyzed before and after system transition in 1989. * Before system transition: S. Mrożek, Emigranci (1974), * After system transition: J. Głowacki, Antygona w Nowym Yorku (1992) Mrożek and Głowacki had themselves migration experiences with high achievement and recognition in literature not only in Poland but also in the world. In their works, hardships as 'strangers' in foreign countries, emotional wandering and agony, nostalgia to lost home land and exploration of identity were described vividly. By comparing the 2 literature texts, this study attempts to trace the change of Diaspora consciousness which Polish migrants experienced in foreign countries with different political system like socialism and capitalism.

History and Characteristics of Risk Perception and Response Related to Science: Focused on Blood Pressure (과학에 관련된 위험 인식과 대응의 역사와 특징 -혈압을 중심으로-)

  • Wonbin Jang;Minchul Kim
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.6
    • /
    • pp.549-562
    • /
    • 2023
  • The current society is in the VUCA era, where various risks produced by humans are spread along with the development of science and technology. There is a need to increase the level of risk literacy of citizens to strengthen their daily preparedness to respond to these risks. For this on, it is necessary to reconsider the role of science education so that risks can be perceived and responded to scientifically and objectively. Accordingly, in order to investigate the role of science education in a risk society, this study reviewed the history of risk perception and response related to science and analyzed its characteristics. In this process, perception and response to risks arising from blood pressure were analyzed in three contexts (historical context, curriculum context, textbook context). For historical context, journals registered in SCIE were selected as research subjects among journals where research related to the history of knowledge of the heart and cardiovascular system was conducted. Papers with the keywords 'hypertension' and 'history' were selected from the journals, and changes in perception and responses related to blood pressure were compared and analyzed by period. The curriculum context is analyzed from the 1st national curriculum to the 2022 revised curriculum, and content elements and achievement standard related to blood pressure were compared and analyzed. It was confirmed that risks arising from blood pressure were not included from the 1st to the 6th national curriculum, and that risks arising from blood pressure were included from the 7th national curriculum (excluding the 2009 revised curriculum). For the textbook context, the 7th national curriculum BiologyⅠ, the 2015 revised curriculum Life ScienceⅠ, and Health were selected, and through text mining, keywords that representing curriculums and textbooks were selected, and the presentation of risk perception and response was analyzed based on the keywords. And by analyzing the figures and tables presented in the textbook, the characteristics of risk perception and risk response were derived. This study is meaningful in that it was able to confirm the role of risk perception and response in science education.

Degree of Self-Understanding Through "Self-Guided Interpretation" in Yeoncheon, Hantan River UNESCO Geopark: Focusing on Readability and Curriculum Relevance (한탄강 세계지질공원 연천 지역의 자기-안내식 해설 매체를 통한 스스로 이해 가능 정도: 이독성과 교육과정 관련성을 중심으로)

  • Min Ji Kim;Chan-Jong Kim;Eun-Jeong Yu
    • Journal of the Korean earth science society
    • /
    • v.44 no.6
    • /
    • pp.655-674
    • /
    • 2023
  • This study examined whether the "self-guided interpretation" media in the Yeoncheon area of the Hantangang River UNESCO Geopark are intelligible for visitors. Accordingly, two on-site investigations were conducted in the Hantangang River Global Geopark in September and November 2022. The Yeoncheon area, known for its diverse geological features and the era of geological attraction formation, was selected for analysis. We analyzed the readability levels, graphic characteristics, and alignment with science curriculum of the interpretive media specific to geological sites among a total of 36 self-guided interpretive media in the Yeoncheon area. Results indicated that information boards, primarily offering guidance on geological attractions, were the most prevalent type of interpretive media in the Yeoncheon area. The quantity of text in explanatory media surpassed that of a 12th-grade science textbook. The average vocabulary grade was similar to that of 11th- and 12th-grade science textbooks, with somewhat reduced readability due to a high occurrence of complex sentences. Predominant graphic types included illustrative photographs, aiding comprehension of the geological formation process through multi-structure graphics. Regarding scientific terms used in the interpretive media, 86.3% of the terms were within the "Solid Earth" section of the 2015 revised curriculum, with the majority being at the 4th-grade level. The 11th-grade optional curriculum terms comprised the second largest portion, and 13.7% of all science terms were from outside the curriculum. Notably, variations in the scientific terminology's complexity was based on geological attractions. Specifically, the terminology level on the homepage tended to be generally higher than that on information boards. Through these findings, specific factors impeding visitor comprehension of geological attractions in the Yeoncheon area, based on the interpretation medium, were identified. We suggest further research to effect improvements in self-guided interpretation media, fostering geological resource education for general visitors and anticipating advancements in geology education.

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.

  • An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

    • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
      • Journal of Intelligence and Information Systems
      • /
      • v.25 no.1
      • /
      • pp.21-41
      • /
      • 2019
    • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.

    Increasing Accuracy of Classifying Useful Reviews by Removing Neutral Terms (중립도 기반 선택적 단어 제거를 통한 유용 리뷰 분류 정확도 향상 방안)

    • Lee, Minsik;Lee, Hong Joo
      • Journal of Intelligence and Information Systems
      • /
      • v.22 no.3
      • /
      • pp.129-142
      • /
      • 2016
    • Customer product reviews have become one of the important factors for purchase decision makings. Customers believe that reviews written by others who have already had an experience with the product offer more reliable information than that provided by sellers. However, there are too many products and reviews, the advantage of e-commerce can be overwhelmed by increasing search costs. Reading all of the reviews to find out the pros and cons of a certain product can be exhausting. To help users find the most useful information about products without much difficulty, e-commerce companies try to provide various ways for customers to write and rate product reviews. To assist potential customers, online stores have devised various ways to provide useful customer reviews. Different methods have been developed to classify and recommend useful reviews to customers, primarily using feedback provided by customers about the helpfulness of reviews. Most shopping websites provide customer reviews and offer the following information: the average preference of a product, the number of customers who have participated in preference voting, and preference distribution. Most information on the helpfulness of product reviews is collected through a voting system. Amazon.com asks customers whether a review on a certain product is helpful, and it places the most helpful favorable and the most helpful critical review at the top of the list of product reviews. Some companies also predict the usefulness of a review based on certain attributes including length, author(s), and the words used, publishing only reviews that are likely to be useful. Text mining approaches have been used for classifying useful reviews in advance. To apply a text mining approach based on all reviews for a product, we need to build a term-document matrix. We have to extract all words from reviews and build a matrix with the number of occurrences of a term in a review. Since there are many reviews, the size of term-document matrix is so large. It caused difficulties to apply text mining algorithms with the large term-document matrix. Thus, researchers need to delete some terms in terms of sparsity since sparse words have little effects on classifications or predictions. The purpose of this study is to suggest a better way of building term-document matrix by deleting useless terms for review classification. In this study, we propose neutrality index to select words to be deleted. Many words still appear in both classifications - useful and not useful - and these words have little or negative effects on classification performances. Thus, we defined these words as neutral terms and deleted neutral terms which are appeared in both classifications similarly. After deleting sparse words, we selected words to be deleted in terms of neutrality. We tested our approach with Amazon.com's review data from five different product categories: Cellphones & Accessories, Movies & TV program, Automotive, CDs & Vinyl, Clothing, Shoes & Jewelry. We used reviews which got greater than four votes by users and 60% of the ratio of useful votes among total votes is the threshold to classify useful and not-useful reviews. We randomly selected 1,500 useful reviews and 1,500 not-useful reviews for each product category. And then we applied Information Gain and Support Vector Machine algorithms to classify the reviews and compared the classification performances in terms of precision, recall, and F-measure. Though the performances vary according to product categories and data sets, deleting terms with sparsity and neutrality showed the best performances in terms of F-measure for the two classification algorithms. However, deleting terms with sparsity only showed the best performances in terms of Recall for Information Gain and using all terms showed the best performances in terms of precision for SVM. Thus, it needs to be careful for selecting term deleting methods and classification algorithms based on data sets.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.