• Title/Summary/Keyword: TextMining

Search Result 1,563, Processing Time 0.025 seconds

Analysis of Research Trends on Archival Information Services Using Text Mining (텍스트마이닝을 활용한 국내외 기록서비스 연구동향 분석)

  • Seohee Park;Hye-Eun Lee
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.24 no.1
    • /
    • pp.89-109
    • /
    • 2024
  • The study analyzed the research trends of domestic and international record information services from 2003 to 2022. A total of 136 academic papers registered in the Korea Citation Index (KCI) and 74 from the Library, Information Science & Technology Abstracts (LISTA) were examined by quantitative and qualitative content analysis to understand the research status of 20 years from various angles, such as publication year, research type, researcher type, subject, and purpose. Frequency analysis, co-occurrence frequency analysis, centrality analysis, and topic modeling were performed by applying text mining techniques. Results showed that domestic papers demonstrated a research flow focused on specific institutions or records, and user-centered satisfaction surveys and content-centered studies were conducted. Moreover, foreign papers confirmed various evaluation-oriented and information provision studies, such as data, resources, and collections, along with the research trend focusing on the relationship between archivists and users. The management of information resources was identified as a common topic in both domestic and foreign papers, but it is possible to identify that domestic research focuses on maintaining the quality of domestic information resources, while foreign research focuses on the storage and retrieval of information.

A Study on the Purchasing Factors of Color Cosmetics Using Big Data: Focusing on Topic Modeling and Concor Analysis (빅데이터를 활용한 색조화장품의 구매 요인에 관한 연구: 토픽모델링과 Concor 분석을 중심으로)

  • Eun-Hee Lee;Seung- Hee Bae
    • Journal of the Korean Applied Science and Technology
    • /
    • v.40 no.4
    • /
    • pp.724-732
    • /
    • 2023
  • In this study, we tried to analyze the characteristics of color cosmetics information search and the major information of interest in the color cosmetics market after COVID-19 shown in the text mining analysis results by collecting data on online interest information of consumers in the color cosmetics market after COVID-19. In the empirical analysis, text mining was performed on all documents such as news, blogs, cafes, and web pages, including the word "color cosmetics". As a result of the analysis, online information searches for color cosmetics after COVID-19 were mainly focused on purchase information, information on skin and mask-related makeup methods, and major topics such as interest brands and event information. As a result, post-COVID-19 color cosmetics buyers will become more sensitive to purchase information such as product value, safety, price benefits, and store information through active online information search, so a response strategy is required.

An Exploratory Study of Success Factors for Generative AI Services: Utilizing Text Mining and ChatGPT (생성형AI 서비스의 성공요인에 대한 탐색적 연구: 텍스트 마이닝과 ChatGPT를 활용하여)

  • Ji Hoon Yang;Sung-Byung Yang;Sang-Hyeak Yoon
    • Information Systems Review
    • /
    • v.25 no.2
    • /
    • pp.125-144
    • /
    • 2023
  • Generative Artificial Intelligence (AI) technology is gaining global attention as it can automatically generate sentences, images, and voices that humans previously generated. In particular, ChatGPT, a representative generative AI service, shows proactivity and accuracy differentiated from existing chatbot services, and the number of users is rapidly increasing in a short period of time. Despite this growing interest in generative AI services, most preceding studies are still in their infancy. Therefore, this study utilized LDA topic modeling and keyword network diagrams to derive success factors for generative AI services and to propose successful business strategies based on them. In addition, using ChatGPT, a new research methodology that complements the existing text-mining method, was presented. This study overcomes the limitations of previous research that relied on qualitative methods and makes academic and practical contributions to the future development of generative AI services.

Analysis of accident types at small and medium-sized construction sites based on web scraping and text mining (웹 스크래핑 및 텍스트마이닝에 기반한 중소규모 건설현장 사고유형 분석)

  • Younggeun Yoon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.609-615
    • /
    • 2024
  • The construction industry's fatality count stands at 402, comprising approximately 46% of total industrial accidents. Notably, construction costs less than 5 billion won account for about 69%, so strengthening safety management at small and medium-sized construction sites is required. In this study, 19,511 accident investigation data were collected using web scraping. Through statistical analysis of the collected structured data and text mining analysis of the unstructured data, accident types and causes of accidents were analyzed by construction costs at sites less than 5 billion won. As a result, it was confirmed that there were differences in accident types and causes depending on the construction costs. It is hoped that the results of this study will be used for customized safety management at small and medium-sized construction sites.

A study on cultural consumers of OTT original contents based on Text Mining, focusing on Netflix's 'Parasyte: The Grey' based on a comic book (텍스트 마이닝 기반 OTT 오리지널 콘텐츠의 문화소비자 연구, 만화 원작의 넷플릭스 '기생수: 더 그레이' 중심으로)

  • Oh Se Jong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.6
    • /
    • pp.789-797
    • /
    • 2024
  • The study of cultural consumers plays an important role in selecting actors, location selection, marketing, and scenarios in movies and series, and in box office factors. In particular, the study of cultural consumers of OTT original contents can produce viewer-tailored works by utilizing massive viewing data, social media user analysis, and location-based information. The research method analyzed the emotional vocabulary of text mining N-gram, CONCOR, and Bayesian classifier machine learning. Through the Netflix work 'Parasyte: The Gray' based on a comic book, the analysis of cultural consumption patterns of cultural consumers, actor selection, implications of genre change, complex human emotional narrative, location selection, and the effects of VFX were analyzed. In addition, the changed story development and storytelling structure were examined through Dan Harmon's 8 stages of hero storytelling. This study will save costs and time in cultural content development and the entertainment industry through the response factors of cultural consumers in OTT original contents and considerations for production.

A Study on Perception Analysis and Strategic Direction of Spatial Computing through Text Mining: Focusing on the Case of Apple Vision Pro (텍스트마이닝을 통한 공간 컴퓨팅 인식 분석 및 전략 방향에 관한 연구: 애플 비전 프로 사례를 중심으로)

  • Heetae Yang
    • Information Systems Review
    • /
    • v.26 no.2
    • /
    • pp.205-221
    • /
    • 2024
  • In June 2023, the term "spatial computing" began gaining recognition among the public with Apple's Vision Pro announcement, and interest surged exponentially after its official release in February 2024. With the market opening up, there's a need to analyze public perception for sustainable growth of Spatial Computing and provide evidence-based strategies for industry and government response. This study explores domestic public perception of Spatial Computing using various text mining techniques and seeks strategic directions for successful market penetration based on the analysis. Significantly, the study contributes by leading research on Spatial Computing, proposing new research methodologies, and offering strategic and policy directions for stakeholders.

Towards Improving Causality Mining using BERT with Multi-level Feature Networks

  • Ali, Wajid;Zuo, Wanli;Ali, Rahman;Rahman, Gohar;Zuo, Xianglin;Ullah, Inam
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.10
    • /
    • pp.3230-3255
    • /
    • 2022
  • Causality mining in NLP is a significant area of interest, which benefits in many daily life applications, including decision making, business risk management, question answering, future event prediction, scenario generation, and information retrieval. Mining those causalities was a challenging and open problem for the prior non-statistical and statistical techniques using web sources that required hand-crafted linguistics patterns for feature engineering, which were subject to domain knowledge and required much human effort. Those studies overlooked implicit, ambiguous, and heterogeneous causality and focused on explicit causality mining. In contrast to statistical and non-statistical approaches, we present Bidirectional Encoder Representations from Transformers (BERT) integrated with Multi-level Feature Networks (MFN) for causality recognition, called BERT+MFN for causality recognition in noisy and informal web datasets without human-designed features. In our model, MFN consists of a three-column knowledge-oriented network (TC-KN), bi-LSTM, and Relation Network (RN) that mine causality information at the segment level. BERT captures semantic features at the word level. We perform experiments on Alternative Lexicalization (AltLexes) datasets. The experimental outcomes show that our model outperforms baseline causality and text mining techniques.

Reinforcement Method for Automated Text Classification using Post-processing and Training with Definition Criteria (학습방법개선과 후처리 분석을 이용한 자동문서분류의 성능향상 방법)

  • Choi, Yun-Jeong;Park, Seung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.12B no.7 s.103
    • /
    • pp.811-822
    • /
    • 2005
  • Automated text categorization is to classify free text documents into predefined categories automatically and whose main goals is to reduce considerable manual process required to the task. The researches to improving the text categorization performance(efficiency) in recent years, focused on enhancing existing classification models and algorithms itself, but, whose range had been limited by feature based statistical methodology. In this paper, we propose RTPost system of different style from i.ny traditional method, which takes fault tolerant system approach and data mining strategy. The 2 important parts of RTPost system are reinforcement training and post-processing part. First, the main point of training method deals with the problem of defining category to be classified before selecting training sample documents. And post-processing method deals with the problem of assigning category, not performance of classification algorithms. In experiments, we applied our system to documents getting low classification accuracy which were laid on a decision boundary nearby. Through the experiments, we shows that our system has high accuracy and stability in actual conditions. It wholly did not depend on some variables which are important influence to classification power such as number of training documents, selection problem and performance of classification algorithms. In addition, we can expect self learning effect which decrease the training cost and increase the training power with employing active learning advantage.

User Experience Evaluation of Menstrual Cycle Measurement Application Using Text Mining Analysis Techniques (텍스트 마이닝 분석 기법을 활용한 월경주기측정 애플리케이션 사용자 경험 평가)

  • Wookyung Jeong;Donghee Shin
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.4
    • /
    • pp.1-31
    • /
    • 2023
  • This study conducted user experience evaluation by introducing various text mining techniques along with topic modeling techniques for mobile menstrual cycle measurement applications that are closely related to women's health and analyzed the results by combining them with a honeycomb model. To evaluate the user experience revealed in the menstrual cycle measurement application review, 47,117 Korean reviews of the menstrual cycle measurement application were collected. Topic modeling analysis was conducted to confirm the overall discourse on the user experience revealed in the review, and text network analysis was conducted to confirm the specific experience of each topic. In addition, sentimental analysis was conducted to understand the emotional experience of users. Based on this, the development strategy of the menstrual cycle measurement application was presented in terms of accuracy, design, monitoring, data management, and user management. As a result of the study, it was confirmed that the accuracy and monitoring function of the menstrual cycle measurement of the application should be improved, and it was observed that various design attempts were required. In addition, the necessity of supplementing personal information and the user's biometric data management method was also confirmed. By exploring the user experience (UX) of the menstrual cycle measurement application in-depth, this study revealed various factors experienced by users and suggested practical improvements to provide a better experience. It is also significant in that it presents a methodology by combines topic modeling and text network analysis techniques so that researchers can closely grasp vast amounts of review data in the process of evaluating user experiences.

A Machine Learning Based Facility Error Pattern Extraction Framework for Smart Manufacturing (스마트제조를 위한 머신러닝 기반의 설비 오류 발생 패턴 도출 프레임워크)

  • Yun, Joonseo;An, Hyeontae;Choi, Yerim
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.97-110
    • /
    • 2018
  • With the advent of the 4-th industrial revolution, manufacturing companies have increasing interests in the realization of smart manufacturing by utilizing their accumulated facilities data. However, most previous research dealt with the structured data such as sensor signals, and only a little focused on the unstructured data such as text, which actually comprises a large portion of the accumulated data. Therefore, we propose an association rule mining based facility error pattern extraction framework, where text data written by operators are analyzed. Specifically, phrases were extracted and utilized as a unit for text data analysis since a word, which normally used as a unit for text data analysis, is unable to deliver the technical meanings of facility errors. Performances of the proposed framework were evaluated by addressing a real-world case, and it is expected that the productivity of manufacturing companies will be enhanced by adopting the proposed framework.