• Title/Summary/Keyword: 텍스트 연구

Search Result 3,492, Processing Time 0.033 seconds

Online Document Mining Approach to Predicting Crowdfunding Success (온라인 문서 마이닝 접근법을 활용한 크라우드펀딩의 성공여부 예측 방법)

  • Nam, Suhyeon;Jin, Yoonsun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.45-66
    • /
    • 2018
  • Crowdfunding has become more popular than angel funding for fundraising by venture companies. Identification of success factors may be useful for fundraisers and investors to make decisions related to crowdfunding projects and predict a priori whether they will be successful or not. Recent studies have suggested several numeric factors, such as project goals and the number of associated SNS, studying how these affect the success of crowdfunding campaigns. However, prediction of the success of crowdfunding campaigns via non-numeric and unstructured data is not yet possible, especially through analysis of structural characteristics of documents introducing projects in need of funding. Analysis of these documents is promising because they are open and inexpensive to obtain. We propose a novel method to predict the success of a crowdfunding project based on the introductory text. To test the performance of the proposed method, in our study, texts related to 1,980 actual crowdfunding projects were collected and empirically analyzed. From the text data set, the following details about the projects were collected: category, number of replies, funding goal, fundraising method, reward, number of SNS followers, number of images and videos, and miscellaneous numeric data. These factors were identified as significant input features to be used in classification algorithms. The results suggest that the proposed method outperforms other recently proposed, non-text-based methods in terms of accuracy, F-score, and elapsed time.

Impact of Word Embedding Methods on Performance of Sentiment Analysis with Machine Learning Techniques

  • Park, Hoyeon;Kim, Kyoung-jae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.8
    • /
    • pp.181-188
    • /
    • 2020
  • In this study, we propose a comparative study to confirm the impact of various word embedding techniques on the performance of sentiment analysis. Sentiment analysis is one of opinion mining techniques to identify and extract subjective information from text using natural language processing and can be used to classify the sentiment of product reviews or comments. Since sentiment can be classified as either positive or negative, it can be considered one of the general classification problems. For sentiment analysis, the text must be converted into a language that can be recognized by a computer. Therefore, text such as a word or document is transformed into a vector in natural language processing called word embedding. Various techniques, such as Bag of Words, TF-IDF, and Word2Vec are used as word embedding techniques. Until now, there have not been many studies on word embedding techniques suitable for emotional analysis. In this study, among various word embedding techniques, Bag of Words, TF-IDF, and Word2Vec are used to compare and analyze the performance of movie review sentiment analysis. The research data set for this study is the IMDB data set, which is widely used in text mining. As a result, it was found that the performance of TF-IDF and Bag of Words was superior to that of Word2Vec and TF-IDF performed better than Bag of Words, but the difference was not very significant.

3D Web based Collaborative Authoring System of Tangible Contents (3D 웹 기반 실감 콘텐츠 협업 저작시스템 연구)

  • Lee, Changhyeon;Kwon, Yong-Moo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.07a
    • /
    • pp.308-309
    • /
    • 2011
  • 최근 소셜 미디어(Social Media)라는 말은 컴퓨터 관련 연구분야에 종사하는 사람들 뿐만 아니라 일반 사람들도 모르는 사람이 없을만큼 중요하고, 많이 쓰이는 말이다. 본 연구에서는 인터넷과 통신기기의 발달로 이러한 소셜 미디어들이 사용자들이 감당할 수 없을 만큼 생성되지만 이러한 소셜 미디어를 어떻게 사용하고 효율적으로 묶어서 표현해야 하는지에 관한 연구이다. 소셜미디어의 종류에는 블로그, 소셜 네트워킹 서비스(SNS), 위키, UCC, 마이크로 블로그 등으로 나누어진다. 본 연구에서는 Social Media를 기반으로 Tangible Blog를 위한 콘텐츠를 저작하는 시스템에 관한 연구를 진행한다. 여기서 소셜미디어는 일반적으로 사용되는 사진, 동영상, 효과음, 텍스트에 3D Contents를 추가는 것을 목표로 한다. 3D Contents는 현재 게임분야에 많이 사용되고 있는 Kinect를 이용하여 생성하고 이러한 소셜 미디어들을 Web 환경에서 Authoring 하는 방법에 관한 연구를 소개한다. 최종적으로는 현재 많이 사용되고 있는 Blog의 형태에서 발전한 Tangible Blog를 만드는 것이 목표이다. 여기서 Tangible Blog는 기존의 텍스트, 음악, 동영상 등의 소스를 이용한 사용자의 일상생활 및 의견 표현을 넘어선 3D Contents의 활용, 스토리텔링 기법 활용 및 Sensory Effect를 활용한 실감 있는 블로그를 만드는 것을 목표로 한다.

  • PDF

A Study on the Imjin War's Historical Materials with Multi-layer Network Analysis and Topic Modeling (다중 네트워크 분석과 토픽 모델링을 이용한 임진왜란 시기 사료에 관한 연구)

  • Cho, HyunChul;Song, Min
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.33 no.1
    • /
    • pp.167-198
    • /
    • 2022
  • Convergence science research is activated, and digital humanities research is also encouraged in humanities. Therefore, this study attempted to propose a experimental study that applies Text mining and Entitymetrics methods to historical materials. Annals of King Seonjo, revised Annals of King Seonjo, Miscellaneous Record of the War and Writings on Imjin War were used, also network analysis and DMR topic models were used to explore topic changes and common entities in historical sources. Through the results, it was possible to propose the availability of quantitative analysis for text data, presenting a timing change of a specific topic, and an undiscovered relationship between person entities.

A study on detective story authors' style differentiation and style structure based on Text Mining (텍스트 마이닝 기법을 활용한 고전 추리 소설 작가 간 문체적 차이와 문체 구조에 대한 연구)

  • Moon, Seok Hyung;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.89-115
    • /
    • 2019
  • This study was conducted to present the stylistic differences between Arthur Conan Doyle and Agatha Christie, famous as writers of classical mystery novels, through data analysis, and further to present the analytical methodology of the study of style based on text mining. The reason why we chose mystery novels for our research is because the unique devices that exist in classical mystery novels have strong stylistic characteristics, and furthermore, by choosing Arthur Conan Doyle and Agatha Christie, who are also famous to the general reader, as subjects of analysis, so that people who are unfamiliar with the research can be familiar with them. The primary objective of this study is to identify how the differences exist within the text and to interpret the effects of these differences on the reader. Accordingly, in addition to events and characters, which are key elements of mystery novels, the writer's grammatical style of writing was defined in style and attempted to analyze it. Two series and four books were selected by each writer, and the text was divided into sentences to secure data. After measuring and granting the emotional score according to each sentence, the emotions of the page progress were visualized as a graph, and the trend of the event progress in the novel was identified under eight themes by applying Topic modeling according to the page. By organizing co-occurrence matrices and performing network analysis, we were able to visually see changes in relationships between people as events progressed. In addition, the entire sentence was divided into a grammatical system based on a total of six types of writing style to identify differences between writers and between works. This enabled us to identify not only the general grammatical writing style of the author, but also the inherent stylistic characteristics in their unconsciousness, and to interpret the effects of these characteristics on the reader. This series of research processes can help to understand the context of the entire text based on a defined understanding of the style, and furthermore, by integrating previously individually conducted stylistic studies. This prior understanding can also contribute to discovering and clarifying the existence of text in unstructured data, including online text. This could help enable more accurate recognition of emotions and delivery of commands on an interactive artificial intelligence platform that currently converts voice into natural language. In the face of increasing attempts to analyze online texts, including New Media, in many ways and discover social phenomena and managerial values, it is expected to contribute to more meaningful online text analysis and semantic interpretation through the links to these studies. However, the fact that the analysis data used in this study are two or four books by author can be considered as a limitation in that the data analysis was not attempted in sufficient quantities. The application of the writing characteristics applied to the Korean text even though it was an English text also could be limitation. The more diverse stylistic characteristics were limited to six, and the less likely interpretation was also considered as a limitation. In addition, it is also regrettable that the research was conducted by analyzing classical mystery novels rather than text that is commonly used today, and that various classical mystery novel writers were not compared. Subsequent research will attempt to increase the diversity of interpretations by taking into account a wider variety of grammatical systems and stylistic structures and will also be applied to the current frequently used online text analysis to assess the potential for interpretation. It is expected that this will enable the interpretation and definition of the specific structure of the style and that various usability can be considered.

Text Mining-Based Emerging Trend Analysis for e-Learning Contents Targeting for CEO (텍스트마이닝을 통한 최고경영자 대상 이러닝 콘텐츠 트렌드 분석)

  • Kyung-Hoon Kim;Myungsin Chae;Byungtae Lee
    • Information Systems Review
    • /
    • v.19 no.2
    • /
    • pp.1-19
    • /
    • 2017
  • Original scripts of e-learning lectures for the CEOs of corporation S were analyzed using topic analysis, which is a text mining method. Twenty-two topics were extracted based on the keywords chosen from five-year records that ranged from 2011 to 2015. Research analysis was then conducted on various issues. Promising topics were selected through evaluation and element analysis of the members of each topic. In management and economics, members demonstrated high satisfaction and interest toward topics in marketing strategy, human resource management, and communication. Philosophy, history of war, and history demonstrated high interest and satisfaction in the field of humanities, whereas mind health showed high interest and satisfaction in the field of in lifestyle. Studies were also conducted to identify topics on the proportion of content, but these studies failed to increase member satisfaction. In the field of IT, educational content responds sensitively to change of the times, but it may not increase the interest and satisfaction of members. The present study found that content production for CEOs should draw out deep implications for value innovation through technology application instead of simply ending the technical aspect of information delivery. Previous studies classified contents superficially based on the name of content program when analyzing the status of content operation. However, text mining can derive deep content and subject classification based on the contents of unstructured data script. This approach can examine current shortages and necessary fields if the service contents of the themes are displayed by year. This study was based on data obtained from influential e-learning companies in Korea. Obtaining practical results was difficult because data were not acquired from portal sites or social networking service. The content of e-learning trends of CEOs were analyzed. Data analysis was also conducted on the intellectual interests of CEOs in each field.

Analysis of Issues on Underground Space between Central and Local Governments Utilizing Social Media Data (소셜미디어 데이터를 활용한 중앙정부와 지방정부 간 지하공간의 주요 이슈 고찰)

  • Choi, Hae-Ok;Baek, Sung-Joon
    • Journal of Cadastre & Land InformatiX
    • /
    • v.46 no.1
    • /
    • pp.75-86
    • /
    • 2016
  • This study examines the social issues between the central and local governments related with the underground space after happenings of sinkholes in Jamsil area in July, 2014. In this study, we consider the keyword network of the social network analysis as a research methodology. The social issues regarding the underground space have been dealt with through the analysis of the centrality and group density to know the attributes of the network. The results show that the government has been steadily helpful to the local governments for establishing the socialized law for the underground space. This research suggests that the laws and technologies as to the underground space issues cooperate each other in the future. It also shows that the government should enact the policies and the national plans for the development of the underground.

A Study on the Retrieval Effectiveness Based on Image Query Types (이미지 인지 유형 및 검색질의 방식에 따른 검색 효율성에 관한 연구)

  • Kim, Seonghee;Yi, Keunyoung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.47 no.3
    • /
    • pp.321-342
    • /
    • 2013
  • The purpose of this study was to compare and evaluate retrieval effectiveness of three types of image perception using different retrieval methods. Image types included specific, general, and abstract topics. The retrieval method included text only search, query by example (QBE) search, and a hybrid/hybrid search. Thirty-two college students were recruited for searching topics using Google image search system. The search results were compared with One-Way and Two-Way ANOVA. As a result, text search and hybrid search showed advantage when searching for specific and general topics. On the other hand, the QBE search performed better than both the text-only and hybrid search for abstract topics. The results have implications for the implementation of image retrieval systems.

Linguistic and Cognitive Factors that Affect Word Problem Solving (수학 문장제 해결에 영향을 주는 언어적.인지적 요인 -혼합물 문제를 중심으로-)

  • 김선희
    • Journal of Educational Research in Mathematics
    • /
    • v.14 no.3
    • /
    • pp.267-281
    • /
    • 2004
  • Many students feel the word problems are very difficult. This study analyzes the linguistic and cognitive factors that affect word problem solving so that we help students bring through the difficulty. There are a text base, a situation model, and a real world in the linguistic aspects. Students have a difficulty at the transition from text base to situation model(equation), and make lots of errors at the situation model. In the cognitive aspects, I investigated problem solving schemes, strategies, and complexity level. Students are likely to choose strategy by the contents which teacher instructed, but not by low complexity level, and mix up the amount of sugar and sugar water, and concentration. We can recognize how complex the types of word problems are to solve, which strategies students choose largely, and what errors that students make in the problem solving are.

  • PDF

An Analysis of the Discourse Topics of Users who Exhibit Symptoms of Depression on Social Media (소셜미디어를 통한 우울 경향 이용자 담론 주제 분석)

  • Seo, Harim;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.207-226
    • /
    • 2019
  • Depression is a serious psychological disease that is expected to afflict an increasing number of people. And studies on depression have been conducted in the context of social media because social media is a platform through which users often frankly express their emotions and often reveal their mental states. In this study, large amounts of Korean text were collected and analyzed to determine whether such data could be used to detect depression in users. This study analyzed data collected from Twitter users who had and did not have depressive tendencies between January 2016 and February 2019. The data for each user was separately analyzed before and after the appearance of depressive tendencies to see how their expression changed. In this study the data were analyzed through co-occurrence word analysis, topic modeling, and sentiment analysis. This study's automated data collection method enabled analyses of data collected over a relatively long period of time. Also it compared the textual characteristics of users with depressive tendencies to those without depressive tendencies.