• Title/Summary/Keyword: Text Search

Search Result 554, Processing Time 0.024 seconds

Exploring the Trend of Korean Creative Dance by Analyzing Research Topics : Application of Text Mining (연구주제 분석을 통한 한국창작무용 경향 탐색 : 텍스트 마이닝의 적용)

  • Yoo, Ji-Young;Kim, Woo-Kyung
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.6
    • /
    • pp.53-60
    • /
    • 2020
  • The study is based on the assumption that the trend of phenomena and trends in research are contextually consistent. Therefore the purpose of this study is to explore the trend of dance through the subject analysis of the Korean creative dance study by utilizing text mining. Thus, 1,291 words were analyzed in the 616 journal title, which were established on the paper search website. The collection, refining and analysis of the data were all R 3.6.0 SW. According to the study, keywords representing the times were frequently used before the 2000s, but Korean creative dance research types were also found in terms of education and physical training. Second, the frequency of keywords related to the dance troupe's performance was high after the 2000s, but it was confirmed that Choi Seung-hee was still in an important position in the study of Korean creative dance. Third, an analysis of the overall research subjects of the Korean creative dance study showed that the research on 'Art of Choi Seung-hee in the modern era' was the highest proportion. Fourth, the Hot Topics, which are rising as of 2000, appeared as 'the performance activities of the National Dance Company' and 'the choreography expression and utilization of traditional dance'. However, since the recent trend of the National Dance Company's performance is advocating 'modernization based on tradition', it has been confirmed that the trend of Korean creative dance since the 2000s has been focused on the use of traditional dance motifs. Fifth, the Cold Topic, which has been falling as of 2000, has been shown to be a study of 'dancing expressions by age'. It was judged that interest in research also decreased due to the tendency to mix various dance styles after the establishment of the genre of Korean creative dance.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

An analysis of creative trend of election Ads and PR strategy which appears in recent political campaign - Focused on 2010. 6.2 local election, 2011. 10.26 by-election, 2012. 4.11 general election, 2012. 12.19 presidential election (한국 최근 정치캠페인에서 나타난 크리에이티브한 선거광고홍보전략 트렌드 분석 -2010. 6.2지방선거, 2011. 10.26 보궐선거 2012. 4.11 총선, 2012. 12.19 대선을 중심으로)

  • Kim, Man-Ki
    • Journal of Digital Convergence
    • /
    • v.11 no.8
    • /
    • pp.65-73
    • /
    • 2013
  • Outcome of election depends on which candidate of politics uses more original and creative idea for Ads and PR of election in election campaign strategy of political campaign. Especially, since political Ads and PR are the ways of capturing voters' sensitivities with one line of copy(slogan) and one image, Ads and PR are very important. This research analyzes unique and creative trend of political campaigns which are used in each unit election which is held four times(2010. 6 2 local election, 2011. 10 26 by-election, 2012. 4 11 general election, 2012. 12 19 presidential election) during 2010~2012. For analysis, search analysis of text and image used in video, internet, booklet type of Ads and PR material for election, and election campaign. Video is used in election campaign during election period. Unique and creative political campaign is customized micro-marketing election strategy trend which tries to fit for tendency of backing including gender, age group, social atmosphere, etc. This research excludes the degree of success of this election strategy from subject of analysis.

Application of Advertisement Filtering Model and Method for its Performance Improvement (광고 글 필터링 모델 적용 및 성능 향상 방안)

  • Park, Raegeun;Yun, Hyeok-Jin;Shin, Ui-Cheol;Ahn, Young-Jin;Jeong, Seungdo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.11
    • /
    • pp.1-8
    • /
    • 2020
  • In recent years, due to the exponential increase in internet data, many fields such as deep learning have developed, but side effects generated as commercial advertisements, such as viral marketing, have been discovered. This not only damages the essence of the internet for sharing high-quality information, but also causes problems that increase users' search times to acquire high-quality information. In this study, we define advertisement as "a text that obscures the essence of information transmission" and we propose a model for filtering information according to that definition. The proposed model consists of advertisement filtering and advertisement filtering performance improvement and is designed to continuously improve performance. We collected data for filtering advertisements and learned document classification using KorBERT. Experiments were conducted to verify the performance of this model. For data combining five topics, accuracy and precision were 89.2% and 84.3%, respectively. High performance was confirmed, even if atypical characteristics of advertisements are considered. This approach is expected to reduce wasted time and fatigue in searching for information, because our model effectively delivers high-quality information to users through a process of determining and filtering advertisement paragraphs.

Characteristics of Smartphone User in Application Usage and Implications for Applications Business Model (스마트폰 사용자들의 앱 이용 특성과 앱 비즈니스 모델에의 시사)

  • Yun, Hyung Bo;Wang, Boram;Park, Jiyun
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.3
    • /
    • pp.32-42
    • /
    • 2013
  • As the smartphone market grows, the needs for its new business model are also increased. However, most previous researches on smartphone applications focused on Technology Acceptance Model(TAM) and Rogers' Diffusion of Innovation Theory so that there was lack of researches on characteristics for actual smartphone users. In this research, we divided the smartphone applications into five category functions (Call & Text/Music & Video/Information Search/Game/Social Network Service (SNS)). We analyzed characteristic differences of users who used the each application category and found that the differences were statistically significant in both demographic and smartphone usage characteristics (frequency of downloading applications, and download experience of paid applications). Additionally, the smartphone usage characteristic is closely related to the usage duration. The representative result is that the characteristics of people used Music & Video function actively were women in their 20s who downloaded applications more than three times per week, and had a download experience of paid applications. It is positive result for players in the application markets, because it means the users are willing to pay for downloading the paid applications. However, large companies already occupied most of the market share in music applications so that small and medium-sized players should develop an innovative and distinguishable business model in order to success. We believe this research result would provide significant implications for the players in planning the successful business model and developing an user-specific application product.

XML Document Editing System for Structural Processing of the Digital Document to Including Mathematical Formula (수식을 포함한 전자문헌의 구조적 처리를 위한 XML 문서편집시스템)

  • 윤화묵;유범종;김창수;정회경
    • Journal of the Korean Society for information Management
    • /
    • v.19 no.4
    • /
    • pp.96-111
    • /
    • 2002
  • A lot of accumulated data of many quantity exist within a institution or an organization, but most data is remained in form of standardization as each institution or organization. There are difficulty in exchange and share of information. New concept of knowledge information resource management to overcome this disadvantage was introduced, and the digitization of knowledge information resources to share and manage accumulated data is been doing. Specially, in science technic or education scholarship it, the tendency that importing XML to process necessary data to exchange and share of knowledge information resources structurally, and limitation of back for search and indexing or reusability is happened according as expression of great many mathematics used inside electron document of these sphere is processed to nonstructural data of image or text and so on. There is interest converged in processing of mathematics that use MathML to overcome this, and we require the solution to be able to process MathML easily and efficiently on structural document. In this paper, designed and implemented of XML document editing system which easy structural process of electronic document for knowledge information resources, and create and express MathML easily on structural document without expert knowledge about MathML.

The Search of the Habitus Formation Process in Professional Football Club Supporters (프로축구 서포터즈의 아비투스 형성과정 탐색)

  • Oh, Byoung-Don;Yu, Young-Seol
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.8
    • /
    • pp.3672-3681
    • /
    • 2013
  • The purpose of this study was to analyze subculture of professional football supporters with a view of Bourdieu's Habitus theory. What sorts of mechanism were worked when supporters formed their Habitus. The methods adopted in the article was qualitative. The qualitative information on the nature of their fandom was gleaned from 'virtual participant observation' or 'interviewing' of professional football fans that participated in discussions on the internet and participant observation of professional fans. The intensity and criteria sampling method was used to select 6 supporters who had participated in the supporting activities for more than five years and played crucial roles in running their organization as officials. The textual analysis which is consisting of translation, coding, and processing was used to mean the identification and exegesis of contextualization cues that make a text meaningful to the professional football supporters. an intended audience. The findings of this study were that (1) the subculture variety of activities for players and coaching staffs were the fundamental factors when supporters formed their subculture. (2) the professional supporters became habitus through the progressively development of subcultures such as enthusiastic supporters, small meetings, and events relating to soccer players.

A Korean Document Sentiment Classification System based on Semantic Properties of Sentiment Words (감정 단어의 의미적 특성을 반영한 한국어 문서 감정분류 시스템)

  • Hwang, Jae-Won;Ko, Young-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.4
    • /
    • pp.317-322
    • /
    • 2010
  • This paper proposes how to improve performance of the Korean document sentiment-classification system using semantic properties of the sentiment words. A sentiment word means a word with sentiment, and sentiment features are defined by a set of the sentiment words which are important lexical resource for the sentiment classification. Sentiment feature represents different sentiment intensity in general field and in specific domain. In general field, we can estimate the sentiment intensity using a snippet from a search engine, while in specific domain, training data can be used for this estimation. When the sentiment intensity of the sentiment features are estimated, it is called semantic orientation and is used to estimate the sentiment intensity of the sentences in the text documents. After estimating sentiment intensity of the sentences, we apply that to the weights of sentiment features. In this paper, we evaluate our system in three different cases such as general, domain-specific, and general/domain-specific semantic orientation using support vector machine. Our experimental results show the improved performance in all cases, and, especially in general/domain-specific semantic orientation, our proposed method performs 3.1% better than a baseline system indexed by only content words.

Distribution Information Technology Investment and the Market Value of the Firm : Focusing on RFID case (한국에서 유통정보기술 투자가 주가에 미치는 영향에 관한 연구 : RFID 사례를 중심으로)

  • Son, Sam-Ho
    • Journal of Distribution Science
    • /
    • v.16 no.10
    • /
    • pp.65-76
    • /
    • 2018
  • Purpose - This paper investigates how the market value of the firms are impacted by distribution information technology investment in Korea over time and across markets, industries and project characteristics. This is the first empirical study on the market payoffs from the RFID investment in Korea. The purpose of this study is to provide a appropriate guideline for investors and practitioners with respect to the announcement representing RFID adoption in Korea. This reaction guideline will stimulate the practitioners to monitor and evaluate the benefits and costs of the innovative RFID technology. Research design, data, and methodology - This paper employs event study methodology to analyze the payoffs from distribution information technology investment announcements over a fifteen-year period from 2003 to 2017. Event study method is based on the assumptions such as market efficiency, unanticipated RFID invest announcements and no confounding effects in the data. This study collected the information on RFID investment announcements by using a full text search engine Bigkinds provided by Korea Press Foundation over a fifteen-year period from January 2003 through December 2017. This paper selected 88 announcements representing RFID adoption by 46 firms. This paper estimated the payoffs from RFID investment announcement through events windows by using the market model of Mcwilliams and Siegel (1997) and calculated the Z-values. Using this test statistics we could infer if RFID adoption make large differences in abnormal returns across various classifications of the firms. Results - There is significant positive market returns from the announcement representing distribution information technology investment in the pre-2009 time period, the significances of payoffs disappear in the post-2009 time period. For this reason investors or practitioners can understand the importance of market entry time and the fact that the greater rewards may belong to early innovators while late imitators cannot reap such a rewards. This paper also find that there is a large differences in the payoffs from the announcement across markets, industries and project characteristics. Conclusions - Analysing the selected sample of 88 announcements representing RFID Adoption over fifteen-year period from 2003 to 2017, this study find that there is not only significant abnormal excess returns from RFID investment announcements but also there is great differences in the abnormal returns over time and across firm sizes or affiliated markets, industries, and project characteristics. This means that there are considerable values for the investors across various firm classifications. The findings of this paper provide useful implications for the practitioners to make judicious decisions whether to adopt the innovative technologies in general or not considering the various concrete circumstances in Korea.