• Title/Summary/Keyword: 연관단어

Search Result 253, Processing Time 0.025 seconds

A Study on Text Mining Methods to Analyze Civil Complaints: Structured Association Analysis (민원 분석을 위한 텍스트 마이닝 기법 연구: 계층적 연관성 분석)

  • Kim, HyunJong;Lee, TaiHun;Ryu, SeungEui;Kim, NaRang
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.23 no.3
    • /
    • pp.13-24
    • /
    • 2018
  • For government and public institutions, civil complaints containing direct requirements of citizens can be utilized as important data in developing policies. However, it is difficult to draw accurate requirements using text mining methods since the nature of the complaint text is unstructured. In this study, a new method is proposed that draws the exact requirements of citizens, improving the previous text mining in analyzing the data of civil complaints. The new text-mining method is based on the principle of Co-Occurrences Structure Map, and it is structured by two-step association analysis, so that it consists of the first-order related word and a second-order related word based on the core subject word. For the analysis, 3,004 cases posted on the electronic bulletin board of Busan City for the year 2016 are used. This study's academic contribution suggests a method deriving the requirements of citizens from the civil affairs data. As a practical contribution, it also enables policy development using civil service data.

Analysis of Author Image Based on Book Recommendation from Readers (독자 추천도서 정보를 이용한 작가 이미지 분석 연구)

  • Choi, Sanghee
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.4
    • /
    • pp.153-171
    • /
    • 2017
  • Many readers tend to read books of a specific author and to expand their reading areas according to the author. This study chose Edgar Allan Poe and analyzed the image of the author using co-recommended authors and books by other readers. The frequencies of co-occurred authors and books were investigated and the relations of authors and books were analyzed with network analysis methods. As a result, genre images of Poe, related authors, and related books are discovered. This study also suggested the methods to identify the image of a author, related author groups, and related books for libraries' reading programs and book curation.

Query Extension of Retrieve System Using Hangul Word Embedding and Apriori (한글 워드임베딩과 아프리오리를 이용한 검색 시스템의 질의어 확장)

  • Shin, Dong-Ha;Kim, Chang-Bok
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.6
    • /
    • pp.617-624
    • /
    • 2016
  • The hangul word embedding should be performed certainly process for noun extraction. Otherwise, it should be trained words that are not necessary, and it can not be derived efficient embedding results. In this paper, we propose model that can retrieve more efficiently by query language expansion using hangul word embedded, apriori, and text mining. The word embedding and apriori is a step expanding query language by extracting association words according to meaning and context for query language. The hangul text mining is a step of extracting similar answer and responding to the user using noun extraction, TF-IDF, and cosine similarity. The proposed model can improve accuracy of answer by learning the answer of specific domain and expanding high correlation query language. As future research, it needs to extract more correlation query language by analysis of user queries stored in database.

The Tresnds of Artiodactyla Researches in Korea, China and Japan using Text-mining and Co-occurrence Analysis of Words (텍스트마이닝과 동시출현단어분석을 이용한 한국, 중국, 일본의 우제목 연구 동향 분석)

  • Lee, Byeong-Ju;Kim, Baek-Jun;Lee, Jae Min;Eo, Soo Hyung
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.1
    • /
    • pp.9-15
    • /
    • 2019
  • Artiodactyla, which is an even-toed mammal, widely inhabits worldwide. In recent years, wild Artiodactyla species have attracted public attention due to the rapid increase of crop damage and road-kill caused by wild Artiodactyla such as water deer and wild boar and the decrease of some species such as long-tailed goral and musk deer. In spite of such public attention, however, there have been few studies on Artiodactyla in Korea, and no studies have focused on the trend analysis of Artiodactyla, making it difficult to understand actual problems. Many recent studies on trend used text-mining and co-occurrence analysis to increase objectivity in the classification of research subjects by extracting keywords appearing in literature and quantifying relevance between words. In this study, we analyzed texts from research articles of three countries (Korea, China, and Japan) through text-mining and co-occurrence analysis and compared the research subjects in each country. We extracted 199 words from 665 articles related to Artiodactyla of three countries through text-mining. Three word-clusters were formed as a result of co-occurrence analysis on extracted words. We determined that cluster1 was related to "habitat condition and ecology", cluster2 was related to "disease" and cluster3 was related to "conservation genetics and molecular ecology". The results of comparing the rates of occurrence of each word clusters in each country showed that they were relatively even in China and Japan whereas Korea had a prevailing rate (69%) of cluster2 related to "disease". In the regression analysis on the number of words per year in each cluster, the number of words in both China and Japan increased evenly by year in each cluster while the rate of increase of cluster2 was five times more than the other clusters in Korea. The results indicate that Korean researches on Artiodactyla tended to focus on diseases more than those in China and Japan, and few researchers considered other subjects including habitat characteristics, behavior and molecular ecology. In order to control the damage caused by Artiodactyla and to establish a reasonable policy for the protection of endangered species, it is necessary to accumulate basic ecological data by conducting researches on wild Artiodactyla more.

Associated Keyword Recommendation System for Keyword-based Blog Marketing (키워드 기반 블로그 마케팅을 위한 연관 키워드 추천 시스템)

  • Choi, Sung-Ja;Son, Min-Young;Kim, Young-Hak
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.5
    • /
    • pp.246-251
    • /
    • 2016
  • Recently, the influence of SNS and online media is rapidly growing with a consequent increase in the interest of marketing using these tools. Blog marketing can increase the ripple effect and information delivery in marketing at low cost by prioritizing keyword search results of influential portal sites. However, because of the tough competition to gain top ranking of search results of specific keywords, long-term and proactive efforts are needed. Therefore, we propose a new method that recommends associated keyword groups with the possibility of higher exposure of the blog. The proposed method first collects the documents of blog including search results of target keyword, and extracts and filters keyword with higher association considering the frequency and location information of the word. Next, each associated keyword is compared to target keyword, and then associated keyword group with the possibility of higher exposure is recommended considering the information such as their association, search amount of associated keyword per month, the number of blogs including in search result, and average writhing date of blogs. The experiment result shows that the proposed method recommends keyword group with higher association.

Profiling and Co-word Analysis of Teaching Korean as a Foreign Language Domain (프로파일링 분석과 동시출현단어 분석을 이용한 한국어교육학의 정체성 분석)

  • Kang, Beomil;Park, Ji-Hong
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.4
    • /
    • pp.195-213
    • /
    • 2013
  • This study aims at establishing the identity of teaching Korean as a Foreign Language (KFL) domain by using journal profiling and co-word analysis in comparison with the relevant and adjacent domains. Firstly, by extracting and comparing topic terms, we calculate the similarity of academic journals of the three domains, KFL, teaching Korean as a Native Language (KNL), and Korean Linguistics (KL). The result shows that the journals of KFL form a distinct cluster from the others. The profiling analysis and co-word analysis are then conducted to visualize the relationship among all the three domains in order to uncover the characteristics of KFL. The findings show that KFL is more similar to KNL than to KL. Finally, the comparison of knowledge structures of these three domains based on the co-word analysis demonstrates the uniqueness of KFL as an independent domain in relation with the other relevant domains.

Contextual Advertisement System based on Document Clustering (문서 클러스터링을 이용한 문맥 광고 시스템)

  • Lee, Dong-Kwang;Kang, In-Ho;An, Dong-Un
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.73-80
    • /
    • 2008
  • In this paper, an advertisement-keyword finding method using document clustering is proposed to solve problems by ambiguous words and incorrect identification of main keywords. News articles that have similar contents and the same advertisement-keywords are clustered to construct the contextual information of advertisement-keywords. In addition to news articles, the web page and summary of a product are also used to construct the contextual information. The given document is classified as one of the news article clusters, and then cluster-relevant advertisement-keywords are used to identify keywords in the document. We could achieve 21% precision improvement by our proposed method.

Favorable analysis of users through the social data analysis based on sentimental analysis (소셜데이터 감성분석을 통한 사용자의 호감도 분석)

  • Lee, Min-gyu;Sohn, Hyo-jung;Seong, Baek-min;Kim, Jong-bae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.438-440
    • /
    • 2014
  • Recently it is used commercially to actively move the data from the SNS service. Therefore, we propose a method that can accurately analyze the information related to the reputation of companies and products in real time SNS environment in this paper.Identify the relationship between words by performing morphological analysis on the text data gathered by crawling the SNS scheme. In addition, it shows the visualization to analyze statistically through a established emotional dictionary morphemes are extracted from the sentence. Here, if the extracted word is not exist in sentimental dictionary. Also, we propose the algorithm that add the word to emotional dictionary automatically.

  • PDF

Ranking Contribution of Star in Each Domain Using Association Text Mining News Articles on the Web (뉴스기사의 연관 단어 텍스트 마이닝을 이용한 스타의 분야별 기여도순위 비교기법)

  • Kang, Yoonjeong;Yoon, Jaeyeol;Lim, JiYeon;Kim, Ung-mo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.11a
    • /
    • pp.1191-1194
    • /
    • 2011
  • 스타의 대중에 대한 인기가 브랜드의 이미지 제고와 상업적 영향을 끄는 마케팅 전략을 스타 마케팅이라고 한다. 오늘날의 스타는 방송, 연예활동뿐만 아니라 스포츠, 정치활동, 사회기여활동 등 다양한 분야에서 활약하며 스타의 이미지는 그 활약상에 영향을 받는다. 스타의 이미지는 브랜드 및 기업의 이미지로 직결되므로 그에 대한 사전분석은 마케팅에서 중요한 요소이다. 그래서 일반적으로 스타들이 활약하는 도메인을 분류하여서 그 스타에 대해서 검색을 하였을 때 어떤 분야에서 활약하고 기여를 하는지 그 기여도를 도메인에 따라 랭킹을 매기는 방법을 제안한다. 뉴스기사에서 텍스트 마이닝 기술을 이용하여 스타의 이름과 활동 도메인들에 대해서 관련단어를 빈도에 따라 추출한다. 그리고 관련된 단어들을 이용하여 스타에 대한 뉴스 중 각 도메인과 관련된 기사들을 카운트하며 도메인에 대해서 긍정 혹은 부정적인 보도내용일 경우에는 극성을 부여하여 그 가중치를 달리한다. 빈도 및 극성을 고려한 점수화에 의해 스타가 기여하는 분야에 대한 순위를 매긴다.

A Language Model based Knowledge Network for Analyzing Disaster Safety related Social Interest (재난안전 사회관심 분석을 위한 언어모델 활용 정보 네트워크 구축)

  • Choi, Dong-Jin;Han, So-Hee;Kim, Kyung-Jun;Bae, Eun-Sol
    • Proceedings of the Korean Society of Disaster Information Conference
    • /
    • 2022.10a
    • /
    • pp.145-147
    • /
    • 2022
  • 본 논문은 대규모 텍스트 데이터에서 이슈를 발굴할 때 사용되는 기존의 정보 네트워크 또는 지식 그래프 구축 방법의 한계점을 지적하고, 문장 단위로 정보 네트워크를 구축하는 새로운 방법에 대해서 제안한다. 먼저 문장을 구성하는 단어와 캐릭터수의 분포를 측정하며 의성어와 같은 노이즈를 제거하기 위한 역치값을 설정하였다. 다음으로 BERT 기반 언어모델을 이용하여 모든 문장을 벡터화하고, 코사인 유사도를 이용하여 두 문장벡터에 대한 유사성을 측정하였다. 오분류된 유사도 결과를 최소화하기 위하여 명사형 단어의 의미적 연관성을 비교하는 알고리즘을 개발하였다. 제안된 유사문장 비교 알고리즘의 결과를 검토해 보면, 두 문장은 서술되는 형태가 다르지만 동일한 주제와 내용을 다루고 있는 것을 확인할 수 있었다. 본 논문에서 제안하는 방법은 단어 단위 지식 그래프 해석의 어려움을 극복할 수 있는 새로운 방법이다. 향후 이슈 및 트랜드 분석과 같은 미래연구 분야에 적용하면, 데이터 기반으로 특정 주제에 대한 사회적 관심을 수렴하고, 수요를 반영한 정책적 제언을 도출하는데 기여할 수 있을 것이다

  • PDF