• 제목/요약/키워드: Network Mining

검색결과 1,053건 처리시간 0.026초

텍스트 마이닝을 이용한 국내 기록관리학 분야 지적구조 분석 (Examining the Intellectual Structure of Records Management & Archival Science in Korea with Text Mining)

  • 이재윤;문주영;김희정
    • 한국문헌정보학회지
    • /
    • 제41권1호
    • /
    • pp.345-372
    • /
    • 2007
  • 이 연구에서는 텍스트 마이닝의 주요 기법인 문헌 클러스터링과 문헌 유사도 네트워크 분석을 적용하여 기록관리학 연구의 지적구조를 분석하였다. 대상 데이터는 2001년부터 2006년까지 발간된 국내 문헌정보학 영역의 대표적인 저널 5종에서 선정된 기록관리학 관련 논문 145건을 중심으로 분석하였다. 군집단위 지적구조 분석 결과, 국내에서 수행된 기록관리학 영역의 핵심적인 주제 영역은 <전자기록관리 디지털보존>, <기록관리정책 제도>, <기록물 기술/목록>, <기록관리학 영역/교육>이었으며, 문헌단위 지적구조 분석을 통하여서는 <디지털 아카이빙> 주제 영역이 중심을 이루고 있음을 확인할 수 있었다. 또한 시기별 분석을 통해서는 <기록정보서비스> 영역이 새롭게 등장하고 있음이 드러났다.

StrokeBase: A Database of Cerebrovascular Disease-related Candidate Genes

  • Kim, Young-Uk;Kim, Il-Hyun;Bang, Ok-Sun;Kim, Young-Joo
    • Genomics & Informatics
    • /
    • 제6권3호
    • /
    • pp.153-156
    • /
    • 2008
  • Complex diseases such as stroke and cancer have two or more genetic loci and are affected by environmental factors that contribute to the diseases. Due to the complex characteristics of these diseases, identifying candidate genes requires a system-level analysis of the following: gene ontology, pathway, and interactions. A database and user interface, termed StrokeBase, was developed; StrokeBase provides queries that search for pathways, candidate genes, candidate SNPs, and gene networks. The database was developed by using in silico data mining of HGNC, ENSEMBL, STRING, RefSeq, UCSC, GO, HPRD, KEGG, GAD, and OMIM. Forty candidate genes that are associated with cerebrovascular disease were selected by human experts and public databases. The networked cerebrovascular disease gene maps also were developed; these maps describe genegene interactions and biological pathways. We identified 1127 genes, related indirectly to cerebrovascular disease but directly to the etiology of cerebrovascular disease. We found that a protein-protein interaction (PPI) network that was associated with cerebrovascular disease follows the power-law degree distribution that is evident in other biological networks. Not only was in silico data mining utilized, but also 250K Affymetrix SNP chips were utilized in the 320 control/disease association study to generate associated markers that were pertinent to the cerebrovascular disease as a genome-wide search. The associated genes and the genes that were retrieved from the in silico data mining system were compared and analyzed. We developed a well-curated cerebrovascular disease-associated gene network and provided bioinformatic resources to cerebrovascular disease researchers. This cerebrovascular disease network can be used as a frame of systematic genomic research, applicable to other complex diseases. Therefore, the ongoing database efficiently supports medical and genetic research in order to overcome cerebrovascular disease.

아토바스타틴의 새로운 약물 적응증 탐색을 위한 비정형 데이터 분석 (Analysis of Unstructured Data on Detecting of New Drug Indication of Atorvastatin)

  • 정휘수;강길원;최웅;박종혁;신광수;서영성
    • Journal of health informatics and statistics
    • /
    • 제43권4호
    • /
    • pp.329-335
    • /
    • 2018
  • Objectives: In recent years, there has been an increased need for a way to extract desired information from multiple medical literatures at once. This study was conducted to confirm the usefulness of unstructured data analysis using previously published medical literatures to search for new indications. Methods: The new indications were searched through text mining, network analysis, and topic modeling analysis using 5,057 articles of atorvastatin, a treatment for hyperlipidemia, from 1990 to 2017. Results: The extracted keywords was 273. In the frequency of text mining and network analysis, the existing indications of atorvastatin were extracted in top level. The novel indications by Term Frequency-Inverse Document Frequency (TF-IDF) were atrial fibrillation, heart failure, breast cancer, rheumatoid arthritis, combined hyperlipidemia, arrhythmias, multiple sclerosis, non-alcoholic fatty liver disease, contrast-induced acute kidney injury and prostate cancer. Conclusions: Unstructured data analysis for discovering new indications from massive medical literature is expected to be used in drug repositioning industries.

빅데이터를 활용한 패션쇼에 대한 소비자 인식 연구 (A Study of Consumer Perception on Fashion Show Using Big Data Analysis)

  • 김다정;이승희
    • 패션비즈니스
    • /
    • 제23권3호
    • /
    • pp.85-100
    • /
    • 2019
  • This study examines changes in consumer perceptions of fashion shows, which are critical elements in the apparel industry and a means to represent a brand's image and originality. For this purpose, big data in clothing marketing, text mining, semantic network analysis techniques were applied. This study aims to verify the effectiveness and significance of fashion shows in an effort to give directions for their future utilization. The study was conducted in two major stages. First, data collection with the key word, "fashion shows," was conducted across websites, including Naver and Daum between 2015 and 2018. The data collection period was divided into the first- and second-half periods. Next, Textom 3.0 was utilized for data refinement, text mining, and word clouding. The Ucinet 6.0 and NetDraw, were used for semantic network analysis, degree centrality, CONCOR analysis and also visualization. The level of interest in "models" was found to be the highest among the perception factors related to fashion shows in both periods. In the first-half period, the consumer interests focused on detailed visual stimulants such as model and clothing while in the second-half period, perceptions changed as the value of designers and brands were increasingly recognized over time. The findings of this study can be utilized as a tool to evaluate fashion shows, the apparel industry sectors, and the marketing methods. Additionally, it can also be used as a theoretical framework for big data analysis and as a basis of strategies and research in industrial developments.

국내 전자정부 연구동향에 대한 정량적 분석: 텍스트 마이닝과 네트워크 분석 기법을 중심으로 (Quantitative Analysis of Research Trends in Korean E-Government Using Text Mining and Network Analysis Methods)

  • 이수인;신신애;강동석;김상현
    • 정보화정책
    • /
    • 제25권4호
    • /
    • pp.84-107
    • /
    • 2018
  • 기존에 수행된 국내 전자정부 동향연구는 정성적 연구방법에만 의존하는 약점을 지니고 있다. 이에 본 연구는 2018년 9월 현재 시점에서 1996~2017년까지의 데이터를 기반으로 정량적 분석을 수행하였다. 텍스트 마이닝을 통해 도출된 연구주제는 총 7가지였으며, 그중에서도 프레임워크와 공공정책 효과의 네트워크 중심성이 높은 것으로 식별되었다. 본 연구결과는 전자정부의 발전을 위해 필요한 학술적/정책적 시사점을 제공하였다. 시사점 중의 하나는 기존 연구가 주로 수행하던 방식인 정성적 분석방법 대신에 정량적 분석방법을 활용하여, 상대적으로 객관성 및 학문의 다양성 확보에 이바지한다는 점이다.

텍스트 마이닝과 네트워크 분석을 이용한 지역 이미지 변화 분석 (Regional Image Change Analysis using Text Mining and Network Analysis)

  • 정은희
    • 한국정보전자통신기술학회논문지
    • /
    • 제15권2호
    • /
    • pp.79-88
    • /
    • 2022
  • 소셜미디어 빅데이터는 소비자의 소비형태 뿐만 아니라 지역의 이미지를 파악할 수 있는 많은 정보가 포함되어 있다. 본 논문에서는 국내 포털 사이트인 네이버와 다음의 Blog와 Cafe로부터 '삼척'이 포함된 데이터를 2015년부터 2019년까지 1년 단위로 수집하였고, 텍스트 마이닝과 네트워크 분석을 실시하여 지역 이미지를 형성하는 키워드를 추출하고 지역 이미지 변화를 분석하였다. 연구 결과에 따르면, 2015년 지역 이미지는 '장호항', '동해', '해수욕장' 등 인근 지명이나 장소 등의 이미지 인지적 요소들로 표현되고 있는데, 2016년과 2019년은 지역 내의 특정 장소인 삼척쏠비치로 이미지 인지적 요소가 변한 것을 알 수 있다. 그리고 지역 이미지와 연관된 키워드들이 삼척을 대표하는 명소인 '장호항', 리조트가 포함하고 있는 것을 보아 지역 이미지 형성에 인프라 시설 요소가 큰 역할을 한다고 볼 수 있다. 네트워크 데이터에 대한 유의성 검증은 부트스트랩 기법을 이용하였고, 2015년, 2016년, 2019년 p-value가 각각 0.0002, 0.0006, 0.0002로 유의수준 5%에서 통계적으로 유의한 것으로 나타났다.

빅데이터 분석을 통한 아두이노 강의에 대한 사회적 인식 (Social perception of the Arduino lecture as seen in big data)

  • 이은상
    • 정보교육학회논문지
    • /
    • 제25권6호
    • /
    • pp.935-945
    • /
    • 2021
  • 이 연구의 목적은 빅데이터 분석 방법을 이용하여 아두이노 강의에 대한 사회적 인식을 분석하는 데 있다. 이를 위해 네이버 사이트의 블로그, 카페, 뉴스 채널에서 '아두이노+강의'를 검색 키워드로 2012년 1월부터 2021년 5월까지의 데이터를 텍스톰 사이트로 수집하였다. 수집된 데이터는 텍스톰 사이트를 이용하여 정제하였으며, 텍스톰 사이트, Ucinet 6, Netdraw 프로그램을 이용하여 텍스트 마이닝 분석과 의미 연결망 분석을 수행하였다. 빈도 분석, TF-IDF 분석, 연결 중심성 등의 텍스트 마이닝 분석 결과 '교육', '코딩' 등이 상위 키워드임을 확인하였다. 의미 연결망 분석을 위해 CONCOR 분석을 수행한 결과 '아두이노 관련 교육', '피지컬 컴퓨팅 관련 강의', '아두이노 특강', 'GUI 프로그래밍' 등 4개의 군집을 확인할 수 있다. 이 연구를 통해 인터넷상에서 아두이노 강의와 관련하여 일반 대중들의 여러 가지 의미 있는 사회적 인식을 확인할 수 있었다. 이 연구의 결과는 아두이노 강의를 준비하는 교수자나 해당 주제를 연구하는 연구자, 나아가 소프트웨어 교육이나 코딩 교육과 관련 정책을 수립하는 정책 입안자들에게 의미 있는 시사점을 제공하는 자료로 활용될 것이다.

Big Data Analysis on the Perception of Home Training According to the Implementation of COVID-19 Social Distancing

  • Hyun-Chang Keum;Kyung-Won Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제15권3호
    • /
    • pp.211-218
    • /
    • 2023
  • Due to the implementation of COVID-19 distancing, interest and users in 'home training' are rapidly increasing. Therefore, the purpose of this study is to identify the perception of 'home training' through big data analysis on social media channels and provide basic data to related business sector. Social media channels collected big data from various news and social content provided on Naver and Google sites. Data for three years from March 22, 2020 were collected based on the time when COVID-19 distancing was implemented in Korea. The collected data included 4,000 Naver blogs, 2,673 news, 4,000 cafes, 3,989 knowledge IN, and 953 Google channel news. These data analyzed TF and TF-IDF through text mining, and through this, semantic network analysis was conducted on 70 keywords, big data analysis programs such as Textom and Ucinet were used for social big data analysis, and NetDraw was used for visualization. As a result of text mining analysis, 'home training' was found the most frequently in relation to TF with 4,045 times. The next order is 'exercise', 'Homt', 'house', 'apparatus', 'recommendation', and 'diet'. Regarding TF-IDF, the main keywords are 'exercise', 'apparatus', 'home', 'house', 'diet', 'recommendation', and 'mat'. Based on these results, 70 keywords with high frequency were extracted, and then semantic indicators and centrality analysis were conducted. Finally, through CONCOR analysis, it was clustered into 'purchase cluster', 'equipment cluster', 'diet cluster', and 'execute method cluster'. For the results of these four clusters, basic data on the 'home training' business sector were presented based on consumers' main perception of 'home training' and analysis of the meaning network.

Dual-stream Co-enhanced Network for Unsupervised Video Object Segmentation

  • Hongliang Zhu;Hui Yin;Yanting Liu;Ning Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권4호
    • /
    • pp.938-958
    • /
    • 2024
  • Unsupervised Video Object Segmentation (UVOS) is a highly challenging problem in computer vision as the annotation of the target object in the testing video is unknown at all. The main difficulty is to effectively handle the complicated and changeable motion state of the target object and the confusion of similar background objects in video sequence. In this paper, we propose a novel deep Dual-stream Co-enhanced Network (DC-Net) for UVOS via bidirectional motion cues refinement and multi-level feature aggregation, which can fully take advantage of motion cues and effectively integrate different level features to produce high-quality segmentation mask. DC-Net is a dual-stream architecture where the two streams are co-enhanced by each other. One is a motion stream with a Motion-cues Refine Module (MRM), which learns from bidirectional optical flow images and produces fine-grained and complete distinctive motion saliency map, and the other is an appearance stream with a Multi-level Feature Aggregation Module (MFAM) and a Context Attention Module (CAM) which are designed to integrate the different level features effectively. Specifically, the motion saliency map obtained by the motion stream is fused with each stage of the decoder in the appearance stream to improve the segmentation, and in turn the segmentation loss in the appearance stream feeds back into the motion stream to enhance the motion refinement. Experimental results on three datasets (Davis2016, VideoSD, SegTrack-v2) demonstrate that DC-Net has achieved comparable results with some state-of-the-art methods.

Exploration of Association Rules for Social Survey Data

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2005년도 춘계학술대회
    • /
    • pp.18-24
    • /
    • 2005
  • The methods of data mining are decision tree, association rules, clustering, neural network and so on. Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. We analyze Gyeongnam social indicator survey data by 2003 using association rule technique for environment information. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. We can use association rule outputs in environmental preservation and environmental improvement.

  • PDF