• Title/Summary/Keyword: Bibliographic big data

Search Result 12, Processing Time 0.02 seconds

Developing Graphic Interface for Efficient Online Searching and Analysis of Graph-Structured Bibliographic Big Data (그래프 구조를 갖는 서지 빅데이터의 효율적인 온라인 탐색 및 분석을 지원하는 그래픽 인터페이스 개발)

  • You, Youngseok;Park, Beomjun;Jo, Sunhwa;Lee, Suan;Kim, Jinho
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.77-88
    • /
    • 2020
  • Recently, many researches habe been done to organize and analyze various complex relationships in real world, represented in the form of graphs. In particular, the computer field literature data system, such as DBLP, is a representative graph data in which can be composed of papers, their authors, and citation among papers. Becasue graph data is very complex in storage structure and expression, it is very difficult task to search, analysis, and visualize a large size of bibliographic big data. In this paper, we develop a graphic user interface tool, called EEUM, which visualizes bibliographic big data in the form of graphs. EEUM provides the features to browse bibliographic big data according to the connected graph structure by visually displaying graph data, and implements search, management and analysis of the bibliographc big data. It also shows that EEUM can be conveniently used to search, explore, and analyze by applying EEUM to the bibliographic graph big data provided by DBLP. Through EEUM, you can easily find influential authors or papers in every research fields, and conveniently use it as a search and analysis tool for complex bibliographc big data, such as giving you a glimpse of all the relationships between several authors and papers.

Automatic Switching of Clustering Methods based on Fuzzy Inference in Bibliographic Big Data Retrieval System

  • Zolkepli, Maslina;Dong, Fangyan;Hirota, Kaoru
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.256-267
    • /
    • 2014
  • An automatic switch among ensembles of clustering algorithms is proposed as a part of the bibliographic big data retrieval system by utilizing a fuzzy inference engine as a decision support tool to select the fastest performing clustering algorithm between fuzzy C-means (FCM) clustering, Newman-Girvan clustering, and the combination of both. It aims to realize the best clustering performance with the reduction of computational complexity from O($n^3$) to O(n). The automatic switch is developed by using fuzzy logic controller written in Java and accepts 3 inputs from each clustering result, i.e., number of clusters, number of vertices, and time taken to complete the clustering process. The experimental results on PC (Intel Core i5-3210M at 2.50 GHz) demonstrates that the combination of both clustering algorithms is selected as the best performing algorithm in 20 out of 27 cases with the highest percentage of 83.99%, completed in 161 seconds. The self-adapted FCM is selected as the best performing algorithm in 4 cases and the Newman-Girvan is selected in 3 cases.The automatic switch is to be incorporated into the bibliographic big data retrieval system that focuses on visualization of fuzzy relationship using hybrid approach combining FCM and Newman-Girvan algorithm, and is planning to be released to the public through the Internet.

A Quantitative Analysis on Machine Learning and Smart Farm with Bibliographic Data from 2013 to 2023

  • Yong Sauk Hau
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.388-393
    • /
    • 2024
  • The convergence of machine learning and smart farm is becoming more and more important. The purpose of this research is to quantitatively analyze machine learning and smart farm with bibliographic data from 2013 to 2023. This study analyzed the 251 articles, filtered from the Web of Science, with regard to the article publication trend, the article citation trend, the top 10 research area, and the top 10 keywords representing the articles. The quantitative analysis results reveal the four points: First, the number of article publications in machine learning and smart farm continued growing from 2016. Second, the article citations in machine learning and smart farm drastically increased since 2018. Third, Computer Science, Engineering, Agriculture, Telecommunications, Chemistry, Environmental Sciences Ecology, Material Science, Instruments Instrumentation, Science Technology Other Topics, and Physics are top 10 research areas. Fourth, it is 'machine learning', 'smart farming', 'internet of things', 'precision agriculture', 'deep learning', 'agriculture', 'big data', 'machine', 'smart' and 'smart agriculture' that are the top 10 keywords composing authors' keywords in the articles in machine learning and smart farm from 2013 to 2023.

A Study on Interdisciplinary Structure of Big Data Research with Journal-Level Bibliographic-Coupling Analysis (학술지 단위 서지결합분석을 통한 빅데이터 연구분야의 학제적 구조에 관한 연구)

  • Lee, Boram;Chung, EunKyung
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.3
    • /
    • pp.133-154
    • /
    • 2016
  • Interdisciplinary approach has been recognized as one of key strategies to address various and complex research problems in modern science. The purpose of this study is to investigate the interdisciplinary characteristics and structure of the field of big data. Among the 1,083 journals related to the field of big data, multiple Subject Categories (SC) from the Web of Science were assigned to 420 journals (38.8%) and 239 journals (22.1%) were assigned with the SCs from different fields. These results show that the field of big data indicates the characteristics of interdisciplinarity. In addition, through bibliographic coupling network analysis of top 56 journals, 10 clusters in the network were recognized. Among the 10 clusters, 7 clusters were from computer science field focusing on technical aspects such as storing, processing and analyzing the data. The results of cluster analysis also identified multiple research works of analyzing and utilizing big data in various fields such as science & technology, engineering, communication, law, geography, bio-engineering and etc. Finally, with measuring three types of centrality (betweenness centrality, nearest centrality, triangle betweenness centrality) of journals, computer science journals appeared to have strong impact and subjective relations to other fields in the network.

Big Data Platform for Public Library Users: Focusing on the Cultural Programs and Community Service (이용자를 위한 공공도서관 빅데이터 플랫폼 구축 방안 연구 - 문화프로그램 및 커뮤니티 서비스 정보를 중심으로 -)

  • Yoon, SoYoung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.33 no.3
    • /
    • pp.347-370
    • /
    • 2022
  • Most public library websites provide unstructured cultural program data, which cannot be produced and utilized systematically as bibliographic information. It is not sufficiently used in existing library big data research or cases, and there is a risk of disappearing when the website is reorganized or the person in charge is changed. This study developed a data schema that can be used in conjunction with bibliographic data by collecting and analyzing cultural programs and community service data produced in an unstructured manner and proposed to share and utilize public library cultural programs and community service data, and establish a library big data platform that can serve as an information channel between librarians who are cultural program planners. Library program data posted on the library website can be integrated and managed through the platform, securing continuity of work, and systematically managing and preserving the specialized service history of individual libraries.

Personalized Book Curation System based on Integrated Mining of Book Details and Body Texts (도서 정보 및 본문 텍스트 통합 마이닝 기반 사용자 맞춤형 도서 큐레이션 시스템)

  • Ahn, Hee-Jeong;Kim, Kee-Won;Kim, Seung-Hoon
    • Journal of Information Technology Applications and Management
    • /
    • v.24 no.1
    • /
    • pp.33-43
    • /
    • 2017
  • The content curation service through big data analysis is receiving great attention in various content fields, such as film, game, music, and book. This service recommends personalized contents to the corresponding user based on user's preferences. The existing book curation systems recommended books to users by using bibliographic citation, user profile or user log data. However, these systems are difficult to recommend books related to character names or spatio-temporal information in text contents. Therefore, in this paper, we suggest a personalized book curation system based on integrated mining of a book. The proposed system consists of mining system, recommendation system, and visualization system. The mining system analyzes book text, user information or profile, and SNS data. The recommendation system recommends personalized books for users based on the analysed data in the mining system. This system can recommend related books using based on book keywords even if there is no user information like new customer. The visualization system visualizes book bibliographic information, mining data such as keyword, characters, character relations, and book recommendation results. In addition, this paper also includes the design and implementation of the proposed mining and recommendation module in the system. The proposed system is expected to broaden users' selection of books and encourage balanced consumption of book contents.

Technology Clustering Using Textual Information of Reference Titles in Scientific Paper (과학기술 논문의 참고문헌 텍스트 정보를 활용한 기술의 군집화)

  • Park, Inchae;Kim, Songhee;Yoon, Byungun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.2
    • /
    • pp.25-32
    • /
    • 2020
  • Data on patent and scientific paper is considered as a useful information source for analyzing technological information and has been widely utilized. Technology big data is analyzed in various ways to identify the latest technological trends and predict future promising technologies. Clustering is one of the ways to discover new features by creating groups from technology big data. Patent includes refined bibliographic information such as patent classification code whereas scientific paper does not have appropriate bibliographic information for clustering. This research proposes a new approach for clustering data of scientific paper by utilizing reference titles in each scientific paper. In this approach, the reference titles are considered as textual information because each reference consists of the title of the paper that represents the core content of the paper. We collected the scientific paper data, extracted the title of the reference, and conducted clustering by measuring the text-based similarity. The results from the proposed approach are compared with the results using existing methodologies that one is the approach utilizing textual information from titles and abstracts and the other one is a citation-based approach. The suggested approach in this paper shows statistically significant difference compared to the existing approaches and it shows better clustering performance. The proposed approach will be considered as a useful method for clustering scientific papers.

Exploring the dynamic knowledge structure of studies on the Internet of things: Keyword analysis

  • Yoon, Young Seog;Zo, Hangjung;Choi, Munkee;Lee, Donghyun;Lee, Hyun-woo
    • ETRI Journal
    • /
    • v.40 no.6
    • /
    • pp.745-758
    • /
    • 2018
  • A wide range of studies in various disciplines has focused on the Internet of Things (IoT) and cyber-physical systems (CPS). However, it is necessary to summarize the current status and to establish future directions because each study has its own individual goals independent of the completion of all IoT applications. The absence of a comprehensive understanding of IoT and CPS has disrupted an efficient resource allocation. To assess changes in the knowledge structure and emerging technologies, this study explores the dynamic research trends in IoT by analyzing bibliographic data. We retrieved 54,237 keywords in 12,600 IoT studies from the Scopus database, and conducted keyword frequency, co-occurrence, and growth-rate analyses. The analysis results reveal how IoT technologies have been developed and how they are connected to each other. We also show that such technologies have diverged and converged simultaneously, and that the emerging keywords of trust, smart home, cloud, authentication, context-aware, and big data have been extracted. We also unveil that the CPS is directly involved in network, security, management, cloud, big data, system, industry, architecture, and the Internet.

An Identification on Big Data Application Fields by Utilizing Journal Bibliographic Coupling Analysis (서지결합분석을 통한 빅데이터 활용 분야 연구)

  • Lee, Boram
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2016.08a
    • /
    • pp.19-22
    • /
    • 2016
  • 본 연구는 빅데이터의 처리 저장 등과 같은 기술적 측면이 아닌 분석 활용적 측면에 초점을 맞춰 관련 학문분야를 파악하고 분야 간 지적구조를 규명하고자 하였다. 연구 결과 빅데이터 관련 연구들이 주제분야에 따라 명백한 차이를 보이고 있음을 확인할 수 있었다. 주제범주 분석을 통해 공학 기술(34.60%), 사회과학(25.24%), 자연과학(23.14%), 의학 보건학(14.85%) 등은 관련 연구가 비교적 고르게 분포되어 있지만, 인문학(1.69%)과 농업과학(0.21%)은 연구가 미비함을 알 수 있었다. 네트워크 분석 결과 사회과학 분야(31.58%)에 비해 공학 및 자연과학 분야(68.42%)의 빅데이터 연구가 더 활발함을 확인할 수 있었다. 또한 공학 및 자연과학 분야 연구들은 다양한 주제분야를 다루는 반면 사회과학 분야에서는 아직 한정된 주제분야에서 연구가 진행되고 있음을 알 수 있었다.

  • PDF

Current Trends for National Bibliography through Analyzing the Status of Representative National Bibliographies (주요국 국가서지 현황조사를 통한 국가서지의 최신 경향 분석)

  • Lee, Mihwa;Lee, Ji-Won
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.32 no.1
    • /
    • pp.35-57
    • /
    • 2021
  • This paper is to grasp the current trends of national bibliographies through analyzing representative national bibliographies using literature review, analysis of national bibliographies' web pages and survey. First, in order to conform to the definition of a national bibliography as a record of a national publication, it attempts to include a variety of materials from print to electronic resources, but in reality it cannot contain all the materials, so there are exceptions. It is impossible to create a general selection guide for national bibliography coverage, and a plan that reflects the national characteristics and prepares a valid and comprehensive coverage based on analysis is needed. Second, cooperation with publishers and libraries is being made to efficiently generate national bibliography. For the efficiency of national bibliography generation, changes should be sought such as the standardization and consistency, the collection level metadata description for digital resources, and the creation of national bibliography using linked data. Third, national bibliography is published through the national bibliographic online search system, linked data search, MARC download using PDF, OAI-PMH, SRU, Z39.50, and mass download in RDF/XML format, and is integrated with the online public access catalog or also built separately. Above all, national bibliographies and online public access catalogs need to be built in a way of data reuse through an integrated library system. Fourth, as a differentiated function for national bibliography, various services such as user tagging and national bibliographic statistics are provided along with various browsing functions. In addition, services of analysis of national bibliographic big data, links to electronic publications, and mass download of linked data should be provided, and it is necessary to identify users' needs and provide open services that reflect them in order to develop differentiated services. Through the current trends and considerations of the national bibliographies analyzed in this study, it will be possible to explore changes in national and international national bibliography.