• Title/Summary/Keyword: Library Big Data

Search Result 98, Processing Time 0.028 seconds

Analyzing Students' Non-face-to-face Course Evaluation by Topic Modeling and Developing Deep Learning-based Classification Model (토픽 모델링 기반 비대면 강의평 분석 및 딥러닝 분류 모델 개발)

  • Han, Ji Yeong;Heo, Go Eun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.4
    • /
    • pp.267-291
    • /
    • 2021
  • Due to the global pandemic caused by COVID-19 in 2020, there have been major changes in the education sites. Universities have fully introduced remote learning, which was considered as an auxiliary education, and non-face-to-face classes have become commonplace, and professors and students are making great efforts to adapt to the new educational environment. In order to improve the quality of non-face-to-face lectures amid these changes, it is necessary to study the factors affecting lecture satisfaction. Therefore, This paper presents a new methodology using big data to identify the factors affecting university lecture satisfaction changed before and after COVID-19. We use Topic Modeling method to analyze lecture reviews before and after COVID-19, and identify factors affecting lecture satisfaction. Through this, we suggest the direction for university education to move forward. In addition, we can identify the factors of satisfaction and dissatisfaction of lectures from multiangle by establishing a topic classification model with an F1-score of 0.84 based on KoBERT, a deep learning language model, and further contribute to continuous qualitative improvement of lecture satisfaction.

Analysis of Borrows Demand for Books in Public Libraries Considering Cultural Characteristics (문화적 특성을 고려한 공공도서관 도서 대출수요 분석 : 대구광역시 시립도서관을 사례로)

  • Oh, Min-Ki;Kim, Kyung-Rae;Jeong, Won-Oong;Kim, Keun-Wook
    • Journal of Digital Convergence
    • /
    • v.19 no.3
    • /
    • pp.55-64
    • /
    • 2021
  • Public libraries are a space where residents learn a wide range of knowledge and ideologies, and as they are directly connected to life, various related studies have been conducted. In most previous studies, variables such as population, traffic accessibility, and environment were found to be highly relevant to library use. In this study, it can be said that the difference from previous studies is that the book borrow demand and relevance were analyzed by reflecting the variables of cultural characteristics based on the book borrow history (1,820,407 cases) and member information (297,222 persons). As a result of the analysis, it was analyzed that as the increase in borrows for social science and literature books compared to technical science books, the demand for book borrows increased. In addition, various descriptive statistical analyzes were used to analyze the characteristics of library book borrow demand, and policy implications and limitations of the study were also presented based on the analysis results. and considering that cultural characteristics change depending on the location and time of day, it is believed that related research should be continued in the future.

A Comparative Study for Digital Animation Production using Database (데이터베이스를 활용한 디지털 애니메이션 제작 방법 비교 분석)

  • Lee, Dong-Eun
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.1
    • /
    • pp.96-105
    • /
    • 2008
  • The introduction of the digital media changed 21 century at time of database paradigm and the database paradigm called forth a big change in production process and industry of the animation. One of the characteristics of animation is that it has to newly produce all of the images. But due to the introduction of digital technology, the original copy of those images produced for animation can be saved. Also, those permanently saved data can easily produce new images through modification and composition. Therefore, this assignment will be focused on the usage of digital base and how it has made a difference in animation creative skills. In order to achieve the goal, I will be reviewing some examples and discuss about the future of animation for the next generation.

  • PDF

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Design of a High-Speed Data Packet Allocation Circuit for Network-on-Chip (NoC 용 고속 데이터 패킷 할당 회로 설계)

  • Kim, Jeonghyun;Lee, Jaesung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.459-461
    • /
    • 2022
  • One of the big differences between Network-on-Chip (NoC) and the existing parallel processing system based on an off-chip network is that data packet routing is performed using a centralized control scheme. In such an environment, the best-effort packet routing problem becomes a real-time assignment problem in which data packet arriving time and processing time is the cost. In this paper, the Hungarian algorithm, a representative computational complexity reduction algorithm for the linear algebraic equation of the allocation problem, is implemented in the form of a hardware accelerator. As a result of logic synthesis using the TSMC 0.18um standard cell library, the area of the circuit designed through case analysis for the cost distribution is reduced by about 16% and the propagation delay of it is reduced by about 52%, compared to the circuit implementing the original operation sequence of the Hungarian algorithm.

  • PDF

Patent Technology Trends of Oral Health: Application of Text Mining

  • Hee-Kyeong Bak;Yong-Hwan Kim;Han-Na Kim
    • Journal of dental hygiene science
    • /
    • v.24 no.1
    • /
    • pp.9-21
    • /
    • 2024
  • Background: The purpose of this study was to utilize text network analysis and topic modeling to identify interconnected relationships among keywords present in patent information related to oral health, and subsequently extract latent topics and visualize them. By examining key keywords and specific subjects, this study sought to comprehend the technological trends in oral health-related innovations. Furthermore, it aims to serve as foundational material, suggesting directions for technological advancement in dentistry and dental hygiene. Methods: The data utilized in this study consisted of information registered over a 20-year period until July 31st, 2023, obtained from the patent information retrieval service, KIPRIS. A total of 6,865 patent titles related to keywords, such as "dentistry," "teeth," and "oral health," were collected through the searches. The research tools included a custom-designed program coded specifically for the research objectives based on Python 3.10. This program was used for keyword frequency analysis, semantic network analysis, and implementation of Latent Dirichlet Allocation for topic modeling. Results: Upon analyzing the centrality of connections among the top 50 frequently occurring words, "method," "tooth," and "manufacturing" displayed the highest centrality, while "active ingredient" had the lowest. Regarding topic modeling outcomes, the "implant" topic constituted the largest share at 22.0%, while topics concerning "devices and materials for oral health" and "toothbrushes and oral care" exhibited the lowest proportions at 5.5% each. Conclusion: Technologies concerning methods and implants are continually being researched in patents related to oral health, while there is comparatively less technological development in devices and materials for oral health. This study is expected to be a valuable resource for uncovering potential themes from a large volume of patent titles and suggesting research directions.

An Investigation on Digital Humanities Research Trend by Analyzing the Papers of Digital Humanities Conferences (디지털 인문학 연구 동향 분석 - Digital Humanities 학술대회 논문을 중심으로 -)

  • Chung, EunKyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.1
    • /
    • pp.393-413
    • /
    • 2021
  • Digital humanities, which creates new and innovative knowledge through the combination of digital information technology and humanities research problems, can be seen as a representative multidisciplinary field of study. To investigate the intellectual structure of the digital humanities field, a network analysis of authors and keywords co-word was performed on a total of 441 papers in the last two years (2019, 2020) at the Digital Humanities Conference. As the results of the author and keyword analysis show, we can find out the active activities of Europe, North America, and Japanese and Chinese authors in East Asia. Through the co-author network, 11 dis-connected sub-networks are identified, which can be seen as a result of closed co-authoring activities. Through keyword analysis, 16 sub-subject areas are identified, which are machine learning, pedagogy, metadata, topic modeling, stylometry, cultural heritage, network, digital archive, natural language processing, digital library, twitter, drama, big data, neural network, virtual reality, and ethics. This results imply that a diver variety of digital information technologies are playing a major role in the digital humanities. In addition, keywords with high frequency can be classified into humanities-based keywords, digital information technology-based keywords, and convergence keywords. The dynamics of the growth and development of digital humanities can represented in these combinations of keywords.

Analyzing Patterns of Sales and Floating Population Using Markov Chain (마르코브 체인을 적용한 유동인구의 매출 및 이동 패턴 분석)

  • Kim, Bong Gyun;Lee, Wonsang;Lee, Bong Gyou
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.71-78
    • /
    • 2020
  • Recently, as the issue of gentrification emerges, it becomes important to understand the dynamics of local commercial district, which plays the important role for facilitating the local economy and building the community in a city. This paper attempts to provide the framework for systemically analyzing and understanding the local commercial district. Then, this paper empirically analyzes the patterns of sales and flow of floating population by focusing on two representative local commercial districts in Seoul. In addition, the floating population data from telecommunication bases is further modeled with Markov chain for systemically understanding the local commercial districts. Finally, the transition patterns and consumption amounts of floating population are comprehensively analyzed for providing the implications on the evolutions of local commercial districts in a city. We expect that findings of our study could contribute to the economic growth of local commercial district, which could lead to the continuous development of city economy.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Current Trends for National Bibliography through Analyzing the Status of Representative National Bibliographies (주요국 국가서지 현황조사를 통한 국가서지의 최신 경향 분석)

  • Lee, Mihwa;Lee, Ji-Won
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.32 no.1
    • /
    • pp.35-57
    • /
    • 2021
  • This paper is to grasp the current trends of national bibliographies through analyzing representative national bibliographies using literature review, analysis of national bibliographies' web pages and survey. First, in order to conform to the definition of a national bibliography as a record of a national publication, it attempts to include a variety of materials from print to electronic resources, but in reality it cannot contain all the materials, so there are exceptions. It is impossible to create a general selection guide for national bibliography coverage, and a plan that reflects the national characteristics and prepares a valid and comprehensive coverage based on analysis is needed. Second, cooperation with publishers and libraries is being made to efficiently generate national bibliography. For the efficiency of national bibliography generation, changes should be sought such as the standardization and consistency, the collection level metadata description for digital resources, and the creation of national bibliography using linked data. Third, national bibliography is published through the national bibliographic online search system, linked data search, MARC download using PDF, OAI-PMH, SRU, Z39.50, and mass download in RDF/XML format, and is integrated with the online public access catalog or also built separately. Above all, national bibliographies and online public access catalogs need to be built in a way of data reuse through an integrated library system. Fourth, as a differentiated function for national bibliography, various services such as user tagging and national bibliographic statistics are provided along with various browsing functions. In addition, services of analysis of national bibliographic big data, links to electronic publications, and mass download of linked data should be provided, and it is necessary to identify users' needs and provide open services that reflect them in order to develop differentiated services. Through the current trends and considerations of the national bibliographies analyzed in this study, it will be possible to explore changes in national and international national bibliography.