• Title/Summary/Keyword: Keyword-based

Search Result 1,126, Processing Time 0.029 seconds

Analysis of Research Topics in Archival Studies: Focusing on Academic Papers in Archival Science, Library and Information Science, and History from 2002 to 2023 (국내 기록분야 연구주제 분석: 2002~2023년간 기록관리학, 문헌정보학, 역사학 학술논문을 중심으로)

  • SeonWook Kim
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.23 no.4
    • /
    • pp.91-111
    • /
    • 2023
  • This study aims to analyze research topics within the domain of archival studies by examining bibliographic information from academic papers in archival science, library and information science, and history. After collecting 1,173 academic papers, network analysis was performed based on author keyword data, topic modeling was conducted from abstract data, and the analysis results were organized over time. The network analysis results based on author keywords confirmed that the research topic network actively changed according to variations in major laws and policies. Moreover, topic modeling from the abstract showed that the subjects of the entire academic paper were divided into "Records Management," "Archiving," and "National Records Policy." Notably, from 2002 to 2009, "Records Management" and "National Records Policy" were relatively dominant, but it has achieved balanced quantitative growth since 2009, peaking in 2019.

Anatomy of Sentiment Analysis of Tweets Using Machine Learning Approach

  • Misbah Iram;Saif Ur Rehman;Shafaq Shahid;Sayeda Ambreen Mehmood
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.97-106
    • /
    • 2023
  • Sentiment analysis using social network platforms such as Twitter has achieved tremendous results. Twitter is an online social networking site that contains a rich amount of data. The platform is known as an information channel corresponding to different sites and categories. Tweets are most often publicly accessible with very few limitations and security options available. Twitter also has powerful tools to enhance the utility of Twitter and a powerful search system to make publicly accessible the recently posted tweets by keyword. As popular social media, Twitter has the potential for interconnectivity of information, reviews, updates, and all of which is important to engage the targeted population. In this work, numerous methods that perform a classification of tweet sentiment in Twitter is discussed. There has been a lot of work in the field of sentiment analysis of Twitter data. This study provides a comprehensive analysis of the most standard and widely applicable techniques for opinion mining that are based on machine learning and lexicon-based along with their metrics. The proposed work is helpful to analyze the information in the tweets where opinions are highly unstructured, heterogeneous, and polarized positive, negative or neutral. In order to validate the performance of the proposed framework, an extensive series of experiments has been performed on the real world twitter dataset that alter to show the effectiveness of the proposed framework. This research effort also highlighted the recent challenges in the field of sentiment analysis along with the future scope of the proposed work.

Evaluation of the Quality of the Case Reports in the Journal of Korean Obstetrics and Gynecology from April 2019 to February 2024 Based on the CARE Guidelines (CARE(CAse REport) 지침에 따른 2019-2024년 대한한방부인과학회지 증례보고의 질 평가)

  • Han-Seul Kwon;Ye-Jin Yoon;Hyeong-Jun Kim
    • The Journal of Korean Obstetrics and Gynecology
    • /
    • v.37 no.2
    • /
    • pp.17-34
    • /
    • 2024
  • Objectives: The purpose of this study is to evaluate the quality of case reports published in the Journal of Korean Obstetrics and Gynecology from April 2019 to February 2024, compared with January 2015 to March 2019. Methods: Case reports were selected by searching from archive on the website of society of the Journal of Korean Obstetrics and Gynecology. The quality of the case reports were assessed based on CAse REport (CARE) guideline. Results: A total of 30 case reports was finally included for the assessment. Overall quality of reporting for case reports published from April 2019 to February 2024 was improved compared to one of previous study. However, the 4 items of CARE guidelines with an unreported rate of 50% or more - patient's perspective on interventions (96.67%), diagnostic challenges (93.33%), intervention adherence and tolerability (93.33%), adverse events (56.67%) - are items that require active description in future case reports. In addition, Keyword and timeline have more than 50% reported to be 'Not-sufficient' in both previous and present studies. So active efforts by researchers are needed to include 'Case report (or Case study)' in keywords, and to include intervention by period, symptom changes in a timeline. Conclusions: Despite the overall improvement in the quality of reporting in the Journal of Korean Obstetrics and Gynecology, efforts to improve the quality of case reports should be continued.

Improving Accuracy of Chapter-level Lecture Video Recommendation System using Keyword Cluster-based Graph Neural Networks

  • Purevsuren Chimeddorj;Doohyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.7
    • /
    • pp.89-98
    • /
    • 2024
  • In this paper, we propose a system for recommending lecture videos at the chapter level, addressing the balance between accuracy and processing speed in chapter-level video recommendations. Specifically, it has been observed that enhancing recommendation accuracy reduces processing speed, while increasing processing speed decreases accuracy. To mitigate this trade-off, a hybrid approach is proposed, utilizing techniques such as TF-IDF, k-means++ clustering, and Graph Neural Networks (GNN). The approach involves pre-constructing clusters based on chapter similarity to reduce computational load during recommendations, thereby improving processing speed, and applying GNN to the graph of clusters as nodes to enhance recommendation accuracy. Experimental results indicate that the use of GNN resulted in an approximate 19.7% increase in recommendation accuracy, as measured by the Mean Reciprocal Rank (MRR) metric, and an approximate 27.7% increase in precision defined by similarities. These findings are expected to contribute to the development of a learning system that recommends more suitable video chapters in response to learners' queries.

Peatland restoration research: a global overview with insights from Indonesia

  • Kushartati Budiningsih;Prakoso Bhairawa Putera;Ari Nurlia;Nur Arifatul Ulya;Fitri Nurfatriani;Mimi Salminah;Dhany Yuniati;Asmanah Widarti
    • Journal of Ecology and Environment
    • /
    • v.48 no.3
    • /
    • pp.263-276
    • /
    • 2024
  • Background: Repeated and severe fires have led to a large investment in research directed towards recapturing the natural values of Indonesia's peatland forest resources. The aim of this study was to identify the patterns and trends in research on peatland restoration-related literature available on the Scopus database. Methods in this paper a bibliometric methodology, the Scopus database and VOSviewer were used explore the trends in the published peatland restoration literature in the period 1994-2021; the leading journals and most influential authors, affiliations, countries, documents and research themes were identified. Results: Three hundred and seventeen documents including 266 journal articles were identified. The leading journals based on numbers of articles published and citations were Restoration Ecology and Ecological Engineering. Authors affiliated to institutions in Canada and the United Kingdom were the most influential. Indonesia was the third most influential based on numbers of documents. The most influential article was "The underappreciated potential of peatlands in global climate change mitigation strategies" by Liefield J in Nature Communications with an annual average citation rate of 66/year. A keyword co-occurrence network identified nine main themes in peat restoration research. Conclusions: The findings of the study are used to outline the types of research in peat restoration now required to meet the outstanding and unmet challenges confronted in Indonesia. Three significant challenges have been identified: (1) anthropogenic, those that encompass issues related to community acceptance and participation in peatland restoration, (2) ecological, those associated with severely degraded peatlands, and (3) economic, the absence of secure funding to cover substantial costs.

A Search Method for Components Based-on XML Component Specification (XML 컴포넌트 명세서 기반의 컴포넌트 검색 기법)

  • Park, Seo-Young;Shin, Yoeng-Gil;Wu, Chi-Su
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.2
    • /
    • pp.180-192
    • /
    • 2000
  • Recently, the component technology has played a main role in software reuse. It has changed the code-based reuse into the binary code-based reuse, because components can be easily combined into the developing software only through component interfaces. Since components and component users have increased rapidly, it is necessary that the users of components search for the most proper components for HTML among the enormous number of components on the Internet. It is desirable to use web-document-typed specifications for component specifications on the Internet. This paper proposes to use XML component specifications instead of HTML specifications, because it is impossible to represent the semantics of contexts using HTML. We also propose the XML context-search method based on XML component specifications. Component users use the contexts for the component properties and the terms for the values of component properties in their queries for searching components. The index structure for the context-based search method is the inverted file indexing structure of term-context-component specification. Not only an XML context-based search method but also a variety of search methods based on context-based search, such as keyword, search, faceted search, and browsing search method, are provided for the convenience of users. We use the 3-layer architecture, with an interface layer, a query expansion layer, and an XML search engine layer, of the search engine for the efficient index scheme. In this paper, an XML DTD(Document Type Definition) for component specification is defined and the experimental results of comparing search performance of XML with HTML are discussed.

  • PDF

Ontology-based User Customized Search Service Considering User Intention (온톨로지 기반의 사용자 의도를 고려한 맞춤형 검색 서비스)

  • Kim, Sukyoung;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.129-143
    • /
    • 2012
  • Recently, the rapid progress of a number of standardized web technologies and the proliferation of web users in the world bring an explosive increase of producing and consuming information documents on the web. In addition, most companies have produced, shared, and managed a huge number of information documents that are needed to perform their businesses. They also have discretionally raked, stored and managed a number of web documents published on the web for their business. Along with this increase of information documents that should be managed in the companies, the need of a solution to locate information documents more accurately among a huge number of information sources have increased. In order to satisfy the need of accurate search, the market size of search engine solution market is becoming increasingly expended. The most important functionality among much functionality provided by search engine is to locate accurate information documents from a huge information sources. The major metric to evaluate the accuracy of search engine is relevance that consists of two measures, precision and recall. Precision is thought of as a measure of exactness, that is, what percentage of information considered as true answer are actually such, whereas recall is a measure of completeness, that is, what percentage of true answer are retrieved as such. These two measures can be used differently according to the applied domain. If we need to exhaustively search information such as patent documents and research papers, it is better to increase the recall. On the other hand, when the amount of information is small scale, it is better to increase precision. Most of existing web search engines typically uses a keyword search method that returns web documents including keywords which correspond to search words entered by a user. This method has a virtue of locating all web documents quickly, even though many search words are inputted. However, this method has a fundamental imitation of not considering search intention of a user, thereby retrieving irrelevant results as well as relevant ones. Thus, it takes additional time and effort to set relevant ones out from all results returned by a search engine. That is, keyword search method can increase recall, while it is difficult to locate web documents which a user actually want to find because it does not provide a means of understanding the intention of a user and reflecting it to a progress of searching information. Thus, this research suggests a new method of combining ontology-based search solution with core search functionalities provided by existing search engine solutions. The method enables a search engine to provide optimal search results by inferenceing the search intention of a user. To that end, we build an ontology which contains concepts and relationships among them in a specific domain. The ontology is used to inference synonyms of a set of search keywords inputted by a user, thereby making the search intention of the user reflected into the progress of searching information more actively compared to existing search engines. Based on the proposed method we implement a prototype search system and test the system in the patent domain where we experiment on searching relevant documents associated with a patent. The experiment shows that our system increases the both recall and precision in accuracy and augments the search productivity by using improved user interface that enables a user to interact with our search system effectively. In the future research, we will study a means of validating the better performance of our prototype system by comparing other search engine solution and will extend the applied domain into other domains for searching information such as portal.

Odysseus/Parallel-OOSQL: A Parallel Search Engine using the Odysseus DBMS Tightly-Coupled with IR Capability (오디세우스/Parallel-OOSQL: 오디세우스 정보검색용 밀결합 DBMS를 사용한 병렬 정보 검색 엔진)

  • Ryu, Jae-Joon;Whang, Kyu-Young;Lee, Jae-Gil;Kwon, Hyuk-Yoon;Kim, Yi-Reun;Heo, Jun-Suk;Lee, Ki-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.412-429
    • /
    • 2008
  • As the amount of electronic documents increases rapidly with the growth of the Internet, a parallel search engine capable of handling a large number of documents are becoming ever important. To implement a parallel search engine, we need to partition the inverted index and search through the partitioned index in parallel. There are two methods of partitioning the inverted index: 1) document-identifier based partitioning and 2) keyword-identifier based partitioning. However, each method alone has the following drawbacks. The former is convenient in inserting documents and has high throughput, but has poor performance for top h query processing. The latter has good performance for top-k query processing, but is inconvenient in inserting documents and has low throughput. In this paper, we propose a hybrid partitioning method to compensate for the drawback of each method. We design and implement a parallel search engine that supports the hybrid partitioning method using the Odysseus DBMS tightly coupled with information retrieval capability. We first introduce the architecture of the parallel search engine-Odysseus/parallel-OOSQL. We then show the effectiveness of the proposed system through systematic experiments. The experimental results show that the query processing time of the document-identifier based partitioning method is approximately inversely proportional to the number of blocks in the partition of the inverted index. The results also show that the keyword-identifier based partitioning method has good performance in top-k query processing. The proposed parallel search engine can be optimized for performance by customizing the methods of partitioning the inverted index according to the application environment. The Odysseus/parallel OOSQL parallel search engine is capable of indexing, storing, and querying 100 million web documents per node or tens of billions of web documents for the entire system.

A Study on Increasing the Efficiency of Image Search Using Image Attribute in the area of content-Based Image Retrieval (내용기반 이미지 검색에 있어 이미지 속성정보를 활용한 검색 효율성 향상)

  • Mo, Yeong-Il;Lee, Cheol-Gyu
    • Journal of the Korea Society for Simulation
    • /
    • v.18 no.2
    • /
    • pp.39-48
    • /
    • 2009
  • This study reviews the limit of image search by considering on the image search methods related to content-based image retrieval and suggests a user interface for more efficient content-based image retrieval and the ways to utilize image properties. For now, most studies on image search are being performed focusing on content-based image retrieval; they try to search based on the image's colors, texture, shapes, and the overall form of the image. However, the results are not satisfactory because there are various technological limits. Accordingly, this study suggests a new retrieval system which adapts content-based image retrieval and the conventional keyword search method. This is about a way to attribute properties to images using texts and a fast way to search images by expressing the attribute of images as keywords and utilizing them to search images. Also, the study focuses on a simulation for a user interface to make query language on the Internet and a search for clothes in an online shopping mall as an application of the retrieval system based on image attribute. This study will contribute to adding a new purchase pattern in online shopping malls and to the development of the area of similar image search.

An Intelligence Support System Research on KTX Rolling Stock Failure Using Case-based Reasoning and Text Mining (사례기반추론과 텍스트마이닝 기법을 활용한 KTX 차량고장 지능형 조치지원시스템 연구)

  • Lee, Hyung Il;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.47-73
    • /
    • 2020
  • KTX rolling stocks are a system consisting of several machines, electrical devices, and components. The maintenance of the rolling stocks requires considerable expertise and experience of maintenance workers. In the event of a rolling stock failure, the knowledge and experience of the maintainer will result in a difference in the quality of the time and work to solve the problem. So, the resulting availability of the vehicle will vary. Although problem solving is generally based on fault manuals, experienced and skilled professionals can quickly diagnose and take actions by applying personal know-how. Since this knowledge exists in a tacit form, it is difficult to pass it on completely to a successor, and there have been studies that have developed a case-based rolling stock expert system to turn it into a data-driven one. Nonetheless, research on the most commonly used KTX rolling stock on the main-line or the development of a system that extracts text meanings and searches for similar cases is still lacking. Therefore, this study proposes an intelligence supporting system that provides an action guide for emerging failures by using the know-how of these rolling stocks maintenance experts as an example of problem solving. For this purpose, the case base was constructed by collecting the rolling stocks failure data generated from 2015 to 2017, and the integrated dictionary was constructed separately through the case base to include the essential terminology and failure codes in consideration of the specialty of the railway rolling stock sector. Based on a deployed case base, a new failure was retrieved from past cases and the top three most similar failure cases were extracted to propose the actual actions of these cases as a diagnostic guide. In this study, various dimensionality reduction measures were applied to calculate similarity by taking into account the meaningful relationship of failure details in order to compensate for the limitations of the method of searching cases by keyword matching in rolling stock failure expert system studies using case-based reasoning in the precedent case-based expert system studies, and their usefulness was verified through experiments. Among the various dimensionality reduction techniques, similar cases were retrieved by applying three algorithms: Non-negative Matrix Factorization(NMF), Latent Semantic Analysis(LSA), and Doc2Vec to extract the characteristics of the failure and measure the cosine distance between the vectors. The precision, recall, and F-measure methods were used to assess the performance of the proposed actions. To compare the performance of dimensionality reduction techniques, the analysis of variance confirmed that the performance differences of the five algorithms were statistically significant, with a comparison between the algorithm that randomly extracts failure cases with identical failure codes and the algorithm that applies cosine similarity directly based on words. In addition, optimal techniques were derived for practical application by verifying differences in performance depending on the number of dimensions for dimensionality reduction. The analysis showed that the performance of the cosine similarity was higher than that of the dimension using Non-negative Matrix Factorization(NMF) and Latent Semantic Analysis(LSA) and the performance of algorithm using Doc2Vec was the highest. Furthermore, in terms of dimensionality reduction techniques, the larger the number of dimensions at the appropriate level, the better the performance was found. Through this study, we confirmed the usefulness of effective methods of extracting characteristics of data and converting unstructured data when applying case-based reasoning based on which most of the attributes are texted in the special field of KTX rolling stock. Text mining is a trend where studies are being conducted for use in many areas, but studies using such text data are still lacking in an environment where there are a number of specialized terms and limited access to data, such as the one we want to use in this study. In this regard, it is significant that the study first presented an intelligent diagnostic system that suggested action by searching for a case by applying text mining techniques to extract the characteristics of the failure to complement keyword-based case searches. It is expected that this will provide implications as basic study for developing diagnostic systems that can be used immediately on the site.