• Title/Summary/Keyword: 용어추출

Search Result 365, Processing Time 0.027 seconds

NFT(Non-Fungible Token) Patent Trend Analysis using Topic Modeling

  • Sin-Nyum Choi;Woong Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.41-48
    • /
    • 2023
  • In this paper, we propose an analysis of recent trends in the NFT (Non-Fungible Token) industry using topic modeling techniques, focusing on their universal application across various industrial fields. For this study, patent data was utilized to understand industry trends. We collected data on 371 domestic and 454 international NFT-related patents registered in the patent information search service KIPRIS from 2017, when the first NFT standard was introduced, to October 2023. In the preprocessing stage, stopwords and lemmas were removed, and only noun words were extracted. For the analysis, the top 50 words by frequency were listed, and their corresponding TF-IDF values were examined to derive key keywords of the industry trends. Next, Using the LDA algorithm, we identified four major latent topics within the patent data, both domestically and internationally. We analyzed these topics and presented our findings on NFT industry trends, underpinned by real-world industry cases. While previous review presented trends from an academic perspective using paper data, this study is significant as it provides practical trend information based on data rooted in field practice. It is expected to be a useful reference for professionals in the NFT industry for understanding market conditions and generating new items.

A Bibliometric Study on Sustainable Development Goals (SDGs) Research Trends in Entrepreneurship (키워드 네트워크 분석을 활용한 창업분야 지속가능발전목표(SDGs) 연구동향 분석)

  • An, Seung Kwon;Choi, Min Jung
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.18 no.2
    • /
    • pp.21-34
    • /
    • 2023
  • The purpose of this study is to examine the extent of Sustainable Development Goals (SDGs)-related research in the field of entrepreneurship globally since the adoption of the SDGs at the UN General Assembly, and to compare international and domestic research trends in order to determine the direction of SDGs-related research in entrepreneurship in Korea. Utilizing three databases-Web of Science (WoS), KCI, and DBpia- SDGs-related studies in entrepreneurship were extracted by employing specific search terms. After data purification, a total of 356 studies abroad and 4 studies in Korea were used for analysis. After data purification, a total of 356 international studies and 4 Korean studies were analyzed. Due to the limited number of domestic studies, the research trends were examined by conducting frequency analysis and keyword network analysis on international studies alone. Frequency analysis revealed that SDGs research in entrepreneurship primarily focused on sustainability-related terms and was conducted in conjunction with business models, innovation, entrepreneurship education, and strategies. Furthermore, yearly frequency analysis demonstrated an expansion of topics to encompass research on entrepreneurship and SDGs policies, the roles and capabilities of female entrepreneurs in SDGs implementation, energy start-ups and SDGs, directions for implementing SDGs in business schools and SDGs education, indicators for SDGs implementation and evaluation, and technologies for sustainability. The keyword network analysis identified central topics such as business, sustainability, SDGs, innovation, entrepreneurship, business models, and education, with research areas extending to entrepreneurship ecosystems, change and strategy, ethics, and climate. This study holds significance in establishing a foundation for SDGs research in entrepreneurship, which is currently an underexplored area in Korea, by presenting emerging research trends related to SDGs in entrepreneurship.

  • PDF

Analysis of Keywords in national river occupancy permits by region using text mining and network theory (텍스트 마이닝과 네트워크 이론을 활용한 권역별 국가하천 점용허가 키워드 분석)

  • Seong Yun Jeong
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.185-197
    • /
    • 2023
  • This study was conducted using text mining and network theory to extract useful information for application for occupancy and performance of permit tasks contained in the permit contents from the permit register, which is used only for the simple purpose of recording occupancy permit information. Based on text mining, we analyzed and compared the frequency of vocabulary occurrence and topic modeling in five regions, including Seoul, Gyeonggi, Gyeongsang, Jeolla, Chungcheong, and Gangwon, as well as normalization processes such as stopword removal and morpheme analysis. By applying four types of centrality algorithms, including stage, proximity, mediation, and eigenvector, which are widely used in network theory, we looked at keywords that are in a central position or act as an intermediary in the network. Through a comprehensive analysis of vocabulary appearance frequency, topic modeling, and network centrality, it was found that the 'installation' keyword was the most influential in all regions. This is believed to be the result of the Ministry of Environment's permit management office issuing many permits for constructing facilities or installing structures. In addition, it was found that keywords related to road facilities, flood control facilities, underground facilities, power/communication facilities, sports/park facilities, etc. were at a central position or played a role as an intermediary in topic modeling and networks. Most of the keywords appeared to have a Zipf's law statistical distribution with low frequency of occurrence and low distribution ratio.

An Analysis of University Record-Related Regulations and Proposals for Improvement (대학 기록 관련 규정의 현황 분석과 개선 방향)

  • SeonWook Kim
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.35 no.3
    • /
    • pp.105-135
    • /
    • 2024
  • This study aims to analyze the current state of record-related regulations in Korean universities and suggest improvement measures. The research involved a three-step process: exploratory analysis of existing regulations, establishment of classification criteria, and comparative analysis with the National Archives' guidelines. Data were gathered from 66 regulations across 63 universities. The findings reveal that many universities revised their regulations after 2020, often without clear standards, leading to inconsistencies. Notably, many private universities still lack proper record-related regulations. Discrepancies between the National Archives' guidelines and university practices were identified. To address these issues, it is recommended that the National Archives incorporate feedback from universities in guideline revisions, universities enhance their record management by consulting experts and increasing personnel, and record management professionals report their institutional status and share best practices. Accurate terminology use is essential to avoid confusion.

Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)

  • An, Jungkook;Kim, Hee-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.49-67
    • /
    • 2015
  • Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.

An Intelligence Support System Research on KTX Rolling Stock Failure Using Case-based Reasoning and Text Mining (사례기반추론과 텍스트마이닝 기법을 활용한 KTX 차량고장 지능형 조치지원시스템 연구)

  • Lee, Hyung Il;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.47-73
    • /
    • 2020
  • KTX rolling stocks are a system consisting of several machines, electrical devices, and components. The maintenance of the rolling stocks requires considerable expertise and experience of maintenance workers. In the event of a rolling stock failure, the knowledge and experience of the maintainer will result in a difference in the quality of the time and work to solve the problem. So, the resulting availability of the vehicle will vary. Although problem solving is generally based on fault manuals, experienced and skilled professionals can quickly diagnose and take actions by applying personal know-how. Since this knowledge exists in a tacit form, it is difficult to pass it on completely to a successor, and there have been studies that have developed a case-based rolling stock expert system to turn it into a data-driven one. Nonetheless, research on the most commonly used KTX rolling stock on the main-line or the development of a system that extracts text meanings and searches for similar cases is still lacking. Therefore, this study proposes an intelligence supporting system that provides an action guide for emerging failures by using the know-how of these rolling stocks maintenance experts as an example of problem solving. For this purpose, the case base was constructed by collecting the rolling stocks failure data generated from 2015 to 2017, and the integrated dictionary was constructed separately through the case base to include the essential terminology and failure codes in consideration of the specialty of the railway rolling stock sector. Based on a deployed case base, a new failure was retrieved from past cases and the top three most similar failure cases were extracted to propose the actual actions of these cases as a diagnostic guide. In this study, various dimensionality reduction measures were applied to calculate similarity by taking into account the meaningful relationship of failure details in order to compensate for the limitations of the method of searching cases by keyword matching in rolling stock failure expert system studies using case-based reasoning in the precedent case-based expert system studies, and their usefulness was verified through experiments. Among the various dimensionality reduction techniques, similar cases were retrieved by applying three algorithms: Non-negative Matrix Factorization(NMF), Latent Semantic Analysis(LSA), and Doc2Vec to extract the characteristics of the failure and measure the cosine distance between the vectors. The precision, recall, and F-measure methods were used to assess the performance of the proposed actions. To compare the performance of dimensionality reduction techniques, the analysis of variance confirmed that the performance differences of the five algorithms were statistically significant, with a comparison between the algorithm that randomly extracts failure cases with identical failure codes and the algorithm that applies cosine similarity directly based on words. In addition, optimal techniques were derived for practical application by verifying differences in performance depending on the number of dimensions for dimensionality reduction. The analysis showed that the performance of the cosine similarity was higher than that of the dimension using Non-negative Matrix Factorization(NMF) and Latent Semantic Analysis(LSA) and the performance of algorithm using Doc2Vec was the highest. Furthermore, in terms of dimensionality reduction techniques, the larger the number of dimensions at the appropriate level, the better the performance was found. Through this study, we confirmed the usefulness of effective methods of extracting characteristics of data and converting unstructured data when applying case-based reasoning based on which most of the attributes are texted in the special field of KTX rolling stock. Text mining is a trend where studies are being conducted for use in many areas, but studies using such text data are still lacking in an environment where there are a number of specialized terms and limited access to data, such as the one we want to use in this study. In this regard, it is significant that the study first presented an intelligent diagnostic system that suggested action by searching for a case by applying text mining techniques to extract the characteristics of the failure to complement keyword-based case searches. It is expected that this will provide implications as basic study for developing diagnostic systems that can be used immediately on the site.

Color-related Query Processing for Intelligent E-Commerce Search (지능형 검색엔진을 위한 색상 질의 처리 방안)

  • Hong, Jung A;Koo, Kyo Jung;Cha, Ji Won;Seo, Ah Jeong;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.109-125
    • /
    • 2019
  • As interest on intelligent search engines increases, various studies have been conducted to extract and utilize the features related to products intelligencely. In particular, when users search for goods in e-commerce search engines, the 'color' of a product is an important feature that describes the product. Therefore, it is necessary to deal with the synonyms of color terms in order to produce accurate results to user's color-related queries. Previous studies have suggested dictionary-based approach to process synonyms for color features. However, the dictionary-based approach has a limitation that it cannot handle unregistered color-related terms in user queries. In order to overcome the limitation of the conventional methods, this research proposes a model which extracts RGB values from an internet search engine in real time, and outputs similar color names based on designated color information. At first, a color term dictionary was constructed which includes color names and R, G, B values of each color from Korean color standard digital palette program and the Wikipedia color list for the basic color search. The dictionary has been made more robust by adding 138 color names converted from English color names to foreign words in Korean, and with corresponding RGB values. Therefore, the fininal color dictionary includes a total of 671 color names and corresponding RGB values. The method proposed in this research starts by searching for a specific color which a user searched for. Then, the presence of the searched color in the built-in color dictionary is checked. If there exists the color in the dictionary, the RGB values of the color in the dictioanry are used as reference values of the retrieved color. If the searched color does not exist in the dictionary, the top-5 Google image search results of the searched color are crawled and average RGB values are extracted in certain middle area of each image. To extract the RGB values in images, a variety of different ways was attempted since there are limits to simply obtain the average of the RGB values of the center area of images. As a result, clustering RGB values in image's certain area and making average value of the cluster with the highest density as the reference values showed the best performance. Based on the reference RGB values of the searched color, the RGB values of all the colors in the color dictionary constructed aforetime are compared. Then a color list is created with colors within the range of ${\pm}50$ for each R value, G value, and B value. Finally, using the Euclidean distance between the above results and the reference RGB values of the searched color, the color with the highest similarity from up to five colors becomes the final outcome. In order to evaluate the usefulness of the proposed method, we performed an experiment. In the experiment, 300 color names and corresponding color RGB values by the questionnaires were obtained. They are used to compare the RGB values obtained from four different methods including the proposed method. The average euclidean distance of CIE-Lab using our method was about 13.85, which showed a relatively low distance compared to 3088 for the case using synonym dictionary only and 30.38 for the case using the dictionary with Korean synonym website WordNet. The case which didn't use clustering method of the proposed method showed 13.88 of average euclidean distance, which implies the DBSCAN clustering of the proposed method can reduce the Euclidean distance. This research suggests a new color synonym processing method based on RGB values that combines the dictionary method with the real time synonym processing method for new color names. This method enables to get rid of the limit of the dictionary-based approach which is a conventional synonym processing method. This research can contribute to improve the intelligence of e-commerce search systems especially on the color searching feature.

The Development of the Korean Form of Childhood Attention Problem(CAP) Scale: A Study on the Reliability and Validity (한국형 소아기 집중력 문제척도: 신뢰도 및 타당도 연구)

  • Seo, Wan-Seok;Lee, Jong-Bum;Park, Hyung-Bae;Suh, Hyea-Soo;Lee, Kwang-Hun;SaKong, Jeong-Kyu
    • Journal of Yeungnam Medical Science
    • /
    • v.14 no.1
    • /
    • pp.123-136
    • /
    • 1997
  • The purpose of this study was to examine the reliability and validity of a Korean form of Childhood Attention Problem(CAP) scale. CAP were administered to 98 normal elementary school students as control group and 98 attention deficit hyperactivity disorder patients. Male students showed high scores than female students in both subscale and total scores, but not statistically significant. There were no significant difference in CAP scale between male students and female students in attention deficit hyperactivity disorder patients. In the reliability test, the test-retest reliability coefficient was highly satisfactory and that of inattention subscale was 0.83, impulsivity subscale was 0.70 and total score was 0.82. In the reliability test by internal consistency, the Cronbach $\alpha$ coefficient was highly satisfactory and that of inattention subscale was 0.91, overactivity subscale was 0.89(p<0.05). The concurrent validity between CAP scale and ADDES-BV scale was 0.85 in attention deficit hyperactivity disorder patient group and 0.73 in normal control group(p<0.05). In discriminant validity test between attention deficit hyperactivity disorder patient group and normal control group, the patient group showed higher score(p<0.05). The total discriminant capacity of the patient group in CAP was 93.4%. In this point of view, CAP scale showed high reliability and validity in applying to Korean subjects and was proved to be the good and simple screening test tool for attention deficit hyperactivity disorder research and can help many young patient to treat early.

  • PDF

A Study on Body Image of Women Who Participate in Physical Exercise (스포츠 센터 운동 참여에 따른 여성의 신체이미지에 관한 연구)

  • Kang, Byeol-Nim
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.145-148
    • /
    • 2006
  • This study aimed at preventing women from suffering from health problems and stress due to excessive lookism and encouraging them to participate in sports activities to form desirable body image and eventually to live a healthy and sound life. To achieve this goal, this study formed a population with members of sports centers located in Seoul and Daejeon areas as of 2006 and made a sample of 450 participants in physical exercise at a sports center through stratified cluster random sampling and that of 450 non-participants through a survey with mothers and sisters of those students from elementary and secondary schools within the areas for sampling the participants' group, thereby analyzing the data on a total of 900 persons. A questionnaire was used as a tool to collect data; a reliability test presented weight-, health-, and figure-related factors as a=.807, a=.819, and a=.784, respectively. This study used such statistical analysis methods as t-test, One-way ANOVA, and the Analysis of Covariance to analyze data. This study produced the following conclusions through these research methods and procedure. Pticipation in physical exercise has a positive effect on body image. Pticipation in physical exercise at a sports center show higher satisfaction with body image than non-participats.

  • PDF

An Analysis of High School Technology·Home Economics Textbooks' Activities to Improve the Resilience of Youth (청소년의 회복탄력성에 대한 고등학교 기술·가정 교과서 활동과제 분석)

  • Choi, Yoo-ri;Kim, Eun-Jong;Lee, So-Young;Lee, Gi-Sen;Lim, So-Jin;Park, Mi-Jeong
    • Journal of Korean Home Economics Education Association
    • /
    • v.30 no.4
    • /
    • pp.37-55
    • /
    • 2018
  • The purpose of this study is to contribute to the improvement of the resilience of adolescents through the analysis of activities in high school Technology·Home Economics textbooks developed according to the 2015 revised high school Technology·Home Economics curriculum. For this purpose, we analyzed the activities of 12 high school Technology·Home Economics textbooks in the 'human development and family' and 'family life and safety' areas based on the sub-factors of resilience. A total of 303 activities were extracted from 12 textbooks. After analyzing the activities of the three people, the process of revising and supplementing the analysis criteria through consultation was conducted three times and then reviewed by three experts. The analysis found that although there were differences in the number of activities to be dealt with, it was common to focus on raising interpersonal ability(54.8%) among the sub-factors of resilience. Followed by self-regulation(39.4%) and positive(5.8%). Second, the analysis of the activities by core concepts showed that the most activities dealing with the sub-factors of resilience were in the 'family life and safety' area, which deals with 'safety (44.3%)' as a core concept. And in the area of human development and family, which deals with development (25.1%) and relationships (36%) as core concept, the sub-factors of resilience were also covered. This can be inferred that the home curriculum is suitable for systematic education of resilience, and that the term resilience in the curriculum has been considered and dealt with resilience before it is specifically mentioned. I hope that the results of this study will be used as basic data for the development of home and resilience education programs in the future.