• Title/Summary/Keyword: 텍스트 기반 유사도

Search Result 196, Processing Time 0.024 seconds

Incorporating Social Relationship discovered from User's Behavior into Collaborative Filtering (사용자 행동 기반의 사회적 관계를 결합한 사용자 협업적 여과 방법)

  • Thay, Setha;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.1-20
    • /
    • 2013
  • Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.

Research on Development of Support Tools for Local Government Business Transaction Operation Using Big Data Analysis Methodology (빅데이터 분석 방법론을 활용한 지방자치단체 단위과제 운영 지원도구 개발 연구)

  • Kim, Dabeen;Lee, Eunjung;Ryu, Hanjo
    • The Korean Journal of Archival Studies
    • /
    • no.70
    • /
    • pp.85-117
    • /
    • 2021
  • The purpose of this study is to investigate and analyze the current status of unit tasks, unit task operation, and record management problems used by local governments, and to present improvement measures using text-based big data technology based on the implications derived from the process. Local governments are in a serious state of record management operation due to errors in preservation period due to misclassification of unit tasks, inability to identify types of overcommon and institutional affairs, errors in unit tasks, errors in name, referenceable standards, and tools. However, the number of unit tasks is about 720,000, which cannot be effectively controlled due to excessive quantities, and thus strict and controllable tools and standards are needed. In order to solve these problems, this study developed a system that applies text-based analysis tools such as corpus and tokenization technology during big data analysis, and applied them to the names and construction terms constituting the record management standard. These unit task operation support tools are expected to contribute significantly to record management tasks as they can support standard operability such as uniform preservation period, identification of delegated office records, control of duplicate and similar unit task creation, and common tasks. Therefore, if the big data analysis methodology can be linked to BRM and RMS in the future, it is expected that the quality of the record management standard work will increase.

Investigation of Elementary Students' Scientific Communication Competence Considering Grammatical Features of Language in Science Learning (과학 학습 언어의 문법적 특성을 고려한 초등학생의 과학적 의사소통 능력 고찰)

  • Maeng, Seungho;Lee, Kwanhee
    • Journal of Korean Elementary Science Education
    • /
    • v.41 no.1
    • /
    • pp.30-43
    • /
    • 2022
  • In this study, elementary students' science communication competence was investigated based on the grammatical features expressed in their language-use in classroom discourse and science writings. The classes were designed to integrate the evidence-based reasoning framework and traditional learning cycle and were conducted on fifth graders in an elementary school. Eight elementary students' discourse data and writings were analyzed using lexico-grammatical resource analysis, which examined the discourse text's content and logical relations. The results revealed that the student language used in analyzing data, interpreting evidence, or constructing explanations did not precisely conform to the grammatical features in science language use. However, they provided examples of grammatical metaphors by nominalizing observed events in the classroom discourses and those of causal relations in their writings. Thus, elementary students can use science language grammatically from science language-use experiences through listening to a teacher's instructional discourses or recognizing the grammatical structures of science texts in workbooks. The opportunities in which elementary students experience the language-use model in science learning need to be offered to understand the appropriate language use in the epistemic context of evidence-based reasoning and learn literacy skills in science.

A Study on Intertextuality in <2013 Home of the Legends> (연작 웹툰 《2013 전설의 고향》에 나타나는 상호텍스트성 연구)

  • Yang, Hyelim
    • Cartoon and Animation Studies
    • /
    • s.34
    • /
    • pp.293-316
    • /
    • 2014
  • (傳說의 故鄕) is a broadcast play as one-act play series based on Korean legends and folktales. It was first aired in 1977 from KBS and there has been borrowed from this play in a variety of genres such as books and movies as the name of this series securing its popularity and awareness of the public. In this context, this is a representative work for Korean horror genre. Recently, for example, a series webtoon <2013 Home of the Legends> is published on one of the main portal websites, NAVER from July, 2013. This webtoon is main subject of this study. The purpose of this study is to discuss how the genre characteristics of Korean horror in TV serial play transmitted and changed in series webtoon <2013 Home of the Legends>. TV serial play is a representative narrative based on Korean folktales, trying to change its narrative in the range of undestroyed folktale basic move with combining the original motifs. Serial webtoon <2013 Home of the Legends>, however, deconstructs this combination motif in folktale form and leads to new move in narrative. For Korean users accustomed to Korean folktale form as the architext, this will be expected as reversal and make catharsis. Meanwhile, the deconstruction of combination motif leads to extinction of its cause-and-effect, which consists the axis of original narrative form, with resulting powerless theme, good overcoming evil and punitive justice. The aspects of changes in <2013 Home of the Legends> represent new orientation of Korean horror.

Comparison Between Hidden Layers of Neural Networks and Topics for Hidden Layer Comprehension (인공신경망 은닉층 해석을 위한 토픽과의 비교)

  • Jeong, Young-Seob
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.910-913
    • /
    • 2017
  • 데이터의 양이 증가하면서 인공신경망을 통한 데이터 분석 기술이 주목받고 있으며, 텍스트, 그림, 동영상 등에 이르기까지 다양한 종류의 데이터를 자동으로 분석하여, 번역기, 채팅봇, 그림 캡션 자동 생성 등에 대한 연구 및 서비스 개발에 활용되고 있다. 인공신경망 기반으로 수행된 많은 연구들이 공통적으로 가진 한계가 있는데, 그것은 은닉층에 대한 해석이 어렵다는 것이다. 가령, 입력층, 은닉층, 그리고 결과층으로 이루어진 인공신경망을 임의의 데이터로 학습시키면, 입력층과 은닝층 사이에 존재하는 행렬은 해당 데이터에 존재하는 패턴 정보를 내포하게 된다. 따라서, 행렬에 존재하는 패턴 정보를 직접 분석할 수 있다면, 인공신경망 결과물에 대한 해석이 가능할 뿐만 아니라 성능을 높이기 위해 어떤 조정이 필요한지에 대한 직관도 얻을 수 있을 것이다. 하지만, 이 행렬의 실체는 숫자로 이루어진 벡터이므로 사람이 직접 해석하는 것은 불가능하며, 지금까지 수행되어온 대부분의 인공신경망 연구들은 공통적으로 이러한 한계점을 가지고 있다. 본 연구는 데이터에 존재하는 패턴을 잡아내면서도 해석이 가능한 토픽 모델과 인공신경망의 결과물을 비교함으로써, 인공신경망 은닉층 해석에 대한 실마리를 찾기 위한 연구이다. 실험을 통해 토픽과 은닉층 패턴의 유사성을 검증하고, 향후 인공신경망 연구에서 은닉층에 대한 가능성을 논한다.

A Comparative Study on Korean Zero-shot Relation Extraction using a Large Language Model (거대 언어 모델을 활용한 한국어 제로샷 관계 추출 비교 연구)

  • Jinsung Kim;Gyeongmin Kim;Kinam Park;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.648-653
    • /
    • 2023
  • 관계 추출 태스크는 주어진 텍스트로부터 두 개체 간의 적절한 관계를 추론하는 작업이며, 지식 베이스 구축 및 질의응답과 같은 응용 태스크의 기반이 된다. 최근 자연어처리 분야 전반에서 생성형 거대 언어모델의 내재 지식을 활용하여 뛰어난 성능을 성취하면서, 대표적인 정보 추출 태스크인 관계 추출에서 역시 이를 적극적으로 활용 가능한 방안에 대한 탐구가 필요하다. 특히, 실 세계의 추론 환경과의 유사성에서 기인하는 저자원 특히, 제로샷 환경에서의 관계 추출 연구의 중요성에 기반하여, 효과적인 프롬프팅 기법의 적용이 유의미함을 많은 기존 연구에서 증명해왔다. 따라서, 본 연구는 한국어 관계 추출 분야에서 거대 언어모델에 다각적인 프롬프팅 기법을 활용하여 제로샷 환경에서의 추론에 관한 비교 연구를 진행함으로써, 추후 한국어 관계 추출을 위한 최적의 거대 언어모델 프롬프팅 기법 심화 연구의 기반을 제공하고자 한다. 특히, 상식 추론 등의 도전적인 타 태스크에서 큰 성능 개선을 보인 사고의 연쇄(Chain-of-Thought) 및 자가 개선(Self-Refine)을 포함한 세 가지 프롬프팅 기법을 한국어 관계 추출에 도입하여 양적/질적으로 비교 분석을 제공한다. 실험 결과에 따르면, 사고의 연쇄 및 자가 개선 기법 보다 일반적인 태스크 지시 등이 포함된 프롬프팅이 정량적으로 가장 좋은 제로샷 성능을 보인다. 그러나, 이는 두 방법의 한계를 지적하는 것이 아닌, 한국어 관계 추출 태스크에의 최적화의 필요성을 암시한다고 해석 가능하며, 추후 이러한 방법론들을 발전시키는 여러 실험적 연구에 의해 개선될 것으로 판단된다.

  • PDF

A Design and Implementation of a Content_Based Image Retrieval System using Color Space and Keywords (칼라공간과 키워드를 이용한 내용기반 화상검색 시스템 설계 및 구현)

  • Kim, Cheol-Ueon;Choi, Ki-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.6
    • /
    • pp.1418-1432
    • /
    • 1997
  • Most general content_based image retrieval techniques use color and texture as retrieval indices. In color techniques, color histogram and color pair based color retrieval techniques suffer from a lack of spatial information and text. And This paper describes the design and implementation of content_based image retrieval system using color space and keywords. The preprocessor for image retrieval has used the coordinate system of the existing HSI(Hue, Saturation, Intensity) and preformed to split One image into chromatic region and achromatic region respectively, It is necessary to normalize the size of image for 200*N or N*200 and to convert true colors into 256 color. Two color histograms for background and object are used in order to decide on color selection in the color space. Spatial information is obtained using a maximum entropy discretization. It is possible to choose the class, color, shape, location and size of image by using keyword. An input color is limited by 15 kinds keyword of chromatic and achromatic colors of the Korea Industrial Standards. Image retrieval method is used as the key of retrieval properties in the similarity. The weight values of color space ${\alpha}(%)and\;keyword\;{\beta}(%)$ can be chosen by the user in inputting the query words, controlling the values according to the properties of image_contents. The result of retrieval in the test using extracted feature such as color space and keyword to the query image are lower that those of weight value. In the case of weight value, the average of te measuring parameters shows approximate Precision(0.858), Recall(0.936), RT(1), MT(0). The above results have proved higher retrieval effects than the content_based image retrieval by using color space of keywords.

  • PDF

System Implement to Identify Copyright Infringement Based on the Text Reference Point (텍스트 기준점 기반의 저작권 침해 판단 시스템 구현)

  • Choi, Kyung-Ung;Park, Soon-Cheol;Yang, Seung-Won
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.1
    • /
    • pp.77-84
    • /
    • 2015
  • Most of the existing methods make the index key with every 6 words in every sentence in a document in order to identify copyright infringement between two documents. However, these methods has the disadvantage to take a long time to inspect the copyright infringement because of the long indexing time for the large-scale document. In this paper, we propose a method to select the longest word (called a feature bock) as an index key in the predetermined-sized window which scans a document character by character. This method can be characterized by removing duplicate blocks in the process of scanning a document, dramatically reducing the number of the index keys. The system with this method can find the copyright infringement positions of two documents very accurately and quickly since relatively small number of blocks are compared.

Study of the Application of VQA Deep Learning Technology to the Operation and Management of Urban Parks - Analysis of SNS Images - (도시공원 운영 및 관리를 위한 VQA 딥러닝 기술 활용 연구 - SNS 이미지 분석을 중심으로 -)

  • Lee, Da-Yeon;Park, Seo-Eun;Lee, Jae Ho
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.5
    • /
    • pp.44-56
    • /
    • 2023
  • This research explores the enhancement of park operation and management by analyzing the changing demands of park users. While traditional methods depended on surveys, there has been a recent shift towards utilizing social media data to understand park usage trends. Notably, most research has focused on text data from social media, overlooking the valuable insights from image data. Addressing this gap, our study introduces a novel method of assessing park usage using social media image data and then applies it to actual city park evaluations. A unique image analysis tool, built on Visual Question Answering (VQA) deep learning technology, was developed. This tool revealed specific city park details such as user demographics, behaviors, and locations. Our findings highlight three main points: (1) The VQA-based image analysis tool's validity was proven by matching its results with traditional text analysis outcomes. (2) VQA deep learning technology offers insights like gender, age, and usage time, which aren't accessible from text analysis alone. (3) Using VQA, we derived operational and management strategies for city parks. In conclusion, our VQA-based method offers significant methodological advancements for future park usage studies.

Brand Platformization and User Sentiment: A Text Mining Analysis of Nike Run Club with Comparative Insights from Adidas Runtastic (텍스트마이닝을 활용한 브랜드 플랫폼 사용자 감성 분석: 나이키 및 아디다스 러닝 앱 리뷰 비교분석을 중심으로)

  • Hanna Park;Yunho Maeng;Hyogun Kym
    • Knowledge Management Research
    • /
    • v.25 no.1
    • /
    • pp.43-66
    • /
    • 2024
  • In an era where digital technology reshapes brand-consumer interactions, this study examines the influence of Nike's Run Club and Adidas' Runtastic apps on loyalty and advocacy. Analyzing 3,715 English reviews from January 2020 to October 2023 through text mining, and conducting a focused sentiment analysis on 155 'recommend' mentions, we explore the nuances of 'hot loyalty'. The findings reveal Nike as a 'companion' with an emphasis on emotional engagement, versus Runtastic's 'tool' focus on reliability. This underscores the varied consumer perceptions across similar platforms, highlighting the necessity for brands to integrate user preferences and address technical flaws to foster loyalty. Demonstrating how customized technology adaptations impact loyalty, this research offers crucial insights for digital brand strategy, suggesting a proactive approach in app development and management for brand loyalty enhancement