• Title/Summary/Keyword: 다중 해시

Search Result 30, Processing Time 0.031 seconds

A Study on the Emotion Analysis of Instagram Using Images and Hashtags (이미지와 해시태그를 이용한 인스타그램의 감정 분석 연구)

  • Jeong, Dahye;Gim, Jangwon
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.9
    • /
    • pp.123-131
    • /
    • 2019
  • Social network service users actively express and share their feelings about social issues and content of interest through postings. As a result, the sharing of emotions among individuals and community members in social network is spreading rapidly. Therefore, resulting in active research of emotion analysis on posting of users. However, There is insufficient research on emotion analysis for postings containing various emotions. In this paper, we propose a method that analyzes the emotions of an Instagram posts using hashtags and images. This method extracts representative emotion from user posts containing multiple emotions with 66.4% accuracy and 81.7% recall, which improves the emotion classification performance compared to the previous method.

Design and Implementation of Multiple Filter Distributed Deduplication System Applying Cuckoo Filter Similarity (쿠쿠 필터 유사도를 적용한 다중 필터 분산 중복 제거 시스템 설계 및 구현)

  • Kim, Yeong-A;Kim, Gea-Hee;Kim, Hyun-Ju;Kim, Chang-Geun
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.10
    • /
    • pp.1-8
    • /
    • 2020
  • The need for storage, management, and retrieval techniques for alternative data has emerged as technologies based on data generated from business activities conducted by enterprises have emerged as the key to business success in recent years. Existing big data platform systems must load a large amount of data generated in real time without delay to process unstructured data, which is an alternative data, and efficiently manage storage space by utilizing a deduplication system of different storages when redundant data occurs. In this paper, we propose a multi-layer distributed data deduplication process system using the similarity of the Cuckoo hashing filter technique considering the characteristics of big data. Similarity between virtual machines is applied as Cuckoo hash, individual storage nodes can improve performance with deduplication efficiency, and multi-layer Cuckoo filter is applied to reduce processing time. Experimental results show that the proposed method shortens the processing time by 8.9% and increases the deduplication rate by 10.3%.

Similar Contents Recommendation Model Based On Contents Meta Data Using Language Model (언어모델을 활용한 콘텐츠 메타 데이터 기반 유사 콘텐츠 추천 모델)

  • Donghwan Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.27-40
    • /
    • 2023
  • With the increase in the spread of smart devices and the impact of COVID-19, the consumption of media contents through smart devices has significantly increased. Along with this trend, the amount of media contents viewed through OTT platforms is increasing, that makes contents recommendations on these platforms more important. Previous contents-based recommendation researches have mostly utilized metadata that describes the characteristics of the contents, with a shortage of researches that utilize the contents' own descriptive metadata. In this paper, various text data including titles and synopses that describe the contents were used to recommend similar contents. KLUE-RoBERTa-large, a Korean language model with excellent performance, was used to train the model on the text data. A dataset of over 20,000 contents metadata including titles, synopses, composite genres, directors, actors, and hash tags information was used as training data. To enter the various text features into the language model, the features were concatenated using special tokens that indicate each feature. The test set was designed to promote the relative and objective nature of the model's similarity classification ability by using the three contents comparison method and applying multiple inspections to label the test set. Genres classification and hash tag classification prediction tasks were used to fine-tune the embeddings for the contents meta text data. As a result, the hash tag classification model showed an accuracy of over 90% based on the similarity test set, which was more than 9% better than the baseline language model. Through hash tag classification training, it was found that the language model's ability to classify similar contents was improved, which demonstrated the value of using a language model for the contents-based filtering.

A Secure RFID Multi-Tag Search Protocol Without On-line Server (서버가 없는 환경에서 안전한 RFID 다중 태그 검색 프로토콜)

  • Lee, Jae-Dong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.3
    • /
    • pp.405-415
    • /
    • 2012
  • In many applications a reader needs to determine whether a particular tag exists within a group of tags without a server. This is referred to as serverless RFID tag searching. A few protocols for the serverless RFID searching are proposed but they are the single tag search protocol which can search a tag at one time. In this paper, we propose a multi-tag search protocol based on a hash function and a random number generator which can search some tags at one time. For this study, we introduce a protocol which can resolve the problem of synchronization of seeds when communication error occurs in the S3PR protocol[1], and propose a multi-tag search protocol which can reduce the communication overhead. The proposed protocol is secure against tracking attack, impersonation attack, replay attack and denial-of-service attack. This study will be the basis of research for multi-tag serach protocol.

A Design of SMS DDoS Detection and Defense Method using Counting Bloom Filter (Counting Bloom Filter를 이용한 SMS DDoS 탐지 및 방어 기법 설계)

  • Shin, Kwang-Kyoon;Park, Ui-Chung;Jun, Moon-Seog
    • Proceedings of the KAIS Fall Conference
    • /
    • 2011.05a
    • /
    • pp.53-56
    • /
    • 2011
  • 지난 7.7 DDoS(Distributed Denial of Service), 3.3 DDoS 대란을 통해서 보여주듯 DDoS 공격이 네트워크 주요 위협요소로 매우 부각되고 있으나, 공격에 대해서 실시간으로 감지하고 대응하기에 어렵다. 그리고 현재 여러 분야에서 매우 많은 용도로 사용되는 SMS(Short Message Service)도 DDoS 공격 수단으로 사용되어 이동전화 시스템에 큰 혼란을 야기할 수 있다. 기존의 Bloom Filter 탐지 기법은 구조가 간단하고 실시간 탐지가 가능한 장점을 갖지만 오탐지율에 대한 문제점을 가진다. 본 논문에서는 목적지 기반의 다중의 해시함수를 사용한 Counting Bloom Filter 기법을 이용하여 임계치 이상 카운트된 동일한 목적지로 발송되는 SMS에 대하여 공격으로 탐지하고 SMSC에 통보하여 차단시키는 시스템을 제안한다.

  • PDF

SVC and CAS Combining Scheme for Support Multi-Device watching Environment (다중기기 시청 환경을 지원하기 위한 SVC와 CAS 결합 기법)

  • Son, Junggab;Lee, Hoonjung;Oh, Heekuck
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.11a
    • /
    • pp.907-910
    • /
    • 2011
  • IPTV나 DTV에서 사용하는 CAS는 하나의 스트리밍으로 하나의 콘텐츠만을 전송하는 환경이지만 SVC와의 결합을 통해 사용자의 다양한 비디오 어플리케이션을 단일 스트리밍으로 지원하도록 개선할 수 있다. 이러한 환경에서는 효율성을 우선적으로 설계하여야 하며, 서비스 등급별 과금 정책을 위해 계층적 키 관리 기법의 적용이 필요하다. 본 논문에서는 CAS에 SVC를 적용함에 있어 발생할 수 있는 문제점들에 대해 살펴보고 CAS환경에서의 SVC 암호화기법에 대해 제안한다. 제안하는 기법의 안전성은 기존 CAS와 단방향 해시 함수의 안전성에 기반하며, 기존 CAS에 비교적 적은 오버헤드로 적용이 가능하다는 장점이 있다.

Effective Load Shedding for Multi-Way windowed Joins Based on the Arrival Order of Tuples on Data Streams (다중 윈도우 조인을 위한 튜플의 도착 순서에 기반한 효과적인 부하 감소 기법)

  • Kwon, Tae-Hyung;Lee, Ki-Yong;Son, Jin-Hyun;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.37 no.1
    • /
    • pp.1-11
    • /
    • 2010
  • Recently, there has been a growing interest in the processing of continuous queries over multiple data streams. When the arrival rates of tuples exceed the memory capacity of the system, a load shedding technique is used to avoid the system becoming overloaded by dropping some subset of input tuples. In this paper, we propose an effective load shedding algorithm for multi-way windowed joins over multiple data streams. Most previous load shedding algorithms estimate the productivity of each tuple, i.e., the number of join output tuples produced by the tuple, based on its "join attribute value" and drop tuples with the lowest productivity. However, the productivity of a tuple cannot be accurately estimated from its join attribute value when the join attribute values are unique and do not repeat, or the distribution of the join attribute values changes over time. For these cases, we estimate the productivity of a tuple based on its "arrival order" on data streams, rather than its join attribute value. The proposed method can effectively estimate the productivity of a tuple even when the productivity of a tuple cannot be accurately estimated from its join attribute value. Through extensive experiments and analysis, we show that our proposed method outperforms the previous methods in terms of effectiveness and efficiency.

Design and Implementation of High-Speed Pattern Matcher Using Multi-Entry Simultaneous Comparator in Network Intrusion Detection System (네트워크 침입 탐지 시스템에서 다중 엔트리 동시 비교기를 이용한 고속패턴 매칭기의 설계 및 구현)

  • Jeon, Myung-Jae;Hwang, Sun-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.11
    • /
    • pp.2169-2177
    • /
    • 2015
  • This paper proposes a new pattern matching module to overcome the increased runtime of previous algorithm using RAM, which was designed to overcome cost limitation of hash-based algorithm using CAM (Content Addressable Memory). By adopting Merge FSM algorithm to reduce the number of state, the proposed module contains state block and entry block to use in RAM. In the proposed module, one input string is compared with multiple entry strings simultaneously using entry block. The effectiveness of the proposed pattern matching unit is verified by executing Snort 2.9 rule set. Experimental results show that the number of memory reads has decreased by 15.8%, throughput has increased by 47.1%, while memory usage has increased by 2.6%, when compared to previous methods.

SVC and CAS Combining Scheme for Support Multi-Device Watching Environment (다중기기 시청환경을 지원하기 위한 SVC와 CAS 결합 기법)

  • Son, Junggab;Oh, Heekuck;Kim, SangJin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.6
    • /
    • pp.1111-1120
    • /
    • 2013
  • CAS used in IPTV or DTV has an environment of sending single type of contents through single streaming. But it can be improved to support users' various video applications through single streaming by combining with SVC. For such an environment, efficiency should be firstly considered, and hierarchical key management methods for billing policy by service levels should be applied. This study aims to look into considerations to apply SVC to CAS and propose SVC encryption in CAS environment. The security of the proposed scheme is based on the safety of CAS and oneway hash function. If the proposed scheme is applied, scalability can be efficiently provided even in the encrypted contents and it is possible to bill users according to picture quality. In addition, the test results show that SVC contents given by streaming service with the average less than 10%overhead can be safely protected against illegal uses.

Finding Pseudo Periods over Data Streams based on Multiple Hash Functions (다중 해시함수 기반 데이터 스트림에서의 아이템 의사 주기 탐사 기법)

  • Lee, Hak-Joo;Kim, Jae-Wan;Lee, Won-Suk
    • Journal of Information Technology Services
    • /
    • v.16 no.1
    • /
    • pp.73-82
    • /
    • 2017
  • Recently in-memory data stream processing has been actively applied to various subjects such as query processing, OLAP, data mining, i.e., frequent item sets, association rules, clustering. However, finding regular periodic patterns of events in an infinite data stream gets less attention. Most researches about finding periods use autocorrelation functions to find certain changes in periodic patterns, not period itself. And they usually find periodic patterns in time-series databases, not in data streams. Literally a period means the length or era of time that some phenomenon recur in a certain time interval. However in real applications a data set indeed evolves with tiny differences as time elapses. This kind of a period is called as a pseudo-period. This paper proposes a new scheme called FPMH (Finding Periods using Multiple Hash functions) algorithm to find such a set of pseudo-periods over a data stream based on multiple hash functions. According to the type of pseudo period, this paper categorizes FPMH into three, FPMH-E, FPMH-PC, FPMH-PP. To maximize the performance of the algorithm in the data stream environment and to keep most recent periodic patterns in memory, we applied decay mechanism to FPMH algorithms. FPMH algorithm minimizes the usage of memory as well as processing time with acceptable accuracy.