• Title/Summary/Keyword: Large tag data

Search Result 67, Processing Time 0.025 seconds

Extending Semantic Image Annotation using User- Defined Rules and Inference in Mobile Environments (모바일 환경에서 사용자 정의 규칙과 추론을 이용한 의미 기반 이미지 어노테이션의 확장)

  • Seo, Kwang-won;Im, Dong-Hyuk
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.158-165
    • /
    • 2018
  • Since a large amount of multimedia image has dramatically increased, it is important to search semantically relevant image. Thus, several semantic image annotation methods using RDF(Resource Description Framework) model in mobile environment are introduced. Earlier studies on annotating image semantically focused on both the image tag and the context-aware information such as temporal and spatial data. However, in order to fully express their semantics of image, we need more annotations which are described in RDF model. In this paper, we propose an annotation method inferencing with RDFS entailment rules and user defined rules. Our approach implemented in Moment system shows that it can more fully represent the semantics of image with more annotation triples.

Semantic Image Annotation and Retrieval in Mobile Environments (모바일 환경에서 의미 기반 이미지 어노테이션 및 검색)

  • No, Hyun-Deok;Seo, Kwang-won;Im, Dong-Hyuk
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.8
    • /
    • pp.1498-1504
    • /
    • 2016
  • The progress of mobile computing technology is bringing a large amount of multimedia contents such as image. Thus, we need an image retrieval system which searches semantically relevant image. In this paper, we propose a semantic image annotation and retrieval in mobile environments. Previous mobile-based annotation approaches cannot fully express the semantics of image due to the limitation of current form (i.e., keyword tagging). Our approach allows mobile devices to annotate the image automatically using the context-aware information such as temporal and spatial data. In addition, since we annotate the image using RDF(Resource Description Framework) model, we are able to query SPARQL for semantic image retrieval. Our system implemented in android environment shows that it can more fully represent the semantics of image and retrieve the images semantically comparing with other image annotation systems.

Ontology and Sequential Rule Based Streaming Media Event Recognition (온톨로지 및 순서 규칙 기반 대용량 스트리밍 미디어 이벤트 인지)

  • Soh, Chi-Seung;Park, Hyun-Kyu;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.470-479
    • /
    • 2016
  • As the number of various types of media data such as UCC (User Created Contents) increases, research is actively being carried out in many different fields so as to provide meaningful media services. Amidst these studies, a semantic web-based media classification approach has been proposed; however, it encounters some limitations in video classification because of its underlying ontology derived from meta-information such as video tag and title. In this paper, we define recognized objects in a video and activity that is composed of video objects in a shot, and introduce a reasoning approach based on description logic. We define sequential rules for a sequence of shots in a video and describe how to classify it. For processing the large amount of increasing media data, we utilize Spark streaming, and a distributed in-memory big data processing framework, and describe how to classify media data in parallel. To evaluate the efficiency of the proposed approach, we conducted an experiment using a large amount of media ontology extracted from Youtube videos.

Food Detection by Fine-Tuning Pre-trained Convolutional Neural Network Using Noisy Labels

  • Alshomrani, Shroog;Aljoudi, Lina;Aljabri, Banan;Al-Shareef, Sarah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.182-190
    • /
    • 2021
  • Deep learning is an advanced technology for large-scale data analysis, with numerous promising cases like image processing, object detection and significantly more. It becomes customarily to use transfer learning and fine-tune a pre-trained CNN model for most image recognition tasks. Having people taking photos and tag themselves provides a valuable resource of in-data. However, these tags and labels might be noisy as people who annotate these images might not be experts. This paper aims to explore the impact of noisy labels on fine-tuning pre-trained CNN models. Such effect is measured on a food recognition task using Food101 as a benchmark. Four pre-trained CNN models are included in this study: InceptionV3, VGG19, MobileNetV2 and DenseNet121. Symmetric label noise will be added with different ratios. In all cases, models based on DenseNet121 outperformed the other models. When noisy labels were introduced to the data, the performance of all models degraded almost linearly with the amount of added noise.

Spatial Clustering Analysis based on Text Mining of Location-Based Social Media Data (위치기반 소셜 미디어 데이터의 텍스트 마이닝 기반 공간적 클러스터링 분석 연구)

  • Park, Woo Jin;Yu, Ki Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.23 no.2
    • /
    • pp.89-96
    • /
    • 2015
  • Location-based social media data have high potential to be used in various area such as big data, location based services and so on. In this study, we applied a series of analysis methodology to figure out how the important keywords in location-based social media are spatially distributed by analyzing text information. For this purpose, we collected tweet data with geo-tag in Gangnam district and its environs in Seoul for a month of August 2013. From this tweet data, principle keywords are extracted. Among these, keywords of three categories such as food, entertainment and work and study are selected and classified by category. The spatial clustering is conducted to the tweet data which contains keywords in each category. Clusters of each category are compared with buildings and benchmark POIs in the same position. As a result of comparison, clusters of food category showed high consistency with commercial areas of large scale. Clusters of entertainment category corresponded with theaters and sports complex. Clusters of work and study showed high consistency with areas where private institutes and office buildings are concentrated.

Instruction Queue Architecture for Low Power Microprocessors (마이크로프로세서 전력소모 절감을 위한 명령어 큐 구조)

  • Choi, Min;Maeng, Seung-Ryoul
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.56-62
    • /
    • 2008
  • Modern microprocessors must deliver high application performance, while the design process should not subordinate power. In terms of performance and power tradeoff, the instructions window is particularly important. This is because a large instruction window leads to achieve high performance. However, naive scaling conventional instruction window can severely affect the complexity and power consumption. This paper explores an architecture level approach to reduce power dissipation. We propose a low power issue logic with an efficient tag translation. The direct lookup table (DTL) issue logic eliminates the associative wake-up of conventional instruction window. The tag translation scheme deals with data dependencies and resource conflicts by using bit-vector based structure. Experimental results show that, for SPEC2000 benchmarks, the proposed design reduces power consumption by 24.45% on average over conventional approach.

Efficient Capturing of Spatial Data in Geographic Database System (지리 데이타베이스 시스템에서의 효율적인 공간 데이타 수집)

  • Kim, Jong-Hun;Kim, Jae-Hong;Bae, Hae-Yeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.3
    • /
    • pp.279-289
    • /
    • 1994
  • A Geographic Database System is a database system which supports map-formed output and allows users to store, retrieve, manage and analyze spatial and aspatial data. Because of large data amount, takes too much time to input spatial data into a Geographic Database System and too much storage. Therefore, an efficient spatial data collecting system is highly required for a Geographic Database System to reduce the input processing time and to use the storage efficiently. In this paper, we analyze conventional vectorizing methods and suggest a different approach. Our approach vectorizes specific geographic data when the users input its aspatial data, instead of vectorizing all the map data. And also, we propose optimized vector data format using tag bit to use the storage that collected data efficiently.

  • PDF

Blockchain and AI-based big data processing techniques for sustainable agricultural environments (지속가능한 농업 환경을 위한 블록체인과 AI 기반 빅 데이터 처리 기법)

  • Yoon-Su Jeong
    • Advanced Industrial SCIence
    • /
    • v.3 no.2
    • /
    • pp.17-22
    • /
    • 2024
  • Recently, as the ICT field has been used in various environments, it has become possible to analyze pests by crops, use robots when harvesting crops, and predict by big data by utilizing ICT technologies in a sustainable agricultural environment. However, in a sustainable agricultural environment, efforts to solve resource depletion, agricultural population decline, poverty increase, and environmental destruction are constantly being demanded. This paper proposes an artificial intelligence-based big data processing analysis method to reduce the production cost and increase the efficiency of crops based on a sustainable agricultural environment. The proposed technique strengthens the security and reliability of data by processing big data of crops combined with AI, and enables better decision-making and business value extraction. It can lead to innovative changes in various industries and fields and promote the development of data-oriented business models. During the experiment, the proposed technique gave an accurate answer to only a small amount of data, and at a farm site where it is difficult to tag the correct answer one by one, the performance similar to that of learning with a large amount of correct answer data (with an error rate within 0.05) was found.

Suggestions on how to convert official documents to Machine Readable (공문서의 기계가독형(Machine Readable) 전환 방법 제언)

  • Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.67
    • /
    • pp.99-138
    • /
    • 2021
  • In the era of big data, analyzing not only structured data but also unstructured data is emerging as an important task. Official documents produced by government agencies are also subject to big data analysis as large text-based unstructured data. From the perspective of internal work efficiency, knowledge management, records management, etc, it is necessary to analyze big data of public documents to derive useful implications. However, since many of the public documents currently held by public institutions are not in open format, a pre-processing process of extracting text from a bitstream is required for big data analysis. In addition, since contextual metadata is not sufficiently stored in the document file, separate efforts to secure metadata are required for high-quality analysis. In conclusion, the current official documents have a low level of machine readability, so big data analysis becomes expensive.

Detecting Spam Data for Securing the Reliability of Text Analysis (텍스트 분석의 신뢰성 확보를 위한 스팸 데이터 식별 방안)

  • Hyun, Yoonjin;Kim, Namgyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.493-504
    • /
    • 2017
  • Recently, tremendous amounts of unstructured text data that is distributed through news, blogs, and social media has gained much attention from many researchers and practitioners as this data contains abundant information about various consumers' opinions. However, as the usefulness of text data is increasing, more and more attempts to gain profits by distorting text data maliciously or nonmaliciously are also increasing. This increase in spam text data not only burdens users who want to obtain useful information with a large amount of inappropriate information, but also damages the reliability of information and information providers. Therefore, efforts must be made to improve the reliability of information and the quality of analysis results by detecting and removing spam data in advance. For this purpose, many studies to detect spam have been actively conducted in areas such as opinion spam detection, spam e-mail detection, and web spam detection. In this study, we introduce core concepts and current research trends of spam detection and propose a methodology to detect the spam tag of a blog as one of the challenging attempts to improve the reliability of blog information.