• Title/Summary/Keyword: Descriptor systems

Search Result 138, Processing Time 0.023 seconds

Person-Independent Facial Expression Recognition with Histograms of Prominent Edge Directions

  • Makhmudkhujaev, Farkhod;Iqbal, Md Tauhid Bin;Arefin, Md Rifat;Ryu, Byungyong;Chae, Oksam
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.6000-6017
    • /
    • 2018
  • This paper presents a new descriptor, named Histograms of Prominent Edge Directions (HPED), for the recognition of facial expressions in a person-independent environment. In this paper, we raise the issue of sampling error in generating the code-histogram from spatial regions of the face image, as observed in the existing descriptors. HPED describes facial appearance changes based on the statistical distribution of the top two prominent edge directions (i.e., primary and secondary direction) captured over small spatial regions of the face. Compared to existing descriptors, HPED uses a smaller number of code-bins to describe the spatial regions, which helps avoid sampling error despite having fewer samples while preserving the valuable spatial information. In contrast to the existing Histogram of Oriented Gradients (HOG) that uses the histogram of the primary edge direction (i.e., gradient orientation) only, we additionally consider the histogram of the secondary edge direction, which provides more meaningful shape information related to the local texture. Experiments on popular facial expression datasets demonstrate the superior performance of the proposed HPED against existing descriptors in a person-independent environment.

T-DMB Hybrid Data Service Part 1: Hybrid BIFS Technology (T-DMB 하이브리드 데이터 서비스 Part 1: 하이브리드 BIFS 기술)

  • Lim, Young-Kwon;Kim, Kyu-Heon;Jeong, Je-Chang
    • Journal of Broadcast Engineering
    • /
    • v.16 no.2
    • /
    • pp.350-359
    • /
    • 2011
  • Fast developments of broadcasting technologies since 1990s enabled not only High Definition Television service providing high quality audiovisual contents at home but also mobile broadcasting service providing audiovisual contents to high speed moving vehicle. Terrestrial Digital Multimedia Broadcasting (T-DMB) is one of the technologies developed for mobile broadcasting service, which has been successfully commercialized. One of the major technical breakthroughs achieved by T-DMB in addition to robust vehicular reception is an adoption of framework based on MPEG-4 System. It naturally enables integrated interactive data services by using Binary Format for Scene (BIFS) technology for scene description and representation of graphics object and Object Descriptor Framework representing multimedia service components as objects. T-DMB interactive data service has two fundamental limitations. Firstly, graphic data for interactive service should be always overlaid on top of a video not to be rendered out of it. Secondly, data for interactive service is only received by broadcasting channel. These limitations were considered as general in broadcasting systems. However, they are being considered as hard limitations for personalized data services using location information and user characteristics which are becoming widely used for data services of smart devices in these days. In this paper, the architecture of T-DMB hybrid data service is proposed which is utilizing broadcasting network, wireless internet and local storage for delivering BIFS data to overcome these limitations. This paper also presents hybrid BIFS technology to implement T-DMB hybrid data service while maintaining backward compatibility with legacy T-DMB players.

Edge-based spatial descriptor for content-based Image retrieval (내용 기반 영상 검색을 위한 에지 기반의 공간 기술자)

  • Kim, Nac-Woo;Kim, Tae-Yong;Choi, Jong-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.5 s.305
    • /
    • pp.1-10
    • /
    • 2005
  • Content-based image retrieval systems are being actively investigated owing to their ability to retrieve images based on the actual visual content rather than by manually associated textual descriptions. In this paper, we propose a novel approach for image retrieval based on edge structural features using edge correlogram and color coherence vector. After color vector angle is applied in the pre-processing stage, an image is divided into two image parts (high frequency image and low frequency image). In low frequency image, the global color distribution of smooth pixels is extracted by color coherence vector, thereby incorporating spatial information into the proposed color descriptor. Meanwhile, in high frequency image, the distribution of the gray pairs at an edge is extracted by edge correlogram. Since the proposed algorithm includes the spatial and edge information between colors, it can robustly reduce the effect of the significant change in appearance and shape in image analysis. The proposed method provides a simple and flexible description for the image with complex scene in terms of structural features of the image contents. Experimental evidence suggests that our algorithm outperforms the recently histogram refinement methods for image indexing and retrieval. To index the multidimensional feature vectors, we use R*-tree structure.

Performance Evaluations for Leaf Classification Using Combined Features of Shape and Texture (형태와 텍스쳐 특징을 조합한 나뭇잎 분류 시스템의 성능 평가)

  • Kim, Seon-Jong;Kim, Dong-Pil
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.1-12
    • /
    • 2012
  • There are many trees in a roadside, parks or facilities for landscape. Although we are easily seeing a tree in around, it would be difficult to classify it and to get some information about it, such as its name, species and surroundings of the tree. To find them, you have to find the illustrated books for plants or search for them on internet. The important components of a tree are leaf, flower, bark, and so on. Generally we can classify the tree by its leaves. A leaf has the inherited features of the shape, vein, and so on. The shape is important role to decide what the tree is. And texture included in vein is also efficient feature to classify them. This paper evaluates the performance of a leaf classification system using both shape and texture features. We use Fourier descriptors for shape features, and both gray-level co-occurrence matrices and wavelets for texture features, and used combinations of such features for evaluation of images from the Flavia dataset. We compared the recognition rates and the precision-recall performances of these features. Various experiments showed that a combination of shape and texture gave better results for performance. The best came from the case of a combination of features of shape and texture with a flipped contour for a Fourier descriptor.

A Basic Study on the Extraction of Dangerous Region for Safe Landing of self-Driving UAMs (자율주행 UAM의 안전착륙을 위한 위험영역 추출에 관한 기초 연구)

  • Chang min Park
    • Journal of Platform Technology
    • /
    • v.11 no.3
    • /
    • pp.24-31
    • /
    • 2023
  • Recently, interest in UAM (Urban Air Mobility, UAM), which can take off and land vertically in the operation of urban air transportation systems, has been increasing. Therefore, various start-up companies are developing related technologies as eco-friendly future transportation with advanced technology. However, studies on ways to increase safety in the operation of UAM are still insignificant. In particular, efforts are more urgent to improve the safety of risks generated in the process of attempting to land in the city center by UAM equipped with autonomous driving. Accordingly, this study proposes a plan to safely land by avoiding dangerous region that interfere when autonomous UAM attempts to land in the city center. To this end, first, the latitude and longitude coordinate values of dangerous objects observed by the sense of the UAM are calculated. Based on this, we proposed to convert the coordinates of the distorted planar image from the 3D image to latitude and longitude and then use the calculated latitude and longitude to compare the pre-learned feature descriptor with the HOG (Histogram of Oriented Gradients, HOG) feature descriptor to extract the dangerous Region. Although the dangerous region could not be completely extracted, generally satisfactory results were obtained. Accordingly, the proposed research method reduces the enormous cost of selecting a take-off and landing site for UAM equipped with autonomous driving technology and contribute to basic measures to reduce risk increase safety when attempting to land in complex environments such as urban areas.

  • PDF

HK Curvature Descriptor-Based Surface Registration Method Between 3D Measurement Data and CT Data for Patient-to-CT Coordinate Matching of Image-Guided Surgery (영상 유도 수술의 환자 및 CT 데이터 좌표계 정렬을 위한 HK 곡률 기술자 기반 표면 정합 방법)

  • Kwon, Ki-Hoon;Lee, Seung-Hyun;Kim, Min Young
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.8
    • /
    • pp.597-602
    • /
    • 2016
  • In image guided surgery, a patient registration process is a critical process for the successful operation, which is required to use pre-operative images such as CT and MRI during operation. Though several patient registration methods have been studied, we concentrate on one method that utilizes 3D surface measurement data in this paper. First, a hand-held 3D surface measurement device measures the surface of the patient, and secondly this data is matched with CT or MRI data using optimization algorithms. However, generally used ICP algorithm is very slow without a proper initial location and also suffers from local minimum problem. Usually, this problem is solved by manually providing the proper initial location before performing ICP. But, it has a disadvantage that an experience user has to perform the method and also takes a long time. In this paper, we propose a method that can accurately find the proper initial location automatically. The proposed method finds the proper initial location for ICP by converting 3D data to 2D curvature images and performing image matching. Curvature features are robust to the rotation, translation, and even some deformation. Also, the proposed method is faster than traditional methods because it performs 2D image matching instead of 3D point cloud matching.

A Method to Improve the Performance of Adaboost Algorithm by Using Mixed Weak Classifier (혼합 약한 분류기를 이용한 AdaBoost 알고리즘의 성능 개선 방법)

  • Kim, Jeong-Hyun;Teng, Zhu;Kim, Jin-Young;Kang, Dong-Joong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.5
    • /
    • pp.457-464
    • /
    • 2009
  • The weak classifier of AdaBoost algorithm is a central classification element that uses a single criterion separating positive and negative learning candidates. Finding the best criterion to separate two feature distributions influences learning capacity of the algorithm. A common way to classify the distributions is to use the mean value of the features. However, positive and negative distributions of Haar-like feature as an image descriptor are hard to classify by a single threshold. The poor classification ability of the single threshold also increases the number of boosting operations, and finally results in a poor classifier. This paper proposes a weak classifier that uses multiple criterions by adding a probabilistic criterion of the positive candidate distribution with the conventional mean classifier: the positive distribution has low variation and the values are closer to the mean while the negative distribution has large variation and values are widely spread. The difference in the variance for the positive and negative distributions is used as an additional criterion. In the learning procedure, we use a new classifier that provides a better classifier between them by selective switching between the mean and standard deviation. We call this new type of combined classifier the "Mixed Weak Classifier". The proposed weak classifier is more robust than the mean classifier alone and decreases the number of boosting operations to be converged.

Research Trends and Case Study on Keypoint Recognition and Tracking for Augmented Reality in Mobile Devices (모바일 증강현실을 위한 특징점 인식, 추적 기술 및 사례 연구)

  • Choi, Heeseung;Ahn, Sang Chul;Kim, Ig-Jae
    • Journal of the HCI Society of Korea
    • /
    • v.10 no.2
    • /
    • pp.45-55
    • /
    • 2015
  • In recent years, keypoint recognition and tracking technologies are considered as crucial task in many practical systems for markerless augmented reality. The keypoint recognition and technologies are widely studied in many research areas, including computer vision, robot navigation, human computer interaction, and etc. Moreover, due to the rapid growth of mobile market related to augmented reality applications, several effective keypoint-based matching and tracking methods have been introduced by considering mobile embedded systems. Therefore, in this paper, we extensively analyze the recent research trends on keypoint-based recognition and tracking with several core components: keypoint detection, description, matching, and tracking. Then, we also present one of our research related to mobile augmented reality, named mobile tour guide system, by real-time recognition and tracking of tour maps on mobile devices.

Design of Client-Server Model For Effective Processing and Utilization of Bigdata (빅데이터의 효과적인 처리 및 활용을 위한 클라이언트-서버 모델 설계)

  • Park, Dae Seo;Kim, Hwa Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.109-122
    • /
    • 2016
  • Recently, big data analysis has developed into a field of interest to individuals and non-experts as well as companies and professionals. Accordingly, it is utilized for marketing and social problem solving by analyzing the data currently opened or collected directly. In Korea, various companies and individuals are challenging big data analysis, but it is difficult from the initial stage of analysis due to limitation of big data disclosure and collection difficulties. Nowadays, the system improvement for big data activation and big data disclosure services are variously carried out in Korea and abroad, and services for opening public data such as domestic government 3.0 (data.go.kr) are mainly implemented. In addition to the efforts made by the government, services that share data held by corporations or individuals are running, but it is difficult to find useful data because of the lack of shared data. In addition, big data traffic problems can occur because it is necessary to download and examine the entire data in order to grasp the attributes and simple information about the shared data. Therefore, We need for a new system for big data processing and utilization. First, big data pre-analysis technology is needed as a way to solve big data sharing problem. Pre-analysis is a concept proposed in this paper in order to solve the problem of sharing big data, and it means to provide users with the results generated by pre-analyzing the data in advance. Through preliminary analysis, it is possible to improve the usability of big data by providing information that can grasp the properties and characteristics of big data when the data user searches for big data. In addition, by sharing the summary data or sample data generated through the pre-analysis, it is possible to solve the security problem that may occur when the original data is disclosed, thereby enabling the big data sharing between the data provider and the data user. Second, it is necessary to quickly generate appropriate preprocessing results according to the level of disclosure or network status of raw data and to provide the results to users through big data distribution processing using spark. Third, in order to solve the problem of big traffic, the system monitors the traffic of the network in real time. When preprocessing the data requested by the user, preprocessing to a size available in the current network and transmitting it to the user is required so that no big traffic occurs. In this paper, we present various data sizes according to the level of disclosure through pre - analysis. This method is expected to show a low traffic volume when compared with the conventional method of sharing only raw data in a large number of systems. In this paper, we describe how to solve problems that occur when big data is released and used, and to help facilitate sharing and analysis. The client-server model uses SPARK for fast analysis and processing of user requests. Server Agent and a Client Agent, each of which is deployed on the Server and Client side. The Server Agent is a necessary agent for the data provider and performs preliminary analysis of big data to generate Data Descriptor with information of Sample Data, Summary Data, and Raw Data. In addition, it performs fast and efficient big data preprocessing through big data distribution processing and continuously monitors network traffic. The Client Agent is an agent placed on the data user side. It can search the big data through the Data Descriptor which is the result of the pre-analysis and can quickly search the data. The desired data can be requested from the server to download the big data according to the level of disclosure. It separates the Server Agent and the client agent when the data provider publishes the data for data to be used by the user. In particular, we focus on the Big Data Sharing, Distributed Big Data Processing, Big Traffic problem, and construct the detailed module of the client - server model and present the design method of each module. The system designed on the basis of the proposed model, the user who acquires the data analyzes the data in the desired direction or preprocesses the new data. By analyzing the newly processed data through the server agent, the data user changes its role as the data provider. The data provider can also obtain useful statistical information from the Data Descriptor of the data it discloses and become a data user to perform new analysis using the sample data. In this way, raw data is processed and processed big data is utilized by the user, thereby forming a natural shared environment. The role of data provider and data user is not distinguished, and provides an ideal shared service that enables everyone to be a provider and a user. The client-server model solves the problem of sharing big data and provides a free sharing environment to securely big data disclosure and provides an ideal shared service to easily find big data.

Human Action Recognition Using Pyramid Histograms of Oriented Gradients and Collaborative Multi-task Learning

  • Gao, Zan;Zhang, Hua;Liu, An-An;Xue, Yan-Bing;Xu, Guang-Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.2
    • /
    • pp.483-503
    • /
    • 2014
  • In this paper, human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning is proposed. First, we accumulate global activities and construct motion history image (MHI) for both RGB and depth channels respectively to encode the dynamics of one action in different modalities, and then different action descriptors are extracted from depth and RGB MHI to represent global textual and structural characteristics of these actions. Specially, average value in hierarchical block, GIST and pyramid histograms of oriented gradients descriptors are employed to represent human motion. To demonstrate the superiority of the proposed method, we evaluate them by KNN, SVM with linear and RBF kernels, SRC and CRC models on DHA dataset, the well-known dataset for human action recognition. Large scale experimental results show our descriptors are robust, stable and efficient, and outperform the state-of-the-art methods. In addition, we investigate the performance of our descriptors further by combining these descriptors on DHA dataset, and observe that the performances of combined descriptors are much better than just using only sole descriptor. With multimodal features, we also propose a collaborative multi-task learning method for model learning and inference based on transfer learning theory. The main contributions lie in four aspects: 1) the proposed encoding the scheme can filter the stationary part of human body and reduce noise interference; 2) different kind of features and models are assessed, and the neighbor gradients information and pyramid layers are very helpful for representing these actions; 3) The proposed model can fuse the features from different modalities regardless of the sensor types, the ranges of the value, and the dimensions of different features; 4) The latent common knowledge among different modalities can be discovered by transfer learning to boost the performance.