• Title/Summary/Keyword: Majority voting

Search Result 71, Processing Time 0.037 seconds

Fuzzy-Membership Based Writer Identification from Handwritten Devnagari Script

  • Kumar, Rajiv;Ravulakollu, Kiran Kumar;Bhat, Rajesh
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.893-913
    • /
    • 2017
  • The handwriting based person identification systems use their designer's perceived structural properties of handwriting as features. In this paper, we present a system that uses those structural properties as features that graphologists and expert handwriting analyzers use for determining the writer's personality traits and for making other assessments. The advantage of these features is that their definition is based on sound historical knowledge (i.e., the knowledge discovered by graphologists, psychiatrists, forensic experts, and experts of other domains in analyzing the relationships between handwritten stroke characteristics and the phenomena that imbeds individuality in stroke). Hence, each stroke characteristic reflects a personality trait. We have measured the effectiveness of these features on a subset of handwritten Devnagari and Latin script datasets from the Center for Pattern Analysis and Recognition (CPAR-2012), which were written by 100 people where each person wrote three samples of the Devnagari and Latin text that we have designed for our experiments. The experiment yielded 100% correct identification on the training set. However, we observed an 88% and 89% correct identification rate when we experimented with 200 training samples and 100 test samples on handwritten Devnagari and Latin text. By introducing the majority voting based rejection criteria, the identification accuracy increased to 97% on both script sets.

Patch based Semi-supervised Linear Regression for Face Recognition

  • Ding, Yuhua;Liu, Fan;Rui, Ting;Tang, Zhenmin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.3962-3980
    • /
    • 2019
  • To deal with single sample face recognition, this paper presents a patch based semi-supervised linear regression (PSLR) algorithm, which draws facial variation information from unlabeled samples. Each facial image is divided into overlapped patches, and a regression model with mapping matrix will be constructed on each patch. Then, we adjust these matrices by mapping unlabeled patches to $[1,1,{\cdots},1]^T$. The solutions of all the mapping matrices are integrated into an overall objective function, which uses ${\ell}_{2,1}$-norm minimization constraints to improve discrimination ability of mapping matrices and reduce the impact of noise. After mapping matrices are computed, we adopt majority-voting strategy to classify the probe samples. To further learn the discrimination information between probe samples and obtain more robust mapping matrices, we also propose a multistage PSLR (MPSLR) algorithm, which iteratively updates the training dataset by adding those reliably labeled probe samples into it. The effectiveness of our approaches is evaluated using three public facial databases. Experimental results prove that our approaches are robust to illumination, expression and occlusion.

A Methodology for Predicting Changes in Product Evaluation Based on Customer Experience Using Deep Learning (딥러닝을 활용한 고객 경험 기반 상품 평가 변화 예측 방법론)

  • An, Jiyea;Kim, Namgyu
    • Journal of Information Technology Services
    • /
    • v.21 no.4
    • /
    • pp.75-90
    • /
    • 2022
  • From the past to the present, reviews have had much influence on consumers' purchasing decisions. Companies are making various efforts, such as introducing a review incentive system to increase the number of reviews. Recently, as various types of reviews can be left, reviews have begun to be recognized as interesting new content. This way, reviews have become essential in creating loyal customers. Therefore, research and utilization of reviews are being actively conducted. Some studies analyze reviews to discover customers' needs, studies that upgrade recommendation systems using reviews, and studies that analyze consumers' emotions and attitudes through reviews. However, research that predicts the future using reviews is insufficient. This study used a dataset consisting of two reviews written in pairs with differences in usage periods. In this study, the direction of consumer product evaluation is predicted using KoBERT, which shows excellent performance in Text Deep Learning. We used 7,233 reviews collected to demonstrate the excellence of the proposed model. As a result, the proposed model using the review text and the star rating showed excellent performance compared to the baseline that follows the majority voting.

Credit Risk Evaluations of Online Retail Enterprises Using Support Vector Machines Ensemble: An Empirical Study from China

  • LI, Xin;XIA, Han
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.8
    • /
    • pp.89-97
    • /
    • 2022
  • The e-commerce market faces significant credit risks due to the complexity of the industry and information asymmetries. Therefore, credit risk has started to stymie the growth of e-commerce. However, there is no reliable system for evaluating the creditworthiness of e-commerce companies. Therefore, this paper constructs a credit risk evaluation index system that comprehensively considers the online and offline behavior of online retail enterprises, including 15 indicators that reflect online credit risk and 15 indicators that reflect offline credit risk. This paper establishes an integration method based on a fuzzy integral support vector machine, which takes the factor analysis results of the credit risk evaluation index system of online retail enterprises as the input and the credit risk evaluation results of online retail enterprises as the output. The classification results of each sub-classifier and the importance of each sub-classifier decision to the final decision have been taken into account in this method. Select the sample data of 1500 online retail loan customers from a bank to test the model. The empirical results demonstrate that the proposed method outperforms a single SVM and traditional SVMs aggregation technique via majority voting in terms of classification accuracy, which provides a basis for banks to establish a reliable evaluation system.

A Study on Classification of CNN-based Linux Malware using Image Processing Techniques (영상처리기법을 이용한 CNN 기반 리눅스 악성코드 분류 연구)

  • Kim, Se-Jin;Kim, Do-Yeon;Lee, Hoo-Ki;Lee, Tae-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.9
    • /
    • pp.634-642
    • /
    • 2020
  • With the proliferation of Internet of Things (IoT) devices, using the Linux operating system in various architectures has increased. Also, security threats against Linux-based IoT devices are increasing, and malware variants based on existing malware are constantly appearing. In this paper, we propose a system where the binary data of a visualized Executable and Linkable Format (ELF) file is applied to Local Binary Pattern (LBP) image processing techniques and a median filter to classify malware in a Convolutional Neural Network (CNN). As a result, the original image showed the highest accuracy and F1-score at 98.77%, and reproducibility also showed the highest score at 98.55%. For the median filter, the highest precision was 99.19%, and the lowest false positive rate was 0.008%. Using the LBP technique confirmed that the overall result was lower than putting the original ELF file through the median filter. When the results of putting the original file through image processing techniques were classified by majority, it was confirmed that the accuracy, precision, F1-score, and false positive rate were better than putting the original file through the median filter. In the future, the proposed system will be used to classify malware families or add other image processing techniques to improve the accuracy of majority vote classification. Or maybe we mean "the use of Linux O/S distributions for various architectures has increased" instead? If not, please rephrase as intended.

Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)

  • An, Jungkook;Kim, Hee-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.49-67
    • /
    • 2015
  • Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.

Performance Comparison of Various Features for Off-line Handwritten Numerals Recognition and Suggestions for Improving Recognition Rate (오프라인 필기체 슷자 인식을 위한 다양한 특징들의 성능 비교 및 인식률 개선 방안)

  • Park, Chang-Sun;Kim, Du-Yeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.4
    • /
    • pp.915-925
    • /
    • 1996
  • In this paper, in order to find effective features which can handle variations in off-line handwritten numerals, we performed a comparative study on various sets of features. Results of experimental performance comparison shows that 4- directional features using contours and features which combined cross distance, cross, mesh and projection features are very effective for off-line handwritten numerals recognition in terms of recognition rates and recognition time. And in order to surmount limitation of recognition rate by a single neural network. we proposed a modularized neural network using majority voting and reliability factor with complex feature that mix effective features together. In order to verify the performance of the proposed method, the handwritten numeral databases of Concordia University of Canada and Dong-A University of Korea are used in the experiments. With the database of Concordia University, the recognition rate of 97.1%, the rejection rate of 1.5%, the error rate of 1.4% and the reliability of 98.5% are obtained ; and with the database of Dong-A University, there cognition rate of 98%, the rejection rate of 1.2%, the error rate of 0.8%, the reliability o99.1% are obtained.

  • PDF

Quorum Consensus Method based on Ghost using Simplified Metadata (단순화된 메타데이타를 이용한 고스트 기반 정족수 동의 기법의 개선)

  • Cho, Song-Yean;Kim, Tai-Yun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.1
    • /
    • pp.34-43
    • /
    • 2000
  • Replicated data that is used for fault tolerant distributed system requires replica control protocol to maintain data consistency. The one of replica control protocols is quorum consensus method which accesses replicated data by getting majority approval. If site failure or communication link failure occurs and any one can't get quorum consensus, it degrades the availability of data managed by quorum consensus protocol. So it needs for ghost to replace the failed site. Because ghost is not full replica but process which has state information using meta data, it is important to simplify meta data. In order to maintain availability and simplify meta data, we propose a method to use cohort set as ghost's meta data. The proposed method makes it possible to organize meta data in 2N+logN bits and to have higher availability than quorum consensus only with cohort set and dynamic linear voting protocol. Using Markov model we calculate proposed method's availability to analyze availability and compare it with existing protocols.

  • PDF

Structural Analysis Algorithm for Automatic Transcription 'Pansori' (판소리 자동채보를 위한 구조분석 알고리즘)

  • Ju, Young-Ho;Kim, Joon-Cheol;Seo, Kyoung-Suk;Lee, Joon-Whoan
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.2
    • /
    • pp.28-38
    • /
    • 2014
  • For western music there has been a volume of researches on music information analysis for automatic transcription or content-based music retrieval. But it is hard to find the similar research on Korean traditional music. In this paper we propose several algorithms to automatically analyze the structure of Korean traditional music 'Pansori'. The proposed algorithm automatically distinguishes between the 'sound' part and 'speech' part which are named 'sori' and 'aniri', respectively, using the ratio of phonetic and pause time intervals. For rhythm called 'jangdan' classification the algorithm makes the robust decision using the majority voting process based on template matching. Also an algorithm is suggested to detect the bar positions in the 'sori' part based on Kalman filter. Every proposed algorithm in the paper works so well enough for the sample music sources of 'Pansori' that the results may be used to automatically transcribe the 'Pansori'.

Fault Tolerant Encryption and Data Compression under Ubiquitous Environment (Ubiquitous 환경 하에서 고장 극복 암호 및 데이터 압축)

  • You, Young-Gap;Kim, Han-Byeo-Ri;Park, Kyung-Chang;Lee, Sang-Jin;Kim, Seung-Youl;Hong, Yoon-Ki
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.8
    • /
    • pp.91-98
    • /
    • 2009
  • This paper presents a solution to error avalanche of deciphering where radio noise brings random bit errors in encrypted image data under ubiquitous environment. The image capturing module is to be made comprising data compression and encryption features to reduce data traffic volume and to protect privacy. Block cipher algorithms may experience error avalanche: multiple pixel defects due to single bit error in an encrypted message. The new fault tolerant scheme addresses error avalanche effect exploiting a three-dimensional data shuffling process, which disperses error bits on many frames resulting in sparsely isolated errors. Averaging or majority voting with neighboring pixels can tolerate prominent pixel defects without increase in data volume due to error correction. This scheme has 33% lower data traffic load with respect to the conventional Hamming code based approach.