• Title/Summary/Keyword: Voting method

Search Result 187, Processing Time 0.022 seconds

Artificial Intelligence Algorithms, Model-Based Social Data Collection and Content Exploration (소셜데이터 분석 및 인공지능 알고리즘 기반 범죄 수사 기법 연구)

  • An, Dong-Uk;Leem, Choon Seong
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.23-34
    • /
    • 2019
  • Recently, the crime that utilizes the digital platform is continuously increasing. About 140,000 cases occurred in 2015 and about 150,000 cases occurred in 2016. Therefore, it is considered that there is a limit handling those online crimes by old-fashioned investigation techniques. Investigators' manual online search and cognitive investigation methods those are broadly used today are not enough to proactively cope with rapid changing civil crimes. In addition, the characteristics of the content that is posted to unspecified users of social media makes investigations more difficult. This study suggests the site-based collection and the Open API among the content web collection methods considering the characteristics of the online media where the infringement crimes occur. Since illegal content is published and deleted quickly, and new words and alterations are generated quickly and variously, it is difficult to recognize them quickly by dictionary-based morphological analysis registered manually. In order to solve this problem, we propose a tokenizing method in the existing dictionary-based morphological analysis through WPM (Word Piece Model), which is a data preprocessing method for quick recognizing and responding to illegal contents posting online infringement crimes. In the analysis of data, the optimal precision is verified through the Vote-based ensemble method by utilizing a classification learning model based on supervised learning for the investigation of illegal contents. This study utilizes a sorting algorithm model centering on illegal multilevel business cases to proactively recognize crimes invading the public economy, and presents an empirical study to effectively deal with social data collection and content investigation.

  • PDF

Predicting Stock Liquidity by Using Ensemble Data Mining Methods

  • Bae, Eun Chan;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.6
    • /
    • pp.9-19
    • /
    • 2016
  • In finance literature, stock liquidity showing how stocks can be cashed out in the market has received rich attentions from both academicians and practitioners. The reasons are plenty. First, it is known that stock liquidity affects significantly asset pricing. Second, macroeconomic announcements influence liquidity in the stock market. Therefore, stock liquidity itself affects investors' decision and managers' decision as well. Though there exist a great deal of literature about stock liquidity in finance literature, it is quite clear that there are no studies attempting to investigate the stock liquidity issue as one of decision making problems. In finance literature, most of stock liquidity studies had dealt with limited views such as how much it influences stock price, which variables are associated with describing the stock liquidity significantly, etc. However, this paper posits that stock liquidity issue may become a serious decision-making problem, and then be handled by using data mining techniques to estimate its future extent with statistical validity. In this sense, we collected financial data set from a number of manufacturing companies listed in KRX (Korea Exchange) during the period of 2010 to 2013. The reason why we selected dataset from 2010 was to avoid the after-shocks of financial crisis that occurred in 2008. We used Fn-GuidPro system to gather total 5,700 financial data set. Stock liquidity measure was computed by the procedures proposed by Amihud (2002) which is known to show best metrics for showing relationship with daily return. We applied five data mining techniques (or classifiers) such as Bayesian network, support vector machine (SVM), decision tree, neural network, and ensemble method. Bayesian networks include GBN (General Bayesian Network), NBN (Naive BN), TAN (Tree Augmented NBN). Decision tree uses CART and C4.5. Regression result was used as a benchmarking performance. Ensemble method uses two types-integration of two classifiers, and three classifiers. Ensemble method is based on voting for the sake of integrating classifiers. Among the single classifiers, CART showed best performance with 48.2%, compared with 37.18% by regression. Among the ensemble methods, the result from integrating TAN, CART, and SVM was best with 49.25%. Through the additional analysis in individual industries, those relatively stabilized industries like electronic appliances, wholesale & retailing, woods, leather-bags-shoes showed better performance over 50%.

Development of Deep Learning Based Ensemble Land Cover Segmentation Algorithm Using Drone Aerial Images (드론 항공영상을 이용한 딥러닝 기반 앙상블 토지 피복 분할 알고리즘 개발)

  • Hae-Gwang Park;Seung-Ki Baek;Seung Hyun Jeong
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.1
    • /
    • pp.71-80
    • /
    • 2024
  • In this study, a proposed ensemble learning technique aims to enhance the semantic segmentation performance of images captured by Unmanned Aerial Vehicles (UAVs). With the increasing use of UAVs in fields such as urban planning, there has been active development of techniques utilizing deep learning segmentation methods for land cover segmentation. The study suggests a method that utilizes prominent segmentation models, namely U-Net, DeepLabV3, and Fully Convolutional Network (FCN), to improve segmentation prediction performance. The proposed approach integrates training loss, validation accuracy, and class score of the three segmentation models to enhance overall prediction performance. The method was applied and evaluated on a land cover segmentation problem involving seven classes: buildings,roads, parking lots, fields, trees, empty spaces, and areas with unspecified labels, using images captured by UAVs. The performance of the ensemble model was evaluated by mean Intersection over Union (mIoU), and the results of comparing the proposed ensemble model with the three existing segmentation methods showed that mIoU performance was improved. Consequently, the study confirms that the proposed technique can enhance the performance of semantic segmentation models.

Estimation of the Percent of the Vote by Adjustment of Voter Turnout in Election Polls (선거여론조사에서 투표율 반영을 통한 득표율 추정)

  • Kim, Jeonghoon;Han, Sang-Tae;Kang, Hyuncheol
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2873-2881
    • /
    • 2018
  • It is very important to obtain objective and credible information through election polls in order to contribute to the correct voting behavior of the voters or to establish appropriate election strategies for candidates or political parties. Therefore, many related organizations such as political parties, media organizations, and research institutions have been making efforts to improve the accuracy of the results of the polls and the election prediction. Kim et al. (2017) analyzed whether the non-response group responded that there is no support candidate in the election survey to increase the accuracy of the estimation of the vote rate. As a result, it has been confirmed that the accuracy of the estimation of the vote rate can be significantly improved by performing an appropriate classification on the non-response layer. In this study, we propose a method to estimate the turnout by each strata (sex, age group) under the condition that the total turnout rate is given for a specific district (region) and propose a procedure to predict the vote rate by reflecting the turnout. In addition, case studies were conducted using data gathered through telephone interviews for the 20th National Assembly elections in 2016.

The Problems and Improvement Measures of Protection for Politician (정치인 경호제도의 문제점 및 개선방안)

  • Jo, Sung-Gu;Kim, Tae-Min
    • Korean Security Journal
    • /
    • no.22
    • /
    • pp.169-196
    • /
    • 2010
  • Although more priority is given to politicians from the aspect that they represent people and decide the future of country, the current situation is that politicians are not free from terrorism because of insufficient guard-concerned law, negative social recognition and increased crime and terrorism. The measure for politician terrorism shall be handled from the aspect of national security rather than public peace. For the purpose, basic legal foundation shall be prepared and specialized guard technique considering specialty of politician shall be established. Basic solution shall be established by reinforcing law against politician terrorism and establishing new law from the national viewpoint. The guard for politician has two faces that both of safety of guard target and voting intention of voter shall be met at the same time. Although special guard technique is required for guarding politician, current situation is that it is not researched professionally. In relation to the measure to develop the system of protection for politician, First, the study suggested legal foundation for politician guard. Although the 17th National Assembly proposed revised legal plan to protect politician from terrorism, it is suspended, expired and abolished now. The legal plan presented by members of the National Assembly was simply restricted to the scope of public guard. The study divided establishment of legal foundation into two things. The first one is the dispatch type of effective public guard and the second one is the transfer to private guard. Second, the study suggested environmental development method of politician guard. in the environment of politician guard, the study suggested improvement and development method by analyzing social recognition, politician's mind and voter's mind psychologically. After the beginning of human society, if human race is continued, political activity won't disappear. It is obvious that the safety of political leader is very important issue for human race because he plays the role to decide the future of human. In the future, more specialized, effective law shall be prepared and deeper study of scholar shall be performed.

  • PDF

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

Finding Influential Users in the SNS Using Interaction Concept : Focusing on the Blogosphere with Continuous Referencing Relationships (상호작용성에 의한 SNS 영향유저 선정에 관한 연구 : 연속적인 참조관계가 있는 블로고스피어를 중심으로)

  • Park, Hyunjung;Rho, Sangkyu
    • The Journal of Society for e-Business Studies
    • /
    • v.17 no.4
    • /
    • pp.69-93
    • /
    • 2012
  • Various influence-related relationships in Social Network Services (SNS) among users, posts, and user-and-post, can be expressed using links. The current research evaluates the influence of specific users or posts by analyzing the link structure of relevant social network graphs to identify influential users. We applied the concept of mutual interactions proposed for ranking semantic web resources, rather than the voting notion of Page Rank or HITS, to blogosphere, one of the early SNS. Through many experiments with network models, where the performance and validity of each alternative approach can be analyzed, we showed the applicability and strengths of our approach. The weight tuning processes for the links of these network models enabled us to control the experiment errors form the link weight differences and compare the implementation easiness of alternatives. An additional example of how to enter the content scores of commercial or spam posts into the graph-based method is suggested on a small network model as well. This research, as a starting point of the study on identifying influential users in SNS, is distinctive from the previous researches in the following points. First, various influence-related properties that are deemed important but are disregarded, such as scraping, commenting, subscribing to RSS feeds, and trusting friends, can be considered simultaneously. Second, the framework reflects the general phenomenon where objects interacting with more influential objects increase their influence. Third, regarding the extent to which a bloggers causes other bloggers to act after him or her as the most important factor of influence, we treated sequential referencing relationships with a viewpoint from that of PageRank or HITS (Hypertext Induced Topic Selection).