• Title/Summary/Keyword: real-world dataset

Search Result 136, Processing Time 0.026 seconds

Spectral clustering based on the local similarity measure of shared neighbors

  • Cao, Zongqi;Chen, Hongjia;Wang, Xiang
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.769-779
    • /
    • 2022
  • Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.

Synthetic Image Generation for Military Vehicle Detection (군용물체탐지 연구를 위한 가상 이미지 데이터 생성)

  • Se-Yoon Oh;Hunmin Yang
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.26 no.5
    • /
    • pp.392-399
    • /
    • 2023
  • This research paper investigates the effectiveness of using computer graphics(CG) based synthetic data for deep learning in military vehicle detection. In particular, we explore the use of synthetic image generation techniques to train deep neural networks for object detection tasks. Our approach involves the generation of a large dataset of synthetic images of military vehicles, which is then used to train a deep learning model. The resulting model is then evaluated on real-world images to measure its effectiveness. Our experimental results show that synthetic training data alone can achieve effective results in object detection. Our findings demonstrate the potential of CG-based synthetic data for deep learning and suggest its value as a tool for training models in a variety of applications, including military vehicle detection.

A Study on Human-AI Collaboration Process to Support Evidence-Based National Innovation Monitoring: Case Study on Ministry of Oceans and Fisheries (Human-AI 협력 프로세스 기반의 증거기반 국가혁신 모니터링 연구: 해양수산부 사례)

  • Jung Sun Lim;Seoung Hun Bae;Kil-Ho Ryu;Sang-Gook Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.2
    • /
    • pp.22-31
    • /
    • 2023
  • Governments around the world are enacting laws mandating explainable traceability when using AI(Artificial Intelligence) to solve real-world problems. HAI(Human-Centric Artificial Intelligence) is an approach that induces human decision-making through Human-AI collaboration. This research presents a case study that implements the Human-AI collaboration to achieve explainable traceability in governmental data analysis. The Human-AI collaboration explored in this study performs AI inferences for generating labels, followed by AI interpretation to make results more explainable and traceable. The study utilized an example dataset from the Ministry of Oceans and Fisheries to reproduce the Human-AI collaboration process used in actual policy-making, in which the Ministry of Science and ICT utilized R&D PIE(R&D Platform for Investment and Evaluation) to build a government investment portfolio.

Modeling of Photovoltaic Power Systems using Clustering Algorithm and Modular Networks (군집화 알고리즘 및 모듈라 네트워크를 이용한 태양광 발전 시스템 모델링)

  • Lee, Chang-Sung;Ji, Pyeong-Shik
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.65 no.2
    • /
    • pp.108-113
    • /
    • 2016
  • The real-world problems usually show nonlinear and multi-variate characteristics, so it is difficult to establish concrete mathematical models for them. Thus, it is common to practice data-driven modeling techniques in these cases. Among them, most widely adopted techniques are regression model and intelligent model such as neural networks. Regression model has drawback showing lower performance when much non-linearity exists between input and output data. Intelligent model has been shown its superiority to the linear model due to ability capable of effectively estimate desired output in cases of both linear and nonlinear problem. This paper proposes modeling method of daily photovoltaic power systems using ELM(Extreme Learning Machine) based modular networks. The proposed method uses sub-model by fuzzy clustering rather than using a single model. Each sub-model is implemented by ELM. To show the effectiveness of the proposed method, we performed various experiments by dataset acquired during 2014 in real-plant.

A Novel Duty Cycle Based Cross Layer Model for Energy Efficient Routing in IWSN Based IoT Application

  • Singh, Ghanshyam;Joshi, Pallavi;Raghuvanshi, Ajay Singh
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1849-1876
    • /
    • 2022
  • Wireless Sensor Network (WSN) is considered as an integral part of the Internet of Things (IoT) for collecting real-time data from the site having many applications in industry 4.0 and smart cities. The task of nodes is to sense the environment and send the relevant information over the internet. Though this task seems very straightforward but it is vulnerable to certain issues like energy consumption, delay, throughput, etc. To efficiently address these issues, this work develops a cross-layer model for the optimization between MAC and the Network layer of the OSI model for WSN. A high value of duty cycle for nodes is selected to control the delay and further enhances data transmission reliability. A node measurement prediction system based on the Kalman filter has been introduced, which uses the constraint based on covariance value to decide the scheduling scheme of the nodes. The concept of duty cycle for node scheduling is employed with a greedy data forwarding scheme. The proposed Duty Cycle-based Greedy Routing (DCGR) scheme aims to minimize the hop count, thereby mitigating the energy consumption rate. The proposed algorithm is tested using a real-world wastewater treatment dataset. The proposed method marks an 87.5% increase in the energy efficiency and reduction in the network latency by 61% when validated with other similar pre-existing schemes.

Effects of Intellectual Property Rights Protection on Services Export Diversification in Developing Countries

  • SENA KIMM GNANGNON
    • KDI Journal of Economic Policy
    • /
    • v.46 no.1
    • /
    • pp.53-89
    • /
    • 2024
  • The effects of the betterment of enforced intellectual property rights (IPRs) provisions on services export diversification are investigated. The analysis used an unbalanced panel dataset of 76 developing countries over the period of 1970-2014. The empirical analysis is based on the feasible generalized least squares estimator. It suggests that the implementation of weaker IPR protection fosters services export diversification in less developed countries (i.e., those whose real per capita incomes are less than US$US$ 1458.60), including those with a low level of export product upgrading. Conversely, in relatively advanced developing countries (countries whose real per capita income exceeds US$ 3356.80), including those with high levels of export product upgrading, the implementation of stronger IPR laws induces greater services export diversification. Finally, the analysis revealed the existence of a non-linear relationship between IPR protection and services export diversification. The implementation of stronger intellectual property laws spurs services export diversification in countries with high degree of IPR protection, especially when IPR protection exceeds a certain level, recorded here as having a score of 1.197. In contrast, in countries with weaker IPR protection, in particular those with IPR protection levels that score less than 0.915, it is rather the implementation of weaker intellectual property laws that promotes services export diversification.

Violent Behavior Detection using Motion Analysis in Surveillance Video (감시 영상에서 움직임 정보 분석을 통한 폭력행위 검출)

  • Kang, Joohyung;Kwak, Sooyeong
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.430-439
    • /
    • 2015
  • The demand of violence detection techniques using a video analysis to help prevent crimes is increasing recently. Many researchers have studied vision based behavior recognition but, violent behavior analysis techniques usually focus on violent scenes in television and movie content. Many methods previously published usually used both a color(e.g., skin and blood) and motion information for detecting violent scenes because violences usually involve blood scenes in movies. However, color information (e.g., blood scenes) may not be useful cues for violence detection in surveillance videos, because they are rarely taken in real world situations. In this paper, we propose a method of violent behavior detection in surveillance videos using motion vectors such as flow vector magnitudes and changes in direction except the color information. In order to evaluate the proposed algorithm, we test both USI dataset and various real world surveillance videos from YouTube.

Privacy-Preserving Estimation of Users' Density Distribution in Location-based Services through Geo-indistinguishability

  • Song, Seung Min;Kim, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.161-169
    • /
    • 2022
  • With the development of mobile devices and global positioning systems, various location-based services can be utilized, which collects user's location information and provides services based on it. In this process, there is a risk of personal sensitive information being exposed to the outside, and thus Geo-indistinguishability (Geo-Ind), which protect location privacy of LBS users by perturbing their true location, is widely used. However, owing to the data perturbation mechanism of Geo-Ind, it is hard to accurately obtain the density distribution of LBS users from the collection of perturbed location data. Thus, in this paper, we aim to develop a novel method which enables to effectively compute the user density distribution from perturbed location dataset collected under Geo-Ind. In particular, the proposed method leverages Expectation-Maximization(EM) algorithm to precisely estimate the density disribution of LBS users from perturbed location dataset. Experimental results on real world datasets show that our proposed method achieves significantly better performance than a baseline approach.

Performance Assessment of Machine Learning and Deep Learning in Regional Name Identification and Classification in Scientific Documents (머신러닝을 이용한 과학기술 문헌에서의 지역명 식별과 분류방법에 대한 성능 평가)

  • Jung-Woo Lee;Oh-Jin Kwon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.389-396
    • /
    • 2024
  • Generative AI has recently been utilized across all fields, achieving expert-level advancements in deep data analysis. However, identifying regional names in scientific literature remains a challenge due to insufficient training data and limited AI application. This study developed a standardized dataset for effectively classifying regional names using address data from Korean institution-affiliated authors listed in the Web of Science. It tested and evaluated the applicability of machine learning and deep learning models in real-world problems. The BERT model showed superior performance, with a precision of 98.41%, recall of 98.2%, and F1 score of 98.31% for metropolitan areas, and a precision of 91.79%, recall of 88.32%, and F1 score of 89.54% for city classifications. These findings offer a valuable data foundation for future research on regional R&D status, researcher mobility, collaboration status, and so on.

Prediction Model of Real Estate ROI with the LSTM Model based on AI and Bigdata

  • Lee, Jeong-hyun;Kim, Hoo-bin;Shim, Gyo-eon
    • International journal of advanced smart convergence
    • /
    • v.11 no.1
    • /
    • pp.19-27
    • /
    • 2022
  • Across the world, 'housing' comprises a significant portion of wealth and assets. For this reason, fluctuations in real estate prices are highly sensitive issues to individual households. In Korea, housing prices have steadily increased over the years, and thus many Koreans view the real estate market as an effective channel for their investments. However, if one purchases a real estate property for the purpose of investing, then there are several risks involved when prices begin to fluctuate. The purpose of this study is to design a real estate price 'return rate' prediction model to help mitigate the risks involved with real estate investments and promote reasonable real estate purchases. Various approaches are explored to develop a model capable of predicting real estate prices based on an understanding of the immovability of the real estate market. This study employs the LSTM method, which is based on artificial intelligence and deep learning, to predict real estate prices and validate the model. LSTM networks are based on recurrent neural networks (RNN) but add cell states (which act as a type of conveyer belt) to the hidden states. LSTM networks are able to obtain cell states and hidden states in a recursive manner. Data on the actual trading prices of apartments in autonomous districts between January 2006 and December 2019 are collected from the Actual Trading Price Disclosure System of the Ministry of Land, Infrastructure and Transport (MOLIT). Additionally, basic data on apartments and commercial buildings are collected from the Public Data Portal and Seoul Metropolitan Government's data portal. The collected actual trading price data are scaled to monthly average trading amounts, and each data entry is pre-processed according to address to produce 168 data entries. An LSTM model for return rate prediction is prepared based on a time series dataset where the training period is set as April 2015~August 2017 (29 months), the validation period is set as September 2017~September 2018 (13 months), and the test period is set as December 2018~December 2019 (13 months). The results of the return rate prediction study are as follows. First, the model achieved a prediction similarity level of almost 76%. After collecting time series data and preparing the final prediction model, it was confirmed that 76% of models could be achieved. All in all, the results demonstrate the reliability of the LSTM-based model for return rate prediction.