• Title/Summary/Keyword: Research dataset

Search Result 1,350, Processing Time 0.028 seconds

EMOS: Enhanced moving object detection and classification via sensor fusion and noise filtering

  • Dongjin Lee;Seung-Jun Han;Kyoung-Wook Min;Jungdan Choi;Cheong Hee Park
    • ETRI Journal
    • /
    • v.45 no.5
    • /
    • pp.847-861
    • /
    • 2023
  • Dynamic object detection is essential for ensuring safe and reliable autonomous driving. Recently, light detection and ranging (LiDAR)-based object detection has been introduced and shown excellent performance on various benchmarks. Although LiDAR sensors have excellent accuracy in estimating distance, they lack texture or color information and have a lower resolution than conventional cameras. In addition, performance degradation occurs when a LiDAR-based object detection model is applied to different driving environments or when sensors from different LiDAR manufacturers are utilized owing to the domain gap phenomenon. To address these issues, a sensor-fusion-based object detection and classification method is proposed. The proposed method operates in real time, making it suitable for integration into autonomous vehicles. It performs well on our custom dataset and on publicly available datasets, demonstrating its effectiveness in real-world road environments. In addition, we will make available a novel three-dimensional moving object detection dataset called ETRI 3D MOD.

Prediction of Diabetic Nephropathy from Diabetes Dataset Using Feature Selection Methods and SVM Learning (특징점 선택방법과 SVM 학습법을 이용한 당뇨병 데이터에서의 당뇨병성 신장합병증의 예측)

  • Cho, Baek-Hwan;Lee, Jong-Shill;Chee, Young-Joan;Kim, Kwang-Won;Kim, In-Young;Kim, Sun-I.
    • Journal of Biomedical Engineering Research
    • /
    • v.28 no.3
    • /
    • pp.355-362
    • /
    • 2007
  • Diabetes mellitus can cause devastating complications, which often result in disability and death, and diabetic nephropathy is a leading cause of death in people with diabetes. In this study, we tried to predict the onset of diabetic nephropathy from an irregular and unbalanced diabetic dataset. We collected clinical data from 292 patients with type 2 diabetes and performed preprocessing to extract 184 features to resolve the irregularity of the dataset. We compared several feature selection methods, such as ReliefF and sensitivity analysis, to remove redundant features and improve the classification performance. We also compared learning methods with support vector machine, such as equal cost learning and cost-sensitive learning to tackle the unbalanced problem in the dataset. The best classifier with the 39 selected features gave 0.969 of the area under the curve by receiver operation characteristics analysis, which represents that our method can predict diabetic nephropathy with high generalization performance from an irregular and unbalanced dataset, and physicians can benefit from it for predicting diabetic nephropathy.

Remote Sensing Image Classification for Land Cover Mapping in Developing Countries: A Novel Deep Learning Approach

  • Lynda, Nzurumike Obianuju;Nnanna, Nwojo Agwu;Boukar, Moussa Mahamat
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.214-222
    • /
    • 2022
  • Convolutional Neural networks (CNNs) are a category of deep learning networks that have proven very effective in computer vision tasks such as image classification. Notwithstanding, not much has been seen in its use for remote sensing image classification in developing countries. This is majorly due to the scarcity of training data. Recently, transfer learning technique has successfully been used to develop state-of-the art models for remote sensing (RS) image classification tasks using training and testing data from well-known RS data repositories. However, the ability of such model to classify RS test data from a different dataset has not been sufficiently investigated. In this paper, we propose a deep CNN model that can classify RS test data from a dataset different from the training dataset. To achieve our objective, we first, re-trained a ResNet-50 model using EuroSAT, a large-scale RS dataset to develop a base model then we integrated Augmentation and Ensemble learning to improve its generalization ability. We further experimented on the ability of this model to classify a novel dataset (Nig_Images). The final classification results shows that our model achieves a 96% and 80% accuracy on EuroSAT and Nig_Images test data respectively. Adequate knowledge and usage of this framework is expected to encourage research and the usage of deep CNNs for land cover mapping in cases of lack of training data as obtainable in developing countries.

Developing and Pre-Processing a Dataset using a Rhetorical Relation to Build a Question-Answering System based on an Unsupervised Learning Approach

  • Dutta, Ashit Kumar;Wahab sait, Abdul Rahaman;Keshta, Ismail Mohamed;Elhalles, Abheer
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.11
    • /
    • pp.199-206
    • /
    • 2021
  • Rhetorical relations between two text fragments are essential information and support natural language processing applications such as Question - Answering (QA) system and automatic text summarization to produce an effective outcome. Question - Answering (QA) system facilitates users to retrieve a meaningful response. There is a demand for rhetorical relation based datasets to develop such a system to interpret and respond to user requests. There are a limited number of datasets for developing an Arabic QA system. Thus, there is a lack of an effective QA system in the Arabic language. Recent research works reveal that unsupervised learning can support the QA system to reply to users queries. In this study, researchers intend to develop a rhetorical relation based dataset for implementing unsupervised learning applications. A web crawler is developed to crawl Arabic content from the web. A discourse-annotated corpus is generated using the rhetorical structural theory. A Naïve Bayes based QA system is developed to evaluate the performance of datasets. The outcome shows that the performance of the QA system is improved with proposed dataset and able to answer user queries with an appropriate response. In addition, the results on fine-grained and coarse-grained relations reveal that the dataset is highly reliable.

A Study on Insider Threat Dataset Sharing Using Blockchain (블록체인을 활용한 내부자 유출위협 데이터 공유 연구)

  • Wonseok Yoon;Hangbae Chang
    • Journal of Platform Technology
    • /
    • v.11 no.2
    • /
    • pp.15-25
    • /
    • 2023
  • This study analyzes the limitations of the insider threat datasets used for insider threat detection research and compares and analyzes the solution-based insider threat data with public insider threat data using a security solution to overcome this. Through this, we design a data format suitable for insider threat detection and implement a system that can safely share insider threat information between different institutions and companies using blockchain technology. Currently, there is no dataset collected based on actual events in the insider threat dataset that is revealed to researchers. Public datasets are virtual synthetic data randomly created for research, and when used as a learning model, there are many limitations in the real environment. In this study, to improve these limitations, a private blockchain was designed to secure information sharing between institutions of different affiliations, and a method was derived to increase reliability and maintain information integrity and consistency through agreement and verification among participants. The proposed method is expected to collect data through an outflow threat collector and collect quality data sets that posed a threat, not synthetic data, through a blockchain-based sharing system, to solve the current outflow threat dataset problem and contribute to the insider threat detection model in the future.

  • PDF

Arabic Words Extraction and Character Recognition from Picturesque Image Macros with Enhanced VGG-16 based Model Functionality Using Neural Networks

  • Ayed Ahmad Hamdan Al-Radaideh;Mohd Shafry bin Mohd Rahim;Wad Ghaban;Majdi Bsoul;Shahid Kamal;Naveed Abbas
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1807-1822
    • /
    • 2023
  • Innovation and rapid increased functionality in user friendly smartphones has encouraged shutterbugs to have picturesque image macros while in work environment or during travel. Formal signboards are placed with marketing objectives and are enriched with text for attracting people. Extracting and recognition of the text from natural images is an emerging research issue and needs consideration. When compared to conventional optical character recognition (OCR), the complex background, implicit noise, lighting, and orientation of these scenic text photos make this problem more difficult. Arabic language text scene extraction and recognition adds a number of complications and difficulties. The method described in this paper uses a two-phase methodology to extract Arabic text and word boundaries awareness from scenic images with varying text orientations. The first stage uses a convolution autoencoder, and the second uses Arabic Character Segmentation (ACS), which is followed by traditional two-layer neural networks for recognition. This study presents the way that how can an Arabic training and synthetic dataset be created for exemplify the superimposed text in different scene images. For this purpose a dataset of size 10K of cropped images has been created in the detection phase wherein Arabic text was found and 127k Arabic character dataset for the recognition phase. The phase-1 labels were generated from an Arabic corpus of quotes and sentences, which consists of 15kquotes and sentences. This study ensures that Arabic Word Awareness Region Detection (AWARD) approach with high flexibility in identifying complex Arabic text scene images, such as texts that are arbitrarily oriented, curved, or deformed, is used to detect these texts. Our research after experimentations shows that the system has a 91.8% word segmentation accuracy and a 94.2% character recognition accuracy. We believe in the future that the researchers will excel in the field of image processing while treating text images to improve or reduce noise by processing scene images in any language by enhancing the functionality of VGG-16 based model using Neural Networks.

Evaluation and validation of stem volume models for Quercus glauca in the subtropical forest of Jeju Island, Korea

  • Seo, Yeon Ok;Lumbres, Roscinto Ian C.;Won, Hyun Kyu;Jung, Sung Cheol;Lee, Young Jin
    • Journal of Ecology and Environment
    • /
    • v.38 no.4
    • /
    • pp.485-491
    • /
    • 2015
  • This study was conducted to develop stem volume models for the volume estimation of Quercus glauca Thunb. in Jeju Island, Republic of Korea. Furthermore, this study validated the developed stem volume models using an independent dataset. A total of 167 trees were measured for their diameter at breast height (DBH), total height and stem volume using non-destructive sampling methods. Eighty percent of the dataset was used for the initial model development while the remaining 20% was used for model validation. The performance of the different models was evaluated using the following fit statistics: standard error of estimate (SEE), mean bias absolute mean deviation (AMD), coefficient of determination (R2), and root mean square error (RMSE). The AMD of the five models from the different DBH classes were determined using the validation dataset. Model 5 (V = aDbHc), which estimates volume using DBH and total height as predicting variables, had the best SEE (0.02745), AMD (0.01538), R2 (0.97603) and RMSE (0.02746). Overall, volume models with two independent variables (DBH and total height) performed better than those with only one (DBH) based on the model evaluation and validation. The models developed in this study can provide forest managers with accurate estimations for the stem volumes of Quercus glauca in the subtropical forests of Jeju Island, Korea.

Improvement of Cloud Physics Parameterization in the KMA Earth System Model (기상청 지구시스템모델에서의 구름입자 수농도 모수화 방법 개선)

  • Lee, Hannah;Yum, Seong Soo;Shim, Sungbo;Boo, Kyung-On;Cho, ChunHo
    • Atmosphere
    • /
    • v.24 no.1
    • /
    • pp.111-122
    • /
    • 2014
  • In the Korea Meteorological Administration earth system model (HadGEM2-AO), cloud drop number concentration is determined from aerosol number concentration according to the observed relationship between aerosol and cloud drop number concentrations. However, the observational dataset used for establishing the relationship was obtained from limited regions of the earth and therefore may not be representative of the entire earth. Here we reestablished the relationship between aerosol and cloud drop number concentrations based on a composite of observational dataset obtained from many different regions around the world that includes the original dataset. The new relationship tends to provide lower cloud drop number concentration for aerosol number concentration < 600 $cm^{-3}$ and the opposite for > 600 $cm^{-3}$. This new empirical relationship was applied to the KMA earth system model and the historical run (1861~2005) is made again. Here only the 30 year (1861~1890) averages from the runs with the new and the original relationships between aerosol and cloud drop number concentrations (newHIST and HIST, respectively) were compared. For this early period aerosol number concentrations were generally lower than 600 $cm^{-3}$ and therefore cloud drop number concentrations were generally lower but cloud drop effective radii were larger for newHIST than for HIST. The results from the complete historical run with the new relationship are expected to show more significant differences from the original historical run.

Reevaluation of Photon Activation Yields of 11C, 13N, and 15O for the Estimation of Activity in Gas and Water Induced by the Operation of Electron Accelerators for Medical Use

  • Masumoto, Kazuyoshi;Matsumura, Hiroshi;Kosako, Kazuaki;Bessho, Kotaro;Toyoda, Akihiro
    • Journal of Radiation Protection and Research
    • /
    • v.41 no.3
    • /
    • pp.286-290
    • /
    • 2016
  • Background: Activation of air and water in the electron linear accelerator for medical use has not been considered severely. By the new Japanese regulation for protection of radiation hazard, it became indispensable to evaluate of activation of air and water in the accelerator room. The measurement of induced activity in air and water components in the electron energy region of 10 to 20 MeV is very difficult, because this energy region is close to the threshold energy region of photonuclear reactions. Then, we measured the photonuclear reaction yields of $^{13}N$, $^{15}O$, and $^{11}C$ by using the electron linear accelerator. Obtained data were compared with the data calculated by the Monte Carlo method. Materials and Methods: An activation experiment was performed at the Research Center for Electron Photon Science, Tohoku University. Highly purified $SiO_2$, $Si_3N_4$, and carbon disks were irradiated for 10 minutes by bremsstrahlung converted by a tungsten plate. Induced activity from C, N, and O was obtained. Monte Carlo calculation was performed using MCNP5 and AERY (DCHAIN-SP) to simulate the experimental condition. Cross section data were adopted the KAERI dataset. Results and Discussion: In our experiment in hospital, calculated values were not agreed with experimental values. It might be three possible reasons as the cause of this deference, such as irradiation energy, calculation procedure and cross section data. Obtained data of this work, calculated and experimental values were good agreement with each other within one order. In this work, we used KAERI dataset of photonuclear reaction instead of JENDL. Therefore, it was found that the photonuclear cross section data of light elements are most important for yield calculation in these reactions. Conclusion: Further improvement for calculation using a new dataset JENDL/PD-2015 and considering electron energy spreading will be needed.

Progression-Preserving Dimension Reduction for High-Dimensional Sensor Data Visualization

  • Yoon, Hyunjin;Shahabi, Cyrus;Winstein, Carolee J.;Jang, Jong-Hyun
    • ETRI Journal
    • /
    • v.35 no.5
    • /
    • pp.911-914
    • /
    • 2013
  • This letter presents Progression-Preserving Projection, a dimension reduction technique that finds a linear projection that maps a high-dimensional sensor dataset into a two- or three-dimensional subspace with a particularly useful property for visual exploration. As a demonstration of its effectiveness as a visual exploration and diagnostic means, we empirically evaluate the proposed technique over a dataset acquired from our own virtual-reality-enhanced ball-intercepting training system designed to promote the upper extremity movement skills of individuals recovering from stroke-related hemiparesis.