• Title/Summary/Keyword: Label Encoding

Search Result 11, Processing Time 0.029 seconds

Fast XML Encoding Scheme Using Reuse of Deleted Nodes (삭제된 노드의 재사용을 이용한 Fast XML 인코딩 기법)

  • Hye-Kyeong Ko
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.835-843
    • /
    • 2023
  • Given the structure of XML data, path and tree pattern matching algorithms play an important role in XML query processing. To facilitate decisions or relationships between nodes, nodes in an XML tree are typically labeled in a way that can quickly establish an ancestor-descendant on relationship between two nodes. However, these techniques have the disadvantage of re-labeling existing nodes or recalculating certain values if insertion occurs due to sequential updates. Therefore, in current labeling techniques, the cost of updating labels is very high. In this paper, we propose a new labeling technique called Fast XML encoding, which supports the update of order-sensitive XML documents without re-labeling or recalculation. It also controls the length of the label by reusing deleted labels at the same location in the XML tree. The proposed reuse algorithm can reduce the length of the label when all deleted labels are inserted in the same location. The proposed technique in the experimental results can efficiently handle order-sensitive queries and updates.

Variation for Mental Health of Children of Marginalized Classes through Exercise Therapy using Deep Learning (딥러닝을 이용한 소외계층 아동의 스포츠 재활치료를 통한 정신 건강에 대한 변화)

  • Kim, Myung-Mi
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.4
    • /
    • pp.725-732
    • /
    • 2020
  • This paper uses variables following as : to follow me well(0-9), it takes a lot of time to make a decision (0-9), lethargy(0-9) during physical activity in the exercise learning program of the children in the marginalized class. This paper classifies 'gender', 'physical education classroom', and 'upper, middle and lower' of age, and observe changes in ego-resiliency and self-control through sports rehabilitation therapy to find out changes in mental health. To achieve this, the data acquired was merged and the characteristics of large and small numbers were removed using the Label encoder and One-hot encoding. Then, to evaluate the performance by applying each algorithm of MLP, SVM, Dicesion tree, RNN, and LSTM, the train and test data were divided by 75% and 25%, and then the algorithm was learned with train data and the accuracy of the algorithm was measured with the Test data. As a result of the measurement, LSTM was the most effective in sex, MLP and LSTM in physical education classroom, and SVM was the most effective in age.

An Improved Method of the Prime Number Labeling Scheme for Dynamic XML Documents (빈번히 갱신되는 XML 문서에 대한 프라임 넘버 레이블링 기법)

  • Yoo, Ji-You;Yoo, Sang-Won;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.129-137
    • /
    • 2006
  • An XML labeling scheme is an efficient encoding method to determine the ancestor-descendant relationships of elements and the orders of siblings. Recently, many dynamic XML documents have appeared in the Web Services and the AXML(the Active XML), so we need to manage them with a dynamic XML labeling scheme. The prime number labeling scheme is a representative scheme which supports dynamic XML documents. It determines the ancestor-descendant relationships between two elements with the feature of prime numbers. When a new element is inserted into the XML document using this scheme, it has an advantage that an assigning the label of new element don't change the label values of existing nodes. But it has to have additional expensive operations and data structure for maintaining the orders of siblings. In this paper, we suggest the order number sharing method and algorithms categorized by the insertion positions of new nodes. They greatly minimize the existing method's sibling order maintenance cost.

Feature Selection with Ensemble Learning for Prostate Cancer Prediction from Gene Expression

  • Abass, Yusuf Aleshinloye;Adeshina, Steve A.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12spc
    • /
    • pp.526-538
    • /
    • 2021
  • Machine and deep learning-based models are emerging techniques that are being used to address prediction problems in biomedical data analysis. DNA sequence prediction is a critical problem that has attracted a great deal of attention in the biomedical domain. Machine and deep learning-based models have been shown to provide more accurate results when compared to conventional regression-based models. The prediction of the gene sequence that leads to cancerous diseases, such as prostate cancer, is crucial. Identifying the most important features in a gene sequence is a challenging task. Extracting the components of the gene sequence that can provide an insight into the types of mutation in the gene is of great importance as it will lead to effective drug design and the promotion of the new concept of personalised medicine. In this work, we extracted the exons in the prostate gene sequences that were used in the experiment. We built a Deep Neural Network (DNN) and Bi-directional Long-Short Term Memory (Bi-LSTM) model using a k-mer encoding for the DNA sequence and one-hot encoding for the class label. The models were evaluated using different classification metrics. Our experimental results show that DNN model prediction offers a training accuracy of 99 percent and validation accuracy of 96 percent. The bi-LSTM model also has a training accuracy of 95 percent and validation accuracy of 91 percent.

Design of an Effective Deep Learning-Based Non-Profiling Side-Channel Analysis Model (효과적인 딥러닝 기반 비프로파일링 부채널 분석 모델 설계방안)

  • Han, JaeSeung;Sim, Bo-Yeon;Lim, Han-Seop;Kim, Ju-Hwan;Han, Dong-Guk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1291-1300
    • /
    • 2020
  • Recently, a deep learning-based non-profiling side-channel analysis was proposed. The deep learning-based non-profiling analysis is a technique that trains a neural network model for all guessed keys and then finds the correct secret key through the difference in the training metrics. As the performance of non-profiling analysis varies greatly depending on the neural network training model design, a correct model design criterion is required. This paper describes the two types of loss functions and eight labeling methods used in the training model design. It predicts the analysis performance of each labeling method in terms of non-profiling analysis and power consumption model. Considering the characteristics of non-profiling analysis and the HW (Hamming Weight) power consumption model is assumed, we predict that the learning model applying the HW label without One-hot encoding and the Correlation Optimization (CO) loss will have the best analysis performance. And we performed actual analysis on three data sets that are Subbytes operation part of AES-128 1 round. We verified our prediction by non-profiling analyzing two data sets with a total 16 of MLP-based model, which we describe.

High Compression Image Coding with BTC Parameters (BTC 파라메타를 이용한 고압축 영상부호화)

  • Shim, Young-Serk;Lee, Hark-Jun
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.2
    • /
    • pp.140-146
    • /
    • 1989
  • An efficient quantization and encoding of BTC (Block Truncation Coding) parameters {($Y_{\alpha},\;Y_{\beta}),\;P_{{\beta}/{\beta}}$} are investigated, In our algorithm 4${\times}$4 blocks are classified into flat or edge block. While edge block is represented by two approximation level $Y_{\alpha},\;Y_{\beta}$ with label plane $P_{{\beta}/{\beta}}$, flat block is represented by single approximation level Y. The approximation levels Y, $Y_{\alpha}$ and $Y_{\beta}$ are encoded by predictive quatization specially designed, and the label plane $P_{{\beta}/{\beta}}$ is tried to be encoded using stored 32 reference plantes. The performance of the proposed scheme has appeared comparable to much more complex transform coding in terms of SNR, although it requires more study on the representation of small slope in background.

  • PDF

Computer Codes for Korean Sounds: K-SAMPA

  • Kim, Jong-mi
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4E
    • /
    • pp.3-16
    • /
    • 2001
  • An ASCII encoding of Korean has been developed for extended phonetic transcription of the Speech Assessment Methods Phonetic Alphabet (SAMPA). SAMPA is a machine-readable phonetic alphabet used for multilingual computing. It has been developed since 1987 and extended to more than twenty languages. The motivating factor for creating Korean SAMPA (K-SAMPA) is to label Korean speech for a multilingual corpus or to transcribe native language (Ll) interfered pronunciation of a second language learner for bilingual education. Korean SAMPA represents each Korean allophone with a particular SAMPA symbol. Sounds that closely resemble it are represented by the same symbol, regardless of the language they are uttered in. Each of its symbols represents a speech sound that is spectrally and temporally so distinct as to be perceptually different when the components are heard in isolation. Each type of sound has a separate IPA-like designation. Korean SAMPA is superior to other transcription systems with similar objectives. It describes better the cross-linguistic sound quality of Korean than the official Romanization system, proclaimed by the Korean government in July 2000, because it uses an internationally shared phonetic alphabet. It is also phonetically more accurate than the official Romanization in that it dispenses with orthographic adjustments. It is also more convenient for computing than the International Phonetic Alphabet (IPA) because it consists of the symbols on a standard keyboard. This paper demonstrates how the Korean SAMPA can express allophonic details and prosodic features by adopting the transcription conventions of the extended SAMPA (X-SAMPA) and the prosodic SAMPA(SAMPROSA).

  • PDF

Correcting Misclassified Image Features with Convolutional Coding

  • Mun, Ye-Ji;Kim, Nayoung;Lee, Jieun;Kang, Je-Won
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.11-14
    • /
    • 2018
  • The aim of this study is to rectify the misclassified image features and enhance the performance of image classification tasks by incorporating a channel- coding technique, widely used in telecommunication. Specifically, the proposed algorithm employs the error - correcting mechanism of convolutional coding combined with the convolutional neural networks (CNNs) that are the state - of- the- arts image classifier s. We develop an encoder and a decoder to employ the error - correcting capability of the convolutional coding. In the encoder, the label values of the image data are converted to convolutional codes that are used as target outputs of the CNN, and the network is trained to minimize the Euclidean distance between the target output codes and the actual output codes. In order to correct misclassified features, the outputs of the network are decoded through the trellis structure with Viterbi algorithm before determining the final prediction. This paper demonstrates that the proposed architecture advances the performance of the neural networks compared to the traditional one- hot encoding method.

  • PDF

A Study on the Considerations for Constructing RDA Application Profiles (RDA 응용 프로파일 구축시 고려사항에 관한 연구)

  • Lee, Mihwa
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.30 no.4
    • /
    • pp.29-50
    • /
    • 2019
  • This study was to suggest the considerations for application profiles of 2019 revised RDA using literature reviews and case studies according to new RDA that revised in order to reflect the LRM and linked data. First, the additional elements were recommended as the contents of application profiles such as inverse element, broader element, narrow element, domain, range, alternate label name, mapping to MARC, mapping to BIBFRAME, and RDA description examples as new elements as well as element name, element ID, element URL, description method, vocabulary encoding scheme, data provenance element, data provenance value, and notes as the elements that were already suggested by previous researches. Second, RDA rules' representations in forms of flow chart and application profiles through analyzing RDA rules were suggested in order to apply the rules to RDA application profiles to structure the rules in which every element has 4 types of description method, many conditions, and options. Third, the RDA mapping to BIBFRAME was suggested in RDA application profiles because RDA and BIBFRAME are co-related in context of content standard and encoding format, and mapping BIBFRAME and RDA is necessitated for programming BIBFRAME editors with RDA as content standard. This study will contribute to find the methods for constructing RDA application profiles and BIBFRAME application profiles with RDA as content standard.

A Supervised Feature Selection Method for Malicious Intrusions Detection in IoT Based on Genetic Algorithm

  • Saman Iftikhar;Daniah Al-Madani;Saima Abdullah;Ammar Saeed;Kiran Fatima
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.3
    • /
    • pp.49-56
    • /
    • 2023
  • Machine learning methods diversely applied to the Internet of Things (IoT) field have been successful due to the enhancement of computer processing power. They offer an effective way of detecting malicious intrusions in IoT because of their high-level feature extraction capabilities. In this paper, we proposed a novel feature selection method for malicious intrusion detection in IoT by using an evolutionary technique - Genetic Algorithm (GA) and Machine Learning (ML) algorithms. The proposed model is performing the classification of BoT-IoT dataset to evaluate its quality through the training and testing with classifiers. The data is reduced and several preprocessing steps are applied such as: unnecessary information removal, null value checking, label encoding, standard scaling and data balancing. GA has applied over the preprocessed data, to select the most relevant features and maintain model optimization. The selected features from GA are given to ML classifiers such as Logistic Regression (LR) and Support Vector Machine (SVM) and the results are evaluated using performance evaluation measures including recall, precision and f1-score. Two sets of experiments are conducted, and it is concluded that hyperparameter tuning has a significant consequence on the performance of both ML classifiers. Overall, SVM still remained the best model in both cases and overall results increased.