• Title/Summary/Keyword: Cosine

검색결과 1,078건 처리시간 0.023초

Multi-Document Summarization Method of Reviews Using Word Embedding Clustering (워드 임베딩 클러스터링을 활용한 리뷰 다중문서 요약기법)

  • Lee, Pil Won;Hwang, Yun Young;Choi, Jong Seok;Shin, Young Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • 제10권11호
    • /
    • pp.535-540
    • /
    • 2021
  • Multi-document refers to a document consisting of various topics, not a single topic, and a typical example is online reviews. There have been several attempts to summarize online reviews because of their vast amounts of information. However, collective summarization of reviews through existing summary models creates a problem of losing the various topics that make up the reviews. Therefore, in this paper, we present method to summarize the review with minimal loss of the topic. The proposed method classify reviews through processes such as preprocessing, importance evaluation, embedding substitution using BERT, and embedding clustering. Furthermore, the classified sentences generate the final summary using the trained Transformer summary model. The performance evaluation of the proposed model was compared by evaluating the existing summary model, seq2seq model, and the cosine similarity with the ROUGE score, and performed a high performance summary compared to the existing summary model.

Performance Improvement of Context-Sensitive Spelling Error Correction Techniques using Knowledge Graph Embedding of Korean WordNet (alias. KorLex) (한국어 어휘 의미망(alias. KorLex)의 지식 그래프 임베딩을 이용한 문맥의존 철자오류 교정 기법의 성능 향상)

  • Lee, Jung-Hun;Cho, Sanghyun;Kwon, Hyuk-Chul
    • Journal of Korea Multimedia Society
    • /
    • 제25권3호
    • /
    • pp.493-501
    • /
    • 2022
  • This paper is a study on context-sensitive spelling error correction and uses the Korean WordNet (KorLex)[1] that defines the relationship between words as a graph to improve the performance of the correction[2] based on the vector information of the word embedded in the correction technique. The Korean WordNet replaced WordNet[3] developed at Princeton University in the United States and was additionally constructed for Korean. In order to learn a semantic network in graph form or to use it for learned vector information, it is necessary to transform it into a vector form by embedding learning. For transformation, we list the nodes (limited number) in a line format like a sentence in a graph in the form of a network before the training input. One of the learning techniques that use this strategy is Deepwalk[4]. DeepWalk is used to learn graphs between words in the Korean WordNet. The graph embedding information is used in concatenation with the word vector information of the learned language model for correction, and the final correction word is determined by the cosine distance value between the vectors. In this paper, In order to test whether the information of graph embedding affects the improvement of the performance of context- sensitive spelling error correction, a confused word pair was constructed and tested from the perspective of Word Sense Disambiguation(WSD). In the experimental results, the average correction performance of all confused word pairs was improved by 2.24% compared to the baseline correction performance.

Semi-automatic Data Fusion Method for Spatial Datasets (공간 정보를 가지는 데이터셋의 준자동 융합 기법)

  • Yoon, Jong-chan;Kim, Han-joon
    • The Journal of Society for e-Business Studies
    • /
    • 제26권4호
    • /
    • pp.1-13
    • /
    • 2021
  • With the development of big data-related technologies, it has become possible to process vast amounts of data that could not be processed before. Accordingly, the establishment of an automated data selection and fusion process for the realization of big data-based services has become a necessity, not an option. In this paper, we propose an automation technique to create meaningful new information by fusing datasets containing spatial information. Firstly, the given datasets are embedded by using the Node2Vec model and the keywords of each dataset. Then, the semantic similarities among all of datasets are obtained by calculating the cosine similarity for the embedding vector of each pair of datasets. In addition, a person intervenes to select some candidate datasets with one or more spatial identifiers from among dataset pairs with a relatively higher similarity, and fuses the dataset pairs to visualize them. Through such semi-automatic data fusion processes, we show that significant fused information that cannot be obtained with a single dataset can be generated.

The Research Trends and Keywords Modeling of Shoulder Rehabilitation using the Text-mining Technique (텍스트 마이닝 기법을 활용한 어깨 재활 연구분야 동향과 키워드 모델링)

  • Kim, Jun-hee;Jung, Sung-hoon;Hwang, Ui-jae
    • Journal of the Korean Society of Physical Medicine
    • /
    • 제16권2호
    • /
    • pp.91-100
    • /
    • 2021
  • PURPOSE: This study analyzed the trends and characteristics of shoulder rehabilitation research through keyword analysis, and their relationships were modeled using text mining techniques. METHODS: Abstract data of 10,121 articles in which abstracts were registered on the MEDLINE of PubMed with 'shoulder' and 'rehabilitation' as keywords were collected using python. By analyzing the frequency of words, 10 keywords were selected in the order of the highest frequency. Word-embedding was performed using the word2vec technique to analyze the similarity of words. In addition, the groups were classified and analyzed based on the distance (cosine similarity) through the t-SNE technique. RESULTS: The number of studies related to shoulder rehabilitation is increasing year after year, keywords most frequently used in relation to shoulder rehabilitation studies are 'patient', 'pain', and 'treatment'. The word2vec results showed that the words were highly correlated with 12 keywords from studies related to shoulder rehabilitation. Furthermore, through t-SNE, the keywords of the studies were divided into 5 groups. CONCLUSION: This study was the first study to model the keywords and their relationships that make up the abstracts of research in the MEDLINE of Pub Med related to 'shoulder' and 'rehabilitation' using text-mining techniques. The results of this study will help increase the diversifying research topics of shoulder rehabilitation studies to be conducted in the future.

Buckling failure of cylindrical ring structures subjected to coupled hydrostatic and hydrodynamic pressures

  • Ping, Liu;Feng, Yang Xin;Ngamkhanong, Chayut
    • Structural Monitoring and Maintenance
    • /
    • 제8권4호
    • /
    • pp.345-360
    • /
    • 2021
  • This paper presents an analytical approach to calculate the buckling load of the cylindrical ring structures subjected to both hydrostatic and hydrodynamic pressures. Based on the conservative law of energy and Timoshenko beam theory, a theoretical formula, which can be used to evaluate the critical pressure of buckling, is first derived for the simplified cylindrical ring structures. It is assumed that the hydrodynamic pressure can be treated as an equivalent hydrostatic pressure as a cosine function along the perimeter while the thickness ratio is limited to 0.2. Note that this paper limits the deformed shape of the cylindrical ring structures to an elliptical shape. The proposed analytical solutions are then compared with the numerical simulations. The critical pressure is evaluated in this study considering two possible failure modes: ultimate failure and buckling failure. The results show that the proposed analytical solutions can correctly predict the critical pressure for both failure modes. However, it is not recommended to be used when the hydrostatic pressure is low or medium (less than 80% of the critical pressure) as the analytical solutions underestimate the critical pressure especially when the ultimate failure mode occurs. This implies that the proposed solutions can still be used properly when the subsea vehicles are located in the deep parts of the ocean where the hydrostatic pressure is high. The finding will further help improve the geometric design of subsea vehicles against both hydrostatic and hydrodynamic pressures to enhance its strength and stability when it moves underwater. It will also help to control the speed of the subsea vehicles especially they move close to the sea bottom to prevent a catastrophic failure.

Selective Shuffling for Hiding Hangul Messages in Steganography (스테가노그래피에서 한글 메시지 은닉을 위한 선택적 셔플링)

  • Ji, Seon-su
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • 제15권3호
    • /
    • pp.211-216
    • /
    • 2022
  • Steganography technology protects the existence of hidden information by embedding a secret message in a specific location on the cover medium. Security and resistance are strengthened by applying various hybrid methods based on encryption and steganography. In particular, techniques to increase chaos and randomness are needed to improve security. In fact, the case where the shuffling method is applied based on the discrete cosine transform(DCT) and the least significant bit(LSB) is an area that needs to be studied. I propose a new approach to hide the bit information of Hangul messages by integrating the selective shuffling method that can add the complexity of message hiding and applying the spatial domain technique to steganography. Inverse shuffling is applied when extracting messages. In this paper, the Hangul message to be inserted is decomposed into the choseong, jungseong and jongseong. It improves security and chaos by applying a selective shuffling process based on the corresponding information. The correlation coefficient and PSNR were used to confirm the performance of the proposed method. It was confirmed that the PSNR value of the proposed method was appropriate when compared with the reference value.

A DCT Learning Combined RRU-Net for the Image Splicing Forgery Detection (DCT 학습을 융합한 RRU-Net 기반 이미지 스플라이싱 위조 영역 탐지 모델)

  • Young-min Seo;Jung-woo Han;Hee-jung Kwon;Su-bin Lee;Joongjin Kook
    • Journal of the Semiconductor & Display Technology
    • /
    • 제22권1호
    • /
    • pp.11-17
    • /
    • 2023
  • This paper proposes a lightweight deep learning network for detecting an image splicing forgery. The research on image forgery detection using CNN, a deep learning network, and research on detecting and localizing forgery in pixel units are in progress. Among them, CAT-Net, which learns the discrete cosine transform coefficients of images together with images, was released in 2022. The DCT coefficients presented by CAT-Net are combined with the JPEG artifact learning module and the backbone model as pre-learning, and the weights are fixed. The dataset used for pre-training is not included in the public dataset, and the backbone model has a relatively large number of network parameters, which causes overfitting in a small dataset, hindering generalization performance. In this paper, this learning module is designed to learn the characterization depending on the DCT domain in real-time during network training without pre-training. The DCT RRU-Net proposed in this paper is a network that combines RRU-Net which detects forgery by learning only images and JPEG artifact learning module. It is confirmed that the network parameters are less than those of CAT-Net, the detection performance of forgery is better than that of RRU-Net, and the generalization performance for various datasets improves through the network architecture and training method of DCT RRU-Net.

  • PDF

Mathematical formulations for static behavior of bi-directional FG porous plates rested on elastic foundation including middle/neutral-surfaces

  • Amr E. Assie;Salwa A. Mohamed;Alaa A. Abdelrahman;Mohamed A. Eltaher
    • Steel and Composite Structures
    • /
    • 제48권2호
    • /
    • pp.113-130
    • /
    • 2023
  • The present manuscript aims to investigate the deviation between the middle surface (MS) and neutral surface (NS) formulations on the static response of bi-directionally functionally graded (BDFG) porous plate. The higher order shear deformation plate theory with a four variable is exploited to define the displacement field of BDFG plate. The displacement field variables based on both NS and on MS are presented in detail. These relations tend to get and derive a new set of boundary conditions (BCs). The porosity distribution is portrayed by cosine function including three different configurations, center, bottom, and top distributions. The elastic foundation including shear and normal stiffnesses by Winkler-Pasternak model is included. The equilibrium equations based on MS and NS are derived by using Hamilton's principles and expressed by variable coefficient partial differential equations. The numerical differential quadrature method (DQM) is adopted to solve the derived partial differential equations with variable coefficient. Rigidities coefficients and stress resultants for both MS and NS formulations are derived. The mathematical formulation is proved with previous published work. Additional numerical and parametric results are developed to present the influences of modified boundary conditions, NS and MS formulations, gradation parameters, elastic foundations coefficients, porosity type and porosity coefficient on the static response of BDFG porous plate. The following model can be used in design and analysis of BDFG structure used in aerospace, vehicle, dental, bio-structure, civil and nuclear structures.

Forced vibration of a sandwich Timoshenko beam made of GPLRC and porous core

  • Mohammad Safari;Mehdi Mohammadimehr;Hossein Ashrafi
    • Structural Engineering and Mechanics
    • /
    • 제88권1호
    • /
    • pp.1-12
    • /
    • 2023
  • In this study, forced vibration behavior of a piezo magneto electric sandwich Timoshenko beam is investigated. It is assumed a sandwich beam with porous core and graphene platelet reinforced composite (GPLRC) in facesheets subjected to magneto-electro-elastic and temperature-dependent material properties. The magneto electro platelets are under linear function along with the thickness that includes a cosine function and magnetic and electric constant potentials. The governing equations of motion are derived using modified strain gradient theory for microstructures. The effects of material length scale parameters, temperature change, different distributions of porous, various patterns of graphene platelets, and the core to face sheets thickness ratio on the natural frequency and excited frequency of a sandwich Timoshenko beam are scrutinized. Various size-dependent methods effects such as MSGT, MCST, and CT on the natural frequency is considered. Moreover, the final results affirm that the increase in porosity coefficient and volume fractions lead to an increase in the amount of natural frequency; while vice versa for the increment in the aspect ratio. From forced vibration analysis, it is understood that by increasing the values of volume fraction and the length thickness of GPL, the maximum deflection of a sandwich beam decreases. Also, it is concluded that increasing the temperature, the thickness of GPL, and the initial force leads to a decrease in the maximum deflection of GPL. It is also shown that resonance phenomenon occurs when the natural and excitation frequencies become equal to each other. Outcomes also reveal that the third natural frequency owns the minimum value of both deflection and frequency ratio and the first natural frequency has the maximum.

Assessment of compressive strength of high-performance concrete using soft computing approaches

  • Chukwuemeka Daniel;Jitendra Khatti;Kamaldeep Singh Grover
    • Computers and Concrete
    • /
    • 제33권1호
    • /
    • pp.55-75
    • /
    • 2024
  • The present study introduces an optimum performance soft computing model for predicting the compressive strength of high-performance concrete (HPC) by comparing models based on conventional (kernel-based, covariance function-based, and tree-based), advanced machine (least square support vector machine-LSSVM and minimax probability machine regressor-MPMR), and deep (artificial neural network-ANN) learning approaches using a common database for the first time. A compressive strength database, having results of 1030 concrete samples, has been compiled from the literature and preprocessed. For the purpose of training, testing, and validation of soft computing models, 803, 101, and 101 data points have been selected arbitrarily from preprocessed data points, i.e., 1005. Thirteen performance metrics, including three new metrics, i.e., a20-index, index of agreement, and index of scatter, have been implemented for each model. The performance comparison reveals that the SVM (kernel-based), ET (tree-based), MPMR (advanced), and ANN (deep) models have achieved higher performance in predicting the compressive strength of HPC. From the overall analysis of performance, accuracy, Taylor plot, accuracy metric, regression error characteristics curve, Anderson-Darling, Wilcoxon, Uncertainty, and reliability, it has been observed that model CS4 based on the ensemble tree has been recognized as an optimum performance model with higher performance, i.e., a correlation coefficient of 0.9352, root mean square error of 5.76 MPa, and mean absolute error of 4.1069 MPa. The present study also reveals that multicollinearity affects the prediction accuracy of Gaussian process regression, decision tree, multilinear regression, and adaptive boosting regressor models, novel research in compressive strength prediction of HPC. The cosine sensitivity analysis reveals that the prediction of compressive strength of HPC is highly affected by cement content, fine aggregate, coarse aggregate, and water content.