• Title/Summary/Keyword: Seq2seq model

Search Result 36, Processing Time 0.023 seconds

Sentence Recommendation Using Beam Search in a Military Intelligent Image Analysis System (군사용 지능형 영상 판독 시스템에서의 빔서치를 활용한 문장 추천)

  • Na, Hyung-Sun;Jeon, Tae-Hyeon;Kang, Hyung-Seok;Ahn, Jinhyun;Im, Dong-Hyuk
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.521-528
    • /
    • 2021
  • Existing image analysis systems in use in the military field are carried out by readers analyzing and identifying images themselves, writing and disseminating related content, and in this process, repetitive tasks are frequent, resulting in workload. In this paper, to solve the previous problem, we proposed an algorithm that can operate the Seq2Seq model on a word basis, which operates on a sentence basis, and applied the Attention technique to improve accuracy. In addition, by applying the Beam Search technique, we would like to recommend various current identification sentences based on the past identification contents of a specific area. It was confirmed through experiments that the Beam Search technique recommends sentences more effectively than the existing greedy Search technique, and confirmed that the accuracy of recommendation increases when the size of Beam is large.

Epigenetic regulation of key gene of PCK1 by enhancer and super-enhancer in the pathogenesis of fatty liver hemorrhagic syndrome

  • Yi Wang;Shuwen Chen;Min Xue;Jinhu Ma;Xinrui Yi;Xinyu Li;Xuejin Lu;Meizi Zhu;Jin Peng;Yunshu Tang;Yaling Zhu
    • Animal Bioscience
    • /
    • v.37 no.8
    • /
    • pp.1317-1332
    • /
    • 2024
  • Objective: Rare study of the non-coding and regulatory regions of the genome limits our ability to decode the mechanisms of fatty liver hemorrhage syndrome (FLHS) in chickens. Methods: Herein, we constructed the high-fat diet-induced FLHS chicken model to investigate the genome-wide active enhancers and transcriptome by H3K27ac target chromatin immunoprecipitation sequencing (ChIP-seq) and RNA sequencing (RNA-Seq) profiles of normal and FLHS liver tissues. Concurrently, an integrative analysis combining ChIP-seq with RNA-Seq and a comparative analysis with chicken FLHS, rat non-alcoholic fatty liver disease (NAFLD) and human NAFLD at the transcriptome level revealed the enhancer and super enhancer target genes and conservative genes involved in metabolic processes. Results: In total, 56 and 199 peak-genes were identified in upregulated peak-genes positively regulated by H3K27ac (Cor (peak-gene correlation) ≥0.5 and log2(FoldChange) ≥1) (PP) and downregulated peak-genes positively regulated by H3K27ac (Cor (peak-gene correlation) ≥0.5 and log2(FoldChange)≤-1) (PN), respectively; then we screened key regulatory targets mainly distributing in lipid metabolism (PCK1, APOA4, APOA1, INHBE) and apoptosis (KIT, NTRK2) together with MAPK and PPAR signaling pathway in FLHS. Intriguingly, PCK1 was also significantly covered in up-regulated super-enhancers (SEs), which further implied the vital role of PCK1 during the development of FLHS. Conclusion: Together, our studies have identified potential therapeutic biomarkers of PCK1 and elucidated novel insights into the pathogenesis of FLHS, especially for the epigenetic perspective.

A Query-aware Dialog Model for Open-domain Dialog (입력 발화의 키워드를 반영하는 응답을 생성하는 대화 모델)

  • Lim, Yeon-Soo;Kim, So-Eon;Kim, Bong-Min;Jung, Heejae;Park, Seong-Bae
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.274-279
    • /
    • 2020
  • 대화 시스템은 사용자의 입력 발화에 대해 적절하고 의미 있는 응답을 생성하는 시스템으로 seq2seq 구조를 갖는 대화 모델이 주로 연구되고 있다. 그러나 seq2seq 기반 대화 모델은 입력 발화와 관련성이 떨어지는 응답을 생성하거나 모든 입력 발화와 어울리지만 무미건조한 응답을 생성하는 문제가 있다. 본 논문에서는 이를 해결하기 위해 입력 발화에서 고려해야 하는 키워드를 찾고 그 키워드를 반영하는 응답을 생성하는 모델을 제안한다. 제안 모델은 주어진 입력 발화에서 self-attention을 사용해 각 토큰에 대한 키워드 점수를 구한다. 키워드 점수가 가장 높은 토큰을 대화의 주제 또는 핵심 내용을 포함하는 키워드로 정의하고 응답 생성 과정에서 키워드와 관련된 응답을 생성하도록 한다. 본 논문에서 제안한 대화 모델의 실험 결과 문법과 입력 발화와 생성한 응답의 관련성 측면에서 성능이 향상되었음을 알 수 있었다. 특히 관련성 점수는 본 논문에서 제안한 모델이 비교 모델보다 약 0.25점 상승했다. 실험 결과를 통해 본 논문이 제안한 모델의 우수성을 확인하였다.

  • PDF

Analysis of the Structural Relationships among Self-efficacy, Experience, Mobile Learning Quality, and Learner Satisfaction in Universities

  • LEE, Jong-Yeon;PARK, Sanghoon
    • Educational Technology International
    • /
    • v.17 no.2
    • /
    • pp.203-228
    • /
    • 2016
  • This study was designed to determine the factors affecting learner satisfaction and examine the relationships of these factors in mobile learning linked to pre-existing e-learning in universities. In the structural model used, three mobile learning quality factors are the endogenous variables, namely, system quality (SYQ), information quality (INQ) and service quality (SEQ) perceived by students, and learner satisfaction (LS), whereas students' self-efficacy (SE) and experience (EX) in mobile learning are the exogenous variables. The subjects were 900 students who registered for mobile learning courses offered by a private university in Seoul, Korea. The results indicated that SE in mobile learning had positive effects on SYQ, INQ, and SEQ. Furthermore, SE influenced LS when analyzed without quality factors as parameters. Mobile learning EX directly affected INQ, but not SYQ or SEQ. EX likewise had a direct effect on LS when analyzed without quality factors as parameters. Meanwhile, both SYQ and INQ showed a positive effect on LS, but not SEQ. SE and EX affected LS indirectly when SYQ and INQ were used as parameters. This study addresses the importance of increasing SE, EX, SYQ, and INQ to increase LS in mobile learning in universities

Multi-Document Summarization Method of Reviews Using Word Embedding Clustering (워드 임베딩 클러스터링을 활용한 리뷰 다중문서 요약기법)

  • Lee, Pil Won;Hwang, Yun Young;Choi, Jong Seok;Shin, Young Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.535-540
    • /
    • 2021
  • Multi-document refers to a document consisting of various topics, not a single topic, and a typical example is online reviews. There have been several attempts to summarize online reviews because of their vast amounts of information. However, collective summarization of reviews through existing summary models creates a problem of losing the various topics that make up the reviews. Therefore, in this paper, we present method to summarize the review with minimal loss of the topic. The proposed method classify reviews through processes such as preprocessing, importance evaluation, embedding substitution using BERT, and embedding clustering. Furthermore, the classified sentences generate the final summary using the trained Transformer summary model. The performance evaluation of the proposed model was compared by evaluating the existing summary model, seq2seq model, and the cosine similarity with the ROUGE score, and performed a high performance summary compared to the existing summary model.

K-mer Based RNA-seq Read Distribution Method For Accelerating De Novo Transcriptome Assembly

  • Kwon, Hwijun;Jung, Inuk
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.8
    • /
    • pp.1-8
    • /
    • 2020
  • In this paper, we propose a gene family based RNA-seq read distribution method in means to accelerate the overal transcriptome assembly computation time. To measure the performance of our transcriptome sequence data distribution method, we evaluated the performance by testing four types of data sets of the Arabidopsis thaliana genome (Whole Unclassified Reads, Family-Classified Reads, Model-Classified Reads, and Randomly Classified Reads). As a result of de novo transcript assembly in distributed nodes using model classification data, the generated gene contigs matched 95% compared to the contig generated by WUR, and the execution time was reduced by 4.2 times compared to a single node environment using the same resources.

Deep Learning-based Delinquent Taxpayer Prediction: A Scientific Administrative Approach

  • YongHyun Lee;Eunchan Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.30-45
    • /
    • 2024
  • This study introduces an effective method for predicting individual local tax delinquencies using prevalent machine learning and deep learning algorithms. The evaluation of credit risk holds great significance in the financial realm, impacting both companies and individuals. While credit risk prediction has been explored using statistical and machine learning techniques, their application to tax arrears prediction remains underexplored. We forecast individual local tax defaults in Republic of Korea using machine and deep learning algorithms, including convolutional neural networks (CNN), long short-term memory (LSTM), and sequence-to-sequence (seq2seq). Our model incorporates diverse credit and public information like loan history, delinquency records, credit card usage, and public taxation data, offering richer insights than prior studies. The results highlight the superior predictive accuracy of the CNN model. Anticipating local tax arrears more effectively could lead to efficient allocation of administrative resources. By leveraging advanced machine learning, this research offers a promising avenue for refining tax collection strategies and resource management.

Characterizing Milk Production Related Genes in Holstein Using RNA-seq

  • Seo, Minseok;Lee, Hyun-Jeong;Kim, Kwondo;Caetano-Anolles, Kelsey;Jeong, Jin Young;Park, Sungkwon;Oh, Young Kyun;Cho, Seoae;Kim, Heebal
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.29 no.3
    • /
    • pp.343-351
    • /
    • 2016
  • Although the chemical, physical, and nutritional properties of bovine milk have been extensively studied, only a few studies have attempted to characterize milk-synthesizing genes using RNA-seq data. RNA-seq data was collected from 21 Holstein samples, along with group information about milk production ability; milk yield; and protein, fat, and solid contents. Meta-analysis was employed in order to generally characterize genes related to milk production. In addition, we attempted to investigate the relationship between milk related traits, parity, and lactation period. We observed that milk fat is highly correlated with lactation period; this result indicates that this effect should be considered in the model in order to accurately detect milk production related genes. By employing our developed model, 271 genes were significantly (false discovery rate [FDR] adjusted p-value<0.1) detected as milk production related differentially expressed genes. Of these genes, five (albumin, nitric oxide synthase 3, RNA-binding region (RNP1, RRM) containing 3, secreted and transmembrane 1, and serine palmitoyltransferase, small subunit B) were technically validated using quantitative real-time polymerase chain reaction (qRT-PCR) in order to check the accuracy of RNA-seq analysis. Finally, 83 gene ontology biological processes including several blood vessel and mammary gland development related terms, were significantly detected using DAVID gene-set enrichment analysis. From these results, we observed that detected milk production related genes are highly enriched in the circulation system process and mammary gland related biological functions. In addition, we observed that detected genes including caveolin 1, mammary serum amyloid A3.2, lingual antimicrobial peptide, cathelicidin 4 (CATHL4), cathelicidin 6 (CATHL6) have been reported in other species as milk production related gene. For this reason, we concluded that our detected 271 genes would be strong candidates for determining milk production.

A Transliteration Model based on the Seq2seq Learning and Methods for Phonetically-Aware Partial Match for Transliterated Terms in Korean (문장대문장 학습을 이용한 음차변환 모델과 한글 음차변환어의 발음 유사도 기반 부분매칭 방법론)

  • Park, Joohee;Park, Wonjun;Seo, Heecheol
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.443-448
    • /
    • 2018
  • 웹검색 결과의 품질 향상을 위해서는 질의의 정확한 매칭 뿐만이 아니라, 서로 같은 대상을 지칭하는 한글 문자열과 영문 문자열(예: 네이버-naver)의 매칭과 같은 유연한 매칭 또한 중요하다. 본 논문에서는 문장대문장 학습을 통해 영문 문자열을 한글 문자열로 음차변환하는 방법론을 제시한다. 또한 음차변환 결과로 얻어진 한글 문자열을 동일 영문 문자열의 다양한 음차변환 결과와 매칭시킬 수 있는 발음 유사성 기반 부분 매칭 방법론을 제시하고, 위키피디아의 리다이렉트 키워드를 활용하여 이들의 성능을 정량적으로 평가하였다. 이를 통해 본 논문은 문장대문장 학습 기반의 음차 변환 결과가 복잡한 문맥을 고려할 수 있으며, Damerau-Levenshtein 거리의 계산에 자모 유사도를 활용하여 기존에 비해 효과적으로 한글 키워드들 간의 부분매칭이 가능함을 보였다.

  • PDF

Korean Generation-based Dialogue State Tracking using Korean Token-Free Pre-trained Language Model KeByT5 (한국어 토큰-프리 사전학습 언어모델 KeByT5를 이용한 한국어 생성 기반 대화 상태 추적)

  • Kiyoung Lee;Jonghun Shin;Soojong Lim;Ohwoog Kwon
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.644-647
    • /
    • 2023
  • 대화 시스템에서 대화 상태 추적은 사용자와의 대화를 진행하면서 사용자의 의도를 파악하여 시스템 응답을 결정하는데 있어서 중요한 역할을 수행한다. 특히 목적지향(task-oriented) 대화에서 사용자 목표(goal)를 만족시키기 위해서 대화 상태 추적은 필수적이다. 최근 다양한 자연어처리 다운스트림 태스크들이 사전학습 언어모델을 백본 네트워크로 사용하고 그 위에서 해당 도메인 태스크를 미세조정하는 방식으로 좋은 성능을 내고 있다. 본 논문에서는 한국어 토큰-프리(token-free) 사전학습 언어모델인 KeByT5B 사용하고 종단형(end-to-end) seq2seq 방식으로 미세조정을 수행한 한국어 생성 기반 대화 상태 추적 모델을 소개하고 관련하여 수행한 실험 결과를 설명한다.

  • PDF