• 제목/요약/키워드: Seq2seq model

검색결과 36건 처리시간 0.025초

실내 사람 위치 추적 기반 LSTM 모델을 이용한 고객 혼잡 예측 연구 (An Approach Using LSTM Model to Forecasting Customer Congestion Based on Indoor Human Tracking)

  • 채희주;곽경헌;이다연;김은경
    • 한국시뮬레이션학회논문지
    • /
    • 제32권3호
    • /
    • pp.43-53
    • /
    • 2023
  • 본 연구는 실내 상업적 공간, 특히 카페에서 보안 카메라를 이용해 방문자 수와 위치를 실시간으로 파악하고, 이를 통해 사용 가능한 좌석 정보와 혼잡도 예측을 제공하는 시스템의 개발을 목표로 한다. 우리는 실시간 객체 탐지 및 추적 알고리즘인 YOLO를 활용하여 방문자 수와 위치를 실시간으로 파악하며, 이 정보를 카페 실내 지도에 업데이트하여 카페 방문자가 사용 가능한 좌석을 확인할 수 있도록 한다. 또한, 우리는 vanishing gradient문제를 해결한 장단기 메모리(Long Short Term Memory, LSTM)와 시간적인 관계를 가지는 데이터를 처리하는데 유용한 시퀀스-투-시퀀스(Sequence-to-Sequence, Seq2Seq)기법을 활용해 다양한 시간 간격에 따른 방문자 수와 움직임 패턴을 학습하고, 이를 바탕으로 카페의 혼잡도를 실시간으로 예측하는 시스템을 개발하였다. 이 시스템은 카페의 관리자와 이용자 모두에게 예상 혼잡도를 제공함으로써, 카페의 운영 효율성을 향상시키고, 고객 만족도를 높일 수 있다. 본 연구에서는 보안 카메라를 활용한 실내 위치 추적 기술의 효용성을 입증하며, 상업적 공간에서의 활용 가능성과 더불어 미래 연구 방향을 제시한다.

Screening for candidate genes related with histological microstructure, meat quality and carcass characteristic in pig based on RNA-seq data

  • Ropka-Molik, Katarzyna;Bereta, Anna;Zukowski, Kacper;Tyra, Miroslaw;Piorkowska, Katarzyna;Zak, Grzegorz;Oczkowicz, Maria
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제31권10호
    • /
    • pp.1565-1574
    • /
    • 2018
  • Objective: The aim of the present study was to identify genetic variants based on RNA-seq data, obtained via transcriptome sequencing of muscle tissue of pigs differing in muscle histological structure, and to verify the variants' effect on histological microstructure and production traits in a larger pig population. Methods: RNA-seq data was used to identify the panel of single nucleotide polymorphisms (SNPs) significantly related with percentage and diameter of each fiber type (I, IIA, IIB). Detected polymorphisms were mapped to quantitative trait loci (QTLs) regions. Next, the association study was performed on 944 animals representing five breeds (Landrace, Large White, Pietrain, Duroc, and native Puławska breed) in order to evaluate the relationship of selected SNPs and histological characteristics, meat quality and carcasses traits. Results: Mapping of detected genetic variants to QTL regions showed that chromosome 14 was the most overrepresented with the identification of four QTLs related to percentage of fiber types I and IIA. The association study performed on a 293 longissimus muscle samples confirmed a significant positive effect of transforming acidic coiled-coil-containing protein 2 (TACC2) polymorphisms on fiber diameter, while SNP within forkhead box O1 (FOXO1) locus was associated with decrease of diameter of fiber types IIA and IIB. Moreover, subsequent general linear model analysis showed significant relationship of FOXO1, delta 4-desaturase, sphingolipid 1 (DEGS1), and troponin T2 (TNNT2) genes with loin 'eye' area, FOXO1 with loin weight, as well as FOXO1 and TACC2 with lean meat percentage. Furthermore, the intramuscular fat content was positively associated (p<0.01) with occurrence of polymorphisms within DEGS1, TNNT2 genes and negatively with occurrence of TACC2 polymorphism. Conclusion: This study's results indicate that the SNP calling analysis based on RNA-seq data can be used to search candidate genes and establish the genetic basis of phenotypic traits. The presented results can be used for future studies evaluating the use of selected SNPs as genetic markers related to muscle histological profile and production traits in pig breeding.

Understanding recurrent neural network for texts using English-Korean corpora

  • Lee, Hagyeong;Song, Jongwoo
    • Communications for Statistical Applications and Methods
    • /
    • 제27권3호
    • /
    • pp.313-326
    • /
    • 2020
  • Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.

앙상블 기법을 활용한 RNA-Sequencing 데이터의 폐암 예측 연구 (A Study on Predicting Lung Cancer Using RNA-Sequencing Data with Ensemble Learning)

  • Geon AN;JooYong PARK
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제2권1호
    • /
    • pp.7-14
    • /
    • 2024
  • In this paper, we explore the application of RNA-sequencing data and ensemble machine learning to predict lung cancer and treatment strategies for lung cancer, a leading cause of cancer mortality worldwide. The research utilizes Random Forest, XGBoost, and LightGBM models to analyze gene expression profiles from extensive datasets, aiming to enhance predictive accuracy for lung cancer prognosis. The methodology focuses on preprocessing RNA-seq data to standardize expression levels across samples and applying ensemble algorithms to maximize prediction stability and reduce model overfitting. Key findings indicate that ensemble models, especially XGBoost, substantially outperform traditional predictive models. Significant genetic markers such as ADGRF5 is identified as crucial for predicting lung cancer outcomes. In conclusion, ensemble learning using RNA-seq data proves highly effective in predicting lung cancer, suggesting a potential shift towards more precise and personalized treatment approaches. The results advocate for further integration of molecular and clinical data to refine diagnostic models and improve clinical outcomes, underscoring the critical role of advanced molecular diagnostics in enhancing patient survival rates and quality of life. This study lays the groundwork for future research in the application of RNA-sequencing data and ensemble machine learning techniques in clinical settings.

PC-SAN: Pretraining-Based Contextual Self-Attention Model for Topic Essay Generation

  • Lin, Fuqiang;Ma, Xingkong;Chen, Yaofeng;Zhou, Jiajun;Liu, Bo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권8호
    • /
    • pp.3168-3186
    • /
    • 2020
  • Automatic topic essay generation (TEG) is a controllable text generation task that aims to generate informative, diverse, and topic-consistent essays based on multiple topics. To make the generated essays of high quality, a reasonable method should consider both diversity and topic-consistency. Another essential issue is the intrinsic link of the topics, which contributes to making the essays closely surround the semantics of provided topics. However, it remains challenging for TEG to fill the semantic gap between source topic words and target output, and a more powerful model is needed to capture the semantics of given topics. To this end, we propose a pretraining-based contextual self-attention (PC-SAN) model that is built upon the seq2seq framework. For the encoder of our model, we employ a dynamic weight sum of layers from BERT to fully utilize the semantics of topics, which is of great help to fill the gap and improve the quality of the generated essays. In the decoding phase, we also transform the target-side contextual history information into the query layers to alleviate the lack of context in typical self-attention networks (SANs). Experimental results on large-scale paragraph-level Chinese corpora verify that our model is capable of generating diverse, topic-consistent text and essentially makes improvements as compare to strong baselines. Furthermore, extensive analysis validates the effectiveness of contextual embeddings from BERT and contextual history information in SANs.

BSA-Seq Technologies Identify a Major QTL for Clubroot Resistance in Chinese Cabbage (Brassica rapa ssp. pekinesis)

  • Yuan, Yu-Xiang;Wei, Xiao-Chun;Zhang, Qiang;Zhao, Yan-Yan;Jiang, Wu-Sheng;Yao, Qiu-Ju;Wang, Zhi-Yong;Zhang, Ying;Tan, Yafei;Li, Yang;Xu, Qian;Zhang, Xiao-Wei
    • 한국균학회소식:학술대회논문집
    • /
    • 한국균학회 2015년도 춘계학술대회 및 임시총회
    • /
    • pp.41-41
    • /
    • 2015
  • BSA-seq technologies, combined Bulked Segregant Analysis (BSA) and Next-Generation Sequencing (NGS), are making it faster and more efficient to establish the association of agronomic traits with molecular markers or candidate genes, which is the requirement for marker-assisted selection in molecular breeding. Clubroot disease, caused by Plasmodiophora brassicae, is a serious threat to Brassica crops. Even we have breed new clubroot resistant varieties of Chinese cabbage (B. rapa ssp. pekinesis), the underlying genetic mechanism is unclear. In this study, an $F_2$ population of 340 plants were inoculated with P. brassicae from Xinye (Pathotype 2 on the differentials of Williams). Resistance phenotype segregation ratio for the populations fit a 3:1 (R:S) segregation model, consistent with a single dominant gene model. Super-BSA, using re-sequencing the parents, extremely R and S DNA pools with each 50 plants, revealed 3 potential candidate regions on the chromosome A03, with the most significant region falling between 24.30 Mb and 24.75 Mb. A linkage map with 31 markers in this region was constructed with several closely linked markers identified. A Major QTL for clubroot resistance, CRq, which was identified with the peak LOD score at 169.3, explaining 89.9% of the phenotypic variation. And we developed a new co-segregated InDel marker BrQ-2. Joint BSA-seq and traditional QTL analysis delimited CRq to an 250 kb genomic region, where four TIR-NBS-LRR genes (Bra019409, Bra019410, Bra019412 and Bra019413) clustered. The CR gene CRq and closely linked markers will be highly useful for breeding new resistant Chinese cabbage cultivars.

  • PDF

Genome-wide survey and expression analysis of F-box genes in wheat

  • Kim, Dae Yeon;Hong, Min Jeong;Seo, Yong Weon
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2017년도 9th Asian Crop Science Association conference
    • /
    • pp.141-141
    • /
    • 2017
  • The ubiquitin-proteasome pathway is the major regulatory mechanism in a number of cellular processes for selective degradation of proteins and involves three steps: (1) ATP dependent activation of ubiquitin by E1 enzyme, (2) transfer of activated ubiquitin to E2 and (3) transfer of ubiquitin to the protein to be degraded by E3 complex. F-box proteins are subunit of SCF complex and involved in specificity for a target substrate to be degraded. F-box proteins regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence. However, little is known about the F-box genes in wheat. The draft genome sequence of wheat (IWGSC Reference Sequence v1.0 assembly) used to analysis a genome-wide survey of the F-box gene family in wheat. The Hidden Markov Model (HMM) profiles of F-box (PF00646), F-box-like (PF12937), F-box-like 2 (PF13013), FBA (PF04300), FBA_1 (PF07734), FBA_2 (PF07735), FBA_3 (PF08268) and FBD (PF08387) domains were downloaded from Pfam database were searched against IWGSC Reference Sequence v1.0 assembly. RNA-seq paired-end libraries from different stages of wheat, such as stages of seedling, tillering, booting, day after flowering (DAF) 1, DAF 10, DAF 20, and DAF 30 were conducted and sequenced by Illumina HiSeq2000 for expression analysis of F-box protein genes. Basic analysis including Hisat, HTseq, DEseq, gene ontology analysis and KEGG mapping were conducted for differentially expressed gene analysis and their annotation mappings of DEGs from various stages. About 950 F-box domain proteins identified by Pfam were mapped to wheat reference genome sequence by blastX (e-value < 0.05). Among them, more than 140 putative F-box protein genes were selected by fold changes cut-offs of > 2, significance p-value < 0.01, and FDR<0.01. Expression profiling of selected F-box protein genes were shown by heatmap analysis, and average linkage and squared Euclidean distance of putative 144 F-box protein genes by expression patterns were calculated for clustering analysis. This work may provide valuable and basic information for further investigation of protein degradation mechanism by ubiquitin proteasome system using F-box proteins during wheat development stages.

  • PDF

미세먼지 예측 성능 개선을 위한 시공간 트랜스포머 모델의 적용 (Application of spatiotemporal transformer model to improve prediction performance of particulate matter concentration)

  • 김영광;김복주;안성만
    • 지능정보연구
    • /
    • 제28권1호
    • /
    • pp.329-352
    • /
    • 2022
  • 미세먼지는 폐나 혈관에 침투해 각종 심장 질환이나 폐암 등의 호흡기 질환을 일으키는 것으로 보고되고 있다. 지하철은 일 평균 천만 명이 이용하는 교통수단으로, 깨끗하고 쾌적한 환경조성이 중요하나 지하터널을 통과하는 지하철의 운행 특성과 터널에 갇힌 미세먼지가 열차 풍으로 인해 지하역사로 이동하는 등의 문제로 지하역사의 미세먼지 오염도는 높은 것으로 나타나고 있다. 환경부와 서울시는 지하역사 공기질 개선대책을 수립하여 다양한 미세먼지 저감 노력을 기울이고 있다. 스마트 공기질 관리 시스템은 공기질 데이터 수집 및 미세먼지 농도를 예측하여 공기질을 관리하는 시스템으로 미세먼지 농도 예측 모델이 중요한 구성 요소이다. 그동안 시계열 데이터 예측에 관한 다양한 연구가 진행되어왔지만, 지하철 역사의 미세먼지 농도 예측과 관련해서는 통계나 순환신경망 기반의 딥러닝 모델 연구에 국한되어 있다. 이에 본 연구에서는 시공간 트랜스포머를 포함한 4개의 트랜스포머 기반 모델을 제안한다. 서울시 지하철 역사의 대합실을 대상으로 한 시간 후의 미세먼지 농도 예측실험을 수행한 결과, 트랜스포머 기반 모델들의 성능이 기존의 ARIMA, LSTM, Seq2Seq 모델들에 비해 우수한 성능을 나타냄을 확인하였다. 트랜스포머 기반 모델 중에서는 시공간 트랜스포머의 성능이 가장 우수하였다. 데이터 기반의 예측을 통하여 운영되는 스마트 공기질 관리 시스템은 미세먼지 예측의 정확도가 향상될수록 더욱더 효과적이고 에너지 효율적으로 운영될 수 있다. 본 연구 결과는 스마트 공기질 관리 시스템의 효율적 운영에 기여할 수 있을 것으로 기대된다.

전인적 호스피스 간호중재 프로그램이 호스피스완화의료병동 입원 환자의 자아존중감과 영적안녕에 미치는 효과 (Effects of Holistic Hospice Nursing Intervention Program on Self Esteem and Spiritual Well-being for Inpatients of Hospice Palliative Care Unit)

  • 최성은;강은실
    • Journal of Hospice and Palliative Care
    • /
    • 제12권4호
    • /
    • pp.209-219
    • /
    • 2009
  • 목적: 본 연구는 단일군 사전 사후 원시실험설계 연구로서 호스피스완화의료병동 입원 환자를 위한 전인적 호스피스간호중재 프로그램("무지개 프로그램")의 자아존중감과 영적안녕에 대한 효과를 검증하고자 하였다. 방법: 2004년 4월 6일부터 2005년 4월 20일까지 경북포항시 소재 선린병원의 호스피스완화의료병동에 입원한 만18세 이상의 성인 환자로서 의사소통이 가능하여 연구참여에 서면 동의한 27명을 대상으로 사전조사 후 2주간, 총 10회(회당 120분)로 구성된 전인적 호스피스 간호중재 프로그램 제공 후, 사후 조사를 실시하였다. 효과 검정을 위해 자아존중감 측정 도구로는 성인용으로 수정보완된 Self Esteem Questionnaire (SEQ), 영적안녕 측정도구로는 Spiritual Well-being Questionnaire를 사용하였으며, 자료분석은 SPSS/WIN 12.0 프로그램을 이용하여 Paired t-test로 분석하였다. 결과: 1. 가설 1 '전인적 호스피스 간호중재 프로그램을 제공받은 호스피스완화의료병동 입원환자(이하 실험군)는 실험 전보다 실험 후의 자아존중감 정도가 높을 것이다'는 지지되었다(t=11.554, P<0.000) 2. 가설 2 '전인적 호스피스 간호중재 프로그램을 제공받은 실험군은 실험 전보다 실험 후의 영적안녕 정도가 높을 것이다'는 지지되었다(t=6.387, P<0.000). 결론: 전인적 호스피스간호중재 프로그램은 호스피스완화의료병동 입원환자의 자아존중감과 영적안녕을 증진시키는 데 효과적이므로, 호스피스완화의료 병동에 입원한 말기환자를 위해 임상 실무에서 적용 가능하며, 호스피스 전문직의 다학제적 팀 접근 모델로 연구, 교육 측면에서도 유용하리라 생각한다.

  • PDF

Ginsenoside Rg3 increases gemcitabine sensitivity of pancreatic adenocarcinoma via reducing ZFP91 mediated TSPYL2 destabilization

  • Pan, Haixia;Yang, Linhan;Bai, Hansong;Luo, Jing;Deng, Ying
    • Journal of Ginseng Research
    • /
    • 제46권5호
    • /
    • pp.636-645
    • /
    • 2022
  • Background: Ginsenoside Rg3 and gemcitabine have mutual enhancing antitumor effects. However, the underlying mechanisms are not clear. This study explored the influence of ginsenoside Rg3 on Zinc finger protein 91 homolog (ZFP91) expression in pancreatic adenocarcinoma (PAAD) and their regulatory mechanisms on gemcitabine sensitivity. Methods: RNA-seq and survival data from The Cancer Genome Atlas (TCGA)-PAAD and Genotype-Tissue Expression (GTEx) were used for in-silicon analysis. PANC-1, BxPC-3, and PANC-1 gemcitabine-resistant (PANC-1/GR) cells were used for in vitro analysis. PANC-1 derived tumor xenograft nude mice model was used to assess the influence of ginsenoside Rg3 and ZFP91 on tumor growth in vivo. Results: Ginsenoside Rg3 reduced ZFP91 expression in PAAD cells in a dose-dependent manner. ZFP91 upregulation was associated with significantly shorter survival of patients with PAAD. ZFP91 overexpression induced gemcitabine resistance, which was partly conquered by ginsenoside Rg3 treatment. ZFP91 depletion sensitized PANC-1/GR cells to gemcitabine treatment. ZFP91 interacted with Testis-Specific Y-Encoded-Like Protein 2 (TSPYL2), induced its poly-ubiquitination, and promoted proteasomal degradation. Ginsenoside Rg3 treatment weakened ZFP91-induced TSPYL2 poly-ubiquitination and degradation. Enforced TSPYL2 expression increased gemcitabine sensitivity of PAAD cells and partly reversed induced gemcitabine resistance in PANC-1/GR cells. Conclusion: Ginsenoside Rg3 can increase gemcitabine sensitivity of pancreatic adenocarcinoma at least via reducing ZFP91 mediated TSPYL2 destabilization.