• Title/Summary/Keyword: Long-term memory

Search Result 782, Processing Time 0.025 seconds

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

Deep Learning-based Abnormal Behavior Detection System for Dementia Patients (치매 환자를 위한 딥러닝 기반 이상 행동 탐지 시스템)

  • Kim, Kookjin;Lee, Seungjin;Kim, Sungjoong;Kim, Jaegeun;Shin, Dongil;shin, Dong-kyoo
    • Journal of Internet Computing and Services
    • /
    • v.21 no.3
    • /
    • pp.133-144
    • /
    • 2020
  • The number of elderly people with dementia is increasing as fast as the proportion of older people due to aging, which creates a social and economic burden. In particular, dementia care costs, including indirect costs such as increased care costs due to lost caregiver hours and caregivers, have grown exponentially over the years. In order to reduce these costs, it is urgent to introduce a management system to care for dementia patients. Therefore, this study proposes a sensor-based abnormal behavior detection system to manage dementia patients who live alone or in an environment where they cannot always take care of dementia patients. Existing studies were merely evaluating behavior or evaluating normal behavior, and there were studies that perceived behavior by processing images, not data from sensors. In this study, we recognized the limitation of real data collection and used both the auto-encoder, the unsupervised learning model, and the LSTM, the supervised learning model. Autoencoder, an unsupervised learning model, trained normal behavioral data to learn patterns for normal behavior, and LSTM further refined classification by learning behaviors that could be perceived by sensors. The test results show that each model has about 96% and 98% accuracy and is designed to pass the LSTM model when the autoencoder outlier has more than 3%. The system is expected to effectively manage the elderly and dementia patients who live alone and reduce the cost of caring.

Research on the Utilization of Recurrent Neural Networks for Automatic Generation of Korean Definitional Sentences of Technical Terms (기술 용어에 대한 한국어 정의 문장 자동 생성을 위한 순환 신경망 모델 활용 연구)

  • Choi, Garam;Kim, Han-Gook;Kim, Kwang-Hoon;Kim, You-eil;Choi, Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.4
    • /
    • pp.99-120
    • /
    • 2017
  • In order to develop a semiautomatic support system that allows researchers concerned to efficiently analyze the technical trends for the ever-growing industry and market. This paper introduces a couple of Korean sentence generation models that can automatically generate definitional statements as well as descriptions of technical terms and concepts. The proposed models are based on a deep learning model called LSTM (Long Sort-Term Memory) capable of effectively labeling textual sequences by taking into account the contextual relations of each item in the sequences. Our models take technical terms as inputs and can generate a broad range of heterogeneous textual descriptions that explain the concept of the terms. In the experiments using large-scale training collections, we confirmed that more accurate and reasonable sentences can be generated by CHAR-CNN-LSTM model that is a word-based LSTM exploiting character embeddings based on convolutional neural networks (CNN). The results of this study can be a force for developing an extension model that can generate a set of sentences covering the same subjects, and furthermore, we can implement an artificial intelligence model that automatically creates technical literature.

Beyond Clot Dissolution; Role of Tissue Plasminogen Activator in Central Nervous System

  • Kim, Ji-Woon;Lee, Soon-Young;Joo, So-Hyun;Song, Mi-Ryoung;Shin, Chan-Young
    • Biomolecules & Therapeutics
    • /
    • v.15 no.1
    • /
    • pp.16-26
    • /
    • 2007
  • Tissue plasminogen activator (tPA) is a serine protease catalyzing the proteolytic conversion of plasminogen into plasmin, which is involved in thrombolysis. During last two decades, the role of tPA in brain physiology and pathology has been extensively investigated. tPA is expressed in brain regions such as cortex, hippocampus, amygdala and cerebellum, and major neural cell types such as neuron, astrocyte, microglia and endothelial cells express tPA in basal status. After strong neural stimulation such as seizure, tPA behaves as an immediate early gene increasing the expression level within an hour. Neural activity and/or postsynaptic stimulation increased the release of tPA from axonal terminal and presumably from dendritic compartment. Neuronal tPA regulates plastic changes in neuronal function and structure mediating key neurologic processes such as visual cortex plasticity, seizure spreading, cerebellar motor learning, long term potentiation and addictive or withdrawal behavior after morphine discontinuance. In addition to these physiological roles, tPA mediates excitotoxicity leading to the neurodegeneration in several pathological conditions including ischemic stroke. Increasing amount of evidence also suggest the role of tPA in neurodegenerative diseases such as Alzheimer's disease and multiple sclerosis even though beneficial effects was also reported in case of Alzheimer's disease based on the observation of tPA-induced degradation of $A{\beta}$ aggregates. Target proteins of tPA action include extracellular matrix protein laminin, proteoglycans and NMDA receptor. In addition, several receptors (or binding partners) for tPA has been reported such as low-density lipoprotein receptor-related protein (LRP) and annexin II, even though intracellular signaling mechanism underlying tPA action is not clear yet. Interestingly, the action of tPA comprises both proteolytic and non-proteolytic mechanism. In case of microglial activation, tPA showed non-proteolytic cytokine-like function. The search for exact target proteins and receptor molecules for tPA along with the identification of the mechanism regulating tPA expression and release in the nervous system will enable us to better understand several key neurological processes like teaming and memory as well as to obtain therapeutic tools against neurodegenerative diseases.

Korean Abbreviation Generation using Sequence to Sequence Learning (Sequence-to-sequence 학습을 이용한 한국어 약어 생성)

  • Choi, Su Jeong;Park, Seong-Bae;Kim, Kweon-Yang
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.3
    • /
    • pp.183-187
    • /
    • 2017
  • Smart phone users prefer fast reading and texting. Hence, users frequently use abbreviated sequences of words and phrases. Nowadays, abbreviations are widely used from chat terms to technical terms. Therefore, gathering abbreviations would be helpful to many services, including information retrieval, recommendation system, and so on. However, manually gathering abbreviations needs to much effort and cost. This is because new abbreviations are continuously generated whenever a new material such as a TV program or a phenomenon is made. Thus it is required to generate of abbreviations automatically. To generate Korean abbreviations, the existing methods use the rule-based approach. The rule-based approach has limitations, in that it is unable to generate irregular abbreviations. Another problem is to decide the correct abbreviation among candidate abbreviations generated rules. To address the limitations, we propose a method of generating Korean abbreviations automatically using sequence-to-sequence learning in this paper. The sequence-to-sequence learning can generate irregular abbreviation and does not lead to the problem of deciding correct abbreviation among candidate abbreviations. Accordingly, it is suitable for generating Korean abbreviations. To evaluate the proposed method, we use dataset of two type. As experimental results, we prove that our method is effective for irregular abbreviations.

The Ability of L2 LSTM Language Models to Learn the Filler-Gap Dependency

  • Kim, Euhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.11
    • /
    • pp.27-40
    • /
    • 2020
  • In this paper, we investigate the correlation between the amount of English sentences that Korean English learners (L2ers) are exposed to and their sentence processing patterns by examining what Long Short-Term Memory (LSTM) language models (LMs) can learn about implicit syntactic relationship: that is, the filler-gap dependency. The filler-gap dependency refers to a relationship between a (wh-)filler, which is a wh-phrase like 'what' or 'who' overtly in clause-peripheral position, and its gap in clause-internal position, which is an invisible, empty syntactic position to be filled by the (wh-)filler for proper interpretation. Here to implement L2ers' English learning, we build LSTM LMs that in turn learn a subset of the known restrictions on the filler-gap dependency from English sentences in the L2 corpus that L2ers can potentially encounter in their English learning. Examining LSTM LMs' behaviors on controlled sentences designed with the filler-gap dependency, we show the characteristics of L2ers' sentence processing using the information-theoretic metric of surprisal that quantifies violations of the filler-gap dependency or wh-licensing interaction effects. Furthermore, comparing L2ers' LMs with native speakers' LM in light of processing the filler-gap dependency, we not only note that in their sentence processing both L2ers' LM and native speakers' LM can track abstract syntactic structures involved in the filler-gap dependency, but also show using linear mixed-effects regression models that there exist significant differences between them in processing such a dependency.

Reconsidering of critical factors for high quality e-Learning (이 러닝의 질적 우수성에 대한 재고(再考)무엇이 질을 결정하는가?)

  • Cho Eun-Soon
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2005.05a
    • /
    • pp.36-50
    • /
    • 2005
  • e-Learning has been mushrooming with wide range of learning groups from pedagogy to andragogy. Despite of increasing e-learning opportunities, many people doubt whether e-learning learners really learn something. The related research papers emphasized that e-Learning would be a failure in terms of understanding of e-learners and intuitive learning activities for activating learner's long-term memory span. The current learning strategies in e-Learning may be based on the traditional classroom, and this results in boring and ineffective learning outcomes. This paper analyzed that how learners have received e-Learning for the last few years from the research and explained what could be the failing aspects of e-Learning. To be successful, e-Learning should consider the e-Learner's individualized teaming style and thinking patterns. When considering of various e-Learning components, the quality of e-learning should not be focused on any specific single factor, but develop every individual factor to the high level of quality. In conclusion, this paper suggest that we need new understand of e-Learning and e-Learner. Also the e-Learning strategies should be examined throughly whether they are on the side of learners and realized how they learn from e-Learning. Finally, we should add enormous imagination into e-Learning for next generation because their teaming patterns significantly differ from their parent's generation.

  • PDF

A Study on e-Learning Quality Improvement (이 러닝의 질적 향상 방안에 대한 연구)

  • Cho Eun-Soon
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.5
    • /
    • pp.316-324
    • /
    • 2005
  • e-Learning has been mushrooming with wide range of teaming groups from pedagogy to andragogy As e-teaming opportunities increase, many people raise question about whether e-teaming show positive teaming effects. The related research emphasized that e-learning would be a failure in terms of understanding of e-Learners and activating intuitive teaming activities from learner's long-term memory span. The e-teaming strategies based on the traditional classroom and resulted boring and ineffective teaming outcomes, should be changed to provide authentic and effective teaming results. This paper analyzed that how learners have received e-Learning for the last few years from the research and explained what could be the failing aspects in e-Learning. To be successful, e-loaming should consider the e-learner's individualized teaming style and thinking patterns. When considering of various e-Learning components, the quality of e-teaming should not be focused on any specific single factor, but develop every individual factor to be integrated into high level of quality. In conclusion, this paper suggest that it is needed new understandings of e-Loaming and e-Learner. Also the e-Learning strategies should be examined throughly whether they are on the side of learners and realized how they learn from e-Learning. Finally, we should add enormous imagination into e-loaming for next generation because new generation's teaming patterns significantly differ from their parent's generation.

  • PDF

Proposal of a Step-by-Step Optimized Campus Power Forecast Model using CNN-LSTM Deep Learning (CNN-LSTM 딥러닝 기반 캠퍼스 전력 예측 모델 최적화 단계 제시)

  • Kim, Yein;Lee, Seeun;Kwon, Youngsung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.10
    • /
    • pp.8-15
    • /
    • 2020
  • A forecasting method using deep learning does not have consistent results due to the differences in the characteristics of the dataset, even though they have the same forecasting models and parameters. For example, the forecasting model X optimized with dataset A would not produce the optimized result with another dataset B. The forecasting model with the characteristics of the dataset needs to be optimized to increase the accuracy of the forecasting model. Therefore, this paper proposes novel optimization steps for outlier removal, dataset classification, and a CNN-LSTM-based hyperparameter tuning process to forecast the daily power usage of a university campus based on the hourly interval. The proposing model produces high forecasting accuracy with a 2% of MAPE with a single power input variable. The proposing model can be used in EMS to suggest improved strategies to users and consequently to improve the power efficiency.

Comparison of physics-based and data-driven models for streamflow simulation of the Mekong river (메콩강 유출모의를 위한 물리적 및 데이터 기반 모형의 비교·분석)

  • Lee, Giha;Jung, Sungho;Lee, Daeeop
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.6
    • /
    • pp.503-514
    • /
    • 2018
  • In recent, the hydrological regime of the Mekong river is changing drastically due to climate change and haphazard watershed development including dam construction. Information of hydrologic feature like streamflow of the Mekong river are required for water disaster prevention and sustainable water resources development in the river sharing countries. In this study, runoff simulations at the Kratie station of the lower Mekong river are performed using SWAT (Soil and Water Assessment Tool), a physics-based hydrologic model, and LSTM (Long Short-Term Memory), a data-driven deep learning algorithm. The SWAT model was set up based on globally-available database (topography: HydroSHED, landuse: GLCF-MODIS, soil: FAO-Soil map, rainfall: APHRODITE, etc) and then simulated daily discharge from 2003 to 2007. The LSTM was built using deep learning open-source library TensorFlow and the deep-layer neural networks of the LSTM were trained based merely on daily water level data of 10 upper stations of the Kratie during two periods: 2000~2002 and 2008~2014. Then, LSTM simulated daily discharge for 2003~2007 as in SWAT model. The simulation results show that Nash-Sutcliffe Efficiency (NSE) of each model were calculated at 0.9(SWAT) and 0.99(LSTM), respectively. In order to simply simulate hydrological time series of ungauged large watersheds, data-driven model like the LSTM method is more applicable than the physics-based hydrological model having complexity due to various database pressure because it is able to memorize the preceding time series sequences and reflect them to prediction.