• Title/Summary/Keyword: Receptive fields

Search Result 51, Processing Time 0.024 seconds

Face recognition using a sparse population coding model for receptive field formation of the simple cells in the primary visual cortex (주 시각피질에서의 단순세포 수용영역 형성에 대한 성긴 집단부호 모델을 이용한 얼굴이식)

  • 김종규;장주석;김영일
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.10
    • /
    • pp.43-50
    • /
    • 1997
  • In this paper, we present a method that can recognize face images by use of a sparse population code that is a learning model about a receptive fields of the simple cells in the primary visual cortex. Twenty front-view facial images form twenty persons were used for the training process, and 200 varied facial images, 20 per person, were used for test. The correct recognition rate was 100% for only the front-view test facial images, which include the images either with spectacles or of various expressions, while it was 90% in average for the total input images that include rotated faces. We analyzed the effect of nonlinear functon that determine the sparseness, and compared recognition rate using the sparese population code with that using eigenvectors (eigenfaces), which is compact code that makes contrast with the sparse population code.

  • PDF

Prediction of Nonlinear Sequences by Self-Organized CMAC Neural Network (자율조직 CMAC 신경망에 의한 비선형 시계열 예측)

  • 이태호
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.4
    • /
    • pp.62-66
    • /
    • 2002
  • An attempt of using SOCMAC neural network for the prediction of a nonlinear sequence, which is generated by Mackey-Glass equation, is reported. The ,report shows the SOCMAC can handle a system with multi-dimensional continuous inputs, which has been considered very difficult, if not impossible, task to be implemented by a CMAC neural network because of a huge amount of memory required. Also, an improved training method based on the variable receptive fields is proposed. The Performance ranged somewhere around those of TDNN and BP neural networks.

  • PDF

A Multi-Stage Convolution Machine with Scaling and Dilation for Human Pose Estimation

  • Nie, Yali;Lee, Jaehwan;Yoon, Sook;Park, Dong Sun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3182-3198
    • /
    • 2019
  • Vision-based Human Pose Estimation has been considered as one of challenging research subjects due to problems including confounding background clutter, diversity of human appearances and illumination changes in scenes. To tackle these problems, we propose to use a new multi-stage convolution machine for estimating human pose. To provide better heatmap prediction of body joints, the proposed machine repeatedly produces multiple predictions according to stages with receptive field large enough for learning the long-range spatial relationship. And stages are composed of various modules according to their strategic purposes. Pyramid stacking module and dilation module are used to handle problem of human pose at multiple scales. Their multi-scale information from different receptive fields are fused with concatenation, which can catch more contextual information from different features. And spatial and channel information of a given input are converted to gating factors by squeezing the feature maps to a single numeric value based on its importance in order to give each of the network channels different weights. Compared with other ConvNet-based architectures, we demonstrated that our proposed architecture achieved higher accuracy on experiments using standard benchmarks of LSP and MPII pose datasets.

ASPPMVSNet: A high-receptive-field multiview stereo network for dense three-dimensional reconstruction

  • Saleh Saeed;Sungjun Lee;Yongju Cho;Unsang Park
    • ETRI Journal
    • /
    • v.44 no.6
    • /
    • pp.1034-1046
    • /
    • 2022
  • The learning-based multiview stereo (MVS) methods for three-dimensional (3D) reconstruction generally use 3D volumes for depth inference. The quality of the reconstructed depth maps and the corresponding point clouds is directly influenced by the spatial resolution of the 3D volume. Consequently, these methods produce point clouds with sparse local regions because of the lack of the memory required to encode a high volume of information. Here, we apply the atrous spatial pyramid pooling (ASPP) module in MVS methods to obtain dense feature maps with multiscale, long-range, contextual information using high receptive fields. For a given 3D volume with the same spatial resolution as that in the MVS methods, the dense feature maps from the ASPP module encoded with superior information can produce dense point clouds without a high memory footprint. Furthermore, we propose a 3D loss for training the MVS networks, which improves the predicted depth values by 24.44%. The ASPP module provides state-of-the-art qualitative results by constructing relatively dense point clouds, which improves the DTU MVS dataset benchmarks by 2.25% compared with those achieved in the previous MVS methods.

Pixel-Wise Polynomial Estimation Model for Low-Light Image Enhancement

  • Muhammad Tahir Rasheed;Daming Shi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2483-2504
    • /
    • 2023
  • Most existing low-light enhancement algorithms either use a large number of training parameters or lack generalization to real-world scenarios. This paper presents a novel lightweight and robust pixel-wise polynomial approximation-based deep network for low-light image enhancement. For mapping the low-light image to the enhanced image, pixel-wise higher-order polynomials are employed. A deep convolution network is used to estimate the coefficients of these higher-order polynomials. The proposed network uses multiple branches to estimate pixel values based on different receptive fields. With a smaller receptive field, the first branch enhanced local features, the second and third branches focused on medium-level features, and the last branch enhanced global features. The low-light image is downsampled by the factor of 2b-1 (b is the branch number) and fed as input to each branch. After combining the outputs of each branch, the final enhanced image is obtained. A comprehensive evaluation of our proposed network on six publicly available no-reference test datasets shows that it outperforms state-of-the-art methods on both quantitative and qualitative measures.

Multi-scale context fusion network for melanoma segmentation

  • Zhenhua Li;Lei Zhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.7
    • /
    • pp.1888-1906
    • /
    • 2024
  • Aiming at the problems that the edge of melanoma image is fuzzy, the contrast with the background is low, and the hair occlusion makes it difficult to segment accurately, this paper proposes a model MSCNet for melanoma segmentation based on U-net frame. Firstly, a multi-scale pyramid fusion module is designed to reconstruct the skip connection and transmit global information to the decoder. Secondly, the contextural information conduction module is innovatively added to the top of the encoder. The module provides different receptive fields for the segmented target by using the hole convolution with different expansion rates, so as to better fuse multi-scale contextural information. In addition, in order to suppress redundant information in the input image and pay more attention to melanoma feature information, global channel attention mechanism is introduced into the decoder. Finally, In order to solve the problem of lesion class imbalance, this paper uses a combined loss function. The algorithm of this paper is verified on ISIC 2017 and ISIC 2018 public datasets. The experimental results indicate that the proposed algorithm has better accuracy for melanoma segmentation compared with other CNN-based image segmentation algorithms.

Partially Connected Multi-Layer Perceptrons and their Combination for Off-line Handwritten Hangul Recognition (오프라인 필기체 전표용 한글 인식을 위한 부분 연결 다층 신경망과 결합)

  • 백영목;임길택;진성일
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.4
    • /
    • pp.87-94
    • /
    • 1999
  • This paper presents a study on the off-line handwritten Hangul (Korean) character recognition using the partially connected neural network (PCNN), which is based on partial connections between the input receptive fields and the hidden nodes. The hidden nodes of three PCNNs have ten receptive fields and different input feature sets. And we introduce modular partially connected neural network (MPCNN), The MPCNN combines three PCNNs with a merging network. The learning scheme of the proposed networks is composed of two steps: PCNN learning step and the merging step of combining three PCNN s. In the merging step, another merging PCNN network is introduced and trained by regarding the hidden output of each PCNN as a new input feature vector. The performance of the proposed classifier is verified on the recognition of 18 off-line handwritten Hangul characters widely used in business cards in Korea.

  • PDF

Controlled Release of 5-Fluorouracil from Crosslinked Poly(2-hydroxyethyl methacrylate) (가교된 Poly(2-hydroxyethyl methacrylate)로 부터의 5-Fluorouracil의 방출성)

  • Cho, Chong-Su;Chung, Sook-Ja;Lee, Kang-Choon
    • Journal of Biomedical Engineering Research
    • /
    • v.10 no.2
    • /
    • pp.191-194
    • /
    • 1989
  • A new neural network architecture for the recognition of patterns from images is proposed, which is partially based on the results of physiological studies. The proposed network is composed of multi-layers and the nerve cells in each layer are connected by spatial filters which approximate receptive fields in optic nerve fields. In the proposed method, patterns recognition for complicated images is carried out using global features as well as local features such as lines and end-points. A new generating method of matched filers representing global features is proposed in this network.

  • PDF

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

A Study on Static Situation Awareness System with the Aid of Optimized Polynomial Radial Basis Function Neural Networks (최적화된 pRBF 뉴럴 네트워크에 의한 정적 상황 인지 시스템에 관한 연구)

  • Oh, Sung-Kwun;Na, Hyun-Suk;Kim, Wook-Dong
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.12
    • /
    • pp.2352-2360
    • /
    • 2011
  • In this paper, we introduce a comprehensive design methodology of Radial Basis Function Neural Networks (RBFNN) that is based on mechanism of clustering and optimization algorithm. We can divide some clusters based on similarity of input dataset by using clustering algorithm. As a result, the number of clusters is equal to the number of nodes in the hidden layer. Moreover, the centers of each cluster are used into the centers of each receptive field in the hidden layer. In this study, we have applied Fuzzy-C Means(FCM) and K-Means(KM) clustering algorithm, respectively and compared between them. The weight connections of model are expanded into the type of polynomial functions such as linear and quadratic. In this reason, the output of model consists of relation between input and output. In order to get the optimal structure and better performance, Particle Swarm Optimization(PSO) is used. We can obtain optimized parameters such as both the number of clusters and the polynomial order of weights connection through structural optimization as well as the widths of receptive fields through parametric optimization. To evaluate the performance of proposed model, NXT equipment offered by National Instrument(NI) is exploited. The situation awareness system-related intelligent model was built up by the experimental dataset of distance information measured between object and diverse sensor such as sound sensor, light sensor, and ultrasonic sensor of NXT equipment.