• Title/Summary/Keyword: Input preprocessing

Search Result 295, Processing Time 0.019 seconds

R-lambda Model based Rate Control for GOP Parallel Coding in A Real-Time HEVC Software Encoder (HEVC 실시간 소프트웨어 인코더에서 GOP 병렬 부호화를 지원하는 R-lambda 모델 기반의 율 제어 방법)

  • Kim, Dae-Eun;Chang, Yongjun;Kim, Munchurl;Lim, Woong;Kim, Hui Yong;Seok, Jin Wook
    • Journal of Broadcast Engineering
    • /
    • v.22 no.2
    • /
    • pp.193-206
    • /
    • 2017
  • In this paper, we propose a rate control method based on the $R-{\lambda}$ model that supports a parallel encoding structure in GOP levels or IDR period levels for 4K UHD input video in real-time. For this, a slice-level bit allocation method is proposed for parallel encoding instead of sequential encoding. When a rate control algorithm is applied in the GOP level or IDR period level parallelism, the information of how many bits are consumed cannot be shared among the frames belonging to a same frame level except the lowest frame level of the hierarchical B structure. Therefore, it is impossible to manage the bit budget with the existing bit allocation method. In order to solve this problem, we improve the bit allocation procedure of the conventional ones that allocate target bits sequentially according to the encoding order. That is, the proposed bit allocation strategy is to assign the target bits in GOPs first, then to distribute the assigned target bits from the lowest depth level to the highest depth level of the HEVC hierarchical B structure within each GOP. In addition, we proposed a processing method that is used to improve subjective image qualities by allocating the bits according to the coding complexities of the frames. Experimental results show that the proposed bit allocation method works well for frame-level parallel HEVC software encoders and it is confirmed that the performance of our rate controller can be improved with a more elaborate bit allocation strategy by using the preprocessing results.

Landslide Susceptibility Mapping Using Deep Neural Network and Convolutional Neural Network (Deep Neural Network와 Convolutional Neural Network 모델을 이용한 산사태 취약성 매핑)

  • Gong, Sung-Hyun;Baek, Won-Kyung;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1723-1735
    • /
    • 2022
  • Landslides are one of the most prevalent natural disasters, threating both humans and property. Also landslides can cause damage at the national level, so effective prediction and prevention are essential. Research to produce a landslide susceptibility map with high accuracy is steadily being conducted, and various models have been applied to landslide susceptibility analysis. Pixel-based machine learning models such as frequency ratio models, logistic regression models, ensembles models, and Artificial Neural Networks have been mainly applied. Recent studies have shown that the kernel-based convolutional neural network (CNN) technique is effective and that the spatial characteristics of input data have a significant effect on the accuracy of landslide susceptibility mapping. For this reason, the purpose of this study is to analyze landslide vulnerability using a pixel-based deep neural network model and a patch-based convolutional neural network model. The research area was set up in Gangwon-do, including Inje, Gangneung, and Pyeongchang, where landslides occurred frequently and damaged. Landslide-related factors include slope, curvature, stream power index (SPI), topographic wetness index (TWI), topographic position index (TPI), timber diameter, timber age, lithology, land use, soil depth, soil parent material, lineament density, fault density, normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) were used. Landslide-related factors were built into a spatial database through data preprocessing, and landslide susceptibility map was predicted using deep neural network (DNN) and CNN models. The model and landslide susceptibility map were verified through average precision (AP) and root mean square errors (RMSE), and as a result of the verification, the patch-based CNN model showed 3.4% improved performance compared to the pixel-based DNN model. The results of this study can be used to predict landslides and are expected to serve as a scientific basis for establishing land use policies and landslide management policies.

Development of deep learning structure for complex microbial incubator applying deep learning prediction result information (딥러닝 예측 결과 정보를 적용하는 복합 미생물 배양기를 위한 딥러닝 구조 개발)

  • Hong-Jik Kim;Won-Bog Lee;Seung-Ho Lee
    • Journal of IKEEE
    • /
    • v.27 no.1
    • /
    • pp.116-121
    • /
    • 2023
  • In this paper, we develop a deep learning structure for a complex microbial incubator that applies deep learning prediction result information. The proposed complex microbial incubator consists of pre-processing of complex microbial data, conversion of complex microbial data structure, design of deep learning network, learning of the designed deep learning network, and GUI development applied to the prototype. In the complex microbial data preprocessing, one-hot encoding is performed on the amount of molasses, nutrients, plant extract, salt, etc. required for microbial culture, and the maximum-minimum normalization method for the pH concentration measured as a result of the culture and the number of microbial cells to preprocess the data. In the complex microbial data structure conversion, the preprocessed data is converted into a graph structure by connecting the water temperature and the number of microbial cells, and then expressed as an adjacency matrix and attribute information to be used as input data for a deep learning network. In deep learning network design, complex microbial data is learned by designing a graph convolutional network specialized for graph structures. The designed deep learning network uses a cosine loss function to proceed with learning in the direction of minimizing the error that occurs during learning. GUI development applied to the prototype shows the target pH concentration (3.8 or less) and the number of cells (108 or more) of complex microorganisms in an order suitable for culturing according to the water temperature selected by the user. In order to evaluate the performance of the proposed microbial incubator, the results of experiments conducted by authorized testing institutes showed that the average pH was 3.7 and the number of cells of complex microorganisms was 1.7 × 108. Therefore, the effectiveness of the deep learning structure for the complex microbial incubator applying the deep learning prediction result information proposed in this paper was proven.

Estimation of Rice Heading Date of Paddy Rice from Slanted and Top-view Images Using Deep Learning Classification Model (딥 러닝 분류 모델을 이용한 직하방과 경사각 영상 기반의 벼 출수기 판별)

  • Hyeok-jin Bak;Wan-Gyu Sang;Sungyul Chang;Dongwon Kwon;Woo-jin Im;Ji-hyeon Lee;Nam-jin Chung;Jung-Il Cho
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.4
    • /
    • pp.337-345
    • /
    • 2023
  • Estimating the rice heading date is one of the most crucial agricultural tasks related to productivity. However, due to abnormal climates around the world, it is becoming increasingly challenging to estimate the rice heading date. Therefore, a more objective classification method for estimating the rice heading date is needed than the existing methods. This study, we aimed to classify the rice heading stage from various images using a CNN classification model. We collected top-view images taken from a drone and a phenotyping tower, as well as slanted-view images captured with a RGB camera. The collected images underwent preprocessing to prepare them as input data for the CNN model. The CNN architectures employed were ResNet50, InceptionV3, and VGG19, which are commonly used in image classification models. The accuracy of the models all showed an accuracy of 0.98 or higher regardless of each architecture and type of image. We also used Grad-CAM to visually check which features of the image the model looked at and classified. Then verified our model accurately measure the rice heading date in paddy fields. The rice heading date was estimated to be approximately one day apart on average in the four paddy fields. This method suggests that the water head can be estimated automatically and quantitatively when estimating the rice heading date from various paddy field monitoring images.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.