• Title/Summary/Keyword: Pooling Method

Search Result 91, Processing Time 0.019 seconds

YOLO based Optical Music Recognition and Virtual Reality Content Creation Method (YOLO 기반의 광학 음악 인식 기술 및 가상현실 콘텐츠 제작 방법)

  • Oh, Kyeongmin;Hong, Yoseop;Baek, Geonyeong;Chun, Chanjun
    • Smart Media Journal
    • /
    • v.10 no.4
    • /
    • pp.80-90
    • /
    • 2021
  • Using optical music recognition technology based on deep learning, we propose to apply the results derived to VR games. To detect the music objects in the music sheet, the deep learning model used YOLO v5, and Hough transform was employed to detect undetected objects, modifying the size of the staff. It analyzes and uses BPM, maximum number of combos, and musical notes in VR games using output result files, and prevents the backlog of notes through Object Pooling technology for resource management. In this paper, VR games can be produced with music elements derived from optical music recognition technology to expand the utilization of optical music recognition along with providing VR contents.

Modified AWSSDR method for frequency-dependent reverberation time estimation (주파수 대역별 잔향시간 추정을 위한 변형된 AWSSDR 방식)

  • Min Sik Kim;Hyung Soon Kim
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.91-100
    • /
    • 2023
  • Reverberation time (T60) is a typical acoustic parameter that provides information about reverberation. Since the impacts of reverberation vary depending on the frequency bands even in the same space, frequency-dependent (FD) T60, which offers detailed insights into the acoustic environments, can be useful. However, most conventional blind T60 estimation methods, which estimate the T60 from speech signals, focus on fullband T60 estimation, and a few blind FDT60 estimation methods commonly show poor performance in the low-frequency bands. This paper introduces a modified approach based on Attentive pooling based Weighted Sum of Spectral Decay Rates (AWSSDR), previously proposed for blind T60 estimation, by extending its target from fullband T60 to FDT60. The experimental results show that the proposed method outperforms conventional blind FDT60 estimation methods on the acoustic characterization of environments (ACE) challenge evaluation dataset. Notably, it consistently exhibits excellent estimation performance in all frequency bands. This demonstrates that the mechanism of the AWSSDR method is valuable for blind FDT60 estimation because it reflects the FD variations in the impact of reverberation, aggregating information about FDT60 from the speech signal by processing the spectral decay rates associated with the physical properties of reverberation in each frequency band.

Characteristics of Sputtered Ta films by Statistical Method (통계적 실험 방법에 의한 Ta 박막의 증착 특성 연구)

  • Seo, Yu-Seok;Park, Dae-Gyu;Jeong, Cheol-Mo;Kim, Sang-Beom;Son, Pyeong-Geun;Lee, Seung-Jin;Kim, Han-Min;Yang, Hong-Seon;Park, Jin-Won
    • Korean Journal of Materials Research
    • /
    • v.11 no.6
    • /
    • pp.492-497
    • /
    • 2001
  • We report the characteristics and the dependence of sputter-deposited Ta films on the process parameters. The properties of as-deposited Ta films such as deposition rate, resistivity, Rs uniformity, reflectivity, and stress were investigated and analyzed as a function of process parameter using a statistical experimental method. The functional relationships between the independent and dependent variables were predicted by surface response. The optimal deposition condition of DC magnetron sputtered Ta films was obtained at the chamber pressure of 2 mTorr, power density of 8 W/$\textrm{cm}^2$, and substrate temperature of 2$0^{\circ}C$ by means of resistivity and Rs uniformity. The fitness value for quadratic model as evaluated by the R- square was 0.85~ 0.9 without pooling. The as-deposited Ta films exhibited the resistivity of ~180$\mu$$\Omega$cm with Rs uniformity of ~2%. The transmission electron microscopy and x-ray diffractometry identified that the phase of as-deposited film was $\beta$-Ta having the grain size of 100~200.

  • PDF

Efficient UML Modeling Method for Remote University Application EJB Component Extraction (원격대학 애플리케이션용 EJB 컴포넌트 추출을 위한 UML 설계에 관한 연구)

  • 반길우;최유순;박종구
    • KSCI Review
    • /
    • v.8 no.1
    • /
    • pp.29-36
    • /
    • 2001
  • EJB application development environment is developing component support Object-Oriented distributed processing, it is component architecture for distributed arrangement. Application developed using EJB is component coupled for business program development easily. EJB is automatically sovled to security. resource Pooling, persistency, concurrency. transaction transparency. This Paper illustrate for EJB extract to EJB sufficient flexibility its development environment, and it was applicated remote university application domain.

  • PDF

Efficient Yard Tractor Control Method for the Dual Cycling in Container Terminal (효율적인 듀얼 사이클을 위한 야드 트랙터 통제 방법)

  • Chung, Chang-Yun;Shin, Jae-Young
    • Journal of Navigation and Port Research
    • /
    • v.36 no.1
    • /
    • pp.69-74
    • /
    • 2012
  • Recent global supply chain, improving the efficiency of container shipping process is very important. In the overseas shipping of goods, the voyage of super containership is common to overcome amount of increasing cargo. Thus, container terminal managers make an experiment on the double cycle and dual cycle operation, which ship loading and unloading were carried out simultaneously, for maximizing the productivity of quay side. Yard Tractors(YTs) pooling methods also are introduced for increasing the efficiency of assignment of YTs. In this paper, we analyzed the efficiency of dual cycling through comparing existing pooling methods with the modified method for the dual cycling. We developed a simulation model using simulation analysis software, Arena. The result of experiment shown that the more dual cycling don't always increase the gross crane rate(GCR), which means productivity of quay cranes(QCs) per hour.

Deep Learning based Photo Horizon Correction (딥러닝을 이용한 영상 수평 보정)

  • Hong, Eunbin;Jeon, Junho;Cho, Sunghyun;Lee, Seungyong
    • Journal of the Korea Computer Graphics Society
    • /
    • v.23 no.3
    • /
    • pp.95-103
    • /
    • 2017
  • Horizon correction is a crucial stage for image composition enhancement. In this paper, we propose a deep learning based method for estimating the slanted angle of a photograph and correcting it. To estimate and correct the horizon direction, existing methods use hand-crafted low-level features such as lines, planes, and gradient distributions. However, these methods may not work well on the images that contain no lines or planes. To tackle this limitation and robustly estimate the slanted angle, we propose a convolutional neural network (CNN) based method to estimate the slanted angle by learning more generic features using a huge dataset. In addition, we utilize multiple adaptive spatial pooling layers to extract multi-scale image features for better performance. In the experimental results, we show our CNN-based approach robustly and accurately estimates the slanted angle of an image regardless of the image content, even if the image contains no lines or planes at all.

Dual Cycle Plan for Efficient Ship Loading and Unloading in Container Terminals (컨테이너 터미널의 효율적인 선적 작업을 위한 Dual Cycle 계획)

  • Chung, Chang-Yun;Shin, Jae-Young
    • Journal of Navigation and Port Research
    • /
    • v.33 no.8
    • /
    • pp.555-562
    • /
    • 2009
  • At container terminals, a major measurement of productivity can be work-efficiency in quay-side. At the apron, containers are loaded onto the ship and unloaded to apron by Q/C(Quay Crane). For improving the productivity of quay crane, the more efficient Y/T(Yard Tractor) operation method is necessary in container terminals. Between quay-side and yard area, current transferring methods is single-cycling which doesn't start loading unless it finishes unloading. Dual-cycling is a technique that can be used to improve the productivity of quay-side and utility of yard tractor by ship loading and unloading simultaneously. Using the dual-cycling at terminals only necessitates an operational change without purchasing extra equipment. Exactly, Y/T operation method has to be changed the dedicate system to pooling system. This paper presents an efficient ship loading and unloading plan in container terminals, which use the dual-cycling. We propose genetic and tabu search algorithm for this problem.

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

Cross platform classification of microarrays by rank comparison

  • Lee, Sunho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.475-486
    • /
    • 2015
  • Mining the microarray data accumulated in the public data repositories can save experimental cost and time and provide valuable biomedical information. Big data analysis pooling multiple data sets increases statistical power, improves the reliability of the results, and reduces the specific bias of the individual study. However, integrating several data sets from different studies is needed to deal with many problems. In this study, I limited the focus to the cross platform classification that the platform of a testing sample is different from the platform of a training set, and suggested a simple classification method based on rank. This method is compared with the diagonal linear discriminant analysis, k nearest neighbor method and support vector machine using the cross platform real example data sets of two cancers.

Distribution and Determinants of Out-of-pocket Healthcare Expenditures in Bangladesh

  • Mahumud, Rashidul Alam;Sarker, Abdur Razzaque;Sultana, Marufa;Islam, Ziaul;Khan, Jahangir;Morton, Alec
    • Journal of Preventive Medicine and Public Health
    • /
    • v.50 no.2
    • /
    • pp.91-99
    • /
    • 2017
  • Objectives: As in many low-income and middle-income countries, out-of-pocket (OOP) payments by patients or their families are a key healthcare financing mechanism in Bangladesh that leads to economic burdens for households. The objective of this study was to identify whether and to what extent socioeconomic, demographic, and behavioral factors of the population had an impact on OOP expenditures in Bangladesh. Methods: A total of 12 400 patients who had paid to receive any type of healthcare services within the previous 30 days were analyzed from the Bangladesh Household Income and Expenditure Survey data, 2010. We employed regression analysis for identify factors influencing OOP health expenditures using the ordinary least square method. Results: The mean total OOP healthcare expenditures was US dollar (USD) 27.66; while, the cost of medicines (USD 16.98) was the highest cost driver (61% of total OOP healthcare expenditure). In addition, this study identified age, sex, marital status, place of residence, and family wealth as significant factors associated with higher OOP healthcare expenditures. In contrary, unemployment and not receiving financial social benefits were inversely associated with OOP expenditures. Conclusions: The findings of this study can help decision-makers by clarifying the determinants of OOP, discussing the mechanisms driving these determinants, and there by underscoring the need to develop policy options for building stronger financial protection mechanisms. The government should consider devoting more resources to providing free or subsidized care. In parallel with government action, the development of other prudential and sustainable risk-pooling mechanisms may help attract enthusiastic subscribers to community-based health insurance schemes.