A Frame Unit Based Adaptive Pruning Algorithm for the East Speech Recognition (음성인식의 고속화를 위한 프레임 단위 적응 프루닝 알고리즘)
-
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.183-186
- /
- 2000
본 논문에서는 인식이 진행되는 동안 탐색 공간을 효과적으로 줄임으로써 음성인식의 고속화를 달성할 수 있는 새로운 프레임 단위 적응 프루닝 알고리즘을 제안하고 실험을 통하여 그 유효성을 확인하였다. 이것은 앞 프레임과 뒤 프레임 사이의 최대확률은 높은 상관성을 가지므로 프루닝 문턱치를 앞 프레임의 최대 확률로부터 효과적으로 구할 수 있다는 사실에 근거를 두고있다. 이 방법에서는 앞 프레임의 최대 우도 확률과 후보 확률들의 조합으로 현재 프레임의 프루닝 문턱치를 갱신함으로써 현재 프레임의 문턱치를 인식 과정 중에 얻을 수 있기 때문에, 인식 태스크가 바뀌어도 문턱치를 구하기 위한 사전 실험을 수행할 필요가 없게 된다. 또한, 프레임 단위로 적응적으로 얻어진 문턱치는 다른 환경 하에서도 인식 속도의 향상을 가져올 수 있게 된다. 제안된 알고리즘의 유효성을 확인하여 위하여 한국어 주소 인식 시스템에 적용하였다. 본 시스템은 48개의 유사음소단위(PLUs)를 인식의 기본단위로 하고, 적응알고리즘으로는 최대사후확률추정법((MAP: Maximum A Posteriori Probability Estimation)을, 인식 알고리즘으로는 OPDP(One Pass Dynamic Programming)법을 이용하였다 남성화자 3인이 25개의 연결 주소명을 대상으로 인식 실험을 수행한 결과, 제안된 프레임단위 적응프루닝 문턱치를 적용한 경우를 기존의 고정 프루닝 문턱치와 가변 프루닝 문턱치를 적용한 경우와 비교하였을 때 인식률의 변화 없이 탐색공간이 상대적으로 각각
There is the temporal correlation of the video sequence between the motion vector of current block and the motion vector of the previous block. If we can obtain useful and enough information from the motion vector of the same coordinate block of the previous frame, the total number of search points used to find the motion vector of the current block may be reduced significantly. In this paper, we propose the block-matching motion estimation using an adaptive initial search point by the predicted motion information from the same block of the previous frame. And the first search point of the proposed algorithm is moved an initial point on the location of being possibility and the searching process after moving the first search point is processed according to the fast search pattern. Simulation results show that PSNR(Peak-to-Signal Noise Ratio) values are improved UP to the 1.05dB as depend on the image sequences and improved about 0.33~0.37dB on an average. Search times are reduced about 29~97% than the other fast search algorithms. Simulation results also show that the performance of the proposed scheme gives better subjective picture quality than the other fast search algorithms and is closer to that of the FS(Full Search) algorithm.
As the growth of ubiquitous services, various types of ad hoc networks have emerged. In particular, wireless sensor networks (WSN) and mobile ad hoc networks (MANET) are widely known ad hoc networks, but there are also other kinds of wireless ad hoc networks in which the characteristics of the aforementioned two network types are mixed together. This paper proposes a variant of the Low Energy Adaptive Cluster Hierarchy (LEACH) routing protocol modified to be suitable in such a combined network environment. That is, the proposed routing protocol provides node detection and route discovery/maintenance in a network with a large number of mobile sensor nodes, while preserving node mobility, network connectivity, and energy efficiency. The proposed routing protocol is implemented with a multi-hop multi-path algorithm, a topology reconfiguration technique using node movement estimation and vibration sensors, and an efficient path selection and data transmission technique for a great many moving nodes. In the experiments, the performance of the proposed protocol is demonstrated by comparing it to the conventional LEACH protocol.
Establishing realistic bus service coverage is needed to build optimum city bus line networks and reasonable bus service coverage areas. The purposes of this study are understanding the characteristics of the present walking time and marginal walking time of small-medium cities and constructing an ANFIS (Adaptive Neuro-Fuzzy Inference System) model to estimate the marginal walking time for certain age and income. The cities of Masan, Chongwon and Jinju are selected for study cities. The 80 percentile of present walking time of bus users of these cities are 10.2-11.1 minutes, thus the values are greater than the 5 minutes of the maximum walking time in USA and the marginal walking times of 21.1-21.8 minutes are much greater. An ANFIS model based on pulled data of the cities are constructed to estimate the marginal walking time of small-medium cities. Analyzing the relationship between marginal walking time and age/income by using the model, the marginal walking time decreases as the age increases, but is near constant from the age of 25 to 35. And the marginal walking time is inversely proportional to the income. In comparing the surveyed and the estimated values, as the statistics of coefficient of determination, MSE and MAE are 0.996, 0.163, 0.333 respectively, it may be judged that the explainability of the model is very high. The technique developed in this study can be applied to other cities.
In this paper, a new audio reproduction system was developed in which the cross-talk signals would be reasonably cancelled at an arbitrary listener position. To adaptively remove the cross-talk signals according to the listener's position, a method of tracking the listener position was employed. This was achieved using the two microphones, where the listener direction was estimated using the time-delay between the two signals from the two microphones, respectively. Moreover, room reverberation effects were taken into consideration where linear prediction analysis was involved. To remove the cross-talk signals at the left-and right-ears, the paths between the sources and the ears were represented using the KEMAR head-related transfer functions (HRTFs) which were measured from the artificial dummy head. To evaluate the usefulness of the proposed listener tracking system, the performance of cross-talk cancellation was evaluated at the estimated listener positions. The performance was evaluated in terms of the channel separation ration (CSR), a -10 dB of CSR was experimentally achieved although the listener positions were more or less deviated. A real-time system was implemented using a floating-point digital signal processor (DSP). It was confirmed that the average errors of the listener direction was 5 degree and the subjects indicated that 80 % of the stimuli was perceived as the correct directions.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
Climate changes have caused not only changes in the frequency and intensity of extreme climate events, but also temperature and precipitation. The damages on agricultural production system will be increased by heavy rainfall and snow. In this study we assessed vulnerability of crop cultivation facility and animal husbandry facility by heavy rain in 232 agricultural districts. The climate data of 2000 years were used for vulnerability analysis on present status and the data derived from A1B scenario were used for the assessment in the years of 2020, 2050 and 2100, respectively. Vulnerability of local districts was evaluated by three indices such as climate exposure, sensitivity and adaptive capacity, and each index was determined from selected alternative variables. Collected data were normalized and then multiplied by weight value that was elicited in delphi investigation. Jeonla-do and Gangwon-do showed higher climate exposures than the other provinces. The higher sensitivity to abnormal weather was observed from the regions that have large-scale cultivation facility complex compared to the other regions and vulnerability to abnormal weather also was higher at these provinces. In the projected estimation based on the SRES A1B, the vulnerability of controlled agricultural facility in Korea totally increased, especially was dramatic between 2000's and 2020 year.
As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.