• Title/Summary/Keyword: weighted function

Search Result 740, Processing Time 0.036 seconds

The Study on Speaker Change Verification Using SNR based weighted KL distance (SNR 기반 가중 KL 거리를 활용한 화자 변화 검증에 관한 연구)

  • Cho, Joon-Beom;Lee, Ji-eun;Lee, Kyong-Rok
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.6
    • /
    • pp.159-166
    • /
    • 2017
  • In this paper, we have experimented to improve the verification performance of speaker change detection on broadcast news. It is to enhance the input noisy speech and to apply the KL distance $D_s$ using the SNR-based weighting function $w_m$. The basic experimental system is the verification system of speaker change using GMM-UBM based KL distance D(Experiment 0). Experiment 1 applies the input noisy speech enhancement using MMSE Log-STSA. Experiment 2 applies the new KL distance $D_s$ to the system of Experiment 1. Experiments were conducted under the condition of 0% MDR in order to prevent missing information of speaker change. The FAR of Experiment 0 was 71.5%. The FAR of Experiment 1 was 67.3%, which was 4.2% higher than that of Experiment 0. The FAR of experiment 2 was 60.7%, which was 10.8% higher than that of experiment 0.

A Spatial Interpolation Model for Daily Minimum Temperature over Mountainous Regions (산악지대의 일 최저기온 공간내삽모형)

  • Yun Jin-Il;Choi Jae-Yeon;Yoon Young-Kwan;Chung Uran
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.2 no.4
    • /
    • pp.175-182
    • /
    • 2000
  • Spatial interpolation of daily temperature forecasts and observations issued by public weather services is frequently required to make them applicable to agricultural activities and modeling tasks. In contrast to the long term averages like monthly normals, terrain effects are not considered in most spatial interpolations for short term temperatures. This may cause erroneous results in mountainous regions where the observation network hardly covers full features of the complicated terrain. We developed a spatial interpolation model for daily minimum temperature which combines inverse distance squared weighting and elevation difference correction. This model uses a time dependent function for 'mountain slope lapse rate', which can be derived from regression analyses of the station observations with respect to the geographical and topographical features of the surroundings including the station elevation. We applied this model to interpolation of daily minimum temperature over the mountainous Korean Peninsula using 63 standard weather station data. For the first step, a primitive temperature surface was interpolated by inverse distance squared weighting of the 63 point data. Next, a virtual elevation surface was reconstructed by spatially interpolating the 63 station elevation data and subtracted from the elevation surface of a digital elevation model with 1 km grid spacing to obtain the elevation difference at each grid cell. Final estimates of daily minimum temperature at all the grid cells were obtained by applying the calculated daily lapse rate to the elevation difference and adjusting the inverse distance weighted estimates. Independent, measured data sets from 267 automated weather station locations were used to calculate the estimation errors on 12 dates, randomly selected one for each month in 1999. Analysis of 3 terms of estimation errors (mean error, mean absolute error, and root mean squared error) indicates a substantial improvement over the inverse distance squared weighting.

  • PDF

Doubly-robust Q-estimation in observational studies with high-dimensional covariates (고차원 관측자료에서의 Q-학습 모형에 대한 이중강건성 연구)

  • Lee, Hyobeen;Kim, Yeji;Cho, Hyungjun;Choi, Sangbum
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.309-327
    • /
    • 2021
  • Dynamic treatment regimes (DTRs) are decision-making rules designed to provide personalized treatment to individuals in multi-stage randomized trials. Unlike classical methods, in which all individuals are prescribed the same type of treatment, DTRs prescribe patient-tailored treatments which take into account individual characteristics that may change over time. The Q-learning method, one of regression-based algorithms to figure out optimal treatment rules, becomes more popular as it can be easily implemented. However, the performance of the Q-learning algorithm heavily relies on the correct specification of the Q-function for response, especially in observational studies. In this article, we examine a number of double-robust weighted least-squares estimating methods for Q-learning in high-dimensional settings, where treatment models for propensity score and penalization for sparse estimation are also investigated. We further consider flexible ensemble machine learning methods for the treatment model to achieve double-robustness, so that optimal decision rule can be correctly estimated as long as at least one of the outcome model or treatment model is correct. Extensive simulation studies show that the proposed methods work well with practical sample sizes. The practical utility of the proposed methods is proven with real data example.

A Study on the Improvement of Types and Grades of Forest Wetland through Correlation Analysis of Forest Wetland Evaluation Factors and Types (산림습원 가치평가 요소와 유형 및 등급의 상관성 분석을 통한 산림습원 유형 구분 및 등급의 개선 방안 연구)

  • Lee, Jong-Won;Yun, Ho-Geun;Lee, Kyu Song;An, Jong Bin
    • Korean Journal of Plant Resources
    • /
    • v.35 no.4
    • /
    • pp.471-501
    • /
    • 2022
  • This study was carried out on 455 forest wetlands of south Korea for which an inventory was established through value evaluation and grade. Correlation analysis was conducted to find out the correlation between the types and grades of forest wetlands and 23 evaluation factors in four categories: vegetation and landscape, material circulation and hydraulics·hydrology, humanities and social landscape, and disturbance level. Through the improvement of types and grades of forest wetlands, it is possible to secure basic data that can be used in setting up conservation measures by preparing standards necessary for future forest wetland conservation and restoration, and to found a systematic monitoring system. First, between the type of forest wetland and size and accessibility showed a positive correlation, but the remaining items were analyzed to have negative or no correlation. In particular, it was found that there was no negative correlation or no correlation with the grades of forest wetland. Moreover, it was found that there was a very strong negative correlation with the weighted four category items. Thus, it is judged that improvement is necessary because there is an error in the weight or adjust the evaluation criteria of the value evaluation item, add an item that can increase objectivity. Especially, in the case of forest wetlands, the ecosystem service function due to biodiversity is the largest, so evaluation items should be improved in consideration of this. Therefore, it can be divided into five categories: uniqueness and rarity (15%), wildlife habitat (15%), vegetation and landscape (35%), material cycle·hydraulic hydrology (30%), and humanities and social landscape (5%). It will be possible to propose weights that can increase effectiveness.

An Enlarged Perivascular Space: Clinical Relevance and the Role of Imaging in Aging and Neurologic Disorders (늘어난 혈관주위공간: 노화와 신경계질환에서의 임상적의의와 영상의 역할)

  • Younghee Yim;Won-Jin Moon
    • Journal of the Korean Society of Radiology
    • /
    • v.83 no.3
    • /
    • pp.538-558
    • /
    • 2022
  • The perivascular space (PVS) of the brain, also known as Virchow-Robin space, consists of cerebrospinal fluid and connective tissues bordered by astrocyte endfeet. The PVS, in a word, is the route over the arterioles, capillaries, and venules where the substances can move. Although the PVS was identified and described first in the literature approximately over 150 years ago, its importance has been highlighted recently after the function of the waste clearing system of the interstitial fluid and wastes was revealed. The PVS is known to be a microscopic structure detected using T2-weighted brain MRI as dot-like hyperintensity lesions when enlarged. Although until recently regarded as normal with no clinical consequence and ignored in many circumstances, several studies have argued the association of an enlarged PVS with neurodegenerative or other diseases. Many questions and unknown facts about this structure still exist; we can only assume that the normal PVS functions are crucial in keeping the brain healthy. In this review, we covered the history, anatomy, pathophysiology, and MRI findings of the PVS; finally, we briefly touched upon the recent trials to better visualize the PVS by providing a glimpse of the brain fluid dynamics and clinical importance of the PVS.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

Functional MRI of Visual Cortex: Correlation between Photic Stimulator Size and Cortex Activation (시각피질의 기능적 MR 연구: 광자극 크기와 피질 활성화와의 관계)

  • 김경숙;이호규;최충곤;서대철
    • Investigative Magnetic Resonance Imaging
    • /
    • v.1 no.1
    • /
    • pp.114-118
    • /
    • 1997
  • Purpose: Functional MR imaging is the method of demonstrating changes in regional cerebral blood flow produced by sensory, motor, and any other tasks. Functional MR of visual cortex is performed as a patient stares a photic stimulation, so adaptable photic stimulation is necessary. The purpose of this study is to evaluate whether the size of photic stimulator can affect the degree of visual cortex activation. Materials and Methods: Functional MR imaging was performed in 5 volunteers with normal visual acuity. Photic stimulator was made by 39 light-emitting diodes on a plate, operating at 8Hz. The sizes of photic stimulator were full field, half field and focal central field. The MR imager was Siemens 1.5-T Magnetom Vision system, using standard head coil. Functional MRI utilized EPI sequence (TR/TE= 1.0/51. Omsec, matrix $No.=98{\times}128$, slice thickness=8mm) with 3sets of 6 imaging during stimulation and 6 imaging during rest, all 36 scannings were obtained. Activation images were obtained using postprocessing software(statistical analysis by Z-score), and these images were combined with T-1 weighted anatomical images. The activated signals were quantified by numbering the activated pixels, and activation a index was obtained by dividing the pixel number of each stimulator size with the sum of the pixel number of 3 study using 3 kinds of stimulators. The correlation between the activation index and the stimulator size was analysed. Results: Mean increase of signal intensities on the activation area using full field photic stimulator was about 9.6%. The activation index was greatest on full field, second on half field and smallest on focal central field in 4. The index of half field was greater than that of full field in 1. The ranges of activation index were full field 43-73%(mean 55%), half field 22-40 %(mean 32%), and focal central field 5-24%(mean 13%). Conclusion: The degree of visual cortex activation increases with the size of photic stimulator.

  • PDF

Construction of Gene Network System Associated with Economic Traits in Cattle (소의 경제형질 관련 유전자 네트워크 분석 시스템 구축)

  • Lim, Dajeong;Kim, Hyung-Yong;Cho, Yong-Min;Chai, Han-Ha;Park, Jong-Eun;Lim, Kyu-Sang;Lee, Seung-Su
    • Journal of Life Science
    • /
    • v.26 no.8
    • /
    • pp.904-910
    • /
    • 2016
  • Complex traits are determined by the combined effects of many loci and are affected by gene networks or biological pathways. Systems biology approaches have an important role in the identification of candidate genes related to complex diseases or traits at the system level. The gene network analysis has been performed by diverse types of methods such as gene co-expression, gene regulatory relationships, protein-protein interaction (PPI) and genetic networks. Moreover, the network-based methods were described for predicting gene functions such as graph theoretic method, neighborhood counting based methods and weighted function. However, there are a limited number of researches in livestock. The present study systemically analyzed genes associated with 102 types of economic traits based on the Animal Trait Ontology (ATO) and identified their relationships based on the gene co-expression network and PPI network in cattle. Then, we constructed the two types of gene network databases and network visualization system (http://www.nabc.go.kr/cg). We used a gene co-expression network analysis from the bovine expression value of bovine genes to generate gene co-expression network. PPI network was constructed from Human protein reference database based on the orthologous relationship between human and cattle. Finally, candidate genes and their network relationships were identified in each trait. They were typologically centered with large degree and betweenness centrality (BC) value in the gene network. The ontle program was applied to generate the database and to visualize the gene network results. This information would serve as valuable resources for exploiting genomic functions that influence economically and agriculturally important traits in cattle.

Research on ITB Contract Terms Classification Model for Risk Management in EPC Projects: Deep Learning-Based PLM Ensemble Techniques (EPC 프로젝트의 위험 관리를 위한 ITB 문서 조항 분류 모델 연구: 딥러닝 기반 PLM 앙상블 기법 활용)

  • Hyunsang Lee;Wonseok Lee;Bogeun Jo;Heejun Lee;Sangjin Oh;Sangwoo You;Maru Nam;Hyunsik Lee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.11
    • /
    • pp.471-480
    • /
    • 2023
  • The Korean construction order volume in South Korea grew significantly from 91.3 trillion won in public orders in 2013 to a total of 212 trillion won in 2021, particularly in the private sector. As the size of the domestic and overseas markets grew, the scale and complexity of EPC (Engineering, Procurement, Construction) projects increased, and risk management of project management and ITB (Invitation to Bid) documents became a critical issue. The time granted to actual construction companies in the bidding process following the EPC project award is not only limited, but also extremely challenging to review all the risk terms in the ITB document due to manpower and cost issues. Previous research attempted to categorize the risk terms in EPC contract documents and detect them based on AI, but there were limitations to practical use due to problems related to data, such as the limit of labeled data utilization and class imbalance. Therefore, this study aims to develop an AI model that can categorize the contract terms based on the FIDIC Yellow 2017(Federation Internationale Des Ingenieurs-Conseils Contract terms) standard in detail, rather than defining and classifying risk terms like previous research. A multi-text classification function is necessary because the contract terms that need to be reviewed in detail may vary depending on the scale and type of the project. To enhance the performance of the multi-text classification model, we developed the ELECTRA PLM (Pre-trained Language Model) capable of efficiently learning the context of text data from the pre-training stage, and conducted a four-step experiment to validate the performance of the model. As a result, the ensemble version of the self-developed ITB-ELECTRA model and Legal-BERT achieved the best performance with a weighted average F1-Score of 76% in the classification of 57 contract terms.

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

  • Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.77-97
    • /
    • 2010
  • Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.