• Title/Summary/Keyword: 성능목표

Search Result 2,190, Processing Time 0.026 seconds

Analysis of grout injection distance in single rock joint (단일절리 암반에서 그라우팅 주입거리 분석)

  • Ji-Yeong Kim;Jo-Hyun Weon;Jong-Won Lee;Tae-Min Oh
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.25 no.6
    • /
    • pp.541-554
    • /
    • 2023
  • The utilization of underground spaces in relation to tunnels and energy/waste storage is on the rise. To ensure the stability of underground spaces, it is crucial to reinforce rock fractures and discontinuities. Discontinuities, such as joints, can weaken the strength of the rock and lead to groundwater inflow into underground spaces. In order to enhance the strength and stability of the area around these discontinuities, rock grouting techniques are employed. However, during rock grouting, it is impossible to visually confirm whether the grouting material is being smoothly injected as intended. Without proper injection, the expected increases in strength, durability, and degree of consolidation may not be achieved. Therefore, it is necessary to predict in advance whether the grouting material is being injected as designed. In this study, we aimed to assess the injection performance based on injection variables such as the water/cement mixture ratio, injection pressure, and injection flow using UDEC (Universal Distinct Element Code) numerical program. Additionally, numerical results were validated by the lab experiment. The results of this study are expected to help optimize variables such as injection material properties, injection time, and pump pressure in the grouting design in the field.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

  • Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.697-705
    • /
    • 2009
  • An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.

A Study on the Drug Classification Using Machine Learning Techniques (머신러닝 기법을 이용한 약물 분류 방법 연구)

  • Anmol Kumar Singh;Ayush Kumar;Adya Singh;Akashika Anshum;Pradeep Kumar Mallick
    • Advanced Industrial SCIence
    • /
    • v.3 no.2
    • /
    • pp.8-16
    • /
    • 2024
  • This paper shows the system of drug classification, the goal of this is to foretell the apt drug for the patients based on their demographic and physiological traits. The dataset consists of various attributes like Age, Sex, BP (Blood Pressure), Cholesterol Level, and Na_to_K (Sodium to Potassium ratio), with the objective to determine the kind of drug being given. The models used in this paper are K-Nearest Neighbors (KNN), Logistic Regression and Random Forest. Further to fine-tune hyper parameters using 5-fold cross-validation, GridSearchCV was used and each model was trained and tested on the dataset. To assess the performance of each model both with and without hyper parameter tuning evaluation metrics like accuracy, confusion matrices, and classification reports were used and the accuracy of the models without GridSearchCV was 0.7, 0.875, 0.975 and with GridSearchCV was 0.75, 1.0, 0.975. According to GridSearchCV Logistic Regression is the most suitable model for drug classification among the three-model used followed by the K-Nearest Neighbors. Also, Na_to_K is an essential feature in predicting the outcome.

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.

Improvement of the Fishing Gear and Fishing Method of the East-Sea Trawl Fishery (동해구 트롤 어구어법의 개량)

  • 권병국;이주희;이춘우;김형석;김용식;안영일;김정문
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.37 no.2
    • /
    • pp.106-116
    • /
    • 2001
  • A serious of studies on the fishing gear and system of the East Sea trawl fishery was carried out to improve the fishing efficiency and the working conditions. As the first step of these studies, the fishing gear and system of the traditional East Sea trawl were checked in order to solve the some problems, such as the poor sheering efficiency of net mouth, the inconvenient fishing system of the side trawl and etc. And then the fishing system was reorganized from the side trawl into the stern trawl by setting up the net drum system on the stern deck, and introduction of two types of new designed nets, one for mainly the midwater trawl and the other for the bottom trawl. The results of the field experiment on the modified system and nets can be summarized as follows : 1. the modified system was well worked and could save the man-labour by about 80%. 2. The sheering efficiency of the improved net, A type was improved to 20 m height and 30 m width in the net mouth, and that of B type net, to 10 m height and 33 m width, compared with 1.5 m height and 15 m width in the traditional net. 3. Catch efficiency of pink shrimp in A or B type net was better about 3 or 5 times than that of traditional net, and in B net, for herring and other bottom fishes is better about 2 times than that of the traditional net.

  • PDF

Development of Deep Learning Structure to Improve Quality of Polygonal Containers (다각형 용기의 품질 향상을 위한 딥러닝 구조 개발)

  • Yoon, Suk-Moon;Lee, Seung-Ho
    • Journal of IKEEE
    • /
    • v.25 no.3
    • /
    • pp.493-500
    • /
    • 2021
  • In this paper, we propose the development of deep learning structure to improve quality of polygonal containers. The deep learning structure consists of a convolution layer, a bottleneck layer, a fully connect layer, and a softmax layer. The convolution layer is a layer that obtains a feature image by performing a convolution 3x3 operation on the input image or the feature image of the previous layer with several feature filters. The bottleneck layer selects only the optimal features among the features on the feature image extracted through the convolution layer, reduces the channel to a convolution 1x1 ReLU, and performs a convolution 3x3 ReLU. The global average pooling operation performed after going through the bottleneck layer reduces the size of the feature image by selecting only the optimal features among the features of the feature image extracted through the convolution layer. The fully connect layer outputs the output data through 6 fully connect layers. The softmax layer multiplies and multiplies the value between the value of the input layer node and the target node to be calculated, and converts it into a value between 0 and 1 through an activation function. After the learning is completed, the recognition process classifies non-circular glass bottles by performing image acquisition using a camera, measuring position detection, and non-circular glass bottle classification using deep learning as in the learning process. In order to evaluate the performance of the deep learning structure to improve quality of polygonal containers, as a result of an experiment at an authorized testing institute, it was calculated to be at the same level as the world's highest level with 99% good/defective discrimination accuracy. Inspection time averaged 1.7 seconds, which was calculated within the operating time standards of production processes using non-circular machine vision systems. Therefore, the effectiveness of the performance of the deep learning structure to improve quality of polygonal containers proposed in this paper was proven.

Development of Digital Transceiver Unit for 5G Optical Repeater (5G 광중계기 구동을 위한 디지털 송수신 유닛 설계)

  • Min, Kyoung-Ok;Lee, Seung-Ho
    • Journal of IKEEE
    • /
    • v.25 no.1
    • /
    • pp.156-167
    • /
    • 2021
  • In this paper, we propose a digital transceiver unit design for in-building of 5G optical repeaters that extends the coverage of 5G mobile communication network services and connects to a stable wireless network in a building. The digital transceiver unit for driving the proposed 5G optical repeater is composed of 4 blocks: a signal processing unit, an RF transceiver unit, an optical input/output unit, and a clock generation unit. The signal processing unit plays an important role, such as a combination of a basic operation of the CPRI interface, a 4-channel antenna signal, and response to external control commands. It also transmits and receives high-quality IQ data through the JESD204B interface. CFR and DPD blocks operate to protect the power amplifier. The RF transmitter/receiver converts the RF signal received from the antenna to AD, is transmitted to the signal processing unit through the JESD204B interface, and DA converts the digital signal transmitted from the signal processing unit to the JESD204B interface and transmits the RF signal to the antenna. The optical input/output unit converts an electric signal into an optical signal and transmits it, and converts the optical signal into an electric signal and receives it. The clock generator suppresses jitter of the synchronous clock supplied from the CPRI interface of the optical input/output unit, and supplies a stable synchronous clock to the signal processing unit and the RF transceiver. Before CPRI connection, a local clock is supplied to operate in a CPRI connection ready state. XCZU9CG-2FFVC900I of Xilinx's MPSoC series was used to evaluate the accuracy of the digital transceiver unit for driving the 5G optical repeater proposed in this paper, and Vivado 2018.3 was used as the design tool. The 5G optical repeater digital transceiver unit proposed in this paper converts the 5G RF signal input to the ADC into digital and transmits it to the JIG through CPRI and outputs the downlink data signal received from the JIG through the CPRI to the DAC. And evaluated the performance. The experimental results showed that flatness, Return Loss, Channel Power, ACLR, EVM, Frequency Error, etc. exceeded the target set value.

Automatic Speech Style Recognition Through Sentence Sequencing for Speaker Recognition in Bilateral Dialogue Situations (양자 간 대화 상황에서의 화자인식을 위한 문장 시퀀싱 방법을 통한 자동 말투 인식)

  • Kang, Garam;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.17-32
    • /
    • 2021
  • Speaker recognition is generally divided into speaker identification and speaker verification. Speaker recognition plays an important function in the automatic voice system, and the importance of speaker recognition technology is becoming more prominent as the recent development of portable devices, voice technology, and audio content fields continue to expand. Previous speaker recognition studies have been conducted with the goal of automatically determining who the speaker is based on voice files and improving accuracy. Speech is an important sociolinguistic subject, and it contains very useful information that reveals the speaker's attitude, conversation intention, and personality, and this can be an important clue to speaker recognition. The final ending used in the speaker's speech determines the type of sentence or has functions and information such as the speaker's intention, psychological attitude, or relationship to the listener. The use of the terminating ending has various probabilities depending on the characteristics of the speaker, so the type and distribution of the terminating ending of a specific unidentified speaker will be helpful in recognizing the speaker. However, there have been few studies that considered speech in the existing text-based speaker recognition, and if speech information is added to the speech signal-based speaker recognition technique, the accuracy of speaker recognition can be further improved. Hence, the purpose of this paper is to propose a novel method using speech style expressed as a sentence-final ending to improve the accuracy of Korean speaker recognition. To this end, a method called sentence sequencing that generates vector values by using the type and frequency of the sentence-final ending appearing in the utterance of a specific person is proposed. To evaluate the performance of the proposed method, learning and performance evaluation were conducted with a actual drama script. The method proposed in this study can be used as a means to improve the performance of Korean speech recognition service.

A preliminary assessment of high-spatial-resolution satellite rainfall estimation from SAR Sentinel-1 over the central region of South Korea (한반도 중부지역에서의 SAR Sentinel-1 위성강우량 추정에 관한 예비평가)

  • Nguyen, Hoang Hai;Jung, Woosung;Lee, Dalgeun;Shin, Daeyun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.6
    • /
    • pp.393-404
    • /
    • 2022
  • Reliable terrestrial rainfall observations from satellites at finer spatial resolution are essential for urban hydrological and microscale agricultural demands. Although various traditional "top-down" approach-based satellite rainfall products were widely used, they are limited in spatial resolution. This study aims to assess the potential of a novel "bottom-up" approach for rainfall estimation, the parameterized SM2RAIN model, applied to the C-band SAR Sentinel-1 satellite data (SM2RAIN-S1), to generate high-spatial-resolution terrestrial rainfall estimates (0.01° grid/6-day) over Central South Korea. Its performance was evaluated for both spatial and temporal variability using the respective rainfall data from a conventional reanalysis product and rain gauge network for a 1-year period over two different sub-regions in Central South Korea-the mixed forest-dominated, middle sub-region and cropland-dominated, west coast sub-region. Evaluation results indicated that the SM2RAIN-S1 product can capture general rainfall patterns in Central South Korea, and hold potential for high-spatial-resolution rainfall measurement over the local scale with different land covers, while less biased rainfall estimates against rain gauge observations were provided. Moreover, the SM2RAIN-S1 rainfall product was better in mixed forests considering the Pearson's correlation coefficient (R = 0.69), implying the suitability of 6-day SM2RAIN-S1 data in capturing the temporal dynamics of soil moisture and rainfall in mixed forests. However, in terms of RMSE and Bias, better performance was obtained with the SM2RAIN-S1 rainfall product over croplands rather than mixed forests, indicating that larger errors induced by high evapotranspiration losses (especially in mixed forests) need to be included in further improvement of the SM2RAIN.