• Title/Summary/Keyword: Residual Block

Search Result 201, Processing Time 0.025 seconds

Emotion Transfer with Strength Control for End-to-End TTS (감정 제어 가능한 종단 간 음성합성 시스템)

  • Jeon, Yejin;Lee, Gary Geunbae
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.423-426
    • /
    • 2021
  • 본 논문은 전역 스타일 토큰(Global Style Token)을 기준으로 하여 감정의 세기를 조절할 수 있는 방법을 소개한다. 기존의 전역 스타일 토큰 연구에서는 원하는 스타일이 포함된 참조 오디오(reference audio)을 사용하여 음성을 합성하였다. 그러나, 참조 오디오의 스타일대로만 음성합성이 가능하기 때문에 세밀한 감정 조절에 어려움이 있었다. 이 문제를 해결하기 위해 본 논문에서는 전역 스타일 토큰의 레퍼런스 인코더 부분을 잔여 블록(residual block)과 컴퓨터 비전 분야에서 사용되는 AlexNet으로 대체하였다. AlexNet은 5개의 함성곱 신경망(convolutional neural networks) 으로 구성되어 있지만, 본 논문에서는 1개의 신경망을 제외한 4개의 레이어만 사용했다. 청취 평가(Mean Opinion Score)를 통해 제시된 방법으로 감정 세기의 조절 가능성을 보여준다.

  • PDF

Development of Technique in Super Resolution domain that eliminates unnecessary Correlation information between Pixels & Channels. (픽셀, 채널간 불필요한 상호연관 정보를 제거하는 초해상화 딥러닝 기법)

  • Kang, Jung-Heum;Bae, Sung-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.656-659
    • /
    • 2020
  • 초해상화 딥러닝 기법은 학습 시 수렴하기까지 최소 수백 번의 에폭을 필요로 하며 오랜 시간이 걸린다. 최근, 영상 인식용 딥러닝 모델에서는 학습 수렴 속도를 향상시키기 위해 픽셀, 채널간 불필요한 상호연관 정보를 제거하는 Deconvolution 기술이 제안되었다. 본 논문에서는 최초로 Deconvolution 기술을 초해상화 딥러닝 방법에 적용하여 학습 수렴 속도 증가를 시도했다. 영상 인식 딥러닝 기법과 다르게 초해상화 딥러닝 기법은 이미지 특성 추출 부분과 이미지 복원 부분의 정보를 보존하는 것이 중요하기 때문에, EDSR을 Baseline 모델로 사용하여 양쪽 끝의 레이어는 기존의 Convolution 연산을 그대로 유지하고, 중간 레이어의 ResBlock 내의 Convolution 연산만 Deconvolution 연산으로 바꿔서 구성하였다. 초해상화 벤치마크 데이터셋을 사용한 실험 결과, 수렴속도가 빨라지지 않는 결과를 도출했다. 본 논문에서는 Deconvolution 기술이 Baseline 모델의 성능을 개선하지 못하는 이유를 초해상화 분야에서 기본적으로 적용되는 Residual Learning 기법 때문으로 분석했다.

  • PDF

A Fast Inter-layer Mode Decision Method inScalable Video Coding (공간적 스케일러블 비디오 부호화에서 계층간 모드 고속 결정 방법)

  • Lee, Bum-Shik;Hahm, Sang-Jin;Park, Chang-Seob;Park, Keun-Soo;Kim, Mun-Churl
    • Journal of Broadcast Engineering
    • /
    • v.12 no.4
    • /
    • pp.360-372
    • /
    • 2007
  • We propose a fast inter-layer mode decision method by utilizing coding information of base layer upward its enhancement layer inscalable video coding (SVC), also called MPEG-4 part 10 Advanced Video Coding Amendment 3 or H.264 Scalable Extension (SE) which is being standardized. In this paper, when the motion vectors from the base layer have zero motion (0, 0) in inter-layer motion prediction or the Integer Transform coefficients of the residual between current MB and the motion compensated MB by the predicted motion vectors from the base layer are all zero, the block mode of the corresponding block to be encoded at the enhancement layer is determined to be the $16{\times}16$ mode. In addition, if the predicted mode of the MB to be encoded at the enhancement layer is not equal to the $16{\times}16$ mode, then the rate-distortion optimization is only performed on the reduced candidated modes which are same or smaller partitioned modes. Our proposed method exhibits the complexity reduction in encoding time up to 72%. Nevertheless, it shows negligible PSNR degradation and bit rate increase up to 0.25dB and 1.73%, respectively.

The Consideration for Optimum 3D Seismic Processing Procedures in Block II, Northern Part of South Yellow Sea Basin (대륙붕 2광구 서해분지 북부지역의 3D전산처리 최적화 방안시 고려점)

  • Ko, Seung-Won;Shin, Kook-Sun;Jung, Hyun-Young
    • The Korean Journal of Petroleum Geology
    • /
    • v.11 no.1 s.12
    • /
    • pp.9-17
    • /
    • 2005
  • In the main target area of the block II, Targe-scale faults occur below the unconformity developed around 1 km in depth. The contrast of seismic velocity around the unconformity is generally so large that the strong multiples and the radical velocity variation would deteriorate the quality of migrated section due to serious distortion. More than 15 kinds of data processing techniques have been applied to improve the image resolution for the structures farmed from this active crustal activity. The bad and noisy traces were edited on the common shot gathers in the first step to get rid of acquisition problems which could take place from unfavorable conditions such as climatic change during data acquisition. Correction of amplitude attenuation caused from spherical divergence and inelastic attenuation has been also applied. Mild F/K filter was used to attenuate coherent noise such as guided waves and side scatters. Predictive deconvolution has been applied before stacking to remove peg-leg multiples and water reverberations. The velocity analysis process was conducted at every 2 km interval to analyze migration velocity, and it was iterated to get the high fidelity image. The strum noise caused from streamer was completely removed by applying predictive deconvolution in time space and ${\tau}-P$ domain. Residual multiples caused from thin layer or water bottom were eliminated through parabolic radon transform demultiple process. The migration using curved ray Kirchhoff-style algorithm has been applied to stack data. The velocity obtained after several iteration approach for MVA (migration velocity analysis) was used instead or DMO for the migration velocity. Using various testing methods, optimum seismic processing parameter can be obtained for structural and stratigraphic interpretation in the Block II, Yellow Sea Basin.

  • PDF

Production of Single Core with Waste Zirconia Block (지르코니아 블록 폐기물을 이용한 싱글코어의 제조법)

  • Jo, Jun-Ho;Seo, Jeong-Il;Bae, Won-Tae
    • Journal of Technologic Dentistry
    • /
    • v.35 no.1
    • /
    • pp.57-64
    • /
    • 2013
  • Purpose: Waste parts of zirconia blocks and powders were remained after CAD/CAM process. In order to make these residual zirconia fit for practical use, zirconia single cores were produced by drain casting process. Methods: Remained zirconia blocks were reduced to powders with zirconia mortar, and screened with 180 mesh sieve. Zirconia slip was prepared from waste parts of zirconia by ball milling. Plaster molds for forming cores by slip casting were also prepared. Formed cores were removed from mold after partial drying. Dried cores were biscuit fired at $1,100^{\circ}C$ for 1hour. Biscuit fired cores were treated with tools to control the fitness and thickness. Finished cores were $2^{nd}$ fired at $1,500^{\circ}C$ for 1hour. Microstructure of cross section of core was observed by SEM. Results: When mill pot was filled with 100g of zirconia and alumina mixed powder, 300g of zirconia ball, and 180g of distilled water, the optimum slip for drain casting was obtained. Gypsum plaster for ceramic forming was more suitable then yellow stone plaster for casting process. SEM photograph showed the microstructure of fully dense with uniform grain size of zirconia and well dispersed alumina grains into the zirconia matrix. Conclusion: Zirconia single cores were produced by drain casting process. Drain casting is useful process to make these residual zirconia fit for practical use. Further study will be focused on the preparation of the bridge type cores by casting.

Perceptual Generative Adversarial Network for Single Image De-Snowing (단일 영상에서 눈송이 제거를 위한 지각적 GAN)

  • Wan, Weiguo;Lee, Hyo Jong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.10
    • /
    • pp.403-410
    • /
    • 2019
  • Image de-snowing aims at eliminating the negative influence by snow particles and improving scene understanding in images. In this paper, a perceptual generative adversarial network based a single image snow removal method is proposed. The residual U-Net is designed as a generator to generate the snow free image. In order to handle various sizes of snow particles, the inception module with different filter kernels is adopted to extract multiple resolution features of the input snow image. Except the adversarial loss, the perceptual loss and total variation loss are employed to improve the quality of the resulted image. Experimental results indicate that our method can obtain excellent performance both on synthetic and realistic snow images in terms of visual observation and commonly used visual quality indices.

Complications after Senning Operation for TGA with and Wothout VSD (대혈관전위증에서 Senning수술후 합병증에 관한 임상적 고찰)

  • 안재호
    • Journal of Chest Surgery
    • /
    • v.26 no.8
    • /
    • pp.595-603
    • /
    • 1993
  • We analysed 60 consecutive patients who got Senning operation for transposition of the great arteries [TGA] with or without ventricular septal defects [VSD]. There were 41 simple TGA [group I] and 19 TGA with VSD [Group II], the operative mortality was 20 % [in group I 4.9 %, group II 52.6 %]. Among the survivors [n=48], the mean follow-up period was 7 years [range, 1 year to 13.5 years] and the actuarial survival rate at 13 years were 95 % in group I and 42 % in group II. Preoperative high left ventricular pressure and high pulmonary arterial pressure affected the surviving [p<0.01]. There occurred various type of arrhythmia like junctional rhythm, first degree atrioventricular [AV] block, sick sinus syndrome and complete AV block, and we inserted 2 permanent pacemakers for these patients. The incidence of arrhythmia were 28.2 % [11/39] in group I and 55.6 % [5/9] in group II, and the actuarial freedom from arrhythmia at 13 years after operation was 66 % [71 % in group I, 44 % in group II]. Increased aortic cross clamping time had affected the development of arrhythmia [p<0.05] which meant the complexity of the operation. The total incidence of left ventricular outflow tract obstruction [LVOTO] was 31.3 % [15/48], but only 3 patients [6.25 %] showed the significant gradient requiring reoperation. The pulmonary venous pathway obstruction [PVO] were found in 3 patients, all in group I, and among them only one required the reoperation. The estimated freedom from PVO was 89 % at 13 years [87 % in group I, 100 % in group II], but we couldn`t find any significant systemic venous obstruction in our series. There occurred 27.1 % [13/48] mild degree tricuspid valve regurgitation without necessary surgical correction. We experienced 14.6 % [7/48] reoperation rate: 3 residual VSD, 3 LVOTO, 1 PVO, 3 atrial baffle leakage. For this high incidence of complication rate after Senning operation and high mortality in TGA with VSD, We do not use this kind of surgical modality any more and do the Jatene operation for all the TGA patients since several years ago.

  • PDF

Bandwidth-Efficient OFDM Transmission with Iterative Cyclic Prefix Reconstruction

  • Lim, Jong-Bu;Kim, Eung-Sun;Park, Cheol-Jin;Won, Hui-Chul;Kim, Ki-Ho;Im, Gi-Hong
    • Journal of Communications and Networks
    • /
    • v.10 no.3
    • /
    • pp.239-252
    • /
    • 2008
  • For orthogonal frequency division multiplexing (OFDM), cyclic prefix (CP) should be longer than the length of channel impulse response, resulting in a loss of bandwidth efficiency. In this paper, we describe a new technique to restore the cyclicity of the received signal when the CP is not sufficient for OFDM systems. The proposed technique efficiently restores the cyclicity of the current received symbol by adding the weighted next received symbol to the current received symbol. Iterative CP reconstruction (CPR) procedure, based on the residual intersymbol interference cancellation (RISIC) algorithm, is analyzed and compared to the RISIC. In addition, we apply the CPR method to Alamouti space-time block coded (STBC) OFDM system. It is shown that in the STBC OFDM, tail cancellation as well as cyclic reconstruction of the CPR procedure should be repeated. The computational complexities of the RISIC, the proposed CPR, the RISIC with STBC, and the proposed CPR with STBC are analyzed and their performances are evaluated in multipath fading environments. We also propose an iterative channel estimation (CE) method for OFDM with insufficient CP. Further, we discuss the CE method for the STBC OFDM system with the CPR. It is shown that the CPR technique with the proposed CE method minimizes the loss of bandwidth efficiency due to the use of CP, without sacrificing the diversity gain of the STBC OFDM system.

Volumetric stability of autogenous bone graft with mandibular body bone: cone-beam computed tomography and three-dimensional reconstruction analysis

  • Lee, Hyeong-Geun;Kim, Yong-Deok
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.41 no.5
    • /
    • pp.232-239
    • /
    • 2015
  • Objectives: The purpose of this study was to estimate the volumetric change of augmented autobone harvested from mandibular body cortical bone, using cone-beam computed tomography (CBCT) and three-dimensional reconstruction. In addition, the clinical success of dental implants placed 4 to 6 months after bone grafting was also evaluated. Materials and Methods: Ninety-five patients (48 men and 47 women) aged 19 to 72 years were included in this study. A total of 128 graft sites were evaluated. The graft sites were divided into three parts: anterior and both posterior regions of one jaw. All patients included in the study were scheduled for an onlay graft and implantation using a two-stage procedure. The dental implants were inserted 4 to 6 months after the bone graft. Volumetric stability was evaluated by serial CBCT images. Results: No major complications were observed for the donor sites. A total of 128 block bones were used to augment severely resorbed alveolar bone. Only 1 of the 128 bone grafts was resorbed by more than half, and that was due to infection. In total, the average amount of residual grafted bone after resorption at the recipient sites was $74.6%{\pm}8.4%$. Conclusion: Volumetric stability of mandibular body autogenous block grafts is predictable. The procedure is satisfactory for patients who want dental implants regardless of atrophic alveolar bone.

Distributed Video Coding based on Adaptive Block Quantization Using Received Motion Vectors (수신된 움직임 벡터를 이용한 적응적 블록 양자화 기반 분산 비디오 코딩 방법)

  • Min, Kyung-Yeon;Park, Sea-Nae;Nam, Jung-Hak;Sim, Dong-Gyu;Kim, Sang-Hyo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.2C
    • /
    • pp.172-181
    • /
    • 2010
  • In this paper, we propose an adaptive block quantization method. The propose method perfrect reconstructs side information without high complexity in the encoder side, as transmitting motion vectors from a decoder to an encoder side. Also, at the encoder side, residual signals between reconstructed side information and original frame are adaptively quantized to minimize parity bits to be transmitted to the decoder. The proposed method can effectively allocate bits based on bit error rate of side information. Also, we can achieved bit-saving by transmission of parity bits based on the error correction ability of the LDPC channel decoder, because we can know bit error rate and positions of error bit in encoder side. Experimental results show that the proposed algorithm achieves bit-saving by around 66% and delay of feedback channel, compared with the convntional algorithm.