• Title/Summary/Keyword: Software engineering

Search Result 12,460, Processing Time 0.035 seconds

Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)

  • Bogyung Park;Somin Park;Hyunki Hong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.7
    • /
    • pp.303-314
    • /
    • 2023
  • Voice conversion, a technology that allows an individual's speech data to be regenerated with the acoustic properties(tone, cadence, gender) of another, has countless applications in education, communication, and entertainment. This paper proposes an approach based on the StarGAN-VC model that generates realistic-sounding speech without requiring parallel utterances. To overcome the constraints of the existing StarGAN-VC model that utilizes one-hot vectors of original and target speaker information, this paper extracts feature vectors of target speakers using a pre-trained version of Rawnet3. This results in a latent space where voice conversion can be performed without direct speaker-to-speaker mappings, enabling an any-to-any structure. In addition to the loss terms used in the original StarGAN-VC model, Wasserstein distance is used as a loss term to ensure that generated voice segments match the acoustic properties of the target voice. Two Time-Scale Update Rule (TTUR) is also used to facilitate stable training. Experimental results show that the proposed method outperforms previous methods, including the StarGAN-VC network on which it was based.

Heterogeneous Sensor Coordinate System Calibration Technique for AR Whole Body Interaction (AR 전신 상호작용을 위한 이종 센서 간 좌표계 보정 기법)

  • Hangkee Kim;Daehwan Kim;Dongchun Lee;Kisuk Lee;Nakhoon Baek
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.7
    • /
    • pp.315-324
    • /
    • 2023
  • A simple and accurate whole body rehabilitation interaction technology using immersive digital content is needed for elderly patients with steadily increasing age-related diseases. In this study, we introduce whole-body interaction technology using HoloLens and Kinect for this purpose. To achieve this, we propose three coordinate transformation methods: mesh feature point-based transformation, AR marker-based transformation, and body recognition-based transformation. The mesh feature point-based transformation aligns the coordinate system by designating three feature points on the spatial mesh and using a transform matrix. This method requires manual work and has lower usability, but has relatively high accuracy of 8.5mm. The AR marker-based method uses AR and QR markers recognized by HoloLens and Kinect simultaneously to achieve a compliant accuracy of 11.2mm. The body recognition-based transformation aligns the coordinate system by using the position of the head or HMD recognized by both devices and the position of both hands or controllers. This method has lower accuracy, but does not require additional tools or manual work, making it more user-friendly. Additionally, we reduced the error by more than 10% using RANSAC as a post-processing technique. These three methods can be selectively applied depending on the usability and accuracy required for the content. In this study, we validated this technology by applying it to the "Thunder Punch" and rehabilitation therapy content.

UX Design of Mobile Banking Usage Improvement for Seniors (시니어들을 위한 모바일 뱅킹 이용률 개선을 위한 UX 디자인)

  • Jongbin Lee;Homin Boun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.7
    • /
    • pp.325-332
    • /
    • 2023
  • Currently, the world's population has already entered a super-aging era, and the rate is expected to increase rapidly to about 40% by 2050. However, the rapid development of automation technology and the online service sector, the main technologies of the Fourth Industrial Revolution, are still further isolating them in a world where many inconveniences and development technologies are applied. As such, alienation in daily life is widely expanded in various fields, but the financial service sector is one of the must-use areas regardless of age because of its strong nature in the public service sector, and is a very important factor in the period when branches are rapidly decreasing. However, the current utilization rate of mobile banking services is not around 5%, so users over 60 are rarely able to use them. The UX design of the most frequently used remittance service screen in mobile banking services was proposed, and the difficulty of trying to find the preferred bank among 56 or more banks was solved by analyzing the usage rate of each bank and dividing it into three stages by age group from 50 or older. In addition, it was designed to strengthen customized services by showing their recently used banks as the top priority. The design proposed in this study obtained an average of 4.8 points or more out of 5 points as a result of usability satisfaction through interviews with less than 50 senior groups. This study is believed to help each bank upgrade its different mobile banking designs in a unified manner.

Impacts of Nitrate in Base Flow Discharge on Surface Water Quality (질산성 질소 기저유출이 지표수 수질에 미치는 영향)

  • Kim, Geonha;Lee, Hosik
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.1B
    • /
    • pp.105-109
    • /
    • 2009
  • It is a well known fact that baseflow discharge of rainfall runoff impacts on water quality of surface water significantly. In this paper, impacts of nitrate discharged as base flow on stream water quality were studied by using a software, PULSE from USGS to calculate monthly ground water discharge from hydrograph. We used water quality and flow rate data for Ghapcehon2 site in Daejeon city for year 2005 as well as ground water quality data in the watershed acquired from government agencies. Agricultural and forestry land use are dominant for upstream of Ghapcheon2 in the watershed. Base flow contributes about 85~95% of stream flows during spring and fall while 25~38% of stream flow was induced by base flow during summer and winter. Monthly nitrate loading discharged as base flow for Ghapcheon2 was estimated by using averaged nitrate concentration of groundwater in the watershed. Nitrate loading induced by base flow at Ghapcheon2 was estimated as 5.4 ton of $NO_{3}{^-}-N/km^{2}$, which is about 60% of nitrate loading of surface water, 9.2 ton of $NO_{3}{^-}-N/km^{2}$. Seasonal variation of nitrate concentration of base flow was estimated by dividing monthly nitrate loading by monthly base flow discharge. Nitrate concentration of groundwater was increasing from rainy season. From this study, it can be understood that ground water quality monitoring is important for the proper manage of surface water quality.

Empirical and Numerical Analyses of a Small Planing Ship Resistance using Longitudinal Center of Gravity Variations (경험식과 수치해석을 이용한 종방향 무게중심 변화에 따른 소형선박의 저항성능 변화에 관한 연구)

  • Michael;Jun-Taek Lim;Nam-Kyun Im;Kwang-Cheol Seo
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.7
    • /
    • pp.971-979
    • /
    • 2023
  • Small ships (<499 GT) constitute 46% of the existing ships, therefore, it can be concluded that they produce relatively high CO2 gas emissions. Operating in optimal trim conditions can reduce the resistance of the ship, which results in fewer greenhouse gases. An affordable way for trim optimization is to adjust the weight distribution to obtain an optimum longitudinal center of gravity (LCG). Therefore, in this study, the effect of LCG changes on the resistance of a small planing ship is studied using empirical and numerical analyses. The Savitsky method employing Maxsurf resistance and the STAR-CCM+ commercial computational fluid dynamics (CFD) software is used for the empirical and numerical analyses, respectively. Finally, the total resistance from the ship design process is compared to obtain the optimum LCG. To summarize, using numerical analysis, optimum LCG is achieved at the 46.2% length overall (LoA) at Froude Number 0.56, and 43.4% LoA at Froude Number 0.63, which provides a significant resistance reduction of 41.12 - 45.16% compared to the reference point at 29.2% LoA.

An Experiment for Surface Soil Moisture Mapping Using Sentinel-1 and Sentinel-2 Image on Google Earth Engine (Google Earth Engine 제공 Sentinel-1과 Sentinel-2 영상을 이용한 지표 토양수분도 제작 실험)

  • Jihyun Lee ;Kwangseob Kim;Kiwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.599-608
    • /
    • 2023
  • The increasing interest in soil moisture data using satellite data for applications of hydrology, meteorology, and agriculture has led to the development of methods for generating soil moisture maps of variable resolution. This study demonstrated the capability of generating soil moisture maps using Sentinel-1 and Sentinel-2 data provided by Google Earth Engine (GEE). The soil moisture map was derived using synthetic aperture radar (SAR) image and optical image. SAR data provided by the Sentinel-1 analysis ready data in GEE was applied with normalized difference vegetation index (NDVI) based on Sentinel-2 and Environmental Systems Research Institute (ESRI)-based Land Cover map. This study produced a soil moisture map in the research area of Victoria, Australia and compared it with field measurements obtained from a previous study. As for the validation of the applied method's result accuracy, the comparative experimental results showed a meaningful range of consistency as 4-10%p between the values obtained using the algorithm applied in this study and the field-based ones, and they also showed very high consistency with satellite-based soil moisture data as 0.5-2%p. Therefore, public open data provided by GEE and the algorithm applied in this study can be used for high-resolution soil moisture mapping to represent regional land surface characteristics.

Soil Moisture Estimation Using KOMPSAT-3 and KOMPSAT-5 SAR Images and Its Validation: A Case Study of Western Area in Jeju Island (KOMPSAT-3와 KOMPSAT-5 SAR 영상을 이용한 토양수분 산정과 결과 검증: 제주 서부지역 사례 연구)

  • Jihyun Lee;Hayoung Lee;Kwangseob Kim;Kiwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1185-1193
    • /
    • 2023
  • The increasing interest in soil moisture data from satellite imagery for applications in hydrology, meteorology, and agriculture has led to the development of methods to produce variable-resolution soil moisture maps. Research on accurate soil moisture estimation using satellite imagery is essential for remote sensing applications. The purpose of this study is to generate a soil moisture estimation map for a test area using KOMPSAT-3/3A and KOMPSAT-5 SAR imagery and to quantitatively compare the results with soil moisture data from the Soil Moisture Active Passive (SMAP) mission provided by NASA, with a focus on accuracy validation. In addition, the Korean Environmental Geographic Information Service (EGIS) land cover map was used to determine soil moisture, especially in agricultural and forested regions. The selected test area for this study is the western part of Jeju, South Korea, where input data were available for the soil moisture estimation algorithm based on the Water Cloud Model (WCM). Synthetic Aperture Radar (SAR) imagery from KOMPSAT-5 HV and Sentinel-1 VV were used for soil moisture estimation, while vegetation indices were calculated from the surface reflectance of KOMPSAT-3 imagery. Comparison of the derived soil moisture results with SMAP (L-3) and SMAP (L-4) data by differencing showed a mean difference of 4.13±3.60 p% and 14.24±2.10 p%, respectively, indicating a level of agreement. This research suggests the potential for producing highly accurate and precise soil moisture maps using future South Korean satellite imagery and publicly available data sources, as demonstrated in this study.

Gear Fault Diagnosis Based on Residual Patterns of Current and Vibration Data by Collaborative Robot's Motions Using LSTM (LSTM을 이용한 협동 로봇 동작별 전류 및 진동 데이터 잔차 패턴 기반 기어 결함진단)

  • Baek Ji Hoon;Yoo Dong Yeon;Lee Jung Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.445-454
    • /
    • 2023
  • Recently, various fault diagnosis studies are being conducted utilizing data from collaborative robots. Existing studies performing fault diagnosis on collaborative robots use static data collected based on the assumed operation of predefined devices. Therefore, the fault diagnosis model has a limitation of increasing dependency on the learned data patterns. Additionally, there is a limitation in that a diagnosis reflecting the characteristics of collaborative robots operating with multiple joints could not be conducted due to experiments using a single motor. This paper proposes an LSTM diagnostic model that can overcome these two limitations. The proposed method selects representative normal patterns using the correlation analysis of vibration and current data in single-axis and multi-axis work environments, and generates residual patterns through differences from the normal representative patterns. An LSTM model that can perform gear wear diagnosis for each axis is created using the generated residual patterns as inputs. This fault diagnosis model can not only reduce the dependence on the model's learning data patterns through representative patterns for each operation, but also diagnose faults occurring during multi-axis operation. Finally, reflecting both internal and external data characteristics, the fault diagnosis performance was improved, showing a high diagnostic performance of 98.57%.

TAGS: Text Augmentation with Generation and Selection (생성-선정을 통한 텍스트 증강 프레임워크)

  • Kim Kyung Min;Dong Hwan Kim;Seongung Jo;Heung-Seon Oh;Myeong-Ha Hwang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.455-460
    • /
    • 2023
  • Text augmentation is a methodology that creates new augmented texts by transforming or generating original texts for the purpose of improving the performance of NLP models. However existing text augmentation techniques have limitations such as lack of expressive diversity semantic distortion and limited number of augmented texts. Recently text augmentation using large language models and few-shot learning can overcome these limitations but there is also a risk of noise generation due to incorrect generation. In this paper, we propose a text augmentation method called TAGS that generates multiple candidate texts and selects the appropriate text as the augmented text. TAGS generates various expressions using few-shot learning while effectively selecting suitable data even with a small amount of original text by using contrastive learning and similarity comparison. We applied this method to task-oriented chatbot data and achieved more than sixty times quantitative improvement. We also analyzed the generated texts to confirm that they produced semantically and expressively diverse texts compared to the original texts. Moreover, we trained and evaluated a classification model using the augmented texts and showed that it improved the performance by more than 0.1915, confirming that it helps to improve the actual model performance.

A Study on Korean Speech Animation Generation Employing Deep Learning (딥러닝을 활용한 한국어 스피치 애니메이션 생성에 관한 고찰)

  • Suk Chan Kang;Dong Ju Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.461-470
    • /
    • 2023
  • While speech animation generation employing deep learning has been actively researched for English, there has been no prior work for Korean. Given the fact, this paper for the very first time employs supervised deep learning to generate Korean speech animation. By doing so, we find out the significant effect of deep learning being able to make speech animation research come down to speech recognition research which is the predominating technique. Also, we study the way to make best use of the effect for Korean speech animation generation. The effect can contribute to efficiently and efficaciously revitalizing the recently inactive Korean speech animation research, by clarifying the top priority research target. This paper performs this process: (i) it chooses blendshape animation technique, (ii) implements the deep-learning model in the master-servant pipeline of the automatic speech recognition (ASR) module and the facial action coding (FAC) module, (iii) makes Korean speech facial motion capture dataset, (iv) prepares two comparison deep learning models (one model adopts the English ASR module, the other model adopts the Korean ASR module, however both models adopt the same basic structure for their FAC modules), and (v) train the FAC modules of both models dependently on their ASR modules. The user study demonstrates that the model which adopts the Korean ASR module and dependently trains its FAC module (getting 4.2/5.0 points) generates decisively much more natural Korean speech animations than the model which adopts the English ASR module and dependently trains its FAC module (getting 2.7/5.0 points). The result confirms the aforementioned effect showing that the quality of the Korean speech animation comes down to the accuracy of Korean ASR.