Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)
-
- KIPS Transactions on Software and Data Engineering
- /
- v.12 no.7
- /
- pp.303-314
- /
- 2023
Voice conversion, a technology that allows an individual's speech data to be regenerated with the acoustic properties(tone, cadence, gender) of another, has countless applications in education, communication, and entertainment. This paper proposes an approach based on the StarGAN-VC model that generates realistic-sounding speech without requiring parallel utterances. To overcome the constraints of the existing StarGAN-VC model that utilizes one-hot vectors of original and target speaker information, this paper extracts feature vectors of target speakers using a pre-trained version of Rawnet3. This results in a latent space where voice conversion can be performed without direct speaker-to-speaker mappings, enabling an any-to-any structure. In addition to the loss terms used in the original StarGAN-VC model, Wasserstein distance is used as a loss term to ensure that generated voice segments match the acoustic properties of the target voice. Two Time-Scale Update Rule (TTUR) is also used to facilitate stable training. Experimental results show that the proposed method outperforms previous methods, including the StarGAN-VC network on which it was based.
The construction of smart communities is a new method and important measure to ensure the security of residential areas. In order to solve the problem of low accuracy in face recognition caused by distorting facial features due to monitoring camera angles and other external factors, this paper proposes the following optimization strategies in designing a face recognition network: firstly, a global graph convolution module is designed to encode facial features as graph nodes, and a multi-scale feature enhancement residual module is designed to extract facial keypoint features in conjunction with the global graph convolution module. Secondly, after obtaining facial keypoints, they are constructed as a directed graph structure, and graph attention mechanisms are used to enhance the representation power of graph features. Finally, tensor computations are performed on the graph features of two faces, and the aggregated features are extracted and discriminated by a fully connected layer to determine whether the individuals' identities are the same. Through various experimental tests, the network designed in this paper achieves an AUC index of 85.65% for facial keypoint localization on the 300W public dataset and 88.92% on a self-built dataset. In terms of face recognition accuracy, the proposed network achieves an accuracy of 83.41% on the IBUG public dataset and 96.74% on a self-built dataset. Experimental results demonstrate that the network designed in this paper exhibits high detection and recognition accuracy for faces in surveillance videos.
The scale of visual expression has expanded from freeze frame to motion picture as media have developed. Moving pictures such as animation, movies, TV CM and GUI become formative elements whose movement is necessary compared to freeze frame as apparent movement phenomenon and unit structure such as short and scene appear. Therefore, of formative elements such as a shape, color, space, size and movement, movement is importantly distinguished in the moving image. The expression and form of image as a relationship between the signified and signifier explained by Saussure are accepted as a sign by mutual complement even though they limit the content. This makes it possible to infer that the formal feature of movement participates in the message content. To verify this, the result of moving picture visual perception experiment based on the gestalt grouping principle result shows that 70-80 percent of subjects think that 'movement' is the important grouping clue in perception. Movement affects the maintenance of the context of message content in the communication process when the meaning structure of moving picture is analyzed based on the structural feature. The identity can be maintained with if there is a movement with similar directive point even if the color and shape of people, things and background are changed. Second, the clarity of the content is elevated by a distinguished object as a figure by movement. Third, it acts as a knowledge representation which can predict similar movement process of next information processing. Forth, movement gives the content consistency even though more than two scenes have fast switch and complicated editing structure like cross-cutting. Movement becomes a clue which can make grouping information input by visual perception reaction. Also, it gives the order to the visual expression which can be used improperly by formation of structural frame of image message and has the effectiveness which elevates the clarity of signification. Moving picture has discourse with several mixed unit structures because it fundamentally contains time and the common and distinguished expression is needed by media-mix circumstances. Therefore, by the application of gestalt grouping principle to moving picture field, movement becomes the more distinguished than other formative elements and affects the formation of meaning structure. This study propose a viewpoint that develops structural formative beauty and new image expression in the media image field.
The purpose of this study is to ascertain the design element in traditional palaces of Korea, China and Japan. It takes threesteps to proceed this study. Firstly, it needs to be established the analysis framework from the documents. In second step, the design elements - the form, the material, the pattern and the color - should be collected and investigated through the observation of the actual traditional palaces the Changduckung, the Forbidden City, the Nijo castle. The third step is the analysis of the results of the investigation of the design elements from step two. To sum up similarities and dissimilarities among the design element in traditional palaces of Korea, China and Japan is as the following It is to be noticed that the mainly common characteristics of the artistic design are 'naturalism', 'harmonious ideas' and 'confucianism'. But the representation style of the design element is differed from the country. : The typical features of China are symmetry, glassy surface by artificial process, the meandered curve, the magnificent pattern and the constrable color. In Japan, the mathematical asymmetry, made-up rough surface by artificial skill, decorativepattern with abbreviation and achromatic color are important feature of the design element. While the major features of Korean design element are asymmetrical balance with nature, rough surface by natural process, moderate pattern and harmonious color.
3,000 years ago, since the first poster of humanity appeared in Egypt, the invention of printing technique and the development of civilization have accelerated the poster production technology. In keeping with this, the expression of poster has also been developed as an attempt to express artistic sensibility in a simple arrangement of characters, and now it has become an art form that has become a domain of professional designers. However, the technological development in the expression of poster is keep staying in two-dimensional, and is dependent on printing only that it is irrelevant to the change of ICT environment based on modern multimedia. Especially, among the many kinds of posters, the style of movie posters, which are the only objects for video, are still printed on paper, and many attempts have been made so far, but the movie industry still does not consider ICT integration at all. This study started with the feature that the object of the movie poster dealt with the video and attempted to introduce the augmented reality to apply the dynamic image of the movie to the static poster. In the graduation work of the media design major of a university in Korea, the poster of each works for promoting the visual work of the students was designed and printed in the form of a commercial film poster. Among them, 6 artworks that are considered to be suitable for augmented reality were selected and augmented reality was introduced and exhibited. Content that appears matched to the poster through the mobile device is reproduced on a poster of a scene of the video, but the text informations of the original poster are kept as they are, so that is able to build a moving poster looked like a wanted from the movie "Harry Potter". In order to produce this augmented reality poster, we applied augmented reality to posters of existing commercial films produced in two different formats, and found a way to increase the characteristics of AR contents. Through this, we were able to understand poster design suitable for AR representation, and technical expression for stable operation of augmented reality can be summarized in the matching process of augmented reality contents production.
This paper propose a method that controls facial expression of 3D avatar by having the user select a sequence of facial expressions in the space of facial expressions. And we setup its system. The space of expression is created from about 2400 frames consist of motion captured data of facial expressions. To represent the state of each expression, we use the distance matrix that represents the distances between pairs of feature points on the face. The set of distance matrices is used as the space of expressions. But this space is not such a space where one state can go to another state via the straight trajectory between them. We derive trajectories between two states from the captured set of expressions in an approximate manner. First, two states are regarded adjacent if the distance between their distance matrices is below a given threshold. Any two states are considered to have a trajectory between them If there is a sequence of adjacent states between them. It is assumed . that one states goes to another state via the shortest trajectory between them. The shortest trajectories are found by dynamic programming. The space of facial expressions, as the set of distance matrices, is multidimensional. Facial expression of 3D avatar Is controled in real time as the user navigates the space. To help this process, we visualized the space of expressions in 2D space by using the multidimensional scaling(MDS). To see how effective this system is, we had users control facial expressions of 3D avatar by using the system. As a result of that, users estimate that system is very useful to control facial expression of 3D avatar in real-time.
The study on Musan twelve peaks of Yongho garden in Jinju, Gyeongnam was anticipated to provide data and implication for reproducing similar spaces and modern changes in terms of design factor since it is the prototype of traditional mount for overcoming monotonous geographical features and intriguing changes and interests. The study analyzed and interpreted the symbolism of twelve peaks, principles of space composition and function and effect of visual construction that were pursued by the builder in terms of landscape view, which results are as following. The center of Yongho garden, Yonghoji(龍虎池) is a typical man-made pond for a supportive feng shui feature. It is a supporting equipment to complete the state of feng shui, and the result of strengthening the completion through the connection with the dragon-related name of the place. The shape of Musan twelve peaks looks like an oval form of Geumseongsan(金星山), 2~3.5m in height and 6~12m in diameter. Peaks are estimated as 1.5~3.7m(2.4m in average) in height,
The aim of this essay is to illustrate Sunjung Manhwa in the 1970s which has been alienated in comics studies. This essay analyses the articles and the serial comics in Schoolgirl, the magazine in the 1970s, and examines the ideal representations of the girls at that time. Sunjung Manhwa is really different between the 1960s and 1970s. It cannot be explained on this gap just by analyzing Sunjung Manhwa in book form alone. Even though the censorship on comics was the element that has hampered the development of comics as a whole, the slumps of Sunjung Manhwa in the 1970s were very excessive compared to other comics genres. This article can gain the answers to the reason of the changes of Sunjung Manhwa by studying the magazines which was the main mass media aimed at girls with Sunjung Manhwa. While the articles in magazines show the editing direction and its characteristics, they reflect the values and ideologies at that time. The same is true for the comics in the magazines. Especially, the comics in the magazines was relatively free from the censorship. This essay examined how the articles and the comics in the girls' magazine in the 1970s represented the images of girls at the time by focusing on feature articles and comics in the magazine, Schoolgirl. This article explored Um, Hee-Ja's Blue Zone and Bang, Young-Jin's Mini March among a full-length serial comics in the magazine, Schoolgirl. Both Blue Zone and Mini March reveal the images of an ideal girl that has been emphasized by the articles in Schoolgirl. Blue Zone draws the appearances of an earnest and obedient daughter, and Mini March represents the figures of a cheerful and bright girl. Through this study, it can be recognized that the magazines in the 1970s highly appraised girls who are obedient to a given society and serve to a harmonious family as ideal ones, and it might be guessed that the ideal images of girls that was characterized ceaselessly by the magazines were the standard of the censorship on comics and its creativity and had also a huge impact on the contents and the expressions of a great deal of works. The 1970s was the times when its importance has been lost in the history of the comics studies by the censorship on the comics and the monopoly of "Hapdong(합동) publisher." The limits of expression in terms of censorship were awfully distinct, so its result was few of good works in quality, and there are still many blanks in the study on 1970s' comics. This study has a meaning which fills up a blank in the comics studies.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70