Search | Korea Science

A DB Pruning Method in a Large Corpus-Based TTS with Multiple Candidate Speech Segments (대용량 복수후보 TTS 방식에서 합성용 DB의 감량 방법)

Lee, Jung-Chul;Kang, Tae-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.6
- /
- pp.572-577
- /
- 2009
Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. To prune the redundant speech segments in a large speech segment DB, we can utilize a decision-tree based triphone clustering algorithm widely used in speech recognition area. But, the conventional methods have problems in representing the acoustic transitional characteristics of the phones and in applying context questions with hierarchic priority. In this paper, we propose a new clustering algorithm to downsize the speech DB. Firstly, three 13th order MFCC vectors from first, medial, and final frame of a phone are combined into a 39 dimensional vector to represent the transitional characteristics of a phone. And then the hierarchically grouped three question sets are used to construct the triphone trees. For the performance test, we used DTW algorithm to calculate the acoustic similarity between the target triphone and the triphone from the tree search result. Experimental results show that the proposed method can reduce the size of speech DB by 23% and select better phones with higher acoustic similarity. Therefore the proposed method can be applied to make a small sized TTS.
https://doi.org/10.7776/ASK.2009.28.6.572 인용 PDF KSCI

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

Kim, Minjeong;Cho, Yoonho
- Journal of Intelligence and Information Systems
- /
- v.21 no.4
- /
- pp.93-110
- /
- 2015
The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.
https://doi.org/10.13088/jiis.2015.21.4.093 인용 PDF KSCI

Development of an Automatic 3D Coregistration Technique of Brain PET and MR Images (뇌 PET과 MR 영상의 자동화된 3차원적 합성기법 개발)

Lee, Jae-Sung;Kwark, Cheol-Eun;Lee, Dong-Soo;Chung, June-Key;Lee, Myung-Chul;Park, Kwang-Suk
- The Korean Journal of Nuclear Medicine
- /
- v.32 no.5
- /
- pp.414-424
- /
- 1998
Purpose: Cross-modality coregistration of positron emission tomography (PET) and magnetic resonance imaging (MR) could enhance the clinical information. In this study we propose a refined technique to improve the robustness of registration, and to implement more realistic visualization of the coregistered images. Materials and Methods: Using the sinogram of PET emission scan, we extracted the robust head boundary and used boundary-enhanced PET to coregister PET with MR. The pixels having 10% of maximum pixel value were considered as the boundary of sinogram. Boundary pixel values were exchanged with maximum value of sinogram. One hundred eighty boundary points were extracted at intervals of about 2 degree using simple threshold method from each slice of MR images. Best affined transformation between the two point sets was performed using least square fitting which should minimize the sum of Euclidean distance between the point sets. We reduced calculation time using pre-defined distance map. Finally we developed an automatic coregistration program using this boundary detection and surface matching technique. We designed a new weighted normalization technique to display the coregistered PET and MR images simultaneously. Results: Using our newly developed method, robust extraction of head boundary was possible and spatial registration was successfully performed. Mean displacement error was less than 2.0 mm. In visualization of coregistered images using weighted normalization method, structures shown in MR image could be realistically represented. Conclusion: Our refined technique could practically enhance the performance of automated three dimensional coregistration.
PDF

Experimental Design of S box and G function strong with attacks in SEED-type cipher (SEED 형식 암호에서 공격에 강한 S 박스와 G 함수의 실험적 설계)

박창수;송홍복;조경연
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.8 no.1
- /
- pp.123-136
- /
- 2004
In this paper, complexity and regularity of polynomial multiplication over $GF({2^n})$ are defined by using Hamming weight of rows and columns of the matrix ever GF(2) which represents polynomial multiplication. It is shown experimentally that in order to construct the block cipher robust against differential cryptanalysis, polynomial multiplication of substitution layer and the permutation layer should have high complexity and high regularity. With result of the experiment, a way of constituting S box and G function is suggested in the block cipher whose structure is similar to SEED, which is KOREA standard of 128-bit block cipher. S box can be formed with a nonlinear function and an affine transform. Nonlinear function must be strong with differential attack and linear attack, and it consists of an inverse number over $GF({2^8})$ which has neither a fixed pout, whose input and output are the same except 0 and 1, nor an opposite fixed number, whose output is one`s complement of the input. Affine transform can be constituted so that the input/output correlation can be the lowest and there can be no fixed point or opposite fixed point. G function undergoes linear transform with 4 S-box outputs using the matrix of 4${\times}$4 over $GF({2^8})$. The components in the matrix of linear transformation have high complexity and high regularity. Furthermore, G function can be constituted so that MDS(Maximum Distance Separable) code can be formed, SAC(Strict Avalanche Criterion) can be met, and there can be no weak input where a fixed point an opposite fixed point, and output can be two`s complement of input. The primitive polynomials of nonlinear function affine transform and linear transformation are different each other. The S box and G function suggested in this paper can be used as a constituent of the block cipher with high security, in that they are strong with differential attack and linear attack with no weak input and they are excellent at diffusion.
PDF KSCI

The Zhouyi and Artificial Intelligence (『주역』과 인공지능)

Bang, In
- Journal of Korean Philosophical Society
- /
- v.145
- /
- pp.91-117
- /
- 2018
This paper aims to clarify the similarities and differences between the Zhouyi and artificial intelligence. The divination of the Zhouyi is rooted in the oldest system of human knowledge, while artificial intelligence stands at the cutting edge of modern scientific revolution. At first sight, there does not appear to be any association that links the one to the another. However, they share the same ground as seen from a semiotic standpoint because both of them depend on the semiotic system as a means of obtaining knowledge. At least four aspects can be pointed out in terms of similarities. First, artificial intelligence and the Zhouyi use artificial language that consists of semiotic signs. Secondly, the principle that enables divination and artificial intelligence lies in imitation and representation. Thirdly, artificial intelligence and the Zhouyi carry out inferences based on mathematical algorithms that adopt the binary system. Fourth, artificial intelligence and the Zhouyi use analogy as a means of obtaining knowledge. However, those similarities do not guarantee that the Zhouyi could arrive at the scientific certainty. Nevertheless, it can give us important insight into the essence of our civilization. The Zhouyi uses intellect in order to get new information about the unknown world. However, it is hard to know what kind of intellect is involved in the process of divination. Likewise, we do not know the fundamental character of artificial intelligence. The intellect hidden in the unknown subject is a mystic and fearful existence to us. Just as the divination of the Zhouyi inspires the sense of reverence toward the supernatural subject, we could not but have fear in front of the invisible subject hidden in artificial intelligence. In the past, traditional philosophy acknowledged the existence of intellect only in conscious beings. Nonetheless, it becomes evident that human civilization ushers into a new epoch. As Ray Kurzweil mentioned, the moment of singularity comes when artificial intelligence surpasses human intelligence. In my viewpoint, the term of singularity can be used for denoting the critical point in which the human species enters into the new phase of civilization. To borrow the term of Shao Yong(邵雍) in the Northern Song Dynasty, the past civilization belongs to the Earlier Heaven(先天), the future civilization belongs to the Later Heaven(後天). Once our civilization passes over the critical point, it is impossible to go back into the past. The opening of the Later Heaven foretold by the religious thinkers in the late period of Joseon Dynasty was a prophecy in its own age, but it is becoming a reality in the present.
https://doi.org/10.20293/jokps.2018.145.91 인용

Image Watermarking for Copyright Protection of Images on Shopping Mall (쇼핑몰 이미지 저작권보호를 위한 영상 워터마킹)

Bae, Kyoung-Yul
- Journal of Intelligence and Information Systems
- /
- v.19 no.4
- /
- pp.147-157
- /
- 2013
With the advent of the digital environment that can be accessed anytime, anywhere with the introduction of high-speed network, the free distribution and use of digital content were made possible. Ironically this environment is raising a variety of copyright infringement, and product images used in the online shopping mall are pirated frequently. There are many controversial issues whether shopping mall images are creative works or not. According to Supreme Court's decision in 2001, to ad pictures taken with ham products is simply a clone of the appearance of objects to deliver nothing but the decision was not only creative expression. But for the photographer's losses recognized in the advertising photo shoot takes the typical cost was estimated damages. According to Seoul District Court precedents in 2003, if there are the photographer's personality and creativity in the selection of the subject, the composition of the set, the direction and amount of light control, set the angle of the camera, shutter speed, shutter chance, other shooting methods for capturing, developing and printing process, the works should be protected by copyright law by the Court's sentence. In order to receive copyright protection of the shopping mall images by the law, it is simply not to convey the status of the product, the photographer's personality and creativity can be recognized that it requires effort. Accordingly, the cost of making the mall image increases, and the necessity for copyright protection becomes higher. The product images of the online shopping mall have a very unique configuration unlike the general pictures such as portraits and landscape photos and, therefore, the general image watermarking technique can not satisfy the requirements of the image watermarking. Because background of product images commonly used in shopping malls is white or black, or gray scale (gradient) color, it is difficult to utilize the space to embed a watermark and the area is very sensitive even a slight change. In this paper, the characteristics of images used in shopping malls are analyzed and a watermarking technology which is suitable to the shopping mall images is proposed. The proposed image watermarking technology divide a product image into smaller blocks, and the corresponding blocks are transformed by DCT (Discrete Cosine Transform), and then the watermark information was inserted into images using quantization of DCT coefficients. Because uniform treatment of the DCT coefficients for quantization cause visual blocking artifacts, the proposed algorithm used weighted mask which quantizes finely the coefficients located block boundaries and coarsely the coefficients located center area of the block. This mask improves subjective visual quality as well as the objective quality of the images. In addition, in order to improve the safety of the algorithm, the blocks which is embedded the watermark are randomly selected and the turbo code is used to reduce the BER when extracting the watermark. The PSNR(Peak Signal to Noise Ratio) of the shopping mall image watermarked by the proposed algorithm is 40.7~48.5[dB] and BER(Bit Error Rate) after JPEG with QF = 70 is 0. This means the watermarked image is high quality and the algorithm is robust to JPEG compression that is used generally at the online shopping malls. Also, for 40% change in size and 40 degrees of rotation, the BER is 0. In general, the shopping malls are used compressed images with QF which is higher than 90. Because the pirated image is used to replicate from original image, the proposed algorithm can identify the copyright infringement in the most cases. As shown the experimental results, the proposed algorithm is suitable to the shopping mall images with simple background. However, the future study should be carried out to enhance the robustness of the proposed algorithm because the robustness loss is occurred after mask process.
https://doi.org/10.13088/jiis.2013.19.4.147 인용 PDF KSCI

A Study on Fast Iris Detection for Iris Recognition in Mobile Phone (휴대폰에서의 홍채인식을 위한 고속 홍채검출에 관한 연구)

Park Hyun-Ae;Park Kang-Ryoung
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.43 no.2 s.308
- /
- pp.19-29
- /
- 2006
As the security of personal information is becoming more important in mobile phones, we are starting to apply iris recognition technology to these devices. In conventional iris recognition, magnified iris images are required. For that, it has been necessary to use large magnified zoom & focus lens camera to capture images, but due to the requirement about low size and cost of mobile phones, the zoom & focus lens are difficult to be used. However, with rapid developments and multimedia convergence trends in mobile phones, more and more companies have built mega-pixel cameras into their mobile phones. These devices make it possible to capture a magnified iris image without zoom & focus lens. Although facial images are captured far away from the user using a mega-pixel camera, the captured iris region possesses sufficient pixel information for iris recognition. However, in this case, the eye region should be detected for accurate iris recognition in facial images. So, we propose a new fast iris detection method, which is appropriate for mobile phones based on corneal specular reflection. To detect specular reflection robustly, we propose the theoretical background of estimating the size and brightness of specular reflection based on eye, camera and illuminator models. In addition, we use the successive On/Off scheme of the illuminator to detect the optical/motion blurring and sunlight effect on input image. Experimental results show that total processing time(detecting iris region) is on average 65ms on a Samsung SCH-S2300 (with 150MHz ARM 9 CPU) mobile phone. The rate of correct iris detection is 99% (about indoor images) and 98.5% (about outdoor images).
PDF KSCI

The Relationship Analysis between the Epicenter and Lineaments in the Odaesan Area using Satellite Images and Shaded Relief Maps (위성영상과 음영기복도를 이용한 오대산 지역 진앙의 위치와 선구조선의 관계 분석)

CHA, Sung-Eun;CHI, Kwang-Hoon;JO, Hyun-Woo;KIM, Eun-Ji;LEE, Woo-Kyun
- Journal of the Korean Association of Geographic Information Studies
- /
- v.19 no.3
- /
- pp.61-74
- /
- 2016
The purpose of this paper is to analyze the relationship between the location of the epicenter of a medium-sized earthquake(magnitude 4.8) that occurred on January 20, 2007 in the Odaesan area with lineament features using a shaded relief map(1/25,000 scale) and satellite images from LANDSAT-8 and KOMPSAT-2. Previous studies have analyzed lineament features in tectonic settings primarily by examining two-dimensional satellite images and shaded relief maps. These methods, however, limit the application of the visual interpretation of relief features long considered as the major component of lineament extraction. To overcome some existing limitations of two-dimensional images, this study examined three-dimensional images, produced from a Digital Elevation Model and drainage network map, for lineament extraction. This approach reduces mapping errors introduced by visual interpretation. In addition, spline interpolation was conducted to produce density maps of lineament frequency, intersection, and length required to estimate the density of lineament at the epicenter of the earthquake. An algorithm was developed to compute the Value of the Relative Density(VRD) representing the relative density of lineament from the map. The VRD is the lineament density of each map grid divided by the maximum density value from the map. As such, it is a quantified value that indicates the concentration level of the lineament density across the area impacted by the earthquake. Using this algorithm, the VRD calculated at the earthquake epicenter using the lineament's frequency, intersection, and length density maps ranged from approximately 0.60(min) to 0.90(max). However, because there were differences in mapped images such as those for solar altitude and azimuth, the mean of VRD was used rather than those categorized by the images. The results show that the average frequency of VRD was approximately 0.85, which was 21% higher than the intersection and length of VRD, demonstrating the close relationship that exists between lineament and the epicenter. Therefore, it is concluded that the density map analysis described in this study, based on lineament extraction, is valid and can be used as a primary data analysis tool for earthquake research in the future.
https://doi.org/10.11108/kagis.2016.19.3.061 인용 PDF KSCI

Reconstruction of Metabolic Pathway for the Chicken Genome (닭 특이 대사 경로 재확립)

Kim, Woon-Su;Lee, Se-Young;Park, Hye-Sun;Baik, Woon-Kee;Lee, Jun-Heon;Seo, Seong-Won
- Korean Journal of Poultry Science
- /
- v.37 no.3
- /
- pp.275-282
- /
- 2010
Chicken is an important livestock as a valuable biomedical model as well as food for human, and there is a strong rationale for improving our understanding on metabolism and physiology of this organism. The first draft of chicken genome assembly was released in 2004, which enables elaboration on the linkage between genetic and metabolic traits of chicken. The objectives of this study were thus to reconstruct metabolic pathway of the chicken genome and to construct a chicken specific pathway genome database (PGDB). We developed a comprehensive genome database for chicken by integrating all the known annotations for chicken genes and proteins using a pipeline written in Perl. Based on the comprehensive genome annotations, metabolic pathways of the chicken genome were reconstructed using the PathoLogic algorithm in Pathway Tools software. We identified a total of 212 metabolic pathways, 2,709 enzymes, 71 transporters, 1,698 enzymatic reactions, 8 transport reactions, and 1,360 compounds in the current chicken genome build, Gallus_gallus-2.1. Comparative metabolic analysis with the human, mouse and cattle genomes revealed that core metabolic pathways are highly conserved in the chicken genome. It was indicated the quality of assembly and annotations of the chicken genome need to be improved and more researches are required for improving our understanding on function of genes and metabolic pathways of avian species. We conclude that the chicken PGDB is useful for studies on avian and chicken metabolism and provides a platform for comparative genomic and metabolic analysis of animal biology and biomedicine.
https://doi.org/10.5536/KJPS.2010.37.3.275 인용 PDF KSCI

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

Kim Yu-Seop;Chang Jeong-Ho
- The KIPS Transactions:PartB
- /
- v.11B no.6
- /
- pp.749-758
- /
- 2004
In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.
https://doi.org/10.3745/KIPSTB.2004.11B.6.749 인용 PDF KSCI

Search Result 2,061, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)