• Title/Summary/Keyword: vector data

Search Result 3,324, Processing Time 0.029 seconds

Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)

  • Bogyung Park;Somin Park;Hyunki Hong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.7
    • /
    • pp.303-314
    • /
    • 2023
  • Voice conversion, a technology that allows an individual's speech data to be regenerated with the acoustic properties(tone, cadence, gender) of another, has countless applications in education, communication, and entertainment. This paper proposes an approach based on the StarGAN-VC model that generates realistic-sounding speech without requiring parallel utterances. To overcome the constraints of the existing StarGAN-VC model that utilizes one-hot vectors of original and target speaker information, this paper extracts feature vectors of target speakers using a pre-trained version of Rawnet3. This results in a latent space where voice conversion can be performed without direct speaker-to-speaker mappings, enabling an any-to-any structure. In addition to the loss terms used in the original StarGAN-VC model, Wasserstein distance is used as a loss term to ensure that generated voice segments match the acoustic properties of the target voice. Two Time-Scale Update Rule (TTUR) is also used to facilitate stable training. Experimental results show that the proposed method outperforms previous methods, including the StarGAN-VC network on which it was based.

A Desirability Function-Based Multi-Characteristic Robust Design Optimization Technique (호감도 함수 기반 다특성 강건설계 최적화 기법)

  • Jong Pil Park;Jae Hun Jo;Yoon Eui Nahm
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.199-208
    • /
    • 2023
  • Taguchi method is one of the most popular approaches for design optimization such that performance characteristics become robust to uncontrollable noise variables. However, most previous Taguchi method applications have addressed a single-characteristic problem. Problems with multiple characteristics are more common in practice. The multi-criteria decision making(MCDM) problem is to select the optimal one among multiple alternatives by integrating a number of criteria that may conflict with each other. Representative MCDM methods include TOPSIS(Technique for Order of Preference by Similarity to Ideal Solution), GRA(Grey Relational Analysis), PCA(Principal Component Analysis), fuzzy logic system, and so on. Therefore, numerous approaches have been conducted to deal with the multi-characteristic design problem by combining original Taguchi method and MCDM methods. In the MCDM problem, multiple criteria generally have different measurement units, which means that there may be a large difference in the physical value of the criteria and ultimately makes it difficult to integrate the measurements for the criteria. Therefore, the normalization technique is usually utilized to convert different units of criteria into one identical unit. There are four normalization techniques commonly used in MCDM problems, including vector normalization, linear scale transformation(max-min, max, or sum). However, the normalization techniques have several shortcomings and do not adequately incorporate the practical matters. For example, if certain alternative has maximum value of data for certain criterion, this alternative is considered as the solution in original process. However, if the maximum value of data does not satisfy the required degree of fulfillment of designer or customer, the alternative may not be considered as the solution. To solve this problem, this paper employs the desirability function that has been proposed in our previous research. The desirability function uses upper limit and lower limit in normalization process. The threshold points for establishing upper or lower limits let us know what degree of fulfillment of designer or customer is. This paper proposes a new design optimization technique for multi-characteristic design problem by integrating the Taguchi method and our desirability functions. Finally, the proposed technique is able to obtain the optimal solution that is robust to multi-characteristic performances.

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

A Spatial-Temporal Correlation Analysis of Housing Prices in Busan Using SpVAR and GSTAR (SpVAR(공간적 벡터자기회귀모델)과 GSTAR(일반화 시공간자기회귀모델)를 이용한 부산지역 주택가격의 시공간적 상관성 분석)

  • Kwon, Youngwoo;Choi, Yeol
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.44 no.2
    • /
    • pp.245-256
    • /
    • 2024
  • Since 2020, quantitative easing and easy money policies have been implemented for the purpose of economic stimulus. As a result, real estate prices have skyrocketed. In this study, the relationship between sales and rental prices by housing type during the period of soaring real estate prices in Busan was analyzed spatio-temporally. Based on the actual transaction price data, housing type, transaction type, and monthly data of district units were constructed. Among the spatio-temporal analysis models, the SpVAR, which is used to understand the temporal and spatial effects of variables, and the GSTAR, which is used to understand the effects of each region on those variables, were used. As a result, the sales price of apartment had positive effect on the sale price of apartment, row house, and detached house in the surrounding area, including the target area. On the other hand, it was confirmed that demand was converted to apartment rental due to an increase in apartment sales prices, and the sale price fell again over time. The spatio-temporal spillover effect of apartments was positive, but the positive effect of row house and detached house were concentrated in the original downtown area.

The Analysis on the Relationship between Firms' Exposures to SNS and Stock Prices in Korea (기업의 SNS 노출과 주식 수익률간의 관계 분석)

  • Kim, Taehwan;Jung, Woo-Jin;Lee, Sang-Yong Tom
    • Asia pacific journal of information systems
    • /
    • v.24 no.2
    • /
    • pp.233-253
    • /
    • 2014
  • Can the stock market really be predicted? Stock market prediction has attracted much attention from many fields including business, economics, statistics, and mathematics. Early research on stock market prediction was based on random walk theory (RWT) and the efficient market hypothesis (EMH). According to the EMH, stock market are largely driven by new information rather than present and past prices. Since it is unpredictable, stock market will follow a random walk. Even though these theories, Schumaker [2010] asserted that people keep trying to predict the stock market by using artificial intelligence, statistical estimates, and mathematical models. Mathematical approaches include Percolation Methods, Log-Periodic Oscillations and Wavelet Transforms to model future prices. Examples of artificial intelligence approaches that deals with optimization and machine learning are Genetic Algorithms, Support Vector Machines (SVM) and Neural Networks. Statistical approaches typically predicts the future by using past stock market data. Recently, financial engineers have started to predict the stock prices movement pattern by using the SNS data. SNS is the place where peoples opinions and ideas are freely flow and affect others' beliefs on certain things. Through word-of-mouth in SNS, people share product usage experiences, subjective feelings, and commonly accompanying sentiment or mood with others. An increasing number of empirical analyses of sentiment and mood are based on textual collections of public user generated data on the web. The Opinion mining is one domain of the data mining fields extracting public opinions exposed in SNS by utilizing data mining. There have been many studies on the issues of opinion mining from Web sources such as product reviews, forum posts and blogs. In relation to this literatures, we are trying to understand the effects of SNS exposures of firms on stock prices in Korea. Similarly to Bollen et al. [2011], we empirically analyze the impact of SNS exposures on stock return rates. We use Social Metrics by Daum Soft, an SNS big data analysis company in Korea. Social Metrics provides trends and public opinions in Twitter and blogs by using natural language process and analysis tools. It collects the sentences circulated in the Twitter in real time, and breaks down these sentences into the word units and then extracts keywords. In this study, we classify firms' exposures in SNS into two groups: positive and negative. To test the correlation and causation relationship between SNS exposures and stock price returns, we first collect 252 firms' stock prices and KRX100 index in the Korea Stock Exchange (KRX) from May 25, 2012 to September 1, 2012. We also gather the public attitudes (positive, negative) about these firms from Social Metrics over the same period of time. We conduct regression analysis between stock prices and the number of SNS exposures. Having checked the correlation between the two variables, we perform Granger causality test to see the causation direction between the two variables. The research result is that the number of total SNS exposures is positively related with stock market returns. The number of positive mentions of has also positive relationship with stock market returns. Contrarily, the number of negative mentions has negative relationship with stock market returns, but this relationship is statistically not significant. This means that the impact of positive mentions is statistically bigger than the impact of negative mentions. We also investigate whether the impacts are moderated by industry type and firm's size. We find that the SNS exposures impacts are bigger for IT firms than for non-IT firms, and bigger for small sized firms than for large sized firms. The results of Granger causality test shows change of stock price return is caused by SNS exposures, while the causation of the other way round is not significant. Therefore the correlation relationship between SNS exposures and stock prices has uni-direction causality. The more a firm is exposed in SNS, the more is the stock price likely to increase, while stock price changes may not cause more SNS mentions.

Monitoring soybean growth using L, C, and X-bands automatic radar scatterometer measurement system (L, C, X-밴드 레이더 산란계 자동측정시스템을 이용한 콩 생육 모니터링)

  • Kim, Yi-Hyun;Hong, Suk-Young;Lee, Hoon-Yol;Lee, Jae-Eun
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.2
    • /
    • pp.191-201
    • /
    • 2011
  • Soybean has widely grown for its edible bean which has numerous uses. Microwave remote sensing has a great potential over the conventional remote sensing with the visible and infrared spectra due to its all-weather day-and-night imaging capabilities. In this investigation, a ground-based polarimetric scatterometer operating at multiple frequencies was used to continuously monitor the crop conditions of a soybean field. Polarimetric backscatter data at L, C, and X-bands were acquired every 10 minutes on the microwave observations at various soybean stages. The polarimetric scatterometer consists of a vector network analyzer, a microwave switch, radio frequency cables, power unit and a personal computer. The polarimetric scatterometer components were installed inside an air-conditioned shelter to maintain constant temperature and humidity during the data acquisition period. The backscattering coefficients were calculated from the measured data at incidence angle $40^{\circ}$ and full polarization (HH, VV, HV, VH) by applying the radar equation. The soybean growth data such as leaf area index (LAI), plant height, fresh and dry weight, vegetation water content and pod weight were measured periodically throughout the growth season. We measured the temporal variations of backscattering coefficients of the soybean crop at L, C, and X-bands during a soybean growth period. In the three bands, VV-polarized backscattering coefficients were higher than HH-polarized backscattering coefficients until mid-June, and thereafter HH-polarized backscattering coefficients were higher than VV-, HV-polarized back scattering coefficients. However, the cross-over stage (HH > VV) was different for each frequency: DOY 200 for L-band and DOY 210 for both C and X-bands. The temporal trend of the backscattering coefficients for all bands agreed with the soybean growth data such as LAI, dry weight and plant height; i.e., increased until about DOY 271 and decreased afterward. We plotted the relationship between the backscattering coefficients with three bands and soybean growth parameters. The growth parameters were highly correlated with HH-polarization at L-band (over r=0.92).

Analysis of Trading Performance on Intelligent Trading System for Directional Trading (방향성매매를 위한 지능형 매매시스템의 투자성과분석)

  • Choi, Heung-Sik;Kim, Sun-Woong;Park, Sung-Cheol
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.187-201
    • /
    • 2011
  • KOSPI200 index is the Korean stock price index consisting of actively traded 200 stocks in the Korean stock market. Its base value of 100 was set on January 3, 1990. The Korea Exchange (KRX) developed derivatives markets on the KOSPI200 index. KOSPI200 index futures market, introduced in 1996, has become one of the most actively traded indexes markets in the world. Traders can make profit by entering a long position on the KOSPI200 index futures contract if the KOSPI200 index will rise in the future. Likewise, they can make profit by entering a short position if the KOSPI200 index will decline in the future. Basically, KOSPI200 index futures trading is a short-term zero-sum game and therefore most futures traders are using technical indicators. Advanced traders make stable profits by using system trading technique, also known as algorithm trading. Algorithm trading uses computer programs for receiving real-time stock market data, analyzing stock price movements with various technical indicators and automatically entering trading orders such as timing, price or quantity of the order without any human intervention. Recent studies have shown the usefulness of artificial intelligent systems in forecasting stock prices or investment risk. KOSPI200 index data is numerical time-series data which is a sequence of data points measured at successive uniform time intervals such as minute, day, week or month. KOSPI200 index futures traders use technical analysis to find out some patterns on the time-series chart. Although there are many technical indicators, their results indicate the market states among bull, bear and flat. Most strategies based on technical analysis are divided into trend following strategy and non-trend following strategy. Both strategies decide the market states based on the patterns of the KOSPI200 index time-series data. This goes well with Markov model (MM). Everybody knows that the next price is upper or lower than the last price or similar to the last price, and knows that the next price is influenced by the last price. However, nobody knows the exact status of the next price whether it goes up or down or flat. So, hidden Markov model (HMM) is better fitted than MM. HMM is divided into discrete HMM (DHMM) and continuous HMM (CHMM). The only difference between DHMM and CHMM is in their representation of state probabilities. DHMM uses discrete probability density function and CHMM uses continuous probability density function such as Gaussian Mixture Model. KOSPI200 index values are real number and these follow a continuous probability density function, so CHMM is proper than DHMM for the KOSPI200 index. In this paper, we present an artificial intelligent trading system based on CHMM for the KOSPI200 index futures system traders. Traders have experienced on technical trading for the KOSPI200 index futures market ever since the introduction of the KOSPI200 index futures market. They have applied many strategies to make profit in trading the KOSPI200 index futures. Some strategies are based on technical indicators such as moving averages or stochastics, and others are based on candlestick patterns such as three outside up, three outside down, harami or doji star. We show a trading system of moving average cross strategy based on CHMM, and we compare it to a traditional algorithmic trading system. We set the parameter values of moving averages at common values used by market practitioners. Empirical results are presented to compare the simulation performance with the traditional algorithmic trading system using long-term daily KOSPI200 index data of more than 20 years. Our suggested trading system shows higher trading performance than naive system trading.

Estimation of Mean Surface Current and Current Variability in the East Sea using Surface Drifter Data from 1991 to 2017 (1991년부터 2017년까지 표층 뜰개 자료를 이용하여 계산한 동해의 평균 표층 해류와 해류 변동성)

  • PARK, JU-EUN;KIM, SOO-YUN;CHOI, BYOUNG-JU;BYUN, DO-SEONG
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.2
    • /
    • pp.208-225
    • /
    • 2019
  • To understand the mean surface circulation and surface currents in the East Sea, trajectories of surface drifters passed through the East Sea from 1991 to 2017 were analyzed. By analyzing the surface drifter trajectory data, the main paths of surface ocean currents were grouped and the variation in each main current path was investigated. The East Korea Warm Current (EKWC) heading northward separates from the coast at $36{\sim}38^{\circ}N$ and flows to the northeast until $131^{\circ}E$. In the middle (from $131^{\circ}E$ to $137^{\circ}E$) of the East Sea, the average latitude of the currents flowing eastward ranges from 36 to $40^{\circ}N$ and the currents meander with large amplitude. When the average latitude of the surface drifter paths was in the north (south) of $37.5^{\circ}N$, the meandering amplitude was about 50 (100) km. The most frequent route of surface drifters in the middle of the East Sea was the path along $37.5-38.5^{\circ}N$. The surface drifters, which were deployed off the coast of Vladivostok in the north of the East Sea, moved to the southwest along the coast and were separated from the coast to flow southeastward along the cyclonic circulation around the Japan Basin. And, then, the drifters moved to the east along $39-40^{\circ}N$. The mean surface current vector and mean speed were calculated in each lattice with $0.25^{\circ}$ grid spacing using the velocity data of surface drifters which passed through each lattice. The current variance ellipses were calculated with $0.5^{\circ}$ grid spacing. Because the path of the EKWC changes every year in the western part of the Ulleung Basin and the current paths in the Yamato Basin keep changing with many eddies, the current variance ellipses are relatively large in these region. We present a schematic map of the East Sea surface current based on the surface drifter data. The significance of this study is that the surface ocean circulation of the East Sea, which has been mainly studied by numerical model simulations and the sea surface height data obtained from satellite altimeters, was analyzed based on in-situ Lagrangian observational current data.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Construction of X-band automatic radar scatterometer measurement system and monitoring of rice growth (X-밴드 레이더 산란계 자동 측정시스템 구축과 벼 생육 모니터링)

  • Kim, Yi-Hyun;Hong, Suk-Young;Lee, Hoon-Yol
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.43 no.3
    • /
    • pp.374-383
    • /
    • 2010
  • Microwave radar can penetrate cloud cover regardless of weather conditions and can be used day and night. Especially a ground-based polarimetric scatterometer has advantages of monitoring crop conditions continuously with full polarization and different frequencies. Kim et al. (2009) have measured backscattering coefficients of paddy rice using L-, C-, X-band scatterometer system with full polarization and various angles during the rice growth period and have revealed the necessity of near-continuous automatic measurement to eliminate the difficulties, inaccuracy and sparseness of data acquisitions arising from manual operation of the system. In this study, we constructed an X-band automatic scatterometer system, analyzed scattering characteristics of paddy rice from X-band scatterometer data and estimated rice growth parameter using backscattering coefficients in X-band. The system was installed inside a shelter in an experimental paddy field at the National Academy of Agricultural Science (NAAS) before rice transplanting. The scatterometer system consists of X-band antennas, HP8720D vector network analyzer, RF cables and personal computer that controls frequency, polarization and data storage. This system using automatically measures fully-polarimetric backscattering coefficients of rice crop every 10 minutes. The backscattering coefficients were calculated from the measured data at a fixed incidence angle of $45^{\circ}$ and with full polarization (HH, VV, HV, VH) by applying the radar equation and compared with rice growth data such as plant height, stem number, fresh dry weight and Leaf Area Index (LAI) that were collected at the same time of each rice growth parameter. We examined the temporal behaviour of the backscattering coefficients of the rice crop at X-band during rice growth period. The HH-, VV-polarization backscattering coefficients steadily increased toward panicle initiation stage, thereafter decreased and again increased in early-September. We analyzed the relationships between backscattering coefficients in X-band and plant parameters and predicted the rice growth parameters using backscattering coefficients. It was confirmed that X-band is sensitive to grain maturity at near harvesting season.