• Title/Summary/Keyword: dialect classification

Search Result 9, Processing Time 0.024 seconds

Dialect classification based on the speed and the pause of speech utterances (발화 속도와 휴지 구간 길이를 사용한 방언 분류)

  • Jonghwan Na;Bowon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.43-51
    • /
    • 2023
  • In this paper, we propose an approach for dialect classification based on the speed and pause of speech utterances as well as the age and gender of the speakers. Dialect classification is one of the important techniques for speech analysis. For example, an accurate dialect classification model can potentially improve the performance of speaker or speech recognition. According to previous studies, research based on deep learning using Mel-Frequency Cepstral Coefficients (MFCC) features has been the dominant approach. We focus on the acoustic differences between regions and conduct dialect classification based on the extracted features derived from the differences. In this paper, we propose an approach of extracting underexplored additional features, namely the speed and the pauses of speech utterances along with the metadata including the age and the gender of the speakers. Experimental results show that our proposed approach results in higher accuracy, especially with the speech rate feature, compared to the method only using the MFCC features. The accuracy improved from 91.02% to 97.02% compared to the previous method that only used MFCC features, by incorporating all the proposed features in this paper.

Performance Comparison of Korean Dialect Classification Models Based on Acoustic Features

  • Kim, Young Kook;Kim, Myung Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.10
    • /
    • pp.37-43
    • /
    • 2021
  • Using the acoustic features of speech, important social and linguistic information about the speaker can be obtained, and one of the key features is the dialect. A speaker's use of a dialect is a major barrier to interaction with a computer. Dialects can be distinguished at various levels such as phonemes, syllables, words, phrases, and sentences, but it is difficult to distinguish dialects by identifying them one by one. Therefore, in this paper, we propose a lightweight Korean dialect classification model using only MFCC among the features of speech data. We study the optimal method to utilize MFCC features through Korean conversational voice data, and compare the classification performance of five Korean dialects in Gyeonggi/Seoul, Gangwon, Chungcheong, Jeolla, and Gyeongsang in eight machine learning and deep learning classification models. The performance of most classification models was improved by normalizing the MFCC, and the accuracy was improved by 1.07% and F1-score by 2.04% compared to the best performance of the classification model before normalizing the MFCC.

Emotion Recognition in Arabic Speech from Saudi Dialect Corpus Using Machine Learning and Deep Learning Algorithms

  • Hanaa Alamri;Hanan S. Alshanbari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.9-16
    • /
    • 2023
  • Speech can actively elicit feelings and attitudes by using words. It is important for researchers to identify the emotional content contained in speech signals as well as the sort of emotion that resulted from the speech that was made. In this study, we studied the emotion recognition system using a database in Arabic, especially in the Saudi dialect, the database is from a YouTube channel called Telfaz11, The four emotions that were examined were anger, happiness, sadness, and neutral. In our experiments, we extracted features from audio signals, such as Mel Frequency Cepstral Coefficient (MFCC) and Zero-Crossing Rate (ZCR), then we classified emotions using many classification algorithms such as machine learning algorithms (Support Vector Machine (SVM) and K-Nearest Neighbor (KNN)) and deep learning algorithms such as (Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM)). Our Experiments showed that the MFCC feature extraction method and CNN model obtained the best accuracy result with 95%, proving the effectiveness of this classification system in recognizing Arabic spoken emotions.

Identifying Mobile Owner based on Authorship Attribution using WhatsApp Conversation

  • Almezaini, Badr Mohammd;Khan, Muhammad Asif
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.317-323
    • /
    • 2021
  • Social media is increasingly becoming a part of our daily life for communicating each other. There are various tools and applications for communication and therefore, identity theft is a common issue among users of such application. A new style of identity theft occurs when cybercriminals break into WhatsApp account, pretend as real friends and demand money or blackmail emotionally. In order to prevent from such issues, data mining can be used for text classification (TC) in analysis authorship attribution (AA) to recognize original sender of the message. Arabic is one of the most spoken languages around the world with different variants. In this research, we built a machine learning model for mining and analyzing the Arabic messages to identify the author of the messages in Saudi dialect. Many points would be addressed regarding authorship attribution mining and analysis: collect Arabic messages in the Saudi dialect, filtration of the messages' tokens. The classification would use a cross-validation technique and different machine-learning algorithms (Naïve Baye, Support Vector Machine). Results of average accuracy for Naïve Baye and Support Vector Machine have been presented and suggestions for future work have been presented.

On the primacy of auditory phonetics In tonological analysis and pitch description;In connection with the development of a new pitch scale (성조 분석과 음조 기술에서 청각음성학의 일차성;반자동 음조 청취 등급 분석기 개발과 관련하여)

  • Gim, Cha-Gyun
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.3-23
    • /
    • 2007
  • King Sejong the Great, his students in Jip-hyeun-jeon school and Choe Sejin, their successor of the sixteenth century, indicated Middle Korean had three distinctive pitches, low, high, and rising (phyeong-, geo-, sang-sheong). Thanks to $Hun-min-jeng-{\emptyset}eum$ as well as its Annotation and side-dots literatures in fifteenth and sixteenth centuries, we can compare Middle Korean with Hamgyeong dialect, Gyeongsang dialect, and extant tone dialects with joint preservers of what was probably the tonal system of unitary mother Korean language. What is most remarkable about middle Korean phonetic work is its manifest superiority in conception and execution as anything produced in the present day linguistic scholarship. But at this stage in linguistics, prior to the technology and equipment needed for the scientific analysis of sound waves, auditory description was the only possible frame for an accurate and systematic classification. And auditory phonetics still remains fundamental in pitch description, even though modern acoustic categories may supplement and supersede auditory ones in tonological analysis. Auditory phonetics, however, has serious shortcoming that its theory and practice are too subject to be developed into the present century science. With joint researchers, I am developping a new pitch scale. It is a semiautomatic auditory grade pitch analysis program. The result of our labor will give a significant breakthrough to upgrade our component in linguistics.

  • PDF

F0 Perturbation as a Perceptual Cue to Stop Distinction in Busan and Seoul Dialects of Korean

  • Kang, Kyoung-Ho
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.137-143
    • /
    • 2013
  • Recent investigation of acoustic correlates of Korean stop manner contrasts has reported a diachronic transition in Korean stops: young Seoul speakers are relatively more dependent on the F0 characteristics of the stops than on the VOT characteristics in aspirated and lenis stop distinction. This finding has been examined against tonal dialects of Korean and the results suggested that the speakers of tonal dialects are not sharing the transition. These results also suggested that F0 function for segmental stop classification interferes with the function for lexical tone classification in their tonal speech. The current study investigated these findings in terms of perception. Perceptual behavior of Seoul and Busan speakers of Korean was examined in a comparative manner through the measurement of perceptual cue weight of F0 and VOT in particular. The results from regression and correlation analyses revealed that Busan speakers are closer to older Seoul speakers than to younger Seoul speakers in that the cue weight for VOT and F0 were comparable in the aspirated-lenis stop distinction. This result was in contrast to the perceptual behavior of younger Seoul speakers who showed clear dominance of F0 over VOT for the same distinction. These findings provided perceptual evidence of the dual function of F0 for segmental and lexical distinctions in tonal dialects of Korean.

Classification of Subregions in Yeongnam Region (영남지역 내 하위지역 구분)

  • Son, Myoung Won
    • Journal of the Korean association of regional geographers
    • /
    • v.22 no.1
    • /
    • pp.25-35
    • /
    • 2016
  • This paper is to classify subregions of Yeongnam region, to realize their core area, and to provide the basis on studying inherent cultural characteristics in Yeongnam region. To do so, I made a overlay human factors of administrative district and dialect with physical factors of drainage basin and climate area. The limit of subregion is the range of possessing environmental factors similar to provincial center. The parcels possessing a equivalent combination of environmental factors are 27, the size of parcel is various from one-city/county to six-city/county. These parcels are classified to six subregions(Andong, Sangju, Kyeongju, Daegu, Kimhae, Jinju). The boundary of subregion is high mountains and large river which are obstacle to communication between subregions, and in case of little obstacle exists transitional zone.

  • PDF

A Study on the Engineering Characteristic of scoria in Jeju-Do (제주도산 송이의 공학적 특성에 관한 연구)

  • Chun, Byung-Sik;Kim, Dong-Hoon;Kim, Young-Hun;Lee, Dong-Yeup
    • Proceedings of the Korean Geotechical Society Conference
    • /
    • 2008.10a
    • /
    • pp.1630-1637
    • /
    • 2008
  • Jeju-do is a island formed by the volcanic activity and has more than 360 volcanic cones distributed widely along the long axis of the elliptically shaped island. The volcanic cones consist mainly of scoria, so called "Song-I" in the local dialect. In this study the chemical and soil mechanical properties of scoria being very different from those of the inland were investigated with the various tests. In the sieve-passing test the particle size of scoria had more than 10 of uniformity coefficient and gradation coefficient of 1 ~ 3, showing relatively homogenous distribution. Based on the uniformity classification, scoria was assorted into GW. In the large scale direct shear tested for measuring the mechanical strength of scoria the internal friction angle of red scoria was $37^{\circ}$ and that of black scoria was $36^{\circ}$. This indicated that there was no difference in the mechanical strength between two types of scoria. On the other hand, red and black scoria had $1.24{\times}10^{-3}$ to $3.55{\times}10^{-2}$ cm/sec of k values for the static water level permeability, thus being classified into a coarse or fine sand as compared with that representing the saturated soil. They also had 1.411 to $1.477\;g/cm^3$ of notably low $r_{dmax}$ values for the compaction test as compared with common soil, which was considered to be due to their low specific gravity and high porosity. In conclusion, the soil mechanic properties of scoria obtained from this study are thought to be very helpful for reducing lots of trial and error happening in the civil engineering construction.

  • PDF

Species Identification and Monitoring of Labeling Compliance for Commercial Pufferfish Products Sold in Korean On-line Markets (국내 온라인 유통 복어 제품의 종판별 및 표시사항 모니터링 연구)

  • Ji Young Lee;Kun Hee Kim;Tae Sun Kang
    • Journal of Food Hygiene and Safety
    • /
    • v.38 no.6
    • /
    • pp.464-475
    • /
    • 2023
  • In this study, based on an analysis of two DNA barcode markers (cytochrome c oxidase subunit I and cytochrome b genes), we performed species identification and monitored labeling compliance for 50 commercial pufferfish products sold in on-line markets in Korea. Using these barcode sequences as a query for species identification and phylogenetic analysis, we screened the GenBank database. A total of seven pufferfish species (Takifugu chinensis, T. pseudommus, T. xanthopterus, T. alboplumbeus, T. porphyreus, T. vermicularis, and Lagocephalus cheesemanii) were identified and we detected 35 products (70%) that were non-compliant with the corresponding label information. Moreover, the labels on 12 commercial products contained only the general common name (i.e., pufferfish), although not the scientific or Korean names for the 21 edible pufferfish species. Furthermore, the proportion of mislabeled highly processed products (n = 9, 81.8%) was higher than that of simply processed products (n = 26, 66.7%). With respect to the country of origin, the percentage of mislabeled Chinese products (n = 8, 80%) was higher than that of Korean products (n = 26, 66.7%). In addition, the market and dialect names of different pufferfish species were labeled only as Jolbok or Milbok, whereas two non-edible pufferfish species (T. vermicularis and T. pseudommus) were used in six commercial pufferfish products described as JolboK and Gumbok on their labels, which could be attributable to the complex classification system used for pufferfish. These monitoring results highlight the necessity to develop genetic methods that can be used to identify the 21 edible pufferfish species, as well as the need for regulatory monitoring of commercial pufferfish products.