• Title/Summary/Keyword: Transcription model

Search Result 360, Processing Time 0.026 seconds

Transcription Mechanism of Minute Surface Pattern in Injection Molding

  • YASUHARA Toshiyuki;KATO Kazunori;IMAMURA Hiroshi;OHTAKE Naoto
    • Proceedings of the Korean Society for Technology of Plasticity Conference
    • /
    • 2003.04a
    • /
    • pp.1-6
    • /
    • 2003
  • In injection molding of an optical disk, a toric lens, etc., their performance depends on the transcription preciseness of fine surface structure of a mold. However, transcription behavior has not been made clear yet, because transcription is made in very short time and the structure is very small. In this paper, transcription properties have been examined, by using V-grooves of various sizes. machined on mold surfaces, and the following results are obtained. (1) Transcription properties have been made clear experimentally and it was found that the mold temperature $T_D$ makes great influence on the transcription property and that compression applying time $t_c$ should be taken more than 2.0s for fine transcription. (2) A mechanical model of transcription process, in consideration with strain recovery due to viscoelastic property of polymer. is proposed. (3) Simulation results agree with experimental ones fairly well. It means that the transcription model is useful for estimation of transcription property in advance of an actual. injection molding.

  • PDF

Phonology of Transcription (음운표기의 음운론)

  • Chung, Kook
    • Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.23-40
    • /
    • 2003
  • This paper examines transcription of sounds from a phonological perspective. It has found that most of transcriptions have been done on a segmental basis alone, without consideration of the whole phonological systems and levels, and without a full understanding of the nature of the linguistic and phonetic alphabets. In a word, sound transcriptions have not been done on the basis of the phonology of the language and the alphabet. This study shows a phonological model for transcribing foreign and native sounds, suggesting ways of improving some of the current transcription systems such as the Hangeul transcription of loan words and the romanization of Hangeul, as well as the phonetic transcription of English and other foreign languages.

  • PDF

Annotation of a Non-native English Speech Database by Korean Speakers

  • Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.111-135
    • /
    • 2002
  • An annotation model of a non-native speech database has been devised, wherein English is the target language and Korean is the native language. The proposed annotation model features overt transcription of predictable linguistic information in native speech by the dictionary entry and several predefined types of error specification found in native language transfer. The proposed model is, in that sense, different from other previously explored annotation models in the literature, most of which are based on native speech. The validity of the newly proposed model is revealed in its consistent annotation of 1) salient linguistic features of English, 2) contrastive linguistic features of English and Korean, 3) actual errors reported in the literature, and 4) the newly collected data in this study. The annotation method in this model adopts the widely accepted conventions, Speech Assessment Methods Phonetic Alphabet (SAMPA) and the TOnes and Break Indices (ToBI). In the proposed annotation model, SAMPA is exclusively employed for segmental transcription and ToBI for prosodic transcription. The annotation of non-native speech is used to assess speaking ability for English as Foreign Language (EFL) learners.

  • PDF

Rich Transcription Generation Using Automatic Insertion of Punctuation Marks (자동 구두점 삽입을 이용한 Rich Transcription 생성)

  • Kim, Ji-Hwan
    • MALSORI
    • /
    • no.61
    • /
    • pp.87-100
    • /
    • 2007
  • A punctuation generation system which combines prosodic information with acoustic and language model information is presented. Experiments have been conducted first for the reference text transcriptions. In these experiments, prosodic information was shown to be more useful than language model information. When these information sources are combined, an F-measure of up to 0.7830 was obtained for adding punctuation to a reference transcription. This method of punctuation generation can also be applied to the 1-best output of a speech recogniser. The 1-best output is first time aligned. Based on the time alignment information, prosodic features are generated. As in the approach applied in the punctuation generation for reference transcriptions, the best sequence of punctuation marks for this 1-best output is found using the prosodic feature model and an language model trained on texts which contain punctuation marks.

  • PDF

Korean Broadcast News Transcription Using Morpheme-based Recognition Units

  • Kwon, Oh-Wook;Alex Waibel
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.1E
    • /
    • pp.3-11
    • /
    • 2002
  • Broadcast news transcription is one of the hardest tasks in speech recognition because broadcast speech signals have much variability in speech quality, channel and background conditions. We developed a Korean broadcast news speech recognizer. We used a morpheme-based dictionary and a language model to reduce the out-of·vocabulary (OOV) rate. We concatenated the original morpheme pairs of short length or high frequency in order to reduce insertion and deletion errors due to short morphemes. We used a lexicon with multiple pronunciations to reflect inter-morpheme pronunciation variations without severe modification of the search tree. By using the merged morpheme as recognition units, we achieved the OOV rate of 1.7% comparable to European languages with 64k vocabulary. We implemented a hidden Markov model-based recognizer with vocal tract length normalization and online speaker adaptation by maximum likelihood linear regression. Experimental results showed that the recognizer yielded 21.8% morpheme error rate for anchor speech and 31.6% for mostly noisy reporter speech.

Statistical Analysis Between Size and Balance of Text Corpus by Evaluation of the effect of Interview Sentence in Language Modeling (언어모델 인터뷰 영향 평가를 통한 텍스트 균형 및 사이즈간의 통계 분석)

  • Jung Eui-Jung;Lee Youngjik
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.87-90
    • /
    • 2002
  • This paper analyzes statistically the relationship between size and balance of text corpus by evaluation of the effect of interview sentences in language model for Korean broadcast news transcription system. Our Korean broadcast news transcription system's ultimate purpose is to recognize not interview speech, but the anchor's and reporter's speech in broadcast news show. But the gathered text corpus for constructing language model consists of interview sentences a portion of the whole, $15\%$ approximately. The characteristic of interview sentence is different from the anchor's and the reporter's in one thing or another. Therefore it disturbs the anchor and reporter oriented language modeling. In this paper, we evaluate the effect of interview sentences in language model for Korean broadcast news transcription system and analyze statistically the relationship between size and balance of text corpus by making an experiment as the same procedure according to varying the size of corpus.

  • PDF

A study on improving the performance of the machine-learning based automatic music transcription model by utilizing pitch number information (음고 개수 정보 활용을 통한 기계학습 기반 자동악보전사 모델의 성능 개선 연구)

  • Daeho Lee;Seokjin Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.207-213
    • /
    • 2024
  • In this paper, we study how to improve the performance of a machine learning-based automatic music transcription model by adding musical information to the input data. Where, the added musical information is information on the number of pitches that occur in each time frame, and which is obtained by counting the number of notes activated in the answer sheet. The obtained information on the number of pitches was used by concatenating it to the log mel-spectrogram, which is the input of the existing model. In this study, we use the automatic music transcription model included the four types of block predicting four types of musical information, we demonstrate that a simple method of adding pitch number information corresponding to the music information to be predicted by each block to the existing input was helpful in training the model. In order to evaluate the performance improvement proceed with an experiment using MIDI Aligned Piano Sounds (MAPS) data, as a result, when using all pitch number information, performance improvement was confirmed by 9.7 % in frame-based F1 score and 21.8 % in note-based F1 score including offset.

An estimation method for stochastic reaction model (확률적 방법에 기반한 화학 반응 모형의 모수 추정 방법)

  • Choi, Boseung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.4
    • /
    • pp.813-826
    • /
    • 2015
  • This research deals with an estimation method for kinetic reaction model. The kinetic reaction model is a model to explain spread or changing process based on interaction between species on the Biochemical area. This model can be applied to a model for disease spreading as well as a model for system Biology. In the search, we assumed that the spread of species is stochastic and we construct the reaction model based on stochastic movement. We utilized Gillespie algorithm in order to construct likelihood function. We introduced a Bayesian estimation method using Markov chain Monte Carlo methods that produces more stable results. We applied the Bayesian estimation method to the Lotka-Volterra model and gene transcription model and had more stable estimation results.

Establishment of a Pancreatic Cancer Stem Cell Model Using the SW1990 Human Pancreatic Cancer Cell Line in Nude Mice

  • Pan, Yan;Gao, Song;Hua, Yong-Qiang;Liu, Lu-Ming
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.2
    • /
    • pp.437-442
    • /
    • 2015
  • Aim: To establish a pancreatic cancer stem cell model using human pancreatic cancer cells in nude mice to provide a platform for pancreatic cancer stem cell research. Materials and Methods: To establish pancreatic cancer xenografts using human pancreatic cancer cell line SW1990, nude mice were randomly divided into control and gemcitabine groups. When the tumor grew to a volume of $125mm^3$, they treated with gemcitabine at a dose of 50mg/kg by intraperitoneal injection of 0.2ml in the gemcitabine group, while the mice in control group were treated with the same volume of normal saline. Gemcitabine was given 2 times a week for 3 times. When the model was established, the proliferation of pancreatic cancer stem cells was observed by clone formation assay, and the protein and/or mRNA expression of pancreatic stem cell surface markers including CD24, CD44, CD133, ALDH, transcription factors containing Oct-4, Sox-2, Nanog and Gli, the key nuclear transcription factor in Sonic Hedgehog signaling pathway was detected by Western blot and/or RT-PCR to verify the reliability of this model. Results: This model is feasible and safe. During the establishment, no mice died and the weight of nude mice maintained above 16.5g. The clone forming ability in gemcitabine group was stronger than that of the control group (p<0.01). In gemcitabine group, the protein expression of pancreatic cancer stem cell surface markers including CD44, and ALDH was up-regulated, the protein and mRNA expression of nuclear transcription factor including Oct-4, Sox-2 and Nanog was also significantly increased (P<0.01). In addition, the protein expression of key nuclear transcription factor in Sonic Hedgehog signaling pathway, Gli-1, was significantly enhanced (p<0.01). Conclusions: The pancreatic cancer stem cell model was successfully established using human pancreatic cancer cell line SW1990 in nude mice. Gemcitabine could enrich pancreatic cancer stem cells, simultaneously accompanied by the activation of Sonic Hedgehog signaling pathway.

Reducing latency of neural automatic piano transcription models (인공신경망 기반 저지연 피아노 채보 모델)

  • Dasol Lee;Dasaem Jeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.2
    • /
    • pp.102-111
    • /
    • 2023
  • Automatic Music Transcription (AMT) is a task that detects and recognizes musical note events from a given audio recording. In this paper, we focus on reducing the latency of real-time AMT systems on piano music. Although neural AMT models have been adapted for real-time piano transcription, they suffer from high latency, which hinders their usefulness in interactive scenarios. To tackle this issue, we explore several techniques for reducing the intrinsic latency of a neural network for piano transcription, including reducing window and hop sizes of Fast Fourier Transformation (FFT), modifying convolutional layer's kernel size, and shifting the label in the time-axis to train the model to predict onset earlier. Our experiments demonstrate that combining these approaches can lower latency while maintaining high transcription accuracy. Specifically, our modified model achieved note F1 scores of 92.67 % and 90.51 % with latencies of 96 ms and 64 ms, respectively, compared to the baseline model's note F1 score of 93.43 % with a latency of 160 ms. This methodology has potential for training AMT models for various interactive scenarios, including providing real-time feedback for piano education.