• Title/Summary/Keyword: Audio Data

Search Result 887, Processing Time 0.025 seconds

New Automatic Taxonomy Generation Algorithm for the Audio Genre Classification (음악 장르 분류를 위한 새로운 자동 Taxonomy 구축 알고리즘)

  • Choi, Tack-Sung;Moon, Sun-Kook;Park, Young-Cheol;Youn, Dae-Hee;Lee, Seok-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.3
    • /
    • pp.111-118
    • /
    • 2008
  • In this paper, we propose a new automatic taxonomy generation algorithm for the audio genre classification. The proposed algorithm automatically generates hierarchical taxonomy based on the estimated classification accuracy at all possible nodes. The estimation of classification accuracy in the proposed algorithm is conducted by applying the training data to classifier using k-fold cross validation. Subsequent classification accuracy is then to be tested at every node which consists of two clusters by applying one-versus-one support vector machine. In order to assess the performance of the proposed algorithm, we extracted various features which represent characteristics such as timbre, rhythm, pitch and so on. Then, we investigated classification performance using the proposed algorithm and previous flat classifiers. The classification accuracy reaches to 89 percent with proposed scheme, which is 5 to 25 percent higher than the previous flat classification methods. Using low-dimensional feature vectors, in particular, it is 10 to 25 percent higher than previous algorithms for classification experiments.

Method of extracting context from media data by using video sharing site

  • Kondoh, Satoshi;Ogawa, Takeshi
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.709-713
    • /
    • 2009
  • Recently, a lot of research that applies data acquired from devices such as cameras and RFIDs to context aware services is being performed in the field on Life-Log and the sensor network. A variety of analytical techniques has been proposed to recognize various information from the raw data because video and audio data include a larger volume of information than other sensor data. However, manually watching a huge amount of media data again has been necessary to create supervised data for the update of a class or the addition of a new class because these techniques generally use supervised learning. Therefore, the problem was that applications were able to use only recognition function based on fixed supervised data in most cases. Then, we proposed a method of acquiring supervised data from a video sharing site where users give comments on any video scene because those sites are remarkably popular and, therefore, many comments are generated. In the first step of this method, words with a high utility value are extracted by filtering the comment about the video. Second, the set of feature data in the time series is calculated by applying functions, which extract various feature data, to media data. Finally, our learning system calculates the correlation coefficient by using the above-mentioned two kinds of data, and the correlation coefficient is stored in the DB of the system. Various other applications contain a recognition function that is used to generate collective intelligence based on Web comments, by applying this correlation coefficient to new media data. In addition, flexible recognition that adjusts to a new object becomes possible by regularly acquiring and learning both media data and comments from a video sharing site while reducing work by manual operation. As a result, recognition of not only the name of the seen object but also indirect information, e.g. the impression or the action toward the object, was enabled.

  • PDF

Multimodal Approach for Summarizing and Indexing News Video

  • Kim, Jae-Gon;Chang, Hyun-Sung;Kim, Young-Tae;Kang, Kyeong-Ok;Kim, Mun-Churl;Kim, Jin-Woong;Kim, Hyung-Myung
    • ETRI Journal
    • /
    • v.24 no.1
    • /
    • pp.1-11
    • /
    • 2002
  • A video summary abstracts the gist from an entire video and also enables efficient access to the desired content. In this paper, we propose a novel method for summarizing news video based on multimodal analysis of the content. The proposed method exploits the closed caption data to locate semantically meaningful highlights in a news video and speech signals in an audio stream to align the closed caption data with the video in a time-line. Then, the detected highlights are described using MPEG-7 Summarization Description Scheme, which allows efficient browsing of the content through such functionalities as multi-level abstracts and navigation guidance. Multimodal search and retrieval are also within the proposed framework. By indexing synchronized closed caption data, the video clips are searchable by inputting a text query. Intensive experiments with prototypical systems are presented to demonstrate the validity and reliability of the proposed method in real applications.

  • PDF

Living as Severe COPD Patient - Life of Stepping on the Thin Ice (중증 만성폐쇄성 폐질환 환자로 살아가기 -살얼음판 위를 걸어가는 삶-)

  • Kim, Sung-Reul;Kim, Yun-Ok;Kwon, Kyoung-Min
    • Korean Journal of Adult Nursing
    • /
    • v.22 no.6
    • /
    • pp.663-675
    • /
    • 2010
  • Purpose: The purpose of this study was to explore the life experiences of patients with a severe Chronic Obstructive Pulmonary Disease (COPD). Methods: The data were collected through in-depth interviews of six patients suffering from severe COPD. The interviewed data were audio-recorded and transcribed verbatim and checked for accuracy. The Giorgi method of phenomenology was used for analyzing data. Results: Eight themes forming the, units of meaning, were: Repeated and Unpredictable Suffering of Dyspnea, Confidence Loss/Exhaustion Life due to non-efficient breathing, Gradually Deprived Liberty, Absolute Being to Sustaining my life, Source of Burden but Significant Person I am in the Family, Endless Tug-of-War-Capability/Endeavor to Breath, Longing for my Life, and Dead-end of breathing. Conclusion: The study results provide an in-depth understanding of life experiences of patients suffering from severe COPD. The findings will be useful to nurses caring for this population.

Nurses Experience of Caring for Dying Patients in Hospitals (임종환자를 돌보는 병원간호사의 경험: 감정에 충실하면서 자신 추스르기)

  • 이명선
    • Journal of Korean Academy of Nursing
    • /
    • v.33 no.5
    • /
    • pp.553-561
    • /
    • 2003
  • Purpose: To develop a substantive theory that represents hospital nurses' experience on caring for dying patients. Method: Grounded theory method guided the data collection and analysis. A purposeful sample of 15 hospital nurses participated during the period of 2001-2002. The data were collected by semi-structured individual interviews. All interviews were audio taped and transcribed verbatim. Constant comparative analysis was employed to analyze the data. Result: 'Putting oneself into shape while being faithful to feelings and emotions' emerged as the basic social-psychological process. Three different phases were identified: being faithful to own feelings and behaviors; putting oneself into shape; and mourning death. The first phase includes the categories of 'establishing trust relationships' and 'sympathizing with dying patients and their family members.' The second phase consists of 'controlling feelings,' 'adjusting ethical conflicts,' and 'providing best patient-care,' and 'helping family accept the jeath.' And the third phase consists of 'overcoming sadness' and 'releasing other negative feelings.' Conclusion: The result of this study will help health professionals develop efficient support programs that support nurses caring for dying patients in hospitals. Further study needs to be done to verify findings.

An Efficient Guitar Chords Classification System Using Transfer Learning (전이학습을 이용한 효율적인 기타코드 분류 시스템)

  • Park, Sun Bae;Lee, Ho-Kyoung;Yoo, Do Sik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.10
    • /
    • pp.1195-1202
    • /
    • 2018
  • Artificial neural network is widely used for its excellent performance and implementability. However, traditional neural network needs to learn the system from scratch, with the addition of new input data, the variation of the observation environment, or the change in the form of input/output data. To resolve such a problem, the technique of transfer learning has been proposed. Transfer learning constructs a newly developed target system partially updating existing system and hence provides much more efficient learning process. Until now, transfer learning is mainly studied in the field of image processing and is not yet widely employed in acoustic data processing. In this paper, focusing on the scalability of transfer learning, we apply the concept of transfer learning to the problem of guitar chord classification and evaluate its performance. For this purpose, we build a target system of convolutional neutral network (CNN) based 48 guitar chords classification system by applying the concept of transfer learning to a source system of CNN based 24 guitar chords classification system. We show that the system with transfer learning has performance similar to that of conventional system, but it requires only half the learning time.

Ground station Baseband Controller(GBC) Development of STSAT-2 (과학기술위성2호 관제를 위한 Ground station Baseband Controller(GBC) 개발)

  • Oh Dae-Soo;Oh Seung-Han;Park Hong-Young;Kim Kyung-Hee;Cha Won-Ho;Lim Chul-Woo
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.8
    • /
    • pp.482-485
    • /
    • 2005
  • STSAT-2 is first satellite which is scheduled to launch by first Korea launcher. Ground station Baseband Controller(GBC) for operating STSAT-2 is now developing. GBC control data flow path between satellite operation computers and ground station antennas and count number of received data packets among demodulated audio signals from three antennas and also set data flow path to good-receiving antenna automatically In GBC two uplink FSK modulators(1.2kbps, 9.6kbps) and six downlink FSK demodulators(9.6kbps, 38.4kbps) are embedded. STSAT-2 GBC hardware is more simpler than STSAT-1 GBC by using FPGA in which all digital logic implemented. Now test and debugging of GBC hardware and Software(FPGA Code and CBC Manager Program) is well progressing in SaTReC, KAIST. This paper introduce GBC structure, functions and test results.

Music Genre Classification using Time Delay Neural Network (시간 지연 신경망을 이용한 음악 장르 분류)

  • 이재원;조찬윤;김상균
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.5
    • /
    • pp.414-422
    • /
    • 2001
  • This paper proposes a classifier of music genre using time delay neural network(TDNN) fur an audio data retrieval systems. The classifier considers eight kinds of genres such as Blues, Country, Hard Core, Hard Rock, Jazz, R&B(Soul), Techno and Trash Metal. The comparative unit to classify the genres is a melody between bars. The melody pattern is extracted based un snare drum sound which represents the periodicity of rhythm effectively. The classifier is constructed with the TDNN and uses fourier transformed feature vector of the melody as input pattern. We experimented the classifier on eighty training data from ten musics for each genres and forty test data from five musics for each genres, and obtained correct classification rates of 92.5% and 60%, respectively.

  • PDF

Ground station Baseband Controller(GBC) Development of STSAT-2 (과학기술위성2호 관제를 위한 Ground station Baseband Controller(GBC) 개발)

  • Oh, Dae-Soo;Oh, Seung-Han;Park, Hong-Young;Kim, Kyung-Hee;Cha, Won-Ho;Lim, Chul-Woo;Ryu, Chang-Wan;Hwang, Dong-Hwan
    • Proceedings of the KIEE Conference
    • /
    • 2005.05a
    • /
    • pp.116-118
    • /
    • 2005
  • STSAT-2 is first satellite which is scheduled to launch by first Korea launcher. After launch Ground station Baseband Controller(GBC) for operating STSAT-2 is now developing. GBC control data flow path between satellite operation computers and ground station antennas. and GBC count number of received data packets among demodulated audio signals from three antennas and set data flow path to good-receiving antenna automatically. In GBC two uplink FSK modulators(1.2kbps, 9.6kbps) and six downlink FSK demodulators(9.6kbps, 38.4kbps) are embedded. STSAT-2 GBC hardware is more simpler than STSAT-1 GBC by using FPGA in which all digital logic implemented. Now test and debugging of GBC hardware and Software(FPGA Code and GBC Manager Program) is well progressing in SaTReC, KAIST. This paper introduce GBC structure, functions and test results.

  • PDF

A Dynamic Packet Recovery Mechanism for Realtime Service in Mobile Computing Environments

  • Park, Kwang-Roh;Oh, Yeun-Joo;Lim, Kyung-Shik;Cho, Kyoung-Rok
    • ETRI Journal
    • /
    • v.25 no.5
    • /
    • pp.356-368
    • /
    • 2003
  • This paper analyzes the characteristics of packet losses in mobile computing environments based on the Gilbert model and then describes a mechanism that can recover the lost audio packets using redundant data. Using information periodically reported by a receiver, the sender dynamically adjusts the amount and offset values of redundant data with the constraint of minimizing the bandwidth consumption of wireless links. Since mobile computing environments can be often characterized by frequent and consecutive packet losses, loss recovery mechanism need to deal efficiently with both random and consecutive packet losses. To achieve this, the suggested mechanism uses relatively large, discontinuous exponential offset values. That gives the same effect as using both the sequential and interleaving redundant information. To verify the effectiveness of the mechanism, we extended and implemented RTP/RTCP and applications. The experimental results show that our mechanism, with an exponential offset, achieves a remarkably low complete packet loss rate and adapts dynamically to the fluctuation of the packet loss pattern in mobile computing environments.

  • PDF