• Title/Summary/Keyword: Audio information

Search Result 1,429, Processing Time 0.029 seconds

Spontaneous Speech Emotion Recognition Based On Spectrogram With Convolutional Neural Network (CNN 기반 스펙트로그램을 이용한 자유발화 음성감정인식)

  • Guiyoung Son;Soonil Kwon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.6
    • /
    • pp.284-290
    • /
    • 2024
  • Speech emotion recognition (SER) is a technique that is used to analyze the speaker's voice patterns, including vibration, intensity, and tone, to determine their emotional state. There has been an increase in interest in artificial intelligence (AI) techniques, which are now widely used in medicine, education, industry, and the military. Nevertheless, existing researchers have attained impressive results by utilizing acted-out speech from skilled actors in a controlled environment for various scenarios. In particular, there is a mismatch between acted and spontaneous speech since acted speech includes more explicit emotional expressions than spontaneous speech. For this reason, spontaneous speech-emotion recognition remains a challenging task. This paper aims to conduct emotion recognition and improve performance using spontaneous speech data. To this end, we implement deep learning-based speech emotion recognition using the VGG (Visual Geometry Group) after converting 1-dimensional audio signals into a 2-dimensional spectrogram image. The experimental evaluations are performed on the Korean spontaneous emotional speech database from AI-Hub, consisting of 7 emotions, i.e., joy, love, anger, fear, sadness, surprise, and neutral. As a result, we achieved an average accuracy of 83.5% and 73.0% for adults and young people using a time-frequency 2-dimension spectrogram, respectively. In conclusion, our findings demonstrated that the suggested framework outperformed current state-of-the-art techniques for spontaneous speech and showed a promising performance despite the difficulty in quantifying spontaneous speech emotional expression.

A Feasibility Study of AMT Application to Tidal Flat Sedimentary Layer (갯벌 지역의 하부퇴적층에 대한 AMT 탐사의 적용 가능성 평가)

  • Kwon, Byung-Doo;Lee, Choon-Ki;Park, Gye-Soon;Choi, Su-Young;Yoo, Hee-Young;Choi, Jong-Keun;Eom, Joo-Young
    • Journal of the Korean earth science society
    • /
    • v.28 no.1
    • /
    • pp.64-74
    • /
    • 2007
  • The marine seismic prospecting using a research vessel in the shallow sea near the coastal area has certain limits according to the water depth and survey environment. Also, for the electrical resistivity survey at seashore area, one may need a specially designed high-voltage source to penetrate the very conductive surface layer. Therefore, we have conducted a feasibility study on the application of magnetotelluric method (MT), a passive geophysical method, on investigating of shallow marine environment geology. Our study involves both theoretical modeling and field survey at the tidal flat area which represent the very shallow marine environment. We have applied the audio-frequency magnetotelluric (AMT) method to the intertidal deposits of Gunhung Bay, west coast of Korea, and analysed the field data both qualitatively and quantitatively to investigate the morphology and sedimentary stratigraphy of the tidal flat. The inversion of AMT data well reveals the upper sedimentary layer of Holocene intertidal sediments having a range of 13-20 m thickness and the erosional patterns at the unconformable contact boundary. However, the AMT inversion results tend to overestimate the depth of basement (30-50 m) when compared with the seismic section (27-33 m). Since MT responses are not significantly sensitive to the resistivity of middle layer or the depth of basement, the AMT inversion result for basement may have to be adjusted using the comparison with other geophysical information like seismic section or logging data if possible. But, the AMT method can be an effective alternative choice for investigating the seashore area to get important basic informations such as the depositional environment of the tidal flat, sea-water intrusion and the basement structure near the sea shore.

Design and Implementation of Web Based Instruction Based on Constructivism for Self-Directed Learning Ablity (구성주의 이론에 기반한 자기주도적 웹 기반 교육의 설계와 구현)

  • Kim Gi-Nam;Kim Eui-Jeong;Kim Chang-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2006.05a
    • /
    • pp.855-858
    • /
    • 2006
  • First of all, Developing information technology makes it possible to change a paradigm of all kinds of areas, including an education. Students can choose learning goals and objects themselves and acquire not the accumulation of knowledge but the method of their learning. Moreover, Teachers get to be adviser, and students play a key role in teaming. That is, the subject of leaning is students. Constructivism emphasizes the student-oriented environment of education, which corresponds to the characteristics of hypeimedia. In addition, Internet allows us to make a practical plan for constructivism. Web Based Internet provides us with a proper environment to make constructivism practice md causes an education system to change. Sure Web Based Instruction makes them motivated to learn more, they can gain plenty of information regardless of places or time. Besides, they are able to consult more up-to-date information regarding their learning use hypermedia such as an image, audio, video, and test, and effectively communicate with their instructor through a board, an e-mail, a chatting etc. A school and instructors have been making effort to develop a new model of a teaching method to cope with a new environment change. In this thesis, with 'Design and Implementation of Web Based Instruction Based on Constructivism', providing online learner-oriented and indexed video lesson, learners can get chance of self-oriented learning. In addition, learners doesn't have to cover all contents of a lesson but can choose contents they want to have from a indexed list of a lesson, and they ran search contents they want to have with a 'Keyword Search' on a main page, which can make learners improve learner's achievement.

  • PDF

Development of Music Recommendation System based on Customer Sentiment Analysis (소비자 감성 분석 기반의 음악 추천 알고리즘 개발)

  • Lee, Seung Jun;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.197-217
    • /
    • 2018
  • Music is one of the most creative act that can express human sentiment with sound. Also, since music invoke people's sentiment to get empathized with it easily, it can either encourage or discourage people's sentiment with music what they are listening. Thus, sentiment is the primary factor when it comes to searching or recommending music to people. Regard to the music recommendation system, there are still lack of recommendation systems that are based on customer sentiment. An algorithm's that were used in previous music recommendation systems are mostly user based, for example, user's play history and playlists etc. Based on play history or playlists between multiple users, distance between music were calculated refer to basic information such as genre, singer, beat etc. It can filter out similar music to the users as a recommendation system. However those methodology have limitations like filter bubble. For example, if user listen to rock music only, it would be hard to get hip-hop or R&B music which have similar sentiment as a recommendation. In this study, we have focused on sentiment of music itself, and finally developed methodology of defining new index for music recommendation system. Concretely, we are proposing "SWEMS" index and using this index, we also extracted "Sentiment Pattern" for each music which was used for this research. Using this "SWEMS" index and "Sentiment Pattern", we expect that it can be used for a variety of purposes not only the music recommendation system but also as an algorithm which used for buildup predicting model etc. In this study, we had to develop the music recommendation system based on emotional adjectives which people generally feel when they listening to music. For that reason, it was necessary to collect a large amount of emotional adjectives as we can. Emotional adjectives were collected via previous study which is related to them. Also more emotional adjectives has collected via social metrics and qualitative interview. Finally, we could collect 134 individual adjectives. Through several steps, the collected adjectives were selected as the final 60 adjectives. Based on the final adjectives, music survey has taken as each item to evaluated the sentiment of a song. Surveys were taken by expert panels who like to listen to music. During the survey, all survey questions were based on emotional adjectives, no other information were collected. The music which evaluated from the previous step is divided into popular and unpopular songs, and the most relevant variables were derived from the popularity of music. The derived variables were reclassified through factor analysis and assigned a weight to the adjectives which belongs to the factor. We define the extracted factors as "SWEMS" index, which describes sentiment score of music in numeric value. In this study, we attempted to apply Case Based Reasoning method to implement an algorithm. Compare to other methodology, we used Case Based Reasoning because it shows similar problem solving method as what human do. Using "SWEMS" index of each music, an algorithm will be implemented based on the Euclidean distance to recommend a song similar to the emotion value which given by the factor for each music. Also, using "SWEMS" index, we can also draw "Sentiment Pattern" for each song. In this study, we found that the song which gives a similar emotion shows similar "Sentiment Pattern" each other. Through "Sentiment Pattern", we could also suggest a new group of music, which is different from the previous format of genre. This research would help people to quantify qualitative data. Also the algorithms can be used to quantify the content itself, which would help users to search the similar content more quickly.

Automatic Speech Style Recognition Through Sentence Sequencing for Speaker Recognition in Bilateral Dialogue Situations (양자 간 대화 상황에서의 화자인식을 위한 문장 시퀀싱 방법을 통한 자동 말투 인식)

  • Kang, Garam;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.17-32
    • /
    • 2021
  • Speaker recognition is generally divided into speaker identification and speaker verification. Speaker recognition plays an important function in the automatic voice system, and the importance of speaker recognition technology is becoming more prominent as the recent development of portable devices, voice technology, and audio content fields continue to expand. Previous speaker recognition studies have been conducted with the goal of automatically determining who the speaker is based on voice files and improving accuracy. Speech is an important sociolinguistic subject, and it contains very useful information that reveals the speaker's attitude, conversation intention, and personality, and this can be an important clue to speaker recognition. The final ending used in the speaker's speech determines the type of sentence or has functions and information such as the speaker's intention, psychological attitude, or relationship to the listener. The use of the terminating ending has various probabilities depending on the characteristics of the speaker, so the type and distribution of the terminating ending of a specific unidentified speaker will be helpful in recognizing the speaker. However, there have been few studies that considered speech in the existing text-based speaker recognition, and if speech information is added to the speech signal-based speaker recognition technique, the accuracy of speaker recognition can be further improved. Hence, the purpose of this paper is to propose a novel method using speech style expressed as a sentence-final ending to improve the accuracy of Korean speaker recognition. To this end, a method called sentence sequencing that generates vector values by using the type and frequency of the sentence-final ending appearing in the utterance of a specific person is proposed. To evaluate the performance of the proposed method, learning and performance evaluation were conducted with a actual drama script. The method proposed in this study can be used as a means to improve the performance of Korean speech recognition service.

Outdoor/Environmental Education Program Design in the Nature Study Center - The Program Diversification for the Middle School Students - (자연학습원 옥외 환경교육 프로그램 설계를 위한 연구 -중학생을 위한 프로그램 다양화를 중심으로-)

  • 이재영;안동만
    • Hwankyungkyoyuk
    • /
    • v.3 no.1
    • /
    • pp.141-152
    • /
    • 1992
  • The purpose of this study is to search for the ways to diversify Outdoor/Environmental Education Program in the Nature Study Center(NSC), especially for the middle school students. For this study, various research methods such as literature review, questionnaire survey (448 students, 11 middle school teachers, 19 NSC staffs), interview and participant observation are used the process of this study consists of two steps. The first step is to define research questions through pilot survey and the second is to investigate the research questions, in the form of hypotheses through main survey. Nine hypotheses are formulated. Six are related with program elements (educational goals, student characteristics, staff resources, teaching methods, instructional resources, contents). three are related with program implementation process(preplan, implementation, post-evaluation). The hypotheses are tested and alternatives for program improvement are proposed. 1. Educational goals : Educational goals of NSC should be focused on Outdoor /Environmental Education and each NSC should specialize on its own theme. The objectives of every sub-program should be unified toward educational goals. 2. Student characteristics: The Outdoor/Environmental Education Program should reflect student characteristics: sex, urban/rural origins, normal/handicapped, number of visit and so on. 3. Staff resources : Provide qualified staff with professional knowledge and positive attitudes, reeducate staffs periodically, reduce management staff and increase teaching staffs. Provide permanent and well paid position, encourage and give opportunities and the middle school teachers to participate in program. 4. teaching method: Increase outdoor classes two way communication between teaching staffs and students adopt more open ended teaching method so that students can exercise coworks in small groups. 5. Instructional resources: Diversify NSC sites(mountains, coastal areas, urban areas and so on), teaching media (audio/visual equipments, graphic design of signs). Consider design for handicapped and integrate indoor and outdoor educational facilities. Plan nature trails with separate themes, allign nature trail so that it passes through diverse environments. 6. Content : Reflect characteristic site potential specialize on day or night program, on seasonal program, and on site specific social issues(such as interpreting of environmental damages around the NSCs). 7. Preplan: Get Information and know about visiting students in advance. Discuss with middle school teachers and adjust program weeks before visits if many or all of the students had already visited a NSC. arrange a visit to other NSC. Provide an introductory class for the teachers and students before they visit a NSC. 8. Implementation: During NSC visit and classes apply various and appropriate techniques to collect in formation for later evaluation. Improve NSC provided evaluation sheet so as to reflect student characteristic. Compare with formal education and investigate on effects of NSC program. 9. Post-evaluation: Formalize a post-evaluation process and organization. During the winter vacation, develop new programs based on the post-evaluationacation, for the next year. Also, have comparative evaluation meetings of staff from various NSCs during the winter vacation while there is no visitors and classes.

  • PDF

Similar sub-Trajectory Retrieval Technique based on Grid for Video Data (비디오 데이타를 위한 그리드 기반의 유사 부분 궤적 검색 기법)

  • Lee, Ki-Young;Lim, Myung-Jae;Kim, Kyu-Ho;Kim, Joung-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.5
    • /
    • pp.183-189
    • /
    • 2009
  • Recently, PCS, PDA and mobile devices, such as the proliferation of spread, GPS (Global Positioning System) the use of, the rapid development of wireless network and a regular user even images, audio, video, multimedia data, such as increased use is for. In particular, video data among multimedia data, unlike the moving object, text or image data that contains information about the movements and changes in the space of time, depending on the kinds of changes that have sigongganjeok attributes. Spatial location of objects on the flow of time, changing according to the moving object (Moving Object) of the continuous movement trajectory of the meeting is called, from the user from the database that contains a given query trajectory and data trajectory similar to the finding of similar trajectory Search (Similar Sub-trajectory Retrieval) is called. To search for the trajectory, and these variations, and given the similar trajectory of the user query (Tolerance) in the search for a similar trajectory to approximate data matching (Approximate Matching) should be available. In addition, a large multimedia data from the database that you only want to be able to find a faster time-effective ways to search different from the existing research is required. To this end, in this paper effectively divided into a grid to search for the trajectory to the trajectory of moving objects, similar to the effective support of the search trajectory offers a new grid-based search techniques.

  • PDF

Design and Implementation of a Bluetooth Baseband Module with DMA Interface (DMA 인터페이스를 갖는 블루투스 기저대역 모듈의 설계 및 구현)

  • Cheon, Ik-Jae;O, Jong-Hwan;Im, Ji-Suk;Kim, Bo-Gwan;Park, In-Cheol
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.3
    • /
    • pp.98-109
    • /
    • 2002
  • Bluetooth technology is a publicly available specification proposed for Radio Frequency (RF) communication for short-range :1nd point-to-multipoint voice and data transfer. It operates in the 2.4㎓ ISM(Industrial, Scientific and Medical) band and offers the potential for low-cost, broadband wireless access for various mobile and portable devices at range of about 10 meters. In this paper, we describe the structure and the test results of the bluetooth baseband module with direct memory access method we have developed. This module consists of three blocks; link controller, UART interface, and audio CODEC. This module has a bus interface for data communication between this module and main processor and a RF interface for the transmission of bit-stream between this module and RF module. The bus interface includes DMA interface. Compared with the link controller with FIFOs, The module with DMA has a wide difference in size of module and speed of data processing. The small size module supplies lorr cost and various applications. In addition, this supports a firmware upgrade capability through UART. An FPGA and an ASIC implementation of this module, designed as soft If, are tested for file and bit-stream transfers between PCs.

An Embedded Watermark into Multiple Lower Bitplanes of Digital Image (디지털 영상의 다중 하위 비트플랜에 삽입되는 워터마크)

  • Rhee, Kang-Hyeon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.6 s.312
    • /
    • pp.101-109
    • /
    • 2006
  • Recently, according to the number of internet in widely use and the development of the related application program, the distribution and use of multimedia content(text, images, video, audio etc.) is very easy. Digital signal may be easily duplicated and the duplicated data can have same quality of original data so that it is difficult to warrant original owner. For the solution of this problem, the protection method of copyright which is encipher and watermarking. Digital watermarking is used to protect IP(Intellectual Property) and authenticate the owner of multimedia content. In this paper, the proposed watermarking algerian embeds watermark into multiple lower bitplanes of digital image. In the proposed algorithm, original and watermark images are decomposed to bitplanes each other and the watermarking operation is executed in the corresponded bitplane. The position of watermark image embedded in each bitplane is used to the watermarking key and executed in multiple lower bitplane which has no an influence on human visual recognition. Thus this algorithm can present watermark image to the multiple inherent patterns and needs small watermarking quantity. In the experiment, the author confirmed that it has high robustness against attacks of JPEG, MEDIAN and PSNR but it is weakness against attacks of NOISE, RNDDIST, ROT, SCALE, SS on spatial domain when a criterion PSNR of watermarked image is 40dB.

Context Adaptive User Interface Generation in Ubiquitous Home Using Bayesian Network and Behavior Selection Network (베이지안 네트워크와 행동 선택 네트워크를 이용한 유비쿼터스 홈에서의 상황 적응적 인터페이스 생성)

  • Park, Han-Saem;Song, In-Jee;Cho, Sung-Bea
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.573-578
    • /
    • 2008
  • Recently, we should control various devices such as TV, audio, DVD player, video player, and set-top box simultaneously to manipulate home theater system. To execute the function the user want in this situation, user should know functions and positions of the buttons in several remote controllers. Normally, people feel difficult due to these realistic problems. Besides, the number of the devices that we can control shall increase, and people will confuse more if the ubiquitous home environment is realized. Therefore, user adaptive interface that provides the summarized functions is required. Moreover there can be a lot of mobile and stationary controller devices in ubiquitous computing environment, so user interface should be adaptive in selecting the functions that user wants and in adjusting the features of UI to fit in specific controller. To implement the user and controller adaptive interface, we modeled the ubiquitous home environment and used modeled context and device information. We have used Bayesian network to get the degree of necessity in each situation. Behavior selection network uses predicted user situation and the degree of necessity, and it selects necessary functions in current situation. Selected functions are used to construct adaptive interface for each controller using presentation template. For experiments, we have implemented ubiquitous home environment and generated controller usage log in this environment. We have confirmed the BN predicted user requirements effectively as evaluating the inferred results of controller necessity based on generated scenario. Finally, comparing the adaptive home UI with the fixed one to 14 subjects, we confirmed that the generated adaptive UI was more useful for general tasks than fixed UI.

  • PDF