• Title/Summary/Keyword: Speech Web

Search Result 104, Processing Time 0.026 seconds

Same music file recognition method by using similarity measurement among music feature data (음악 특징점간의 유사도 측정을 이용한 동일음원 인식 방법)

  • Sung, Bo-Kyung;Chung, Myoung-Beom;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.99-106
    • /
    • 2008
  • Recently, digital music retrieval is using in many fields (Web portal. audio service site etc). In existing fields, Meta data of music are used for digital music retrieval. If Meta data are not right or do not exist, it is hard to get high accurate retrieval result. Contents based information retrieval that use music itself are researched for solving upper problem. In this paper, we propose Same music recognition method using similarity measurement. Feature data of digital music are extracted from waveform of music using Simplified MFCC (Mel Frequency Cepstral Coefficient). Similarity between digital music files are measured using DTW (Dynamic time Warping) that are used in Vision and Speech recognition fields. We success all of 500 times experiment in randomly collected 1000 songs from same genre for preying of proposed same music recognition method. 500 digital music were made by mixing different compressing codec and bit-rate from 60 digital audios. We ploved that similarity measurement using DTW can recognize same music.

  • PDF

The Medial Sural Artery Perforator Flap versus Other Free Flaps in Head and Neck Reconstruction: A Systematic Review

  • Yasser Al Omran;Ellie Evans;Chloe Jordan;Tiffanie-Marie Borg;Samar AlOmran;Sarvnaz Sepehripour;Mohammed Ali Akhavani
    • Archives of Plastic Surgery
    • /
    • v.50 no.3
    • /
    • pp.264-273
    • /
    • 2023
  • The medial sural artery perforator (MSAP) flap is a versatile fasciocutaneous flap, and yet is less commonly utilized than other free flaps in microvascular reconstructions of the head and neck. The aim is to conduct a high-quality Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA)- and Assessment of Multiple Systematic Reviews 2 (AMSTAR 2)-compliant systematic review comparing the use of the MSAP flap to other microvascular free flaps in the head and neck. Medline, Embase, and Web of Science databases were searched to identify all original comparative studies comparing patients undergoing head and neck reconstruction with an MSAP flap to the radial forearm free flap (RFFF) or anterolateral thigh (ALT) flap from inception to February 2021. Outcome studied were the recipient-site and donor-site morbidities as well as speech and swallow function. A total of 473 articles were identified from title and abstract review. Four studies met the inclusion criteria. Compared with the RFFF and the ALT flaps, the MSAP flap had more recipient-site complications (6.0 vs 10.4%) but less donor-site complications (20.2 vs 7.8%). The MSAP flap demonstrated better overall donor-site appearance and function than the RFFF and ALT flaps (p = 0.0006) but no statistical difference in speech and swallowing function following reconstruction (p = 0.28). Although higher quality studies reviewing the use of the MSAP flap to other free flaps are needed, the MSAP flap provides a viable and effective reconstructive option and should be strongly considered for reconstruction of head and neck defects.

Expansion of Sensibility Area and Industrial Application in the Convergence Era - With Special Reference to Analysis of the Internet Arts of Sommerer and Mignonneau - (컨버전스시대 감성영역의 확장과 산업활용 -Sommerer와 Mignonneau의 인터넷 아트 분석을 중심으로-)

  • Kim, Hee-Young;Lee, Yong-Jae
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.146-154
    • /
    • 2010
  • Recently 'convergence' and 'communication' have been keywords in many areas. Artists and engineers have begun to communicate each other through collaboration based on new technologies. One of the exemplary technologies of this era of convergence is a technology of fusing five senses used by both Internet Art and industrial technologies such as car navigation systems and the iPhone. Sommerer and Mignonneau's Internet Art $\ll$Riding the Net$\gg$,$\ll$The Living Room$\gg$, and $\ll$The Living Web$\gg$ implement the Internet and the five-sense fusion technology to translate not only sound into visual images but also tactile senses into tempo-spatial representations. Likewise, industrial technologies such as car navigation systems and the iPhone employ the five-sense fusion technology of speech recognition, which leads to the expansion of the realm of senses in technology as seen in Internet Art. As examined in this study, the development of art and technology through their convergence will open up a new dimension of digital art and culture technology industry.

A Corpus-based Analysis of EFL Learners' Use of Hedges in Cross-cultural Communication

  • Min, Su-Jung
    • English Language & Literature Teaching
    • /
    • v.16 no.4
    • /
    • pp.91-106
    • /
    • 2010
  • This study examines the use of hedges in cross-cultural communication between EFL learners in an e-learning environment. The study analyzes the use of hedges in a corpus of an interactive web with a bulletin board system through which college students of English at Japanese and Korean universities interacted with each other discussing the topics of local and global issues. It compares the use of hedges in the students' corpus to that of a native English speakers' corpus. The result shows that EFL learners tend to use relatively smaller number of hedges than the native speakers in terms of the frequencies of the total tokens. It further reveals that the learners' overuse of a single versatile high-frequency hedging item, I think, results in relative underuse of other hedging devices. This indicates that due to their small repertoire of hedges, EFL learners' overuse of a limited number of hedging items may cause their speech or writing to become less competent. Based on the result and interviews with the learners, the study also argues that hedging should be understood in its social contexts and should not be understood just as a lack of conviction or a mark of low proficiency. Suggestions were made for using computer corpora in understanding EFL learners' language difficulties and helping them develop communicative and pragmatic competence.

  • PDF

A Splog Detection System Using Support Vector Systems (지지벡터기계를 이용한 스팸 블로그(Splog) 판별 시스템)

  • Lee, Song-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.1
    • /
    • pp.163-168
    • /
    • 2011
  • Blogs are an easy way to publish information, engage in discussions, and form communities on the Internet. Recently, there are several varieties of spam blog whose purpose is to host ads or raise the PageRank of target sites. Our purpose is to develope the system which detects these spam blogs (splogs) automatically among blogs on Web environment. After removing HTML of blogs, they are tagged by part of speech(POS) tagger. Words and their POS tags information is used as a feature type. Among features, we select useful features with X2 statistics and train the SVM with the selected features. Our system acquired 90.5% of F1 measure with SPLOG data set.

An Optimized e-Lecture Video Search and Indexing framework

  • Medida, Lakshmi Haritha;Ramani, Kasarapu
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.87-96
    • /
    • 2021
  • The demand for e-learning through video lectures is rapidly increasing due to its diverse advantages over the traditional learning methods. This led to massive volumes of web-based lecture videos. Indexing and retrieval of a lecture video or a lecture video topic has thus proved to be an exceptionally challenging problem. Many techniques listed by literature were either visual or audio based, but not both. Since the effects of both the visual and audio components are equally important for the content-based indexing and retrieval, the current work is focused on both these components. A framework for automatic topic-based indexing and search depending on the innate content of the lecture videos is presented. The text from the slides is extracted using the proposed Merged Bounding Box (MBB) text detector. The audio component text extraction is done using Google Speech Recognition (GSR) technology. This hybrid approach generates the indexing keywords from the merged transcripts of both the video and audio component extractors. The search within the indexed documents is optimized based on the Naïve Bayes (NB) Classification and K-Means Clustering models. This optimized search retrieves results by searching only the relevant document cluster in the predefined categories and not the whole lecture video corpus. The work is carried out on the dataset generated by assigning categories to the lecture video transcripts gathered from e-learning portals. The performance of search is assessed based on the accuracy and time taken. Further the improved accuracy of the proposed indexing technique is compared with the accepted chain indexing technique.

Privacy-Preserving in the Context of Data Mining and Deep Learning

  • Altalhi, Amjaad;AL-Saedi, Maram;Alsuwat, Hatim;Alsuwat, Emad
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.6
    • /
    • pp.137-142
    • /
    • 2021
  • Machine-learning systems have proven their worth in various industries, including healthcare and banking, by assisting in the extraction of valuable inferences. Information in these crucial sectors is traditionally stored in databases distributed across multiple environments, making accessing and extracting data from them a tough job. To this issue, we must add that these data sources contain sensitive information, implying that the data cannot be shared outside of the head. Using cryptographic techniques, Privacy-Preserving Machine Learning (PPML) helps solve this challenge, enabling information discovery while maintaining data privacy. In this paper, we talk about how to keep your data mining private. Because Data mining has a wide variety of uses, including business intelligence, medical diagnostic systems, image processing, web search, and scientific discoveries, and we discuss privacy-preserving in deep learning because deep learning (DL) exhibits exceptional exactitude in picture detection, Speech recognition, and natural language processing recognition as when compared to other fields of machine learning so that it detects the existence of any error that may occur to the data or access to systems and add data by unauthorized persons.

A System of Audio Data Analysis and Masking Personal Information Using Audio Partitioning and Artificial Intelligence API (오디오 데이터 내 개인 신상 정보 검출과 마스킹을 위한 인공지능 API의 활용 및 음성 분할 방법의 연구)

  • Kim, TaeYoung;Hong, Ji Won;Kim, Do Hee;Kim, Hyung-Jong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.5
    • /
    • pp.895-907
    • /
    • 2020
  • With the recent increasing influence of multimedia content other than the text-based content, services that help to process information in content brings us great convenience. These services' representative features are searching and masking the sensitive data. It is not difficult to find the solutions that provide searching and masking function for text information and image. However, even though we recognize the necessity of the technology for searching and masking a part of the audio data, it is not easy to find the solution because of the difficulty of the technology. In this study, we propose web application that provides searching and masking functions for audio data using audio partitioning method. While we are achieving the research goal, we evaluated several speech to text conversion APIs to choose a proper API for our purpose and developed regular expressions for searching sensitive information. Lastly we evaluated the accuracy of the developed searching and masking feature. The contribution of this work is in design and implementation of searching and masking a sensitive information from the audio data by the various functionality proving experiments.

An Emotion Scanning System on Text Documents (텍스트 문서 기반의 감성 인식 시스템)

  • Kim, Myung-Kyu;Kim, Jung-Ho;Cha, Myung-Hoon;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.12 no.4
    • /
    • pp.433-442
    • /
    • 2009
  • People are tending to buy products through the Internet rather than purchasing them from the store. Some of the consumers give their feedback on line such as reviews, replies, comments, and blogs after they purchased the products. People are also likely to get some information through the Internet. Therefore, companies and public institutes have been facing this situation where they need to collect and analyze reviews or public opinions for them because many consumers are interested in other's opinions when they are about to make a purchase. However, most of the people's reviews on web site are too numerous, short and redundant. Under these circumstances, the emotion scanning system of text documents on the web is rising to the surface. Extracting writer's opinions or subjective ideas from text exists labeled words like GI(General Inquirer) and LKB(Lexical Knowledge base of near synonym difference) in English, however Korean language is not provided yet. In this paper, we labeled positive, negative, and neutral attribute at 4 POS(part of speech) which are noun, adjective, verb, and adverb in Korean dictionary. We extract construction patterns of emotional words and relationships among words in sentences from a large training set, and learned them. Based on this knowledge, comments and reviews regarding products are classified into two classes polarities with positive and negative using SO-PMI, which found the optimal condition from a combination of 4 POS. Lastly, in the design of the system, a flexible user interface is designed to add or edit the emotional words, the construction patterns related to emotions, and relationships among the words.

  • PDF

Development of a Web-based Presentation Attitude Correction Program Centered on Analyzing Facial Features of Videos through Coordinate Calculation (좌표계산을 통해 동영상의 안면 특징점 분석을 중심으로 한 웹 기반 발표 태도 교정 프로그램 개발)

  • Kwon, Kihyeon;An, Suho;Park, Chan Jung
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.2
    • /
    • pp.10-21
    • /
    • 2022
  • In order to improve formal presentation attitudes such as presentation of job interviews and presentation of project results at the company, there are few automated methods other than observation by colleagues or professors. In previous studies, it was reported that the speaker's stable speech and gaze processing affect the delivery power in the presentation. Also, there are studies that show that proper feedback on one's presentation has the effect of increasing the presenter's ability to present. In this paper, considering the positive aspects of correction, we developed a program that intelligently corrects the wrong presentation habits and attitudes of college students through facial analysis of videos and analyzed the proposed program's performance. The proposed program was developed through web-based verification of the use of redundant words and facial recognition and textualization of the presentation contents. To this end, an artificial intelligence model for classification was developed, and after extracting the video object, facial feature points were recognized based on the coordinates. Then, using 4000 facial data, the performance of the algorithm in this paper was compared and analyzed with the case of facial recognition using a Teachable Machine. Use the program to help presenters by correcting their presentation attitude.