• Title/Summary/Keyword: language processing

Search Result 2,692, Processing Time 0.025 seconds

A new approach for overlay text detection from complex video scene (새로운 비디오 자막 영역 검출 기법)

  • Kim, Won-Jun;Kim, Chang-Ick
    • Journal of Broadcast Engineering
    • /
    • v.13 no.4
    • /
    • pp.544-553
    • /
    • 2008
  • With the development of video editing technology, there are growing uses of overlay text inserted into video contents to provide viewers with better visual understanding. Since the content of the scene or the editor's intention can be well represented by using inserted text, it is useful for video information retrieval and indexing. Most of the previous approaches are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background. In this paper, we propose a novel framework to localize the overlay text in a video scene. Based on our observation that there exist transient colors between inserted text and its adjacent background a transition map is generated. Then candidate regions are extracted by using the transition map and overlay text is finally determined based on the density of state in each candidate. The proposed method is robust to color, size, position, style, and contrast of overlay text. It is also language free. Text region update between frames is also exploited to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.

The Design of Keyword Spotting System based on Auditory Phonetical Knowledge-Based Phonetic Value Classification (청음 음성학적 지식에 기반한 음가분류에 의한 핵심어 검출 시스템 구현)

  • Kim, Hack-Jin;Kim, Soon-Hyub
    • The KIPS Transactions:PartB
    • /
    • v.10B no.2
    • /
    • pp.169-178
    • /
    • 2003
  • This study outlines two viewpoints the classification of phone likely unit (PLU) which is the foundation of korean large vocabulary speech recognition, and the effectiveness of Chiljongseong (7 Final Consonants) and Paljogseong (8 Final Consonants) of the korean language. The phone likely classifies the phoneme phonetically according to the location of and method of articulation, and about 50 phone-likely units are utilized in korean speech recognition. In this study auditory phonetical knowledge was applied to the classification of phone likely unit to present 45 phone likely unit. The vowels 'ㅔ, ㅐ'were classified as phone-likely of (ee) ; 'ㅒ, ㅖ' as [ye] ; and 'ㅚ, ㅙ, ㅞ' as [we]. Secondly, the Chiljongseong System of the draft for unified spelling system which is currently in use and the Paljongseonggajokyong of Korean script haerye were illustrated. The question on whether the phonetic value on 'ㄷ' and 'ㅅ' among the phonemes used in the final consonant of the korean fan guage is the same has been argued in the academic world for a long time. In this study, the transition stages of Korean consonants were investigated, and Ciljonseeng and Paljongseonggajokyong were utilized in speech recognition, and its effectiveness was verified. The experiment was divided into isolated word recognition and speech recognition, and in order to conduct the experiment PBW452 was used to test the isolated word recognition. The experiment was conducted on about 50 men and women - divided into 5 groups - and they vocalized 50 words each. As for the continuous speech recognition experiment to be utilized in the materialized stock exchange system, the sentence corpus of 71 stock exchange sentences and speech corpus vocalizing the sentences were collected and used 5 men and women each vocalized a sentence twice. As the result of the experiment, when the Paljongseonggajokyong was used as the consonant, the recognition performance elevated by an average of about 1.45% : and when phone likely unit with Paljongseonggajokyong and auditory phonetic applied simultaneously, was applied, the rate of recognition increased by an average of 1.5% to 2.02%. In the continuous speech recognition experiment, the recognition performance elevated by an average of about 1% to 2% than when the existing 49 or 56 phone likely units were utilized.

Scenario-Based Implementation Synthesis for Real-Time Object-Oriented Models (실시간 객체 지향 모델을 위한 시나리오 기반 구현 합성)

  • Kim, Sae-Hwa;Park, Ji-Yong;Hong, Seong-Soo
    • The KIPS Transactions:PartD
    • /
    • v.12D no.7 s.103
    • /
    • pp.1049-1064
    • /
    • 2005
  • The demands of increasingly complicated software have led to the proliferation of object-oriented design methodologies in embedded systems. To execute a system designed with objects in target hardware, a task set should be derived from the objects, representing how many tasks reside in the system and which task processes which event arriving at an object. The derived task set greatly influences the responsiveness of the system. Nevertheless, it is very difficult to derive an optimal task set due to the discrepancy between objects and tasks. Therefore, the common method currently used by developers is to repetitively try various task sets. This paper proposes Scenario-based Implementation Synthesis Architecture (SISA) to solve this problem. SISA encompasses a method for deriving a task set from a system designed with objects as well as its supporting development tools and run-time system architecture. A system designed with SISA not only consists of the smallest possible number of tasks, but also guarantees that the response time for each event in the system is minimized. We have fully implemented SISA by extending the ResoRT development tool and applied it to an existing industrial PBX system. The experimental results show that maximum response times were reduced $30.3\%$ on average compared to when the task set was derived by the best known existing methods.

Performance Improvement of Web Information Retrieval Using Sentence-Query Similarity (문장-질의 유사성을 이용한 웹 정보 검색의 성능 향상)

  • Park Eui-Kyu;Ra Dong-Yul;Jang Myung-Gil
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.406-415
    • /
    • 2005
  • Prosperity of Internet led to the web containing huge number of documents. Thus increasing importance is given to the web information retrieval technology that can provide users with documents that contain the right information they want. This paper proposes several techniques that are effective for the improvement of web information retrieval. Similarity between a document and the query is a major source of information exploited by conventional systems. However, we suggest a technique to make use of similarity between a sentence and the query. We introduce a technique to compute the approximate score of the sentence-query similarity even without a mature technology of natural language processing. It was shown that the amount of computation for this task is linear to the number of documents in the total collection, which implies that practical systems can make use of this technique. The next important technique proposed in this paper is to use stratification of documents in re-ranking the documents to output. It was shown that it can lead to significant improvement in performance. We furthermore showed that using hyper links, anchor texts, and titles can result in enhancement of performance. To justify the proposed techniques we developed a large scale web information retrieval system and used it for experiments.

A Unified Design Methodology using UML Classes for XML Application based on RDB (관계형 데이터베이스 기반의 XML 응용을 위한, UML 클래스를 이용한 통합 설계 방법론)

  • Bang, Sung-Yoon;Joo, Kyung-Soo
    • The KIPS Transactions:PartD
    • /
    • v.9D no.6
    • /
    • pp.1105-1112
    • /
    • 2002
  • Nowadays the information exchange based on XML such as B2B electronic commerce is spreading. Therefore a systematic and stable management mechanism for storing the exchanged information is needed. For this goal there are many research activities for concerning the connection between XML application and relational databases. But because XML data has hierarchical structure and relational databases can store only flat-structured data, we need to make a conversion rule which changes the hierarchical architecture to a 2-dimensional format. Accordingly the modeling methodology for storing such structured information in relational databases is needed. In order to build good quality application systems, modeling is an important first step. In 1997, the OMG adopted the UML as its standard modeling language. Since industry has warmly embraced UML, its popularity should become more important in the future. So a design methodology based on UML is needed to develop efficient XML applications. In this paper, we propose a unified design methodology for XML applications based on relational database using UML. To reach these goals, first we introduce a XML modeling methodology to design W3C XML schema using UML and second we propose data modeling methodology for relational database schema to store XML data efficiently in relational databases.

Design of a Deep Neural Network Model for Image Caption Generation (이미지 캡션 생성을 위한 심층 신경망 모델의 설계)

  • Kim, Dongha;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • In this paper, we propose an effective neural network model for image caption generation and model transfer. This model is a kind of multi-modal recurrent neural network models. It consists of five distinct layers: a convolution neural network layer for extracting visual information from images, an embedding layer for converting each word into a low dimensional feature, a recurrent neural network layer for learning caption sentence structure, and a multi-modal layer for combining visual and language information. In this model, the recurrent neural network layer is constructed by LSTM units, which are well known to be effective for learning and transferring sequence patterns. Moreover, this model has a unique structure in which the output of the convolution neural network layer is linked not only to the input of the initial state of the recurrent neural network layer but also to the input of the multimodal layer, in order to make use of visual information extracted from the image at each recurrent step for generating the corresponding textual caption. Through various comparative experiments using open data sets such as Flickr8k, Flickr30k, and MSCOCO, we demonstrated the proposed multimodal recurrent neural network model has high performance in terms of caption accuracy and model transfer effect.

An SAO-based Text Mining Approach for Technology Roadmapping Using Patent Information (기술로드맵핑을 위한 특허정보의 SAO기반 텍스트 마이닝 접근 방법)

  • Choi, Sung-Chul;Kim, Hong-Bin;Yoon, Jang-Hyeok
    • Journal of Technology Innovation
    • /
    • v.20 no.1
    • /
    • pp.199-234
    • /
    • 2012
  • Technology roadmaps (TRMs) are considered to be the essential tool for strategic technology planning and management. Recently, rapidly evolving technological trends and severe technological competition are making TRM more important than ever before. That is because TRM plays a role of "map" that align organizational objectives with their relevant technologies. However, constructing and managing TRMs are costly and time-consuming because they rely on the qualitative and intuitive knowledge of human experts. Therefore, enhancing the productivity of developing TRMs is one of the major concerns in technology planning. In this regard, this paper proposes a technology roadmapping approach based on function of which concept includes objectives, structures and effects of a technology and which are represented as Subject-Action-Object structures extractable by exploiting natural language processing of patent text. We expect that the proposed method will broaden experts' technological horizons in the technology planning process and will help to construct TRMs efficiently with the reduced time and costs.

  • PDF

DEM_Comp Software for Effective Compression of Large DEM Data Sets (대용량 DEM 데이터의 효율적 압축을 위한 DEM_Comp 소프트웨어 개발)

  • Kang, In-Gu;Yun, Hong-Sik;Wei, Gwang-Jae;Lee, Dong-Ha
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.28 no.2
    • /
    • pp.265-271
    • /
    • 2010
  • This paper discusses a new software package, DEM_Comp, developed for effectively compressing large digital elevation model (DEM) data sets based on Lempel-Ziv-Welch (LZW) compression and Huffman coding. DEM_Comp was developed using the $C^{++}$ language running on a Windows-series operating system. DEM_Comp was also tested on various test sites with different territorial attributes, and the results were evaluated. Recently, a high-resolution version of the DEM has been obtained using new equipment and the related technologies of LiDAR (LIght Detection And Radar) and SAR (Synthetic Aperture Radar). DEM compression is useful because it helps reduce the disk space or transmission bandwidth. Generally, data compression is divided into two processes: i) analyzing the relationships in the data and ii) deciding on the compression and storage methods. DEM_Comp was developed using a three-step compression algorithm applying a DEM with a regular grid, Lempel-Ziv compression, and Huffman coding. When pre-processing alone was used on high- and low-relief terrain, the efficiency was approximately 83%, but after completing all three steps of the algorithm, this increased to 97%. Compared with general commercial compression software, these results show approximately 14% better performance. DEM_Comp as developed in this research features a more efficient way of distributing, storing, and managing large high-resolution DEMs.

Relationship between Music Cognitive Skills and Academic Skills (음악의 인지기술과 학습 기술과의 관계)

  • Chong, Hyun Ju
    • Journal of Music and Human Behavior
    • /
    • v.3 no.1
    • /
    • pp.63-76
    • /
    • 2006
  • Melody is defined as adding spatial dimension to the rhythm which is temporal concept. Being able to understand melodic pattern and to reproduce the pattern also requires cognitive skills. Since 1980, there has been much research on the relationship between academic skills and music cognitive skills, and how to transfer the skills learned in music work to the academic learning. The study purported to examine various research outcomes dealing with the correlational and causal relationships between musical and academic skills. The two dominating theories explaining the connection between two skills ares are "neural theory" and "near transfer theory." The theories focus mainly on the transference of spatial and temporal reasoning which are reinforced in the musical learning. The study reviewed the existing meta-analysis studies, which provided evidence for positive correlation between academic and musical skills, and significance of musical learning in academic skills. The study further examined specific skills area that musical learning is correlated, such as mathematics and reading. The research stated that among many mathematical concepts, proportional topics have the strongest correlation with musical skills. Also with reading, temporal processing also has strong relationship with auditory skills and motor skills, and further affect language and literacy ability. The study suggest that skills learned in the musical work can be transferred to other areas of learning and structured music activities may be every efficient for children for facilitating academic concepts.

  • PDF

Linguistic Productivity and Chomskyan Grammar: A Critique (언어창조성과 춈스키 문법 비판)

  • Bong-rae Seok
    • Lingua Humanitatis
    • /
    • v.1 no.1
    • /
    • pp.235-251
    • /
    • 2001
  • According to Chomskyan grammar, humans can generate and understand an unbounded number of grammatical sentences. Against the background of pure and idealized linguistic competence, this linguistic productivity is argued and understood. In actual utterances, however, there are many limitations of productivity but they are said to come from the general constraints on performances such as capacity of short term memory or attention. In this paper I discuss a problem raised against idealized productivity. I argue that linguistic productivity idealizes our linguistic competence too much. By separating idealized competence from the various constraints of performance, Chomskyan theorists can argue for unlimited productivity. However, the absolute distinction between grammar (pure competence) and parser (actual psychological processes) makes little sense when we explain the low acceptability(intelligibility) of center embedded sentences. Usually, the problem of center embedded sentence is explained in terms of memory shortage or other performance constraints. To explain the low acceptability, however, we need to assume specialized memory structure because the low acceptability occurs only with a specific type of syntactic pattern. 1 argue that this special memory structure should not be considered as a general performance constraint. It is a domain specific (specifically linguistic) constraints and an intrinsic part of human language processing. Recent development of Chomskyan grammar, i.e., minimalist approach seems to close the gap between pure competence and this type of specialized constraints. Chomsky's earlier approach of generative grammar focuses on end result of the generative derivation. However, economy principle (of minimalist approach) focuses on actual derivational processes. By having less mathematical or less idealized grammar, we can come closer to the actual computational processes that build syntactic structure of a sentence. In this way, we can have a more concrete picture of our linguistic competence, competence that is not detached from actual computational processes.

  • PDF