• Title/Summary/Keyword: Information Merging

Search Result 557, Processing Time 0.046 seconds

Integration of WFST Language Model in Pre-trained Korean E2E ASR Model

  • Junseok Oh;Eunsoo Cho;Ji-Hwan Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1692-1705
    • /
    • 2024
  • In this paper, we present a method that integrates a Grammar Transducer as an external language model to enhance the accuracy of the pre-trained Korean End-to-end (E2E) Automatic Speech Recognition (ASR) model. The E2E ASR model utilizes the Connectionist Temporal Classification (CTC) loss function to derive hypothesis sentences from input audio. However, this method reveals a limitation inherent in the CTC approach, as it fails to capture language information from transcript data directly. To overcome this limitation, we propose a fusion approach that combines a clause-level n-gram language model, transformed into a Weighted Finite-State Transducer (WFST), with the E2E ASR model. This approach enhances the model's accuracy and allows for domain adaptation using just additional text data, avoiding the need for further intensive training of the extensive pre-trained ASR model. This is particularly advantageous for Korean, characterized as a low-resource language, which confronts a significant challenge due to limited resources of speech data and available ASR models. Initially, we validate the efficacy of training the n-gram model at the clause-level by contrasting its inference accuracy with that of the E2E ASR model when merged with language models trained on smaller lexical units. We then demonstrate that our approach achieves enhanced domain adaptation accuracy compared to Shallow Fusion, a previously devised method for merging an external language model with an E2E ASR model without necessitating additional training.

Skew Correction of Business Card Images for PDA Application (PDA 응용을 위한 명함 영상의 회전 보정)

  • 박준효;장익훈;김남철
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.12C
    • /
    • pp.1225-1238
    • /
    • 2003
  • We present an efficient algorithm for skew correction of business card images obtained by a PDA (personal digital assistant) camera. The proposed method is composed of four parts: block adaptive binarization (BAB), stripe generation, skew angle calculation, and image rotation. In the BAB, an input image is binarized block by block so as to lessen the effect of irregular illumination and shadow over the input image. In the stripe generation, character string clusters are generated merging adjacent characters and their strings, and then only clusters useful for skew angle calculation are output as stripes. In the skew angle calculation, the direction angles of the stripes are calculated using their central moments and then the skew angle of the input image is determined averaging the direction angles. In the image rotation, the input image is rotated by the skew angle. Experimental results shows that the proposed method yields skew correction rates of about 93% for test images of several types of business cards acquired by a PDA under various surrounding conditions.

Framework of File System Robustness Test (FORT : 파일 시스템 강인성 테스트 프레임 워크)

  • Kim, Young-Jin;Won, You-Jip;Kim, Ra-Kie;Lee, Mo-Won;Park, Jae-Seok;Lee, Joo-Wheun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.8
    • /
    • pp.348-366
    • /
    • 2007
  • Capacity of modem storage devices is becoming larger than yesterday and integration of disk is increasing. It refers that physical errors can damage a lot of digital information on storage devices. So we propose file system test framework in this paper to test integrity and robustness of file systems. We develop the tool for generating bad sectors on disks and the tool which creates all physical errors defined in storage devices. We also develop the tool for immediately monitoring the condition of read and write execution on storage devices. So, by integrating those tools, we develop FORT, test framework for confirming robustness of file system. We analyze robustness of ext3 file systems by FORT. Lastly, we present draft of intelligent system merging file system and device driver's layer architecture.

A Study on the Efficient Task Scheduling by the Reconstructed Task Graph (태스크 그래프의 재구성에 의한 효율적 태스크 스케줄링에 관한 연구)

  • Byun, Seung-Hwan;Yoo, Kwan-Jong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.9
    • /
    • pp.2235-2246
    • /
    • 1997
  • This paper presents an effective heuristic task scheduling algorithm for multiprocessor systems. To execute task scheduling effectively which is defined as an allocation of m's tasks onto n's processors(m > n), several problems almost at NP-hard should be cleaned up. The purpose of the task scheduling obtains the minimum execution time by mapping the tasks on a system topology or reduces the total execution time to give a minimum system topology. In order to solve this problem, in this paper, the task scheduling is done by redefining a task graph to a reconstructed task graph (RTG). An RTG is obtained by merging or copying nodes to equal the number of nodes on each level of the task graph to the number of processors of the system topology and then directly scheduled to the system topology. This method obtains a fast scheduling time and a simple scheduling method, and near-optimal execution time without executing steps such as the refinement step and the duplication step after the task scheduling.

  • PDF

Marker extraction for morphological image segmentation using marker incubator (형태론적 영상 분할을 위한 마커 배양기를 이용한 마커의 추출)

  • Park, Hyun-Sang;Ra Jong-Beom
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.11
    • /
    • pp.106-115
    • /
    • 1998
  • The performance of morphological image segmentation heavily depends on a proper selection of markers. In this paper, we propose a marker incubator where only a catchment basin that has grown sufficiently large through flooding simulation is registered as a marker. Marker incubator does following things at each flooding level; growing defined marker regions, finding new marker regions, and postponing irrelevant regions to be examined at the next level. The examination for a region to be a valid marker is performed by two size-oriented criterions that are derived from the structuring element size of a morphological filter. The simulation result shows that the image segmentation with the proposed marker incubator achieves the comparable image quality to Wang's method in a less number of markers even without region merging. Additionally, since the proposed method also performs better in terms of image quality and information for transmission, it is well suited for region-based image coding.

  • PDF

Cryptic variation, molecular data, and the challenge of conserving plant diversity in oceanic archipelagos: the critical role of plant systematics

  • Crawford, Daniel J.;Stuessy, Tod F.
    • Korean Journal of Plant Taxonomy
    • /
    • v.46 no.2
    • /
    • pp.129-148
    • /
    • 2016
  • Plant species on oceanic islands comprise nearly 25% of described vascular plants on only 5% of the Earth's land surface yet are among the most rare and endangered plants. Conservation of plant biodiversity on islands poses particular challenges because many species occur in a few and/or small populations, and their habitats on islands are often disturbed by the activity of humans or by natural processes such as landslides and volcanoes. In addition to described species, evidence is accumulating that there are likely significant numbers of "cryptic" species in oceanic archipelagos. Plant systematists, in collaboration with others in the botanical disciplines, are critical to the discovery of the subtle diversity in oceanic island floras. Molecular data will play an ever increasing role in revealing variation in island lineages. However, the input from plant systematists and other organismal biologists will continue to be important in calling attention to morphological and ecological variation in natural populations and in the discovery of "new" populations that can inform sampling for molecular analyses. Conversely, organismal biologists can provide basic information necessary for understanding the biology of the molecular variants, including diagnostic morphological characters, reproductive biology, habitat, etc. Such basic information is important when describing new species and arguing for their protection. Hybridization presents one of the most challenging problems in the conservation of insular plant diversity, with the process having the potential to decrease diversity in several ways including the merging of species into hybrid swarms or conversely hybridization may generate stable novel recombinants that merit recognition as new species. These processes are often operative in recent radiations in which intrinsic barriers to gene flow have not evolved. The knowledge and continued monitoring of plant populations in the dynamic landscapes on oceanic islands are critical to the preservation of their plant diversity.

Techniques study of IMS/SIP based Lawful Interception in 3G networks (3G 네트워크에서의 IMS/SIP 기반 합법적 감청 기법)

  • Lee, Myoung-rak;Pyo, Sang-Ho;In, Hoh Peter
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.6
    • /
    • pp.1411-1420
    • /
    • 2015
  • Lawful interception(LI) standard of telephone networks has technical limitations to lawfully intercept IMS/SIP-based mobile communication network subscriber who using Android and iPhone device. In addition, the technical standards related to legal interception of the IMS/SIP of the wireless network is insufficient compared to the systematic study of the development of a wireless network infrastructure. The architecture proposed in the standard of ETSI(European Telecommunications Standards Institute) for the seamless LI is insufficient to overcome the limitations of traditional voice-centric LI techniques. This paper proposes an IMS/SIP-based architecture to perform LI under 3G networks that focuses on mobility-supported environments with merging cellular networks and the Internet. We implemented the simulation to verify the efficiency of the proposed architecture, and the experimental results show that our method achieves higher lawful interception rate than that of existing interception methods.

Sketch Map System using Clustering Method of XML Documents (XML 문서의 클러스터링 기법을 이용한 스케치맵 시스템)

  • Kim, Jung-Sook;Lee, Ya-Ri;Hong, Kyung-Pyo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.12
    • /
    • pp.19-30
    • /
    • 2009
  • The service that has recently come into the spotlight utilizes the map to first approach the map and then provide various mash-up formed results through the interface. This service can provide precise information to the users but the map is barely reusable. The sketch-map system of this paper, unlike the existing large map system, uses the method of presenting the specific spot and route in XML document and then clustering among sketch-maps. The map service system is designed to show the optimum route to the destination in a simple outline map. It is done by renovating the spot presented by the map into optimum contents. This service system, through the process of analyzing, splitting and clustering of the sketch-map's XML document input, creates a valid form of a sketch-map. It uses the LCS(Longest Common Subsequence) algorithm for splitting and merging sketch-map in the process of query. In addition, the simulation of this system's expected effects is provided. It shows how the maps that share information and knowledge assemble to form a large map and thus presents the system's ability and role as a new research portal.

A Study on the Design Tendency of Contemporary Architecture Introducing New Media Art Concept - Focusing on the change for way of information transmission change and media development - (뉴 미디어 아트의 개념을 도입한 현대 건축의 디자인 경향에관한 연구 - 미디어 발전과 정보 전달 방식의 변화를 중심으로 -)

  • Cho, Kyoung-Soo;Woo, Ji-Chang
    • Korean Institute of Interior Design Journal
    • /
    • v.19 no.2
    • /
    • pp.66-72
    • /
    • 2010
  • The purpose of this study is to pronounce design trends in contemporary architecture collaborated with new media art concept. Currently, the prevalence of media presence has evolved perspectives on contemporary aesthetics today. To make clear demonstration on the issue, this study categorized new media art's expressional characteristics applied in the contemporary architectural design in conjunction with analytical researches on typologies and expressional characteristics appear in new media art. More specifically, the study selected architects who adopted new media art's expressional characteristics into their works from the year 2000 and performed analytical case studies with regard to the effect of the new media art into their architectural practices. By following methodologies mentioned above, conclusively the study categorized distinct expressional characteristics appears in contemporary architecture as a result of merging with new media art. The characteristics of the new media art appeared in contemporary architecture are categorized into three groups such as the design controlling external environment, the design utilizing web environment and the design participated by users. These observation could be translated that architects could present interactive design between users and building as a result from architect's capability of designing protocols which generate variable forms, colors and patterns in architecture. In particular, architecture utilizing web environment has characteristic capability of configurating user's program in virtual space. Also it is anticipated to suggest new patterns in generating architectural programs and forms. These patterns would not recognize the city merely as an incident or fragmented image but would configurate forms and images constructed by individual notional character. In conclusion, the architecture itself is expected to perform as media to open up opportunities that enables to contribute in expediting interactions among environment, users, and buildings by deviating from perspectives of representation as an object expressed in modernism architecture or as a classical decoration in post-modernism architecture in the past era.

An Integrated Genomic Resource Based on Korean Cattle (Hanwoo) Transcripts

  • Lim, Da-Jeong;Cho, Yong-Min;Lee, Seung-Hwan;Sung, Sam-Sun;Nam, Jung-Rye;Yoon, Du-Hak;Shin, Youn-Hee;Park, Hye-Sun;Kim, Hee-Bal
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.23 no.11
    • /
    • pp.1399-1404
    • /
    • 2010
  • We have created a Bovine Genome Database, an integrated genomic resource for Bos taurus, by merging bovine data from various databases and our own data. We produced 55,213 Korean cattle (Hanwoo) ESTs from cDNA libraries from three tissues. We concentrated on genomic information based on Hanwoo transcripts and provided user-friendly search interfaces within the Bovine Genome Database. The genome browser supported alignment results for the various types of data: Hanwoo EST, consensus sequence, human gene, and predicted bovine genes. The database also provides transcript data information, gene annotation, genomic location, sequence and tissue distribution. Users can also explore bovine disease genes based on comparative mapping of homologous genes and can conduct searches centered on genes within user-selected quantitative trait loci (QTL) regions. The Bovine Genome Database can be accessed at http://bgd.nabc.go.kr.