• Title/Summary/Keyword: Large-scale database

Search Result 298, Processing Time 0.028 seconds

A study on GPS management system on the basis of technology (GPS 도면 지식정보 관리시스템 기술기반에 관한연구)

  • Park, Dong-Heui;Choo, Jun-Sup;Kim, Jong-Min;Gill, Ki-Young
    • Proceedings of the KSR Conference
    • /
    • 2009.05a
    • /
    • pp.1931-1935
    • /
    • 2009
  • Korean railway network GIS-based information system requires so much cost and time. One of the difficulties is due to the fact that GIS-based information system requires the feature database for GIS, which is generally built manually from many as-built drawing files. In order to build-up database automatically Using GPS coordinates, this study suggests the automatic data conversion from electronic drawings to make feature database for GIS. The proposed method can be applied to build large-scale railway facility management system.

  • PDF

Common Speech Database Collection for Telecommunications (통신망환경 한국어 공통음성 DB 구축)

  • Kim Sanghun;Park Moonwhan;Kim Hyunsuk
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.23-26
    • /
    • 2003
  • This paper presents common speech database collection for telecommunication applications. During 3 year project, we will construct very large scale speech and text databases for speech recognition, speech synthesis, and speaker identification. The common speech database has been considered various communication environments, distribution of speakers' sex, distribution of speakers' age, and distribution of speakers' region. It consists of Korean continuous digit, isolated words, and sentences which reflects Korean phonetic coverage. In addition, it consists of various pronunciation style such as read speech, dialogue speech, and semi-spontaneous speech. Thanks to the common speech databases, the duplicated resources of Korean speech industries are prohibited. It encourages domestic speech industries and activate speech technology domestic market.

  • PDF

A New Flash-aware Buffering Scheme Supporting Virtual Page Flushing

  • Lim, Seong-Chae
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.3
    • /
    • pp.161-170
    • /
    • 2022
  • Recently, NAND-type flash memory has been regarded to be new promising storage media for large-scale database systems. For flash memory to be employed for that purpose, we need to reduce its expensive update cost caused by the inablity of in-place updates. To remedy such a drawback in flash memory, we propose a new flash-aware buffering scheme that enables virtual flushing of dirty pages. To this end, we slightly alter the tradional algorithms used for the logging scheme and buffer management scheme. By using the mechanism of virtual flushing, our proposed buffering scheme can efficiently prevent the frequenct occureces of page updates in flash storage. Besides the advantage of reduced page updates, the proposed viurtual flushing mechanism works favorably for shorneing a recocery time in the presense of failure. This is because it can reduce the time for redo actions during a recovry process. Owing to those two benefits, we can say that our scheme couble be very profitable when it is incorporated into cutting-edge flash-based database systems.

External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems

  • Fatima Khalil Aljwari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.164-168
    • /
    • 2023
  • There are many possible ways to configure database management systems (DBMSs) have challenging to manage and set.The problem increased in large-scale deployments with thousands or millions of individual DBMS that each have their setting requirements. Recent research has explored using machine learning-based (ML) agents to overcome this problem's automated tuning of DBMSs. These agents extract performance metrics and behavioral information from the DBMS and then train models with this data to select tuning actions that they predict will have the most benefit. This paper discusses two engineering approaches for integrating ML agents in a DBMS. The first is to build an external tuning controller that treats the DBMS as a black box. The second is to incorporate the ML agents natively in the DBMS's architecture.

Korea Barcode of Life Database System (KBOL)

  • Kim, Sung-Min;Kim, Chang-Bae;Min, Gi-Sik;Suh, Young-Bae;Bhak, Jong;Woo, Tae-Ha;Koo, Hye-Young;Choi, Jun-Kil;Shin, Mann-Kyoon;Jung, Jong-Woo;Song, Kyo-Hong;Ree, Han-Il;Hwang, Ui-Wook;Park, Yung-Chul;Eo, Hae-Seok;Kim, Joo-Pil;Yoon, Seong-Myeong;Rho, Hyun-Soo;Kim, Sa-Heung;Lee, Hang;Min, Mi-Sook
    • Animal cells and systems
    • /
    • v.16 no.1
    • /
    • pp.11-19
    • /
    • 2012
  • A major concern regarding the collection and storage of biodiversity information is the inefficiency of conventional taxonomic approaches in dealing with a large number of species. This inefficiency has increased the demand for automated, rapid, and reliable molecular identification systems and large-scale biological databases. DNA-based taxonomic approaches are now arguably a necessity in biodiversity studies. In particular, DNA barcoding using short DNA sequences provides an effective molecular tool for species identification. We constructed a large-scale database system that holds a collection of 5531 barcode sequences from 2429 Korean species. The Korea Barcode of Life database (KBOL, http://koreabarcode.org) is a web-based database system that is used for compiling a high volume of DNA barcode data and identifying unknown biological specimens. With the KBOL system, users can not only link DNA barcodes and biological information but can also undertake conservation activities, including environmental management, monitoring, and detecting significant organisms.

An Analysis of the Overhead of Multiple Buffer Pool Scheme on InnoDB-based Database Management Systems (InnoDB 기반 DBMS에서 다중 버퍼 풀 오버헤드 분석)

  • Song, Yongju;Lee, Minho;Eom, Young Ik
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1216-1222
    • /
    • 2016
  • The advent of large-scale web services has resulted in gradual increase in the amount of data used in those services. These big data are managed efficiently by DBMS such as MySQL and MariaDB, which use InnoDB engine as their storage engine, since InnoDB guarantees ACID and is suitable for handling large-scale data. To improve I/O performance, InnoDB caches data and index of its database through a buffer pool. It also supports multiple buffer pools to mitigate lock contentions. However, the multiple buffer pool scheme leads to the additional data consistency overhead. In this paper, we analyze the overhead of the multiple buffer pool scheme. In our experimental results, although multiple buffer pool scheme mitigates the lock contention by up to 46.3%, throughput of DMBS is significantly degraded by up to 50.6% due to increased disk I/O and fsync calls.

Analysis of the Influence Factors of Data Loading Performance Using Apache Sqoop (아파치 스쿱을 사용한 하둡의 데이터 적재 성능 영향 요인 분석)

  • Chen, Liu;Ko, Junghyun;Yeo, Jeongmo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.2
    • /
    • pp.77-82
    • /
    • 2015
  • Big Data technology has been attracted much attention in aspect of fast data processing. Research of practicing Big Data technology is also ongoing to process large-scale structured data much faster in Relatioinal Database(RDB). Although there are lots of studies about measuring analyzing performance, studies about structured data loading performance, prior step of analyzing, is very rare. Thus, in this study, structured data in RDB is tested the performance that loads distributed processing platform Hadoop using Apache sqoop. Also in order to analyze the influence factors of data loading, it is tested repeatedly with different options of data loading and compared with data loading performance among RDB based servers. Although data loading performance of Apache Sqoop in test environment was low, but in large-scale Hadoop cluster environment we can expect much better performance because of getting more hardware resources. It is expected to be based on study improving data loading performance and whole steps of performance analyzing structured data in Hadoop Platform.

An EJB-Based Database Agent for Workflow Definition (EJB 기반의 워크플로우 정의 데이터베이스 에이전트 설계 및 구현)

  • 오동근;김광훈
    • Journal of Internet Computing and Services
    • /
    • v.2 no.5
    • /
    • pp.41-47
    • /
    • 2001
  • This paper deals with an EJB-based database agent(component) used to define workflow processes, which is a core function of the e-Chautauqua workflow management system that is an on-going research product. We describe about how to design and implement the EJB-based DB agent that is deployed on EJB server as a component. The agent is located between the build-time clients and the database system, and manages database accesses, such as retrieves and stores, from the workflow definition components. Through the EJB technology, we are able to accomplish a stable database agent that can be characterized by the distributed object management, reliable recovery mechanism from system failovers, reliable large-scale transaction management, and the security functions.

  • PDF

A Knowledge-based Question-Answering System: With A View To Constructing A Fact Database (지식기반 (Knowledge-based) 질의응답시스템: 사실 자료 (Faet Database)구축을 중심으로)

  • 신효필
    • Korean Journal of Cognitive Science
    • /
    • v.13 no.1
    • /
    • pp.41-51
    • /
    • 2002
  • In this paper, I describe a knowledge-based question-answering system and significance of the system with a view to constructing a fact database. The knowledge-based system takes advantage of existing NLP-resources such as conceptual structures of ontologies along with morphotogical, syntactic and semantic analysis. The use of conceptual structures allows us to select right answers through inferences basically made by expansions of concepts. However, the work of constructing factual knowledge requires a great amount of acquisition time in large-scale applications because of the nature of human interference. This is why the procedure of acquiring factual knowledge cannot be fully automated. Apart from efficiency considerations. the knowledge-based system deserves serious consideration, I point out benefits of the system and describe the whole procedure of building the system in terms of a fact database.

  • PDF

Scale Efficiency and Fishing Capacity Analysis for Large Pair-Trawl Vessels in Korean Waters (한국 근해 쌍끌이 대형기선저인망어선의 규모별 효율성과 어회능력 활용도 평가)

  • Lee, Dong-Woo;Lee, Jae-Bong;Jung, Suk-Geun;Kim, Yeong-Hye
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.41 no.6
    • /
    • pp.485-492
    • /
    • 2008
  • To propose proper vessel characteristics for sustainable fisheries in Korean waters, we analyzed the fishing capacity, scale efficiency and utilization of large pair-trawl vessels based on the database of catch, effort and vessel characteristics (gross tonnage and engine power) in 1990 by applying data envelopment analysis (DEA). The input factors were gross tonnage, horse power and days operated; whereas the output factor was expected catch by vessel characteristics. The optimal vessel types, selected based on the input-oriented technical efficiency and gross tonnages, was 100 GT with engine power <600 HP. The output-oriented unbiased estimate of capacity utilization (CD) decreased with increasing vessel tonnage. For the same tonnage vessels, the CD decreased with increasing engine power.