• Title/Summary/Keyword: Web Databases

Search Result 610, Processing Time 0.035 seconds

Analysis of Impact Between Data Analysis Performance and Database

  • Kyoungju Min;Jeongyun Cho;Manho Jung;Hyangbae Lee
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.3
    • /
    • pp.244-251
    • /
    • 2023
  • Engineering or humanities data are stored in databases and are often used for search services. While the latest deep-learning technologies, such like BART and BERT, are utilized for data analysis, humanities data still rely on traditional databases. Representative analysis methods include n-gram and lexical statistical extraction. However, when using a database, performance limitation is often imposed on the result calculations. This study presents an experimental process using MariaDB on a PC, which is easily accessible in a laboratory, to analyze the impact of the database on data analysis performance. The findings highlight the fact that the database becomes a bottleneck when analyzing large-scale text data, particularly over hundreds of thousands of records. To address this issue, a method was proposed to provide real-time humanities data analysis web services by leveraging the open source database, with a focus on the Seungjeongwon-Ilgy, one of the largest datasets in the humanities fields.

A Novel Method for Matching between RDBMS and Domain Ontology

  • Lee, Ki-Jung;WhangBo, Taeg-Keun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.12
    • /
    • pp.1552-1559
    • /
    • 2006
  • In a web environment, similar information exists in many different places in diverse formats. Even duplicate information is stored in the various databases using different terminologies. Since most information serviced in the current World Wide Web however had been constructed before the advent of ontology, it is practically almost impossible to construct ontology for all those resources in the web. In this paper, we assume that most information in the web environment exist in the form of RDBMS, and propose a matching method between domain ontology and existing RDBMS tables for semantic retrieval. In the processing of extracting a local ontology, some problems such as losing domain in formation can occur since the correlation of domain ontology has not been considered at all. To prevent these problems, we propose an instance-based matching which uses relational information between RDBMS tables and relational information between classes in domain ontology. To verify the efficiency of the method proposed in this paper, several experiments are conducted using the digital heritage information currently serviced in the countrywide museums. Results show that the proposed method increase retrieval accuracy in terms of user relevance and satisfaction.

  • PDF

A Study on the Development of the Quality Evaluation Standard of Web-Based Databases (웹 기반 데이터베이스의 품질평가 기준 개발에 관한 연구)

  • Hong Hyun-Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.2
    • /
    • pp.211-235
    • /
    • 2005
  • The purpose of this study was to design an extensive evaluation model for the quality of web-based database and to make a comparative analysis of People's awareness of the importance of assessment indices. The evaluation standard developed in this study was constituted by 19 indices and 45 elements in three different areas, including data, service and effectiveness. And a survey was conducted on database experts and nonprofessional users to find out their awareness of the importance of web-based database evaluation elements, and the findings of the study might contribute to laying the foundation for expediting more in-depth research efforts in the future.

Analysis of Internet User Features using Multi-dimensional Association Analysis (다차원 연관 분석을 이용한 인터넷 이용자의 특징 분석)

  • Lee, Su-Eun;Jung, Yong-Gyu
    • Journal of Service Research and Studies
    • /
    • v.1 no.1
    • /
    • pp.61-69
    • /
    • 2011
  • Data mining that can not be extracted with a simple query in the form of "useful" means to find information in large databases from the existing and unknown knowledge. It is based on this insight about the data can be defined as a gain. In this paper, we use the Internet to find useful patterns on the Web or saved data to the target Web site, which is to analyze the characteristics of users. A general statistical information on Internet users to the data by applying a relevance analysis, Internet use affect the amount of time to analyze the characteristics of Internet users. Only through experiments extracting data from the association rules, producing optimal results apply for the data pre-processing and algorithm for mining the Web to Internet users. characteristics were analyzed.

  • PDF

URL Signatures for Improving URL Normalization (URL 정규화 향상을 위한 URL 서명)

  • Soon, Lay-Ki;Lee, Sang-Ho
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.139-149
    • /
    • 2009
  • In the standard URL normalization mechanism, URLs are normalized syntactically by a set of predefined steps. In this paper, we propose to complement the standard URL normalization by incorporating the semantically meaningful metadata of the web pages. The metadata taken into consideration are the body texts and the page size of the web pages, which can be extracted during HTML parsing. The results from our first exploratory experiment indicate that the body texts are effective in identifying equivalent URLs. Hence, given a URL which has undergone the standard normalization, we construct its URL signature by hashing the body text of the associated web page using Message-Digest algorithm 5 in the second experiment. URLs which share identical signatures are considered to be equivalent in our scheme. The results in the second experiment show that our proposed URL signatures were able to further reduce redundant URLs by 32.94% in comparison with the standard URL normalization.

GIS AND WEB-BASED DSS FOR PRELIMINARY TMDL DEVELOPMENT

  • Choi, Jin-Yong;Bernard A. Engel;Yoon, Kwang-Sik
    • Water Engineering Research
    • /
    • v.4 no.1
    • /
    • pp.19-30
    • /
    • 2003
  • TMDL development and implementation have great potential fur use in efforts to improve water quality management, but the TMDL approach still has several difficulties to overcome in terms of cost, time requirements, and suitable methodologies. A well-defined prioritization approach for identifying watersheds of concern among several tar-get locations that would benefit from TMDL development and implementation, based on a simple screening approach, could be a major step in solving some of these difficulties. Therefore, a web-based decision support system (DSS) was developed to help identify areas within watersheds that might be priority areas for TMDL development. The DSS includes a graphical user interface based on the HTML protocol, hydrological models, databases, and geographic information system (GIS) capabilities. The DSS has a hydrological model that can estimate non-point source pollution loading based on over 30 years of daily direct runoff using the curve number method and pollutant event mean concentration data. The DSS provides comprehensive output analysis tools using charts and tables, and also provides probability analysis and best management practice cost estimation. In conclusion, the DSS is a simple, affordable tool for the preliminary study of TMDL development via the Internet, and the DSS web site can also be used as an information web server for education related to TMDL.

  • PDF

Implementation of WebGIS for Integration of GIS Spatial Analysis and Social Network Analysis (GIS 공간분석과 소셜 네트워크 분석의 통합을 위한 WebGIS 구현)

  • Choi, Hyo-Seok;Yom, Jae-Hong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.32 no.2
    • /
    • pp.95-107
    • /
    • 2014
  • In general, topographical phenomena are represented graphically by data in the spatial domain, while attributes of the non-spatial domain are expressed by alpha-numeric texts. GIS functions for analysis of attributes in the non-spatial domain remain quite simple, such as search methods and simple statistical analysis. Recently, graph modeling and network analysis of social phenomena are commonly used for understanding various social events and phenomena. In this study, we applied the network analysis functions to the non-spatial domain data of GIS to enhance the overall spatial analysis. For this purpose, a novel design was presented to integrate the spatial database and the graph database, and this design was then implemented into a WebGIS system for better decision makings. The developed WebGIS with underlying synchronized databases, was tested in a simulated application about the selection of water supply households during an epidemic of the foot-and-mouse disease. The results of this test indicate that the developed WebGIS can contribute to improved decisions by taking into account the social proximity factors as well as geospatial factors.

e-Cohesive Keyword based Arc Ranking Measure for Web Navigation (연관 웹 페이지 검색을 위한 e-아크 랭킹 메저)

  • Lee, Woo-Key;Lee, Byoung-Su
    • Journal of KIISE:Databases
    • /
    • v.36 no.1
    • /
    • pp.22-29
    • /
    • 2009
  • The World Wide Web has emerged as largest media which provides even a single user to market their products and publish desired information; on the other hand the user can access what kind of information abundantly enough as well. As a result web holds large amount of related information distributed over multiple web pages. The current search engines search for all the entered keywords in a single webpage and rank the resulting set of web pages as an answer to the user query. But this approach fails to retrieve the pair of web pages which contains more relevant information for users search. We introduce a new search paradigm which gives different weights to the query keywords according to their order of appearance. We propose a new arc weight measure that assigns more relevance to the pair of web pages with alternate keywords present so that the pair of web pages which contains related but distributed information can be presented to the user. Our measure proved to be effective on the similarity search in which the experimentation represented the e~arc ranking measure outperforming the conventional ones.

Design and Implementation of the KRISTAL-II Web Gateway for Efficiently Processing a Large Number of On-line Retrieval Requests (대규모 온라인 검색 요구를 효율적으로 처리하기 위한 KRISTAL-II웹 게이트웨이의 설계 및 구현)

  • Lee, Ki-Yong;Kwak, Tae-Yeong;Seo, Jung-Hyun;Kim, Myoung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.5
    • /
    • pp.496-504
    • /
    • 2000
  • The Web gateway is key technology for inter-operating WWW and databases. The previous KRISTAL-II information retrieval system, developed by KORDIC(Korea Research & Development Information Center), used a simple CGI structure web gateway. While the simple CGI structure web gateway is easy to implement, it is not suitable for processing a large number of on-line retrival requests. When considering the growth of the Internet and WWW, it is very important to develop a web gateway efficiently supporting a large number of concurrent users. In this paper, we propose a 3-tier client-server structure web gateway for the KRISTAL-II information system. We also evaluate the performance of the proposed web gateway through experiments.

  • PDF

Classification of Web Search Engines and Necessity of a Hybrid Search Engine (웹 검색엔진 분류 및 하이브리드 검색엔진의 필요성)

  • Paik, Juryon
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.719-729
    • /
    • 2018
  • Abstract In 2017, it has been reported that Google had more than 90% of the market share in search-engines of desktops and mobiles. Most people may consider that Google surely searches the entire web area. However, according to many researches for web data, Google only searches less than 10%, surprisingly. The most region is called the Deep Web, and it is indexable by special search engines, which are different from Google because they focus on a specific segment of interest. Those engines build their own deep-web databases and run particular algorithms to provide accurate and professional search results. There is no search engine that indexes the entire Web, currently. The best way is to use several search engines together for broad and efficient searches as best as possible. This paper defines that kind of search engine as Hybrid Search Engine and provides characteristics and differences compared to conventional search engines, along with a frame of hybrid search engine.