• 제목/요약/키워드: Data journal

검색결과 189,213건 처리시간 0.11초

On the clustering of huge categorical data

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권6호
    • /
    • pp.1353-1359
    • /
    • 2010
  • Basic objective in cluster analysis is to discover natural groupings of items. In general, clustering is conducted based on some similarity (or dissimilarity) matrix or the original input data. Various measures of similarities between objects are developed. In this paper, we consider a clustering of huge categorical real data set which shows the aspects of time-location-activity of Korean people. Some useful similarity measure for the data set, are developed and adopted for the categorical variables. Hierarchical and nonhierarchical clustering method are applied for the considered data set which is huge and consists of many categorical variables.

Designing Summary Tables for Mining Web Log Data

  • Ahn, Jeong-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권1호
    • /
    • pp.157-163
    • /
    • 2005
  • In the Web, the data is generally gathered automatically by Web servers and collected in server or access logs. However, as users access larger and larger amounts of data, query response times to extract information inevitably get slower. A method to resolve this issue is the use of summary tables. In this short note, we design a prototype of summary tables that can efficiently extract information from Web log data. We also present the relative performance of the summary tables against a sampling technique and a method that uses raw data.

  • PDF

Big Data in Smart Tourism: A Perspective Article

  • Park, Sangwon
    • Journal of Smart Tourism
    • /
    • 제1권3호
    • /
    • pp.3-5
    • /
    • 2021
  • The advancement of Information Communication Technology has provided tourism researchers with a golden opportunity to access big data, which plays a critical role in smart tourism. Recognizing the current issue, this paper discusses the evolution of the literature on tourism big data focusing on conceptual understanding of and types of big data, and insights from big data analytics. Indeed, this article provides important research agenda for future tourism researchers who would like to conduct academic research about big data and smart tourism.

A Big Data-Driven Business Data Analysis System: Applications of Artificial Intelligence Techniques in Problem Solving

  • Donggeun Kim;Sangjin Kim;Juyong Ko;Jai Woo Lee
    • 한국빅데이터학회지
    • /
    • 제8권1호
    • /
    • pp.35-47
    • /
    • 2023
  • It is crucial to develop effective and efficient big data analytics methods for problem-solving in the field of business in order to improve the performance of data analytics and reduce costs and risks in the analysis of customer data. In this study, a big data-driven data analysis system using artificial intelligence techniques is designed to increase the accuracy of big data analytics along with the rapid growth of the field of data science. We present a key direction for big data analysis systems through missing value imputation, outlier detection, feature extraction, utilization of explainable artificial intelligence techniques, and exploratory data analysis. Our objective is not only to develop big data analysis techniques with complex structures of business data but also to bridge the gap between the theoretical ideas in artificial intelligence methods and the analysis of real-world data in the field of business.

데이터 융합인재 직무모형 개발 연구 (A Research on Job Model Development for Data Convergent Talent)

  • 엄혜미;유윤형
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제33권1호
    • /
    • pp.207-226
    • /
    • 2024
  • Purpose This study aims to develop a job model for data convergent talents to meet the rapidly changing demands of the data industry. To create a job model, we first define and categorize data convergent talents with balanced competencies in data technology and domain knowledge, and then develop a job model by investigating job areas, scope, activities, and competencies. Design/methodology/approach The research is conducted using the following procedures and methodology. First, we conduct a current status survey on data talent demand, data talent policies, data talent programs, and curricula at home and abroad; second, we collect opinions on the jobs and competencies required for data convergent talents and curricula for talent development through in-depth interview with experts; and third, we present the job areas and job activities of data convergent talents derived from the previous status survey and expert opinions based on the National Competency Standards(NCS). Findings The research findings indicate that there are total of six job roles for data convergent talents, including data scientist, data planner, data architect, data developer, data engineer, and data analyst. It was observed that each of these roles requires the development of common competencies within their respective fields, followed by a need for further specialization into specific competencies within each professional domain.

지역 스마트팜 데이터 연계 및 서비스 활용에 대한 연구 (Research on Regional Smart Farm Data Linkage and Service Utilization)

  • 이원구;구현정;채철주
    • 현장농수산연구지
    • /
    • 제26권2호
    • /
    • pp.14-24
    • /
    • 2024
  • To enhance the usability of smart agriculture, methods for utilizing smart farm data are required. Therefore, this study proposes a scheme for utilizing regional smart farm data by linking it to services. The current status of domestic and foreign smart farm data collection and linkage services is analyzed. To collect and link regional smart farm data, necessary data collection, data cleaning, data storage structure and schema, and data storage and linkage systems are proposed. Based on the standards currently being implemented for regional smart farm internal data storage, a farm schema, environmental information schema, facility control information schema, and growth information schema are designed by extending the crop schema and crop main environmental factor information database schema. A data collection and management system structure based on the Hadoop Ecosystem is designed for data collection and management at regional smart farm data centers. Strategies are proposed for utilizing regional smart farm data to provide smart farm productivity improvement and revenue optimization services, image-based crop analysis services, and virtual reality-based smart farm simulation services.

공공데이터 활용성 제고를 위한 권리처리 플랫폼 구축 전략 (Strategy for Establishing a Rights Processing Platform to Enhance the Utilization of Open Data)

  • 심준보;권헌영
    • 한국IT서비스학회지
    • /
    • 제21권3호
    • /
    • pp.27-42
    • /
    • 2022
  • Open Data is an essential resource for the data industry. 'Act On Promotion Of The Provision And Use Of Public Data', enacted on July 30, 2013, mandates public institutions to manage the quality of Open Data and provide it to the public. Via such a legislation, the legal basis for the public to Open Data is prepared. Furthermore, public institutions are prohibited from developing and providing open data services that are duplicated or similar to those of the private sector, and private start-ups using open data are supported. However, as the demand for Open Data gradually increases, the cases of refusal to provide or interruption of Open Data held by public institutions are also increasing. Accordingly, the 'Open Data Mediation Committee' is established and operated so that the right to use data can be rescued through a simple dispute mediation procedure rather than complicated administrative litigation. The main issues dealt with in dispute settlement so far are usually the rights of third parties, such as open data including personal information, private information such as trade secrets, and copyrights. Plus, non-open data cannot be provided without the consent of the information subject. Rather than processing non-open data into open data through de-identification processing, positive results can be expected if consent is provided through active rights processing of the personal information subject. Not only can the Public Mydata Service be used by the information subject, but Open Data applicants will also be able to secure higher quality Open Data, which will have a positive impact on fostering the private data industry. This study derives a plan to establish a rights processing platform to enhance the usability of Open Data, including private information such as personal information, trade secrets, and copyright, which have become an issue when providing Open Data since 2014. With that, the proposals in this study are expected to serve as a stepping stone to revitalize private start-ups through the use of wide Open Data and improve public convenience through Public MyData services of information subjects.

링크드 데이터 방식을 통한 서지 정보의 확장에 관한 연구 (Extending Bibliographic Information Using Linked Data)

  • 박지영
    • 정보관리학회지
    • /
    • 제29권1호
    • /
    • pp.231-251
    • /
    • 2012
  • 본 연구에서는 서지 정보를 확장하기 위한 방안으로 링크드 데이터를 선정하였다. 링크드 데이터는 웹 공간을 통해 공유 가능한 식별기호와 데이터 구조 및 링크 정보를 제공하기 때문이다. 특히 링크드 데이터는 서지 온톨로지와 결합하여 서지데이터를 확장시키는데 유용하다. 이에 링크드 데이터와 서지 온톨로지를 분석하고, 연계 가능한 링크드 데이터를 검토하였다. 그리고 이를 바탕으로 링크드 데이터로 구축된 기존의 전거 데이터 및 서지 데이터를 연계하였다. 이러한 실험적 연계를 통해 향후 링크드 데이터를 효과적으로 활용하기 위한 과제를 도출할 수 있었다. 즉, 1) 다양한 링크드 데이터 중에서 각 기관에서 적합한 데이터를 선정할 수 있어야 하며, 2) 선정된 링크드 데이터를 연계하기 위한 기준을 정립해야 하고, 마지막으로 3) 자관의 고유한 데이터를 개발하여 이를 다시 공유해 나가야 할 것을 제안하였다.

품질관리시스템을 활용한 태양에너지자원 신뢰성 향상에 관한 연구 (The Study on the Reliability Enhancement for Solar Energy Resources Using the Data quality Management System in Korea (Focused on Data Error Analysis))

  • 조덕기;강용혁
    • 한국태양에너지학회 논문집
    • /
    • 제27권1호
    • /
    • pp.19-27
    • /
    • 2007
  • The Data quality management system(DQMS) organizes and helps manage and process time sequence data usually collected in monitoring networks and programs. DQMS places particular emphasis on data qualify while maintaining a highly organized and convenient structure for data. It operates with in a flexible and powerful commercial relational data base environment which can readily link to other software platforms from local spreadsheets to network server. The Korea Institute of Energy Research(KIER) has been solar radiation data since May, 1991 for 16 different locations. KIER's new data is expected to be extensively used by designer and researchers of solar systems in lieu of unreliable old ones. Unfortunately, the quality of the data has not always been properly mentioned. The purpose of this study is to systematically identify errors in such data set using DQMS in an effort to rehabilitate error-ridden old data. DET successfully uncovered solar radiation data that had questionable quality.

XMDR 데이터 허브 기반의 Proxy 데이터베이스를 이용한 데이터 상호운용 프레임워크 (Data Interoperability Framework based on XMDR Data Hub using Proxy DataBase)

  • 문석재;정계동;최영근
    • 한국정보통신학회논문지
    • /
    • 제12권8호
    • /
    • pp.1463-1472
    • /
    • 2008
  • 본 논문에서는 XMDR(eXtended Meta-Data Resistry) 데이터 허브 기반의 Proxy Database를 이용하여 Legacy Database간의 데이터 상호운용이 가능한 프레임워크를 제안한다. 협 업 환경에서는 Legacy Database간의 상호운용을 하는데 있어서 데이터의 구조, 의미, 형식상의 이질적인 문제들이 발생한다. 또한 실시간으로 변화하는 데이터를 종류와 형식에 관계없이 지속적으로 일관성을 유지하기가 어렵다. 본 논문에서는 XMDR 데이터 허브를 이용하여 Legacy DB간의 데이터 통합 및 상호운용에서 발생할 수 있는 이 질적인 문제를 해결한다. Proxy Database를 이용하여 상호운용하고자 하는 데이터들이 종류와 형식에 상관없이 호환이 가능하고, 지속적으로 정확한 정보를 실시간으로 일관성 있게 제공하는 프레임워크를 제안한다.