• 제목/요약/키워드: 데이터 논문

검색결과 41,647건 처리시간 0.048초

Development of a Linked Data Creation System for Ordinary People and Application (일반인을 위한 링크드 데이터 생성 시스템 개발 및 활용)

  • Jung, Hyo-Sook;Kim, Hee-Jin;Park, Seong-Bin
    • The Journal of Korean Association of Computer Education
    • /
    • 제14권2호
    • /
    • pp.47-59
    • /
    • 2011
  • Linked Data is about using the web to link related data that wasn't linked previously. To publish linked data, people should be able to represent, share, and link pieces of data, information, and knowledge by using URIs and RDF. However, building linked data is not easy for the common users who do not know the knowledge or skill about using URIs and RDF. In this paper, we present a system that the common users can create linked data by connecting data originated from different RDFs. They build linked data by adding new links to connect between RDF data saved in their computers or searched from Swoogle. We can apply the proposed system to creating educational contents. For example, teachers can develop various learning contents by building linked data that connects different data suited to the learning level of their students.

  • PDF

A layered-wise data augmenting algorithm for small sampling data (적은 양의 데이터에 적용 가능한 계층별 데이터 증강 알고리즘)

  • Cho, Hee-chan;Moon, Jong-sub
    • Journal of Internet Computing and Services
    • /
    • 제20권6호
    • /
    • pp.65-72
    • /
    • 2019
  • Data augmentation is a method that increases the amount of data through various algorithms based on a small amount of sample data. When machine learning and deep learning techniques are used to solve real-world problems, there is often a lack of data sets. The lack of data is at greater risk of underfitting and overfitting, in addition to the poor reflection of the characteristics of the set of data when learning a model. Thus, in this paper, through the layer-wise data augmenting method at each layer of deep neural network, the proposed method produces augmented data that is substantially meaningful and shows that the method presented by the paper through experimentation is effective in the learning of the model by measuring whether the method presented by the paper improves classification accuracy.

Sequence Stream Indexing Method using DFT and Bitmap in Sequence Data Warehouse (시퀀스 데이터웨어하우스에서 이산푸리에변환과 비트맵을 이용한 시퀀스 스트림 색인 기법)

  • Son, Dong-Won;Hong, Dong-Kweon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • 제22권2호
    • /
    • pp.181-186
    • /
    • 2012
  • Recently there has been many active researches on searching similar sequences from data generated with the passage of time. Those data are classified as time series data or sequence data and have different semantics from scalar data of traditional databases. In this paper similar sequence search retrieves sequences that have a similar trend of value changes. At first we have transformed the original sequences by applying DFT. The converted data are more suitable for trend analysis and they require less number of attributes for sequence comparisons. In addition we have developed a region-based query and we applied bitmap indexes which could show better performance in data warehouse. We have built bitmap indexes with varying number of attributes and we have found the least cost query plans for efficient similar sequence searches.

The value and sharing of medical research data (의학연구데이터의 가치와 공유의 의미)

  • Kim, Na Won
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 한국정보관리학회 2017년도 제24회 학술대회 논문집
    • /
    • pp.104-104
    • /
    • 2017
  • 연구 데이터란 과학적 연구에서 사용된 일차 자료와 연구자에 의해 직접 작성된 연구의 결과로서 수치, 문자, 이미지, 음성 등의 사실적 기록을 의미한다. 이 연구의 주제 분야인 의학은 잠재적 가치와 활용 가능성이 높고 공익적 성격을 가지고 있는 학문 분야로 의학 연구 데이터의 종류와 관리의 필요성을 통해서 그 가치와 공유 의미를 찾아보고자 한다. 또한 연구 데이터의 대표적인 임상 시험 기록과 연구 논문의 발표와 공유 현황에 대해서도 살펴보고 그 안에서 도서관의 역할이 어떤 것인가를 짚어보고자 한다. 의학 연구 데이터는 환자 진료기록, 건강 검진 기록, 임상 기록, 사망 기록, 임상 시험 기록, 유전체 정보, 연구 논문 등 그 종류와 형태가 다양하며 대용량인 경우가 많다. 의학 연구는 개인 정보보호와 윤리적인 문제 등 연구 수행 과정에서 어려운 점이 많은 성격을 가지고 있으나 질병 치료나 예방 나아가 인류의 건강과 직접적으로 관련된 학문 분야로 의학 연구 데이터의 보존과 공개, 공유를 위한 관리는 큰 의미가 있다. 의학 연구 데이터관리는 새로운 연구의 밑받침이 될 뿐만 아니라 중 저개발 국가의 연구자들에게도 큰 기회를 부여하여 세계적인 의학 발전에 기여할 수 있다. 또한 임상 시험 결과에 대한 은폐와 거짓 연구 방지에도 의미가 있어 미국뿐만 아니라 전세계적으로 학술 연구 논문 발표에 사용된 데이터는 등록하도록 규정하고 있다. 임상 시험 등록으로 공인된 사이트는 NIH의 ClinicalTrials.gov, ICTRP의 Primary Registry 등이 있으며, 우리나라에도 질병관리본부 국립보건연구원에서 관리하는 CRIS 등이 있다. 의학 연구자들은 연구의 시작부터 연구 데이터를 수집, 사용, 보존, 공유의 문제를 고려해야 하나 시간적 물리적인 문제 등으로 어려움을 겪고 있으며, 이를 지원하는 서비스는 도서관에서도 관심이 높아지고 있는 분야로 Virginia Commonwealth 대학 도서관과 Emory 대학 도서관 등에서 시도되고 있다. 이 서비스는 연구 과정에서 사서의 지원이 가능한 새로운 기회로 연구자의 데이터관리를 위한 단계별 스토리를 조직하고 DMP 작성 지원 및 교육 등을 통해서 학술 커뮤니케이션에서 새로운 역할자로 자리잡을 수 있을 것이다.

  • PDF

A Cell-based Compression Technique of the Spatial Data for the Mobile GIS (모바일 GIS를 위한 셀 기반의 공간 데이터 압축 기법)

  • Lee, Ki-Young;Lim, Keun;Choi, Gyoo-Seok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • 제8권6호
    • /
    • pp.49-54
    • /
    • 2008
  • Recently, with the development of wireless communications and GIS, interest about mobile computing is rising. In this case, the GIS needs to be placed in a constricted environment than the environment of server computing. For this reason large amounts of spatial data must be compressed into the mobile device. The compression technique of the spatial data is difficult and must be processed in the correct order because the size of the data is unpredictable. Therefore, in this paper, the cell-based compression technique of the spatial data for mobile GIS is presented. This paper includes the process of transformation of spatial data from a certain server to a mobile device by cell-based compression technique. In this paper, the cell-based compression technique of the spatial data has been proven practically efficient.

  • PDF

Development of a Method for Analyzing and Visualizing Concept Hierarchies based on Relational Attributes and its Application on Public Open Datasets

  • Hwang, Suk-Hyung
    • Journal of the Korea Society of Computer and Information
    • /
    • 제26권9호
    • /
    • pp.13-25
    • /
    • 2021
  • In the age of digital innovation based on the Internet, Information and Communication and Artificial Intelligence technologies, huge amounts of datasets are being generated, collected, accumulated, and opened on the web by various public institutions providing useful and public information. In order to analyse, gain useful insights and information from data, Formal Concept Analysis(FCA) has been successfully used for analyzing, classifying, clustering and visualizing data based on the binary relation between objects and attributes in the dataset. In this paper, we present an approach for enhancing the analysis of relational attributes of data within the extended framework of FCA, which is designed to classify, conceptualize and visualize sets of objects described not only by attributes but also by relations between these objects. By using the proposed tool, RCA wizard, several experiments carried out on some public open datasets demonstrate the validity and usability of our approach on generating and visualizing conceptual hierarchies for extracting more useful knowledge from datasets. The proposed approach can be used as an useful tool for effective data analysis, classifying, clustering, visualization and exploration.

The Method of Rule Discovery for Time Series Data (시 계열 데이터에서의 연관성 발견을 위한 기법)

  • 이준호;차재혁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 한국정보과학회 2004년도 봄 학술발표논문집 Vol.31 No.1 (B)
    • /
    • pp.607-609
    • /
    • 2004
  • 본 논문은 시 계열 데이터에서의 연관성 발견에 있어서 복잡성과 연산량을 효과적으로 줄이며 연관성을 찾아내는 기법에 대해 기술한다. 기존의 시 계열 데이터에서의 sequence 분할 방법은 복잡한 clustering 기법을 사용하여 많은 시간과 resource를 필요로 하는 제한이 있다 이에 본 논문에서는 효과적인 sequence 분할을 위한 증감 table을 이용한 방법을 제안하였다.

  • PDF

Design of Efficient Data Search Function using the Excel VBA DAO (엑셀 VBA DAO 기능을 이용한 효율적인 데이타 검색 기능 설계)

  • Jang, Seung Ju;Ryu, Dae-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • 제18권1호
    • /
    • pp.217-222
    • /
    • 2014
  • In this paper, I propose an efficient data search system using data partitioning algorithm in Microsoft Excel. I propose searching algorithm to retrieve data quickly using VBA functioning in the Excel. This algorithm is to specify the sheet you are looking for. Once the sheet is specified, the algorithm searches the beginning and the end of the data in the sheet. The algorithm compares intermediate values and key words, from the starting position of the cell. In this way, it will search data to the end. This proposed algorithm was implemented and tested in the Excel system using VBA program. The experimental results showed that the performance was better than that of the conventional sequential search method.

A Life-Critical Data Transmission Scheme for Wireless Body Area Networks (무선 인체 통신 네트워크를 위한 응급데이터 전송기법)

  • Choi, Won-Suk;Cho, Sung-Rae
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제34권12B호
    • /
    • pp.1329-1335
    • /
    • 2009
  • In this paper, we propose a new medium access control protocol referred to as DCTW (Dual Channel Transmission Scheme for wireless body area networks). Wireless body area networks (WBANs) requires prioritization mechanism for life-critical data to transmit the data as early as possible. The proposed DCTW exploits a narrow band for transmitting life-critical data while it uses a broadband channel to transmit normal data. Since the narrow band is dedicated to life-critical data, the DCTW can effectively reduce the delay of life-critical data transmission. Through extensive simulation, we show the DCTW outperforms other existing schemes.

Fused Fuzzy Logic System for Corrupted Time Series Data Analysis (훼손된 시계열 데이터 분석을 위한 퍼지 시스템 융합 연구)

  • Kim, Dong Won
    • Journal of Internet of Things and Convergence
    • /
    • 제4권1호
    • /
    • pp.1-5
    • /
    • 2018
  • This paper is concerned with the modeling and identification of time series data corrupted by noise. As modeling techniques, nonsingleton fuzzy logic system (NFLS) is employed for the modeling of corrupted time series. Main characteristic of the NFLS is a fuzzy system whose inputs are modeled as fuzzy number. So the NFLS is especially useful in cases where the available training data or the input data to the fuzzy logic system are corrupted by noise. Simulation results of the Mackey-Glass time series data will be demonstrated to show the performance of the modeling methods. As a result, NFLS does a much better job of modeling noisy time series data than does a traditional Mamdani FLS.