• Title/Summary/Keyword: Number Data

Search Result 22,385, Processing Time 0.046 seconds

A Study for Determining the Best Number of Clusters on Temporal Data (Temporal 데이터의 최적의 클러스터 수 결정에 관한 연구)

  • Cho Young-Hee;Lee Gye-Sung;Jeon Jin-Ho
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.1
    • /
    • pp.23-30
    • /
    • 2006
  • A clustering method for temporal data takes a model-based approach. This uses automata based model for each cluster. It is necessary to construct global models for a set of data in order to elicit individual models for the cluster. The preparation for building individual models is completed by determining the number of clusters inherent in the data set. In this paper, BIC(Bayesian Information Criterion) approximation is used to determine the number clusters and confirmed its applicability. A search technique to improve efficiency is also suggested by analyzing the relationship between data size and BIC values. A number of experiments have been performed to check its validity using artificially generated data sets. BIC approximation measure has been confirmed that it suggests best number of clusters through experiments provided that the number of data is relatively large.

  • PDF

Correlation between Internet Search Query Data and the Health Insurance Review & Assessment Service Data for Seasonality of Plantar Fasciitis (족저 근막염의 계절성에 대한 인터넷 검색어 데이터와 건강보험심사평가원 자료의 연관성)

  • Hwang, Seok Min;Lee, Geum Ho;Oh, Seung Yeol
    • Journal of Korean Foot and Ankle Society
    • /
    • v.25 no.3
    • /
    • pp.126-132
    • /
    • 2021
  • Purpose: This study examined whether there are seasonal variations in the number of plantar fasciitis cases from the database of the Korean Health Insurance Review & Assessment Service and an internet search of the volume data related to plantar fasciitis and whether there are correlations between variations. Materials and Methods: The number of plantar fasciitis cases per month was acquired from the Korean Health Insurance Review & Assessment Service from January 2016 to December 2019. The monthly internet relative search volumes for the keywords "plantar fasciitis" and "heel pain" were collected during the same period from DataLab, an internet search query trend service provided by the Korean portal website, Naver. Cosinor analysis was performed to confirm the seasonality of the monthly number of cases and relative search volumes, and Pearson and Spearman correlation analysis was conducted to assess the correlation between them. Results: The number of cases with plantar fasciitis and the relative search volume for the keywords "plantar fasciitis" and "heel pain" all showed significant seasonality (p<0.001), with the highest in the summer and the lowest in the winter. The number of cases with plantar fasciitis was correlated significantly with the relative search volumes of the keywords "plantar fasciitis" (r=0.632; p<0.001) and "heel pain" (r=0.791; p<0.001), respectively. Conclusion: Both the number of cases with plantar fasciitis and the internet search data for related keywords showed seasonality, which was the highest in summer. The number of cases showed a significant correlation with the internet search data for the seasonality of plantar fasciitis. Internet big data could be a complementary resource for researching and monitoring plantar fasciitis.

Hot Data Identification For Flash Based Storage Systems Considering Continuous Write Operation

  • Lee, Seung-Woo;Ryu, Kwan-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.2
    • /
    • pp.1-7
    • /
    • 2017
  • Recently, NAND flash memory, which is used as a storage medium, is replacing HDD (Hard Disk Drive) at a high speed due to various advantages such as fast access speed, low power, and easy portability. In order to apply NAND flash memory to a computer system, a Flash Translation Layer (FTL) is indispensably required. FTL provides a number of features such as address mapping, garbage collection, wear leveling, and hot data identification. In particular, hot data identification is an algorithm that identifies specific pages where data updates frequently occur. Hot data identification helps to improve overall performance by identifying and managing hot data separately. MHF (Multi hash framework) technique, known as hot data identification technique, records the number of write operations in memory. The recorded value is evaluated and judged as hot data. However, the method of counting the number of times in a write request is not enough to judge a page as a hot data page. In this paper, we propose hot data identification which considers not only the number of write requests but also the persistence of write requests.

Efficient Data Scheduling considering number of Spatial query of Client in Wireless Broadcast Environments (무선방송환경에서 클라이언트의 공간질의 수를 고려한 효율적인 데이터 스케줄링)

  • Song, Doohee;Park, Kwangjin
    • Journal of Internet Computing and Services
    • /
    • v.15 no.2
    • /
    • pp.33-39
    • /
    • 2014
  • How to transfer spatial data from server to client in wireless broadcasting environment is shown as following: A server arranges data information that client wants and transfers data by one-dimensional array for broadcasting cycle. Client listens data transferred by the server and returns resulted value only to server. Recently number of users using location-based services is increasing alongside number of objects, and data volume is changing into large amount. Large volume of data in wireless broadcasting environment may increase query time of client. Therefore, we propose Client based Data Scheduling (CDS) for efficient data scheduling in wireless broadcasting environment. CDS divides map and then calculates total sum of objects for each grid by considering number of objects and data size within divided grids. It carries out data scheduling by applying hot-cold method considering total data size of objects for each grid and number of client. It's proved that CDS reduces average query processing time for client compared to existing method.

Analysis of Traffic Accident by Circular Intersection Type in Korea Using Count Data Model (가산자료 모형을 이용한 국내 원형교차로 유형별 교통사고 분석)

  • Kim, Tae Yang;Lee, Min Yeong;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.32 no.5
    • /
    • pp.129-134
    • /
    • 2017
  • This study aims to develop the traffic accident models by circular intersection type using count data model. The number of accident, the number of fatal and injured persons(FSI), and EPDO are calculated from the traffic accident data of TAAS. The circular intersection accident models are developed through Poisson and negative binomial regression analysis. The main results of this study are as follows. First, the null hypotheses that there are differences in the number of traffic accidents, FSI and EPDO by type of circular intersections are rejected. Second, the scale of intersection(median, large), number of approach road, mean width and length of exit road, area of the circulating roadway and central island are selected as factors influencing the number of traffic accidents, FSI and EPDO in rotary. Third, the scale of intersection(median), guide signs(limited speed, direction, roundabout), number of approach road, entry angle, area of the intersection and central island are adopted as factors influencing the number of traffic accidents, FSI and EPDO in roundabout. Finally, transferring from rotary to roundabout could be expected to make the accident decrease.

System Capacity Analysis with the Retransmission Limit on ARQ in a Voice/Data DS-CDMA System

  • Lee, Chiho;Gwangzeen Ko;Kim, Kiseon
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.513-516
    • /
    • 2000
  • In this paper, we investigate the effect of the retransmission limit both the system capacity and the average number of retransmissions in a voice/data DS-CDMA system. Basically, we consider the IS-95 type reverse link of the CDMA system, which supports two kinds of services: a general voice and a packetized data service. ARQ is used for the reliable data transmission. Convolutional code is used for FEC and CRC-CCITT code is used for the error detection in ARQ. The result shows that the number of concurrent data users decreases as we reduce the number of the retransmissions. However, at the same time, we can also reduce the average number of retransmissions. Concluding1y, we can select the retransmission limit so as to reduce large amount of' retransmissions with small sacrifice in the system capacity.

  • PDF

The Forecasting about the Numbers of the Third Graders in a High-school until 2022 Year in Daegu City

  • Kim, Jong-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.933-942
    • /
    • 2005
  • Recently, the decrease of the number of the third graders in a high-school have serious influences on the number of a limit matriculation of colleges and universities. The purpose of this paper is to forecast for the number of a high-school graduate until 2022 year in Daegu city as based on the resident registration population. As the based period of 2004, most college and universities in Daegu city have to reduce the 37.5% of the number of limit matriculation until 2022 year to equal the number of the third graders in a high-school.

  • PDF

The Forecasting for the Numbers of a High-school Graduate and the Number Limit of Matriculation in Kyungbook

  • Kim, Jong-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.969-977
    • /
    • 2005
  • Recently, the decrease of the number of a high-school graduate have serious influences on the number of a limit matriculation of colleges and universities. Based on the resident registration population, we forecast for the number of a high-school graduate until 2022 year in kyungbook. Most college and universities in Kyungbook have to reduce the 67.8% of the number of limit matriculation until 2022 year to avert a disaster by prompt action.

  • PDF

Estimation of Product Reliability with Incomplete Field Warranty Data (불완전한 사용현장 보증 데이터를 이용한 제품 신뢰도 추정)

  • Lim, Tae-Jin
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.28 no.4
    • /
    • pp.368-378
    • /
    • 2002
  • As more companies are equipped with data aquisition systems for their products, huge amount of field warranty data has been accumulated. We focus on the case when the field data for a given product comprise with the number of sales and the number of the first failures for each period. The number of censored items and their ages are assumed to be given. This type of data are incomplete in the sense that the age of a failed item is unknown. We construct a model for this type of data and propose an algorithm for nonparametric maximum likelihood estimation of the product reliability. Unlike the nonhomogeneous Poisson process(NHPP) model, our method can handle the data with censored items as well as those with small population. A few examples are investigated to characterize our model, and a real field warranty data set is analyzed by the method.

Principles of Multivariate Data Visualization

  • Huh, Moon Yul;Cha, Woon Ock
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.465-474
    • /
    • 2004
  • Data visualization is the automation process and the discovery process to data sets in an effort to discover underlying information from the data. It provides rich visual depictions of the data. It has distinct advantages over traditional data analysis techniques such as exploring the structure of large scale data set both in the sense of number of observations and the number of variables by allowing great interaction with the data and end-user. We discuss the principles of data visualization and evaluate the characteristics of various tools of visualization according to these principles.