• Title/Summary/Keyword: Zipf Distribution

Search Result 21, Processing Time 0.019 seconds

Analysis of a Cache Management Protocol Using a Back-shifting Approach (백쉬프팅 기법을 이용한 캐쉬 유지 규약의 분석)

  • Cho Sung-Ho
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.6
    • /
    • pp.49-56
    • /
    • 2005
  • To reduce server bottlenecks in client-server computing, each client may have its own cache for later reuse. The pessimistic approach for cache management protocol leads to unnecessary waits, because, it can not be commit a transaction until the transaction obtains all requested locks. In addition, optimistic approach tends to make needless aborts. This paper suggests an efficient optimistic protocol that overcomes such shortcomings. In this paper, we present a simulation-based analysis on the performance of our scheme with other well-known protocols. The analysis was executed under the Zipf workload which represents the popularity distribution on the Web. The simulation experiments show that our scheme performs as well as or better than other schemes with low overhead.

  • PDF

An Effective Video Block Placement Strategy on VOD Storage Server with MZR Disks (MZR 디스크를 채택한 VOD 저장서버의 효율적인 비디오 블록 배치방법)

  • Lim, Hyoung-Roung;Kim, Jeong-Won;Kim, Young-Ju;Chung, Ki-Dong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.12
    • /
    • pp.2971-2984
    • /
    • 1997
  • In this paper, we propose an efficient video block Placement scheme that utilize the current disk product that has MZR disk characteristic and users' skewed access pattern on VOD. Also, we evaluate its performance through simulation and modeling of VOD server. The basic placement rule is to place on MZR disks by LP, SHP methods according to the Zipf distribution of popularity. To verify the proposed scheme, we examined its performance on workstation with 2 MZR disks under varied skewed factors. The proposed placement scheme showed better response time than the random method. To extend proposed placement scheme to disk group, we analyzed the theoretical maximum numbers of concurrent users and the required buffer size per user. For performance parameters for the proposed scheme, we considered the disk head scheduling methods, the placement methods and the striping unit. The result of experiments showed that the proposed scheme was effective.

  • PDF

A New Parameter Estimation Method for a Zipf-like Distribution for Geospatial Data Access

  • Li, Rui;Feng, Wei;Wang, Hao;Wu, Huayi
    • ETRI Journal
    • /
    • v.36 no.1
    • /
    • pp.134-140
    • /
    • 2014
  • Many reports have shown that the access pattern for geospatial tiles follows Zipf's law and that its parameter ${\alpha}$ represents the access characteristics. However, visits to geospatial tiles have temporal and spatial popularities, and the ${\alpha}$-value changes as they change. We construct a mathematical model to simulate the user's access behavior by studying the attributes of frequently visited tile objects to determine parameter estimation algorithms. Because the least squares (LS) method in common use cannot obtain an exact ${\alpha}$-value and does not provide a suitable fit to data for frequently visited tiles, we present a new approach, which uses a moment method of estimation to obtain the value of ${\alpha}$ when ${\alpha}$ is close to 1. When ${\alpha}$ is further away from 1, the method uses the associated cache hit ratio for tile access and uses an LS method based on a critical cache size to estimate the value of ${\alpha}$. The decrease in the estimation error is presented and discussed in the section on experiment results. This new method, which provides a more accurate estimate of ${\alpha}$ than earlier methods, promises more effective prediction of requests for frequently accessed tiles for better caching and load balancing.

Market Access Approach to Urban Growth

  • MOON, YOON SANG
    • KDI Journal of Economic Policy
    • /
    • v.42 no.3
    • /
    • pp.1-32
    • /
    • 2020
  • This paper studies urban growth in Korean cities. First, I document that population growth patterns change over time and that the current population distribution supports random urban growth. I confirm two empirical laws-Zipf's law and Gibrat's law-both of which hold in the period of 1995-2015, but do not hold in the earlier period of 1975-1995. Second, I find a systematic employment growth pattern of Korean cities in spite of the random population growth. I examine market access effects on employment growth. Market access, a geographical advantage, has a significant influence on urban employment growth. The market access effect is higher in the Seoul metropolitan area than in the rest of the country. This effect is stronger on employment growth in the manufacturing industry compared to employment growth in the service industry. These results are robust with various checks (e.g., different definitions of urban areas). The results here suggest that policymakers should consider geographical characteristics when they make policy decisions with respect to regional development.

Analysis of Keywords in national river occupancy permits by region using text mining and network theory (텍스트 마이닝과 네트워크 이론을 활용한 권역별 국가하천 점용허가 키워드 분석)

  • Seong Yun Jeong
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.185-197
    • /
    • 2023
  • This study was conducted using text mining and network theory to extract useful information for application for occupancy and performance of permit tasks contained in the permit contents from the permit register, which is used only for the simple purpose of recording occupancy permit information. Based on text mining, we analyzed and compared the frequency of vocabulary occurrence and topic modeling in five regions, including Seoul, Gyeonggi, Gyeongsang, Jeolla, Chungcheong, and Gangwon, as well as normalization processes such as stopword removal and morpheme analysis. By applying four types of centrality algorithms, including stage, proximity, mediation, and eigenvector, which are widely used in network theory, we looked at keywords that are in a central position or act as an intermediary in the network. Through a comprehensive analysis of vocabulary appearance frequency, topic modeling, and network centrality, it was found that the 'installation' keyword was the most influential in all regions. This is believed to be the result of the Ministry of Environment's permit management office issuing many permits for constructing facilities or installing structures. In addition, it was found that keywords related to road facilities, flood control facilities, underground facilities, power/communication facilities, sports/park facilities, etc. were at a central position or played a role as an intermediary in topic modeling and networks. Most of the keywords appeared to have a Zipf's law statistical distribution with low frequency of occurrence and low distribution ratio.

Effective Parallel Hash Join Algorithm Based on Histoftam Equalization in the Presence of Data Skew (데이터 편재 하에서 히스토그램 변환기법에 기초한 효율적인 병렬 해쉬 결합 알고리즘)

  • Park, Ung-Gyu;Choe, Hwang-Gyu;Kim, Tak-Gon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.2
    • /
    • pp.338-348
    • /
    • 1997
  • In this pater, we first propose a data distribution framework to resolve load imbalance and bucket oerflow in parallel hash join.Using the histogram equalization technique, the framework transforms a histogram of skewed data to the desired uniform distribution that corresponds to the relative computing power of node processors in the system.Next we propose an effcient parallel hash join algorithm for handing skwed data based on the proposed data distribution methodology.For performance comparison of our algorithm with other hash join algorithms.we perform similation experiments and actual exeution on COREDB database computer with 8-node hyperube architecture. In these experiments, skwed data distebution of the join atteibute is modeled using a Zipf-like distribution.The perfomance studies undicate that our algorithm outperforms other algorithms in the skewed cases.

  • PDF

Research Trends in Global Cruise Industry Using Keyword Network Analysis (키워드 네트워크 분석을 활용한 세계 크루즈산업 연구동향)

  • Jhang, Se-Eun;Lee, Su-Ho
    • Journal of Navigation and Port Research
    • /
    • v.38 no.6
    • /
    • pp.607-614
    • /
    • 2014
  • This article aims to explore and discuss research trends in global cruise industry using keyword network analysis. We visualize keyword networks in each of four groups of 1982-1999, 2000-2004, 2005-2009, 2010-2014 based on the top 20 keyword nodes' degree centrality and betweenness centrality which are selected among four centrality measurements, comparing them with frequency order. The article shows that keyword frequency collected from 240 articles published in international journals is subject to Zipf's law and nodes degree distribution also exhibits power law. We try to find out research trends in global cruise industry to change some important keywords diachronically, visualizing several networks focusing on the top two keywords, cruise and tourism, belonging to all the four year groups, with high degree and betweenness centrality values. Interestingly enough, a new node, China, connecting the top most keywords, appears in the most recent period of 2010-2014 when China has emerged as one of the rapid development countries in global cruise industry. Therefore keyword network analysis used in this article will be useful to understand research trends in global cruise industry because of increase and decrease of numbers of network types in different year groups and the visual connection between important nodes in giant components.

An Adaptive Batching Scheduling Policy for Efficient User Services (효율적인 사용자 서비스를 위한 적응적 배칭 스케줄링 정책)

  • Choe, Seong-Uk;Kim, Jong-Gyeong;Park, Seung-Gyu;Choe, Gyeong-Hui;Kim, Dong-Yun;Choe, Deok-Gyu
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.37 no.2
    • /
    • pp.44-53
    • /
    • 2000
  • The waiting delays of users are inevitable in this policy since the services are not taken immediately upon requests but upon every scheduling points. An inefficient management of such delays makes an unfair service to users and increases the possibility of higher reneging rates. This paper proposes an adaptive batch scheduling scheme which improves the average waiting time of users requests and reduces the starvation problem of users requesting less popular movies. The proposed scheme selects dynamically multiple videos in given intervals based on the service patterns which reflect the popularity distribution(Zipf-distribution) and resource utilizations. Experimental results of simulations show that the proposed scheme improves about 20-30 percentage of average waiting time and reduces significantly the starving requesters comparing with those of conventional methods such as FCFS and MQL.

  • PDF

Rank-Size Distribution with Web Document Frequency of City Name : Case study with U.S incorporated places of 100,000 or more population (인터넷 문서빈도를 통해 본 도시순위규모에 관한 연구 -미국 10만 이상의 인구를 갖는 도시들을 사례로-)

  • Hong, Il-Young
    • Journal of the Korean association of regional geographers
    • /
    • v.13 no.3
    • /
    • pp.290-300
    • /
    • 2007
  • In this study, web document frequency of city place name is analyzed and it is used as the dataset for rank-size analysis. The search keywords are compared in the context of spatial meaning and the different domain corpus is applied. The acquired search results are applied for the further analysis. Firstly, the rank-size analysis is applied to compare the result between population and document frequency. Secondly, in case of correlation analysis, the significant changes are revealed when the spatial criteria for search keywords are increased. In case of corpus, COM, NET, and ORG shows the higher coefficient values. Lastly, the cluster analysis is applied to classify the list of cities that shows the similarity and difference. These analyses have a significant role in representing the rank-size distribution of city names that are reflected on the web documents in the information society.

  • PDF

Implementation of a Layer-7 Web Clustering System on Linux with Performance Enhancements via Recognition of User Request Rate Variations (리눅스에서 레이어-7 웹 클러스터링 시스템의 구현 및 사용자 요청률 차이의 인식에 기반한 성능 개선)

  • Hong Il-gu;Noh Sam H.
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.1
    • /
    • pp.68-79
    • /
    • 2005
  • The popularity of Web service is ever increasing. As the number of services and clients continue to increase, the problem of providing a system that scales with this increase is becoming more difficult. A costly and ineffective method is to buy a new system that is more powerful every time the load becomes unbearable. h more cost effective solution is to expand the system as the need arises. This is the approach taken in Web cluster systems. However, providing effective scalability in a Web cluster system is stil1 an open issue. In this study, we implement a Web cluster system based on Layer 7 switching technique on Linux. The implementation is based on a design proposed and implemented by Aron et al., but on the FreeBSD. Though the design li the same, due to the vast difference between the FreeBSD and Linux, the implementation presented in this paper is totally new. We also propose the Dual Scheduling (DS) load distribution algorithm that distributes the requests to the system resources by observing the variations in the request rate. We show through measurement on our implementation that the DS alorithm performs considerably bettor than previous algorithms.