• Title/Summary/Keyword: Zipf's distribution

Search Result 11, Processing Time 0.031 seconds

A Study of Zipfian Phenomena in Hangul Literaure (한글 문헌에 있어서 Zipfian 현상에 관한 연구)

  • 신강현;이두영
    • Journal of the Korean Society for information Management
    • /
    • v.5 no.2
    • /
    • pp.53-98
    • /
    • 1988
  • The purpose of this Study is to irwest~gate the Zipfian distribution in Har~gul literature. The result shows that the formulas derived from the liangul Ilterature are it1 accordance with the getlcra\ized Zipf's first law. The result also shows that the formulas derived from the Harlgul literature arc2 not in accordance with the Zlpf's second law and the penerali~ed Zipf's second law.

  • PDF

On Regularity of Daily Distribution of Queries in Search Engine (검색엔진에서 일간질의 어분포의 정상성에 관한 연구)

  • Park, Sang-Gue;Lee, Chan-Kyu;Yoon, Kyung-Hyun;Kim, Seong-Hee;Lee, Jun-Ho
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.255-265
    • /
    • 2007
  • In this paper we analyzed regularity of daily patterns of distribution of Queries coming from internet search engine. And then, we proposed a Pareto distribution and Zipf law for identifying the query distribution and applied them to daily queries on the search engine during 2 week. We found that there is some evidence that Pareto and Zipf laws can be applied to evaluate the regularity of daily patterns of distribution of queries in search engine. Those results can be used to provide a better understanding of the social interests and trends using the query distribution patterns.

Method for Designing Adaptive UI Based on User's Context in the Environment Including Mobile Device and Public Display Device (모바일 장치와 공용 디스플레이 장치를 포함하는 환경에서 사용자의 특성에 기반한 Adaptive UI 설계 방안)

  • Kang, Seung-Soo;Ko, Hyun;Youn, Hee Yong
    • Journal of Information Technology Services
    • /
    • v.11 no.4
    • /
    • pp.181-194
    • /
    • 2012
  • The one of the most meaningful change in the recent ubiquitous environment is the omnipresence of public digital display device for providing ubiquitous information. It is the important issue to provide publicity as well as adaptive information to each user in the field of the public digital display device. This research proposes the idea ensuring fast response speed by the selection of user preference function. The preference function is selected by statistics using Zipf distribution in the system comprising mobile device and digital display device based on NFC (Near Field Communication). The idea is proved by CPM-GOMS model and the improvement of user response can be achieved.

User Centric Content Management System for Open IPTV Over SNS

  • Jeon, Seung Hyun;An, Sanghong;Yoon, Changwoo;Lee, Hyun-woo;Choi, Junkyun
    • Journal of Communications and Networks
    • /
    • v.17 no.3
    • /
    • pp.296-305
    • /
    • 2015
  • Coupled schemes between service-oriented architecture (SOA) and Web 2.0 have recently been researched. Web-based content providers and telecommunications company (Telecom) based Internet protocol television (IPTV) providers have struggled against each other to accommodate more three-screen service subscribers. Since the advent of Web 2.0, more abundant reproduced content can be circulated. However, because according to increasing device's resolution and content formats IPTV providers transcode content in advance, network bandwidth, storage and operation costs for content management systems (CMSs) are wasted. In this paper, we present a user centric CMS for open IPTV, which integrates SOA and Web 2.0. Considering content popularity based on a Zipf-like distribution to solve these problems, we analyze the performance between the user centric CMS and the conventional Web syndication system for normalized costs. Based on the user centric CMS, we implement a social Web TV with device-aware function, which can aggregate, transcode, and deploy content over social networking service independently.

A Study on the Behaviors of Complex System Revealed in the Sizes of Public Libraries in Korea (우리나라 공공도서관의 규모에 나타나는 복잡계 현상에 관한 연구)

  • Lee, Soo-Sang
    • Journal of Korean Library and Information Science Society
    • /
    • v.44 no.4
    • /
    • pp.399-419
    • /
    • 2013
  • This paper conducted the empirical analysis of the behaviors revealed in the eight size distributions of the public libraries in Korea. As a result, the behaviors of complex system appeared in all eight size factors. This means that the sizes of public libraries in Korea were highly polarized. Especially, the zipf's law were found in the size factors such as gross area, number of staffs, volume of books, total budget. And the highly uneven distributions were occurred in the size factors such as membership, number of users, number of borrowers, number of borrowed books. This research outcomes show that a new policy of public libraries is needed to resolve the polarization revealed in the sizes of public libraries in Korea.

A New Parameter Estimation Method for a Zipf-like Distribution for Geospatial Data Access

  • Li, Rui;Feng, Wei;Wang, Hao;Wu, Huayi
    • ETRI Journal
    • /
    • v.36 no.1
    • /
    • pp.134-140
    • /
    • 2014
  • Many reports have shown that the access pattern for geospatial tiles follows Zipf's law and that its parameter ${\alpha}$ represents the access characteristics. However, visits to geospatial tiles have temporal and spatial popularities, and the ${\alpha}$-value changes as they change. We construct a mathematical model to simulate the user's access behavior by studying the attributes of frequently visited tile objects to determine parameter estimation algorithms. Because the least squares (LS) method in common use cannot obtain an exact ${\alpha}$-value and does not provide a suitable fit to data for frequently visited tiles, we present a new approach, which uses a moment method of estimation to obtain the value of ${\alpha}$ when ${\alpha}$ is close to 1. When ${\alpha}$ is further away from 1, the method uses the associated cache hit ratio for tile access and uses an LS method based on a critical cache size to estimate the value of ${\alpha}$. The decrease in the estimation error is presented and discussed in the section on experiment results. This new method, which provides a more accurate estimate of ${\alpha}$ than earlier methods, promises more effective prediction of requests for frequently accessed tiles for better caching and load balancing.

Market Access Approach to Urban Growth

  • MOON, YOON SANG
    • KDI Journal of Economic Policy
    • /
    • v.42 no.3
    • /
    • pp.1-32
    • /
    • 2020
  • This paper studies urban growth in Korean cities. First, I document that population growth patterns change over time and that the current population distribution supports random urban growth. I confirm two empirical laws-Zipf's law and Gibrat's law-both of which hold in the period of 1995-2015, but do not hold in the earlier period of 1975-1995. Second, I find a systematic employment growth pattern of Korean cities in spite of the random population growth. I examine market access effects on employment growth. Market access, a geographical advantage, has a significant influence on urban employment growth. The market access effect is higher in the Seoul metropolitan area than in the rest of the country. This effect is stronger on employment growth in the manufacturing industry compared to employment growth in the service industry. These results are robust with various checks (e.g., different definitions of urban areas). The results here suggest that policymakers should consider geographical characteristics when they make policy decisions with respect to regional development.

Analysis of Keywords in national river occupancy permits by region using text mining and network theory (텍스트 마이닝과 네트워크 이론을 활용한 권역별 국가하천 점용허가 키워드 분석)

  • Seong Yun Jeong
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.185-197
    • /
    • 2023
  • This study was conducted using text mining and network theory to extract useful information for application for occupancy and performance of permit tasks contained in the permit contents from the permit register, which is used only for the simple purpose of recording occupancy permit information. Based on text mining, we analyzed and compared the frequency of vocabulary occurrence and topic modeling in five regions, including Seoul, Gyeonggi, Gyeongsang, Jeolla, Chungcheong, and Gangwon, as well as normalization processes such as stopword removal and morpheme analysis. By applying four types of centrality algorithms, including stage, proximity, mediation, and eigenvector, which are widely used in network theory, we looked at keywords that are in a central position or act as an intermediary in the network. Through a comprehensive analysis of vocabulary appearance frequency, topic modeling, and network centrality, it was found that the 'installation' keyword was the most influential in all regions. This is believed to be the result of the Ministry of Environment's permit management office issuing many permits for constructing facilities or installing structures. In addition, it was found that keywords related to road facilities, flood control facilities, underground facilities, power/communication facilities, sports/park facilities, etc. were at a central position or played a role as an intermediary in topic modeling and networks. Most of the keywords appeared to have a Zipf's law statistical distribution with low frequency of occurrence and low distribution ratio.

Rank-Size Distribution with Web Document Frequency of City Name : Case study with U.S incorporated places of 100,000 or more population (인터넷 문서빈도를 통해 본 도시순위규모에 관한 연구 -미국 10만 이상의 인구를 갖는 도시들을 사례로-)

  • Hong, Il-Young
    • Journal of the Korean association of regional geographers
    • /
    • v.13 no.3
    • /
    • pp.290-300
    • /
    • 2007
  • In this study, web document frequency of city place name is analyzed and it is used as the dataset for rank-size analysis. The search keywords are compared in the context of spatial meaning and the different domain corpus is applied. The acquired search results are applied for the further analysis. Firstly, the rank-size analysis is applied to compare the result between population and document frequency. Secondly, in case of correlation analysis, the significant changes are revealed when the spatial criteria for search keywords are increased. In case of corpus, COM, NET, and ORG shows the higher coefficient values. Lastly, the cluster analysis is applied to classify the list of cities that shows the similarity and difference. These analyses have a significant role in representing the rank-size distribution of city names that are reflected on the web documents in the information society.

  • PDF

Research Trends in Global Cruise Industry Using Keyword Network Analysis (키워드 네트워크 분석을 활용한 세계 크루즈산업 연구동향)

  • Jhang, Se-Eun;Lee, Su-Ho
    • Journal of Navigation and Port Research
    • /
    • v.38 no.6
    • /
    • pp.607-614
    • /
    • 2014
  • This article aims to explore and discuss research trends in global cruise industry using keyword network analysis. We visualize keyword networks in each of four groups of 1982-1999, 2000-2004, 2005-2009, 2010-2014 based on the top 20 keyword nodes' degree centrality and betweenness centrality which are selected among four centrality measurements, comparing them with frequency order. The article shows that keyword frequency collected from 240 articles published in international journals is subject to Zipf's law and nodes degree distribution also exhibits power law. We try to find out research trends in global cruise industry to change some important keywords diachronically, visualizing several networks focusing on the top two keywords, cruise and tourism, belonging to all the four year groups, with high degree and betweenness centrality values. Interestingly enough, a new node, China, connecting the top most keywords, appears in the most recent period of 2010-2014 when China has emerged as one of the rapid development countries in global cruise industry. Therefore keyword network analysis used in this article will be useful to understand research trends in global cruise industry because of increase and decrease of numbers of network types in different year groups and the visual connection between important nodes in giant components.