• 제목/요약/키워드: Web server system

검색결과 1,272건 처리시간 0.02초

클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현 (Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment)

  • 김명진;한승호;최운;이한구
    • 인터넷정보학회논문지
    • /
    • 제14권6호
    • /
    • pp.71-84
    • /
    • 2013
  • 컴퓨터 시스템 운용 간에 발생하는 많은 정보들이 기록되는 로그데이터는 컴퓨터 시스템 운용 점검, 프로세스의 최적화, 사용자 최적화 맞춤형 제공 등 다방면으로 활용되고 있다. 본 논문에서는 다양한 종류의 로그데이터들 중에서 은행에서 발생하는 대용량의 로그데이터를 처리하기 위한 클라우드 환경 하에서의 MongoDB 기반 비정형 로그 처리시스템을 제안한다. 은행업무간 발생하는 대부분의 로그데이터는 고객의 업무처리 프로세스 간에 발생하며, 고객 업무 프로세스 처리에 따른 로그데이터를 수집, 저장, 분류, 분석하기 위해서는 별도로 로그데이터를 처리하는 시스템을 구축해야만 한다. 하지만 기존 컴퓨팅환경 하에서는 폭발적으로 증가하는 대용량 비정형 로그데이터 처리를 위한 유연한 스토리지 확장성 기능, 저장된 비정형 로그데이터를 분류, 분석 처리할 수 있는 기능을 구현하기가 매우 어렵다. 이에 따라 본 논문에서는 클라우드 컴퓨팅 기술을 도입하여 기존 컴퓨팅 인프라 환경의 분석 도구 및 관리체계에서 처리하기 어려웠던 비정형 로그데이터를 처리하기 위한 클라우드 환경기반의 로그데이터 처리시스템을 제안하고 구현하였다. 제안한 본 시스템은 IaaS(Infrastructure as a Service) 클라우드 환경을 도입하여 컴퓨팅 자원의 유연한 확장성을 제공하며 실제로, 로그데이터가 장기간 축적되거나 급격하게 증가하는 상황에서 스토리지, 메모리 등의 자원을 신속성 있고 유연하게 확장을 할 수 있는 기능을 포함한다. 또한, 축적된 비정형 로그데이터의 실시간 분석이 요구되어질 때 기존의 분석도구의 처리한계를 극복하기 위해 본 시스템은 하둡 (Hadoop) 기반의 분석모듈을 도입함으로써 대용량의 로그데이터를 빠르고 신뢰성 있게 병렬 분산 처리할 수 있는 기능을 제공한다. 게다가, HDFS(Hadoop Distributed File System)을 도입함으로써 축적된 로그데이터를 블록단위로 복제본을 생성하여 저장관리하기 때문에 본 시스템은 시스템 장애와 같은 상황에서 시스템이 멈추지 않고 작동할 수 있는 자동복구 기능을 제공한다. 마지막으로, 본 시스템은 NoSQL 기반의 MongoDB를 이용하여 분산 데이터베이스를 구축함으로써 효율적으로 비정형로그데이터를 처리하는 기능을 제공한다. MySQL과 같은 관계형 데이터베이스는 복잡한 스키마 구조를 가지고 있기 때문에 비정형 로그데이터를 처리하기에 적합하지 않은 구조를 가지고 있다. 또한, 관계형 데이터베이스의 엄격한 스키마 구조는 장기간 데이터가 축적되거나, 데이터가 급격하게 증가할 때 저장된 데이터를 분할하여 여러 노드에 분산시키는 노드 확장이 어렵다는 문제점을 가지고 있다. NoSQL은 관계형 데이터베이스에서 제공하는 복잡한 연산을 지원하지는 않지만 데이터가 빠르게 증가할 때 노드 분산을 통한 데이터베이스 확장이 매우 용이하며 비정형 데이터를 처리하는데 매우 적합한 구조를 가지고 있는 비관계형 데이터베이스이다. NoSQL의 데이터 모델은 주로 키-값(Key-Value), 컬럼지향(Column-oriented), 문서지향(Document-Oriented)형태로 구분되며, 제안한 시스템은 스키마 구조가 자유로운 문서지향(Document-Oriented) 데이터 모델의 대표 격인 MongoDB를 도입하였다. 본 시스템에 MongoDB를 도입한 이유는 유연한 스키마 구조에 따른 비정형 로그데이터 처리의 용이성뿐만 아니라, 급격한 데이터 증가에 따른 유연한 노드 확장, 스토리지 확장을 자동적으로 수행하는 오토샤딩 (AutoSharding) 기능을 제공하기 때문이다. 본 논문에서 제안하는 시스템은 크게 로그 수집기 모듈, 로그 그래프생성 모듈, MongoDB 모듈, Hadoop기반 분석 모듈, MySQL 모듈로 구성되어져 있다. 로그 수집기 모듈은 각 은행에서 고객의 업무 프로세스 시작부터 종료 시점까지 발생하는 로그데이터가 클라우드 서버로 전송될 때 로그데이터 종류에 따라 데이터를 수집하고 분류하여 MongoDB 모듈과 MySQL 모듈로 분배하는 기능을 수행한다. 로그 그래프생성 모듈은 수집된 로그데이터를 분석시점, 분석종류에 따라 MongoDB 모듈, Hadoop기반 분석 모듈, MySQL 모듈에 의해서 분석되어진 결과를 사용자에게 웹 인터페이스 형태로 제공하는 역할을 한다. 실시간적 로그데이터분석이 필요한 로그데이터는 MySQL 모듈로 저장이 되어 로그 그래프생성 모듈을 통하여 실시간 로그데이터 정보를 제공한다. 실시간 분석이 아닌 단위시간당 누적된 로그데이터의 경우 MongoDB 모듈에 저장이 되고, 다양한 분석사항에 따라 사용자에게 그래프화해서 제공된다. MongoDB 모듈에 누적된 로그데이터는 Hadoop기반 분석모듈을 통해서 병렬 분산 처리 작업이 수행된다. 성능 평가를 위하여 로그데이터 삽입, 쿼리 성능에 대해서 MySQL만을 적용한 로그데이터 처리시스템과 제안한 시스템을 비교 평가하였으며 그 성능의 우수성을 검증하였다. 또한, MongoDB의 청크 크기별 로그데이터 삽입 성능평가를 통해 최적화된 청크 크기를 확인하였다.

A Study on the Meaning and Strategy of Keyword Advertising Marketing

  • Park, Nam Goo
    • 유통과학연구
    • /
    • 제8권3호
    • /
    • pp.49-56
    • /
    • 2010
  • At the initial stage of Internet advertising, banner advertising came into fashion. As the Internet developed into a central part of daily lives and the competition in the on-line advertising market was getting fierce, there was not enough space for banner advertising, which rushed to portal sites only. All these factors was responsible for an upsurge in advertising prices. Consequently, the high-cost and low-efficiency problems with banner advertising were raised, which led to an emergence of keyword advertising as a new type of Internet advertising to replace its predecessor. In the beginning of 2000s, when Internet advertising came to be activated, display advertisement including banner advertising dominated the Net. However, display advertising showed signs of gradual decline, and registered minus growth in the year 2009, whereas keyword advertising showed rapid growth and started to outdo display advertising as of the year 2005. Keyword advertising refers to the advertising technique that exposes relevant advertisements on the top of research sites when one searches for a keyword. Instead of exposing advertisements to unspecified individuals like banner advertising, keyword advertising, or targeted advertising technique, shows advertisements only when customers search for a desired keyword so that only highly prospective customers are given a chance to see them. In this context, it is also referred to as search advertising. It is regarded as more aggressive advertising with a high hit rate than previous advertising in that, instead of the seller discovering customers and running an advertisement for them like TV, radios or banner advertising, it exposes advertisements to visiting customers. Keyword advertising makes it possible for a company to seek publicity on line simply by making use of a single word and to achieve a maximum of efficiency at a minimum cost. The strong point of keyword advertising is that customers are allowed to directly contact the products in question through its more efficient advertising when compared to the advertisements of mass media such as TV and radio, etc. The weak point of keyword advertising is that a company should have its advertisement registered on each and every portal site and finds it hard to exercise substantial supervision over its advertisement, there being a possibility of its advertising expenses exceeding its profits. Keyword advertising severs as the most appropriate methods of advertising for the sales and publicity of small and medium enterprises which are in need of a maximum of advertising effect at a low advertising cost. At present, keyword advertising is divided into CPC advertising and CPM advertising. The former is known as the most efficient technique, which is also referred to as advertising based on the meter rate system; A company is supposed to pay for the number of clicks on a searched keyword which users have searched. This is representatively adopted by Overture, Google's Adwords, Naver's Clickchoice, and Daum's Clicks, etc. CPM advertising is dependent upon the flat rate payment system, making a company pay for its advertisement on the basis of the number of exposure, not on the basis of the number of clicks. This method fixes a price for advertisement on the basis of 1,000-time exposure, and is mainly adopted by Naver's Timechoice, Daum's Speciallink, and Nate's Speedup, etc, At present, the CPC method is most frequently adopted. The weak point of the CPC method is that advertising cost can rise through constant clicks from the same IP. If a company makes good use of strategies for maximizing the strong points of keyword advertising and complementing its weak points, it is highly likely to turn its visitors into prospective customers. Accordingly, an advertiser should make an analysis of customers' behavior and approach them in a variety of ways, trying hard to find out what they want. With this in mind, her or she has to put multiple keywords into use when running for ads. When he or she first runs an ad, he or she should first give priority to which keyword to select. The advertiser should consider how many individuals using a search engine will click the keyword in question and how much money he or she has to pay for the advertisement. As the popular keywords that the users of search engines are frequently using are expensive in terms of a unit cost per click, the advertisers without much money for advertising at the initial phrase should pay attention to detailed keywords suitable to their budget. Detailed keywords are also referred to as peripheral keywords or extension keywords, which can be called a combination of major keywords. Most keywords are in the form of texts. The biggest strong point of text-based advertising is that it looks like search results, causing little antipathy to it. But it fails to attract much attention because of the fact that most keyword advertising is in the form of texts. Image-embedded advertising is easy to notice due to images, but it is exposed on the lower part of a web page and regarded as an advertisement, which leads to a low click through rate. However, its strong point is that its prices are lower than those of text-based advertising. If a company owns a logo or a product that is easy enough for people to recognize, the company is well advised to make good use of image-embedded advertising so as to attract Internet users' attention. Advertisers should make an analysis of their logos and examine customers' responses based on the events of sites in question and the composition of products as a vehicle for monitoring their behavior in detail. Besides, keyword advertising allows them to analyze the advertising effects of exposed keywords through the analysis of logos. The logo analysis refers to a close analysis of the current situation of a site by making an analysis of information about visitors on the basis of the analysis of the number of visitors and page view, and that of cookie values. It is in the log files generated through each Web server that a user's IP, used pages, the time when he or she uses it, and cookie values are stored. The log files contain a huge amount of data. As it is almost impossible to make a direct analysis of these log files, one is supposed to make an analysis of them by using solutions for a log analysis. The generic information that can be extracted from tools for each logo analysis includes the number of viewing the total pages, the number of average page view per day, the number of basic page view, the number of page view per visit, the total number of hits, the number of average hits per day, the number of hits per visit, the number of visits, the number of average visits per day, the net number of visitors, average visitors per day, one-time visitors, visitors who have come more than twice, and average using hours, etc. These sites are deemed to be useful for utilizing data for the analysis of the situation and current status of rival companies as well as benchmarking. As keyword advertising exposes advertisements exclusively on search-result pages, competition among advertisers attempting to preoccupy popular keywords is very fierce. Some portal sites keep on giving priority to the existing advertisers, whereas others provide chances to purchase keywords in question to all the advertisers after the advertising contract is over. If an advertiser tries to rely on keywords sensitive to seasons and timeliness in case of sites providing priority to the established advertisers, he or she may as well make a purchase of a vacant place for advertising lest he or she should miss appropriate timing for advertising. However, Naver doesn't provide priority to the existing advertisers as far as all the keyword advertisements are concerned. In this case, one can preoccupy keywords if he or she enters into a contract after confirming the contract period for advertising. This study is designed to take a look at marketing for keyword advertising and to present effective strategies for keyword advertising marketing. At present, the Korean CPC advertising market is virtually monopolized by Overture. Its strong points are that Overture is based on the CPC charging model and that advertisements are registered on the top of the most representative portal sites in Korea. These advantages serve as the most appropriate medium for small and medium enterprises to use. However, the CPC method of Overture has its weak points, too. That is, the CPC method is not the only perfect advertising model among the search advertisements in the on-line market. So it is absolutely necessary that small and medium enterprises including independent shopping malls should complement the weaknesses of the CPC method and make good use of strategies for maximizing its strengths so as to increase their sales and to create a point of contact with customers.

  • PDF