• Title/Summary/Keyword: Web log

Search Result 294, Processing Time 0.028 seconds

Designing Integrated Diagnosis Platform for Heterogeneous Combat System of Surface Vessels (다기종 수상함 전투체계의 통합 진단 플랫폼 설계)

  • Kim, Myeong-hun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.186-188
    • /
    • 2021
  • The architecture named IDPS is a design concept of web-based integrated platform for heterogeneous naval combat system, which accomplishes efficiency(decreasing complexity) of diagnosis process and reduces time to diagnose system. Each type of surface vessel has its own diagnostic processes and applications, and that means it also requires its own diagnostic engineer(inefficiency in human resource management). In addition, man-based diagnostic causes quality issues such as difference approach of log analysis in accordance with engineer skills. Thus In this paper, we designed integrated diagnostic platform named IDPS with simplified common process regardless of type of surface vessel and we reinforced IDPS with status decision algorithm(SDA) that judges current software status of vessel based on gathered lots of logs. It will enable engineers to diagnose system more efficiently and to use more resources in utilizing SDA-analyzed diagnostic results.

  • PDF

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

  • Kim, Minjeong;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.93-110
    • /
    • 2015
  • The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.

X-tree Diff: An Efficient Change Detection Algorithm for Tree-structured Data (X-tree Diff: 트리 기반 데이터를 위한 효율적인 변화 탐지 알고리즘)

  • Lee, Suk-Kyoon;Kim, Dong-Ah
    • The KIPS Transactions:PartC
    • /
    • v.10C no.6
    • /
    • pp.683-694
    • /
    • 2003
  • We present X-tree Diff, a change detection algorithm for tree-structured data. Our work is motivated by need to monitor massive volume of web documents and detect suspicious changes, called defacement attack on web sites. From this context, our algorithm should be very efficient in speed and use of memory space. X-tree Diff uses a special ordered labeled tree, X-tree, to represent XML/HTML documents. X-tree nodes have a special field, tMD, which stores a 128-bit hash value representing the structure and data of subtrees, so match identical subtrees form the old and new versions. During this process, X-tree Diff uses the Rule of Delaying Ambiguous Matchings, implying that it perform exact matching where a node in the old version has one-to one corrspondence with the corresponding node in the new, by delaying all the others. It drastically reduces the possibility of wrong matchings. X-tree Diff propagates such exact matchings upwards in Step 2, and obtain more matchings downwsards from roots in Step 3. In step 4, nodes to ve inserted or deleted are decided, We aldo show thst X-tree Diff runs on O(n), woere n is the number of noses in X-trees, in worst case as well as in average case, This result is even better than that of BULD Diff algorithm, which is O(n log(n)) in worst case, We experimented X-tree Diff on reat data, which are about 11,000 home pages from about 20 wev sites, instead of synthetic documets manipulated for experimented for ex[erimentation. Currently, X-treeDiff algorithm is being used in a commeercial hacking detection system, called the WIDS(Web-Document Intrusion Detection System), which is to find changes occured in registered websites, and report suspicious changes to users.

Quantitative Microbial Risk Assessment Model for Staphylococcus aureus in Kimbab (김밥에서의 Staphylococcus aureus에 대한 정량적 미생물위해평가 모델 개발)

  • Bahk, Gyung-Jin;Oh, Deog-Hwan;Ha, Sang-Do;Park, Ki-Hwan;Joung, Myung-Sub;Chun, Suk-Jo;Park, Jong-Seok;Woo, Gun-Jo;Hong, Chong-Hae
    • Korean Journal of Food Science and Technology
    • /
    • v.37 no.3
    • /
    • pp.484-491
    • /
    • 2005
  • Quantitative microbial risk assessment (QMRA) analyzes potential hazard of microorganisms on public health and offers structured approach to assess risks associated with microorganisms in foods. This paper addresses specific risk management questions associated with Staphylococcus aureus in kimbab and improvement and dissemination of QMRA methodology, QMRA model was developed by constructing four nodes from retail to table pathway. Predictive microbial growth model and survey data were combined with probabilistic modeling to simulate levels of S. aureus in kimbab at time of consumption, Due to lack of dose-response models, final level of S. aureus in kimbeb was used as proxy for potential hazard level, based on which possibility of contamination over this level and consumption level of S. aureus through kimbab were estimated as 30.7% and 3.67 log cfu/g, respectively. Regression sensitivity results showed time-temperature during storage at selling was the most significant factor. These results suggested temperature control under $10^{\circ}C$ was critical control point for kimbab production to prevent growth of S. aureus and showed QMRA was useful for evaluation of factors influencing potential risk and could be applied directly to risk management.

A Study on the Meaning and Strategy of Keyword Advertising Marketing

  • Park, Nam Goo
    • Journal of Distribution Science
    • /
    • v.8 no.3
    • /
    • pp.49-56
    • /
    • 2010
  • At the initial stage of Internet advertising, banner advertising came into fashion. As the Internet developed into a central part of daily lives and the competition in the on-line advertising market was getting fierce, there was not enough space for banner advertising, which rushed to portal sites only. All these factors was responsible for an upsurge in advertising prices. Consequently, the high-cost and low-efficiency problems with banner advertising were raised, which led to an emergence of keyword advertising as a new type of Internet advertising to replace its predecessor. In the beginning of 2000s, when Internet advertising came to be activated, display advertisement including banner advertising dominated the Net. However, display advertising showed signs of gradual decline, and registered minus growth in the year 2009, whereas keyword advertising showed rapid growth and started to outdo display advertising as of the year 2005. Keyword advertising refers to the advertising technique that exposes relevant advertisements on the top of research sites when one searches for a keyword. Instead of exposing advertisements to unspecified individuals like banner advertising, keyword advertising, or targeted advertising technique, shows advertisements only when customers search for a desired keyword so that only highly prospective customers are given a chance to see them. In this context, it is also referred to as search advertising. It is regarded as more aggressive advertising with a high hit rate than previous advertising in that, instead of the seller discovering customers and running an advertisement for them like TV, radios or banner advertising, it exposes advertisements to visiting customers. Keyword advertising makes it possible for a company to seek publicity on line simply by making use of a single word and to achieve a maximum of efficiency at a minimum cost. The strong point of keyword advertising is that customers are allowed to directly contact the products in question through its more efficient advertising when compared to the advertisements of mass media such as TV and radio, etc. The weak point of keyword advertising is that a company should have its advertisement registered on each and every portal site and finds it hard to exercise substantial supervision over its advertisement, there being a possibility of its advertising expenses exceeding its profits. Keyword advertising severs as the most appropriate methods of advertising for the sales and publicity of small and medium enterprises which are in need of a maximum of advertising effect at a low advertising cost. At present, keyword advertising is divided into CPC advertising and CPM advertising. The former is known as the most efficient technique, which is also referred to as advertising based on the meter rate system; A company is supposed to pay for the number of clicks on a searched keyword which users have searched. This is representatively adopted by Overture, Google's Adwords, Naver's Clickchoice, and Daum's Clicks, etc. CPM advertising is dependent upon the flat rate payment system, making a company pay for its advertisement on the basis of the number of exposure, not on the basis of the number of clicks. This method fixes a price for advertisement on the basis of 1,000-time exposure, and is mainly adopted by Naver's Timechoice, Daum's Speciallink, and Nate's Speedup, etc, At present, the CPC method is most frequently adopted. The weak point of the CPC method is that advertising cost can rise through constant clicks from the same IP. If a company makes good use of strategies for maximizing the strong points of keyword advertising and complementing its weak points, it is highly likely to turn its visitors into prospective customers. Accordingly, an advertiser should make an analysis of customers' behavior and approach them in a variety of ways, trying hard to find out what they want. With this in mind, her or she has to put multiple keywords into use when running for ads. When he or she first runs an ad, he or she should first give priority to which keyword to select. The advertiser should consider how many individuals using a search engine will click the keyword in question and how much money he or she has to pay for the advertisement. As the popular keywords that the users of search engines are frequently using are expensive in terms of a unit cost per click, the advertisers without much money for advertising at the initial phrase should pay attention to detailed keywords suitable to their budget. Detailed keywords are also referred to as peripheral keywords or extension keywords, which can be called a combination of major keywords. Most keywords are in the form of texts. The biggest strong point of text-based advertising is that it looks like search results, causing little antipathy to it. But it fails to attract much attention because of the fact that most keyword advertising is in the form of texts. Image-embedded advertising is easy to notice due to images, but it is exposed on the lower part of a web page and regarded as an advertisement, which leads to a low click through rate. However, its strong point is that its prices are lower than those of text-based advertising. If a company owns a logo or a product that is easy enough for people to recognize, the company is well advised to make good use of image-embedded advertising so as to attract Internet users' attention. Advertisers should make an analysis of their logos and examine customers' responses based on the events of sites in question and the composition of products as a vehicle for monitoring their behavior in detail. Besides, keyword advertising allows them to analyze the advertising effects of exposed keywords through the analysis of logos. The logo analysis refers to a close analysis of the current situation of a site by making an analysis of information about visitors on the basis of the analysis of the number of visitors and page view, and that of cookie values. It is in the log files generated through each Web server that a user's IP, used pages, the time when he or she uses it, and cookie values are stored. The log files contain a huge amount of data. As it is almost impossible to make a direct analysis of these log files, one is supposed to make an analysis of them by using solutions for a log analysis. The generic information that can be extracted from tools for each logo analysis includes the number of viewing the total pages, the number of average page view per day, the number of basic page view, the number of page view per visit, the total number of hits, the number of average hits per day, the number of hits per visit, the number of visits, the number of average visits per day, the net number of visitors, average visitors per day, one-time visitors, visitors who have come more than twice, and average using hours, etc. These sites are deemed to be useful for utilizing data for the analysis of the situation and current status of rival companies as well as benchmarking. As keyword advertising exposes advertisements exclusively on search-result pages, competition among advertisers attempting to preoccupy popular keywords is very fierce. Some portal sites keep on giving priority to the existing advertisers, whereas others provide chances to purchase keywords in question to all the advertisers after the advertising contract is over. If an advertiser tries to rely on keywords sensitive to seasons and timeliness in case of sites providing priority to the established advertisers, he or she may as well make a purchase of a vacant place for advertising lest he or she should miss appropriate timing for advertising. However, Naver doesn't provide priority to the existing advertisers as far as all the keyword advertisements are concerned. In this case, one can preoccupy keywords if he or she enters into a contract after confirming the contract period for advertising. This study is designed to take a look at marketing for keyword advertising and to present effective strategies for keyword advertising marketing. At present, the Korean CPC advertising market is virtually monopolized by Overture. Its strong points are that Overture is based on the CPC charging model and that advertisements are registered on the top of the most representative portal sites in Korea. These advantages serve as the most appropriate medium for small and medium enterprises to use. However, the CPC method of Overture has its weak points, too. That is, the CPC method is not the only perfect advertising model among the search advertisements in the on-line market. So it is absolutely necessary that small and medium enterprises including independent shopping malls should complement the weaknesses of the CPC method and make good use of strategies for maximizing its strengths so as to increase their sales and to create a point of contact with customers.

  • PDF

An Interactive Approach to Categorize Questions on the Internet BBSs (인터넷 게시판 질문 분류를 위한 인터랙티브 접근방법에 관한 연구)

  • Jae-Kwang Lee;Seong-Ho Noh;Ok-Hyun Ryou
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.3
    • /
    • pp.177-195
    • /
    • 2003
  • In a traditional customer support environment, mainly call centers or service centers are responsible for receiving inquiries from their customers via telephone calls. Due to the rapid growth of Internet with its widespread acceptance and accessibility, means of communication with customers in the traditional customer support center, such as telephones, letters, and direct-visiting, have been replaced by e-mails and bulletin board systems (BBSs) using the Internet constantly. BBSs are basically question and answer systems, they require some lead time to get answer from administrator. To reduce lead time, BBSs enable remote customers or users to log on and tap into a knowledge database that is generally formatted in the form of Frequently Asked Questions (FAQs) that provide answers and solutions to the common problems. And, many different types of the questions are mixed on the BBS. It is a burden to administrator. To build FAQs and to support BBS adminstrator, a supporting tool which is to categorize questions is helpful. In this research, we suggest an interactive question categorizing methodology which consists of steps to present question using keywords, identifying keywords' affinity, computing similarity among questions, and clustering questions. This methodology allows users to interact iteratively for clear manifestation of ambiguous questions. We also developed a prototype system, IQC (interactive question categorizer) and evaluated its performance using the comparison experiments with other systems. IQC is not a general purposed system, but it produces a good result in a given specific domain.

  • PDF

User Perspective Website Clustering for Site Portfolio Construction (사이트 포트폴리오 구성을 위한 사용자 관점의 웹사이트 클러스터링)

  • Kim, Mingyu;Kim, Namgyu
    • Journal of Internet Computing and Services
    • /
    • v.16 no.3
    • /
    • pp.59-69
    • /
    • 2015
  • Many users visit websites every day to perform information retrieval, shopping, and community activities. On the other hand, there is intense competition among sites which attempt to profit from the Internet users. Thus, the owners or marketing officers of each site try to design a variety of marketing strategies including cooperation with other sites. Through such cooperation, a site can share customers' information, mileage points, and hyperlinks with other sites. To create effective cooperation, it is crucial to choose an appropriate partner site that may have many potential customers. Unfortunately, it is exceedingly difficult to identify such an appropriate partner among the vast number of sites. In this paper, therefore, we devise a new methodology for recommending appropriate partner sites to each site. For this purpose, we perform site clustering from the perspective of visitors' similarities, and then identify a group of sites that has a number of common customers. We then analyze the potential for the practical use of the proposed methodology through its application to approximately 140 million actual site browsing histories.

Proxy Caching Scheme Based on the User Access Pattern Analysis for Series Video Data (시리즈 비디오 데이터의 접근 패턴에 기반한 프록시 캐슁 기법)

  • Hong, Hyeon-Ok;Park, Seong-Ho;Chung, Ki-Dong
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.8
    • /
    • pp.1066-1077
    • /
    • 2004
  • Dramatic increase in the number of Internet users want highly qualified service of continuous media contents on the web. To solve these problems, we present two network caching schemes(PPC, PPCwP) which consider the characteristics of continuous media objects and user access pattern in this paper. While there are plenty of reasons to create rich media contents, delivering this high bandwidth contents over the internet presents problems such as server overload, network congestion and client-perceived latency. PPC scheme periodically calculates the popularity of objects based on the playback quantity and determines the optimal size of the initial fraction of a continuous media object to be cached in proportion to the calculated popularity. PPCwP scheme calculates the expected popularity using the series information and prefetches the expected initial fraction of newly created continuous media objects. Under the PPCwP scheme, the initial client-perceived latency and the data transferred from a remote server can be reduced and limited cache storage space can be utilized efficiently. Trace-driven simulation have been performed to evaluate the presented caching schemes using the log-files of iMBC. Through these simulations, PPC and PPCwP outperforms LRU and LFU in terms of BHR and DSR.

  • PDF

Design and Implementation of Picture Archiving and Communication System Component using the RFID for Mobile Web Environments (모바일 웹 환경을 위한 의료영상저장전송시스템 컴포넌트의 설계 및 구현)

  • Kim Chang-Soo;Yim Jae-Hong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.6
    • /
    • pp.1124-1131
    • /
    • 2006
  • The recent medical treatment guidelines and the development of information technology make hospitals reduce the expense in surrounding environment and it requires improving the quality of medical treatment of the hospital. Moreover, MIS, PACS(Picture Archiving and Communication System), OCS, EMR are also developing. Medical Information System is evolved toward integration of medical IT and situation is changing with increasing high speed in the ICT convergence. Mobile component refers to construct wireless system of hospital which has constructed in existing environment. Through RFID development in existing system, anyone can log on easily to internet whenever and wherever. It is the core technology to implement automatic medical processing system. This paper provides a basic review of RFID model, PACS application component services. In addition, designed and implemented database server's component program and client program of mobile application that recognized RFID tag and patient data in the ubiquitous environments. This system implemented mobile PACS that performed patient data based db environments, and so reduced delay time of requisition, medical treatment, lab.

Temporal Data Mining Framework (시간 데이타마이닝 프레임워크)

  • Lee, Jun-Uk;Lee, Yong-Jun;Ryu, Geun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.9D no.3
    • /
    • pp.365-380
    • /
    • 2002
  • Temporal data mining, the incorporation of temporal semantics to existing data mining techniques, refers to a set of techniques for discovering implicit and useful temporal knowledge from large quantities of temporal data. Temporal knowledge, expressible in the form of rules, is knowledge with temporal semantics and relationships, such as cyclic pattern, calendric pattern, trends, etc. There are many examples of temporal data, including patient histories, purchaser histories, and web log that it can discover useful temporal knowledge from. Many studies on data mining have been pursued and some of them have involved issues of temporal data mining for discovering temporal knowledge from temporal data, such as sequential pattern, similar time sequence, cyclic and temporal association rules, etc. However, all of the works treated data in database at best as data series in chronological order and did not consider temporal semantics and temporal relationships containing data. In order to solve this problem, we propose a theoretical framework for temporal data mining. This paper surveys the work to date and explores the issues involved in temporal data mining. We then define a model for temporal data mining and suggest SQL-like mining language with ability to express the task of temporal mining and show architecture of temporal mining system.