• Title/Summary/Keyword: Web Log Data

Search Result 176, Processing Time 0.025 seconds

Algorithm for Extracting the General Web Search Path Pattern (일반적인 웹 검색 경로패턴 추출 알고리즘)

  • Jang, Min-Seok;Ha, Eun-Mi
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.1
    • /
    • pp.771-773
    • /
    • 2005
  • There have been researches about analyzing the information retrieval patterns of log file to efficiently obtain the users' information research patters in web environment. The methods frequently used in their researches is to suggest the algorithms by which the frequent one is derived from the path traversal patterns in efficient way. But one of their general problems is not to provide the proper solution in case of complex, that is, general topological patterns. Therefore this paper tries to suggest a efficient algorithm after defining the general information retrieval pattern.

  • PDF

An Efficient Algorithm for Mining Interactive Communication Sequence Patterns (대화형 통신 순서열 패턴의 마이닝을 위한 효율적인 알고리즘)

  • Haam, Deok-Min;Song, Ji-Hwan;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.169-179
    • /
    • 2009
  • Communication log data consist of communication events such as sending and receiving e-mail or instance message and visiting web sites, etc. Many countries including USA and EU enforce the retention of these data on the communication service providers for the purpose of investigating or detecting criminals through the Internet. Because size of the retained data is very large, the efficient method for extracting valuable information from the data is needed for Law Enforcement Authorities to use the retained data. This paper defines the Interactive Communication Sequence Patterns(ICSPs) that is the important information when each communication event in communication log data consists of sender, receiver, and timestamp of this event. We also define a Mining(FDICSP) problem to discover such patterns and propose a method called Fast Discovering Interactive Communication Sequence Pattern(FDICSP) to solve this problem. FDICSP focuses on the characteristics of ICS to reduce the search space when it finds longer sequences by using shorter sequences. Thus, FDICSP can find Interactive Communication Sequence Patterns efficiently.

The Evaluation for Web Mining and Analytics Service from the View of Personal Information Protection and Privacy (개인정보보호 관점에서의 웹 트래픽 수집 및 분석 서비스에 대한 타당성 연구)

  • Kang, Daniel;Shim, Mi-Na;Bang, Je-Wan;Lee, Sang-Jin;Lim, Jong-In
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.19 no.6
    • /
    • pp.121-134
    • /
    • 2009
  • Consumer-centric marketing business is surely one of the most successful emerging business but it poses a threat to personal privacy. Between the service provider and the user there are many contrary issues to each other. The enterprise asserts that to abuse the privacy data which is anonymous there is not a problem. The individual only will not be able to willingly submit the problem which is latent. Web traffic analysis technology itself doesn't create issues, but this technology when used on data of personal nature might cause concerns. The most criticized ethical issue involving web traffic analysis is the invasion of privacy. So we need to inspect how many and what kind of personal informations being used and if there is any illegal treatment of personal information. In this paper, we inspect the operation of consumer-centric marketing tools such as web log analysis solutions and data gathering services with web browser toolbar. Also we inspect Microsoft explorer-based toolbar application which records and analyzes personal web browsing pattern through reverse engineering technology. Finally, this identified and explored security and privacy requirement issues to develop more reliable solutions. This study is very important for the balanced development with personal privacy protection and web traffic analysis industry.

Method of extracting context from media data by using video sharing site

  • Kondoh, Satoshi;Ogawa, Takeshi
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.709-713
    • /
    • 2009
  • Recently, a lot of research that applies data acquired from devices such as cameras and RFIDs to context aware services is being performed in the field on Life-Log and the sensor network. A variety of analytical techniques has been proposed to recognize various information from the raw data because video and audio data include a larger volume of information than other sensor data. However, manually watching a huge amount of media data again has been necessary to create supervised data for the update of a class or the addition of a new class because these techniques generally use supervised learning. Therefore, the problem was that applications were able to use only recognition function based on fixed supervised data in most cases. Then, we proposed a method of acquiring supervised data from a video sharing site where users give comments on any video scene because those sites are remarkably popular and, therefore, many comments are generated. In the first step of this method, words with a high utility value are extracted by filtering the comment about the video. Second, the set of feature data in the time series is calculated by applying functions, which extract various feature data, to media data. Finally, our learning system calculates the correlation coefficient by using the above-mentioned two kinds of data, and the correlation coefficient is stored in the DB of the system. Various other applications contain a recognition function that is used to generate collective intelligence based on Web comments, by applying this correlation coefficient to new media data. In addition, flexible recognition that adjusts to a new object becomes possible by regularly acquiring and learning both media data and comments from a video sharing site while reducing work by manual operation. As a result, recognition of not only the name of the seen object but also indirect information, e.g. the impression or the action toward the object, was enabled.

  • PDF

Web log Data Analysis Apply to Web Contents Analysis Result data (웹로그 분석을 적용한 웹사이트내의 웹컨텐츠 분석 연구 결과)

  • 정선경;이칠우
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.579-581
    • /
    • 2003
  • 본 논문은 웹사이트를 구성하는 웹컨텐츠의 합목적성 및 사용성 평가를 위하여 웹사이트 사용자들의 흔적인 웹로그 분석 데이터를 적용하여 정량적인 평가를 한 결과 에 대하여 보고 하고자 한다. 웹로그파일은 사용자가 웹사이트를 이용하면 이에 대한 기록이 로그라는 형태로 흔적이 남는다. 로그분석이란 이 데이터를 기반으로 위에서 말한 다양한 정보를 추출해 내는 것이라 할 수 있다. 로그분석은 사용자에 따라 단지 로그 정보를 분석하는 것에 한정시키기도, 로그 정보를 기반으로 한 보다 다양한 정보를 분석하는 확장된 개념으로 확대시키기도 한다. 본 논문은 웹트랜즈[로그분석 툴]를 사용하여 웹사이트내의 웹컨텐츠 분석에 웹 로그 분석 결과가 같는 정량적인 개념을 활용하여 정상적인 분석으로 일관되었던 웹사이트 분석에 새로운 결과를 도출하였다. 또한 마케팅을 수행하는 데 있어서 고객의 요구에 천저히 대응하기 위해서 고객을 철저히 파악하여야 한다. 로그파일 분석을 통해서 주요 고객층, 고객의 구매 패턴, 주 구매시간, 구매탐색 경로등의 데이터를 추출할 수 있다. 로그파일 분석 데이터를 기반으로 인터페이스 설계나 상품의 레이아웃등의 설계, 고객 서비스 강화등의 다양한 대 고객 마케팅을 펼칠 수 있다. 본 논문은 단순 수치해석의 개념이었던 로그 분석에 웹컨텐츠분석 기법을 접목하여 웹사이트내에서 사용되고 있는 컨텐츠의 사용성에 대안 연구 결과에 대해서 발표하고자 한다.

  • PDF

DESIGN AND IMPLEMENTATION OF REAL-TIME MRTG++

  • Urn, Nam-Kyoung;Oh, Chung-Sik;Lee, Sang-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.785-788
    • /
    • 2005
  • We design as well as implement MRTG++ enhanced by eliminating drawbacks of original MRTG. Existing MRTG makes it easy to monitor network resources on the web, and it can show some trend utilization about resources of remote systems. However MRTG has only functions showing statistical information data with daily/monthly/yearly characteristics in real-time, and periodically deletes stored data. Thus we improve log file-based storing method as an effectivelly storing method with database, propose expended MRTG++ retrieving as well as managing with measured data. It also can be applied in ubiquitous environment.

  • PDF

Ecological Risk Assessment of Residual Petroleum Hydrocarbons using a Foodweb Bioaccumulation Model (먹이연쇄 생물축적 모형을 이용한 잔류유류오염물질의 생태위해성평가)

  • Hwang, Sang-Il;Kwon, Jung-Hwan
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.31 no.11
    • /
    • pp.947-956
    • /
    • 2009
  • Residual petroleum hydrocarbons after an oil spill may accumulate in the marine benthic ecosystem due to their high hydrophobicity. A lot of monitoring data are required for the estimation of ecosystem exposure to residual petrochemicals in an ecological risk assessment in the affected region. To save time and cost, the environmental exposure to them in the affected ecosystem can also be assessed using a simple food-web bioaccumulation model. In this study, we evaluated residual concentrations of four selected polycyclic aromatic hydrocarbons (phenanthrene, anthracene, pyrene, and benzo[a]pyrene) in a hypothetic benthic ecosystem composed of six species under two exposure scenarios. Body-residue concentration ranged 5~250 mg/kg body depending on trophic positions in an extreme scenario in which the aqueous concentrations of PAHs were assumed to be one-tenth of their aqueous solubility. In addition, bioconcentration factors (BCFs) and bioaccumulation factors (BAFs) were evaluated for model species. The logarithm of bioconcentration factor (log BCF) linearly increased with increasing the logarithm of 1-octanol-water partition coefficient (log $K_{OW}$) until log $K_{OW}$ of 7.0, followed by a gradual decrease with further increase in log $K_{OW}$ without metabolic degradation. Biomagnification became significant when log $K_{OW}$ of a pollutant exceeded 5.0 in the model ecosystem, indicating that investigation of food-web structure should be critical to predict biomagnifications in the affected ecosystem because log $K_{OW}$ values of many petrochemicals are higher than 5.0. Although further research is required for better site-specific evaluation of exposure, the model simulation can be used to estimate the level of the ecosystem exposure to residual oil contaminants at the screening level.

Development of a Web-Based Program for Cross-Calibration and Record Management of Radiation Measuring Equipment

  • Park, So Hyun;Lee, Rena;Kim, Kyubo;Ahn, Sohyun;Lim, Sangwook;Cho, Samju
    • Progress in Medical Physics
    • /
    • v.30 no.2
    • /
    • pp.59-63
    • /
    • 2019
  • Purpose: To manage radiation measurement equipment, a web-based management program has been developed in this study. Materials and Methods: This program is based on a web service and Java Server Pages (JSP) and employs compatibility and accessibility. Results: The first step in the workflow has been designed to create accounts for each user or organization and to log in. The program consists of two parts: fields for listed instruments, and measurement information. The instruments for measuring radiation listed in this program are as follows: ionization chambers, survey meters, thermometers, barometers, electrometers, and phantoms. Instrument properties can be put in the recording fields and browsing for associated instruments can be performed. The main part of the program is the cross-calibration for each ion chamber. For instance, the ionization chamber to be used as a relative dosimeter can be registered by cross-calibration data with a reference chamber calibrated by an accredited laboratory. This program supports methods using the central axis transfer theory for cross-calibration for the ionization chambers. The reference and field ionization chambers were placed in a solid water phantom along the beam central axis at two different depths, and then the positions were switched. Each measured value was used for calculating the cross-calibration factor. Conclusions: Because many instruments are used and managed in radiation oncology departments, systematic, traceable recording is very important. The web-based program developed in this study is expected to be used effectively in the maintenance of radiation measurement instruments.

A Study on the Development of Internet Purchase Support Systems Based on Data Mining and Case-Based Reasoning (데이터마이닝과 사례기반추론 기법에 기반한 인터넷 구매지원 시스템 구축에 관한 연구)

  • 김진성
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.28 no.3
    • /
    • pp.135-148
    • /
    • 2003
  • In this paper we introduce the Internet-based purchase support systems using data mining and case-based reasoning (CBR). Internet Business activity that involves the end user is undergoing a significant revolution. The ability to track users browsing behavior has brought the vendor and end customer's closer than ever before. It is now possible for a vendor to personalize his product message for individual customers at massive scale. Most of former researchers, in this research arena, used data mining techniques to pursue the customer's future behavior and to improve the frequency of repurchase. The area of data mining can be defined as efficiently discovering association rules from large collections of data. However, the basic association rule-based data mining technique was not flexible. If there were no inference rules to track the customer's future behavior, association rule-based data mining systems may not present more information. To resolve this problem, we combined association rule-based data mining with CBR mechanism. CBR is used in reasoning for customer's preference searching and training through the cases. Data mining and CBR-based hybrid purchase support mechanism can reflect both association rule-based logical inference and case-based information reuse. A Web-log data gathered in the real-world Internet shopping mall is given to illustrate the quality of the proposed systems.

Korean internet audience centric measurement : Internet index (PcMeter를 이용한 인터넷 접속을 측정)

  • 이상경
    • Survey Research
    • /
    • v.1 no.1
    • /
    • pp.125-134
    • /
    • 2000
  • It is very important to keep scientific principles in marketing research, Especially in sampling we have to select the scheme to avoid self-selection bias Internet index is ready-made-research service produced by analyzing the log data transferred from PcMeter installed in panelists PC with panelists profile and site data base following the scientific principles Through the index we can figure out the actual behaviour of Korean netizen surfing various web sites and understand their cybergraphics.

  • PDF