Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)
-
- Journal of Internet Computing and Services
- /
- v.14 no.6
- /
- pp.71-84
- /
- 2013
Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.
With the continued globalization of world markets, transfer pricing has become one of the dominant sources of controversy in international taxation. Transfer pricing is the process by which a multinational corporation calculates a price for goods and services that are transferred to affiliated entities. Consider a Korean electronic enterprise that buys supplies from its own subsidiary located in China. How much the Korean parent company pays its subsidiary will determine how much profit the Chinese unit reports in local taxes. If the parent company pays above normal market prices, it may appear to have a poor profit, even if the group as a whole shows a respectable profit margin. In this way, transfer prices impact the taxable income reported in each country in which the multinational enterprise operates. It's importance lies in that around 60% of international trade involves transactions between two related parts of multinationals, according to the OECD. Multinational enterprises (hereafter MEs) exert much effort into utilizing organizational advantages to make global investments. MEs wish to minimize their tax burden. So MEs spend a fortune on economists and accountants to justify transfer prices that suit their tax needs. On the contrary, local governments are not prepared to cope with MEs' powerful financial instruments. Tax authorities in each country wish to ensure that the tax base of any ME is divided fairly. Thus, both tax authorities and MEs have a vested interest in the way in which a transfer price is determined, and this is why MEs' international transfer prices are at the center of disputes concerned with taxation. Transfer pricing issues and practices are sometimes difficult to control for regulators because the tax administration does not have enough staffs with the knowledge and resources necessary to understand them. The authors examine transfer pricing practices to provide relevant resources useful in designing tax incentives and regulation schemes for policy makers. This study focuses on identifying the relevant business and environmental factors that could influence the international transfer pricing of MEs. In this perspective, we empirically investigate how the management perception of related variables influences their choice of international transfer pricing methods. We believe that this research is particularly useful in the design of tax policy. Because it can concentrate on a few selected factors in consideration of the limited budget of the tax administration with assistance of this research. Data is composed of questionnaire responses from foreign firms in Korea with investment balances exceeding one million dollars in the end of 2004. We mailed questionnaires to 861 managers in charge of the accounting departments of each company, resulting in 121 valid responses. Seventy six percent of the sample firms are classified as small and medium sized enterprises with assets below 100 billion Korean won. Reviewing transfer pricing methods, cost-based transfer pricing is most popular showing that 60 firms have adopted it. The market-based method is used by 31 firms, and 13 firms have reported the resale-pricing method. Regarding the nationalities of foreign investors, the Japanese and the Americans constitute most of the sample. Logistic regressions have been performed for statistical analysis. The dependent variable is binary in that whether the method of international transfer pricing is a market-based method or a cost-based method. This type of binary classification is founded on the belief that the market-based method is evaluated as the relatively objective way of pricing compared with the cost-based methods. Cost-based pricing is assumed to give mangers flexibility in transfer pricing decisions. Therefore, local regulatory agencies are thought to prefer market-based pricing over cost-based pricing. Independent variables are composed of eight factors such as corporate tax rate, tariffs, relations with local tax authorities, tax audit, equity ratios of local investors, volume of internal trade, sales volume, and product life cycle. The first four variables are included in the model because taxation lies in the center of transfer pricing disputes. So identifying the impact of these variables in Korean business environments is much needed. Equity ratio is included to represent the interest of local partners. Volume of internal trade was sometimes employed in previous research to check the pricing behavior of managers, so we have followed these footsteps in this paper. Product life cycle is used as a surrogate of competition in local markets. Control variables are firm size and nationality of foreign investors. Firm size is controlled using dummy variables in that whether or not the specific firm is small and medium sized. This is because some researchers report that big firms show different behaviors compared with small and medium sized firms in transfer pricing. The other control variable is also expressed in dummy variable showing if the entrepreneur is the American or not. That's because some prior studies conclude that the American management style is different in that they limit branch manger's freedom of decision. Reviewing the statistical results, we have found that managers prefer the cost-based method over the market-based method as the importance of corporate taxes and tariffs increase. This result means that managers need flexibility to lessen the tax burden when they feel taxes are important. They also prefer the cost-based method as the product life cycle matures, which means that they support subsidiaries in local market competition using cost-based transfer pricing. On the contrary, as the relationship with local tax authorities becomes more important, managers prefer the market-based method. That is because market-based pricing is a better way to maintain good relations with the tax officials. Other variables like tax audit, volume of internal transactions, sales volume, and local equity ratio have shown only insignificant influence. Additionally, we have replaced two tax variables(corporate taxes and tariffs) with the data showing top marginal tax rate and mean tariff rates of each country, and have performed another regression to find if we could get different results compared with the former one. As a consequence, we have found something different on the part of mean tariffs, that shows only an insignificant influence on the dependent variable. We guess that each company in the sample pays tariffs with a specific rate applied only for one's own company, which could be located far from mean tariff rates. Therefore we have concluded we need a more detailed data that shows the tariffs of each company if we want to check the role of this variable. Considering that the present paper has heavily relied on questionnaires, an effort to build a reliable data base is needed for enhancing the research reliability.
1. Introduction Today Internet is recognized as an important way for the transaction of products and services. According to the data surveyed by the National Statistical Office, the on-line transaction in 2007 for a year, 15.7656 trillion, shows a 17.1%(2.3060 trillion won) increase over last year, of these, the amount of B2C has been increased 12.0%(10.2258 trillion won). Like this, because the entry barrier of on-line market of Korea is low, many retailers could easily enter into the market. So the bigger its scale is, but on the other hand, the tougher its competition is. Particularly due to the Internet and innovation of IT, the existing market has been changed into the perfect competitive market(Srinivasan, Rolph & Kishore, 2002). In the early years of on-line business, they think that the main reason for success is a moderate price, they are awakened to its importance of on-line service quality with tough competition. If it's not sure whether customers can be provided with what they want, they can use the Web sites, perhaps they can trust their products that had been already bought or not, they have a doubt its viability(Parasuraman, Zeithaml & Malhotra, 2005). Customers can directly reserve and issue their air tickets irrespective of place and time at the Web sites of travel agencies or airlines, but its empirical studies about these Web sites for reserving and issuing air tickets are insufficient. Therefore this study goes on for following specific objects. First object is to measure service quality and service recovery of Web sites for reserving and issuing air tickets. Second is to look into whether above on-line service quality and on-line service recovery have an impact on overall service quality. Third is to seek for the relation with overall service quality and customer satisfaction, then this customer satisfaction and loyalty intention. 2. Theoretical Background 2.1 On-line Service Quality Barnes & Vidgen(2000; 2001a; 2001b; 2002) had invented the tool to measure Web sites' quality four times(called WebQual). The WebQual 1.0, Step one invented a measuring item for information quality based on QFD, and this had been verified by students of UK business school. The Web Qual 2.0, Step two invented for interaction quality, and had been judged by customers of on-line bookshop. The WebQual 3.0, Step three invented by consolidating the WebQual 1.0 for information quality and the WebQual2.0 for interactionquality. It includes 3-quality-dimension, information quality, interaction quality, site design, and had been assessed and confirmed by auction sites(e-bay, Amazon, QXL). Furtheron, through the former empirical studies, the authors changed sites quality into usability by judging that usability is a concept how customers interact with or perceive Web sites and It is used widely for accessing Web sites. By this process, WebQual 4.0 was invented, and is consist of 3-quality-dimension; information quality, interaction quality, usability, 22 items. However, because WebQual 4.0 is focusing on technical part, it's usable at the Website's design part, on the other hand, it's not usable at the Web site's pleasant experience part. Parasuraman, Zeithaml & Malhorta(2002; 2005) had invented the measure for measuring on-line service quality in 2002 and 2005. The study in 2002 divided on-line service quality into 5 dimensions. But these were not well-organized, so there needed to be studied again totally. So Parasuraman, Zeithaml & Malhorta(2005) re-worked out the study about on-line service quality measure base on 2002's study and invented E-S-QUAL. After they invented preliminary measure for on-line service quality, they made up a question for customers who had purchased at amazon.com and walmart.com and reassessed this measure. And they perfected an invention of E-S-QUAL consists of 4 dimensions, 22 items of efficiency, system availability, fulfillment, privacy. Efficiency measures assess to sites and usability and others, system availability measures accurate technical function of sites and others, fulfillment measures promptness of delivering products and sufficient goods and others and privacy measures the degree of protection of data about their customers and so on. 2.2 Service Recovery Service industries tend to minimize the losses by coping with service failure promptly. This responses of service providers to service failure mean service recovery(Kelly & Davis, 1994). Bitner(1990) went on his study from customers' view about service providers' behavior for customers to recognize their satisfaction/dissatisfaction at service point. According to them, to manage service failure successfully, exact recognition of service problem, an apology, sufficient description about service failure and some tangible compensation are important. Parasuraman, Zeithaml & Malhorta(2005) approached the service recovery from how to measure, rather than how to manage, and moved to on-line market not to off-line, then invented E-RecS-QUAL which is a measuring tool about on-line service recovery. 2.3 Customer Satisfaction The definition of customer satisfaction can be divided into two points of view. First, they approached customer satisfaction from outcome of comsumer. Howard & Sheth(1969) defined satisfaction as 'a cognitive condition feeling being rewarded properly or improperly for their sacrifice.' and Westbrook & Reilly(1983) also defined customer satisfaction/dissatisfaction as 'a psychological reaction to the behavior pattern of shopping and purchasing, the display condition of retail store, outcome of purchased goods and service as well as whole market.' Second, they approached customer satisfaction from process. Engel & Blackwell(1982) defined satisfaction as 'an assessment of a consistency in chosen alternative proposal and their belief they had with them.' Tse & Wilton(1988) defined customer satisfaction as 'a customers' reaction to discordance between advance expectation and ex post facto outcome.' That is, this point of view that customer satisfaction is process is the important factor that comparing and assessing process what they expect and outcome of consumer. Unlike outcome-oriented approach, process-oriented approach has many advantages. As process-oriented approach deals with customers' whole expenditure experience, it checks up main process by measuring one by one each factor which is essential role at each step. And this approach enables us to check perceptual/psychological process formed customer satisfaction. Because of these advantages, now many studies are adopting this process-oriented approach(Yi, 1995). 2.4 Loyalty Intention Loyalty has been studied by dividing into behavioral approaches, attitudinal approaches and complex approaches(Dekimpe et al., 1997). In the early years of study, they defined loyalty focusing on behavioral concept, behavioral approaches regard customer loyalty as "a tendency to purchase periodically within a certain period of time at specific retail store." But the loyalty of behavioral approaches focuses on only outcome of customer behavior, so there are someone to point the limits that customers' decision-making situation or process were neglected(Enis & Paul, 1970; Raj, 1982; Lee, 2002). So the attitudinal approaches were suggested. The attitudinal approaches consider loyalty contains all the cognitive, emotional, voluntary factors(Oliver, 1997), define the customer loyalty as "friendly behaviors for specific retail stores." However these attitudinal approaches can explain that how the customer loyalty form and change, but cannot say positively whether it is moved to real purchasing in the future or not. This is a kind of shortcoming(Oh, 1995). 3. Research Design 3.1 Research Model Based on the objects of this study, the research model derived is shows, Step 1 and Step 2 are significant, and mediation variable has a significant effect on dependent variables and so does independent variables at Step 3, too. And there needs to prove the partial mediation effect, independent variable's estimate ability at Step 3(Standardized coefficient
shows, Step 1 and Step 2 are significant, and mediation variable has a significant effect on dependent variables and so does independent variables at Step 3, too. And there needs to prove the partial mediation effect, independent variable's estimate ability at Step 3(Standardized coefficient
이메일무단수집거부
이용약관
제 1 장 총칙
제 2 장 이용계약의 체결
제 3 장 계약 당사자의 의무
제 4 장 서비스의 이용
제 5 장 계약 해지 및 이용 제한
제 6 장 손해배상 및 기타사항
Detail Search
Image Search
(β)