Search | Korea Science

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market (데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례)

Lee, Seon Ah;Chang, Namsik
- Journal of Intelligence and Information Systems
- /
- v.21 no.1
- /
- pp.161-177
- /
- 2015
With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.
https://doi.org/10.13088/jiis.2015.21.1.161 인용 PDF KSCI

Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information (웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발)

Choi, Youji;Park, Do-Hyung
- Journal of Intelligence and Information Systems
- /
- v.23 no.3
- /
- pp.155-175
- /
- 2017
As social data become into the spotlight, mainstream web search engines provide data indicate how many people searched specific keyword: Web Search Traffic data. Web search traffic information is collection of each crowd that search for specific keyword. In a various area, web search traffic can be used as one of useful variables that represent the attention of common users on specific interests. A lot of studies uses web search traffic data to nowcast or forecast social phenomenon such as epidemic prediction, consumer pattern analysis, product life cycle, financial invest modeling and so on. Also web search traffic data have begun to be applied to predict tourist inbound. Proper demand prediction is needed because tourism is high value-added industry as increasing employment and foreign exchange. Among those tourists, especially Chinese tourists: Youke is continuously growing nowadays, Youke has been largest tourist inbound of Korea tourism for many years and tourism profits per one Youke as well. It is important that research into proper demand prediction approaches of Youke in both public and private sector. Accurate tourism demands prediction is important to efficient decision making in a limited resource. This study suggests improved model that reflects latest issue of society by presented the attention from group of individual. Trip abroad is generally high-involvement activity so that potential tourists likely deep into searching for information about their own trip. Web search traffic data presents tourists' attention in the process of preparation their journey instantaneous and dynamic way. So that this study attempted select key words that potential Chinese tourists likely searched out internet. Baidu-Chinese biggest web search engine that share over 80%- provides users with accessing to web search traffic data. Qualitative interview with potential tourists helps us to understand the information search behavior before a trip and identify the keywords for this study. Selected key words of web search traffic are categorized by how much directly related to "Korean Tourism" in a three levels. Classifying categories helps to find out which keyword can explain Youke inbound demands from close one to far one as distance of category. Web search traffic data of each key words gathered by web crawler developed to crawling web search data onto Baidu Index. Using automatically gathered variable data, linear model is designed by multiple regression analysis for suitable for operational application of decision and policy making because of easiness to explanation about variables' effective relationship. After regression linear models have composed, comparing with model composed traditional variables and model additional input web search traffic data variables to traditional model has conducted by significance and R squared. after comparing performance of models, final model is composed. Final regression model has improved explanation and advantage of real-time immediacy and convenience than traditional model. Furthermore, this study demonstrates system intuitively visualized to general use -Youke Mining solution has several functions of tourist decision making including embed final regression model. Youke Mining solution has algorithm based on data science and well-designed simple interface. In the end this research suggests three significant meanings on theoretical, practical and political aspects. Theoretically, Youke Mining system and the model in this research are the first step on the Youke inbound prediction using interactive and instant variable: web search traffic information represents tourists' attention while prepare their trip. Baidu web search traffic data has more than 80% of web search engine market. Practically, Baidu data could represent attention of the potential tourists who prepare their own tour as real-time. Finally, in political way, designed Chinese tourist demands prediction model based on web search traffic can be used to tourism decision making for efficient managing of resource and optimizing opportunity for successful policy.
https://doi.org/10.13088/jiis.2017.23.3.155 인용 PDF KSCI

A Comparative Case Study on the Adaptation Process of Advanced Information Technology: A Grounded Theory Approach for the Appropriation Process (신기술 사용 과정에 관한 비교 사례 연구: 기술 전유 과정의 근거이론적 접근)

Choi, Hee-Jae;Lee, Zoon-Ky
- Asia pacific journal of information systems
- /
- v.19 no.3
- /
- pp.99-124
- /
- 2009
Many firms in Korea have adopted and used advanced information technology in an effort to boost efficiency. The process of adapting to the new technology, at the same time, can vary from one firm to another. As such, this research focuses on several relevant factors, especially the roles of social interaction as a key variable that influences the technology adaptation process and the outcomes. Thus far, how a firm goes through the adaptation process to the new technology has not been yet fully explored. Previous studies on changes undergone by a firm or an organization due to information technology have been pursued from various theoretical points of views, evolved from technological and institutional views to an integrated social technology views. The technology adaptation process has been understood to be something that evolves over time and has been regarded as cycles between misalignments and alignments, gradually approaching the stable aligned state. The adaptation process of the new technology was defined as "appropriation" process according to Poole and DeSanctis (1994). They suggested that this process is not automatically determined by the technology design itself. Rather, people actively select how technology structures should be used; accordingly, adoption practices vary. But concepts of the appropriation process in these studies are not accurate while suggested propositions are not clear enough to apply in practice. Furthermore, these studies do not substantially suggest which factors are changed during the appropriation process and what should be done to bring about effective outcomes. Therefore, research objectives of this study lie in finding causes for the difference in ways in which advanced information technology has been used and adopted among organizations. The study also aims to explore how a firm's interaction with social as well as technological factors affects differently in resulting organizational changes. Detail objectives of this study are as follows. First, this paper primarily focuses on the appropriation process of advanced information technology in the long run, and we look into reasons for the diverse types of the usage. Second, this study is to categorize each phases in the appropriation process and make clear what changes occur and how they are evolved during each phase. Third, this study is to suggest the guidelines to determine which strategies are needed in an individual, group and organizational level. For this, a substantially grounded theory that can be applied to organizational practice has been developed from a longitudinal comparative case study. For these objectives, the technology appropriation process was explored based on Structuration Theory by Giddens (1984), Orlikoski and Robey (1991) and Adaptive Structuration Theory by Poole and DeSanctis (1994), which are examples of social technology views on organizational change by technology. Data have been obtained from interviews, observations of medical treatment task, and questionnaires administered to group members who use the technology. Data coding was executed in three steps following the grounded theory approach. First of all, concepts and categories were developed from interviews and observation data in open coding. Next, in axial coding, we related categories to subcategorize along the lines of their properties and dimensions through the paradigm model. Finally, the grounded theory about the appropriation process was developed through the conditional/consequential matrix in selective coding. In this study eight hypotheses about the adaptation process have been clearly articulated. Also, we found that the appropriation process involves through three phases, namely, "direct appropriation," "cooperate with related structures," and "interpret and make judgments." The higher phases of appropriation move, the more users represent various types of instrumental use and attitude. Moreover, the previous structures like "knowledge and experience," "belief that other members know and accept the use of technology," "horizontal communication," and "embodiment of opinion collection process" are evolved to higher degrees in their dimensions of property. Furthermore, users continuously create new spirits and structures, while removing some of the previous ones at the same time. Thus, from longitudinal view, faithful and unfaithful appropriation methods appear recursively, but gradually faithful appropriation takes over the other. In other words, the concept of spirits and structures has been changed in the adaptation process over time for the purpose of alignment between the task and other structures. These findings call for a revised or extended model of structural adaptation in IS (Information Systems) literature now that the vague adaptation process in previous studies has been clarified through the in-depth qualitative study, identifying each phrase with accuracy. In addition, based on these results some guidelines can be set up to help determine which strategies are needed in an individual, group, and organizational level for the purpose of effective technology appropriation. In practice, managers can focus on the changes of spirits and elevation of the structural dimension to achieve effective technology use.
PDF KSCI

A Study on the Determinants of Patent Citation Relationships among Companies : MR-QAP Analysis (기업 간 특허인용 관계 결정요인에 관한 연구 : MR-QAP분석)

Park, Jun Hyung;Kwahk, Kee-Young;Han, Heejun;Kim, Yunjeong
- Journal of Intelligence and Information Systems
- /
- v.19 no.4
- /
- pp.21-37
- /
- 2013
Recently, as the advent of the knowledge-based society, there are more people getting interested in the intellectual property. Especially, the ICT companies leading the high-tech industry are working hard to strive for systematic management of intellectual property. As we know, the patent information represents the intellectual capital of the company. Also now the quantitative analysis on the continuously accumulated patent information becomes possible. The analysis at various levels becomes also possible by utilizing the patent information, ranging from the patent level to the enterprise level, industrial level and country level. Through the patent information, we can identify the technology status and analyze the impact of the performance. We are also able to find out the flow of the knowledge through the network analysis. By that, we can not only identify the changes in technology, but also predict the direction of the future research. In the field using the network analysis there are two important analyses which utilize the patent citation information; citation indicator analysis utilizing the frequency of the citation and network analysis based on the citation relationships. Furthermore, this study analyzes whether there are any impacts between the size of the company and patent citation relationships. 74 S&P 500 registered companies that provide IT and communication services are selected for this study. In order to determine the relationship of patent citation between the companies, the patent citation in 2009 and 2010 is collected and sociomatrices which show the patent citation relationship between the companies are created. In addition, the companies' total assets are collected as an index of company size. The distance between companies is defined as the absolute value of the difference between the total assets. And simple differences are considered to be described as the hierarchy of the company. The QAP Correlation analysis and MR-QAP analysis is carried out by using the distance and hierarchy between companies, and also the sociomatrices that shows the patent citation in 2009 and 2010. Through the result of QAP Correlation analysis, the patent citation relationship between companies in the 2009's company's patent citation network and the 2010's company's patent citation network shows the highest correlation. In addition, positive correlation is shown in the patent citation relationships between companies and the distance between companies. This is because the patent citation relationship is increased when there is a difference of size between companies. Not only that, negative correlation is found through the analysis using the patent citation relationship between companies and the hierarchy between companies. Relatively it is indicated that there is a high evaluation about the patent of the higher tier companies influenced toward the lower tier companies. MR-QAP analysis is carried out as follow. The sociomatrix that is generated by using the year 2010 patent citation relationship is used as the dependent variable. Additionally the 2009's company's patent citation network and the distance and hierarchy networks between the companies are used as the independent variables. This study performed MR-QAP analysis to find the main factors influencing the patent citation relationship between the companies in 2010. The analysis results show that all independent variables have positively influenced the 2010's patent citation relationship between the companies. In particular, the 2009's patent citation relationship between the companies has the most significant impact on the 2010's, which means that there is consecutiveness regarding the patent citation relationships. Through the result of QAP correlation analysis and MR-QAP analysis, the patent citation relationship between companies is affected by the size of the companies. But the most significant impact is the patent citation relationships that had been done in the past. The reason why we need to maintain the patent citation relationship between companies is it might be important in the use of strategic aspect of the companies to look into relationships to share intellectual property between each other, also seen as an important auxiliary of the partner companies to cooperate with.
https://doi.org/10.13088/jiis.2013.19.4.021 인용 PDF KSCI

Search Result 144, Processing Time 0.02 seconds

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market (데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례)

Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information (웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발)

A Comparative Case Study on the Adaptation Process of Advanced Information Technology: A Grounded Theory Approach for the Appropriation Process (신기술 사용 과정에 관한 비교 사례 연구: 기술 전유 과정의 근거이론적 접근)

A Study on the Determinants of Patent Citation Relationships among Companies : MR-QAP Analysis (기업 간 특허인용 관계 결정요인에 관한 연구 : MR-QAP분석)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)