• Title/Summary/Keyword: Processing Efficiency

Search Result 3,499, Processing Time 0.034 seconds

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market (데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례)

  • Lee, Seon Ah;Chang, Namsik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.161-177
    • /
    • 2015
  • With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.

Utility Evaluation of Supportive Devices for Interventional Lower Extremity Angiography (인터벤션 하지 혈관조영검사를 위한 보조기구의 유용성 평가)

  • Kong, Chang gi;Song, Jong Nam;Jeong, Moon Taek;Han, Jae Bok
    • Journal of the Korean Society of Radiology
    • /
    • v.13 no.4
    • /
    • pp.613-621
    • /
    • 2019
  • The purpose of this study is to evaluate the effectiveness of supportive devices which are for minimizing the patient's movement during lower extremity angiography and to verify image quality of phantom by analyzing of Mask image, DSA image and Roadmap image into SNR and CNR. As a result of comparing SNR with CNR of mask image obtained by DSA technique using the phantom alone and phantom placed on the supportive devices, there was no significant difference between about 0~0.06 for SNR and about 0~0.003 for CNR. The study showed about 0.11~0.35 for SNR and 0.016~0.031 for CNR of DSA imaging by DSA technique about only water phantom of the blood vessel model and the water phantom placed on the device. Analyzing SNR and CNR of Roadmap technique about water phantom on the auxiliary device (hardboard paper, pomax, polycarbonate, acrylic) and water phantom alone, there was no significant difference between 0.02~0.05 for SNR and 0.002~0.004 for CNR. In conclusion, there was no significant difference on image quality by using supportive devices made by hardboard paper, pomax, polycarbonate or acryl regardless of whether using supportive devices or not. Supportive devices to minimize of the patient's movement may reduce the total amount of contrast, exam-time, radiation exposure and eliminate risk factors during angiogram. Supportive devices made by hardboard paper can be applied easily during angiogram due to advantages of reasonable price and simple processing. It is considered that will be useful to consider cost efficiency and types of materials and their properties in accordance with purpose and method of the study when the operator makes and uses supportive devices.

A Study on the Quality Improvement of Brain Perfusion SPECT Image (뇌혈류 단일광자방출단층촬영 영상 품질 향상에 대한 연구)

  • Kil, Sang-Hyeong;Lim, Yung-Hyun;Park, Gwang-Yeol;Cho, Seong-Mook
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.23 no.2
    • /
    • pp.13-19
    • /
    • 2019
  • Purpose Tc-99m HMPAO is widely used radiopharmaceutical for brain perfusion SPECT. Tc-99m HMPAO is chemically unstable and is liable to show deterioration of labeling efficiency due to high incidence of secondary Tc-99m HMPAO complex, free pertechnetate and reduced-hydrolyzed Tc-99m. In this study, we investigated whether sialogogues administration could reduce the impurities of Tc-99m HMPAO. Materials and Methods In thirty subjects(20 male and 10 female, age range 19~89 years, mean age $60.7{\pm}14.5years$), brain perfusion SPECT were performed at basal and citric acid stimulation states consecutively after injection of 555 MBq of Tc-99m HMPAO. In the salivary glands, the uptake coefficient was calculated using Siemens processing program. Statistical comparison between before and after the citric acid stimulation performed paired t-test. P value less than 0.05 was regarded as statistically significant. Results Salivary glands uptake was $12900{\pm}3101$ counts in basal and $10677{\pm}2742$ counts in citric acid stimulation states. Unnecessary impurities in the body is much decreased after citric acid administration(t=10.78, P<0.05). The image quality was much improved after administration of citric acid and the regional cerebral perfusion was clearly from demarcated the background. Conclusion The impurity is distributed throughout the body particularly in the salivary glands and nasal mucosa when Tc-99m HMPAO brain perfusion SPECT is performed. If this impurities is not removed, the quality of the image may deteriorate, resulting in errors in visual evaluation. The use of sialogogues could be helpful for decreasing unnecessary impurities in the body.

GF/PC Composite Filament Design & Optimization of 3D Printing Process and Structure for Manufacturing 3D Printed Electric Vehicle Battery Module Cover (전기자동차 배터리 모듈 커버의 3D 프린팅 제작을 위한 GF/PC 복합소재 필라멘트 설계와 3D 프린팅 공정 및 구조 최적화)

  • Yoo, Jeong-Wook;Lee, Jin-Woo;Kim, Seung-Hyun;Kim, Youn-Chul;Suhr, Jong-Hwan
    • Composites Research
    • /
    • v.34 no.4
    • /
    • pp.241-248
    • /
    • 2021
  • As the electric vehicle market grows, there is an issue of light weight vehicles to increase battery efficiency. Therefore, it is going to replace the battery module cover that protects the battery module of electric vehicles with high strength/high heat-resistant polymer composite material which has lighter weight from existing aluminum materials. It also aims to respond to the early electric vehicle market where technology changes quickly by combining 3D printing technology that is advantageous for small production of multiple varieties without restrictions on complex shapes. Based on the composite material mechanics, the critical length of glass fibers in short glass fiber (GF)/polycarbonate (PC) composite materials manufactured through extruder was derived as 453.87 ㎛, and the side feeding method was adopted to improve the residual fiber length from 365.87 ㎛ and to increase a dispersibility. Thus, the optimal properties of tensile strength 135 MPa and Young's modulus 7.8 MPa were implemented as GF/PC composite materials containing 30 wt% of GF. In addition, the filament extrusion conditions (temperature, extrusion speed) were optimized to meet the commercial filament specification of 1.75 mm thickness and 0.05 mm standard deviation. Through manufactured filaments, 3D printing process conditions (temperature, printing speed) were optimized by multi-optimization that minimize porosity, maximize tensile strength, and printing speed to increase the productivity. Through this procedure, tensile strength and elastic modulus were improved 11%, 56% respectively. Also, by post-processing, tensile strength and Young's modulus were improved 5%, 18% respectively. Lastly, using the FEA (finite element analysis) technique, the structure of the battery module cover was optimized to meet the mechanical shock test criteria of the electric vehicle battery module cover (ISO-12405), and it is satisfied the battery cover mechanical shock test while achieving 37% lighter weight compared to aluminum battery module cover. Based on this research, it is expected that 3D printing technology of polymer composite materials can be used in various fields in the future.

Analyzing Different Contexts for Energy Terms through Text Mining of Online Science News Articles (온라인 과학 기사 텍스트 마이닝을 통해 분석한 에너지 용어 사용의 맥락)

  • Oh, Chi Yeong;Kang, Nam-Hwa
    • Journal of Science Education
    • /
    • v.45 no.3
    • /
    • pp.292-303
    • /
    • 2021
  • This study identifies the terms frequently used together with energy in online science news articles and topics of the news reports to find out how the term energy is used in everyday life and to draw implications for science curriculum and instruction about energy. A total of 2,171 online news articles in science category published by 11 major newspaper companies in Korea for one year from March 1, 2018 were selected by using energy as a search term. As a result of natural language processing, a total of 51,224 sentences consisting of 507,901 words were compiled for analysis. Using the R program, term frequency analysis, semantic network analysis, and structural topic modeling were performed. The results show that the terms with exceptionally high frequencies were technology, research, and development, which reflected the characteristics of news articles that report new findings. On the other hand, terms used more than once per two articles were industry-related terms (industry, product, system, production, market) and terms that were sufficiently expected as energy-related terms such as 'electricity' and 'environment.' Meanwhile, 'sun', 'heat', 'temperature', and 'power generation', which are frequently used in energy-related science classes, also appeared as terms belonging to the highest frequency. From a network analysis, two clusters were found including terms related to industry and technology and terms related to basic science and research. From the analysis of terms paired with energy, it was also found that terms related to the use of energy such as 'energy efficiency,' 'energy saving,' and 'energy consumption' were the most frequently used. Out of 16 topics found, four contexts of energy were drawn including 'high-tech industry,' 'industry,' 'basic science,' and 'environment and health.' The results suggest that the introduction of the concept of energy degradation as a starting point for energy classes can be effective. It also shows the need to introduce high-tech industries or the context of environment and health into energy learning.

A Study on the Development Direction of Medical Image Information System Using Big Data and AI (빅데이터와 AI를 활용한 의료영상 정보 시스템 발전 방향에 대한 연구)

  • Yoo, Se Jong;Han, Seong Soo;Jeon, Mi-Hyang;Han, Man Seok
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.9
    • /
    • pp.317-322
    • /
    • 2022
  • The rapid development of information technology is also bringing about many changes in the medical environment. In particular, it is leading the rapid change of medical image information systems using big data and artificial intelligence (AI). The prescription delivery system (OCS), which consists of an electronic medical record (EMR) and a medical image storage and transmission system (PACS), has rapidly changed the medical environment from analog to digital. When combined with multiple solutions, PACS represents a new direction for advancement in security, interoperability, efficiency and automation. Among them, the combination with artificial intelligence (AI) using big data that can improve the quality of images is actively progressing. In particular, AI PACS, a system that can assist in reading medical images using deep learning technology, was developed in cooperation with universities and industries and is being used in hospitals. As such, in line with the rapid changes in the medical image information system in the medical environment, structural changes in the medical market and changes in medical policies to cope with them are also necessary. On the other hand, medical image information is based on a digital medical image transmission device (DICOM) format method, and is divided into a tomographic volume image, a volume image, and a cross-sectional image, a two-dimensional image, according to a generation method. In addition, recently, many medical institutions are rushing to introduce the next-generation integrated medical information system by promoting smart hospital services. The next-generation integrated medical information system is built as a solution that integrates EMR, electronic consent, big data, AI, precision medicine, and interworking with external institutions. It aims to realize research. Korea's medical image information system is at a world-class level thanks to advanced IT technology and government policies. In particular, the PACS solution is the only field exporting medical information technology to the world. In this study, along with the analysis of the medical image information system using big data, the current trend was grasped based on the historical background of the introduction of the medical image information system in Korea, and the future development direction was predicted. In the future, based on DICOM big data accumulated over 20 years, we plan to conduct research that can increase the image read rate by using AI and deep learning algorithms.

Performance Optimization of Numerical Ocean Modeling on Cloud Systems (클라우드 시스템에서 해양수치모델 성능 최적화)

  • JUNG, KWANGWOOG;CHO, YANG-KI;TAK, YONG-JIN
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.27 no.3
    • /
    • pp.127-143
    • /
    • 2022
  • Recently, many attempts to run numerical ocean models in cloud computing environments have been tried actively. A cloud computing environment can be an effective means to implement numerical ocean models requiring a large-scale resource or quickly preparing modeling environment for global or large-scale grids. Many commercial and private cloud computing systems provide technologies such as virtualization, high-performance CPUs and instances, ether-net based high-performance-networking, and remote direct memory access for High Performance Computing (HPC). These new features facilitate ocean modeling experimentation on commercial cloud computing systems. Many scientists and engineers expect cloud computing to become mainstream in the near future. Analysis of the performance and features of commercial cloud services for numerical modeling is essential in order to select appropriate systems as this can help to minimize execution time and the amount of resources utilized. The effect of cache memory is large in the processing structure of the ocean numerical model, which processes input/output of data in a multidimensional array structure, and the speed of the network is important due to the communication characteristics through which a large amount of data moves. In this study, the performance of the Regional Ocean Modeling System (ROMS), the High Performance Linpack (HPL) benchmarking software package, and STREAM, the memory benchmark were evaluated and compared on commercial cloud systems to provide information for the transition of other ocean models into cloud computing. Through analysis of actual performance data and configuration settings obtained from virtualization-based commercial clouds, we evaluated the efficiency of the computer resources for the various model grid sizes in the virtualization-based cloud systems. We found that cache hierarchy and capacity are crucial in the performance of ROMS using huge memory. The memory latency time is also important in the performance. Increasing the number of cores to reduce the running time for numerical modeling is more effective with large grid sizes than with small grid sizes. Our analysis results will be helpful as a reference for constructing the best computing system in the cloud to minimize time and cost for numerical ocean modeling.

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.