• Title/Summary/Keyword: Large-Scale Volume Data

Search Result 94, Processing Time 0.025 seconds

An Efficient Approach for Single-Pass Mining of Web Traversal Sequences (단일 스캔을 통한 웹 방문 패턴의 탐색 기법)

  • Kim, Nak-Min;Jeong, Byeong-Soo;Ahmed, Chowdhury Farhan
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.221-227
    • /
    • 2010
  • Web access sequence mining can discover the frequently accessed web pages pursued by users. Utility-based web access sequence mining handles non-binary occurrences of web pages and extracts more useful knowledge from web logs. However, the existing utility-based web access sequence mining approach considers web access sequences from the very beginning of web logs and therefore it is not suitable for mining data streams where the volume of data is huge and unbounded. At the same time, it cannot find the recent change of knowledge in data streams adaptively. The existing approach has many other limitations such as considering only forward references of web access sequences, suffers in the level-wise candidate generation-and-test methodology, needs several database scans, etc. In this paper, we propose a new approach for high utility web access sequence mining over data streams with a sliding window method. Our approach can not only handle large-scale data but also efficiently discover the recently generated information from data streams. Moreover, it can solve the other limitations of the existing algorithm over data streams. Extensive performance analyses show that our approach is very efficient and outperforms the existing algorithm.

A study on environmental dependence with AGN activity with the SDSS galaxies

  • Kim, Minbae;Choi, Yun-Young;Kim, Sungsoo S.
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.38 no.2
    • /
    • pp.52.2-52.2
    • /
    • 2013
  • We explore the relative importance of the role of small-scale environment and large-scale environment in triggering nuclear activity of the local galaxies using a volume-limited sample with $M_r$ < -19.5 and 0.02 < z < 0.0685 selected from the Sloan Digital Sky Survey Data Release 7. The active galactic nuclei (AGN) host sample is composed of Type II AGNs identified with flux ratios of narrow emission lines with S/N > 6 and the central velocity dispersion of the sample galaxies is limited to have a narrow range between 130 < ${\sigma}$ < 200($km\;s^{-1}$), corresponding to 7.4 < $log(M_{BH}/M_{\odot})$ < 8.1 in order to fix the mass of the supermassive black hole at the center of its host galaxy. In this study, we find that the AGN fraction ($f_{AGN}$) of late-type galaxies are larger than of early-type galaxies and that for target galaxy with late-type nearest neighbor, $f_{AGN}$ starts to increase as the target galaxy approaches the virial radius of the nearest neighbor (about a few hundred kpc scale). The latter result may support the idea that the hydrodynamic interaction with the nearest neighbor as well as tidal interaction and merger also plays an important role in triggering the nuclear activity of galaxy. We also find that early-type cluster galaxies show decline of AGN activity compared to ones in lower density regions, whereas the direction of dependence of AGN activity for late-type galaxies is opposite.

  • PDF

Reynolds and froude number effect on the flow past an interface-piercing circular cylinder

  • Koo, Bonguk;Yang, Jianming;Yeon, Seong Mo;Stern, Frederick
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.6 no.3
    • /
    • pp.529-561
    • /
    • 2014
  • The two-phase turbulent flow past an interface-piercing circular cylinder is studied using a high-fidelity orthogonal curvilinear grid solver with a Lagrangian dynamic subgrid-scale model for large-eddy simulation and a coupled level set and volume of fluid method for air-water interface tracking. The simulations cover the sub-critical and critical and post critical regimes of the Reynolds and sub and super-critical Froude numbers in order to investigate the effect of both dimensionless parameters on the flow. Significant changes in flow features near the air-water interface were observed as the Reynolds number was increased from the sub-critical to the critical regime. The interface makes the separation point near the interface much delayed for all Reynolds numbers. The separation region at intermediate depths is remarkably reduced for the critical Reynolds number regime. The deep flow resembles the single-phase turbulent flow past a circular cylinder, but includes the effect of the free-surface and the limited span length for sub-critical Reynolds numbers. At different Froude numbers, the air-water interface exhibits significantly changed structures, including breaking bow waves with splashes and bubbles at high Froude numbers. Instantaneous and mean flow features such as interface structures, vortex shedding, Reynolds stresses, and vorticity transport are also analyzed. The results are compared with reference experimental data available in the literature. The deep flow is also compared with the single-phase turbulent flow past a circular cylinder in the similar ranges of Reynolds numbers. Discussion is provided concerning the limitations of the current simulations and available experimental data along with future research.

A Case Study on the Aggregate Planning of Multi-product Small-batch Production Facilities: Focusing on System Dynamics Simulation Modeling (다품종 소량생산 설비의 총괄생산계획에 관한 사례 연구: 시스템다이내믹스 시뮬레이션 모델링을 중심으로)

  • Lee, Seungdoe;Kim, Sang Won
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.1
    • /
    • pp.153-167
    • /
    • 2022
  • Purpose: The purpose of this study is to guide the operation managers who plan daily production of large mass-processing facility that services multi-customers with multi-product, small-batch item characteristics by providing the practical best production quantity and the inventory allowed to build. Methods: Close observation of a subcontract paint-shop operator captured the daily decision process which was reflected in the subcontractor-unique mathematical model and the system dynamics simulation model. Multiple simulations were run to find the practical best production quantity and the maximum allowable stock level of inventory that did not undermine the profit from practical best daily production. Actual data and a few constant values were obtained from the firm under study. Results: While the inventory holding cost for the customer-owned material harms the total profit of the subcontractor, the running cost of the processing facility hinders production in small batches. This balances the maximum possible productions and results in practical best daily production which can be found through simulation runs with actual data. The maximum level of stocked inventory is deduced from the practical best daily production. Conclusion: To build a large volume that enables economy-of-scale production, operators should deal with multi-product small-batch items from multiple customers. When the planned schedule of the time and amount of material in-flow tend not to be reliable, operators can find it practical to execute level production across the planning horizon instead of adjusting to day-to-day in-flow fluctuations.

An Efficient Medical Image Compression Considering Brain CT Images with Bilateral Symmetry (뇌 CT 영상의 대칭성을 고려한 관심영역 중심의 효율적인 의료영상 압축)

  • Jung, Jae-Sung;Lee, Chang-Hun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.5
    • /
    • pp.39-54
    • /
    • 2012
  • Picture Archiving and Communication System (PACS) has been planted as one of the key infrastructures with an overall improvement in standards of medical informationization and the stream of digital hospitalization in recent days. The kind and data of digital medical imagery are also increasing rapidly in volume. This trend emphasizes the medical image compression for storing large-scale medical image data. Digital Imaging and Communications in Medicine (DICOM), de facto standard in digital medical imagery, specifies Run Length Encode (RLE), which is the typical lossless data compressing technique, for the medical image compression. However, the RLE is not appropriate approach for medical image data with bilateral symmetry of the human organism. we suggest two preprocessing algorithms that detect interested area, the minimum bounding rectangle, in a medical image to enhance data compression efficiency and that re-code image pixel values to reduce data size according to the symmetry characteristics in the interested area, and also presents an improved image compression technique for brain CT imagery with high bilateral symmetry. As the result of experiment, the suggested approach shows higher data compression ratio than the RLE compression in the DICOM standard without detecting interested area in images.

A Study on Development of Classification Indicators in Transportation Sector Energy Conservation DB (에너지절약 DB 구축을 위한 수송부문 분류지표 설정)

  • Lim, Ki Choo
    • Journal of Energy Engineering
    • /
    • v.25 no.3
    • /
    • pp.149-156
    • /
    • 2016
  • This paper surveyed and analyzed cases of DB development overseas to set the range of DB to be developed for analyzing energy-saving policies in the domestic transportation sector. The foregoing prerequisites were used to establish system for classification in the broad scale under which system for classification in detail indicators that suit one in the broader indicators was set based on analysis of domestic / overseas cases to determine DB development range in the transportation sector required to analysis domestic energy-saving policies. Accordingly, six items subject to the broadest classification were determined, i.e. energy consumption, energy basic unit, emissions of greenhouse gas, economic indicators, transportation volume / transportation records and basic automobile data. Large classification and sub-items determined by surveying expert opinions were set and proposed as DB classification indicators.

Maximum Simplex Volume based Landmark Selection for Isomap (최대 부피 Simplex 기반의 Isomap을 위한 랜드마크 추출)

  • Chi, Junhwa
    • Korean Journal of Remote Sensing
    • /
    • v.29 no.5
    • /
    • pp.509-516
    • /
    • 2013
  • Since traditional linear feature extraction methods are unable to handle nonlinear characteristics often exhibited in hyperspectral imagery, nonlinear feature extraction, also known as manifold learning, is receiving increased attention in hyperspectral remote sensing society as well as other community. A most widely used manifold Isomap is generally promising good results in classification and spectral unmixing tasks, but significantly high computational overhead is problematic, especially for large scale remotely sensed data. A small subset of distinguishing points, referred to as landmarks, is proposed as a solution. This study proposes a new robust and controllable landmark selection method based on the maximum volume of the simplex spanned by landmarks. The experiments are conducted to compare classification accuracies with standard deviation according to sampling methods, the number of landmarks, and processing time. The proposed method could employ both classification accuracy and computational efficiency.

Platelet Indices May be Useful in Discrimination of Benign and Malign Endometrial Lesions, and Early and Advanced Stage Endometrial Cancer

  • Kurtoglu, Emel;Kokcu, Arif;Celik, Handan;Sari, Seher;Tosun, Migraci
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.13
    • /
    • pp.5397-5400
    • /
    • 2015
  • Background: The aim of this study was to investigate the predictive value of white blood cells (WBC), the neutrophil to lymphocyte ratio (NLR), platelet indices including mean platelet volume (MPV), platelet distribution width (PDW), platelet crit (PCT) and platelet to lymphocyte ratio (PLR) in discrimination between benign and malign endometrial lesions, and early and advanced stage endometrial adenocarcinomas. Materials and Methods: Data for 105 patients undergoing total abdominal or vaginal hysterectomy for benign uterine diseases and 114 patients surgically staged for endometrium adenocarcinoma at Ondokuz Mayis University, Department of Gynecology and Obstetrics, between 2008 and 2014, were collected. Parameters were preoperative and postoperative complete blood counts in the week prior to surgery with differentials including WBC, platelet count, platelet indices (MPV, PCT, PDW), NLR and PLR. Pathologic evaluations for both benign and malign endometrium lesions, grade of endometrium adenocarcinoma, tumor stage, presence of lymphovascular space invasion (LVI) were retrospectively analyzed. Results: Regarding definitive factors in discriminating patients with endometrium cancer from those with benign diseases, MPV was significantly increased in the malign group whereas there was a significant decrease in the PDW value compared to the benign group. The best cut-off value in differentiation of the benign and malign groups, malign cases were found to increase over the value of 7.54 for MPV, and under 37.8 for PDW. When definitive factors in discrimination of early stage endometrium cancer from advanced stage disease and LVI in the malign group were evaluated according to the ROC analysis, no significant relation was detected between blood parameters and the stage and the LVI of the disease. Conclusions: MPV and PDW may have predictive value in the discrimination of benign and malign endometrium diseases. Nevertheless, since there have been few reports on this topic, further large-scale prospective studies are necessary.

Profiling Approach for the Choice between Speculation and Postponement Strategy in Supply Chain Management (공급사슬관리의 예측전략과 지연전략 선택을 위한 프로파일링 접근법)

  • Kang, Sung-Wook;Kim, Gyu-Bae
    • Journal of Distribution Science
    • /
    • v.12 no.4
    • /
    • pp.47-54
    • /
    • 2014
  • Purpose - The postponement strategy, which delays the form, place, and production of products as late as possible, has been widely considered as a competitive supply chain management scheme in an era of mass customization and modular manufacturing. An interesting business phenomenon is that not all manufacturing/logistics firms choose the postponement strategy. Given that postponement is a counter-measure to speculation, which has some advantages under certain environments, the current imprudent inclination toward the postponement strategy may cause firms to lose the potential of the speculation strategy, an alternative strategy in supply chain management. Building on the logistics and manufacturing literature, this study examines characteristics of two contrasting strategies, postponement and speculation, and major factors favoring each strategy. Research design, data, and methodology - We apply the profiling approach to two business cases, HP printer and LG mobile phone. The profiling approach is a method of choosing a particular strategy aligned with environmental factors. While various approaches have been used to check the fit between a business strategy and environmental factors, the literature on manufacturing strategy and logistics has commonly adopted the profiling approach. Major factors used in profiling variables are derived from the literature. Two samples, HP printer and LG mobile phone, are selected, because they represent major characteristics appropriate for each strategy. The profiling is based on data from semi-organized interviews with managers. Results - The profiling approach shows that the postponement strategy is a suitable one for HP printers. Most factors, such as product life cycle, large production volume, low-price, product value, and monetary density, support delaying end products until as late as possible. Despite some exceptions, such as delivery time and economy of scale, our analysis states that the overall profile of HP printer is favorable for the postponement strategy. On the other hand, LG mobile phone may adapt the speculation strategy. Although it has large production volume and low delivery frequency, most characteristics support the speculation strategy for this product. An interesting finding is that, despite common perception that advanced technology products such as mobile telephones favor the postponement strategy, profiling proposes the speculation strategy for this product. Conclusions - Our analysis shows that speculation is not the universal option for supply chain management, and that, when choosing a specific strategy, one should consider many factors simultaneously. A major implication of our work is to emphasize the role of environmental factors such as supply chain variables in choosing an inventory strategy, and the importance of fit rather than solely strategic orientation. A theoretical contribution is to demonstrate the benefit of the simultaneous consideration of business variables in choosing specific strategies. For practitioners, our work leads us to consider the existence and the potential of speculation as a counter-measure to postponement. In addition, the comprehensive framework in this research may be instantly used in examining a practical strategy.

The Contribution of University-business Interaction to Innovation: Bibliometric Analysis (대학과 기업 간 상호협력에 따른 혁신창출 -계량서지학적 분석-)

  • Beck, Yeong Ki
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.493-514
    • /
    • 2012
  • Research collaboration between industry and universities is high on many policy agenda's nowadays, especially with regard to science-based technological innovation. Nonetheless, there have been few attempts at examining large-scale systematic and quantitative data on the nature and extent of university-industry collaborations. The objective of this paper is to explore the patterns and trends of research collaborations between universities and companies for scientific knowledge production in the seven science-based technologies. This paper uses co-authored articles published in major scientific journals in the world as an indicator of collaborative scientific research between universities, companies and governmental research institutes. The tens of thousands of co-authorship papers in the northeast region in the US over the years 2006 to 2010 were analyzed for collaboration patterns and their spatial characteristics. This paper finds that there were increases both in the proportions of multiple authored, particularly five or more, papers, and in the volume of international collaborations. By examining a type of collaborations between different institutions, research collaboration between universities and companies in this region is relatively high share at national level. This suggests that the national or even international scale seems more appropriate for innovation policies.

  • PDF