• Title/Summary/Keyword: Large-scale experiments

Search Result 552, Processing Time 0.028 seconds

Fast Search with Data-Oriented Multi-Index Hashing for Multimedia Data

  • Ma, Yanping;Zou, Hailin;Xie, Hongtao;Su, Qingtang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.7
    • /
    • pp.2599-2613
    • /
    • 2015
  • Multi-index hashing (MIH) is the state-of-the-art method for indexing binary codes, as it di-vides long codes into substrings and builds multiple hash tables. However, MIH is based on the dataset codes uniform distribution assumption, and will lose efficiency in dealing with non-uniformly distributed codes. Besides, there are lots of results sharing the same Hamming distance to a query, which makes the distance measure ambiguous. In this paper, we propose a data-oriented multi-index hashing method (DOMIH). We first compute the covariance ma-trix of bits and learn adaptive projection vector for each binary substring. Instead of using substrings as direct indices into hash tables, we project them with corresponding projection vectors to generate new indices. With adaptive projection, the indices in each hash table are near uniformly distributed. Then with covariance matrix, we propose a ranking method for the binary codes. By assigning different bit-level weights to different bits, the returned bina-ry codes are ranked at a finer-grained binary code level. Experiments conducted on reference large scale datasets show that compared to MIH the time performance of DOMIH can be improved by 36.9%-87.4%, and the search accuracy can be improved by 22.2%. To pinpoint the potential of DOMIH, we further use near-duplicate image retrieval as examples to show the applications and the good performance of our method.

Implementation of AMGA GUI Client Toolkit : AMGA Manager (AMGA GUI Client 툴킷 구현 : AMGA Manager)

  • Huh, Tae-Sang;Hwang, Soon-Wook;Park, Guen-Chul
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.3
    • /
    • pp.421-433
    • /
    • 2012
  • AMGA service, which is one of the EMI gLite middleware components, is widely used for analysis of distributed large scale experiments data as metadata repository by scientific and technological researchers and the use of AMGA is extended farther to include general industries needing metadata Catalogue as well. However AMGA, based unix and Grid UI, has the weakness of being absence of general-purpose user interfaces in comparison to other commercial database systems and that's why it's difficult to use and diffuse it although it has the superiority of the functionality. In this paper, we developed AMGA GUI toolkit to provide work convenience using object-oriented modeling language(UML). Currently, AMGA has been used as the main component among many user communities such as Belle II, WISDOM, MDM, and so on, but we expect that this development can not only lower the barrier to entry for AMGA beginners to use it, but lead to expand the use of AMGA service over more communities.

Scheduling based on Cache Utilization in a Cache Server Cluster for Wireless Internet (무선 인터넷을 위한 캐시 서버 클러스터 환경에서 캐시 이용률 기반의 스케줄링)

  • Kwak, Hu-Keun;Chung, Kyu-Sik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.9
    • /
    • pp.435-444
    • /
    • 2007
  • Caching web pages is an important part of web infrastructures. The effects of caching service are even more pronounced for wireless infrastructures due to their limited bandwidth. Medium to large-scale infrastructures deploy a cluster of servers to solve the scalability problem and hot spot problem inherent in caching. In this paper we present scheduling scheme based on cache utilization in a wireless internet proxy server cluster environment. The proposed method uses cache utilization for distributing evenly client requests to a cluster of cache servers and solving hot spot problem. We have implemented our approach and performed various experiments using publicly available traces. Experimental results on a cluster of 16 cache servers demonstrate that the proposed hashing method gives 45% to 114% Performance improvement over other widely used methods while addressing the hot spot problem.

Transcoding Load Estimation Method for Load Balance on Distributed Transcoding Environments (분산 트랜스코딩 환경에서 부하 균형을 위한 트랜스코딩 부하 예측 기법)

  • Seo, Dong-Mahn;Heo, Nan-Sok;Kim, Jong-Woo;Jung, In-Bum
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.9_10
    • /
    • pp.466-475
    • /
    • 2008
  • Owing to the improved wireless communication technologies, it is possible to provide streaming service of multimedia with PDAs and mobile phones in addition to desktop PCs. Since mobile client devices have low computing power and low network bandwidth due to wireless network, the transcoding technology to adapt media for mobile client devices considering their characteristics is necessary. Transcoding servers transcode the source media to the target media within corresponding grades and provide QoS in real-time. In particular, an effective load balancing policy for transcoding servers is inevitable to support QoS for large scale mobile users. In this paper, the transcoding load estimation algorithm is proposed for load balance on the distributed transcoding environments. The proposed algorithm estimates transcoding time from transcoding server information, movie information and target transcoding bit-rate. The estimated transcoding time is proved based on experiments.

Plagiarism Detection Using Dependency Graph Analysis Specialized for JavaScript (자바스크립트에 특화된 프로그램 종속성 그래프를 이용한 표절 탐지)

  • Kim, Shin-Hyong;Han, Tai-Sook
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.5
    • /
    • pp.394-402
    • /
    • 2010
  • JavaScript is one of the most popular languages to develope web sites and web applications. Since applicationss written in JavaScript are sent to clients as the original source code, they are easily exposed to plagiarists. Therefore, a method to detect plagiarized JavaScript programs is necessary. The conventional program dependency graph(PDG) based approaches are not suitable to analyze JavaScript programs because they do not reflect dynamic features of JavaScript. They also generate false positives in some cases and show inefficiency with large scale search space. We devise a JavaScript specific PDG(JS PDG) that captures dynamic features of JavaScript and propose a JavaScript plagiarism detection method for precise and fast detection. We evaluate the proposed plagiarism detection method with experiment. Our experiments show that our approach can detect false-positives generated by conventional PDG and can prune the plagiarism search space.

Enhancing Classification Performance of Temporal Keyword Data by Using Moving Average-based Dynamic Time Warping Method (이동 평균 기반 동적 시간 와핑 기법을 이용한 시계열 키워드 데이터의 분류 성능 개선 방안)

  • Jeong, Do-Heon
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.83-105
    • /
    • 2019
  • This study aims to suggest an effective method for the automatic classification of keywords with similar patterns by calculating pattern similarity of temporal data. For this, large scale news on the Web were collected and time series data composed of 120 time segments were built. To make training data set for the performance test of the proposed model, 440 representative keywords were manually classified according to 8 types of trend. This study introduces a Dynamic Time Warping(DTW) method which have been commonly used in the field of time series analytics, and proposes an application model, MA-DTW based on a Moving Average(MA) method which gives a good explanation on a tendency of trend curve. As a result of the automatic classification by a k-Nearest Neighbor(kNN) algorithm, Euclidean Distance(ED) and DTW showed 48.2% and 66.6% of maximum micro-averaged F1 score respectively, whereas the proposed model represented 74.3% of the best micro-averaged F1 score. In all respect of the comprehensive experiments, the suggested model outperformed the methods of ED and DTW.

Leaching behavior of rhenium and molybdenum from molybdenite roasting dust in NaOH solutions (휘수연석(輝水鉛石)의 배소(焙燒) 중 발생한 분경(粉慶)으로부터 NaOH에 의한 Rhenium과 Molybdenum의 침출(浸出))

  • Kim, Young-Uk;Kang, Jin-Gu;Sohn, Jeong-Soo;Cho, Bong-Gyu;Shin, Shun-Myung
    • Resources Recycling
    • /
    • v.18 no.5
    • /
    • pp.37-43
    • /
    • 2009
  • The demand for rhenium has considerably increased recently owing to the large-scale consumption in industries and the price of rhenium has increased owing to the lack of supply and its availability. The dust from the roasting of molybdenite was employed to investigate the leaching behavior of rhenium and molybdenum. Leaching experiments were done by varying optimum parameters, such as reaction time, NaOH concentration and leaching temperature. The optimum leaching condition was found to be $4\;mol{\cdot}L^{-1}$ NaOH, 2 hours leaching time, $100\;g{\cdot}L^{-1}$ solid/liquid ratio, $80^{\circ}C$ temperature, and 250 rpm. At this condition, leaching percentage of rhenium and molybdenum was 86.1% and 88.6%, respectively.

Optimization of energy efficiency through comparative analysis of factors affecting the operation with energy recovery devices on SWRO desalination process (역삼투막 해수담수화 공정에서 에너지 회수장치의 운영인자 비교분석을 통한 에너지 효율 최적화 연구)

  • Kim, Pooreum;Kim, Hyungsoo;Park, Junyoung;Kim, Taewoo;Kim, Minjin;Park, Kitae;Kim, Jihoon
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.32 no.1
    • /
    • pp.1-10
    • /
    • 2018
  • Recently, interest in the development of alternative water resources has been increasing rapidly due to environmental pollution and depletion of water resources. In particular, seawater desalination has been attracting the most attention as alternative water resources. As seawater desalination consumes a large amount of energy due to high operating pressure, many researches have been conducted to improve energy efficiency such as energy recovery device (ERD). Consequently, this study aims to compare the energy efficiency of RO process according to ERD of isobaric type which is applied in scientific control pilot plant process of each $100m^3/day$ scale based on actual RO product water. As a result, it was confirmed that efficiency, mixing rate, and permeate conductivity were different depending on the size of the apparatus even though the same principle of the ERD was applied. It is believed that this is caused by the difference in cross-sectional area of the contacted portion for pressure transfer inside the ERD. Therefore, further study is needed to confirm the optimum conditions what is applicable to the actual process considering the correlation with other factors as well as the factors obtained from the previous experiments.

Parallel Design and Implementation of Shot Boundary Detection Algorithm (샷 경계 탐지 알고리즘의 병렬 설계와 구현)

  • Lee, Joon-Goo;Kim, SeungHyun;You, Byoung-Moon;Hwang, DooSung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.2
    • /
    • pp.76-84
    • /
    • 2014
  • As the number of high-density videos increase, parallel processing approaches are necessary to process a large-scale of video data. When a processing method of video data requires thousands of simple operations, GPU-based parallel processing is preferred to CPU-based parallel processing by way of reducing the time and space complexities of a given computation problem. This paper studies the parallel design and implementation of a shot-boundary detection algorithm. The proposed shot-boundary detection algorithm uses pixel brightness comparisons and global histogram data among the blocks of frames, and the computation of these data is characterized with the high parallelism for the related operations. In order to maximize these operations in parallel, the computations of the pixel brightness and histogram are designed in parallel and implemented in NVIDIA GPU. The GPU-based shot detection method is tested with 10 videos from the set of videos in National Archive of Korea. In experiments, the detection rate is similar but the computation time is about 10 time faster to that of the CPU-based algorithm.

Deflection of Ultra-high Energy Cosmic Rays by the Galactic Magnetic Field

  • Kim, Jihyun;Kim, Hang Bae;Ryu, Dongsu
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.39 no.2
    • /
    • pp.73.1-73.1
    • /
    • 2014
  • We investigate the influence of the galactic magnetic field (GMF) on the arrival direction (AD) of ultra-high energy cosmic rays (UHECRs) by searching the correlation with the large-scale structure (LSS) of the universe. The deflection angle of UHECRs from sources by the GMF is reflected in a source model by introducing the Gaussian smearing angle as a free parameter. Assuming the deflections by the GMF are mainly dependent on the galactic latitude, b, we divide the regions of sky by b and analyze the correlation between the AD of UHECRs and the LSS of the universe in each region varying the smearing angle. We find the deflection is strongly dependent on the galactic latitude by the maximum likelihood estimation. Specifically, the best-fit smearing angles are $9^{\circ}$ and $84^{\circ}$ in the high galactic latitude (HGL), $-90^{\circ}$ < b < $-60^{\circ}$, and in the low galactic latitude (LGL), $-30^{\circ}$ < b < $30^{\circ}$, respectively. The strength of GMF becomes stronger from the HGL to the LGL. From the results, we can estimate the strength of GMF in each region. In the LGL, for example, if we assume UHECRs are protons, we have the order of $100{\mu}G$ GMF, which is much stronger than the expected value of conventional GMF model. However, if the primaries are heavy nuclei, which is consistent with the observational result of mass composition analysis, the order of GMF strength is a few ${\mu}G$. More data from the future experiments make it possible to study the GMF between the source of UHECRs and Earth more accurately.

  • PDF