• Title/Summary/Keyword: Retrieval System

Search Result 2,272, Processing Time 0.032 seconds

Odysseus/Parallel-OOSQL: A Parallel Search Engine using the Odysseus DBMS Tightly-Coupled with IR Capability (오디세우스/Parallel-OOSQL: 오디세우스 정보검색용 밀결합 DBMS를 사용한 병렬 정보 검색 엔진)

  • Ryu, Jae-Joon;Whang, Kyu-Young;Lee, Jae-Gil;Kwon, Hyuk-Yoon;Kim, Yi-Reun;Heo, Jun-Suk;Lee, Ki-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.412-429
    • /
    • 2008
  • As the amount of electronic documents increases rapidly with the growth of the Internet, a parallel search engine capable of handling a large number of documents are becoming ever important. To implement a parallel search engine, we need to partition the inverted index and search through the partitioned index in parallel. There are two methods of partitioning the inverted index: 1) document-identifier based partitioning and 2) keyword-identifier based partitioning. However, each method alone has the following drawbacks. The former is convenient in inserting documents and has high throughput, but has poor performance for top h query processing. The latter has good performance for top-k query processing, but is inconvenient in inserting documents and has low throughput. In this paper, we propose a hybrid partitioning method to compensate for the drawback of each method. We design and implement a parallel search engine that supports the hybrid partitioning method using the Odysseus DBMS tightly coupled with information retrieval capability. We first introduce the architecture of the parallel search engine-Odysseus/parallel-OOSQL. We then show the effectiveness of the proposed system through systematic experiments. The experimental results show that the query processing time of the document-identifier based partitioning method is approximately inversely proportional to the number of blocks in the partition of the inverted index. The results also show that the keyword-identifier based partitioning method has good performance in top-k query processing. The proposed parallel search engine can be optimized for performance by customizing the methods of partitioning the inverted index according to the application environment. The Odysseus/parallel OOSQL parallel search engine is capable of indexing, storing, and querying 100 million web documents per node or tens of billions of web documents for the entire system.

Text Mining-Based Emerging Trend Analysis for the Aviation Industry (항공산업 미래유망분야 선정을 위한 텍스트 마이닝 기반의 트렌드 분석)

  • Kim, Hyun-Jung;Jo, Nam-Ok;Shin, Kyung-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.65-82
    • /
    • 2015
  • Recently, there has been a surge of interest in finding core issues and analyzing emerging trends for the future. This represents efforts to devise national strategies and policies based on the selection of promising areas that can create economic and social added value. The existing studies, including those dedicated to the discovery of future promising fields, have mostly been dependent on qualitative research methods such as literature review and expert judgement. Deriving results from large amounts of information under this approach is both costly and time consuming. Efforts have been made to make up for the weaknesses of the conventional qualitative analysis approach designed to select key promising areas through discovery of future core issues and emerging trend analysis in various areas of academic research. There needs to be a paradigm shift in toward implementing qualitative research methods along with quantitative research methods like text mining in a mutually complementary manner. The change is to ensure objective and practical emerging trend analysis results based on large amounts of data. However, even such studies have had shortcoming related to their dependence on simple keywords for analysis, which makes it difficult to derive meaning from data. Besides, no study has been carried out so far to develop core issues and analyze emerging trends in special domains like the aviation industry. The change used to implement recent studies is being witnessed in various areas such as the steel industry, the information and communications technology industry, the construction industry in architectural engineering and so on. This study focused on retrieving aviation-related core issues and emerging trends from overall research papers pertaining to aviation through text mining, which is one of the big data analysis techniques. In this manner, the promising future areas for the air transport industry are selected based on objective data from aviation-related research papers. In order to compensate for the difficulties in grasping the meaning of single words in emerging trend analysis at keyword levels, this study will adopt topic analysis, which is a technique used to find out general themes latent in text document sets. The analysis will lead to the extraction of topics, which represent keyword sets, thereby discovering core issues and conducting emerging trend analysis. Based on the issues, it identified aviation-related research trends and selected the promising areas for the future. Research on core issue retrieval and emerging trend analysis for the aviation industry based on big data analysis is still in its incipient stages. So, the analysis targets for this study are restricted to data from aviation-related research papers. However, it has significance in that it prepared a quantitative analysis model for continuously monitoring the derived core issues and presenting directions regarding the areas with good prospects for the future. In the future, the scope is slated to expand to cover relevant domestic or international news articles and bidding information as well, thus increasing the reliability of analysis results. On the basis of the topic analysis results, core issues for the aviation industry will be determined. Then, emerging trend analysis for the issues will be implemented by year in order to identify the changes they undergo in time series. Through these procedures, this study aims to prepare a system for developing key promising areas for the future aviation industry as well as for ensuring rapid response. Additionally, the promising areas selected based on the aforementioned results and the analysis of pertinent policy research reports will be compared with the areas in which the actual government investments are made. The results from this comparative analysis are expected to make useful reference materials for future policy development and budget establishment.

Some Instances of Manchurian Naturalization and Settlement in Choson Dynasty (향화인의 조선 정착 사례 연구 - 여진 향화인을 중심으로 -)

  • Won, Chang-Ae
    • (The)Study of the Eastern Classic
    • /
    • no.37
    • /
    • pp.33-61
    • /
    • 2009
  • In the late Koryo period, until 14th century, there had been at least two groups of Manchurians who were conferred citizenships; one group was living as an original inhabitant in the coastal area of north­eastern part of Korean peninsular, long time ago, and they were over one thousand households. The other was coming down from inland, eastern part of Yoha River, to the area of Tuman River to settle down and they were at least around one hundred and sixty households, including such tribes as Al-tha-ry, Ol-lyang-hap, Ol-jok-hap and others. They were treated courteously, from the early days of Choson dynasty, with governmental policies in an economic, political, and social ways. They were given, for instance, a house, a land, household furniture, and clothes. They were allowed to get marry with a native Korean to settle down. They were educated how to cultivate their lands. It was also possible for them to be given an official position politically or allowed to take a National Civil Official Examination. The fact they could take such an Examination, in particular, means they were treated fairly and equally, because they also had a privilege to improve their social positions through the formal system as much as common people. Two typical families were scrutinized, in this paper, family Chong-hae Lee and family Chon-ju Ju. All of them were successful to settle down with different backgrounds each other. The former were from a headman, Lee Jee-ran, who controlled his tribe, over five hundred households. He was given three titles of a meritorious retainer at the founding of Chosun dynasty, at the retrieval of armies, and an enshrined retainer. His son, Lee Wha-yong, was also given a vassal of merit who kept a close tie successfully with the king's family through a marriage. Upon the foundation of their ancestors, their grandsons, family Lee Hyo-yang and family Lee Hyo-gang, each, had taken solid root as an aristocratic Yang-ban class. The former became a high officer family, generation by generation, while the latter changed into a civil official family through Civil Official Examinations. They lived mainly around Seoul, Kyong-gi Province and some lived in their original places, Ham-kyong Province. Chu-man, the first ancestor, was given a meritorious retainer at the founding of the dynasty and Chu-in was also given a high officer position from the government. They kept living at the original place, Ham-heung, Ham-kyong Province, and then became an outstanding local family there. They began to pass the Civil Official Examinations. After 17th century on the passers were 17 in Civil Official Examinations and 40 were passed in lower civil examinations. The positions in government they attained usually were remonstrance which position was prohibited particularly to North­Western people at that time. The Chosun dynasty was open to Machurians widely through the system of envoy, convoy, and naturalization. It was intended to build up an enclosure policy through a friendly diplomatic relation with them against any possible invasion from outside. This is one reason why they were supported fully that much in a various way.

Improvement and Validation of Convective Rainfall Rate Retrieved from Visible and Infrared Image Bands of the COMS Satellite (COMS 위성의 가시 및 적외 영상 채널로부터 복원된 대류운의 강우강도 향상과 검증)

  • Moon, Yun Seob;Lee, Kangyeol
    • Journal of the Korean earth science society
    • /
    • v.37 no.7
    • /
    • pp.420-433
    • /
    • 2016
  • The purpose of this study is to improve the calibration matrixes of 2-D and 3-D convective rainfall rates (CRR) using the brightness temperature of the infrared $10.8{\mu}m$ channel (IR), the difference of brightness temperatures between infrared $10.8{\mu}m$ and vapor $6.7{\mu}m$ channels (IR-WV), and the normalized reflectance of the visible channel (VIS) from the COMS satellite and rainfall rate from the weather radar for the period of 75 rainy days from April 22, 2011 to October 22, 2011 in Korea. Especially, the rainfall rate data of the weather radar are used to validate the new 2-D and 3-DCRR calibration matrixes suitable for the Korean peninsula for the period of 24 rainy days in 2011. The 2D and 3D calibration matrixes provide the basic and maximum CRR values ($mm\;h^{-1}$) by multiplying the rain probability matrix, which is calculated by using the number of rainy and no-rainy pixels with associated 2-D (IR, IR-WV) and 3-D (IR, IR-WV, VIS) matrixes, by the mean and maximum rainfall rate matrixes, respectively, which is calculated by dividing the accumulated rainfall rate by the number of rainy pixels and by the product of the maximum rain rate for the calibration period by the number of rain occurrences. Finally, new 2-D and 3-D CRR calibration matrixes are obtained experimentally from the regression analysis of both basic and maximum rainfall rate matrixes. As a result, an area of rainfall rate more than 10 mm/h is magnified in the new ones as well as CRR is shown in lower class ranges in matrixes between IR brightness temperature and IR-WV brightness temperature difference than the existing ones. Accuracy and categorical statistics are computed for the data of CRR events occurred during the given period. The mean error (ME), mean absolute error (MAE), and root mean squire error (RMSE) in new 2-D and 3-D CRR calibrations led to smaller than in the existing ones, where false alarm ratio had decreased, probability of detection had increased a bit, and critical success index scores had improved. To take into account the strong rainfall rate in the weather events such as thunderstorms and typhoon, a moisture correction factor is corrected. This factor is defined as the product of the total precipitable waterby the relative humidity (PW RH), a mean value between surface and 500 hPa level, obtained from a numerical model or the COMS retrieval data. In this study, when the IR cloud top brightness temperature is lower than 210 K and the relative humidity is greater than 40%, the moisture correction factor is empirically scaled from 1.0 to 2.0 basing on PW RH values. Consequently, in applying to this factor in new 2D and 2D CRR calibrations, the ME, MAE, and RMSE are smaller than the new ones.

A Performance Comparison of the Mobile Agent Model with the Client-Server Model under Security Conditions (보안 서비스를 고려한 이동 에이전트 모델과 클라이언트-서버 모델의 성능 비교)

  • Han, Seung-Wan;Jeong, Ki-Moon;Park, Seung-Bae;Lim, Hyeong-Seok
    • Journal of KIISE:Information Networking
    • /
    • v.29 no.3
    • /
    • pp.286-298
    • /
    • 2002
  • The Remote Procedure Call(RPC) has been traditionally used for Inter Process Communication(IPC) among precesses in distributed computing environment. As distributed applications have been complicated more and more, the Mobile Agent paradigm for IPC is emerged. Because there are some paradigms for IPC, researches to evaluate and compare the performance of each paradigm are issued recently. But the performance models used in the previous research did not reflect real distributed computing environment correctly, because they did not consider the evacuation elements for providing security services. Since real distributed environment is open, it is very vulnerable to a variety of attacks. In order to execute applications securely in distributed computing environment, security services which protect applications and information against the attacks must be considered. In this paper, we evaluate and compare the performance of the Remote Procedure Call with that of the Mobile Agent in IPC paradigms. We examine security services to execute applications securely, and propose new performance models considering those services. We design performance models, which describe information retrieval system through N database services, using Petri Net. We compare the performance of two paradigms by assigning numerical values to parameters and measuring the execution time of two paradigms. In this paper, the comparison of two performance models with security services for secure communication shows the results that the execution time of the Remote Procedure Call performance model is sharply increased because of many communications with the high cryptography mechanism between hosts, and that the execution time of the Mobile Agent model is gradually increased because the Mobile Agent paradigm can reduce the quantity of the communications between hosts.

CO2 Exchange in Kwangneung Broadleaf Deciduous Forest in a Hilly Terrain in the Summer of 2002 (2002년 여름철 경사진 광릉 낙엽 활엽수림에서의 이산화탄소 교환)

  • Choi, Tae-jin;Kim, Joon;Lim, Jong-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.5 no.2
    • /
    • pp.70-80
    • /
    • 2003
  • We report the first direct measurement of $CO_2$ flux over Kwangneung broadleaf deciduous forest, one of the tower flux sites in KoFlux network. Eddy covariance system was installed on a 30 m tower along with other meteorological instruments from June to August in 2002. Although the study site was non-ideal (with valley-like terrain), turbulence characteristics from limited wind directions (i.e., 90$\pm$45$^{\circ}$) was not significantly different from those obtained at simple, homogeneous terrains with an ideal fetch. Despite very low rate of data retrieval, preliminary results from our analysis are encouraging and worthy of further investigation. Ignoring the role of advection terms, the averaged net ecosystem exchange (NEE) of $CO_2$ ranged from -1.2 to 0.7 mg m$^{-2}$ s$^{-1}$ from June to August in 2002. The effect of weak turbulence on nocturnal NEE was examined in terms of friction velocity (u*) along with the estimation of storage term. The effect of low uf u* NEE was obvious with a threshold value of about 0.2 m s$^{-1}$ . The contribution of storage term to nocturnal NEE was insignificant; suggesting that the $CO_2$ stored within the forest canopy at night was probably removed by the drainage flow along the hilly terrain. This could be also an artifact of uncertainty in calculations of storage term based on a single-level concentration. The hyperbolic light response curves explained >80% of variation in the observed NEE, indicating that $CO_2$ exchange at the site was notably light-dependent. Such a relationship can be used effectively in filling up the missing gaps in NEE data through the season. Finally, a simple scaling analysis based on a linear flow model suggested that advection might play a significant role in NEE evaluation at this site.

Video Scene Detection using Shot Clustering based on Visual Features (시각적 특징을 기반한 샷 클러스터링을 통한 비디오 씬 탐지 기법)

  • Shin, Dong-Wook;Kim, Tae-Hwan;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.47-60
    • /
    • 2012
  • Video data comes in the form of the unstructured and the complex structure. As the importance of efficient management and retrieval for video data increases, studies on the video parsing based on the visual features contained in the video contents are researched to reconstruct video data as the meaningful structure. The early studies on video parsing are focused on splitting video data into shots, but detecting the shot boundary defined with the physical boundary does not cosider the semantic association of video data. Recently, studies on structuralizing video shots having the semantic association to the video scene defined with the semantic boundary by utilizing clustering methods are actively progressed. Previous studies on detecting the video scene try to detect video scenes by utilizing clustering algorithms based on the similarity measure between video shots mainly depended on color features. However, the correct identification of a video shot or scene and the detection of the gradual transitions such as dissolve, fade and wipe are difficult because color features of video data contain a noise and are abruptly changed due to the intervention of an unexpected object. In this paper, to solve these problems, we propose the Scene Detector by using Color histogram, corner Edge and Object color histogram (SDCEO) that clusters similar shots organizing same event based on visual features including the color histogram, the corner edge and the object color histogram to detect video scenes. The SDCEO is worthy of notice in a sense that it uses the edge feature with the color feature, and as a result, it effectively detects the gradual transitions as well as the abrupt transitions. The SDCEO consists of the Shot Bound Identifier and the Video Scene Detector. The Shot Bound Identifier is comprised of the Color Histogram Analysis step and the Corner Edge Analysis step. In the Color Histogram Analysis step, SDCEO uses the color histogram feature to organizing shot boundaries. The color histogram, recording the percentage of each quantized color among all pixels in a frame, are chosen for their good performance, as also reported in other work of content-based image and video analysis. To organize shot boundaries, SDCEO joins associated sequential frames into shot boundaries by measuring the similarity of the color histogram between frames. In the Corner Edge Analysis step, SDCEO identifies the final shot boundaries by using the corner edge feature. SDCEO detect associated shot boundaries comparing the corner edge feature between the last frame of previous shot boundary and the first frame of next shot boundary. In the Key-frame Extraction step, SDCEO compares each frame with all frames and measures the similarity by using histogram euclidean distance, and then select the frame the most similar with all frames contained in same shot boundary as the key-frame. Video Scene Detector clusters associated shots organizing same event by utilizing the hierarchical agglomerative clustering method based on the visual features including the color histogram and the object color histogram. After detecting video scenes, SDCEO organizes final video scene by repetitive clustering until the simiarity distance between shot boundaries less than the threshold h. In this paper, we construct the prototype of SDCEO and experiments are carried out with the baseline data that are manually constructed, and the experimental results that the precision of shot boundary detection is 93.3% and the precision of video scene detection is 83.3% are satisfactory.

Retrieval of the Variation of Optical Characteristics of Asian Dust Plume according to their Vertical Distributions using Multi-wavelength Raman LIDAR System (다파장 라만 라이다 관측을 통한 황사의 이동 고도 분포에 따른 광학적 특성 변화 규명)

  • Shin, Sung-Kyun;Park, Young-San;Choi, Byoung-Choel;Lee, Kwonho;Shin, Dongho;Kim, Young J.;Noh, Youngmin
    • Korean Journal of Remote Sensing
    • /
    • v.30 no.5
    • /
    • pp.597-605
    • /
    • 2014
  • The continuous observations for atmospheric aerosols were conducted during 3 years (2009 to 2011) by using Gwangju Institute of Science and Technology (GIST) multi-wavelength Raman lidar at Gwangju, Korea ($35.10^{\circ}N$, $126.53^{\circ}E$). The aerosol depolarization ratios calculated from lidar data were used to identify the Asian dust layer. The optical properties of Asian dust layer were different according to its vertical distribution. In order to investigate the difference between the optical properties of each individual dust layers, the transport pathway and the transport altitude of Asian dust were analyzed by Hybrid Single Particle Lagrangian Integrated Trajectory (HYSPLIT) model. We consider that the variation of optical properties were influenced not only their transport pathway but also their transport height when it passed over anthropogenic pollution source regions in China. The lower particle depolarization ratio values of $0.12{\pm}0.01$, higher lidar ratio of $67{\pm}9sr$ and $68{\pm}9sr$ at 355 nm and 532 nm, respectively, and higher ${\AA}ngstr\ddot{o}m$ exponent of $1.05{\pm}0.57$ which are considered as the optical properties of pollution were found. In contrast with this, the higher particle depolarization ratio values of $0.21{\pm}0.09$, lower lidar ratio of $48{\pm}5sr$ and $46{\pm}4sr$ at 355 nm and 532 nm, respectively, and lower ${\AA}ngstr\ddot{o}m$ exponent of $0.57{\pm}0.24$ which are considered as the optical properties of dust were found. We found that the degree of mixing of anthropogenic pollutant aerosols in mixed Asian dust govern the variation of optical properties of Asian dust and it depends on their altitude when it passed over the polluted regions over China.

Probabilistic Anatomical Labeling of Brain Structures Using Statistical Probabilistic Anatomical Maps (확률 뇌 지도를 이용한 뇌 영역의 위치 정보 추출)

  • Kim, Jin-Su;Lee, Dong-Soo;Lee, Byung-Il;Lee, Jae-Sung;Shin, Hee-Won;Chung, June-Key;Lee, Myung-Chul
    • The Korean Journal of Nuclear Medicine
    • /
    • v.36 no.6
    • /
    • pp.317-324
    • /
    • 2002
  • Purpose: The use of statistical parametric mapping (SPM) program has increased for the analysis of brain PET and SPECT images. Montreal Neurological Institute (MNI) coordinate is used in SPM program as a standard anatomical framework. While the most researchers look up Talairach atlas to report the localization of the activations detected in SPM program, there is significant disparity between MNI templates and Talairach atlas. That disparity between Talairach and MNI coordinates makes the interpretation of SPM result time consuming, subjective and inaccurate. The purpose of this study was to develop a program to provide objective anatomical information of each x-y-z position in ICBM coordinate. Materials and Methods: Program was designed to provide the anatomical information for the given x-y-z position in MNI coordinate based on the Statistical Probabilistic Anatomical Map (SPAM) images of ICBM. When x-y-z position was given to the program, names of the anatomical structures with non-zero probability and the probabilities that the given position belongs to the structures were tabulated. The program was coded using IDL and JAVA language for 4he easy transplantation to any operating system or platform. Utility of this program was shown by comparing the results of this program to those of SPM program. Preliminary validation study was peformed by applying this program to the analysis of PET brain activation study of human memory in which the anatomical information on the activated areas are previously known. Results: Real time retrieval of probabilistic information with 1 mm spatial resolution was archived using the programs. Validation study showed the relevance of this program: probability that the activated area for memory belonged to hippocampal formation was more than 80%. Conclusion: These programs will be useful for the result interpretation of the image analysis peformed on MNI coordinate, as done in SPM program.

A Study on the Design of the Grid-Cell Assessment System for the Optimal Location of Offshore Wind Farms (해상풍력발전단지의 최적 위치 선정을 위한 Grid-cell 평가 시스템 개념 설계)

  • Lee, Bo-Kyeong;Cho, Ik-Soon;Kim, Dae-Hae
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.24 no.7
    • /
    • pp.848-857
    • /
    • 2018
  • Recently, around the world, active development of new renewable energy sources including solar power, waves, and fuel cells, etc. has taken place. Particularly, floating offshore wind farms have been developed for saving costs through large scale production, using high-quality wind power and minimizing noise damage in the ocean area. The development of floating wind farms requires an evaluation of the Maritime Safety Audit Scheme under the Maritime Safety Act in Korea. Floating wind farms shall be assessed by applying the line and area concept for systematic development, management and utilization of specified sea water. The development of appropriate evaluation methods and standards is also required. In this study, proper standards for marine traffic surveys and assessments were established and a systemic treatment was studied for assessing marine spatial area. First, a marine traffic data collector using AIS or radar was designed to conduct marine traffic surveys. In addition, assessment methods were proposed such as historical tracks, traffic density and marine traffic pattern analysis applying the line and area concept. Marine traffic density can be evaluated by spatial and temporal means, with an adjusted grid-cell scale. Marine traffic pattern analysis was proposed for assessing ship movement patterns for transit or work in sea areas. Finally, conceptual design of a Marine Traffic and Safety Assessment Solution (MaTSAS) was competed that can be analyzed automatically to collect and assess the marine traffic data. It could be possible to minimize inaccurate estimation due to human errors such as data omission or misprints through automated and systematic collection, analysis and retrieval of marine traffic data. This study could provides reliable assessment results, reflecting the line and area concept, according to sea area usage.