• Title/Summary/Keyword: 엔진 성능

Search Result 2,214, Processing Time 0.022 seconds

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.

Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information (웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발)

  • Choi, Youji;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.155-175
    • /
    • 2017
  • As social data become into the spotlight, mainstream web search engines provide data indicate how many people searched specific keyword: Web Search Traffic data. Web search traffic information is collection of each crowd that search for specific keyword. In a various area, web search traffic can be used as one of useful variables that represent the attention of common users on specific interests. A lot of studies uses web search traffic data to nowcast or forecast social phenomenon such as epidemic prediction, consumer pattern analysis, product life cycle, financial invest modeling and so on. Also web search traffic data have begun to be applied to predict tourist inbound. Proper demand prediction is needed because tourism is high value-added industry as increasing employment and foreign exchange. Among those tourists, especially Chinese tourists: Youke is continuously growing nowadays, Youke has been largest tourist inbound of Korea tourism for many years and tourism profits per one Youke as well. It is important that research into proper demand prediction approaches of Youke in both public and private sector. Accurate tourism demands prediction is important to efficient decision making in a limited resource. This study suggests improved model that reflects latest issue of society by presented the attention from group of individual. Trip abroad is generally high-involvement activity so that potential tourists likely deep into searching for information about their own trip. Web search traffic data presents tourists' attention in the process of preparation their journey instantaneous and dynamic way. So that this study attempted select key words that potential Chinese tourists likely searched out internet. Baidu-Chinese biggest web search engine that share over 80%- provides users with accessing to web search traffic data. Qualitative interview with potential tourists helps us to understand the information search behavior before a trip and identify the keywords for this study. Selected key words of web search traffic are categorized by how much directly related to "Korean Tourism" in a three levels. Classifying categories helps to find out which keyword can explain Youke inbound demands from close one to far one as distance of category. Web search traffic data of each key words gathered by web crawler developed to crawling web search data onto Baidu Index. Using automatically gathered variable data, linear model is designed by multiple regression analysis for suitable for operational application of decision and policy making because of easiness to explanation about variables' effective relationship. After regression linear models have composed, comparing with model composed traditional variables and model additional input web search traffic data variables to traditional model has conducted by significance and R squared. after comparing performance of models, final model is composed. Final regression model has improved explanation and advantage of real-time immediacy and convenience than traditional model. Furthermore, this study demonstrates system intuitively visualized to general use -Youke Mining solution has several functions of tourist decision making including embed final regression model. Youke Mining solution has algorithm based on data science and well-designed simple interface. In the end this research suggests three significant meanings on theoretical, practical and political aspects. Theoretically, Youke Mining system and the model in this research are the first step on the Youke inbound prediction using interactive and instant variable: web search traffic information represents tourists' attention while prepare their trip. Baidu web search traffic data has more than 80% of web search engine market. Practically, Baidu data could represent attention of the potential tourists who prepare their own tour as real-time. Finally, in political way, designed Chinese tourist demands prediction model based on web search traffic can be used to tourism decision making for efficient managing of resource and optimizing opportunity for successful policy.

A Comparative Study on the Acceptability and the Consumption Attitude for Soy Foods between Korean and Canadian University Students (한국과 캐나다 대학생들의 콩가공식품에 대한 수응도 및 소비실태 비교 연구)

  • Ahn Tae-Hyun;Paliyath Gopinadhan
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.51 no.5
    • /
    • pp.466-476
    • /
    • 2006
  • The objective of this study was to compare and analyze the acceptability and consumption attitude for soy foods between Korean and Canadian university students as young consumers. This survey was carried out by questionnaire and the subjects were n=516 in Korea and n=502 in Canada. Opinions for soy foods in terms of general knowledge were that soy foods are healthy (86.5% in Korean and 53.4% in Canadian) or neutral (11.6% in Korean and 42.8% in Canadian), dairy foods can be substituted by soy foods (51.9% in Korean and 41.8% in Canadian), and soy foods are not only for vegetarians and milk allergy Patients but also for ordinary People (94.2% in Korean and 87.6% in Canadian). In main sources of information about soy foods, the rate by commercials on TV, radio or magazine was the highest (58.0%) for Korean students and the rate by family or friend was the highest(35.7%) for Canadian students. In consumption attitude, all of Korean students have purchased soy foods but only 55.4% of Canadian students have purchased soy foods, and soymilk was remarkably recognized and consumed then soy beverage and margarine in order. 76.4% of Korean students and 65.1% of Canadian students think soy foods are general and popular and can purchase easily, otherwise, in terms of price, soy foods were expensively recognized as 'more expensive than dairy foods' was 59.1% (Korean) and 54.7% (Canadian), and 'similar to dairy foods' was 36.8% (Korean) and 39.9% (Canadian). Major reasons for the rare consumption were 'I am not interested in soy foods' in Korean students (27.3%) and 'I prefer dairy foods to soy foods' in Canadian students (51.7%). However, consumption of soy foods in both countries are very positive and it will be increased.

Design of a Crowd-Sourced Fingerprint Mapping and Localization System (군중-제공 신호지도 작성 및 위치 추적 시스템의 설계)

  • Choi, Eun-Mi;Kim, In-Cheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.9
    • /
    • pp.595-602
    • /
    • 2013
  • WiFi fingerprinting is well known as an effective localization technique used for indoor environments. However, this technique requires a large amount of pre-built fingerprint maps over the entire space. Moreover, due to environmental changes, these maps have to be newly built or updated periodically by experts. As a way to avoid this problem, crowd-sourced fingerprint mapping attracts many interests from researchers. This approach supports many volunteer users to share their WiFi fingerprints collected at a specific environment. Therefore, crowd-sourced fingerprinting can automatically update fingerprint maps up-to-date. In most previous systems, however, individual users were asked to enter their positions manually to build their local fingerprint maps. Moreover, the systems do not have any principled mechanism to keep fingerprint maps clean by detecting and filtering out erroneous fingerprints collected from multiple users. In this paper, we present the design of a crowd-sourced fingerprint mapping and localization(CMAL) system. The proposed system can not only automatically build and/or update WiFi fingerprint maps from fingerprint collections provided by multiple smartphone users, but also simultaneously track their positions using the up-to-date maps. The CMAL system consists of multiple clients to work on individual smartphones to collect fingerprints and a central server to maintain a database of fingerprint maps. Each client contains a particle filter-based WiFi SLAM engine, tracking the smartphone user's position and building each local fingerprint map. The server of our system adopts a Gaussian interpolation-based error filtering algorithm to maintain the integrity of fingerprint maps. Through various experiments, we show the high performance of our system.