• Title/Summary/Keyword: query index

Search Result 412, Processing Time 0.027 seconds

Measuring Hadoop Optimality by Lorenz Curve (로렌츠 커브를 이용한 하둡 플랫폼의 최적화 지수)

  • Kim, Woo-Cheol;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.2
    • /
    • pp.249-261
    • /
    • 2014
  • Ever increasing "Big data" can only be effectively processed by parallel computing. Parallel computing refers to a high performance computational method that achieves effectiveness by dividing a big query into smaller subtasks and aggregating results from subtasks to provide an output. However, it is well-known that parallel computing does not achieve scalability which means that performance is improved linearly by adding more computers because it requires a very careful assignment of tasks to each node and collecting results in a timely manner. Hadoop is one of the most successful platforms to attain scalability. In this paper, we propose a measurement for Hadoop optimization by utilizing a Lorenz curve which is a proxy for the inequality of hardware resources. Our proposed index takes into account the intrinsic overhead of Hadoop systems such as CPU, disk I/O and network. Therefore, it also indicates that a given Hadoop can be improved explicitly and in what capacity. Our proposed method is illustrated with experimental data and substantiated by Monte Carlo simulations.

A Fuzzy Retrieval System to Facilitate Associated Learning in Problem Banks (문제 은행에서 연상학습을 지원하는 퍼지 검색 시스템)

  • Choi, Jae-hun;Kim, ji-Suk;Cho, Gi-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.4
    • /
    • pp.278-288
    • /
    • 2002
  • This paper presents a design and implementation of fuzzy retrieval system that could support an associated learning in problem banks. It tries to retrieve some of the problems conceptually related to specific semantics described by user's queries. In particular, the problem retrieval system employs a fuzzy thesaurus which represents relationships between domain dependent vocabularies as fuzzy degrees. It would keep track of characteristics of the associated learning, which should guarantee high recall and acceptable precision for retrieval effectiveness. That is, since the thesaurus could make a vocabulary mismatch problem resolved among query terms and document index terms, this retrieval system could take a chance to effectively support user's associated teaming. Finally, we have evaluated whether the fuzzy retrieval system is appropriate for the associated teaming or not, by means of its precision and recall rate point of view.

A Tree-Based Indexing Method for Mobile Data Broadcasting (모바일 데이터 브로드캐스팅을 위한 트리 기반의 인덱싱 방법)

  • Park, Mee-Hwa;Lee, Yong-Kyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.4
    • /
    • pp.141-150
    • /
    • 2008
  • In this mobile computing environment, data broadcasting is widely used to resolve the problem of limited power and bandwidth of mobile equipments. Most previous broadcast indexing methods concentrate on flat data. However. with the growing popularity of XML, an increasing amount of information is being stored and exchanged in the XML format. We propose a novel indexing method. called TOP tree(Tree Ordering based Path summary tree), for indexing XML document on mobile broadcast environments. TOP tree is a path summary tree which provides a concise structure summary at group level using global IDs and element information at local level using local IDs. Based on the TOP tree representation, we suggest a broadcast stream generation and query Processing method that efficiently handles not only simple Path queries but also multiple path queries. We have compared our indexing method with other indexing methods. Evaluation results show that our approaches can effectively improve the access time and tune-in time in a wireless broadcasting environment.

  • PDF

Design and Implementation of Active Database Based Query Processing System for Educational Information (능동 데이터베이스 기반 교육 정보 질의 처리 시스템의 설계 및 구현)

  • Lee, Tae-Jung;Lee, Soo-Jung;Lee, Jae-Ho
    • Journal of The Korean Association of Information Education
    • /
    • v.4 no.1
    • /
    • pp.109-119
    • /
    • 2000
  • In this paper, we design and implement educational system by using active database technique in www environments. The developed system consists of three modules such as user group interface, active-based monitor, and expert group interface. The roles of each module are summarized as follows. The user group interface provides with system accessibility the users who registered in the developed system. The active-based monitor, triggering modules, consists of three sub-modules such as keyword parser, exception handler, and DB access module. Also it manages five items such as keyword index, Q&A DB, user group mailing list, and expert group mailing list. The expect group interface provides answer filling form for the expert who registered in the developed system.

  • PDF

Classification and Retrieval of Object - Oriented Reuse Components with HACM (HACM을 사용한 객체지향 재사용 부품의 분류와 검색)

  • Bae, Je-Min;Kim, Sang-Geun;Lee, Kyung-Whan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.7
    • /
    • pp.1733-1748
    • /
    • 1997
  • In this paper, we propose the classification scheme and retrieval mechanism which can apply to many application domains in order to construct the software reuse library. Classification scheme which is the core of the accessibility in the reusability, is defined by the hierarchical structure using the agglomerative clusters. Agglomerative cluster means the group of the reuse component by the functional relationships. Functional relationships are measured by the HACM which is the representation method about software components to calculate the similarities among the classes in the particular domain. And clustering informations are added to the library structure which determines the functionality and accuracy of the retrieval system. And the system stores the classification results such as the index information with the weights, the similarity matrix, the hierarchical structure. Therefore users can retrieve the software component using the query which is the natural language. The thesis is studied to focus on the findability of software components in the reuse library. As a result, the part of the construction process of the reuse library was automated, and we can construct the object-oriented reuse library with the extendibility and relationship about the reuse components. Also the our process is visualized through the browse hierarchy of the retrieval environment, and the retrieval system is integrated to the reuse system CARS 2.1.

  • PDF

A Design of Parallel Processing System for Management of Moving Objects (이동체 관리를 위한 다중 처리 시스템의 설계)

  • 김진덕;강구안;육정수;박연식
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2004.05b
    • /
    • pp.345-349
    • /
    • 2004
  • In order to index exactly moving objects(vehicle, mobile phone, PDA, etc.) in the mobile database, continuous updates of their locations are inevitable as well as time-consuming. The studies of pure spatial indices have focused on the efficient retrievals. However, the acquisition and management of the terminal Location of moving objects are more important than the efficiency of the query processing in the moving object databases. Therefore, it will be need to adopt parallel processing system for the moving object databases which should maintain the object's current location as precise as possible. This paper proposes a architecture of spatial indexing mobile objects using multiple processors. More precisely, we newly propose a method of splitting buckets using the properties of moving objects in order to minimize the number of database updates. We also propose a acquisition method for gathering the location information of moving objects and passing the information of the bucket extents in order to reduce the amount of passed messages between processors.

  • PDF

A Exploratory Study on the Expansion of Academic Information Services Based on Automatic Semantic Linking Between Academic Web Resources and Information Services (웹 정보의 자동 의미연계를 통한 학술정보서비스의 확대 방안 연구)

  • Jeong, Do-Heon;Yu, So-Young;Kim, Hwan-Min;Kim, Hye-Sun;Kim, Yong-Kwang;Han, Hee-Jun
    • Journal of Information Management
    • /
    • v.40 no.1
    • /
    • pp.133-156
    • /
    • 2009
  • In this study, we link informal Web resources to KISTI NDSL's collections using automatic semantic indexing and tagging to examine the possibility of the service which recommends related documents using the similarity between KISTI's formal information resources and informal web resources. We collect and index Web resources and make automatic semantic linking through STEAK with KISTI's collections for NDSL retrieval. The macro precision which shows retrieval precision per a subject category is 62.6% and the micro precision which shows retrieval precision per a query is 66.9%. The experts' evaluation score is 76.7. This study shows the possibility of semantic linking NDSL retrieval results with Web information resources and expanding information services' coverage to informal information resources.

Strategies to Improve Nutrition for the Elderly in Suwon : Analysis of Dietary Behavior and Food Preferences (수원지역 노인 영양개선 전략 연구 : 식습관 및 식품기호도 분석)

  • 임경숙;민영희;이태영;김영주
    • Korean Journal of Community Nutrition
    • /
    • v.3 no.3
    • /
    • pp.410-422
    • /
    • 1998
  • To promote health status, strategies and interventions to improve nutrition should be based on the proper diagnosis of the subject's eating patterns. The elderly usually have traditional food habits and preferences, and it is very difficult to change them. This study was designed to identify dietary behavior and food preference of the elderly, in order to provide baseline data for the Elderly Nutrition Intervention Program for the Public Health Center. A survey questionnaire was made for use by trained interviewers to query 151elderly people from 5 community elderly centers located in Suwon, Korea. The majority of them ate regularly and partook of all available side dishes. Their major dietary problems were frequent consumptions of salty foods, and eating too quickly. They consumed grains and vegetables regularly, but seldomly ate dairy products, fruits, meat and food prepared with oil. They also tended to eschew ready made processed food, high cholesterol food, and fast food. Also they did not dine out as much as younger people. Desirable eating habit score were not significantly influenced by socioeconomic variables and nutrition-related characteristics. These included nutrition knowledge, Nutritional Risk Index(NRI) and a score of health concerns. However, meal balance scores were significantly higher in the younger group(p<.05), the higher household income group(p<.05). According to stepwise multiple regression analysis, NRI was the most important determinant of a desirable eating habit score for the male elderly, whereas the score of health concerns was mo9st important for female elderly subjects. The greatest predictor of the meal f balance score was nutrition knowledge. The elderly liked sweet tasting food, grains, rice, stews and Korean style soups. They disliked sour food, dairy products, processed food, and bread. The results indicate that the Elderly Nutrition Education Program should focus on increasing consumption of dairy products, fruits and food with oil, prepared by traditional Korean cooking methods. It also suggests that the program planning should consider the socioeconomic status of the elderly, such as income and education level, as well as concern for health.

  • PDF

Big Data Model for Analyzing Plant Growth Environment Informations and Biometric Informations (농작물 생육환경정보와 생체정보 분석을 위한 빅데이터 모델)

  • Lee, JongYeol;Moon, ChangBae;Kim, ByeongMan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.25 no.6
    • /
    • pp.15-23
    • /
    • 2020
  • While research activities in the agricultural field for climate change are being actively carried out, smart agriculture using information and communication technology has become a new trend in line with the Fourth Industrial Revolution. Accordingly, research is being conducted to identify and respond to signs of abnormal growth in advance by monitoring the stress of crops in various outdoor environments and soil conditions. There are also attempts to analyze data collected in real time through various sensors using artificial intelligence techniques or big data technologies. In this paper, we propose a big data model that is effective in analyzing the growth environment informations and biometric information of crops by using the existing relational database for big data analysis. The performance of the model was measured by the response time to a query according to the amount of data. As a result, it was confirmed that there is a maximum time reduction effect of 23.8%.

Comparison of Search Performance of SQLite3 Database by Linux File Systems (Linux File Systems에 따른 SQLite3 데이터베이스의 검색 성능 비교)

  • Choi, Jin-Oh
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.1-6
    • /
    • 2022
  • Recently, IoT sensors are often used to produce stream data locally and they are provided for edge computing applications. Mass-produced data are stored in the mobile device's database for real-time processing and then synchronized with the server when needed. Many mobile databases are developed to support those applications. They are CloudScape, DB2 Everyplace, ASA, PointBase Mobile, etc, and the most widely used database is SQLite3 on Linux. In this paper, we focused on the performance required for synchronization with the server. The search performance required to retrieve SQLite3 was compared and analyzed according to the type of each Linux file system in which the database is stored. Thus, performance differences were checked for each file system according to various search query types, and criteria for applying the more appropriate Linux file system according to the index use environment and table scan environment were prepared and presented.