• Title/Summary/Keyword: Analysis of Query

Search Result 457, Processing Time 0.022 seconds

Curriculum of Basic Data Science Practices for Non-majors (비전공자 대상 기초 데이터과학 실습 커리큘럼)

  • Hur, Kyeong
    • Journal of Practical Engineering Education
    • /
    • v.12 no.2
    • /
    • pp.265-273
    • /
    • 2020
  • In this paper, to design a basic data science practice curriculum as a liberal arts subject for non-majors, we proposed an educational method using an Excel(spreadsheet) data analysis tool. Tools for data collection, data processing, and data analysis include Excel, R, Python, and Structured Query Language (SQL). When it comes to practicing data science, R, Python and SQL need to understand programming languages and data structures together. On the other hand, the Excel tool is a data analysis tool familiar to the general public, and it does not have the burden of learning a programming language. And if you practice basic data science practice with Excel, you have the advantage of being able to concentrate on acquiring data science content. In this paper, a basic data science practice curriculum for one semester and weekly Excel practice contents were proposed. And, to demonstrate the substance of the educational content, examples of Linear Regression Analysis were presented using Excel data analysis tools.

Analysis of 3D Data Structuring and Processing Techniques for 3D GIS (3D GIS를 위한 3차원 구조화 및 처리기술 분석)

  • 구흥대;정동기;유환희
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2004.04a
    • /
    • pp.375-382
    • /
    • 2004
  • Lately, 3D GIS begins to be widely used in many application fields. In this research, we proposed a survey and analysis result of research trends for 3D GIS technologies-acquisition of 3D spatial data, 3D features structuring, 3D visualization, data query, and transmission etc. The result is expected to give the helpful information for constructing research road-map on development of 3D GIS technologies.

  • PDF

Efficient Processing of Continuous Join Queries between a Data Stream and Multiple Relations for Real-Time Analysis of E-Commerce Data (전자상거래 데이터의 실시간 분석을 위한 데이터 스트림과 다수 릴레이션 간의 효율적인 연속 조인 처리 기법)

  • Kim, Haeri;Lee, Ki Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.3
    • /
    • pp.159-175
    • /
    • 2013
  • Recently, as real-time availability of e-commerce data becomes possible, the requirement of real-time analysis of e-commerce increases significantly. In the real-time analysis of e-commerce data, it is very important to efficiently process continuous join queries between an e-commerce data stream and disk-based large relations. In this paper, we propose an efficient method for processing a continuous join query between an e-commerce data stream and multiple disk-based relations. The proposed method improves the service rate significantly, while reducing the amount of required memory substantially. Through analysis and various experiments, we show the efficiency of the proposed method compared with the previous one in terms of service rate and memory usage.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Analysis on Drug Identification Service and other Drug-related Queries in a Hospital Pharmacy (병원약제부의 약품식별업무와 질의응답업무에 관한 업무분석;한 대학병원의 경우)

  • Choi, Ji-Hong;Kim, Jung-Ae;Shanmugam, Srinivasan;Yong, Chul-Soon;Choi, Han-Gon;Yoo, Bong-Kyu
    • YAKHAK HOEJI
    • /
    • v.52 no.4
    • /
    • pp.283-287
    • /
    • 2008
  • Drug identification service and other drug-related query service are becoming increasingly important in hospital pharmacy. The goal of this research was to investigate current situation of the service in hospital pharmacy, which recently implemented the services as part of provision of advanced hospital pharmacy service in order to assure national health improvement. We investigated the report performed from November 2006 through April 2007 in a university hospital located in Daegu, Korea. Number of drug identification service performed was 81 cases during the first three months period (period I), but it increased to 222 cases during the second three months period (period II), which suggested that the service was welcomed by medical staff in the hospital. Time to process each case was about 30 minutes in the period I while it was only 16 minutes in the period II. Proportion of the unidentifiable cases remained at about 25% during the entire period, which suggests that the system for the identification task appears to have some limitations such as unsatisfactory support from the Korea Pharmaceutical Association, laws, and regulations. A vast majority of drug-related queries were mostly from physicians (60.5%) followed by nurses and pharmacists. Time to process each drug-related query was 10.6 minutes in the period I while it was 6.9 minutes in the period II. Queries answered immediately were about 70% of all queries in the period I, but increased to about 85% in the period II.

Analysis and Comparison of Query focused Korean Document Summarization using Word Embedding (워드 임베딩을 이용한 질의 기반 한국어 문서 요약 분석 및 비교)

  • Heu, Jee-Uk
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.161-167
    • /
    • 2019
  • Recently, the amount of created information has been rising rapidly by dissemination of state of the art and developing of the various web service based on ICT. In additionally, the user has to need a lot of times and effort to find the necessary information which is the user want to know it in the mount of information. Document summarization is the technique that making and providing the summary of given document efficiently by analyzing and extracting the key sentences and words. However, it is hard to apply the previous of word embedding technique to the document which is composed by korean language for analyzing contents in the document due to the character of language. In this paper, we propose the new query-focused korean document summarization by exploiting word embedding technique such as Word2Vec and FastText, and then compare the both result of performance.

Pattern Analysis-Based Query Expansion for Enhancing Search Convenience (검색 편의성 향상을 위한 패턴 분석 기반 질의어 확장)

  • Jeon, Seo-In;Park, Gun-Woo;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.2
    • /
    • pp.65-72
    • /
    • 2012
  • In the 21st century of information systems, the amount of information resources are ever increasing and the role of information searching system is becoming criticalto easily acquire required information from the web. Generally, it requires the user to have enough pre-knowledge and superior capabilities to identify keywords of information to effectively search the web. However, most of the users undertake searching of the information without holding enough pre-knowledge and spend a lot of time associating key words which are related to their required information. Furthermore, many search engines support the keywords searching system but this only provides collection of similar words, and do not provide the user with exact relational search information with the keywords. Therefore this research report proposes a method of offering expanded user relationship search keywords by analyzing user query patterns to provide the user a system, which conveniently support their searching of the information.

Efficient Nearest Neighbor Search on Moving Object Trajectories (이동객체궤적에 대한 효율적인 최근접이웃검색)

  • Kim, Gyu-Jae;Park, Young-Hee;Cho, Woo-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.12
    • /
    • pp.2919-2925
    • /
    • 2014
  • Because of the rapid growth of mobile communication and wireless communication, Location-based services are handled in many applications. So, the management and analysis of spatio-temporal data are a hot issue in database research. Index structure and query processing of such contents are very important for these applications. This paper addressees algorithms that make index structure by using Douglas-Peucker Algorithm and process nearest neighbor search query efficiently on moving objects trajectories. We compare and analyze our algorithms by experiments. Our algorithms make small size of index structure and process the query more efficiently.

Performance Analysis of Default Sever Replication Strategy for Query Processing in Mobile Computing (모빌 컴퓨팅 환경에서 중복 디폴트서버를 이용한 쿼리 프로세싱 기법의 성능 분석)

  • 임성화;임성화;김재훈;김성수
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.8A
    • /
    • pp.1096-1103
    • /
    • 2000
  • The default server strategy is commonly used for location and state managements of mobile host in mobile computing. With this strategy, we can find the cell of destination mobile host to send data by querying the default server. In SDN(single Default Notification) strategy which is a kind of default server strategy, the call is established after the location and state of the callee is acquired to the query server by querying the default server. But the communication cost overhead from the default server is increased if there are large number of cells and query requests, and if it is too far from the default server to a base station. Still more it will be unable to establish any calls to a mobile host when there is a fault in the default server of this host. In this paper, we suggest add evaluate a default server replication strategy to reduce the communication cost overhead and to make the service available.

  • PDF

Design and Implementation of Tag Coupling-based Boolean Query Matching System for Ranked Search Result (태그결합을 이용한 불리언 검색에서 순위화된 검색결과를 제공하기 위한 시스템 설계 및 구현)

  • Kim, Yong;Joo, Won-Kyun
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.4
    • /
    • pp.101-121
    • /
    • 2012
  • Since IR systems which adopt only Boolean IR model can not provide ranked search result, users have to conduct time-consuming checking process for huge result sets one by one. This study proposes a method to provide search results ranked by using coupling information between tags instead of index weight information in Boolean IR model. Because document queries are used instead of general user queries in the proposed method, key tags used as queries in a relevant document are extracted. A variety of groups of Boolean queries based on tag couplings are created in the process of extracting queries. Ranked search result can be extracted through the process of matching conducted with differential information among the query groups and tag significance information. To prove the usability of the proposed method, the experiment was conducted to find research trend analysis information on selected research information. Aslo, the service based on the proposed methods was provided to get user feedback for a year. The result showed high user satisfaction.