• Title/Summary/Keyword: Stream Query Processing

Search Result 124, Processing Time 0.019 seconds

Comparison of Search Performance of SQLite3 Database by Linux File Systems (Linux File Systems에 따른 SQLite3 데이터베이스의 검색 성능 비교)

  • Choi, Jin-Oh
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.1-6
    • /
    • 2022
  • Recently, IoT sensors are often used to produce stream data locally and they are provided for edge computing applications. Mass-produced data are stored in the mobile device's database for real-time processing and then synchronized with the server when needed. Many mobile databases are developed to support those applications. They are CloudScape, DB2 Everyplace, ASA, PointBase Mobile, etc, and the most widely used database is SQLite3 on Linux. In this paper, we focused on the performance required for synchronization with the server. The search performance required to retrieve SQLite3 was compared and analyzed according to the type of each Linux file system in which the database is stored. Thus, performance differences were checked for each file system according to various search query types, and criteria for applying the more appropriate Linux file system according to the index use environment and table scan environment were prepared and presented.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

An Efficient Technique for Processing Frequent Updates in the R-tree (R-트리에서 빈번한 변경 질의 처리를 위한 효율적인 기법)

  • 권동섭;이상준;이석호
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.261-273
    • /
    • 2004
  • Advances in information and communication technologies have been creating new classes of applications in the area of databases. For example, in moving object databases, which track positions of a lot of objects, or stream databases, which process data streams from a lot of sensors, data Processed in such database systems are usually changed very rapidly and continuously. However, traditional database systems have a problem in processing these rapidly and continuously changing data because they suppose that a data item stored in the database remains constant until It is explicitly modified. The problem becomes more serious in the R-tree, which is a typical index structure for multidimensional data, because modifying data in the R-tree can generate cascading node splits or merges. To process frequent updates more efficiently, we propose a novel update technique for the R-tree, which we call the leaf-update technique. If a new value of a data item lies within the leaf MBR that the data item belongs, the leaf-update technique changes the leaf node only, not whole of the tree. Using this leaf-update manner and the leaf-access hash table for direct access to leaf nodes, the proposed technique can reduce update cost greatly. In addition, the leaf-update technique can be adopted in diverse variants of the R-tree and various applications that use the R-tree since it is based on the R-tree and it guarantees the correctness of the R-tree. In this paper, we prove the effectiveness of the leaf-update techniques theoretically and present experimental results that show that our technique outperforms traditional one.

S-XML Transformation Method for Efficient Distribution of Spatial Information on u-GIS Environment (u-GIS 환경에서 효율적인 공간 정보 유통을 위한 S-XML 변환 기법)

  • Lee, Dong-Wook;Baek, Sung-Ha;Kim, Gyoung-Bae;Bae, Hae-Young
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.55-62
    • /
    • 2009
  • In u-GIS environment, we collect spatial data needed through sensor network and provide them with information real-time processed or stored. When information through Internet is requested on Web based applications, it is transmitted in XML. Especially, when requested information includes spatial data, GML, S-XML, and other document that can process spatial data are used. In this processing, real-time stream data processed in DSMS is transformed to S-XML document type and spatial information service based on web receive S-XML document through Internet. Because most of spatial application service use existing spatial DBMS as a storage system, The data used in S-XML and SDBMS needs transformation between themselves. In this paper, we propose S-XML a transformation method using caching of spatial data. The proposed method caches the spatial data part of S-XML to transform S-XML and relational spatial database for providing spatial data efficiently and it transforms cached data without additional transformation cost when a transformation between data in the same region is required. Through proposed method, we show that it reduced the cost of transformation between S-XML documents and spatial information services based on web to provide spatial information in u-GIS environment and increased the performance of query processing through performance evaluation.

  • PDF