• Title/Summary/Keyword: stream big data

Search Result 68, Processing Time 0.023 seconds

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Spatial Distribution and Dynamics of Vegetation on a Gravel Bar: Case Study in the Bangtae Stream (자갈 하중주에서 식생의 공간 분포 및 동태: 방태천의 사례)

  • Pee, Jung-Hun;Kim, Hye-Soo;Kim, Gyung-Soon;Oh, Woo-Seok;Koo, Bon-Yoel;Lee, Chang-Seok
    • Korean Journal of Ecology and Environment
    • /
    • v.46 no.2
    • /
    • pp.215-224
    • /
    • 2013
  • We clarified the background for establishment of vegetation by comparing the spatial distribution maps of vegetation and substrate on a gravel bar in the Bangtae stream located on Inje-gun of Gangwon-do, the central eastern Korea. The total vegetation coverage was higher in the interior and lower in the marginal parts of the gravel bar. Spatial distribution of vegetation on the longitudinal section of the gravel bar tended to be arranged in the order of shrub, subtree, and tree dominated vegetation types from the front (upstream) toward the rear (downstream) parts. Coverage of the herbaceous plants was higher in the central and rear parts and lower in the front and right parts of the gravel bar. Vegetation height was higher in the rear part and became lowered as move toward the front part. Substrate was distributed in the order of boulder, gravel, sand, and boulder from the front toward the rear parts. Ordination of stands based on vegetation data was arranged in the order of annual plant, perennial herb, shrub, and tree dominated vegetation as move from the right to the left parts on the axis I. Species richness was higher in the order of Pinus densiflora community, Phragmites japonica community, Salix gracilistyla community, Fraxinus rhynchophylla community, annual plant dominated vegetation, and Prunus padus for. padus community based on the species rank-abundance curve. The order based on the Shannon's index was some different; diversity of Phragmites japonica community and Salix gracilistyla community, which showed higher dominance degree, were low differently from species richness. In conclusion, it was evaluated that the gravel bar newly established toward the upstream and vegetation dynamics of the gravel bar seemed to follow ecosystem mechanisms of succession. As were shown in the above results, the Bangtae stream corresponded to the upstream and thereby particle size of substrate was big. Therefore, they move by rolling and are accumulated for the upstream. Vegetation types were arranged in the order of woodland, shrub-land and grassland from the rear toward the front parts of the gravel bar and thereby reflected the formation process of the bar. However, the gravel bar is disturbed frequently by not only the running water but also the suspended sand as the dynamic space. Such disturbances cause habitat diversity and consequently led to high biodiversity.

Potential of River Bottom and Bank Erosion for River Restoration after Dam Slit in the Mountain Stream

  • Kang, Ji-Hyun;So, Kazama
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.46-46
    • /
    • 2011
  • Severe sediment erosion during floods occur disaster and economic losses, but general sediment erosion is basic mechanism to move sediment from upstream to downstream river. In addition, it is important process to change river form. Check dam, which is constructed in mountain stream, play a vital role such as control of sudden debris flow, but it has negative aspects to river ecosystem. Now a day, check dam of open type is an alternative plan to recover river biological diversity and ecosystem through sediment transport while maintaining the function of disaster control. The purpose of this paper is to verify sediment erosion progress of river bottom and bank as first step for river restoration after dam slit by cross-sectional shear stress and critical shear stress. Study area is upstream reach of slit check dam in mountain stream, named Wasada, in Japan. The check dam was slit with two passages in August, 2010. The transects were surveyed for four upstream cross-sections, 7.4 m, 34 m, 86 m, and 150 m distance from dam in October 2010. Sediment size was surveyed at river bottom and bank. Sediment of cobble size was found at the wetted bottom, and small size particles of sand to medium gravel composed river bank. Discharge was $2.5\;m^3/s$ and bottom slope was 0.027 m/m. Excess shear stress (${\tau}_{ex}$) was calculated for hydraulic erosion by subtracting the values of critical shear stress (${\tau}_{c}$) from the value of shear stress (${\tau}$) at river bottom and bank (${\tau}_{ex}=\tau-{\tau}_c$). Shear stress of river bottom (${\tau}_{bottom}$) was calculated using the cross-sectional shear stress, and bank shear stress (${\tau}_{bank}$) was calculated from the method of Flintham and Carling (1988). $${\tau}_{bank}={\tau}^*SF_{bank}((B+P_{bed})/(2^*P_{bank}))$$ where $SF_{bank}=1.77(P_{bed}/p_{bank}+1.5)^{-1.4}$, B is the water surface width, $P_{bed}$ and $P_{bank}$ are wetted parameter of the bed and bank. Estimated values for ${\tau}_{bottom}$ for a flow of $2.5\;m^3/s$ were lower as 25.0 (7.5 m cross-section), 25.7 (34 m), 21.3 (86 m) and 19.8 (150 m), in N/$m^2$, than critical shear stress (${\tau}_c=62.1\;N/m^2$) with cobble of 64 mm. The values were insufficient to erode cobble sediment. In contrast, even if the values of ${\tau}_{bank}$ were lower than the values for ${\tau}_{bottom}$ as 18.7 (7.5 m), 19.3 (34 m), 16.1 (86 m) and 14.7 (150 m), in N/$m^2$, excess shear stresses were calculated at the three cross-sections of 7.5 m, 34 m, and 86 m distances compare with ${\tau}_c$ is 15.5 N/$m^2$ of 16mm gravel. Bank shear stresses were sufficient for erosion of the medium gravel to sand. Therefore there is potential to erode lateral bank than downward erosion in a flow of $2.5\;m^3/s$. Undercutting of the wetted bank can causes bank scour or collapse, therefore this channel has potential to become wider at the same time. This research is about a potential of sediment erosion, and the result could not verify with real data. Therefore it need next step for verification. In addition an erosion mechanism for river restoration is not simple because discharge distribution is variable by snow-melting or rainy season, and a function for disaster control will recover by big precipitation event. Therefore it needs to consider the relationship between continuous discharge change and sediment erosion.

  • PDF

An Analysis of IT Trends Using Tweet Data (트윗 데이터를 활용한 IT 트렌드 분석)

  • Yi, Jin Baek;Lee, Choong Kwon;Cha, Kyung Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.143-159
    • /
    • 2015
  • Predicting IT trends has been a long and important subject for information systems research. IT trend prediction makes it possible to acknowledge emerging eras of innovation and allocate budgets to prepare against rapidly changing technological trends. Towards the end of each year, various domestic and global organizations predict and announce IT trends for the following year. For example, Gartner Predicts 10 top IT trend during the next year, and these predictions affect IT and industry leaders and organization's basic assumptions about technology and the future of IT, but the accuracy of these reports are difficult to verify. Social media data can be useful tool to verify the accuracy. As social media services have gained in popularity, it is used in a variety of ways, from posting about personal daily life to keeping up to date with news and trends. In the recent years, rates of social media activity in Korea have reached unprecedented levels. Hundreds of millions of users now participate in online social networks and communicate with colleague and friends their opinions and thoughts. In particular, Twitter is currently the major micro blog service, it has an important function named 'tweets' which is to report their current thoughts and actions, comments on news and engage in discussions. For an analysis on IT trends, we chose Tweet data because not only it produces massive unstructured textual data in real time but also it serves as an influential channel for opinion leading on technology. Previous studies found that the tweet data provides useful information and detects the trend of society effectively, these studies also identifies that Twitter can track the issue faster than the other media, newspapers. Therefore, this study investigates how frequently the predicted IT trends for the following year announced by public organizations are mentioned on social network services like Twitter. IT trend predictions for 2013, announced near the end of 2012 from two domestic organizations, the National IT Industry Promotion Agency (NIPA) and the National Information Society Agency (NIA), were used as a basis for this research. The present study analyzes the Twitter data generated from Seoul (Korea) compared with the predictions of the two organizations to analyze the differences. Thus, Twitter data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. To overcome these challenges, we used SAS IRS (Information Retrieval Studio) developed by SAS to capture the trend in real-time processing big stream datasets of Twitter. The system offers a framework for crawling, normalizing, analyzing, indexing and searching tweet data. As a result, we have crawled the entire Twitter sphere in Seoul area and obtained 21,589 tweets in 2013 to review how frequently the IT trend topics announced by the two organizations were mentioned by the people in Seoul. The results shows that most IT trend predicted by NIPA and NIA were all frequently mentioned in Twitter except some topics such as 'new types of security threat', 'green IT', 'next generation semiconductor' since these topics non generalized compound words so they can be mentioned in Twitter with other words. To answer whether the IT trend tweets from Korea is related to the following year's IT trends in real world, we compared Twitter's trending topics with those in Nara Market, Korea's online e-Procurement system which is a nationwide web-based procurement system, dealing with whole procurement process of all public organizations in Korea. The correlation analysis show that Tweet frequencies on IT trending topics predicted by NIPA and NIA are significantly correlated with frequencies on IT topics mentioned in project announcements by Nara market in 2012 and 2013. The main contribution of our research can be found in the following aspects: i) the IT topic predictions announced by NIPA and NIA can provide an effective guideline to IT professionals and researchers in Korea who are looking for verified IT topic trends in the following topic, ii) researchers can use Twitter to get some useful ideas to detect and predict dynamic trends of technological and social issues.

Assessment of the Contribution of Weather, Vegetation and Land Use Change for Agricultural Reservoir and Stream Watershed using the SLURP model (II) - Calibration, Validation and Application of the Model - (SLURP 모형을 이용한 기후, 식생, 토지이용변화가 농업용 저수지 유역과 하천유역에 미치는 기여도 평가(II) - 모형의 검·보정 및 적용 -)

  • Park, Geun-Ae;Ahn, So-Ra;Park, Min-Ji;Kim, Seong-Joon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.30 no.2B
    • /
    • pp.121-135
    • /
    • 2010
  • This study is to assess the effect of potential future climate change on the inflow of agricultural reservoir and its impact to downstream streamflow by reservoir operation for paddy irrigation water supply using the SLURP. Before the future analysis, the SLURP model was calibrated using the 6 years daily streamflow records (1998-200398 and validated using 3 years streamflow data (2004-200698 for a 366.5 $km^2$ watershed including two agricultural reservoirs (Geumgwang8 and Gosam98located in Anseongcheon watershed. The calibration and validation results showed that the model was able to simulate the daily streamflow well considering the reservoir operation for paddy irrigation and flood discharge, with a coefficient of determination and Nash-Sutcliffe efficiency ranging from s 7 to s 9 and 0.5 to s 8 respectively. Then, the future potential climate change impact was assessed using the future wthe fu data was downscaled by nge impFactor method throuih bias-correction, the future land uses wtre predicted by modified CA-Markov technique, and the future ve potentiacovfu information was predicted and considered by the linear regression bpowten mecthly NDVI from NOAA AVHRR ima ps and mecthly mean temperature. The future (2020s, 2050s and 2e 0s) reservoir inflow, the temporal changes of reservoir storaimpand its impact to downstream streamflow watershed wtre analyzed for the A2 and B2 climate change scenarios based on a base year (2005). At an annual temporal scale, the reservoir inflow and storaimpchange oue, anagricultural reservoir wtre projected to big decrease innautumnnunder all possiblmpcombinations of conditions. The future streamflow, soossmoosture and grounwater recharge decreased slightly, whtre as the evapotransporation was projected to increase largely for all possiblmpcombinations of the conditions. At last, this study was analysed contribution of weather, vegetation and land use change to assess which factor biggest impact on agricultural reservoir and stream watershed. As a result, weather change biggest impact on agricultural reservoir inflow, storage, streamflow, evapotranspiration, soil moisture and groundwater recharge.

The Effect of Online Multiple Channel Marketing by Device Type (디바이스 유형을 고려한 온라인 멀티 채널 마케팅 효과)

  • Hajung Shin;Kihwan Nam
    • Information Systems Review
    • /
    • v.20 no.4
    • /
    • pp.59-78
    • /
    • 2018
  • With the advent of the various device types and marketing communication, customer's search and purchase behavior have become more complex and segmented. However, extant research on multichannel marketing effects of the purchase funnel has not reflected the specific features of device User Interface (UI) and User Experience (UX). In this study, we analyzed the marketing channel effects of multi-device shoppers using a unique click stream dataset from global online retailers. We examined device types that activate online shopping and compared the differences between marketing channels that promote visits. In addition, we estimated the direct and indirect effects on visits and purchase revenue through customer's accumulated experience and channel conversions. The findings indicate that the same customer selects a different marketing channel according to the device selection. These results can help retailers gain a better understanding of customers' decision-making process in multi-marketing channel environment and devise the optimal strategy taking into account various device types. Our empirical analyses yield business implications based on the significant results from global big data analytics and contribute academically meaningful theoretical framework using an economic model. We also provide strategic insights attributed to the practical value of an online marketing manager.

A Study on a Quantitative Method in Estimating Forest Effects for Streamflow Regulation (II) - Mainly Dealing with Application of Coefficient for Slope Roughness - (삼림이수기능(森林理水機能)의 정량적(定量的) 평가방법(平價方法)에 관한 연구(硏究)(II) - 조도계수(粗度係數)의 응용(應用)을 중심(中心)으로 -)

  • Lee, Heon Ho
    • Journal of Korean Society of Forest Science
    • /
    • v.81 no.4
    • /
    • pp.337-345
    • /
    • 1992
  • In this research, a kinematic wave model was applied for the runoff analysis, Regulation of streamflow was estimated by the calibration of roughness coefficient as a parameter. The data analyzed were obtained from Ananomiya and Shirasaka experimental basins at Tokyo University Forest in Aichi. Estimation methods and characteristics of roughness coefficient as a evaluation method of hydrological function of forest are summarized as follows ; 1. Roughness coefficient($N_s$) indicates the resistance of hillslope to the flowing water of surface runoff. There exists an hypothesis that resistance of hillslope to flowing water increase with the growth forest and development of the $A_o$ layer. 2. Roughness coefficient($N_s$) was estimated by the parameter when the stream direct runoff was calculated by using the kinematic wave. 3. Secular change of '$N_s$' in ananomiya has a curve which has an upper limit and increases exponentially near the limit. The curve quickly increased from 1935 to 1945 when results of afforestation for erosion control were thought to be effective. On the other hand, slight increase of '$N_s$' in Shirasaka indicates that there was not such a big change in the surface of soil layer. 4. The increase of '$N_s$' was related with decrease of direct runoff and increase of base flow. It was recognized that the rate of direct runoff decreased with the improvement of forest physiognomy and the rate of base flow was increased. But absolute value of water runoff per one storm decreased in chronological order.

  • PDF

The Effects of pH Change in Extraction Solution on the Heavy Metals Extraction from Soil and Controversial Points for Partial Extraction in Korean Standard Method (용출액의 pH 변화가 토양내 중금속 용출에 미치는 영향과 그에 따른 국내 토양 오염 공정시험방법의 문제점)

  • 오창환;유연희;이평구;이영엽
    • Economic and Environmental Geology
    • /
    • v.36 no.3
    • /
    • pp.159-170
    • /
    • 2003
  • Heavy metals are extracted from Chonju stream sediment, roadside soils and sediments along Honam expressway, soils and tailings from mining area using three different methods (partial extraction in Standard Method, partial extraction method with maintaining 0.1 N of extraction solution and Sequential Extraction Method). In samples having buffer capacity against acid, pH 1 (0.1 N HCl) of extraction solution can not be maintained and pH of extraction solution increases up to 8.0 when partial extraction in Standard Method is used. The averages and ranges of HPE(heavy metals extracted using partial extraction in Standard Method)/HPEM(heavy metals extracted using partial extraction method with maintaining 0.1 N of extraction solution) values are 0.479 and 0.145~0.929 for Cd, 0.534 and 0.078~0.928 for Zn, 0.432 and 0.041~0.992 for Mn, 0.359 and 0.011~0.874 for Cu, 0.150 and 0.018~0.530 for Cr, 0.219 and 0.003~0.853 for Pb, and 0.088 and 1.73${\times}$10$^{-5}$~0.303 for Fe. These data indicate that the difference between HPE and HPEM is large in the order of Fe, Cr, Pb, Cu, Mn, Cd and Zn. The amounts of heavy metals extracted decreases in the follow order; Sum III(sum of fraction I, II, III in sequential extraction)>HPEM>Sum III (sum of fraction I and II)>HPE for Zn, Cd and Mn and Sum III>HPEM>HPE for Cr and Fe. In the case Cr, Sum II is lower than HPEM and higher than HPE. In case of Cu, extracted heavy metals is large in the order Sum IV>HPEM>Sum III HPE. HPE/HPEM value decreases with increasing the amount of HCl used for maintaining 0.1 N of extraction solution. For samples with high buffer capacity, HPE/HPEM value in all elements is lower than 0.2. On the other hand, for samples with low buffer capacity, HPE/HPEM value are over 0.2 and many samples have values higher than 0.6 for Zn, Cd Mn and Cu due to the small difference between Sum II and Sum III, and relatively higher mobility. However, for Fe and Cr, HPE/HPEM value is below 0.2 even for samples with low buffer capacity due to their low mobility and big difference between Sum II and Sum III. This study indicates that the partial extraction method in Korean Standard Method of soil is not suitable for an assessment of soil contamination in area where buffer capacity of soil can be decreased or lost because of a long term exposure to environmental damage such as acidic rain.