• Title/Summary/Keyword: Data-driven Modeling

Search Result 166, Processing Time 0.033 seconds

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

Priority for the Investment of Artificial Rainfall Fusion Technology (인공강우 융합기술 개발을 위한 R&D 투자 우선순위 도출)

  • Lim, Jong Yeon;Kim, KwangHoon;Won, DongKyu;Yeo, Woon-Dong
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.3
    • /
    • pp.261-274
    • /
    • 2019
  • This paper aims to develop an appropriate methodology for establishing an investment strategy for 'demonstration of artificial rainfall technology using UAV' and that include establishment of a technology classification, set of indicators for technology evaluation, suggestion of final key technology as a whole study area. It is designed to complement the latest research trend analysis results and expert committee opinions using quantitative analysis. The key indicators for technology evaluation consisted of three major items (activity, technology, marketability) and 10 detailed indicators. The AHP questionnaire was conducted to analyze the importance of indicators. As a result, it was analyzed that the attribute of the technology itself is most important, and the order of closeness to the implementation of the core function (centrality), feasibility (feasibility). Among the 16 technology groups, top investment priority groups were analyzed as ground seeding, artificial rainfall verification, spreading and diffusion of seeding material, artificial rainfall numerical modeling, and UAV sensor technology.

A Study on Data-driven Modeling Employing Stratification-related Physical Variables for Reservoir Water Quality Prediction (취수원 수질예측을 위한 성층 물리변수 활용 데이터 기반 모델링 연구)

  • Hyeon June Jang;Ji Young Jung;Kyung Won Joo;Choong Sung Yi;Sung Hoon Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.143-143
    • /
    • 2023
  • 최근 대청댐('17), 평림댐('19) 등 광역 취수원에서 망간의 먹는 물 수질기준(0.05mg/L 이하) 초과 사례가 발생되어, 다수의 민원이 제기되는 등 취수원의 망간 관리 중요성이 부각되고 있다. 특히, 동절기 전도(Turn-over)시기에 고농도 망간이 발생되는 경우가 많은데, 현재 정수장에서는 망간을 처리하기 위해 유입구간에 필터를 설치하고 주기적으로 교체하는 방식으로 처리하고 있다. 그러나 단기간에 고농도 망간 다량 유입 시 처리용량의 한계 등 정수장에서의 공정관리가 어려워지므로 사전 예측에 의한 대응 체계 고도화가 필요한 실정이다. 본 연구는 광역취수원인 주암댐을 대상으로 망간 예측의 정확도 향상 및 예측기간 확대를 위해 다양한 머신러닝 기법들을 적용하여 비교 분석하였으며, 독립변수 및 초매개변수 최적화를 진행하여 모형의 정확도를 개선하였다. 머신러닝 모형은 수심별 탁도, 저수위, pH, 수온, 전기전도도, DO, 클로로필-a, 기상, 수문 자료 등의 독립변수와 화순정수장에 유입된 망간 농도를 종속변수로 각 변수에 해당하는 실측치를 학습데이터로 사용하였다. 그리고 데이터기반 모형의 정확도를 개선하기 위해서 성층의 수준을 판별하는 지표로서 PEA(Potential Energy Anomaly)를 도입하여 데이터 분석에 활용하고자 하였다. 분석 결과, 망간 유입률은 계절 주기에 따라 농도가 달라지는 것을 확인하였고 동절기 전도시점과 하절기 장마기간 난류생성 시기에 저층의 고농도 망간이 유입이 되는 것을 분석하였다. 또한, 두 시기의 망간 농도의 변화 패턴이 상이하므로 예측 모델은 각 계절별로 구축해 학습을 진행함으로써 예측의 정확도를 향상할 수 있었다. 다양한 머신러닝 모델을 구축하여 성능 비교를 진행한 결과, 동절기에는 Gradient Boosting Machine, 하절기에는 eXtreme Gradient Boosting의 기법이 우수하여 추론 모델로 활용하고자 하였다. 선정 모델을 통한 단기 수질예측 결과, 전도현상 발생 시기에 대한 추종 및 예측력이 기존의 데이터 모형만 적용했을 경우대비 약 15% 이상 예측 효율이 향상된 것으로 나타났다. 본 연구는 머신러닝 모델을 활용한 망간 농도 예측으로 정수장의 신속한 대응 체계 마련을 지원하고, 수처리 공정의 효율성을 높이는 데 기여할 것으로 기대되며, 후속 연구로 과거 시계열 자료 활용 및 물리모형과의 연결 등을 통해 모델의 신뢰성을 제고 할 계획이다.

  • PDF

A Review of Urban Flooding: Causes, Impacts, and Mitigation Strategies (도시 홍수: 원인, 영향 및 저감 전략 고찰)

  • Jin-Yong Lee
    • The Journal of Engineering Geology
    • /
    • v.33 no.3
    • /
    • pp.489-502
    • /
    • 2023
  • Urban floods pose significant challenges to cities worldwide, driven by the interplay between urbanization and climate change. This review examines recent studies of urban floods to understand their causes, impacts, and potential mitigation strategies. Urbanization, with its increase in impermeable surfaces and altered drainage patterns, disrupts natural water flow, exacerbating surface runoff during intense rainfall events. The impacts of urban floods are far-reaching, affecting lives, infrastructure, the economy, and the environment. Loss of life, property damage, disruptions to critical services, and environmental consequences underscore the urgency of effective urban flood management. To mitigate urban floods, integrated flood management strategies are crucial. Sustainable urban planning, green infrastructure, and improved drainage systems play pivotal roles in reducing flood vulnerabilities. Early warning systems, emergency response planning, and community engagement are essential components of flood preparedness and resilience. Looking to the future, climate change projections indicate increased flood risks, necessitating resilience and adaptation measures. Advances in research, data collection, and modeling techniques will enable more accurate flood predictions, thus guiding decision-making. In conclusion, urban flooding demands urgent attention and comprehensive strategies to protect lives, infrastructure, and the economy.

Machine Learning Framework for Predicting Voids in the Mineral Aggregation in Asphalt Mixtures (아스팔트 혼합물의 골재 간극률 예측을 위한 기계학습 프레임워크)

  • Hyemin Park;Ilho Na;Hyunhwan Kim;Bongjun Ji
    • Journal of the Korean Geosynthetics Society
    • /
    • v.23 no.1
    • /
    • pp.17-25
    • /
    • 2024
  • The Voids in the Mineral Aggregate (VMA) within asphalt mixtures play a crucial role in defining the mixture's structural integrity, durability, and resistance to environmental factors. Accurate prediction and optimization of VMA are essential for enhancing the performance and longevity of asphalt pavements, particularly in varying climatic and environmental conditions. This study introduces a novel machine learning framework leveraging ensemble machine learning model for predicting VMA in asphalt mixtures. By analyzing a comprehensive set of variables, including aggregate size distribution, binder content, and compaction levels, our framework offers a more precise prediction of VMA than traditional single-model approaches. The use of advanced machine learning techniques not only surpasses the accuracy of conventional empirical methods but also significantly reduces the reliance on extensive laboratory testing. Our findings highlight the effectiveness of a data-driven approach in the field of asphalt mixture design, showcasing a path toward more efficient and sustainable pavement engineering practices. This research contributes to the advancement of predictive modeling in construction materials, offering valuable insights for the design and optimization of asphalt mixtures with optimal void characteristics.

Impact of pore fluid heterogeneities on angle-dependent reflectivity in poroelastic layers: A study driven by seismic petrophysics

  • Ahmad, Mubasher;Ahmed, Nisar;Khalid, Perveiz;Badar, Muhammad A.;Akram, Sohail;Hussain, Mureed;Anwar, Muhammad A.;Mahmood, Azhar;Ali, Shahid;Rehman, Anees U.
    • Geomechanics and Engineering
    • /
    • v.17 no.4
    • /
    • pp.343-354
    • /
    • 2019
  • The present study demonstrates the application of seismic petrophysics and amplitude versus angle (AVA) forward modeling to identify the reservoir fluids, discriminate their saturation levels and natural gas composition. Two case studies of the Lumshiwal Formation (mainly sandstone) of the Lower Cretaceous age have been studied from the Kohat Sub-basin and the Middle Indus Basin of Pakistan. The conventional angle-dependent reflection amplitudes such as P converted P ($R_{PP}$) and S ($R_{PS}$), S converted S ($R_{SS}$) and P ($R_{SP}$) and newly developed AVA attributes (${\Delta}R_{PP}$, ${\Delta}R_{PS}$, ${\Delta}R_{SS}$ and ${\Delta}R_{SP}$) are analyzed at different gas saturation levels in the reservoir rock. These attributes are generated by taking the differences between the water wet reflection coefficient and the reflection coefficient at unknown gas saturation. Intercept (A) and gradient (B) attributes are also computed and cross-plotted at different gas compositions and gas/water scenarios to define the AVO class of reservoir sands. The numerical simulation reveals that ${\Delta}R_{PP}$, ${\Delta}R_{PS}$, ${\Delta}R_{SS}$ and ${\Delta}R_{SP}$ are good indicators and able to distinguish low and high gas saturation with a high level of confidence as compared to conventional reflection amplitudes such as P-P, P-S, S-S and S-P. In A-B cross-plots, the gas lines move towards the fluid (wet) lines as the proportion of heavier gases increase in the Lumshiwal Sands. Because of the upper contacts with different sedimentary rocks (Shale/Limestone) in both wells, the same reservoir sand exhibits different response similar to AVO classes like class I and class IV. This study will help to analyze gas sands by using amplitude based attributes as direct gas indicators in further gas drilling wells in clastic successions.

Application of InVEST Water Yield Model for Assessing Forest Water Provisioning Ecosystem Service (산림의 수자원 공급 생태계서비스 평가를 위한 InVEST Water Yield 모형의 적용)

  • Song, Chol-Ho;Lee, Woo-Kyun;Choi, Hyun-Ah;Jeon, Seong-Woo;Kim, Jae-Uk;Kim, Joon-Soon;Kim, Jung-Taek
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.18 no.1
    • /
    • pp.120-134
    • /
    • 2015
  • InVEST Water Yield model developed by Natural Capital Project was applied for South Korea to assess domestic forest ecosystem's water provisioning services. The InVEST Water Yield model required 8 input dataset, including six spatial map data and two derived by coefficients. By running the model with relatively easy acquired and modified data, the result of domestic forest ecosystem's water provisioning services was 9,409,622,083 ton using the standard of the year 2011. The result showed similar patterns and distribution of rainfall in 2011, but showed difference when compared with existing researches spatially driven in nationwide statistical analysis results. This difference is assumed to occur with different model mechanism in spatial implementation and statistical analysis. So given that the model is currently still developing, applications should be taken on qualitative perspectives rather than on quantitative perspectives. Additionally, for advancing the application of InVEST water yield model, quantification of suitable input data and comparison using multi-modeling is required.

Acoustic 2-D Full-waveform Inversion with Initial Guess Estimated by Traveltime Tomography (주시 토모그래피와 음향 2차원 전파형 역산의 적용성에 관한 연구)

  • Han Hyun Chul;Cho Chang Soo;Suh Jung Hee;Lee Doo Sung
    • Geophysics and Geophysical Exploration
    • /
    • v.1 no.1
    • /
    • pp.49-56
    • /
    • 1998
  • Seismic tomography has been widely used as high resolution subsurface imaging techniques in engineering applications. Although most of the techniques have been using travel time inversion, waveform method is being driven forward owing to the progress of computational environments. Although full-waveform inversion method has been known as the best method in terms of model resolving power without high-frequency restriction and weak scattering approximation, it has practical disadvantage that it is apt to get stuck in local minimum if the initial guess is far from the actual model and it consumes so much time to calculate. In this study, 2-D full-waveform inversion algorithm in acoustic medium is developed, which uses result of traveltime tomography as initial model. From the application on synthetic data, it is proved that this approach can efficiently reduce the problem of conventional approaches: our algorithm shows much faster convergence rate and improvement of model resolution. Result of application on physical modeling data also shows much improvement. It is expected that this algorithm can be applicable to real data.

  • PDF

Innovation Patterns of Machine Learning and a Birth of Niche: Focusing on Startup Cases in the Republic of Korea (머신러닝 혁신 특성과 니치의 탄생: 한국 스타트업 사례를 중심으로)

  • Kang, Songhee;Jin, Sungmin;Pack, Pill Ho
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.3
    • /
    • pp.1-20
    • /
    • 2021
  • As the Great Reset is discussed at the World Economic Forum due to the COVID-19 pandemic, artificial intelligence, the driving force of the 4th industrial revolution, is also in the spotlight. However, corporate research in the field of artificial intelligence is still scarce. Since 2000, related research has focused on how to create value by applying artificial intelligence to existing companies, and research on how startups seize opportunities and enter among existing businesses to create new value can hardly be found. Therefore, this study analyzed the cases of startups using the comprehensive framework of the multi-level perspective with the research question of how artificial intelligence based startups, a sub-industry of software, have different innovation patterns from the existing software industry. The target firms are gazelle firms that have been certified as venture firms in South Korea, as start-ups within 7 years of age, specializing in machine learning modeling purposively sampled in the medical, finance, marketing/advertising, e-commerce, and manufacturing fields. As a result of the analysis, existing software companies have achieved process innovation from an enterprise-wide integration perspective, in contrast machine learning technology based startups identified unit processes that were difficult to automate or create value by dismantling existing processes, and automate and optimize those processes based on data. The contribution of this study is to analyse the birth of artificial intelligence-based startups and their innovation patterns while validating the framework of an integrated multi-level perspective. In addition, since innovation is driven based on data, the ability to respond to data-related regulations is emphasized even for start-ups, and the government needs to eliminate the uncertainty in related systems to create a predictable and flexible business environment.

Improvement for Impact Assessment of Marine Physical on the Development of Ports and Fishing Harbors in the East Coast (동해안 항만 및 어항 개발사업에 따른 해양물리학적 영향평가 개선방안)

  • Kim, In-Cheol;Kim, Gui-Young;Jeon, Kyeong-Am;Eom, Ki-Hyuk;Yu, Jun;Lee, Dae-In;Kim, Young-Tae;Kim, Hee-Jung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.19 no.2
    • /
    • pp.111-118
    • /
    • 2013
  • This paper suggested the improvement of marine environmental impact assessment in eastern coast as analyzing consultation on the coastal area utilization for development of ports and fishing harbors for 3years in the east coast. The results of survey are only 3cases, 12cases and 16cases each for ocean currents, wave and sounding data. However, for development of ports and fishing harbors in eastern coast, ocean characteristics in eastern coast different than in the West Sea, South Sea is considered to marine environmental impact assessment. For development of ports and fishing harbors in east coast where the influences of ocean currents, wind-driven current and waves are dominant, the effect of the current should be considered to improve the reproducibility of tidal current. The wave should also be considered as an assessment criteria to obtain the validity of project such as harbor tranquility, functionality of breakwaters and stability. In addition, sediment inflow in river and exact water depth data of the ocean should be applied to numerical modeling and set wave-induced current to external force of sediment transport to predict the problems such as the harbor siltation and the coastal erosion considering ocean characteristics in the east coast.