• Title/Summary/Keyword: Model-Based Approach

Search Result 6,840, Processing Time 0.044 seconds

An Artificial Intelligence Approach to Waterbody Detection of the Agricultural Reservoirs in South Korea Using Sentinel-1 SAR Images (Sentinel-1 SAR 영상과 AI 기법을 이용한 국내 중소규모 농업저수지의 수표면적 산출)

  • Choi, Soyeon;Youn, Youjeong;Kang, Jonggu;Park, Ganghyun;Kim, Geunah;Lee, Seulchan;Choi, Minha;Jeong, Hagyu;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_3
    • /
    • pp.925-938
    • /
    • 2022
  • Agricultural reservoirs are an important water resource nationwide and vulnerable to abnormal climate effects such as drought caused by climate change. Therefore, it is required enhanced management for appropriate operation. Although water-level tracking is necessary through continuous monitoring, it is challenging to measure and observe on-site due to practical problems. This study presents an objective comparison between multiple AI models for water-body extraction using radar images that have the advantages of wide coverage, and frequent revisit time. The proposed methods in this study used Sentinel-1 Synthetic Aperture Radar (SAR) images, and unlike common methods of water extraction based on optical images, they are suitable for long-term monitoring because they are less affected by the weather conditions. We built four AI models such as Support Vector Machine (SVM), Random Forest (RF), Artificial Neural Network (ANN), and Automated Machine Learning (AutoML) using drone images, sentinel-1 SAR and DSM data. There are total of 22 reservoirs of less than 1 million tons for the study, including small and medium-sized reservoirs with an effective storage capacity of less than 300,000 tons. 45 images from 22 reservoirs were used for model training and verification, and the results show that the AutoML model was 0.01 to 0.03 better in the water Intersection over Union (IoU) than the other three models, with Accuracy=0.92 and mIoU=0.81 in a test. As the result, AutoML performed as well as the classical machine learning methods and it is expected that the applicability of the water-body extraction technique by AutoML to monitor reservoirs automatically.

Development of Traffic Volume Estimation System in Main and Branch Roads to Estimate Greenhouse Gas Emissions in Road Transportation Category (도로수송부문 온실가스 배출량 산정을 위한 간선 및 지선도로상의 교통량 추정시스템 개발)

  • Kim, Ki-Dong;Lee, Tae-Jung;Jung, Won-Seok;Kim, Dong-Sool
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.28 no.3
    • /
    • pp.233-248
    • /
    • 2012
  • The national emission from energy sector accounted for 84.7% of all domestic emissions in 2007. Of the energy-use emissions, the emission from mobile source as one of key categories accounted for 19.4% and further the road transport emission occupied the most dominant portion in the category. The road transport emissions can be estimated on the basis of either the fuel consumed (Tier 1) or the distance travelled by the vehicle types and road types (higher Tiers). The latter approach must be suitable for simultaneously estimating $CO_2$, $CH_4$, and $N_2O$ emissions in local administrative districts. The objective of this study was to estimate 31 municipal GHG emissions from road transportation in Gyeonggi Province, Korea. In 2008, the municipalities were consisted of 2,014 towns expressed as Dong and Ri, the smallest administrative district unit. Since mobile sources are moving across other city and province borders, the emission estimated by fuel sold is in fact impossible to ensure consistency between neighbouring cities and provinces. On the other hand, the emission estimated by distance travelled is also impossible to acquire key activity data such as traffic volume, vehicle type and model, and road type in small towns. To solve the problem, we applied a hierarchical cluster analysis to separate town-by-town road patterns (clusters) based on a priori activity information including traffic volume, population, area, and branch road length obtained from small 151 towns. After identifying 10 road patterns, a rule building expert system was developed by visual basic application (VBA) to assort various unknown road patterns into one of 10 known patterns. The expert system was self-verified with original reference information and then objects in each homogeneous pattern were used to regress traffic volume based on the variables of population, area, and branch road length. The program was then applied to assign all the unknown towns into a known pattern and to automatically estimate traffic volumes by regression equations for each town. Further VKT (vehicle kilometer travelled) for each vehicle type in each town was calculated to be mapped by GIS (geological information system) and road transport emission on the corresponding road section was estimated by multiplying emission factors for each vehicle type. Finally all emissions from local branch roads in Gyeonggi Province could be estimated by summing up emissions from 1,902 towns where road information was registered. As a result of the study, the GHG average emission rate by the branch road transport was 6,101 kilotons of $CO_2$ equivalent per year (kt-$CO_2$ Eq/yr) and the total emissions from both main and branch roads was 24,152 kt-$CO_2$ Eq/yr in Gyeonggi Province. The ratio of branch roads emission to the total was 0.28 in 2008.

Interpretation of the Umbrella Clause in Investment Treaties (국제투자조약상 포괄적 보호조항(Umbrella Clauses)의 해석에 관한 연구)

  • Jo, Hee-Moon
    • Journal of Arbitration Studies
    • /
    • v.19 no.2
    • /
    • pp.95-126
    • /
    • 2009
  • One of the controversial issues in investor-state investment arbitration is the interpretation of "umbrella clause" that is found in most BIT and FTAs. This treaty clause requires on Contracting State of treaty to observe all investment obligations entered into with foreign investors from the other Contracting State. This clause did not receive in-depth attention until SGS v. Pakistan and SGS v. Philippines cases produced starkly different conclusions on the relations about treaty-based jurisdiction and contract-based jurisdiction. More recent decisions by other arbitral tribunals continue to show different approaches in their interpretation of umbrella clauses. Following the SGS v. Philippines decision, some recent decisions understand that all contracts are covered by umbrella clause, for example, in Siemens A.G. v. Argentina, LG&E Energy Corp. v. Argentina, Sempra Energy Int'l v. Argentina and Enron Corp. V. Argentina. However, other recent decisions have found a different approach that only certain kinds of public contracts are covered by umbrella clauses, for example, in El Paso Energy Int'l Co. v. Argentina, Pan American Energy LLC v. Argentina and CMS Gas Transmission Co. v. Argentina. With relation to the exhaustion of domestic remedies, most of tribunals have the position that the contractual remedy should not affect the jurisdiction of BIT tribunal. Even some tribunals considered that there is no need to exhaust contract remedies before bringing BIT arbitration, provoking suspicion of the validity of sanctity of contract in front of treaty obligation. The decision of the Annulment Committee In CMS case in 2007 was an extraordinarily surprising one and poured oil on the debate. The Committee composed of the three respected international lawyers, Gilbert Guillaume and Nabil Elaraby, both from the ICJ, and professor James Crawford, the Rapportuer of the International Law Commission on the Draft Articles on the Responsibility of States for Internationally Wrongful Acts, observed that the arbitral tribunal made critical errors of law, however, noting that it has limited power to review and overturn the award. The position of the Committee was a direct attack on ICSID system showing as an internal recognition of ICSID itself that the current system of investor-state arbitration is problematic. States are coming to limit the scope of umbrella clauses. For example, the 2004 U.S. Model BIT detailed definition of the type of contracts for which breach of contract claims may be submitted to arbitration, to increase certainty and predictability. Latin American countries, in particular, Argentina, are feeling collectively victims of these pro-investor interpretations of the ICSID tribunals. In fact, BIT between developed and developing countries are negotiated to protect foreign investment from developing countries. This general characteristic of BIT reflects naturally on the provisions making them extremely protective for foreign investors. Naturally, developing countries seek to interpret restrictively BIT provisions, whereas developed countries try to interpret more expansively. As most of cases arising out of alleged violation of BIT are administered in the ICSID, a forum under the auspices of the World Bank, these Latin American countries have been raising the legitimacy deficit of the ICSID. The Argentine cases have been provoking many legal issues of international law, predicting crisis almost coming in actual investor-state arbitration system. Some Latin American countries, such as Bolivia, Venezuela, Ecuador, Argentina, already showed their dissatisfaction with the ICSID system considering withdrawing from it to minimize the eventual investor-state dispute. Thus the disagreement over umbrella clauses in their interpretation is becoming interpreted as an historical reflection on the continued tension between developing and developed countries on foreign investment. There is an academic and political discussion on the possible return of the Calvo Doctrine in Latin America. The paper will comment on these problems related to the interpretation of umbrella clause. The paper analyses ICSID cases involving principally Latin American countries to identify the critical legal issues arising between developing and developed countries. And the paper discusses alternatives in improving actual investor-State investment arbitration; inter alia, the introduction of an appellate system and treaty interpretation rules.

  • PDF

Data Mining Approaches for DDoS Attack Detection (분산 서비스거부 공격 탐지를 위한 데이터 마이닝 기법)

  • Kim, Mi-Hui;Na, Hyun-Jung;Chae, Ki-Joon;Bang, Hyo-Chan;Na, Jung-Chan
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.3
    • /
    • pp.279-290
    • /
    • 2005
  • Recently, as the serious damage caused by DDoS attacks increases, the rapid detection and the proper response mechanisms are urgent. However, existing security mechanisms do not effectively defend against these attacks, or the defense capability of some mechanisms is only limited to specific DDoS attacks. In this paper, we propose a detection architecture against DDoS attack using data mining technology that can classify the latest types of DDoS attack, and can detect the modification of existing attacks as well as the novel attacks. This architecture consists of a Misuse Detection Module modeling to classify the existing attacks, and an Anomaly Detection Module modeling to detect the novel attacks. And it utilizes the off-line generated models in order to detect the DDoS attack using the real-time traffic. We gathered the NetFlow data generated at an access router of our network in order to model the real network traffic and test it. The NetFlow provides the useful flow-based statistical information without tremendous preprocessing. Also, we mounted the well-known DDoS attack tools to gather the attack traffic. And then, our experimental results show that our approach can provide the outstanding performance against existing attacks, and provide the possibility of detection against the novel attack.

A study on Yang Shi Tai Chi Chuan in Bartenieff Fundamentals Perspectives (바티니에프 기본원리를 통해 본 양식 태극권에 관한 연구)

  • Wang, Zhiquan
    • Trans-
    • /
    • v.8
    • /
    • pp.95-127
    • /
    • 2020
  • This research is based on using Bartenieff Fundamentals to analyze the fundamentals of Tai Chi Chuan's movements in order to develop the methods of relaxation from Tai Chi Chuan's principle movement movements It also shows that the two techniques have commonalities in many ways. First of all, taking a philosophical approach on the body movements of Tai Chi Chuan and Bartenieff, for both methods the ultimate goal is the integration of mind and body. In other words, there is a thread of connection between the East's body and mind monism and the west's Body Awareness. Secondly, looking at it from a Breath Support standpoint as used in the Bartenieff method, the two methods both use the breathing to naturally move the body and relax the body. In Tai Chi Chuan the Breath is the basis of life and the strength of the Body. So the breathing of Tai Chi chuan is what makes body and mind communicate, harmonize and integrate. In other words, Breathing in Tai Chi is realized through mental fusion and affects the movements. This is the same as the Breath Support of Bartenieff. It is said that in every aspect the Breath Support of Bartenieff influences the movement and changes both the inner and outer form of the body. Thirdly, looking at the Core Support used in the Bartenieff method, both methods emphasize core. At the same time of moving and being conscious of one's core, the usage of muscles can be deeper rather than superficial and this enables strong and flexible movement. In Tai Chi Chuan abdominal muscles used when one coughs are consciously engaged through abdominal breathing and so strength is collected in the core. When one exercises like that the core becomes more stable and breathing becomes more smooth. Fourthly, analyzing the Rotary Factor used in the Bartenieff Fundamentals, they both use rotary movement to reach the goal of physical relaxation. The rotation factor of Bartenieff allows movement to be easier and more free because of the characteristic of joint exercise where the center axis moved in three dimensions, this is the same in Tai Chi chuan. According to Tai Chi chuan's circle and Spiral Movements, it can achieve the relaxation through switching into a seamless flow and access space as much as possible. Finally, when looking at Developmental Patterning through Bonnie Bainbridge Cohen's Body-Mind Centering Work theory, presented from Bartenieff developmental model are similar with the developmental process of Tai Chi chuan Breath, Core-Distal Connectivity/Navel Radiation, Head-Tail Connectivity/Spinal Movement, Upper-Lower Connectivity/Homologous, Body-Half Connectivity/Homo-Lateral Connectivity, Cross-Lateral Connectivity/Contra-Lateral Connectivity. They are all similar. In other words, in Tai Chi Chuan energy is gathered in the core through breathing, upper and lower body are connected through the spine, not only homo-laterally but also cross-laterally. Through this study the expression of the dance movements can be more natural. Additionally based on the Body Awareness balance usage of the central axis, joints and body can develop the relax technique.

  • PDF

Suggestion of Community Design for the Efficiency of CPTED - Focused on Community Furniture - (범죄예방환경설계(CPTED)의 효율성 증대를 위한 커뮤니티디자인 제안 - 커뮤니티퍼니쳐를 중심으로 -)

  • Lee, Ho Sang
    • Korea Science and Art Forum
    • /
    • v.29
    • /
    • pp.305-318
    • /
    • 2017
  • The need for recognizing the crime in the urban spaces as a social problem and finding out specific approaches such as the study of space design and various guidelines for crime prevention is increasing. In this regard, "Crime Prevention Through Environmental Design" (marked as "CPTED") is actively underway. Yeomri-dong Salt Way is the first place to which the Seoul Crime Prevention Design Project was appled. The business objective of improving the local environment has been implemented rationally through cooperation and voluntary participation between subject of the project executives and community members. Since its efficiency has been proven, the sites have been expanded since then and becomes a benchmarking example of each local government.This kind of problem solving effort has the same context in purpose and direction of the 'Village Art Project' which has been implemented since 2009 with the aim of promoting the culture of the underdeveloped area and encouraging the participation of the residents by introducing the public art. It is noteworthy that this trend is centered around the characteristics of community functions and values. The purpose of this study is to propose the application method of community furniture as a way to increase the efficiency of CPTED to improve the 'quality of life' of residents. To do this, we reviewed CPTED, community design, public art literature and prior research, and identified the problems and implications based on the site visit Yeomri-dong of Seoul and Gamcheon Village of Pusan which is the successful model of "Seoul Root out Crime by Design" and 'Maeulmisul Art Project' respectively. The common elements of the two case places identified in this study are as follows: First, the 'lives' of community residents found its place in the center through the activation of community by collaborative activities in addition to the physical composition of the environment. Second, community design and introduction of public art created a new space, and thereby many people came to visit the village and revitalize the local economy. Third, it strengthened the natural monitoring, the territoriality and control, and the activity increase among the CPTED factors. The psychological aspect of CPTED and the emotional function of public art are fused with the 'community furniture', thereby avoiding a vague or tremendous approach to the public space through a specific local context based on the way of thinking and emotion of local people and it will be possible to create an environment beneficial for all. In this way, the possibility and implication of the fusion of CPTED and public art are expected to be able to reduce the social cost through the construction of the crime prevention infrastructure such as expansion of the CPTED application space, and to suggest a plan to implement the visual amenity as a design strategy to regenerate city.

An Analytical Study on the Seismic Behavior and Safety of Vertical Hydrogen Storage Vessels Under the Earthquakes (지진 시 수직형 수소 저장용기의 거동 특성 분석 및 안전성에 관한 해석적 연구)

  • Sang-Moon Lee;Young-Jun Bae;Woo-Young Jung
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.27 no.6
    • /
    • pp.152-161
    • /
    • 2023
  • In general, large-capacity hydrogen storage vessels, typically in the form of vertical cylindrical vessels, are constructed using steel materials. These vessels are anchored to foundation slabs that are specially designed to suit the environmental conditions. This anchoring method involves pre-installed anchors on top of the concrete foundation slab. However, it's important to note that such a design can result in concentrated stresses at the anchoring points when external forces, such as seismic events, are at play. This may lead to potential structural damage due to anchor and concrete damage. For this reason, in this study, it selected an vertical hydrogen storage vessel based on site observations and created a 3D finite element model. Artificial seismic motions made following the procedures specified in ICC-ES AC 156, as well as domestic recorded earthquakes with a magnitude greater than 5.0, were applied to analyze the structural behavior and performance of the target structures. Conducting experiments on a structure built to actual scale would be ideal, but due to practical constraints, it proved challenging to execute. Therefore, it opted for an analytical approach to assess the safety of the target structure. Regarding the structural response characteristics, the acceleration induced by seismic motion was observed to amplify by approximately ten times compared to the input seismic motions. Additionally, there was a tendency for a decrease in amplification as the response acceleration was transmitted to the point where the centre of gravity is located. For the vulnerable components, specifically the sub-system (support columns and anchorages), the stress levels were found to satisfy the allowable stress criteria. However, the concrete's tensile strength exhibited only about a 5% margin of safety compared to the allowable stress. This indicates the need for mitigation strategies in addressing these concerns. Based on the research findings presented in this paper, it is anticipated that predictable load information for the design of storage vessels required for future shaking table tests will be provided.

An Empirical Study in Relationship between Franchisor's Leadership Behavior Style and Commitment by Focusing Moderating Effect of Franchisee's Self-efficacy (가맹본부의 리더십 행동유형과 가맹사업자의 관계결속에 관한 실증적 연구 - 가맹사업자의 자기효능감의 조절효과를 중심으로 -)

  • Yang, Hoe-Chang;Lee, Young-Chul
    • Journal of Distribution Research
    • /
    • v.15 no.1
    • /
    • pp.49-71
    • /
    • 2010
  • Franchise businesses in South Korea have contributed to economic growth and job creation, and its growth potential remains very high. However, despite such virtues, domestic franchise businesses face many problems such as the instability of franchisor's business structure and weak financial conditions. To solve these problems, the government enacted legislation and strengthened franchise related laws. However, the strengthening of laws regulating franchisors had many side effects that interrupted the development of the franchise business. For example, legal regulations regarding franchisors have had the effect of suppressing the franchisor's leadership activities (e.g. activities such as the ability to advocate the franchisor's policies and strategies to the franchisees, in order to facilitate change and innovation). One of the main goals of the franchise business is to build cooperation between the franchisor and the franchisee for their combined success. However, franchisees can refuse to follow the franchisor's strategies because of the current state of franchise-related law and government policy. The purpose of this study to explore the effects of franchisor's leadership style on franchisee's commitment in a franchise system. We classified leadership styles according to the path-goal theory (House & Mitchell, 1974), and it was hypothesized and tested that the four leadership styles proposed by the path-goal theory (i.e. directive, supportive, participative and achievement-oriented leadership) have different effects on franchisee's commitment. Another purpose of this study to explore the how the level of franchisee's self-efficacy influences both the franchisor's leadership style and franchisee's commitment in a franchise system. Results of the present study are expected to provide important theoretical and practical implications as to the role of franchisor's leadership style, as restricted by government regulations and the franchisee's self-efficacy, which could be needed to improve the quality of the long-term relationship between the franchisor and franchisee. Quoted by Northouse(2007), one problem regarding the investigation of leadership is that there are almost as many different definitions of leadership as there are people who have tried to define it. But despite the multitude of ways in which leadership has been conceptualized, the following components can be identified as central to the phenomenon: (a) leadership is a process, (b) leadership involves influence, (c) leadership occurs in a group context, and (d) leadership involves goal attainment. Based on these components, in this study leadership is defined as a process whereby franchisor's influences a group of franchisee' to achieve a common goal. Focusing on this definition, the path-goal theory is about how leaders motivate subordinates to accomplish designated goals. Drawing heavily from research on what motivates employees, path-goal theory first appeared in the leadership literature in the early 1970s in the works of Evans (1970), House (1971), House and Dessler (1974), and House and Mitchell (1974). The stated goal of this leadership theory is to enhance employee performance and employee satisfaction by focusing on employee motivation. In brief, path-goal theory is designed to explain how leaders can help subordinates along the path to their goals by selecting specific behaviors that are best suited to subordinates' needs and to the situation in which subordinates are working (Northouse, 2007). House & Mitchell(1974) predicted that although many different leadership behaviors could have been selected to be a part of path-goal theory, this approach has so far examined directive, supportive, participative, and achievement-oriented leadership behaviors. And they suggested that leaders may exhibit any or all of these four styles with various subordinates and in different situations. However, due to restrictive government regulations, franchisors are not in a position to change their leadership style to suit their circumstances. In addition, quoted by Northouse(2007), ssubordinate characteristics determine how a leader's behavior is interpreted by subordinates in a given work context. Many researchers have focused on subordinates' needs for affiliation, preferences for structure, desires for control, and self-perceived level of task ability. In this study, we have focused on the self-perceived level of task ability, namely, the franchisee's self-efficacy. According to Bandura (1977), self-efficacy is chiefly defined as the personal attitude of one's ability to accomplish concrete tasks. Therefore, it is not an indicator of one's actual abilities, but an opinion of the extent of how one can use that ability. Thus, the judgment of maintain franchisee's commitment depends on the situation (e.g., government regulation and policy and leadership style of franchisor) and how it affects one's ability to mobilize resources to deal with the task, so even if people possess the same ability, there may be differences in self-efficacy. Figure 1 illustrates the model investigated in this study. In this model, it was hypothesized that leadership styles would affect the franchisee's commitment, and self-efficacy would moderate the relationship between leadership style and franchisee's commitment. Theoretically, quoted by Northouse(2007), the path-goal approach suggests that leaders need to choose a leadership style that best fits the needs of subordinates and the work they are doing. According to House & Mitchell (1974), the theory predicts that a directive style of leadership is best in situations in which subordinates are dogmatic and authoritarian, the task demands are ambiguous, and the organizational rule and procedures are unclear. In these situations, franchisor's directive leadership complements the work by providing guidance and psychological structure for franchisees. For work that is structured, unsatisfying, or frustrating, path-goal theory suggests that leaders should use a supportive style. Franchisor's Supportive leadership offers a sense of human touch for franchisees engaged in mundane, mechanized activity. Franchisor's participative leadership is considered best when a task is ambiguous because participation gives greater clarity to how certain paths lead to certain goals; it helps subordinates learn what actions leads to what outcome. Furthermore, House & Mitchell(1974) predicts that achievement-oriented leadership is most effective in settings in which subordinates are required to perform ambiguous tasks. Marsh and O'Neill (1984) tested the idea that organizational members' anger and decline in performance is caused by deficiencies in their level of effort and found that self-efficacy promotes accomplishment, decreases stress and negative consequences like depression and emotional instability. Based on the extant empirical findings and theoretical reasoning, we posit positive and strong relationships between the franchisor's leadership styles and the franchisee's commitment. Furthermore, the level of franchisee's self-efficacy was thought to maintain their commitment. The questionnaires sent to participants consisted of the following measures; leadership style was assessed using a 20 item 7-point likert scale developed by Indvik (1985), self-efficacy was assessed using a 24 item 6-point likert scale developed by Bandura (1977), and commitment was assessed using a 6 item 5-point likert scale developed by Morgan & Hunt (1994). Questionnaires were distributed to Korean optical franchisees in Seoul. It took about 20 days to complete the data collection. A total number of 140 questionnaires were returned and complete data were available from 137 respondents. Results of multiple regression analyses testing the relationships between the each of the four styles of leadership shown by the franchisor as independent variables and franchisee's commitment as the dependent variable showed that the relationship between supportive leadership style and commitment ($\beta$=.13, p<.001),and the relationship between participative leadership style and commitment ($\beta$=.07, p<.001)were significant. However, when participants divided into high and low self-efficacy groups, results of multiple regression analyses showed that only the relationship between achievement-oriented leadership style and commitment ($\beta$=.14, p<.001) was significant in the high self-efficacy group. In the low self-efficacy group, the relationship between supportive leadership style and commitment ($\beta$=.17, p<.001),and the relationship between participative leadership style and commitment ($\beta$=.10, p<.001) were significant. The study focused on the franchisee's self-efficacy in order to explore the possibility that regulation, originally intended to protect the franchisee, may not be the most effective method to maintain the relationships in a franchise business. The key results of the data analysis regarding the moderating role of self-efficacy between leadership behavior style as proposed by path-goal and commitment theory were as follows. First, this study proposed that franchisor should apply the appropriate type of leadership behavior to strengthen the franchisees commitment because the results demonstrated that supportive and participative leadership styles by the franchisors have a positive influence on the franchisee's level of commitment. Second, it is desirable for franchisor to validate the franchisee's efforts, since the franchisee's characteristics such as self-efficacy had a substantial, positive effect on the franchisee's commitment as well as being a meaningful moderator between leadership and commitment. Third, the results as a whole imply that the government should provide institutional support, namely to put the franchisor in a position to clearly identify the characteristics of their franchisees and provide reasonable means to administer the franchisees to achieve the company's goal.

  • PDF

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.