• Title/Summary/Keyword: Feature Expansion

Search Result 142, Processing Time 0.021 seconds

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

Effects of climate change on biodiversity and measures for them (생물다양성에 대한 기후변화의 영향과 그 대책)

  • An, Ji Hong;Lim, Chi Hong;Jung, Song Hie;Kim, A Reum;Lee, Chang Seok
    • Journal of Wetlands Research
    • /
    • v.18 no.4
    • /
    • pp.474-480
    • /
    • 2016
  • In this study, formation background of biodiversity and its changes in the process of geologic history, and effects of climate change on biodiversity and human were discussed and the alternatives to reduce the effects of climate change were suggested. Biodiversity is 'the variety of life' and refers collectively to variation at all levels of biological organization. That is, biodiversity encompasses the genes, species and ecosystems and their interactions. It provides the basis for ecosystems and the services on which all people fundamentally depend. Nevertheless, today, biodiversity is increasingly threatened, usually as the result of human activity. Diverse organisms on earth, which are estimated as 10 to 30 million species, are the result of adaptation and evolution to various environments through long history of four billion years since the birth of life. Countlessly many organisms composing biodiversity have specific characteristics, respectively and are interrelated with each other through diverse relationship. Environment of the earth, on which we live, has also created for long years through extensive relationship and interaction of those organisms. We mankind also live through interrelationship with the other organisms as an organism. The man cannot lives without the other organisms around him. Even though so, human beings accelerate mean extinction rate about 1,000 times compared with that of the past for recent several years. We have to conserve biodiversity for plentiful life of our future generation and are responsible for sustainable use of biodiversity. Korea has achieved faster economic growth than any other countries in the world. On the other hand, Korea had hold originally rich biodiversity as it is not only a peninsula country stretched lengthily from north to south but also three sides are surrounded by sea. But they disappeared increasingly in the process of fast economic growth. Korean people have created specific Korean culture by coexistence with nature through a long history of agriculture, forestry, and fishery. But in recent years, the relationship between Korean and nature became far in the processes of introduction of western culture and development of science and technology and specific natural feature born from harmonious combination between nature and culture disappears more and more. Population of Korea is expected to be reduced as contrasted with world population growing continuously. At this time, we need to restore biodiversity damaged in the processes of rapid population growth and economic development in concert with recovery of natural ecosystem due to population decrease. There were grand extinction events of five times since the birth of life on the earth. Modern extinction is very rapid and human activity is major causal factor. In these respects, it is distinguished from the past one. Climate change is real. Biodiversity is very vulnerable to climate change. If organisms did not find a survival method such as 'adaptation through evolution', 'movement to the other place where they can exist', and so on in the changed environment, they would extinct. In this respect, if climate change is continued, biodiversity should be damaged greatly. Furthermore, climate change would also influence on human life and socio-economic environment through change of biodiversity. Therefore, we need to grasp the effects that climate change influences on biodiversity more actively and further to prepare the alternatives to reduce the damage. Change of phenology, change of distribution range including vegetation shift, disharmony of interaction among organisms, reduction of reproduction and growth rates due to odd food chain, degradation of coral reef, and so on are emerged as the effects of climate change on biodiversity. Expansion of infectious disease, reduction of food production, change of cultivation range of crops, change of fishing ground and time, and so on appear as the effects on human. To solve climate change problem, first of all, we need to mitigate climate change by reducing discharge of warming gases. But even though we now stop discharge of warming gases, climate change is expected to be continued for the time being. In this respect, preparing adaptive strategy of climate change can be more realistic. Continuous monitoring to observe the effects of climate change on biodiversity and establishment of monitoring system have to be preceded over all others. Insurance of diverse ecological spaces where biodiversity can establish, assisted migration, and establishment of horizontal network from south to north and vertical one from lowland to upland ecological networks could be recommended as the alternatives to aid adaptation of biodiversity to the changing climate.