Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)
-
- Journal of Intelligence and Information Systems
- /
- v.25 no.2
- /
- pp.141-166
- /
- 2019
Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.
Most operational uses of wind speed data require measurements at, or estimates generated for, the reference height of 10 m above mean sea level (AMSL). On the Ieodo Ocean Research Station (IORS), wind speed is measured by instruments installed on the lighthouse tower of the roof deck at 42.3 m AMSL. This preliminary study indicates how these data can best be converted into synthetic 10 m wind speed data for operational uses via the Korea Hydrographic and Oceanographic Agency (KHOA) website. We tested three well-known conventional empirical neutral wind profile formulas (a power law (PL); a drag coefficient based logarithmic law (DCLL); and a roughness height based logarithmic law (RHLL)), and compared their results to those generated using a well-known, highly tested and validated logarithmic model (LMS) with a stability function (
This study analyzes the principles of the 'Earthly Paradise' (仙境, the realm of immortals), 'Virtuous Concordance of Yin and Yang' (陰陽合德), and the 'Reordering Works of Heaven and Earth' (天地公事) while combining them with Joseon art. Therefore, this study aims to discover the context wherein the concept of Taiji in 'Daesoon Truth,' deeply penetrates into Joseon art. Doing so reveals how 'Daesoon Thought' is embedded in the lives and customs of the Korean people. In addition, this study follows a review of the sentiments and intellectual traditions of the Korean people based on 'Daesoon Thought' and creative works. Moreover, 'Daesoon Thought' brings all of this to the forefront in academics and art at the cosmological level. The purpose of this research is to vividly reveal the core of 'Daesoon Thought' as a visual image. Through this, the combination of 'Daesoon Thought' and Joseon art will secure both data and reality at the same time. As part of this, this study deals with the world of 'Daesoon Thought' as a cosmological Taiji principle. This concept is revealed in Joseon art, which is analyzed and examined from the viewpoint of art philosophy. First, as a way to make use of 'Daesoon Thought,' 'Daesoon Truth' was developed and directly applied to Joseon art. In this way, reflections on Korean life within 'Daesoon Thought' can be revealed. In this regard, the selection of Joseon art used in this study highlights creative works that have been deeply ingrained into people's lives. For example, as 'Daesoon Thought' appears to focus on the genre painting, folk painting, and landscape painting of the Joseon Dynasty, attention is given to verifying these cases. This study analyzes 'Daesoon Thought,' which borrows from Joseon art, from the perspective of art philosophy. Accordingly, attempts are made to find examples of the 'Virtuous Concordance of Yin and Yang' and Tai-Ji in Joseon art which became a basis by which 'Daesoon Thought' was communicated to people. In addition, appreciating 'Daesoon Thought' in Joseon art is an opportunity to vividly examine not only the Joseon art style but also the life, consciousness, and mental world of the Korean people. As part of this, Chapter 2 made several findings related to the formation of 'Daesoon Thought.' In Chapter 3, the structures of the ideas of 'Earthly Paradise' and 'Virtuous Concordance of Yin and Yang' were likewise found to have support. And 'The Reordering Works of Heaven and Earth' and Tai-Ji were found in depictions of metaphysical laws. To this end, the laws of 'The Reordering Works of Heaven and Earth' and the structure of Tai-Ji were combined. In chapter 4, we analyzed the 'Daesoon Thought' in the life and work of the Korean people at the level of the convergence of 'Daeesoon Thought' and Joseon art. The analysis of works provides a glimpse into the precise identity of 'Daesoon Thought' as observable in Joseon art, as doing so is useful for generating empirical data. For example, works such as Tai-Jido, Ssanggeum Daemu, Jusachaebujeokdo, Hwajogi Myeonghwabundo, and Gyeongdodo are objects that inspired descriptions of 'Earthly Paradise', 'Virtuous Concordance of Yin and Yang,' and 'The Reordering Works of Heaven and Earth.' As a result, Tai-Ji which appears in 'Daesoon Thought', proved the status of people in Joseon art. Given all of these statements, the Tai-Ji idea pursued by Daesoon Thought is a providence that follows change as all things are mutually created. In other words, it was derived that Tai-Ji ideology sits profoundly in the lives of the Korean people and responds mutually to the providence that converges with 'Mutual Beneficence.'
The recent surge of IT and data acquisition is shifting the paradigm in all aspects of life, and these advances are also affecting academic fields. Research topics and methods are being improved through academic exchange and connections. In particular, data-based research methods are employed in various academic fields, including landscape architecture, where continuous research is needed. Therefore, this study aims to investigate the possibility of developing a landscape preference evaluation and prediction model using machine learning, a branch of Artificial Intelligence, reflecting the current situation. To achieve the goal of this study, machine learning techniques were applied to the landscaping field to build a landscape preference evaluation and prediction model to verify the simulation accuracy of the model. For this, wind power facility landscape images, recently attracting attention as a renewable energy source, were selected as the research objects. For analysis, images of the wind power facility landscapes were collected using web crawling techniques, and an analysis dataset was built. Orange version 3.33, a program from the University of Ljubljana was used for machine learning analysis to derive a prediction model with excellent performance. IA model that integrates the evaluation criteria of machine learning and a separate model structure for the evaluation criteria were used to generate a model using kNN, SVM, Random Forest, Logistic Regression, and Neural Network algorithms suitable for machine learning classification models. The performance evaluation of the generated models was conducted to derive the most suitable prediction model. The prediction model derived in this study separately evaluates three evaluation criteria, including classification by type of landscape, classification by distance between landscape and target, and classification by preference, and then synthesizes and predicts results. As a result of the study, a prediction model with a high accuracy of 0.986 for the evaluation criterion according to the type of landscape, 0.973 for the evaluation criterion according to the distance, and 0.952 for the evaluation criterion according to the preference was developed, and it can be seen that the verification process through the evaluation of data prediction results exceeds the required performance value of the model. As an experimental attempt to investigate the possibility of developing a prediction model using machine learning in landscape-related research, this study was able to confirm the possibility of creating a high-performance prediction model by building a data set through the collection and refinement of image data and subsequently utilizing it in landscape-related research fields. Based on the results, implications, and limitations of this study, it is believed that it is possible to develop various types of landscape prediction models, including wind power facility natural, and cultural landscapes. Machine learning techniques can be more useful and valuable in the field of landscape architecture by exploring and applying research methods appropriate to the topic, reducing the time of data classification through the study of a model that classifies images according to landscape types or analyzing the importance of landscape planning factors through the analysis of landscape prediction factors using machine learning.
A lot of interest in the baby-boomer generation, those who were born after World War II, has emerged since their retirement has been accelerated. The retirement of baby-boomers has caused many health, public welfare, social policy and family relationship problems. However, their increased purchasing power has made them more attractive consumers than any other generation, and they have become a fascinating niche market in the depressed economy. This research selected middle-class women of the baby-boomer generation who have had powerful effects on society and have emerged as an attractive niche market, and attempted to understand their lives intensively. Based on research activities, the purpose of this research is to identify baby-boomer generation middle-aged women's life values. Qualitative research methodology was used to achieve research objectives, and this research aimed to suggest marketing implications to connected industries based on the research results. The research objectives are as follows. 1. understanding the lives of baby-boomer middle-class women who have powerful effects on socio-economic phenomena 2. identifying the life values of baby-boomer middle-class women 3. generating marketing implications based on an understanding of baby-boomer middle-class women's lives and life values This research conducted FGIs(focus group interviews), one of the qualitative research methodologies, to figure out baby-boomer middle-class women's life values intensively and selected 10 women living in Seoul for data collection. The qualitative data of collected FGIs were analyzed with spiral data analysis methodology proposed by Creswell(2007). The most effective factors to influence these middle-class women's lives powerfully were 'time' and 'independence'. Their consciousness of the importance of using time affects their life pattern generally, and their independence also impacts greatly on the way they exploit time and on their diverse relationships. They maximized their self-realization and showed long-term partnership with their surrounding circumstances because of those effective factors. Baby-boomer middle-class women's self-realization was divided into two areas. One was their outside activities and another was perfect management of their physical appearance and home interior. Like the results of this research, their need for social entrance will be reinforced more strongly since their internal and external activities aim for the achievement of self-realization. In addition, this research suggests that baby-boomer middle-class women's activities are connected with their management of their physical appearance and home interior decorations, and that such management is caused not only by a simple interest in fashion and beauty but also a profound desire for self-realization. On account of their consciousness, which is different from other generations, Korean baby-boomer middle-class women are able to maintain positive partnerships with their surrounding circumstances; however, they also show ambivalent emotions to retain effective partnerships. To overcome those stressful situations, they make greater efforts to keep up their health and youth, and also engage in diverse activities to maintain their mental health. Finally, they generate positive attitudes toward their economic situation and extra time to develop self-realization and pursue happy, youthful and healthy lives. Based on those results, this study suggests the following implications. First, industries targeting the baby-boomer generation should develop innovative products and services which help the baby-boomer generation maximize their efficiency of time since time is one of the most important factors powerfully impacting the baby-boomer generation. They will engage in various activities to fill up their extra time and consume helpful products and services. Second, such industries should supply the baby-boomer generation with opportunities which propose new ways of self-realization since this generation shows a great desire for self-realization because of their self-efficacy. With customized strategies of satisfying their needs, the baby-boomer generation would discover opportunities to utilize their abilities, relationships and aesthetic senses, and industries would develop a niche market. Third, market segmentations which target the baby-boomer generation's desire to maintain their physical appearance and home interior should be executed since such activities are the main strategies to develop this generation's self-realization. The baby-boomer generation's desire to study those areas would be expanded, and those education systems should produce innovative products and services targeting the baby-boomer generation. This implication also offers to government officials new policies related with the baby-boomer generation. This exploratory study utilized qualitative research methodology to understand baby-boomer middle-class women's lives, and proposed propositions and limitations for further researches. As for the limitations, first, it is hard to generalize the research results so that they may apply to all areas and economic classes of the baby-boomer generation since this research selected only 10 women living in Seoul for the data collection process. To overcome this limitation, extended data collections of subjects from diverse regions and economic classes should be designed. Second, quantitative research should be conducted to supplement the findings with validities. Third, this research focused on only general ideas of the baby-boomer generation's lives since the range of this study was focused on their overall lives. Therefore, intensive research related to specific areas of their lives should be conducted.
Background: Sarcoidosis is a chronic granulomatous inflammatory disease of unknown etiology often involving the lungs and intrathoracic lymph nodes. The natural course of sarcoidosis is variable from spontaneous remission to significant morbidity or death. But, the mechanisms causing the variable clinical outcomes or any single parameter to predict the prognosis was not known. In sarcoidosis, the number and the activity of CD4 + lymphocytes are significantly increased at the loci of disease and their oligoclonality suggests that the CD4 + lymphocytes hyperreactivity may be caused by persistent antigenic stimulus. Recently, it has been known that CD4+ lymphocytes can be subdivided into 2 distinct population(Th1 and Th2) defined by the spectrum of cytokines produced by these cells. Th1 cells promote cellular immunity associated with delayed type hypersensitivity reactions by generating IL-2 and IFN-
Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.
The important results which have been obtained in the investigation can be recapitulated as follows. 1. As demonstrated by the experimental results and analyses concerning their effects in the on-ground type mushroom house, the constructions in relation to the side wall and ceiling of the experimental house showed a sufficient heat insulation on effect to protect insides of the house from outside climatic conditions. 2. As the effect on the solar type experimental mushroom house which was constructed in a half basement has been shown by the experimental results and analyses, it has been proved to be effective for making use of solar heat. However there were found two problems to be improved for putting solar house to practical use in the farm mushroom growing: (1) the construction of the roof and ceiling should be the same as for the on ground type house, and (2) the solar heat generating system should be reconstructed properly. 3. Among several ventilation systems which have been studied in the experiments, the underground earthen pipe and ceiling ventilation, and vertical side wall and ceiling ventilation systems have been proved to be most effective for natural ventilation. 4. The experimental results have shown that ventilation systems such as the vertical side wall and underground ventilation systems are suitable to put to practical use as natural ventilation systems for farm mushroom house. These ventilation systems can remarkably improve the temperature of fresh air which is introduced into the house by heat transfers within the ventilation passages, so as to approach to the desired temperature of the house without any cooling or heating operation. For example, if it is assuming that X is the outside temperature and Y is the amount of temperature adjustment made by the influence of the ventilation system, the relationships that exist between X and Y can be expressed by the following regression lines. Underground iron pipe ventilation system. Y=0.9X-12.8 Underground earthen pipe ventilation system. Y=0.96X-15.11 Vertical side wall ventilation system. Y=0.94X-17.57 5. The experimental results have 8hown that the relationships existing between the admitted and expelled air and the
Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70