• 제목/요약/키워드: administration information dataset

Search Result 82, Processing Time 0.029 seconds

Estimation of Air Temperature Changes due to Future Urban Growth in the Seoul Metropolitan Area (수도권지역 미래 도시성장에 따른 기온변화 추정)

  • Kim, Yoo-Keun;Kim, Hyun-Su;Jeong, Ju-Hee;Song, Sang-Keun
    • Journal of Environmental Science International
    • /
    • v.19 no.2
    • /
    • pp.237-245
    • /
    • 2010
  • The relationship between air temperatures and the fraction of urban areas (FUA) and their linear regression equation were estimated using land-use data provided by the water management information system (WAMIS) and air temperatures by the Korea Meteorology Administration (KMA) in the Seoul metropolitan area (SMA) during 1975 through 2000. The future FUA in the SMA (from 2000 to 2030) was also predicted by the urban growth model (i.e., SLEUTH) in conjunction with several dataset (e.g., urban, roads, etc.) in the WAMIS. The estimated future FUA was then used as input data for the linear regression equation to estimate an annual mean minimum air temperature in the future (e.g., 2025 and 2030). The FUA in the SMA in 2000 simulated by the SLEUTH showed good agreement with the observations (a high accuracy (73%) between them). The urban growth in the SMA was predicted to increase by 16% of the total areas in 2025 and by 24% in 2030. From the linear regression equation, the annual mean minimum air temperature in the SMA increased about $0.02^{\circ}C$/yr and it was expected to increase up to $8.3^{\circ}C$ in 2025 and $8.7^{\circ}C$ in 2030.

Improving an Ensemble Model by Optimizing Bootstrap Sampling (부트스트랩 샘플링 최적화를 통한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of Internet Computing and Services
    • /
    • v.17 no.2
    • /
    • pp.49-57
    • /
    • 2016
  • Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving prediction accuracy. Bagging is one of the most popular ensemble learning techniques. Bagging has been known to be successful in increasing the accuracy of prediction of the individual classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then combines the predictions of these classifiers to get the final classification result. Bootstrap samples are simple random samples selected from the original training data, so not all bootstrap samples are equally informative, due to the randomness. In this study, we proposed a new method for improving the performance of the standard bagging ensemble by optimizing bootstrap samples. A genetic algorithm is used to optimize bootstrap samples of the ensemble for improving prediction accuracy of the ensemble model. The proposed model is applied to a bankruptcy prediction problem using a real dataset from Korean companies. The experimental results showed the effectiveness of the proposed model.

Drug-Drug Interaction Prediction Using Krill Herd Algorithm Based on Deep Learning Method

  • Al-Marghilani, Abdulsamad
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.6
    • /
    • pp.319-328
    • /
    • 2021
  • Parallel administration of numerous drugs increases Drug-Drug Interaction (DDI) because one drug might affect the activity of other drugs. DDI causes negative or positive impacts on therapeutic output. So there is a need to discover DDI to enhance the safety of consuming drugs. Though there are several DDI system exist to predict an interaction but nowadays it becomes impossible to maintain with a large number of biomedical texts which is getting increased rapidly. Mostly the existing DDI system address classification issues, and especially rely on handcrafted features, and some features which are based on particular domain tools. The objective of this paper to predict DDI in a way to avoid adverse effects caused by the consumed drugs, to predict similarities among the drug, Drug pair similarity calculation is performed. The best optimal weight is obtained with the support of KHA. LSTM function with weight obtained from KHA and makes bets prediction of DDI. Our methodology depends on (LSTM-KHA) for the detection of DDI. Similarities among the drugs are measured with the help of drug pair similarity calculation. KHA is used to find the best optimal weight which is used by LSTM to predict DDI. The experimental result was conducted on three kinds of dataset DS1 (CYP), DS2 (NCYP), and DS3 taken from the DrugBank database. To evaluate the performance of proposed work in terms of performance metrics like accuracy, recall, precision, F-measures, AUPR, AUC, and AUROC. Experimental results express that the proposed method outperforms other existing methods for predicting DDI. LSTMKHA produces reasonable performance metrics when compared to the existing DDI prediction model.

Do Not Just Talk, Show Me in Action: Investigating the Effect of OSSD Activities on Job Change of IT Professional (오픈소스 소프트웨어 개발 플랫폼 활동이 IT 전문직 취업에 미치는 영향)

  • Jang, Moonkyoung;Lee, Saerom;Baek, Hyunmi;Jung, Yoonhyuk
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.1
    • /
    • pp.43-65
    • /
    • 2021
  • With the advancement of information and communications technology, a means to recruit IT professional has fundamentally changed. Nowadays recruiters search for candidate information from the Web as well as traditional information sources such as résumés or interviews. Particularly, open-source software development (OSSD) platforms have become an opportunity for developers to demonstrate their IT capabilities, making it a way for recruiters to find the right candidates, whom they need. Therefore, this study aims to investigate the impact developers' profiles in an OSSD platform on their finding a job. This study examined four antecedents of developer information that can accelerate their job search: job-seeking status, personal-information posting, learning activities and knowledge contribution activities. For the empirical analysis, we developed a Web crawler and gathered a dataset on 4,005 developers from GitHub, which is a well-known OSSD platform. Proportional hazards regression was used for data analysis because shorter job-seeking period implies more successful result of job change. Our results indicate that developers, who explicitly posted their job-seeking status, had shorter job-seeking periods than those who did not. The other antecedents (i.e., personal-information posting, learning, and knowledge contribution activities) also contributed in reducing the job-seeking period. These findings imply values of OSSD platforms for recruiters to find proper candidates and for developers to successfully find a job.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

A Topic Modeling-based Recommender System Considering Changes in User Preferences (고객 선호 변화를 고려한 토픽 모델링 기반 추천 시스템)

  • Kang, So Young;Kim, Jae Kyeong;Choi, Il Young;Kang, Chang Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.43-56
    • /
    • 2020
  • Recommender systems help users make the best choice among various options. Especially, recommender systems play important roles in internet sites as digital information is generated innumerable every second. Many studies on recommender systems have focused on an accurate recommendation. However, there are some problems to overcome in order for the recommendation system to be commercially successful. First, there is a lack of transparency in the recommender system. That is, users cannot know why products are recommended. Second, the recommender system cannot immediately reflect changes in user preferences. That is, although the preference of the user's product changes over time, the recommender system must rebuild the model to reflect the user's preference. Therefore, in this study, we proposed a recommendation methodology using topic modeling and sequential association rule mining to solve these problems from review data. Product reviews provide useful information for recommendations because product reviews include not only rating of the product but also various contents such as user experiences and emotional state. So, reviews imply user preference for the product. So, topic modeling is useful for explaining why items are recommended to users. In addition, sequential association rule mining is useful for identifying changes in user preferences. The proposed methodology is largely divided into two phases. The first phase is to create user profile based on topic modeling. After extracting topics from user reviews on products, user profile on topics is created. The second phase is to recommend products using sequential rules that appear in buying behaviors of users as time passes. The buying behaviors are derived from a change in the topic of each user. A collaborative filtering-based recommendation system was developed as a benchmark system, and we compared the performance of the proposed methodology with that of the collaborative filtering-based recommendation system using Amazon's review dataset. As evaluation metrics, accuracy, recall, precision, and F1 were used. For topic modeling, collapsed Gibbs sampling was conducted. And we extracted 15 topics. Looking at the main topics, topic 1, top 3, topic 4, topic 7, topic 9, topic 13, topic 14 are related to "comedy shows", "high-teen drama series", "crime investigation drama", "horror theme", "British drama", "medical drama", "science fiction drama", respectively. As a result of comparative analysis, the proposed methodology outperformed the collaborative filtering-based recommendation system. From the results, we found that the time just prior to the recommendation was very important for inferring changes in user preference. Therefore, the proposed methodology not only can secure the transparency of the recommender system but also can reflect the user's preferences that change over time. However, the proposed methodology has some limitations. The proposed methodology cannot recommend product elaborately if the number of products included in the topic is large. In addition, the number of sequential patterns is small because the number of topics is too small. Therefore, future research needs to consider these limitations.

Empirical Analysis of the Influence of ICT SMEs' R&D Resources on Corporate Performance (ICT 중소기업의 연구개발 자원이 기업성과에 미치는 영향에 관한 실증연구)

  • Jong Yoon Won;Kun Chang Lee
    • Information Systems Review
    • /
    • v.23 no.3
    • /
    • pp.1-23
    • /
    • 2021
  • The national economic policy paradigm is constantly changing according to the global business environment. Among them, fostering SMEs is a core policy of many developed countries. The growth of SMEs contributes to the creation of jobs and the development of local communities in the era of employment-free growth. In particular, the growth of SMEs is the foundation for growth into mid-sized and large enterprises. Therefore, the growth of SMEs plays an important role in the national economy. Information and communication technology (ICT) became important much more with the emergence of the 4th industrial revolution. Among them, the growth of ICT SMEs is the nation's future asset. Therefore, this study examines and verifies the main factors affecting the performance of ICT SMEs from the view of their R&D resources. On the basis of 1,999 SMEs dataset, empirical analysis was performed to investigate the influence of R&D resources on their corporate performance. Its results are as follows. First, based on theresource-based theory, ICT SMEs' R&D investment, R&D manpower, and government support policies were found to have a positive effect on securing a company's competitive advantage. Second, it was found that the level of product has a positive effect on the company's performance. Finally, it was found that M&A and technology acquisition method strategies differ according to the growth stage of the company. Therefore, in order to achieve technological innovation and corporate performance of ICT SMEs, the government support policy and investment into internal R&D personnel play as main factors. In addition, it was found that technology acquisition strategies differ depending on the growth stage of the company.

Investigating the Performance of Bayesian-based Feature Selection and Classification Approach to Social Media Sentiment Analysis (소셜미디어 감성분석을 위한 베이지안 속성 선택과 분류에 대한 연구)

  • Chang Min Kang;Kyun Sun Eo;Kun Chang Lee
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.1-19
    • /
    • 2022
  • Social media-based communication has become crucial part of our personal and official lives. Therefore, it is no surprise that social media sentiment analysis has emerged an important way of detecting potential customers' sentiment trends for all kinds of companies. However, social media sentiment analysis suffers from huge number of sentiment features obtained in the process of conducting the sentiment analysis. In this sense, this study proposes a novel method by using Bayesian Network. In this model MBFS (Markov Blanket-based Feature Selection) is used to reduce the number of sentiment features. To show the validity of our proposed model, we utilized online review data from Yelp, a famous social media about restaurant, bars, beauty salons evaluation and recommendation. We used a number of benchmarking feature selection methods like correlation-based feature selection, information gain, and gain ratio. A number of machine learning classifiers were also used for our validation tasks, like TAN, NBN, Sons & Spouses BN (Bayesian Network), Augmented Markov Blanket. Furthermore, we conducted Bayesian Network-based what-if analysis to see how the knowledge map between target node and related explanatory nodes could yield meaningful glimpse into what is going on in sentiments underlying the target dataset.

A Study on the Records Management for the National Assembly Members (국회의원 기록관리 방안 연구)

  • Kim, Jang-hwan
    • The Korean Journal of Archival Studies
    • /
    • no.55
    • /
    • pp.39-71
    • /
    • 2018
  • The purpose of this study is to examine the reality of the records management of the National Assembly members and suggest a desirable alternative. Until the Public Records Management Act was enacted in 1999, the level of the records management in the National Assembly was not beyond that of the document management in both the administration and the legislature. Rather, the National Assembly has maintained a records management tradition that systematically manages the minutes and bills since the Constitutional Assembly. After the Act was legislated in 2000, the National Assembly Records Management Regulation was enacted and enforced, and the Archives was established in the form of a subsidiary organ of the Secretariat of the National Assembly, even though its establishment is not obligatory. In addition, for the first time, an archivist was assigned as a records and archives researcher in Korea, whose role is to respond quickly in accordance with the records schedule of the National Assembly, making its service faster than that of the administration. However, the power of the records management of the National Assembly Archives at the time of the Secretariat of the National Assembly was greatly reduced, so the revision of the regulations in accordance with the revised Act in 2007 was not completed until 2011. In the case of the National Assembly, the direct influence of the executive branch was insignificant. As the National Assembly had little direct influence on the administration, it had little positive influence on records management innovation under Roh Moo-Hyun Administration. Even within the National Assembly, the records management observed by its members is insignificant both in practice and in theory. As the National Assembly members are excluded from the Act, there is no legal basis to enforce a records management method upon them. In this study, we analyze the records management problem of the National Assembly members, which mainly concerns the National Assembly records management plan established in the National Archives. Moreover, this study proposes three kinds of records management methods for the National Assembly members, namely, the legislation and revision of regulations, the records management consulting of the National Assembly members, and the transfer of the dataset of administrative information systems and websites.

A Study on the impact of customer to customer interaction on customer value creation behavior (고객과 고객 간의 상호작용이 고객가치창출행동에 미치는 영향에 대한 연구)

  • Seo, Mun-Sik;Cho, Sang-Hyun
    • Management & Information Systems Review
    • /
    • v.37 no.2
    • /
    • pp.169-185
    • /
    • 2018
  • Customers are not merely responders but rather active value creators. As a result most researches related to customer value creation behavior focus on customer participation behavior and interaction between service provider and customer. This study set the research model to examine the correlation between customer to customer interaction, brand attachment and customer value creation behavior. For this study, the relationship among social support, C-to-C social interaction, similarity, brand attachment, and customer value creation behavior were modelled and used to validate our hypotheses. A path model was verified with structural equation modeling using dataset from survey. Results of this study are summarized as follows. First, this study show the C-to-C social interaction, such as social support, C-to-C social interaction, similarity have effects on brand attachment. Thus, this was statistically significant although dismissed from hypothesis verification. Second, the structural correlation shows brand attachment has positive effect on customer value creation behavior The findings suggest that managers need to identify and pay attention to positive customer to customer interaction in the service encounter so that it influence customer brand attachment and customer value creation behavior which is the competitive advantages of service brand.