• Title/Summary/Keyword: amount of learning

Search Result 1,008, Processing Time 0.032 seconds

Recognition Method of Korean Abnormal Language for Spam Mail Filtering (스팸메일 필터링을 위한 한글 변칙어 인식 방법)

  • Ahn, Hee-Kook;Han, Uk-Pyo;Shin, Seung-Ho;Yang, Dong-Il;Roh, Hee-Young
    • Journal of Advanced Navigation Technology
    • /
    • v.15 no.2
    • /
    • pp.287-297
    • /
    • 2011
  • As electronic mails are being widely used for facility and speedness of information communication, as the amount of spam mails which have malice and advertisement increase and cause lots of social and economic problem. A number of approaches have been proposed to alleviate the impact of spam. These approaches can be categorized into pre-acceptance and post-acceptance methods. Post-acceptance methods include bayesian filters, collaborative filtering and e-mail prioritization which are based on words or sentances. But, spammers are changing those characteristics and sending to avoid filtering system. In the case of Korean, the abnormal usages can be much more than other languages because syllable is composed of chosung, jungsung, and jongsung. Existing formal expressions and learning algorithms have the limits to meet with those changes promptly and efficiently. So, we present an methods for recognizing Korean abnormal language(Koral) to improve accuracy and efficiency of filtering system. The method is based on syllabic than word and Smith-waterman algorithm. Through the experiment on filter keyword and e-mail extracted from mail server, we confirmed that Koral is recognized exactly according to similarity level. The required time and space costs are within the permitted limit.

Investigating Opinion Mining Performance by Combining Feature Selection Methods with Word Embedding and BOW (Bag-of-Words) (속성선택방법과 워드임베딩 및 BOW (Bag-of-Words)를 결합한 오피니언 마이닝 성과에 관한 연구)

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.2
    • /
    • pp.163-170
    • /
    • 2019
  • Over the past decade, the development of the Web explosively increased the data. Feature selection step is an important step in extracting valuable data from a large amount of data. This study proposes a novel opinion mining model based on combining feature selection (FS) methods with Word embedding to vector (Word2vec) and BOW (Bag-of-words). FS methods adopted for this study are CFS (Correlation based FS) and IG (Information Gain). To select an optimal FS method, a number of classifiers ranging from LR (logistic regression), NN (neural network), NBN (naive Bayesian network) to RF (random forest), RS (random subspace), ST (stacking). Empirical results with electronics and kitchen datasets showed that LR and ST classifiers combined with IG applied to BOW features yield best performance in opinion mining. Results with laptop and restaurant datasets revealed that the RF classifier using IG applied to Word2vec features represents best performance in opinion mining.

A Development of Flood Mapping Accelerator Based on HEC-softwares (HEC 소프트웨어 기반 홍수범람지도 엑셀러레이터 개발)

  • Kim, JongChun;Hwang, Seokhwan;Jeong, Jongho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.44 no.2
    • /
    • pp.173-182
    • /
    • 2024
  • In recent, there has been a trend toward primarily utilizing data-driven models employing artificial intelligence technologies, such as machine learning, for flood prediction. These data-driven models offer the advantage of utilizing pre-training results, significantly reducing the required simulation time. However, it remains that a considerable amount of flood data is necessary for the pre-training in data-driven models, while the available observed data for application is often insufficient. As an alternative, validated simulation results from physically-based models are being employed as pre-training data alongside observed data. In this context, we developed a flood mapping accelerator to generate flood maps for pre-training. The proposed accelerator automates the entire process of flood mapping, i.e., estimating flood discharge using HEC-1, calculating water surface levels using HEC-RAS, simulating channel overflow and generating flood maps using RAS Mapper. With the accelerator, users can easily prepare a database for pre-training of data-driven models from hundreds to tens of thousands of rainfall scenarios. It includes various convenient menus containing a Graphic User Interface(GUI), and its practical applicability has been validated across 26 test-beds.

A Mobile Dictionary based on a Prefetching Method (선인출 기반의 모바일 사전)

  • Hong, Soon-Jung;Moon, Yang-Sae;Kim, Hea-Suk;Kim, Jin-Ho;Chung, Young-Jun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.3
    • /
    • pp.197-206
    • /
    • 2008
  • In the mobile Internet environment, frequent communications between a mobile device and a content server are required for searching or downloading learning materials. In this paper, we propose an efficient prefetching technique to reduce the network cost and to improve the communication efficiency in the mobile dictionary. Our prefetching-based approach can be explained as follows. First, we propose an overall framework for the prefetching-based mobile dictionary. Second, we present a systematic way of determining the amount of prefetching data for each of packet-based and flat-rate billing cases. Third, by focusing on the English-Korean mobile dictionary for middle or high school students, we propose an intuitive method of determining the words to be prefetched in advance. Fourth, based on these determination methods, we propose an efficient prefetching algorithm. Fifth, through experiments, we show the superiority of our prefetching-based method. From this approach, we can summarize major contributions as follows. First, to our best knowledge, this is the first attempt to exploit prefetching techniques in mobile applications. Second, we propose a systematic way of applying prefetching techniques to a mobile dictionary. Third, using prefetching techniques we improve the overall performance of a network-based mobile dictionary. Experimental results show that, compared with the traditional on-demand approach, our prefetching based approach improves the average performance by $9.8%{\sim}33.2%$. These results indicate that our framework can be widely used not only in the mobile dictionary but also in other mobile Internet applications that require the prefetching technique.

An Analysis of Empathy Represented in Students' Group Journal of Integrated English Class Using Literature (문학을 활용한 통합영어수업의 학습자 그룹저널에 나타난 공감성 분석)

  • Choi, Minju;Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.3
    • /
    • pp.228-234
    • /
    • 2018
  • The aim of this study was to analyze the empathy represented in the learners' group journal of integrated English class using literature. 15 high school students participated in this class. In this study, integrated English class using literature was carried out by supplementing the point that amount of the English classes using literature had been focused on reading activities. In addition, not only communicative abilities but also learners' empathy to the main character in the literary was taught. In order to analyze the empathy expressed in learners' group journal, the integrated English class using literature was conducted in the second period and the class was recorded by video. The empathy was based on the community competence mentioned in the 2015 revised curriculum, and learners were asked to write the group journal. As a result of the research, the learners showed an understanding of the context in the novel and learners' group journal showed that their empathy to the main character in the novel. It is expected that the data on the empathy represented in the learner group journal of the integrated English class using literature will be used in English class.

Changes in Chemical and Sensory Properties of Dongchimi during Fermentation (동치미의 발효 중 화학적 및 관능적 성질의 변화)

  • Kang, Kun-Og;Sohn, Hyun-Joo;Kim, Woo-Jung
    • Korean Journal of Food Science and Technology
    • /
    • v.23 no.3
    • /
    • pp.267-271
    • /
    • 1991
  • The chemical and organoleptic properties of dongchimi studied for their changes during fermentation at $4{\sim}35^{\circ}C$ in 7% NaCl solution with seasonings. It was found that pH decrease showed three variation points in its slope at about pH 5.2, 4.7 and 4.1 during fermentation. A negative linear relationship between total acidity and logarithmic value of pH was obtained. The slope of the relationship curve was increased as the fermentation temperature increased. The reducing sugar was increased until pH reached$4.0{\sim}4.2$ followed by a small decrease and its amount was lowered as the temperature increased from $4\;to\;25^{\circ}C$ The major nonvolatile organic acids were lactic and citric acids. The increase in lactic acid were more marked at $25^{\circ}C$ than at $4^{\circ}C$. As fermentation proceeded the fresh radish odor significantly decreased while sourness and yeast-moldy odor increased. The crispness of the radish decreased slightly during fermentation.

  • PDF

Predicting The Direction of The Daily KOSPI Movement Using Neural Networks For ETF Trades (신경회로망을 이용한 일별 KOSPI 이동 방향 예측에 의한 ETF 매매)

  • Hwang, Heesoo
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.4
    • /
    • pp.1-6
    • /
    • 2019
  • Neural networks have been used to predict the direction of stock index movement from past data. The conventional research that predicts the upward or downward movement of the stock index predicts a rise or fall even with small changes in the index. It is highly likely that losses will occur when trading ETFs by use of the prediction. In this paper, a neural network model that predicts the movement direction of the daily KOrea composite Stock Price Index (KOSPI) to reduce ETF trading losses and earn more than a certain amount per trading is presented. The proposed model has outputs that represent rising (change rate in index ${\geq}{\alpha}$), falling (change rate ${\leq}-{\alpha}$) and neutral ($-{\alpha}$ change rate < ${\alpha}$). If the forecast is rising, buy the Leveraged Exchange Traded Fund (ETF); if it is falling, buy the inverse ETF. The hit ratio (HR) of PNN1 implemented in this paper is 0.720 and 0.616 in the learning and the evaluation respectively. ETF trading yields a yield of 8.386 to 16.324 %. The proposed models show the better ETF trading success rate and yield than the neural network models predicting KOSPI.

Study on the Openness of International Academic Papers by Researchers in Library and Information Science Using POI (Practical Openness Index) (POI(Practical Openness Index)를 활용한 문헌정보학 연구자 국제학술논문의 개방성 연구)

  • Cho, Jane
    • Journal of Korean Library and Information Science Society
    • /
    • v.52 no.2
    • /
    • pp.25-44
    • /
    • 2021
  • In a situation where OA papers are increasing, POI, which indexes how open the research activities of individual researchers are, is drawing attention. This study investigated the existence of OA papers and the OA method published in international academic journals by domestic LIS researchers, and derived the researchers' POI based on this. In addition, by examining the relationship between the POI index and the researcher's amount of research papers, the research sub field, and the foreign co-authors, it was analyzed whether these factors are relevant to the researcher's POI. As a result, there were 492 papers by 82 researchers whose OA status and method were normally identified through Unpaywall. Second, only 20.7% of papers published in international journals were open accessed, and almost cases were gold and green methods. Third, there were many papers in text mining in medical journals, and the papers opened in the green method are open in institutional repositories of foreign co-authors or transnational subject repositories such as PMC. Third, the POI index was relatively higher for researchers in the field of informetrics, machine learning than other fields. In addition, it was analyzed that the presence or absence of overseas co-authors is related to OA.

A Study on Customer Review Rating Recommendation and Prediction through Online Promotional Activity Analysis - Focusing on "S" Company Wearable Products - (온라인 판매촉진활동 분석을 통한 고객 리뷰평점 추천 및 예측에 관한 연구 : S사 Wearable 상품중심으로)

  • Shin, Ho-cheol
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.4
    • /
    • pp.118-129
    • /
    • 2022
  • The purpose of this report is to study a strategic model of promotion activities through various analysis and sales forecasting by selecting wearable products for domestic online companies and collecting sales data. For data analysis, various algorithms are used for analysis and the results are selected as the optimal model. The gradation boosting model, which is selected as the best result, will allow nine independent variables to be entered, including promotion type, price, amount, gender, model, company, grade, sales date, and region, when predicting dependent variables through supervised learning. In this study, the review values set as dependent variables for each type of sales promotion were studied in more detail through the ensemble analysis technique, and the main purpose is to analyze and predict them. The purpose of this study is to study the grades. As a result of the analysis, the evaluation result is 95% of AUC, and F1 is about 93%. In the end, it was confirmed that among the types of sales promotion activities, value-added benefits affected the number of reviews and review grades, and that major variables affected the review and review grades.

A Study on Random Selection of Pooling Operations for Regularization and Reduction of Cross Validation (정규화 및 교차검증 횟수 감소를 위한 무작위 풀링 연산 선택에 관한 연구)

  • Ryu, Seo-Hyeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.4
    • /
    • pp.161-166
    • /
    • 2018
  • In this paper, we propose a method for the random selection of pooling operations for the regularization and reduction of cross validation in convolutional neural networks. The pooling operation in convolutional neural networks is used to reduce the size of the feature map and for its shift invariant properties. In the existing pooling method, one pooling operation is applied in each pooling layer. Because this method fixes the convolution network, the network suffers from overfitting, which means that it excessively fits the models to the training samples. In addition, to find the best combination of pooling operations to maximize the performance, cross validation must be performed. To solve these problems, we introduce the probability concept into the pooling layers. The proposed method does not select one pooling operation in each pooling layer. Instead, we randomly select one pooling operation among multiple pooling operations in each pooling region during training, and for testing purposes, we use probabilistic weighting to produce the expected output. The proposed method can be seen as a technique in which many networks are approximately averaged using a different pooling operation in each pooling region. Therefore, this method avoids the overfitting problem, as well as reducing the amount of cross validation. The experimental results show that the proposed method can achieve better generalization performance and reduce the need for cross validation.