• Title/Summary/Keyword: unbalanced data

Search Result 326, Processing Time 0.028 seconds

Accuracy of Phishing Websites Detection Algorithms by Using Three Ranking Techniques

  • Mohammed, Badiea Abdulkarem;Al-Mekhlafi, Zeyad Ghaleb
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.272-282
    • /
    • 2022
  • Between 2014 and 2019, the US lost more than 2.1 billion USD to phishing attacks, according to the FBI's Internet Crime Complaint Center, and COVID-19 scam complaints totaled more than 1,200. Phishing attacks reflect these awful effects. Phishing websites (PWs) detection appear in the literature. Previous methods included maintaining a centralized blacklist that is manually updated, but newly created pseudonyms cannot be detected. Several recent studies utilized supervised machine learning (SML) algorithms and schemes to manipulate the PWs detection problem. URL extraction-based algorithms and schemes. These studies demonstrate that some classification algorithms are more effective on different data sets. However, for the phishing site detection problem, no widely known classifier has been developed. This study is aimed at identifying the features and schemes of SML that work best in the face of PWs across all publicly available phishing data sets. The Scikit Learn library has eight widely used classification algorithms configured for assessment on the public phishing datasets. Eight was tested. Later, classification algorithms were used to measure accuracy on three different datasets for statistically significant differences, along with the Welch t-test. Assemblies and neural networks outclass classical algorithms in this study. On three publicly accessible phishing datasets, eight traditional SML algorithms were evaluated, and the results were calculated in terms of classification accuracy and classifier ranking as shown in tables 4 and 8. Eventually, on severely unbalanced datasets, classifiers that obtained higher than 99.0 percent classification accuracy. Finally, the results show that this could also be adapted and outperforms conventional techniques with good precision.

CEO Overseas Experience and Firm Internationalization: Before and After the Global Financial Crisis

  • Kim, Jiyoon;Park, Jong-Hun;Kim, Changsu
    • Journal of Korea Trade
    • /
    • v.24 no.7
    • /
    • pp.54-72
    • /
    • 2020
  • Purpose - This study explores the contextual factors that affect the relationship between CEO overseas experience and firm internationalization. This study incorporates a wide range of contextual factors, including mega, macro, and micro variables. In particular, this study goes a step further from prior studies by incorporating a higher-order variable i.e., the global financial crisis that can constrain the managerial discretion of a CEO. Design/methodology - To structure the balanced data set before and after the 2008 global financial crisis, we used the data for the years from 2002 to 2014 from a sample of Korean manufacturing firms. Ultimately, 1101 firm-year unbalanced panel observations from 101 firms were used for the analysis. Findings - Our main findings can be summarized as follows. CEO overseas experience is positively related to firm internationalization. However, this relationship varies depending on the CEOs level of managerial discretion. As for the constraining moderation, the global financial crisis weakened the positive relationship between CEO overseas experience and firm internationalization. As for the enabling moderation, the CEOs tenure strengthened the relationship. Originality/value - This study adopted the knowledge, skills, and abilities (KSA) framework to explain the relationship between CEO overseas experience and firm internationalization. Moreover, we argue that the CEO-internationalization relationship depends on the specific context of the managerial discretion, focusing on the 2008 global financial crisis. Empirically, this study adopted the 2SLS procedure to correct endogeneity. Instead of taking the actual value of prior internationalization as a control, we estimated prior internationalization using the instrument variables at an industry level. This procedure made our estimation more robust.

Dietary Behavior of Students in the Busan Area as Determined Using the Nutritional and Dietary Diagnostic System (어린이 식생활스크리닝(DST)을 이용한 부산지역 초등학생의 식행동 및 영양상태 평가)

  • Jin-seon Song;Youngshin Han;Kyung A Lee
    • Journal of the Korean Dietetic Association
    • /
    • v.29 no.2
    • /
    • pp.86-99
    • /
    • 2023
  • In this study, the authors surveyed the dietary habits of all elementary school students registered with the Busan Metropolitan City Office of Education using an online questionnaire called the Dietary Screening Test (DST). The DST consists of 36 items, and these were divided into 5 factors: life rhythm, meal quality, eating development, eating temperament characteristics, and eating habit characteristics. Data were collected from 153,017 students attending 304 schools in Busan, and the responses of 4,020 were included in the analysis. The study was undertaken to document growth and development and diagnose nutrition and dietary problems to provide basic data for the development of customized nutrition education and counseling programs. Results showed that 13.5% and 14.3% of participants were classified as overweight or required weight management for obesity, respectively; 6.7% were underweight. Additionally, 37.0% and 9.5% of children required parental attention at bedtime and sleeping hours, and 14.2% ate too quickly or too slowly. Furthermore, food group consumptions were unbalanced, 25.0% and 64.4% of participants ate grains and protein less than twice a day, respectively, and 72.3% and 74.5% ate kimchi and vegetables less than twice a day, respectively. In contrast, 28.8% of respondents consumed sweet snacks daily or 5~6 times weekly. These findings highlight the need for a standardized school nutrition counseling manual and individually customized nutrition counseling programs to address the nutrition and dietary problems of elementary school students in Busan.

Efficient Sign Language Recognition and Classification Using African Buffalo Optimization Using Support Vector Machine System

  • Karthikeyan M. P.;Vu Cao Lam;Dac-Nhuong Le
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.6
    • /
    • pp.8-16
    • /
    • 2024
  • Communication with the deaf has always been crucial. Deaf and hard-of-hearing persons can now express their thoughts and opinions to teachers through sign language, which has become a universal language and a very effective tool. This helps to improve their education. This facilitates and simplifies the referral procedure between them and the teachers. There are various bodily movements used in sign language, including those of arms, legs, and face. Pure expressiveness, proximity, and shared interests are examples of nonverbal physical communication that is distinct from gestures that convey a particular message. The meanings of gestures vary depending on your social or cultural background and are quite unique. Sign language prediction recognition is a highly popular and Research is ongoing in this area, and the SVM has shown value. Research in a number of fields where SVMs struggle has encouraged the development of numerous applications, such as SVM for enormous data sets, SVM for multi-classification, and SVM for unbalanced data sets.Without a precise diagnosis of the signs, right control measures cannot be applied when they are needed. One of the methods that is frequently utilized for the identification and categorization of sign languages is image processing. African Buffalo Optimization using Support Vector Machine (ABO+SVM) classification technology is used in this work to help identify and categorize peoples' sign languages. Segmentation by K-means clustering is used to first identify the sign region, after which color and texture features are extracted. The accuracy, sensitivity, Precision, specificity, and F1-score of the proposed system African Buffalo Optimization using Support Vector Machine (ABOSVM) are validated against the existing classifiers SVM, CNN, and PSO+ANN.

Run-time Memory Optimization Algorithm for the DDMB Architecture (DDMB 구조에서의 런타임 메모리 최적화 알고리즘)

  • Cho, Jeong-Hun;Paek, Yun-Heung;Kwon, Soo-Hyun
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.413-420
    • /
    • 2006
  • Most vendors of digital signal processors (DSPs) support a Harvard architecture, which has two or more memory buses, one for program and one or more for data and allow the processor to access multiple words of data from memory in a single instruction cycle. We already addressed how to efficiently assign data to multi-memory banks in our previous work. This paper reports on our recent attempt to optimize run-time memory. The run-time environment for dual data memory banks (DBMBs) requires two run-time stacks to control activation records located in two memory banks corresponding to calling procedures. However, activation records of two memory banks for a procedure are able to have different size. As a consequence, dual run-time stacks can be unbalanced whenever a procedure is called. This unbalance between two memory banks causes that usage of one memory bank can exceed the extent of on-chip memory area although there is free area in the other memory bank. We attempt balancing dual run-time slacks to enhance efficiently utilization of on-chip memory in this paper. The experimental results have revealed that although our algorithm is relatively quite simple, it still can utilize run-time memories efficiently; thus enabling our compiler to run extremely fast, yet minimizing the usage of un-time memory in the target code.

An Empirical Study on Debt Financing of Family Firms : Focused on Packing Order Theory (가족기업의 부채조달에 관한 실증연구 : 자본조달순위이론을 중심으로)

  • Jung, Mingeu;Kim, Dongwook;Kim, Byounggon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.3
    • /
    • pp.337-345
    • /
    • 2018
  • The purpose of this study is to analyze the relationship between the characteristics of Korean family firms and the impact of debt financing. The analysis period was 10 years from 2004 to 2013, and the sample consisted of 4,008 non-financial firms listed on the Korea Exchange. For the analysis, the unbalanced panel data with time - series, cross - section data were formed and analyzed using panel data regression analysis. The results are as follows. First, Korean family firms use relatively less debt than non - family firms. It can be understood that family firms in which the dominant family owns and dominates the corporation are less likely to increase their debt because the agent problem is alleviated and the need for the control effect of Jensen (1986) is lowered. Second, in the verification of the packing order theory using the model proposed by Shyam-Sunder and Myers (1999), family firms have higher compliance with the packing order theory than non-family firms do. When financing is needed, debt is preferred over equity issuance. However, for Korean family firms, 24.38% of the deficit funds are financed through the issuance of net debt, which is relatively low compared to the 75% shown in the analysis of Shyam-Sunder and Myers (1999). These results reveal the limit to the strong claim that the Korean family firms follow the packing order theory.

Study on the analysis of disproportionate data and hypothesis testing (불균형 자료 분석과 가설 검정에 관한 연구)

  • 장석환;송규문;김장한
    • The Korean Journal of Applied Statistics
    • /
    • v.5 no.2
    • /
    • pp.243-254
    • /
    • 1992
  • In the present study two sets of unbalanced two-way cross-classification data with and without empty cell(s) were used to evaluate empirically the various sums of squares in the analysis of variance table. Searle(1977) and Searle et.al.(1981) developed a method of computing R($\alpha$\mid$\mu, \beta$) and R($\beta$\mid$\mu, \alpha$) by the use of partitioned matrix of X'X for the model of no interaction, interchanging the columns of X in order of $\alpha, \mu, \beta$ and accordingly the elements in b. An alternative way of computing R($\alpha$\mid$\mu, \beta$), R($\beta$\mid$\mu, \alpha$) and R($\gamma$\mid$\mu, \alpha, \beta$) without interchanging the columns of X has been found by means of,$(X'X)^-$ derived, using $W_2 = Z_2Z_2-Z_2Z_1(Z_1Z_1)^-Z_1Z_2$. It is true that $R(\alpha$\mid$\mu,\beta,\gamma)\Sigma = SSA_W and R(\beta$\mid$\mu,\alpha,\gamma)\Sigma = SSB_W$ where $SSA_W$ and means analysis and $R(\gamma$\mid$\mu,\alpha,\beta) = R(\gamma$\mid$\mu,\alpha,\beta)\Sigma$ for the data without empty cell, but not for the data with empty cell(s). It is also noticed that for the datd with empty cells under W - restrictions $R(\alpha$\mid$\mu,\beta,\gamma)_W = R(\mu,\alpha,\beta,\gamma)_W - R(\mu,\alpha,\beta,\gamma)_W = R(\alpha$\mid$\mu) and R(\beta$\mid$\mu,\alpha,\gamma)_W = R(\mu,\alpha,\beta,\gamma)_W - R(\mu,\alpha,\beta,\gamma)_W = R(\beta$\mid$\mu) but R(\gamma$\mid$\mu,\alpha,\beta)_W = R(\mu,\alpha,\beta,\gamma)_W - R(\mu,\alpha,\beta,\gamma)_W \neq R(\gamma$\mid$\mu,\alpha,\beta)$. The hypotheses $H_o : K' b = 0$ commonly tested were examined in the relation with the corresponding sums of squares for $R(\alpha$\mid$\mu), R(\beta$\mid$\mu), R(\alpha$\mid$\mu,\beta), R(\beta$\mid$\mu,\alpha), R(\alpha$\mid$\mu,\beta,\gamma), R(\beta$\mid$\mu,\alpha,\gamma), and R(\gamma$\mid$\mu,\alpha,\beta)$ under the restrictions.

  • PDF

An Energy-Balancing Technique using Spatial Autocorrelation for Wireless Sensor Networks (공간적 자기상관성을 이용한 무선 센서 네트워크 에너지 균등화 기법)

  • Jeong, Hyo-nam;Hwang, Jun
    • Journal of Internet Computing and Services
    • /
    • v.17 no.6
    • /
    • pp.33-39
    • /
    • 2016
  • With recent advances in sensor technology, CMOS-based semiconductor devices and networking protocol, the areas for application of wireless sensor networks greatly expanded and diversified. Such diversification of uses for wireless sensor networks creates a multitude of beneficial possibilities for several industries. In the application of wireless sensor networks for monitoring systems' data transmission process from the sensor node to the sink node, transmission through multi-hop paths have been used. Also mobile sink techniques have been applied. However, high energy costs, unbalanced energy consumption of nodes and time gaps between the measured data values and the actual value have created a need for advancement. Therefore, this thesis proposes a new model which alleviates these problems. To reduce the communication costs due to frequent data exchange, a State Prediction Model has been developed to predict the situation of the peripheral node using a geographic autocorrelation of sensor nodes constituting the wireless sensor networks. Also, a Risk Analysis Model has developed to quickly alert the monitoring system of any fatal abnormalities when they occur. Simulation results have shown, in the case of applying the State Prediction Model, errors were smaller than otherwise. When the Risk Analysis Model is applied, the data transfer latency was reduced. The results of this study are expected to be utilized in any efficient communication method for wireless sensor network monitoring systems where all nodes are able to identify their geographic location.

Art transaction using big data Artist analysis system implementation (미술품 거래 빅데이터를 이용한 작가 분석 시스템 구현)

  • SeungKyung Lee;JongTae Lim
    • Journal of Service Research and Studies
    • /
    • v.11 no.2
    • /
    • pp.79-93
    • /
    • 2021
  • The size of the domestic art market has increased 21.9% over the past five years as of 2018 to KRW 448.2 billion and the number of transactions has also increased 31.6% to 39,367 points maintaining growth for the fifth consecutive year. Art distribution platforms are diversifying from galleries and auction-style offline to online auctions. The art market consists of three areas: production (creation), distribution (trade), and consumption (buying) of works and as the perception of artistic value as well as economic value spreads interest is also increasing as a means of investment. Consumers who purchase works and think of them as a means of investment technology have an increased need for objective information about their works, but there is a limit to collecting and analyzing objective and reliable statistics because information provision in the art market distribution area is closed and unbalanced. This paper identifies objective and reliable art distribution status and status through big data collection and structured and unstructured data analysis on art market distribution areas. Through this, we want to implement a system that can objectively provide analysis of authors in the current market. This study collected author information from art distribution sites and calculated the frequency of associated words by writer by collecting and analyzing the author's articles from Maeil Business, a daily newspaper. It aims to provide consumers with objective and reliable information.

The Change of Nearshore Processes due to the Development of Coastal Zone (연안역 개발에 따른 해안과정의 변화)

  • Lee, J.W.;Lee, S.J.;Lee, H.;Jeong, D.D.
    • Journal of Korean Port Research
    • /
    • v.13 no.1
    • /
    • pp.155-166
    • /
    • 1999
  • The construction of the coastal structures and reclamation work causes the circulation reduced in the semi-closed inner water area and the unbalanced sediment budget of beach results in an alteration of beach topography. Among the various fluid motions in the nearshore zone water particle motion due to wave and wave-induced currents are the most responsible for sediment movement. Therefore it is needed to predict the effect of the environmental change because of development and so the prediction of wave transformation dose. The purpose of this study is to introduce the relation between waves wave-induced currents and sediment movement. In this study we will show numerical method using energy conservation equation involving reflection diffraction and reflection and the surfzone energy dissipation term due to wave breaking is included in the basic equation. For the wave-induced current the momentum equation was combined with radiation stresses lateral mixing and friction Various information is required in the prediction of wave-induced current depending on the prediction tool. We can predict changes in wave-induced current from the distribution of wave especially near the wave breaking zone. To evaluate these quantities we have to know the local condition of waves mean sea level and so on. The results from the wave field and wave-induced current field deformation models are used as input data of the sediment transport and bottom change model. Numerical model were established by a finite difference method then were applied to the development plan of the eastern Pusan coastal zone Yeonhwa-ri and Daebyun fishing port. We represented the result with 2-D graphics and made comparison between before and after development.

  • PDF