• Title/Summary/Keyword: Weight Mining

Search Result 159, Processing Time 0.023 seconds

A Study on the CBR Pattern using Similarity and the Euclidean Calculation Pattern (유사도와 유클리디안 계산패턴을 이용한 CBR 패턴연구)

  • Yun, Jong-Chan;Kim, Hak-Chul;Kim, Jong-Jin;Youn, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.4
    • /
    • pp.875-885
    • /
    • 2010
  • CBR (Case-Based Reasoning) is a technique to infer the relationships between existing data and case data, and the method to calculate similarity and Euclidean distance is mostly frequently being used. However, since those methods compare all the existing and case data, it also has a demerit that it takes much time for data search and filtering. Therefore, to solve this problem, various researches have been conducted. This paper suggests the method of SE(Speed Euclidean-distance) calculation that utilizes the patterns discovered in the existing process of computing similarity and Euclidean distance. Because SE calculation applies the patterns and weight found during inputting new cases and enables fast data extraction and short operation time, it can enhance computing speed for temporal or spatial restrictions and eliminate unnecessary computing operation. Through this experiment, it has been found that the proposed method improves performance in various computer environments or processing rate more efficiently than the existing method that extracts data using similarity or Euclidean method does.

A Recommendation System of Exponentially Weighted Collaborative Filtering for Products in Electronic Commerce (지수적 가중치를 적용한 협력적 상품추천시스템)

  • Lee, Gyeong-Hui;Han, Jeong-Hye;Im, Chun-Seong
    • The KIPS Transactions:PartB
    • /
    • v.8B no.6
    • /
    • pp.625-632
    • /
    • 2001
  • The electronic stores have realized that they need to understand their customers and to quickly response their wants and needs. To be successful in increasingly competitive Internet marketplace, recommender systems are adapting data mining techniques. One of most successful recommender technologies is collaborative filtering (CF) algorithm which recommends products to a target customer based on the information of other customers and employ statistical techniques to find a set of customers known as neighbors. However, the application of the systems, however, is not very suitable for seasonal products which are sensitive to time or season such as refrigerator or seasonal clothes. In this paper, we propose a new adjusted item-based recommendation generation algorithms called the exponentially weighted collaborative filtering recommendation (EWCFR) one that computes item-item similarities regarding seasonal products. Finally, we suggest the recommendation system with relatively high quality computing time on main memory database (MMDB) in XML since the collaborative filtering systems are needed that can quickly produce high quality recommendations with very large-scale problems.

  • PDF

Effect of Pretreatment of Mine Tailings on the Performance of Controlled Low Strength Materials (저강도 고유동 충전재의 성능에 미치는 광미 전처리의 영향)

  • Tafesse, Million;Kim, Hyeong-Ki
    • Resources Recycling
    • /
    • v.26 no.3
    • /
    • pp.32-38
    • /
    • 2017
  • For the massive recycling of mine tailings, which are an inorganic by-product of mining process, in the field of civil engineering, pretreatments to extract heavy metals are required. This study focuses on the use of pre-treated tailings as substitute fillers for controlled low-strength material (CLSM). As a comparative study, untreated tailing, microwave-treated tailing and magnetic separated with microwaved tailing were used in this study. Cement contents amounting to 10%, 20% and 30% by the weight of the tailings were designed. Both compressive strength and flowability for all types of mixture were satisfied with the requirements of the American Concrete Institute (ACI) Committee 229, i.e., 0.3-8.3 MPa of compressive strength and longer than 200 mm flowability. Furthermore, all mixtures showed settlements less than 1% by volume of the mix.

Numerical analysis and fluid-solid coupling model test of filling-type fracture water inrush and mud gush

  • Li, Li-Ping;Chen, Di-Yang;Li, Shu-Cai;Shi, Shao-Shuai;Zhang, Ming-Guang;Liu, Hong-Liang
    • Geomechanics and Engineering
    • /
    • v.13 no.6
    • /
    • pp.1011-1025
    • /
    • 2017
  • The geological conditions surrounding the Jijiapo Tunnel of the Three Gorges Fanba Highway project in Hubei Province are very complex. In this paper, a 3-D physical model was carried out to study the evolution process of filling-type fracture water inrush and mud gush based on the conditions of the section located between 16.040 km and 16.042 km of the Jijiapo Tunnel. The 3-D physical model was conducted to clarify the effect of the self-weight of the groundwater level and tunnel excavation during water inrush and mud gush. The results of the displacement, stress and seepage pressure of fracture and surrounding rock in the physical model were analyzed. In the physical model the results of the model test show that the rock displacement suddenly jumped after sustainable growth, rock stress and rock seepage suddenly decreased after continuous growth before water inrushing. Once water inrush occured, internal displacement of filler increased successively from bottom up, stress and seepage pressure of filler droped successively from bottom up, which presented as water inrush and mud gush of filling-type fracture was a evolving process from bottom up. The numerical study was compared with the model test to demonstrate the effectiveness and accuracy of the results of the model test.

A Validation of Effectiveness for Intrusion Detection Events Using TF-IDF (TF-IDF를 이용한 침입탐지이벤트 유효성 검증 기법)

  • Kim, Hyoseok;Kim, Yong-Min
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.6
    • /
    • pp.1489-1497
    • /
    • 2018
  • Web application services have diversified. At the same time, research on intrusion detection is continuing due to the surge of cyber threats. Also, As a single-defense system evolves into multi-level security, we are responding to specific intrusions by correlating security events that have become vast. However, it is difficult to check the OS, service, web application type and version of the target system in real time, and intrusion detection events occurring in network-based security devices can not confirm vulnerability of the target system and success of the attack A blind spot can occur for threats that are not analyzed for problems and associativity. In this paper, we propose the validation of effectiveness for intrusion detection events using TF-IDF. The proposed scheme extracts the response traffics by mapping the response of the target system corresponding to the attack. Then, Response traffics are divided into lines and weights each line with an TF-IDF weight. we checked the valid intrusion detection events by sequentially examining the lines with high weights.

Shear behavior of non-persistent joints in concrete and gypsum specimens using combined experimental and numerical approaches

  • Haeri, Hadi;Sarfarazi, V.;Zhu, Zheming;Hokmabadi, N. Nohekhan;Moshrefifar, MR.;Hedayat, A.
    • Structural Engineering and Mechanics
    • /
    • v.69 no.2
    • /
    • pp.221-230
    • /
    • 2019
  • In this paper, shear behavior of non-persistent joint surrounded in concrete and gypsum layers has been investigated using experimental test and numerical simulation. Two types of mixture were prepared for this study. The first type consists of water and gypsum that were mixed with a ratio of water/gypsum of 0.6. The second type of mixture, water, sand and cement were mixed with a ratio of 27%, 33% and 40% by weight. Shear behavior of a non-persistent joint embedded in these specimens is studied. Physical models consisting of two edge concrete layers with dimensions of 160 mm by 130 mm by 60 mm and one internal gypsum layer with the dimension of 16 mm by 13 mm by 6 mm were made. Two horizontal edge joints were embedded in concrete beams and one angled joint was created in gypsum layer. Several analyses with joints with angles of $0^{\circ}$, $30^{\circ}$, and $60^{\circ}$ degree were conducted. The central fault places in 3 different positions. Along the edge joints, 1.5 cm vertically far from the edge joint face and 3 cm vertically far from the edge joint face. All samples were tested in compression using a universal loading machine and the shear load was induced because of the specimen geometry. Concurrent with the experiments, the extended finite element method (XFEM) was employed to analyze the fracture processes occurring in a non-persistent joint embedded in concrete and gypsum layers using Abaqus, a finite element software platform. The failure pattern of non-persistent cracks (faults) was found to be affected mostly by the central crack and its configuration and the shear strength was found to be related to the failure pattern. Comparison between experimental and corresponding numerical results showed a great agreement. XFEM was found as a capable tool for investigating the fracturing mechanism of rock specimens with non-persistent joint.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

Personalized Recommendation System for IPTV using Ontology and K-medoids (IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템)

  • Yun, Byeong-Dae;Kim, Jong-Woo;Cho, Yong-Seok;Kang, Sang-Gil
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.147-161
    • /
    • 2010
  • As broadcasting and communication are converged recently, communication is jointed to TV. TV viewing has brought about many changes. The IPTV (Internet Protocol Television) provides information service, movie contents, broadcast, etc. through internet with live programs + VOD (Video on demand) jointed. Using communication network, it becomes an issue of new business. In addition, new technical issues have been created by imaging technology for the service, networking technology without video cuts, security technologies to protect copyright, etc. Through this IPTV network, users can watch their desired programs when they want. However, IPTV has difficulties in search approach, menu approach, or finding programs. Menu approach spends a lot of time in approaching programs desired. Search approach can't be found when title, genre, name of actors, etc. are not known. In addition, inserting letters through remote control have problems. However, the bigger problem is that many times users are not usually ware of the services they use. Thus, to resolve difficulties when selecting VOD service in IPTV, a personalized service is recommended, which enhance users' satisfaction and use your time, efficiently. This paper provides appropriate programs which are fit to individuals not to save time in order to solve IPTV's shortcomings through filtering and recommendation-related system. The proposed recommendation system collects TV program information, the user's preferred program genres and detailed genre, channel, watching program, and information on viewing time based on individual records of watching IPTV. To look for these kinds of similarities, similarities can be compared by using ontology for TV programs. The reason to use these is because the distance of program can be measured by the similarity comparison. TV program ontology we are using is one extracted from TV-Anytime metadata which represents semantic nature. Also, ontology expresses the contents and features in figures. Through world net, vocabulary similarity is determined. All the words described on the programs are expanded into upper and lower classes for word similarity decision. The average of described key words was measured. The criterion of distance calculated ties similar programs through K-medoids dividing method. K-medoids dividing method is a dividing way to divide classified groups into ones with similar characteristics. This K-medoids method sets K-unit representative objects. Here, distance from representative object sets temporary distance and colonize it. Through algorithm, when the initial n-unit objects are tried to be divided into K-units. The optimal object must be found through repeated trials after selecting representative object temporarily. Through this course, similar programs must be colonized. Selecting programs through group analysis, weight should be given to the recommendation. The way to provide weight with recommendation is as the follows. When each group recommends programs, similar programs near representative objects will be recommended to users. The formula to calculate the distance is same as measure similar distance. It will be a basic figure which determines the rankings of recommended programs. Weight is used to calculate the number of watching lists. As the more programs are, the higher weight will be loaded. This is defined as cluster weight. Through this, sub-TV programs which are representative of the groups must be selected. The final TV programs ranks must be determined. However, the group-representative TV programs include errors. Therefore, weights must be added to TV program viewing preference. They must determine the finalranks.Based on this, our customers prefer proposed to recommend contents. So, based on the proposed method this paper suggested, experiment was carried out in controlled environment. Through experiment, the superiority of the proposed method is shown, compared to existing ways.

Vacuum Pressure Effect on Thermal Conductivity of KLS-1 (진공압에 따른 한국형 인공월면토(KLS-1)의 열전도도 평가)

  • Jin, Hyunwoo;Lee, Jangguen;Ryu, Byung Hyun;Shin, Hyu-Soung;Chung, Taeil
    • Journal of the Korean Geotechnical Society
    • /
    • v.37 no.8
    • /
    • pp.51-58
    • /
    • 2021
  • South Korea, as the 10th country to join the Artemis program led by NASA, is actively supporting various researches related to the lunar exploration. In particular, the utilization of water as a resource in the Moon has been focused since it was discovered that ice exists at the lunar pole as a form of frozen soil. Information on the thermal conductivity of lunar regolith can be used to estimate the existence for ice water extraction by thermal mining. In this study, the vacuum pressure effect on thermal conductivity of KLS-1 was investigated with a DTVC (Dusty Thermal Vacuum Chamber). The reliability of KLS-1 was reconfirmed through comparison with thermal conductivity of known standard lunar regolith simulants such as JSC-1A. An empirical equation to assess thermal conductivity considering dry unit weight and vacuum pressure was proposed. The results from this study can be implemented to simulate lunar cryogenic environment using the DTVC.

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.