• 제목/요약/키워드: Algorithm Development

검색결과 7,046건 처리시간 0.033초

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • 제20권2호
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

Measurement and Quality Control of MIROS Wave Radar Data at Dokdo (독도 MIROS Wave Radar를 이용한 파랑관측 및 품질관리)

  • Jun, Hyunjung;Min, Yongchim;Jeong, Jin-Yong;Do, Kideok
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • 제32권2호
    • /
    • pp.135-145
    • /
    • 2020
  • Wave observation is widely used to direct observation method for observing the water surface elevation using wave buoy or pressure gauge and remote-sensing wave observation method. The wave buoy and pressure gauge can produce high-quality wave data but have disadvantages of the high risk of damage and loss of the instrument, and high maintenance cost in the offshore area. On the other hand, remote observation method such as radar is easy to maintain by installing the equipment on the land, but the accuracy is somewhat lower than the direct observation method. This study investigates the data quality of MIROS Wave and Current Radar (MWR) installed at Dokdo and improve the data quality of remote wave observation data using the wave buoy (CWB) observation data operated by the Korea Meteorological Administration. We applied and developed the three types of wave data quality control; 1) the combined use (Optimal Filter) of the filter designed by MIROS (Reduce Noise Frequency, Phillips Check, Energy Level Check), 2) Spike Test Algorithm (Spike Test) developed by OOI (Ocean Observatories Initiative) and 3) a new filter (H-Ts QC) using the significant wave height-period relationship. As a result, the wave observation data of MWR using three quality control have some reliability about the significant wave height. On the other hand, there are still some errors in the significant wave period, so improvements are required. Also, since the wave observation data of MWR is different somewhat from the CWB data in high waves of over 3 m, further research such as collection and analysis of long-term remote wave observation data and filter development is necessary.

Development of Convertor supporting Multi-languages for Mobile Network (무선전용 다중 언어의 번역을 지원하는 변환기의 구현)

  • Choe, Ji-Won;Kim, Gi-Cheon
    • The KIPS Transactions:PartC
    • /
    • 제9C권2호
    • /
    • pp.293-296
    • /
    • 2002
  • UP Link is One of the commercial product which converts HTML to HDML convertor in order to show the internet www contents in the mobile environments. When UP browser accesses HTML pages, the agent in the UP Link controls the converter to change the HTML to the HDML, I-Mode, which is developed by NTT-Docomo of Japan, has many contents through the long and stable commercial service. Micro Explorer, which is developed by Stinger project, also has many additional function. In this paper, we designed and implemented WAP convertor which can accept C-HTML contents and mHTML contents. C-HTML format by I-Mode is a subset of HTML format, mHTML format by ME is similar to C-HTML, So the content provides can easily develop C-HTML contents compared with WAP and the other case. Since C-HTML, mHTML and WML are used under the mobile environment, the limited transmission capacity of one page is also similar. In order to make a match table. After that, we apply conversion algorithm on it. If we can not find matched element, we arrange some tags which only can be supported by WML to display in the best shape. By the result, we can convert over 90% contents.

Development of the Risk Evaluation Model for Rear End Collision on the Basis of Microscopic Driving Behaviors (미시적 주행행태를 반영한 후미추돌위험 평가모형 개발)

  • Chung, Sung-Bong;Song, Ki-Han;Park, Chang-Ho;Chon, Kyung-Soo;Kho, Seung-Young
    • Journal of Korean Society of Transportation
    • /
    • 제22권6호
    • /
    • pp.133-144
    • /
    • 2004
  • A model and a measure which can evaluate the risk of rear end collision are developed. Most traffic accidents involve multiple causes such as the human factor, the vehicle factor, and the highway element at any given time. Thus, these factors should be considered in analyzing the risk of an accident and in developing safety models. Although most risky situations and accidents on the roads result from the poor response of a driver to various stimuli, many researchers have modeled the risk or accident by analyzing only the stimuli without considering the response of a driver. Hence, the reliabilities of those models turned out to be low. Thus in developing the model behaviors of a driver, such as reaction time and deceleration rate, are considered. In the past, most studies tried to analyze the relationships between a risk and an accident directly but they, due to the difficulty of finding out the directional relationships between these factors, developed a model by considering these factors, developed a model by considering indirect factors such as volume, speed, etc. However, if the relationships between risk and accidents are looked into in detail, it can be seen that they are linked by the behaviors of a driver, and depending on drivers the risk as it is on the road-vehicle system may be ignored or call drivers' attention. Therefore, an accident depends on how a driver handles risk, so that the more related risk to and accident occurrence is not the risk itself but the risk responded by a driver. Thus, in this study, the behaviors of a driver are considered in the model and to reflect these behaviors three concepts related to accidents are introduced. And safe stopping distance and accident occurrence probability were used for better understanding and for more reliable modeling of the risk. The index which can represent the risk is also developed based on measures used in evaluating noise level, and for the risk comparison between various situations, the equivalent risk level, considering the intensity and duration time, is developed by means of the weighted average. Validation is performed with field surveys on the expressway of Seoul, and the test vehicle was made to collect the traffic flow data, such as deceleration rate, speed and spacing. Based on this data, the risk by section, lane and traffic flow conditions are evaluated and compared with the accident data and traffic conditions. The evaluated risk level corresponds closely to the patterns of actual traffic conditions and counts of accident. The model and the method developed in this study can be applied to various fields, such as safety test of traffic flow, establishment of operation & management strategy for reliable traffic flow, and the safety test for the control algorithm in the advanced safety vehicles and many others.

Recent Progress in Air Conditioning and Refrigeration Research: A Review of Papers Published in the Korean Journal of Air-Conditioning and Refrigeration Engineering in 2006 (공기조화, 냉동 분야의 최근 연구 동향: 2006년 학회지 논문에 대한 종합적 고찰)

  • Han, Hwa-Taik;Shin, Dong-Sin;Choi, Chang-Ho;Lee, Dae-Young;Kim, Seo-Young;Kwon, Yong-Il
    • Korean Journal of Air-Conditioning and Refrigeration Engineering
    • /
    • 제20권6호
    • /
    • pp.427-446
    • /
    • 2008
  • A review on the papers published in the Korean Journal of Air-Conditioning and Refrigeration Engineering in 2006 has been accomplished. Focus has been put on current status of research in the aspect of heating, cooling, ventilation, sanitation and building environments. The conclusions are as follows. (1) The research trends of fluid engineering have been surveyed as groups of general fluid flow, fluid machinery and piping, etc. New research topics include micro heat exchanger and siphon cooling device using nano-fluid. Traditional CFD and flow visualization methods were still popular and widely used in research and development. Studies about diffusers and compressors were performed in fluid machinery. Characteristics of flow and heat transfer and piping optimization were studied in piping systems. (2) The papers on heat transfer have been categorized into heat transfer characteristics, heat exchangers, heat pipes, and two-phase heat transfer. The topics on heat transfer characteristics in general include thermal transport in a cryo-chamber, a LCD panel, a dryer, and heat generating electronics. Heat exchangers investigated include pin-tube type, plate type, ventilation air-to-air type, and heat transfer enhancing tubes. The research on a reversible loop heat pipe, the influence of NCG charging mass on heat transport capacity, and the chilling start-up characteristics in a heat pipe were reported. In two-phase heat transfer area, the studies on frost growth, ice slurry formation and liquid spray cooling were presented. The studies on the boiling of R-290 and the application of carbon nanotubes to enhance boiling were noticeable in this research area. (3) Many studies on refrigeration and air conditioning systems were presented on the practical issues of the performance and reliability enhancement. The air conditioning system with multi indoor units caught attention in several research works. The issues on the refrigerant charge and the control algorithm were treated. The systems with alternative refrigerants were also studied. Carbon dioxide, hydrocarbons and their mixtures were considered and the heat transfer correlations were proposed. (4) Due to high oil prices, energy consumption have been attentioned in mechanical building systems. Research works have been reviewed in this field by grouping into the research on heat and cold sources, air conditioning and cleaning research, ventilation and fire research including tunnel ventilation, and piping system research. The papers involve the promotion of efficient or effective use of energy, which helps to save energy and results in reduced environmental pollution and operating cost. (5) Studies on indoor air quality took a great portion in the field of building environments. Various other subjects such as indoor thermal comfort were also investigated through computer simulation, case study, and field experiment. Studies on energy include not only optimization study and economic analysis of building equipments but also usability of renewable energy in geothermal and solar systems.

Development of Cloud and Shadow Detection Algorithm for Periodic Composite of Sentinel-2A/B Satellite Images (Sentinel-2A/B 위성영상의 주기합성을 위한 구름 및 구름 그림자 탐지 기법 개발)

  • Kim, Sun-Hwa;Eun, Jeong
    • Korean Journal of Remote Sensing
    • /
    • 제37권5_1호
    • /
    • pp.989-998
    • /
    • 2021
  • In the utilization of optical satellite imagery, which is greatly affected by clouds, periodic composite technique is a useful method to minimize the influence of clouds. Recently, a technique for selecting the optimal pixel that is least affected by the cloud and shadow during a certain period by directly inputting cloud and cloud shadow information during period compositing has been proposed. Accurate extraction of clouds and cloud shadowsis essential in order to derive optimal composite results. Also, in the case of an surface targets where spectral information is important, such as crops, the loss of spectral information should be minimized during cloud-free compositing. In thisstudy, clouds using two spectral indicators (Haze Optimized Tranformation and MeanVis) were used to derive a detection technique with low loss ofspectral information while maintaining high detection accuracy of clouds and cloud shadowsfor cabbage fieldsin the highlands of Gangwon-do. These detection results were compared and analyzed with cloud and cloud shadow information provided by Sentinel-2A/B. As a result of analyzing data from 2019 to 2021, cloud information from Sentinel-2A/B satellites showed detection accuracy with an F1 value of 0.91, but bright artifacts were falsely detected as clouds. On the other hand, the cloud detection result obtained by applying the threshold (=0.05) to the HOT showed relatively low detection accuracy (F1=0.72), but the loss ofspectral information was minimized due to the small number of false positives. In the case of cloud shadows, only minimal shadows were detected in the Sentinel-2A/B additional layer, but when a threshold (= 0.015) was applied to MeanVis, cloud shadowsthat could be distinguished from the topographically generated shadows could be detected. By inputting spectral indicators-based cloud and shadow information,stable monthly cloud-free composited vegetation index results were obtained, and in the future, high-accuracy cloud information of Sentinel-2A/B will be input to periodic cloud-free composite for comparison.

Development of a Feasibility Evaluation Model for Apartment Remodeling with the Number of Households Increasing at the Preliminary Stage (노후공동주택 세대수증가형 리모델링 사업의 기획단계 사업성평가 모델 개발)

  • Koh, Won-kyung;Yoon, Jong-sik;Yu, Il-han;Shin, Dong-woo;Jung, Dae-woon
    • Korean Journal of Construction Engineering and Management
    • /
    • 제20권4호
    • /
    • pp.22-33
    • /
    • 2019
  • The government has steadily revised and developed laws and systems for activating remodeling of apartments in response to the problems of aged apartments. However, despite such efforts, remodeling has yet to be activated. For many reasons, this study noted that there were no tools for reasonable profitability judgements and decision making in the preliminary stages of the remodeling project. Thus, the feasibility evaluation model was developed. Generally, the profitability judgements are made after the conceptual design. However, decisions to drive remodeling projects are made at the preliminary stage. So a feasibility evaluation model is required at the preliminary stage. Accordingly, In this study, a feasibility evaluation model was developed for determining preliminary stage profitability. Construction costs, business expenses, financial expenses, and generally sales revenue were calculated using the initial available information and remodeling variables derived through the existing cases. Through this process, we developed an algorithm that can give an overview of the return on investment. In addition, the preliminary stage feasibility evaluation model developed was applied to three cases to verify the applicability of the model. Although applied in three cases, the difference between the model's forecast and actual case values is less than 5%, which is considered highly applicable. If cases are expanded in the future, it will be a useful tool that can be used in actual work. The feasibility evaluation model developed in this study will support decision making by union members, and if the model is applied in different regions, it will be expected to help local governments to understand the size of possible remodeling projects.

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • 제21권4호
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

Development of a Storage Level and Capacity Monitoring and Forecasting Techniques in Yongdam Dam Basin Using High Resolution Satellite Image (고해상도 위성자료를 이용한 용담댐 유역 저수위/저수량 모니터링 및 예측 기술 개발)

  • Yoon, Sunkwon;Lee, Seongkyu;Park, Kyungwon;Jang, Sangmin;Rhee, Jinyung
    • Korean Journal of Remote Sensing
    • /
    • 제34권6_1호
    • /
    • pp.1041-1053
    • /
    • 2018
  • In this study, a real-time storage level and capacity monitoring and forecasting system for Yongdam Dam watershed was developed using high resolution satellite image. The drought indices such as Standardized Precipitation Index (SPI) from satellite data were used for storage level monitoring in case of drought. Moreover, to predict storage volume we used a statistical method based on Principle Component Analysis (PCA) of Singular Spectrum Analysis (SSA). According to this study, correlation coefficient between storage level and SPI (3) was highly calculated with CC=0.78, and the monitoring and predictability of storage level was diagnosed using the drought index calculated from satellite data. As a result of analysis of principal component analysis by SSA, correlation between SPI (3) and each Reconstructed Components (RCs) data were highly correlated with CC=0.87 to 0.99. And also, the correlations of RC data with Normalized Water Surface Level (N-W.S.L.) were confirmed that has highly correlated with CC=0.83 to 0.97. In terms of high resolution satellite image we developed a water detection algorithm by applying an exponential method to monitor the change of storage level by using Multi-Spectral Instrument (MSI) sensor of Sentinel-2 satellite. The materials of satellite image for water surface area detection in Yongdam dam watershed was considered from 2016 to 2018, respectively. Based on this, we proposed the possibility of real-time drought monitoring system using high resolution water surface area detection by Sentinel-2 satellite image. The results of this study can be applied to estimate of the reservoir volume calculated from various satellite observations, which can be used for monitoring and estimating hydrological droughts in an unmeasured area.

Comparison of Housewives' Agricultural Food Consumption Characteristics by Age (주부의 연령대별 농식품 소비 특성 비교)

  • Hong, Jun-Ho;Kim, Jin-Sil;Yu, Yeon-Ju;Lee, Kyung-Hee;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • 제6권1호
    • /
    • pp.83-89
    • /
    • 2021
  • Lifestyle is changing rapidly, and food consumption patterns vary widely among households as dietary and food processing technologies evolve. This paper reclassified the food group of consumer panel data established by the Rural Development Administration, which contains information on purchasing agricultural products by household unit, and compared the consumption characteristics of agricultural products by age group. The criteria for age classification were divided into groups in their 60s and older with a prevalence of 20% or more metabolic diseases and groups in their 30s and 40s with less than 10%. Using the LightGBM algorithm, we classified the differences in food consumption patterns in their 30s and 50s and 60s and found that the precision was 0.85, the reproducibility was 0.71, and F1_score was 0.77. The results of variable importance were confectionery, folio, seasoned vegetables, fruit vegetables, and marine products, followed by the top five values of the SHAP indicator: confectionery, marine products, seasoned vegetables, fruit vegetables, and folio vegetables. As a result of binary classification of consumption patterns as a median instead of the average sensitive to outliers, confectionery showed that those in their 30s and 40s were more than twice as high as those in their 60s. Other variables also showed significant differences between those in their 30s and 40s and those in their 60s and older. According to the study, people in their 30s and 40s consumed more than twice as much confectionery as those in their 60s, while those in their 60s consumed more than twice as much marine products, seasoned vegetables, fruit vegetables, and folioce or logistics as much as those in their 30s and 40s. In addition to the top five items, consumption of 30s and 40s in wheat-processed snacks, breads and noodles was high, which differed from food consumption patterns in their 60s.