• Title/Summary/Keyword: Technology classification

Search Result 4,104, Processing Time 0.039 seconds

The Effect of Data Size on the k-NN Predictability: Application to Samsung Electronics Stock Market Prediction (데이터 크기에 따른 k-NN의 예측력 연구: 삼성전자주가를 사례로)

  • Chun, Se-Hak
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.239-251
    • /
    • 2019
  • Statistical methods such as moving averages, Kalman filtering, exponential smoothing, regression analysis, and ARIMA (autoregressive integrated moving average) have been used for stock market predictions. However, these statistical methods have not produced superior performances. In recent years, machine learning techniques have been widely used in stock market predictions, including artificial neural network, SVM, and genetic algorithm. In particular, a case-based reasoning method, known as k-nearest neighbor is also widely used for stock price prediction. Case based reasoning retrieves several similar cases from previous cases when a new problem occurs, and combines the class labels of similar cases to create a classification for the new problem. However, case based reasoning has some problems. First, case based reasoning has a tendency to search for a fixed number of neighbors in the observation space and always selects the same number of neighbors rather than the best similar neighbors for the target case. So, case based reasoning may have to take into account more cases even when there are fewer cases applicable depending on the subject. Second, case based reasoning may select neighbors that are far away from the target case. Thus, case based reasoning does not guarantee an optimal pseudo-neighborhood for various target cases, and the predictability can be degraded due to a deviation from the desired similar neighbor. This paper examines how the size of learning data affects stock price predictability through k-nearest neighbor and compares the predictability of k-nearest neighbor with the random walk model according to the size of the learning data and the number of neighbors. In this study, Samsung electronics stock prices were predicted by dividing the learning dataset into two types. For the prediction of next day's closing price, we used four variables: opening value, daily high, daily low, and daily close. In the first experiment, data from January 1, 2000 to December 31, 2017 were used for the learning process. In the second experiment, data from January 1, 2015 to December 31, 2017 were used for the learning process. The test data is from January 1, 2018 to August 31, 2018 for both experiments. We compared the performance of k-NN with the random walk model using the two learning dataset. The mean absolute percentage error (MAPE) was 1.3497 for the random walk model and 1.3570 for the k-NN for the first experiment when the learning data was small. However, the mean absolute percentage error (MAPE) for the random walk model was 1.3497 and the k-NN was 1.2928 for the second experiment when the learning data was large. These results show that the prediction power when more learning data are used is higher than when less learning data are used. Also, this paper shows that k-NN generally produces a better predictive power than random walk model for larger learning datasets and does not when the learning dataset is relatively small. Future studies need to consider macroeconomic variables related to stock price forecasting including opening price, low price, high price, and closing price. Also, to produce better results, it is recommended that the k-nearest neighbor needs to find nearest neighbors using the second step filtering method considering fundamental economic variables as well as a sufficient amount of learning data.

Physical Characteristics and Classification of the Ulleung Warm Eddy in the East Sea (Japan Sea) (동해 울릉 난수성 소용돌이의 물리적 특성 및 분류)

  • SHIN, HONG-RYEOL;KIM, INGWON;KIM, DAEHYUK;KIM, CHEOL-HO;KANG, BOONSOON;LEE, EUNIL
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.2
    • /
    • pp.298-317
    • /
    • 2019
  • The physical characteristics of the Ulleung Warm Eddy (UWE) and its relationship with the East Korea Warm Current (EKWC) were analyzed using the CMEMS (Copernicus Marine Environment Monitoring Service) satellite altimetry data and the CTD data of the National Institute of Fisheries Science (NIFS) near the Ulleung Basin from 1993 to 2017. The distribution of the UWEs coupled with EKWC accounts for 81% of the total number of the UWEs. Only 7% of the total eddies are completely separated from the EKWC. The UWE has the characteristics of high temperature and high salinity water inside of it when it is formed from the EKWC. However, when the UWE is wintering, its internal structure changes greatly. In the winter, surface homogeneous layer of $10^{\circ}C$ and 34.2 psu inside of the UWE is produced by vertical convection from sea-surface cooling, and deepened to a maximum depth of approximately 250 m in early spring. In summer, the UWE changes into a structure with a stratified structure in the upper layer within a depth of 100 m and a homogeneous layer made in winter in the lower layer. 62 UWEs were produced for 25 years from 1993 to 2017. on average, 2.5 UWEs were formed annually, and the average life span was 259 days (approximately 8.6 months). The average size of the UWEs is 98 km in the east-west direction and 109 km in the north-south direction. The average size of UWE using satellite altimetric data is estimated to be 1~25 km smaller than that using water temperature cross-sectional data.

Vegetative Propagation and Morphological Characteristics of Amelanchier spp. with High Value as Fruit Tree for Landscaping (정원용 유실수로서 가치가 높은 채진목속(Amelanchier spp.)의 형태적 특성 및 영양번식방법)

  • Kang, Ho Chul;Hwang, Dae Yul;Ha, Yoo Mi
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.46 no.6
    • /
    • pp.111-119
    • /
    • 2018
  • This study was carried out to investigate the growth characteristics and propagation methods of the Korean native Amelanchier asiatica, A. arborea, and A. alnifolia as fruit trees for gardens. Due to the lack of recent research on Amelanchier spp., their superficial classification is still unclear and the names are being used interchangeably. The results are obtained as follows : A. arborea and A. alnifolia were globular type multi-stemmed shrubs. A 20-year-old tree of A. asiatica was 7.8m in height, with a 5.2m crown width, with one trunk. As for the morphological characteristics, leaves of A. asiatica were oblong, with an acuminate of, 6.1cm and 3.6cm width, but A. arborea and A. alnifolia had acute obovate leaves. The leaf size of A. alnifolia was the largest among the three species. The flower size of A. asiatica was bigger than that of A. arborea and A. alnifolia. In addition, its petals and flower clusters were also the largest among the three species. The flowering of A. asiatica initiated on April 21 and then bloomed for a duration of 24 days in Osan, while that of A. arborea and A. alnifolia initiated flowering on April 12 and then bloomed for a duration of 22 days in the same location. The fruit of A. arborea and A. alnifolia were green on May 10~12, it changed into purplish red on May 24~26, and its matured on June 1~3. The duration of fruit persistence of A. arborea and A. alnifolia were 48~50 days. On the other hand, A. asiatica showed greenish fruit on May 20, it became red on September 4, and had fallen by October 3. The fruit size was the largest at 1.03cm of height and 1.12cm of diameter in the A. arborea, followed by the big berry of A. alnifolia and the smallest fruit in the native, A. asiatica. It was difficult to root due to the hardwood cutting of A. arborea at a 40% rate of rooting. In the softwood cutting, the rooting rate of A. arborea was increased by the treatment with concentrated IBA, especially at 5,000 and 7,000ppm. The optimum date for cutting was on June 27, when the rooting rate was more than 80%. The most effective method for rooting of A. arborea was rootone or 7,000 ppm IBA treatment on June 27 softwood cuttings, which showed a rooting rate of over 80%.

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

A Study on the Application of Physical Soil Washing Technology at Lead-contaminated Shooting Range in a Closed Military Shooting Range Area (폐 공용화기사격장 내 납오염 사격장 군부지의 물리적 토양세척정화기술 적용성 연구)

  • Jung, Jaeyun;Jang, Yunyoung
    • Journal of Environmental Impact Assessment
    • /
    • v.28 no.5
    • /
    • pp.492-506
    • /
    • 2019
  • Heavy metal contaminants in the shooting range are mostly present in a warhead circle or a metal fragment present as a particle, these fine metal particles are weathered for a long period of time is very likely that the surface is present as an oxide or carbon oxide. In particular, lead which is a representative contaminant in the shooting range soil, is present as more fine particles because it increases the softness and is stretched well. Therefore, by physical washing experiment, we conducted a degree analysis, concentration of heavy metals by cubic diameter, composition analysis of metallic substances, and assessment of applicability of gravity, magnetism and floating selection. The experimental results FESEM analysis and the measurement results lead to the micro-balance was confirmed thatthe weight goes outless than the soil ofthe same size in a thinly sliced and side-shaped structure according to the dull characteristics it was confirmed that the high specific gravity applicability. In addition, the remediation efficiency evaluation results using a hydrocyclone applied to this showed a cumulative remediation efficiency of 71%,twice 80%, 3 times 91%. On the other hand, magnetic sifting showed a low efficiency of 17%,floating selection -35mesh (0.5mm)target soil showed a relatively high efficiency to 39% -10mesh (2mm) efficiency was only 16%. The target treatment diameter of soil washing should be 2mm to 0.075mm, which is applied to the actual equipment by adding an additional input classification, which would require management as additional installation costs and processes are constructed. As a result, it is found that the soilremediation of shooting range can be separately according to the size of the warhead. The size is larger than the gravel diameter to most 5.56mm, so it is possible to select a specific gravity using a high gravity. However, the contaminants present in the metal fragments were found to be processed by separating using a hydrocyclone of the soil washing according to the weight is less than the soil of the same particle size in a thinly fragmented structure.

Recommender system using BERT sentiment analysis (BERT 기반 감성분석을 이용한 추천시스템)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-15
    • /
    • 2021
  • If it is difficult for us to make decisions, we ask for advice from friends or people around us. When we decide to buy products online, we read anonymous reviews and buy them. With the advent of the Data-driven era, IT technology's development is spilling out many data from individuals to objects. Companies or individuals have accumulated, processed, and analyzed such a large amount of data that they can now make decisions or execute directly using data that used to depend on experts. Nowadays, the recommender system plays a vital role in determining the user's preferences to purchase goods and uses a recommender system to induce clicks on web services (Facebook, Amazon, Netflix, Youtube). For example, Youtube's recommender system, which is used by 1 billion people worldwide every month, includes videos that users like, "like" and videos they watched. Recommended system research is deeply linked to practical business. Therefore, many researchers are interested in building better solutions. Recommender systems use the information obtained from their users to generate recommendations because the development of the provided recommender systems requires information on items that are likely to be preferred by the user. We began to trust patterns and rules derived from data rather than empirical intuition through the recommender systems. The capacity and development of data have led machine learning to develop deep learning. However, such recommender systems are not all solutions. Proceeding with the recommender systems, there should be no scarcity in all data and a sufficient amount. Also, it requires detailed information about the individual. The recommender systems work correctly when these conditions operate. The recommender systems become a complex problem for both consumers and sellers when the interaction log is insufficient. Because the seller's perspective needs to make recommendations at a personal level to the consumer and receive appropriate recommendations with reliable data from the consumer's perspective. In this paper, to improve the accuracy problem for "appropriate recommendation" to consumers, the recommender systems are proposed in combination with context-based deep learning. This research is to combine user-based data to create hybrid Recommender Systems. The hybrid approach developed is not a collaborative type of Recommender Systems, but a collaborative extension that integrates user data with deep learning. Customer review data were used for the data set. Consumers buy products in online shopping malls and then evaluate product reviews. Rating reviews are based on reviews from buyers who have already purchased, giving users confidence before purchasing the product. However, the recommendation system mainly uses scores or ratings rather than reviews to suggest items purchased by many users. In fact, consumer reviews include product opinions and user sentiment that will be spent on evaluation. By incorporating these parts into the study, this paper aims to improve the recommendation system. This study is an algorithm used when individuals have difficulty in selecting an item. Consumer reviews and record patterns made it possible to rely on recommendations appropriately. The algorithm implements a recommendation system through collaborative filtering. This study's predictive accuracy is measured by Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Netflix is strategically using the referral system in its programs through competitions that reduce RMSE every year, making fair use of predictive accuracy. Research on hybrid recommender systems combining the NLP approach for personalization recommender systems, deep learning base, etc. has been increasing. Among NLP studies, sentiment analysis began to take shape in the mid-2000s as user review data increased. Sentiment analysis is a text classification task based on machine learning. The machine learning-based sentiment analysis has a disadvantage in that it is difficult to identify the review's information expression because it is challenging to consider the text's characteristics. In this study, we propose a deep learning recommender system that utilizes BERT's sentiment analysis by minimizing the disadvantages of machine learning. This study offers a deep learning recommender system that uses BERT's sentiment analysis by reducing the disadvantages of machine learning. The comparison model was performed through a recommender system based on Naive-CF(collaborative filtering), SVD(singular value decomposition)-CF, MF(matrix factorization)-CF, BPR-MF(Bayesian personalized ranking matrix factorization)-CF, LSTM, CNN-LSTM, GRU(Gated Recurrent Units). As a result of the experiment, the recommender system based on BERT was the best.

Sea Water Type Classification Around the Ieodo Ocean Research Station Based On Satellite Optical Spectrum (인공위성 광학 스펙트럼 기반 이어도 해양과학기지 주변 해수의 수형 분류)

  • Lee, Ji-Hyun;Park, Kyung-Ae;Park, Jae-Jin;Lee, Ki-Tack;Byun, Do-Seung;Jeong, Kwang-Yeong;Oh, Hyun-Ju
    • Journal of the Korean earth science society
    • /
    • v.43 no.5
    • /
    • pp.591-603
    • /
    • 2022
  • The color and optical properties of seawater are determined by the interaction between dissolved organic and inorganic substances and plankton contained in it. The Ieodo - Ocean Research Institute (I-ORS), located in the East China Sea, is affected by the low salinity of the Yangtze River in the west and the Tsushima Warm Current in the south. Thus, it is a suitable site for analyzing the fluctuations in circulation and optical properties around the Korean Peninsula. In this study, seawater surrounding the I-ORS was classified according to its optical characteristics using the satellite remote reflectance observed with Moderate Resolution Imaging Spectroradiometer (MODIS)/Aqua and National Aeronautics and Space Administration (NASA) bio-Optical Marine Algorithm Dataset (NOMAD) from January 2016 to December 2020. Additionally, the variation characteristics of optical water types (OWTs) from different seasons were presented. A total of 59,532 satellite match-up data (d ≤ 10 km) collected from seawater surrounding the I-ORS were classified into 23 types using the spectral angle mapper. The OWTs appearing in relatively clear waters surrounding the I-ORS were observed to be greater than 50% of the total. The maximum OWTs frequency in summer and winter was opposite according to season. In particular, the OWTs corresponding to optically clear seawater were primarily present in the summer. However, the same OWTs were lower than overall 1% rate in winter. Considering the OWTs fluctuations in the East China Sea, the I-ORS is inferred to be located in the transition zone of seawater. This study contributes in understanding the optical characteristics of seawater and improving the accuracy of satellite ocean color variables.

A Basis Study on the Optimal Design of the Integrated PM/NOx Reduction Device (일체형 PM/NOx 동시저감장치의 최적 설계에 대한 기초 연구)

  • Choe, Su-Jeong;Pham, Van Chien;Lee, Won-Ju;Kim, Jun-Soo;Kim, Jeong-Kuk;Park, Hoyong;Lim, In Gweon;Choi, Jae-Hyuk
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.6
    • /
    • pp.1092-1099
    • /
    • 2022
  • Research on exhaust aftertreatment devices to reduce air pollutants and greenhouse gas emissions is being actively conducted. However, in the case of the particulate matters/nitrogen oxides (PM/NOx) simultaneous reduction device for ships, the problem of back pressure on the diesel engine and replacement of the filter carrier is occurring. In this study, for the optimal design of the integrated device that can simultaneously reduce PM/NOx, an appropriate standard was presented by studying the flow inside the device and change in back pressure through the inlet/outlet pressure. Ansys Fluent was used to apply porous media conditions to a diesel particulate filter (DPF) and selective catalytic reduction (SCR) by setting porosity to 30%, 40%, 50%, 60%, and 70%. In addition, the ef ect on back pressure was analyzed by applying the inlet velocity according to the engine load to 7.4 m/s, 10.3 m/s, 13.1 m/s, and 26.2 m/s as boundary conditions. As a result of a computational fluid dynamics analysis, the rate of change for back pressure by changing the inlet velocity was greater than when inlet temperature was changed, and the maximum rate of change was 27.4 mbar. This was evaluated as a suitable device for ships of 1800kW because the back pressure in all boundary conditions did not exceed the classification standard of 68mbar.

Development of a water quality prediction model for mineral springs in the metropolitan area using machine learning (머신러닝을 활용한 수도권 약수터 수질 예측 모델 개발)

  • Yeong-Woo Lim;Ji-Yeon Eom;Kee-Young Kwahk
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.307-325
    • /
    • 2023
  • Due to the prolonged COVID-19 pandemic, the frequency of people who are tired of living indoors visiting nearby mountains and national parks to relieve depression and lethargy has exploded. There is a place where thousands of people who came out of nature stop walking and breathe and rest, that is the mineral spring. Even in mountains or national parks, there are about 600 mineral springs that can be found occasionally in neighboring parks or trails in the metropolitan area. However, due to irregular and manual water quality tests, people drink mineral water without knowing the test results in real time. Therefore, in this study, we intend to develop a model that can predict the quality of the spring water in real time by exploring the factors affecting the quality of the spring water and collecting data scattered in various places. After limiting the regions to Seoul and Gyeonggi-do due to the limitations of data collection, we obtained data on water quality tests from 2015 to 2020 for about 300 mineral springs in 18 cities where data management is well performed. A total of 10 factors were finally selected after two rounds of review among various factors that are considered to affect the suitability of the mineral spring water quality. Using AutoML, an automated machine learning technology that has recently been attracting attention, we derived the top 5 models based on prediction performance among about 20 machine learning methods. Among them, the catboost model has the highest performance with a prediction classification accuracy of 75.26%. In addition, as a result of examining the absolute influence of the variables used in the analysis through the SHAP method on the prediction, the most important factor was whether or not a water quality test was judged nonconforming in the previous water quality test. It was confirmed that the temperature on the day of the inspection and the altitude of the mineral spring had an influence on whether the water quality was unsuitable.

Community Structure of Natural Monument Forest (Forest of Japanese Torreyas in Pyeongdae-ri, Jeju and Subtropical Forest of Nabeup-ri, Jeju) in Jeju-do (제주도 천연기념물 수림지(제주 평대리 비자나무 숲과 제주 납읍리 난대림)의 군집구조)

  • Jeong Eun Lee;Yo Seob Hwang;Ho Jin Kim;Ju Heung Lee;Chung Weon Yun
    • Journal of Korean Society of Forest Science
    • /
    • v.112 no.4
    • /
    • pp.393-404
    • /
    • 2023
  • The Natural Monument Forest (NMF) is a form of natural and cultural heritage that has symbolized the harmony between nature and culture in Korea for a long time. Recently, the NMF has deteriorated due to industrialization and reckless city expansion. Given this situation, it is necessary to preserve and manage the ecosystem of the NMF through preferential research regarding the forest community structure. Accordingly, this study sought to identify the community structure by analyzing the vegetation classification, stratum structure,and species diversity using vegetation data collected from the Forest of Japanese Torreyas in Pyeongdae-ri, Jeju and the Subtropical Forest of Nabeup-ri, Jeju. The results classified the forest vegetation as a Litsea japonica community group divided into two communities: a Torreya nuciferacommunity and a Quercus glauca community. The T. nuciferacommunity was subdivided into the Idesia polycarpa group and Dryopteris erythrosora group, while the Q. glauca community was subdivided into the Mercurialis leiocarpa group and Arachniodes aristata group. The T. nucifera species showed the highest level of importance in vegetation units 1 (Litsea japonicacommunity group-Torreya nucifera community-Idesia polycarpa group) and 2 (Litsea japonica community group-Torreya nucifera community-Dryopteris erythrosora group), whereas Q. glauca showed the highest level of importance in vegetation units 3 (Litsea japonica community group-Quercus glauca community-Mercurialis leiocarpa group) and 4 (Litsea japonica community group-Quercus glauca community-Arachniodes aristata group). In terms of the species diversity, vegetation units 1, 2, 3, and 4 had 2.866, 2.716, 2.222, and 2.326 species, respectively. These findings suggest that it is necessary to prepare a differentiated management plan for each vegetation unit.