• Title/Summary/Keyword: Learning Processing

Search Result 3,607, Processing Time 0.035 seconds

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

A Study on Automatic Classification Model of Documents Based on Korean Standard Industrial Classification (한국표준산업분류를 기준으로 한 문서의 자동 분류 모델에 관한 연구)

  • Lee, Jae-Seong;Jun, Seung-Pyo;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.221-241
    • /
    • 2018
  • As we enter the knowledge society, the importance of information as a new form of capital is being emphasized. The importance of information classification is also increasing for efficient management of digital information produced exponentially. In this study, we tried to automatically classify and provide tailored information that can help companies decide to make technology commercialization. Therefore, we propose a method to classify information based on Korea Standard Industry Classification (KSIC), which indicates the business characteristics of enterprises. The classification of information or documents has been largely based on machine learning, but there is not enough training data categorized on the basis of KSIC. Therefore, this study applied the method of calculating similarity between documents. Specifically, a method and a model for presenting the most appropriate KSIC code are proposed by collecting explanatory texts of each code of KSIC and calculating the similarity with the classification object document using the vector space model. The IPC data were collected and classified by KSIC. And then verified the methodology by comparing it with the KSIC-IPC concordance table provided by the Korean Intellectual Property Office. As a result of the verification, the highest agreement was obtained when the LT method, which is a kind of TF-IDF calculation formula, was applied. At this time, the degree of match of the first rank matching KSIC was 53% and the cumulative match of the fifth ranking was 76%. Through this, it can be confirmed that KSIC classification of technology, industry, and market information that SMEs need more quantitatively and objectively is possible. In addition, it is considered that the methods and results provided in this study can be used as a basic data to help the qualitative judgment of experts in creating a linkage table between heterogeneous classification systems.

Usefulness of Data Mining in Criminal Investigation (데이터 마이닝의 범죄수사 적용 가능성)

  • Kim, Joon-Woo;Sohn, Joong-Kweon;Lee, Sang-Han
    • Journal of forensic and investigative science
    • /
    • v.1 no.2
    • /
    • pp.5-19
    • /
    • 2006
  • Data mining is an information extraction activity to discover hidden facts contained in databases. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results. Typical applications include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, and credit risk analysis. Law enforcement agencies deal with mass data to investigate the crime and its amount is increasing due to the development of processing the data by using computer. Now new challenge to discover knowledge in that data is confronted to us. It can be applied in criminal investigation to find offenders by analysis of complex and relational data structures and free texts using their criminal records or statement texts. This study was aimed to evaluate possibile application of data mining and its limitation in practical criminal investigation. Clustering of the criminal cases will be possible in habitual crimes such as fraud and burglary when using data mining to identify the crime pattern. Neural network modelling, one of tools in data mining, can be applied to differentiating suspect's photograph or handwriting with that of convict or criminal profiling. A case study of in practical insurance fraud showed that data mining was useful in organized crimes such as gang, terrorism and money laundering. But the products of data mining in criminal investigation should be cautious for evaluating because data mining just offer a clue instead of conclusion. The legal regulation is needed to control the abuse of law enforcement agencies and to protect personal privacy or human rights.

  • PDF

Effect of Visual Perception by Vision Therapy for Improvement of Visual Function (시각기능 개선을 위한 시기능훈련이 시지각에 미치는 영향)

  • Lee, Seung Wook;Lee, Hyun Mee
    • Journal of Korean Ophthalmic Optics Society
    • /
    • v.20 no.4
    • /
    • pp.491-499
    • /
    • 2015
  • Purpose: This study was to examine how decline of visual function affects visual perception by assessing visual perception after improving visual function through visual training, and observing the change in the cognitive ability of visual perception. Methods: This study analyzes the visual perceptual evaluation (TVPS_R) of 23 children below age 13($8.75{\pm}1.66$) who have visual abnormalities, and improves visual function after conducting vision training (vision therapy) of the children. Results: Convergence increased from average $3.39{\pm}2.52{\Delta}$ (prism) to $13.87{\pm}6.04{\Delta}$ in the measurement of long-distance disparate points, and from average $5.48{\pm}3.42{\Delta}$ to $18.43{\pm}7.58{\Delta}$ in the measurement of short-distance disparate points. Short-distance diplopia points increased from $25.87{\pm}7.33cm$ to $7.48{\pm}2.87cm$, and as for accommodative insufficiency, short-distance blur points increased from $19.57{\pm}7.16cm$ to $7.09{\pm}1.88cm$. In the visual perceptual evaluation performed before and after improving visual function, 6 items except visual memory showed statistically significant improvement. By order of significant improvement, response gap was highest with $17.74{\pm}16.94$(p=0.000) in visual closure, followed by $15.65{\pm}17.11$(p=0.000) in visual sequential-memory, $13.65{\pm}16.63$(p=0.001) in visual figure-ground, $12.74{\pm}18.41$(p=0.003) in visual form-constancy, $6.48{\pm}10.07$ (p=0.005) in visual discrimination, and $4.17{\pm}9.33$(p=0.043) in visual spatial-relationship. In the visual perception quotient that added up these scores, the response gap was $15.22{\pm}8.66$(p=0.000), showing a more significant result. Conclusions: Vision training enables efficient visual processing and improves visual perceptual ability. It was confirmed that improvement of visual function through visual training not only improves abnormal visual function but also affects visual perception of children such as learning, perception and recognition.

KoFlux's Progress: Background, Status and Direction (KoFlux 역정: 배경, 현황 및 향방)

  • Kwon, Hyo-Jung;Kim, Joon
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.12 no.4
    • /
    • pp.241-263
    • /
    • 2010
  • KoFlux is a Korean network of micrometeorological tower sites that use eddy covariance methods to monitor the cycles of energy, water, and carbon dioxide between the atmosphere and the key terrestrial ecosystems in Korea. KoFlux embraces the mission of AsiaFlux, i.e. to bring Asia's key ecosystems under observation to ensure quality and sustainability of life on earth. The main purposes of KoFlux are to provide (1) an infrastructure to monitor, compile, archive and distribute data for the science community and (2) a forum and short courses for the application and distribution of knowledge and data between scientists including practitioners. The KoFlux community pursues the vision of AsiaFlux, i.e., "thinking community, learning frontiers" by creating information and knowledge of ecosystem science on carbon, water and energy exchanges in key terrestrial ecosystems in Asia, by promoting multidisciplinary cooperations and integration of scientific researches and practices, and by providing the local communities with sustainable ecosystem services. Currently, KoFlux has seven sites in key terrestrial ecosystems (i.e., five sites in Korea and two sites in the Arctic and Antarctic). KoFlux has systemized a standardized data processing based on scrutiny of the data observed from these ecosystems and synthesized the processed data for constructing database for further uses with open access. Through publications, workshops, and training courses on a regular basis, KoFlux has provided an agora for building networks, exchanging information among flux measurement and modelling experts, and educating scientists in flux measurement and data analysis. Despite such persistent initiatives, the collaborative networking is still limited within the KoFlux community. In order to break the walls between different disciplines and boost up partnership and ownership of the network, KoFlux will be housed in the National Center for Agro-Meteorology (NCAM) at Seoul National University in 2011 and provide several core services of NCAM. Such concerted efforts will facilitate the augmentation of the current monitoring network, the education of the next-generation scientists, and the provision of sustainable ecosystem services to our society.

The Changes of Short-Term Memory and Autonomic Neurocardiac Function after 4-10Hz Sound and Light Stimulation - A Pilot Study - (4-10 Hz 빛과 소리자극 후 단기기억력 및 자율신경심장기능의 변화 - 예비연구 -)

  • Lee, Seung-Hwan;Kim, Jin-Hwan;Park, Joong-Kyu;Lee, Kyung-Uk;Yang, Dae-Hyun;Hong, Keun-Young;Chae, Jeong-Ho
    • Sleep Medicine and Psychophysiology
    • /
    • v.11 no.1
    • /
    • pp.29-36
    • /
    • 2004
  • Objectives: Sound and light (SL) stimulation has been used as a method to induce some useful mental states in the fields of psychology and psychiatry. It is believed that sound and light entrainment device (SLED) has some specific effects through synchronization of EEG in patients who use it. Theta frequency is believed to stimulate deep relaxation and short term memory processing. This study was conducted to evaluate if 4-10 Hz SL stimulation can induce relaxation and improve short term memory function. Methods: Ten medical students with no medical or psychiatric problems participated in this study. Subjects were randomly divided into two groups. One group was applied with real SLED was applied to one group (R group) and pseudo SLED to the other group (P group). The two groups were exposed to SL stimulation with SLED 15 minutes a day for 5 days, and after two days rest the two groups were switched over. The Korean Wechsler Adult Intelligence Scale (K-WAIS), Academic Motivation Tests (AMT), Test Anxiety Scale (TAS), Korean Auditory Verbal Learning Test (K-AVLT), and digit span were used to evaluate short term memory. Spielberger's State-Trait anxiety inventory and heart rate variability (HRV) test were used to evaluate degree of relaxation. Results: Compared with S group, R group showed a significant improvement in K-AVLT and digit span after a single application of SL stimulation. But 5-day long application did not reveal any differences between the two groups. A significant change in HRV was observed in 5-day long application of SL stimulation after being switched over to other SLED. Conclusion: This pilot study suggests that 4-10 Hz SL stimulation has some positive influences on short term memory and relaxation.

  • PDF

A Study on the Effect of Using Sentiment Lexicon in Opinion Classification (오피니언 분류의 감성사전 활용효과에 대한 연구)

  • Kim, Seungwoo;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.133-148
    • /
    • 2014
  • Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user's opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

A hybrid algorithm for the synthesis of computer-generated holograms

  • Nguyen The Anh;An Jun Won;Choe Jae Gwang;Kim Nam
    • Proceedings of the Optical Society of Korea Conference
    • /
    • 2003.07a
    • /
    • pp.60-61
    • /
    • 2003
  • A new approach to reduce the computation time of genetic algorithm (GA) for making binary phase holograms is described. Synthesized holograms having diffraction efficiency of 75.8% and uniformity of 5.8% are proven in computer simulation and experimentally demonstrated. Recently, computer-generated holograms (CGHs) having high diffraction efficiency and flexibility of design have been widely developed in many applications such as optical information processing, optical computing, optical interconnection, etc. Among proposed optimization methods, GA has become popular due to its capability of reaching nearly global. However, there exits a drawback to consider when we use the genetic algorithm. It is the large amount of computation time to construct desired holograms. One of the major reasons that the GA' s operation may be time intensive results from the expense of computing the cost function that must Fourier transform the parameters encoded on the hologram into the fitness value. In trying to remedy this drawback, Artificial Neural Network (ANN) has been put forward, allowing CGHs to be created easily and quickly (1), but the quality of reconstructed images is not high enough to use in applications of high preciseness. For that, we are in attempt to find a new approach of combiningthe good properties and performance of both the GA and ANN to make CGHs of high diffraction efficiency in a short time. The optimization of CGH using the genetic algorithm is merely a process of iteration, including selection, crossover, and mutation operators [2]. It is worth noting that the evaluation of the cost function with the aim of selecting better holograms plays an important role in the implementation of the GA. However, this evaluation process wastes much time for Fourier transforming the encoded parameters on the hologram into the value to be solved. Depending on the speed of computer, this process can even last up to ten minutes. It will be more effective if instead of merely generating random holograms in the initial process, a set of approximately desired holograms is employed. By doing so, the initial population will contain less trial holograms equivalent to the reduction of the computation time of GA's. Accordingly, a hybrid algorithm that utilizes a trained neural network to initiate the GA's procedure is proposed. Consequently, the initial population contains less random holograms and is compensated by approximately desired holograms. Figure 1 is the flowchart of the hybrid algorithm in comparison with the classical GA. The procedure of synthesizing a hologram on computer is divided into two steps. First the simulation of holograms based on ANN method [1] to acquire approximately desired holograms is carried. With a teaching data set of 9 characters obtained from the classical GA, the number of layer is 3, the number of hidden node is 100, learning rate is 0.3, and momentum is 0.5, the artificial neural network trained enables us to attain the approximately desired holograms, which are fairly good agreement with what we suggested in the theory. The second step, effect of several parameters on the operation of the hybrid algorithm is investigated. In principle, the operation of the hybrid algorithm and GA are the same except the modification of the initial step. Hence, the verified results in Ref [2] of the parameters such as the probability of crossover and mutation, the tournament size, and the crossover block size are remained unchanged, beside of the reduced population size. The reconstructed image of 76.4% diffraction efficiency and 5.4% uniformity is achieved when the population size is 30, the iteration number is 2000, the probability of crossover is 0.75, and the probability of mutation is 0.001. A comparison between the hybrid algorithm and GA in term of diffraction efficiency and computation time is also evaluated as shown in Fig. 2. With a 66.7% reduction in computation time and a 2% increase in diffraction efficiency compared to the GA method, the hybrid algorithm demonstrates its efficient performance. In the optical experiment, the phase holograms were displayed on a programmable phase modulator (model XGA). Figures 3 are pictures of diffracted patterns of the letter "0" from the holograms generated using the hybrid algorithm. Diffraction efficiency of 75.8% and uniformity of 5.8% are measured. We see that the simulation and experiment results are fairly good agreement with each other. In this paper, Genetic Algorithm and Neural Network have been successfully combined in designing CGHs. This method gives a significant reduction in computation time compared to the GA method while still allowing holograms of high diffraction efficiency and uniformity to be achieved. This work was supported by No.mOl-2001-000-00324-0 (2002)) from the Korea Science & Engineering Foundation.

  • PDF

Analyzing Different Contexts for Energy Terms through Text Mining of Online Science News Articles (온라인 과학 기사 텍스트 마이닝을 통해 분석한 에너지 용어 사용의 맥락)

  • Oh, Chi Yeong;Kang, Nam-Hwa
    • Journal of Science Education
    • /
    • v.45 no.3
    • /
    • pp.292-303
    • /
    • 2021
  • This study identifies the terms frequently used together with energy in online science news articles and topics of the news reports to find out how the term energy is used in everyday life and to draw implications for science curriculum and instruction about energy. A total of 2,171 online news articles in science category published by 11 major newspaper companies in Korea for one year from March 1, 2018 were selected by using energy as a search term. As a result of natural language processing, a total of 51,224 sentences consisting of 507,901 words were compiled for analysis. Using the R program, term frequency analysis, semantic network analysis, and structural topic modeling were performed. The results show that the terms with exceptionally high frequencies were technology, research, and development, which reflected the characteristics of news articles that report new findings. On the other hand, terms used more than once per two articles were industry-related terms (industry, product, system, production, market) and terms that were sufficiently expected as energy-related terms such as 'electricity' and 'environment.' Meanwhile, 'sun', 'heat', 'temperature', and 'power generation', which are frequently used in energy-related science classes, also appeared as terms belonging to the highest frequency. From a network analysis, two clusters were found including terms related to industry and technology and terms related to basic science and research. From the analysis of terms paired with energy, it was also found that terms related to the use of energy such as 'energy efficiency,' 'energy saving,' and 'energy consumption' were the most frequently used. Out of 16 topics found, four contexts of energy were drawn including 'high-tech industry,' 'industry,' 'basic science,' and 'environment and health.' The results suggest that the introduction of the concept of energy degradation as a starting point for energy classes can be effective. It also shows the need to introduce high-tech industries or the context of environment and health into energy learning.

Derivation of Inherent Optical Properties Based on Deep Neural Network (심층신경망 기반의 해수 고유광특성 도출)

  • Hyeong-Tak Lee;Hey-Min Choi;Min-Kyu Kim;Suk Yoon;Kwang-Seok Kim;Jeong-Eon Moon;Hee-Jeong Han;Young-Je Park
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.695-713
    • /
    • 2023
  • In coastal waters, phytoplankton,suspended particulate matter, and dissolved organic matter intricately and nonlinearly alter the reflectivity of seawater. Neural network technology, which has been rapidly advancing recently, offers the advantage of effectively representing complex nonlinear relationships. In previous studies, a three-stage neural network was constructed to extract the inherent optical properties of each component. However, this study proposes an algorithm that directly employs a deep neural network. The dataset used in this study consists of synthetic data provided by the International Ocean Color Coordination Group, with the input data comprising above-surface remote-sensing reflectance at nine different wavelengths. We derived inherent optical properties using this dataset based on a deep neural network. To evaluate performance, we compared it with a quasi-analytical algorithm and analyzed the impact of log transformation on the performance of the deep neural network algorithm in relation to data distribution. As a result, we found that the deep neural network algorithm accurately estimated the inherent optical properties except for the absorption coefficient of suspended particulate matter (R2 greater than or equal to 0.9) and successfully separated the sum of the absorption coefficient of suspended particulate matter and dissolved organic matter into the absorption coefficient of suspended particulate matter and dissolved organic matter, respectively. We also observed that the algorithm, when directly applied without log transformation of the data, showed little difference in performance. To effectively apply the findings of this study to ocean color data processing, further research is needed to perform learning using field data and additional datasets from various marine regions, compare and analyze empirical and semi-analytical methods, and appropriately assess the strengths and weaknesses of each algorithm.