• Title/Summary/Keyword: Normalizing

Search Result 295, Processing Time 0.026 seconds

An Analysis of IT Trends Using Tweet Data (트윗 데이터를 활용한 IT 트렌드 분석)

  • Yi, Jin Baek;Lee, Choong Kwon;Cha, Kyung Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.143-159
    • /
    • 2015
  • Predicting IT trends has been a long and important subject for information systems research. IT trend prediction makes it possible to acknowledge emerging eras of innovation and allocate budgets to prepare against rapidly changing technological trends. Towards the end of each year, various domestic and global organizations predict and announce IT trends for the following year. For example, Gartner Predicts 10 top IT trend during the next year, and these predictions affect IT and industry leaders and organization's basic assumptions about technology and the future of IT, but the accuracy of these reports are difficult to verify. Social media data can be useful tool to verify the accuracy. As social media services have gained in popularity, it is used in a variety of ways, from posting about personal daily life to keeping up to date with news and trends. In the recent years, rates of social media activity in Korea have reached unprecedented levels. Hundreds of millions of users now participate in online social networks and communicate with colleague and friends their opinions and thoughts. In particular, Twitter is currently the major micro blog service, it has an important function named 'tweets' which is to report their current thoughts and actions, comments on news and engage in discussions. For an analysis on IT trends, we chose Tweet data because not only it produces massive unstructured textual data in real time but also it serves as an influential channel for opinion leading on technology. Previous studies found that the tweet data provides useful information and detects the trend of society effectively, these studies also identifies that Twitter can track the issue faster than the other media, newspapers. Therefore, this study investigates how frequently the predicted IT trends for the following year announced by public organizations are mentioned on social network services like Twitter. IT trend predictions for 2013, announced near the end of 2012 from two domestic organizations, the National IT Industry Promotion Agency (NIPA) and the National Information Society Agency (NIA), were used as a basis for this research. The present study analyzes the Twitter data generated from Seoul (Korea) compared with the predictions of the two organizations to analyze the differences. Thus, Twitter data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. To overcome these challenges, we used SAS IRS (Information Retrieval Studio) developed by SAS to capture the trend in real-time processing big stream datasets of Twitter. The system offers a framework for crawling, normalizing, analyzing, indexing and searching tweet data. As a result, we have crawled the entire Twitter sphere in Seoul area and obtained 21,589 tweets in 2013 to review how frequently the IT trend topics announced by the two organizations were mentioned by the people in Seoul. The results shows that most IT trend predicted by NIPA and NIA were all frequently mentioned in Twitter except some topics such as 'new types of security threat', 'green IT', 'next generation semiconductor' since these topics non generalized compound words so they can be mentioned in Twitter with other words. To answer whether the IT trend tweets from Korea is related to the following year's IT trends in real world, we compared Twitter's trending topics with those in Nara Market, Korea's online e-Procurement system which is a nationwide web-based procurement system, dealing with whole procurement process of all public organizations in Korea. The correlation analysis show that Tweet frequencies on IT trending topics predicted by NIPA and NIA are significantly correlated with frequencies on IT topics mentioned in project announcements by Nara market in 2012 and 2013. The main contribution of our research can be found in the following aspects: i) the IT topic predictions announced by NIPA and NIA can provide an effective guideline to IT professionals and researchers in Korea who are looking for verified IT topic trends in the following topic, ii) researchers can use Twitter to get some useful ideas to detect and predict dynamic trends of technological and social issues.

Electroencephalographic Changes Induced by a Neurofeedback Training : A Preliminary Study in Primary Insomniac Patients (뉴로피드백 훈련에 의한 뇌파 변화 연구 : 일차성 불면증 환자에 대한 예비 연구)

  • Lee, Jin Han;Shin, Hong-Beom;Kim, Jong Won;Suh, Ho-Suk;Lee, Young Jin
    • Sleep Medicine and Psychophysiology
    • /
    • v.26 no.1
    • /
    • pp.44-48
    • /
    • 2019
  • Objectives: Insomnia is one of the most prevalent sleep disorders. Recent studies suggest that cognitive and physical arousal play an important role in the generation of primary insomnia. Studies have also shown that information processing disorders due to cortical hyperactivity might interfere with normal sleep onset and sleep continuity. Therefore, focusing on central nervous system arousal and normalizing the information process have become current topics of interest. It has been well known that neurofeedback can reduce the brain hyperarousal by modulating patients' brain waves during a sequence of behavior therapy. The purpose of this study was to investigate effects of neurofeedback therapy on electroencephalography (EEG) characteristics in patients with primary insomnia. Methods: Thirteen subjects who met the criteria for an insomnia diagnosis and 14 control subjects who were matched on sex and age were included. Neurofeedback and sham treatments were performed in a random order for 30 minutes, respectively. EEG spectral power analyses were performed to quantify effects of the neurofeedback therapy on brain wave forms. Results: In patients with primary insomnia, relative spectral theta and sigma power during a therapeutic neurofeedback session were significantly lower than during a sham session ($13.9{\pm}2.6$ vs. $12.2{\pm}3.8$ and $3.6{\pm}0.9$ vs. $3.2{\pm}1.0$ in %, respectively; p < 0.05). There were no statistically significant changes in other EEG spectral bands. Conclusion: For the first time in Korea, EEG spectral power in the theta band was found to increase when a neurofeedback session was applied to patients with insomnia. This outcome might provide some insight into new interventions for improving sleep onset. However, the treatment response of insomniacs was not precisely evaluated due to limitations of the current pilot study, which requires follow-up studies with larger samples in the future.

Application of Spectral Indices to Drone-based Multispectral Remote Sensing for Algal Bloom Monitoring in the River (하천 녹조 모니터링을 위한 드론 다중분광영상의 분광지수 적용성 평가)

  • Choe, Eunyoung;Jung, Kyung Mi;Yoon, Jong-Su;Jang, Jong Hee;Kim, Mi-Jung;Lee, Ho Joong
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.419-430
    • /
    • 2021
  • Remote sensing techniques using drone-based multispectral image were studied for fast and two-dimensional monitoring of algal blooms in the river. Drone is anticipated to be useful for algal bloom monitoring because of easy access to the field, high spatial resolution, and lowering atmospheric light scattering. In addition, application of multispectral sensors could make image processing and analysis procedures simple, fast, and standardized. Spectral indices derived from the active spectrum of photosynthetic pigments in terrestrial plants and phytoplankton were tested for estimating chlorophyll-a concentrations (Chl-a conc.) from drone-based multispectral image. Spectral indices containing the red-edge band showed high relationships with Chl-a conc. and especially, 3-band model (3BM) and normalized difference chlorophyll index (NDCI) were performed well (R2=0.86, RMSE=7.5). NDCI uses just two spectral bands, red and red-edge, and provides normalized values, so that data processing becomes simple and rapid. The 3BM which was tuned for accurate prediction of Chl-a conc. in productive water bodies adopts originally two spectral bands in the red-edge range, 720 and 760 nm, but here, the near-infrared band replaced the longer red-edge band because the multispectral sensor in this study had only one shorter red-edge band. This index is expected to predict more accurately Chl-a conc. using the sensor specialized with the red-edge range.

Numerical analysis of morphological changes by opening gates of Sejong Weir (보 개방에 의한 하도의 지형변화 과정 수치모의 분석(세종보를 중심으로))

  • Jang, Chang-Lae;Baek, Tae Hyo;Kang, Taeun;Ock, Giyoung
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.8
    • /
    • pp.629-641
    • /
    • 2021
  • In this study, a two-dimensional numerical model (Nays2DH) was applied to analyze the process of morphological changes in the river channel bed depending on the changes in the amount of flooding after fully opening the Sejong weir, which was constructed upstream of the Geum River. For this, numerical simulations were performed by assuming the flow conditions, such as a non-uniform flow (NF), unsteady flows (single flood event, SF), and a continuous flood event (CF). Here, in the cases of the SF and CF, the normalized hydrograph was calculated from real flood events, and then the hydrograph was reconfigured by the peak flow discharge according to the scenario, and then it was employed as the flow discharge at the upstream boundary condition. In this study, to quantitatively evaluate the morphological changes, we analyzed the time changes in the bed deformation the bed relief index (BRI), and we compared the aerial photographs of the study area and the numerical simulation results. As simulation results of the NF, when the steady flow discharge increases, the ratio of lower width to depth decreases and the speed of bar migration increases. The BRI initially increases, but the amount of change decreased with time. In addition, when the steady flow discharge increases, the BRI increased. In the case of SF, the speed of bar migration decreased with the change of the flow discharge. In terms of the morphological response to the peak flood discharge, the time lag also indicated. In other words, in the SF, the change of channel bed indicates a phase lag with respect to the hydraulic condition. In the result of numerical simulation of CF, the speed of bar migration depending on the peak flood discharges decreased exponentially despite the repeated flood occurrences. In addition, as in the result of SF, the phase lag indicated, and the speed of bar migration decreased exponentially. The BRI increased with time changes, but the rate of increase in the BRI was modest despite the continuous peak flooding. Through this study, the morphological changes based on the hydrological characteristics of the river were analyzed numerically, and the methodology suggested that a quantitative prediction for the river bed change according to the flow characteristic can be applied to the field.

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.