• Title/Summary/Keyword: Feature Library

Search Result 179, Processing Time 0.021 seconds

A Study on Shot Segmentation and Indexing of Language Education Videos by Content-based Visual Feature Analysis (교육용 어학 영상의 내용 기반 특징 분석에 의한 샷 구분 및 색인에 대한 연구)

  • Han, Heejun
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.1
    • /
    • pp.219-239
    • /
    • 2017
  • As IT technology develops rapidly and the personal dissemination of smart devices increases, video material is especially used as a medium of information transmission among audiovisual materials. Video as an information service content has become an indispensable element, and it has been used in various ways such as unidirectional delivery through TV, interactive service through the Internet, and audiovisual library borrowing. Especially, in the Internet environment, the information provider tries to reduce the effort and cost for the processing of the provided information in view of the video service through the smart device. In addition, users want to utilize only the desired parts because of the burden on excessive network usage, time and space constraints. Therefore, it is necessary to enhance the usability of the video by automatically classifying, summarizing, and indexing similar parts of the contents. In this paper, we propose a method of automatically segmenting the shots that make up videos by analyzing the contents and characteristics of language education videos and indexing the detailed contents information of the linguistic videos by combining visual features. The accuracy of the semantic based shot segmentation is high, and it can be effectively applied to the summary service of language education videos.

Research on text mining based malware analysis technology using string information (문자열 정보를 활용한 텍스트 마이닝 기반 악성코드 분석 기술 연구)

  • Ha, Ji-hee;Lee, Tae-jin
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.45-55
    • /
    • 2020
  • Due to the development of information and communication technology, the number of new / variant malicious codes is increasing rapidly every year, and various types of malicious codes are spreading due to the development of Internet of things and cloud computing technology. In this paper, we propose a malware analysis method based on string information that can be used regardless of operating system environment and represents library call information related to malicious behavior. Attackers can easily create malware using existing code or by using automated authoring tools, and the generated malware operates in a similar way to existing malware. Since most of the strings that can be extracted from malicious code are composed of information closely related to malicious behavior, it is processed by weighting data features using text mining based method to extract them as effective features for malware analysis. Based on the processed data, a model is constructed using various machine learning algorithms to perform experiments on detection of malicious status and classification of malicious groups. Data has been compared and verified against all files used on Windows and Linux operating systems. The accuracy of malicious detection is about 93.5%, the accuracy of group classification is about 90%. The proposed technique has a wide range of applications because it is relatively simple, fast, and operating system independent as a single model because it is not necessary to build a model for each group when classifying malicious groups. In addition, since the string information is extracted through static analysis, it can be processed faster than the analysis method that directly executes the code.

Metagenomic Approach on the Eukaryotic Plankton Biodiversity in Coastal Water of Busan (Korea) (부산 연안역의 진핵플랑크톤 종다양성에 대한 메타게놈 분석 연구)

  • Yoon, Ji-Mie;Lee, Jee-Eun;Lee, Sang-Rae;Rho, Tae-Keun;Lee, Jin-Ae;Chung, Ik-Kyo;Lee, Tong-Sup
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.17 no.2
    • /
    • pp.59-75
    • /
    • 2012
  • The species composition of plankton is essential to understand the material and energy cycling within marine ecosystem. It also provides the useful information for understanding the properties of marine environments due to its sensitivity to the physicochemical characteristics and variability of water masses. In this study we adopted metagenomics to evaluate eukaryotic plankton species diversity from coastal waters off Busan. Characteristics of water masses at sampling sites is expected to be very complex due to the mixing of various water masses; Nakdong River runoff, Changjiang diluted water (CDW), South Sea coastal water, and Tsushima warm current. 18S rDNA clone libraries were constructed from surface waters at the three sites off Busan. Clone libraries revealed 94 unique phylotypes from 370 clones; Dinophyceae(42 phylotypes), Ciliophora(15 phylotypes), Bacillariophyta(7 phylotypes), Chlorophyta(2 phylotypes), Haptophyceae(1 phylotype), Metazoa(Arthropoda( 17 phylotypes), Chaetognatha(1 phylotypes), Cnidaria(2 phylotypes), Chordata(1 phylotype)), Rhizaria (Acantharea(2 phylotypes), Polycystinea(1 phylotype)), Telonemida(1 phylotype), Fungi(2 phylotypes). The difference in species diversity at the closely located three sites off Busan may be attributed to the various physicochemical properties of water masses at these sites by the mixture of water masses of various origins. Metagenomic study of species composition may provide useful information for understanding marine ecosystem of coastal waters with various physicochemical properties in the near feature.

Comparison of physics-based and data-driven models for streamflow simulation of the Mekong river (메콩강 유출모의를 위한 물리적 및 데이터 기반 모형의 비교·분석)

  • Lee, Giha;Jung, Sungho;Lee, Daeeop
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.6
    • /
    • pp.503-514
    • /
    • 2018
  • In recent, the hydrological regime of the Mekong river is changing drastically due to climate change and haphazard watershed development including dam construction. Information of hydrologic feature like streamflow of the Mekong river are required for water disaster prevention and sustainable water resources development in the river sharing countries. In this study, runoff simulations at the Kratie station of the lower Mekong river are performed using SWAT (Soil and Water Assessment Tool), a physics-based hydrologic model, and LSTM (Long Short-Term Memory), a data-driven deep learning algorithm. The SWAT model was set up based on globally-available database (topography: HydroSHED, landuse: GLCF-MODIS, soil: FAO-Soil map, rainfall: APHRODITE, etc) and then simulated daily discharge from 2003 to 2007. The LSTM was built using deep learning open-source library TensorFlow and the deep-layer neural networks of the LSTM were trained based merely on daily water level data of 10 upper stations of the Kratie during two periods: 2000~2002 and 2008~2014. Then, LSTM simulated daily discharge for 2003~2007 as in SWAT model. The simulation results show that Nash-Sutcliffe Efficiency (NSE) of each model were calculated at 0.9(SWAT) and 0.99(LSTM), respectively. In order to simply simulate hydrological time series of ungauged large watersheds, data-driven model like the LSTM method is more applicable than the physics-based hydrological model having complexity due to various database pressure because it is able to memorize the preceding time series sequences and reflect them to prediction.

Development of a Korean Speech Recognition Platform (ECHOS) (한국어 음성인식 플랫폼 (ECHOS) 개발)

  • Kwon Oh-Wook;Kwon Sukbong;Jang Gyucheol;Yun Sungrack;Kim Yong-Rae;Jang Kwang-Dong;Kim Hoi-Rin;Yoo Changdong;Kim Bong-Wan;Lee Yong-Ju
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.8
    • /
    • pp.498-504
    • /
    • 2005
  • We introduce a Korean speech recognition platform (ECHOS) developed for education and research Purposes. ECHOS lowers the entry barrier to speech recognition research and can be used as a reference engine by providing elementary speech recognition modules. It has an easy simple object-oriented architecture, implemented in the C++ language with the standard template library. The input of the ECHOS is digital speech data sampled at 8 or 16 kHz. Its output is the 1-best recognition result. N-best recognition results, and a word graph. The recognition engine is composed of MFCC/PLP feature extraction, HMM-based acoustic modeling, n-gram language modeling, finite state network (FSN)- and lexical tree-based search algorithms. It can handle various tasks from isolated word recognition to large vocabulary continuous speech recognition. We compare the performance of ECHOS and hidden Markov model toolkit (HTK) for validation. In an FSN-based task. ECHOS shows similar word accuracy while the recognition time is doubled because of object-oriented implementation. For a 8000-word continuous speech recognition task, using the lexical tree search algorithm different from the algorithm used in HTK, it increases the word error rate by $40\%$ relatively but reduces the recognition time to half.

Exploring Usability of Mobile Text Messaging Interfaces (휴대폰 문자메시지 기능의 인터페이스 이용성에 관한 연구)

  • Lee, Jee-Yeon
    • Journal of Information Management
    • /
    • v.35 no.4
    • /
    • pp.1-16
    • /
    • 2004
  • In this paper, we outline the user interface problems that the text messaging users face to provide empirical basis for developing better improved mobile text messaging system. Our initial hypothesis was that the majority of the problems that the text messaging users face, namely, 1) difficulty in correctly understanding the intent of the incoming messages and 2) problem with frequently mis-addressing the recipient of the outgoing messages, can be accounted for by the poor usability of the text messaging user interface. Our analysis is based on the text message-based communication diaries, which were recorded for one week by each and every one of 75 college students, and survey taken from the same subjects. The data was collected in 2004. The students listed various difficulties including the limited message length, obscure input method, lack of mean to express emotional content, lack of receipt confirmation, lack of auto save feature when preparing messages to send, and lack of means to permanently save messages. Some of these problems were also identified in the previous studies. However, we were able to gather additional problems that the users face and also elicit potential solutions to remedy the problems. From these findings and analysis, we attempted to provide ways to improve the text messaging user interface.

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Mechanisms of Insulinotropic Effect of YHB-2017 [Genistein] Isolated from fermentation Broths of Streptomyces sp. (방선균에서 유래한 YHB-2017 [Genistein]의 인슐린 분비 촉진 작용 기전)

  • Kwag, Won-Jae;Park, You-Hoi;Park, Jun-Chul;Lee, Byung-Kyu;Kang, Yup;Choe, Tae-Boo
    • KSBB Journal
    • /
    • v.21 no.6 s.101
    • /
    • pp.466-473
    • /
    • 2006
  • Impaired insulin secretion from pancreatic beta-cells in response to glucose is an important feature in the pathology of non-insulin-dependent diabetes mellitus (NIDDM). In the course of screening for useful insulin secretagogues, we have isolated and identified YHB-2017 (Genistein) as a insulin secretion potentiator from fermentation broths of our in-house microbial library. The insulinotropic activity of YHB-2017 in isolated rat pancreatic islets was exerted only at high concentration of glucose (8.3-16 mM) but not at low concentration of glucose (3.3-5.5 mM). Also, in perifusion study with isolated rat pancreatic islets, YHB-2017 stimulated insulin secretion in a time-dependent manner when YHB-2017 was added to KRB buffer containing 16 mM glucose. In the presence of $200\;{\mu}M$ diazoxide and 35 mM KCI, which stimulates maximum $Ca^{2+}$ influx independently of KATP channel, YHB-2017 enhanced KATP channel-independent insulin secretion at high concentration glucose (16 mM). To elucidate the mechanisms of the glucose-dependent potentiation effect of YHB-2017, pharmacologic inhibitors for protein kinase A, protein kinase C and calcium/calmodulin kinase II were pre-treated and then the potentiation effect of YHB-2017 on insulin secretion was investigated. Pre-treatment of H89 as a PKA inhibitor had a significant inhibitory effect on YHB-2017-induced potentiation effect. Furthermore, western immunoblotting analyses revealed that YHB-2017 increased phosphorylation of PKA substrates and cAMP response element-binding protein (CREB) under high concentration of glucose. These results demonstrated that the insulinotropic effect of YHB-2017 is mediated through PKA signal pathway and activated amplifying $K_{ATP}$ channel-independent insulin secretion pathway.