• Title/Summary/Keyword: Internet-web based system

Search Result 1,320, Processing Time 0.024 seconds

GWB: An integrated software system for Managing and Analyzing Genomic Sequences (GWB: 유전자 서열 데이터의 관리와 분석을 위한 통합 소프트웨어 시스템)

  • Kim In-Cheol;Jin Hoon
    • Journal of Internet Computing and Services
    • /
    • v.5 no.5
    • /
    • pp.1-15
    • /
    • 2004
  • In this paper, we explain the design and implementation of GWB(Gene WorkBench), which is a web-based, integrated system for efficiently managing and analyzing genomic sequences, Most existing software systems handling genomic sequences rarely provide both managing facilities and analyzing facilities. The analysis programs also tend to be unit programs that include just single or some part of the required functions. Moreover, these programs are widely distributed over Internet and require different execution environments. As lots of manual and conversion works are required for using these programs together, many life science researchers suffer great inconveniences. in order to overcome the problems of existing systems and provide a more convenient one for helping genomic researches in effective ways, this paper integrates both managing facilities and analyzing facilities into a single system called GWB. Most important issues regarding the design of GWB are how to integrate many different analysis programs into a single software system, and how to provide data or databases of different formats required to run these programs. In order to address these issues, GWB integrates different analysis programs byusing common input/output interfaces called wrappers, suggests a common format of genomic sequence data, organizes local databases consisting of a relational database and an indexed sequential file, and provides facilities for converting data among several well-known different formats and exporting local databases into XML files.

  • PDF

An Algorithm to Detect P2P Heavy Traffic based on Flow Transport Characteristics (플로우 전달 특성 기반의 P2P 헤비 트래픽 검출 알고리즘)

  • Choi, Byeong-Geol;Lee, Si-Young;Seo, Yeong-Il;Yu, Zhibin;Jun, Jae-Hyun;Kim, Sung-Ho
    • Journal of KIISE:Information Networking
    • /
    • v.37 no.5
    • /
    • pp.317-326
    • /
    • 2010
  • Nowadays, transmission bandwidth for network traffic is increasing and the type is varied such as peer-to-peer (PZP), real-time video, and so on, because distributed computing environment is spread and various network-based applications are developed. However, as PZP traffic occupies much volume among Internet backbone traffics, transmission bandwidth and quality of service(QoS) of other network applications such as web, ftp, and real-time video cannot be guaranteed. In previous research, the port-based technique which checks well-known port number and the Deep Packet Inspection(DPI) technique which checks the payload of packets were suggested for solving the problem of the P2P traffics, however there were difficulties to apply those methods to detection of P2P traffics because P2P applications are not used well-known port number and payload of packets may be encrypted. A proposed algorithm for identifying P2P heavy traffics based on flow transport parameters and behavioral characteristics can solve the problem of the port-based technique and the DPI technique. The focus of this paper is to identify P2P heavy traffic flows rather than all P2P traffics. P2P traffics are consist of two steps i)searching the opposite peer which have some contents ii) downloading the contents from one or more peers. We define P2P flow patterns on these P2P applications' features and then implement the system to classify P2P heavy traffics.

Establishment and service of user analysis environment related to computational science and engineering simulation platform

  • Kwon, Yejin;Jeon, Inho;On, Noori;Seo, Jerry H.;Lee, Jongsuk R.
    • Journal of Internet Computing and Services
    • /
    • v.21 no.6
    • /
    • pp.123-132
    • /
    • 2020
  • The EDucation-research Integration through Simulation On the Net (EDISON) platform, which is a web-based platform that provides computational science and engineering simulation execution environments, can offer various analysis environments to students, general users, as well as computational science and engineering researchers. To expand the user base of the simulation environment services, the EDISON platform holds a challenge every year and attempts to increase the competitiveness and excellence of the platform by analyzing the user requirements of the various simulation environment offered. The challenge platform system in the field of computational science and engineering is provided to users in relation to the simulation service used in the existing EDISON platform. Previously, EDISON challenge servicesoperated independently from simulation services, and hence, services such as end-user review and intermediate simulation results could not be linked. To meet these user requirements, the currently in-service challenge platform for computational science and engineering is linked to the existing computational science and engineering service. In addition, it was possible to increase the efficiency of service resources by providing limited services through various analyses of all users participating in the challenge. In this study, by analyzing the simulation and usage environments of users, we provide an improved challenge platform; we also analyze ways to improve the simulation execution environment.

Study on Companion Dog Practice and Management System on Animal Hospital in Seoul City (서울시내 동물병원의 애완견 진료와 관리시스템에 관한 연구)

  • Kang, Dong-Joon;Chung, Byung-Hyun;Heo, Jung;Kang, Chung-Boo
    • Journal of agriculture & life science
    • /
    • v.43 no.2
    • /
    • pp.39-46
    • /
    • 2009
  • Ten animal hospitals in Seoul were selected, and of which 41,305 practice in total for recent 1 year were analyzed based on international classification of disease(ICD) of WHO. Of the entire practice, prophylaxis cases accounted for 42.2%. Of the 9,376 cases of internal diseases, digestive system diseases accounted for 4,957 cases(52.9%), respiratory system diseases 1,776 cases(18.9%), circulatory system diseases 1,239 cases(13.2%), and neoplasms diseases 413 cases(4.4%), respectively. In the estate investment expenses, to be specific, rental guarantee deposit accounted for about 50% of the entire estate cost, foregift(48%), and monthly rent(2%). Dividing the income of animal hospitals into medical service, articles selling, and pet-dog beauty, medical service accounted for 71.6% of the entire sales, selling of articles 16.7%, and pet-dog beauty 11.6%. Publicity work for the animal hospital was done mostly through web sites on internet(47%), then through local advertising papers(23.5%), and through bulletins and telemarketing(11.8%). Manpower of the animal hospitals in this study was mostly composed of veterinarians(45.5%) up to about half of the entire staffs, while there was no veterinary technician who had specialized education in that area.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Odysseus/Parallel-OOSQL: A Parallel Search Engine using the Odysseus DBMS Tightly-Coupled with IR Capability (오디세우스/Parallel-OOSQL: 오디세우스 정보검색용 밀결합 DBMS를 사용한 병렬 정보 검색 엔진)

  • Ryu, Jae-Joon;Whang, Kyu-Young;Lee, Jae-Gil;Kwon, Hyuk-Yoon;Kim, Yi-Reun;Heo, Jun-Suk;Lee, Ki-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.412-429
    • /
    • 2008
  • As the amount of electronic documents increases rapidly with the growth of the Internet, a parallel search engine capable of handling a large number of documents are becoming ever important. To implement a parallel search engine, we need to partition the inverted index and search through the partitioned index in parallel. There are two methods of partitioning the inverted index: 1) document-identifier based partitioning and 2) keyword-identifier based partitioning. However, each method alone has the following drawbacks. The former is convenient in inserting documents and has high throughput, but has poor performance for top h query processing. The latter has good performance for top-k query processing, but is inconvenient in inserting documents and has low throughput. In this paper, we propose a hybrid partitioning method to compensate for the drawback of each method. We design and implement a parallel search engine that supports the hybrid partitioning method using the Odysseus DBMS tightly coupled with information retrieval capability. We first introduce the architecture of the parallel search engine-Odysseus/parallel-OOSQL. We then show the effectiveness of the proposed system through systematic experiments. The experimental results show that the query processing time of the document-identifier based partitioning method is approximately inversely proportional to the number of blocks in the partition of the inverted index. The results also show that the keyword-identifier based partitioning method has good performance in top-k query processing. The proposed parallel search engine can be optimized for performance by customizing the methods of partitioning the inverted index according to the application environment. The Odysseus/parallel OOSQL parallel search engine is capable of indexing, storing, and querying 100 million web documents per node or tens of billions of web documents for the entire system.

Development of Android Smartphone App for Corner Point Feature Extraction using Remote Sensing Image (위성영상정보 기반 코너 포인트 객체 추출 안드로이드 스마트폰 앱 개발)

  • Kang, Sang-Goo;Lee, Ki-Won
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.1
    • /
    • pp.33-41
    • /
    • 2011
  • In the information communication technology, it is world-widely apparent that trend movement from internet web to smartphone app by users demand and developers environment. So it needs kinds of appropriate technological responses from geo-spatial domain regarding this trend. However, most cases in the smartphone app are the map service and location recognition service, and uses of geo-spatial contents are somewhat on the limited level or on the prototype developing stage. In this study, app for extraction of corner point features using geo-spatial imagery and their linkage to database system are developed. Corner extraction is based on Harris algorithm, and all processing modules in database server, application server, and client interface composing app are designed and implemented based on open source. Extracted corner points are applied LOD(Level of Details) process to optimize on display panel. Additional useful function is provided that geo-spatial imagery can be superimposed with the digital map in the same area. It is expected that this app can be utilized to automatic establishment of POI (Point of Interests) or point-based land change detection purposes.

Location Service Modeling of Distributed GIS for Replication Geospatial Information Object Management (중복 지리정보 객체 관리를 위한 분산 지리정보 시스템의 위치 서비스 모델링)

  • Jeong, Chang-Won;Lee, Won-Jung;Lee, Jae-Wan;Joo, Su-Chong
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.985-996
    • /
    • 2006
  • As the internet technologies develop, the geographic information system environment is changing to the web-based service. Since geospatial information of the existing Web-GIS services were developed independently, there is no interoperability to support diverse map formats. In spite of the same geospatial information object it can be used for various proposes that is duplicated in GIS separately. It needs intelligent strategies for optimal replica selection, which is identification of replication geospatial information objects. And for management of replication objects, OMG, GLOBE and GRID computing suggested related frameworks. But these researches are not thorough going enough in case of geospatial information object. This paper presents a model of location service, which is supported for optimal selection among replication and management of replication objects. It is consist of tree main services. The first is binding service which can save names and properties of object defined by users according to service offers and enable clients to search them on the service of offers. The second is location service which can manage location information with contact records. And obtains performance information by the Load Sharing Facility on system independently with contact address. The third is intelligent selection service which can obtain basic/performance information from the binding service/location service and provide both faster access and better performance characteristics by rules as intelligent model based on rough sets. For the validity of location service model, this research presents the processes of location service execution with Graphic User Interface.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

A Study on the Operating Conditions of Lecture Contents in Contactless Online Classes for University Students (대학생 대상 비대면 온라인 수업에서의 강의 콘텐츠 운영 실태 연구)

  • Lee, Jongmoon
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.32 no.4
    • /
    • pp.5-24
    • /
    • 2021
  • The purpose of this study was to investigate and analyze the operating conditions of lecture contents in contactless online classes for University students. First, as a result of analyzing the responses of 93 respondents, 93.3% of the respondents took real-time online lectures (47.7%) or recorded video lectures (45.6%). Second, as a result of analyzing the contents used as textbooks, it was found that e-books (materials) and paper books (materials) were used together (36.6%), or e-books or electronic materials (36.6% and 37.6% respectively) were used in both liberal arts (47.3%) and major subjects (39.8%). In addition to textbooks, both major subjects and liberal arts highly used web materials (47.6% and 40.5% respectively) and YouTube materials (33.3% and 48.0% respectively) as external materials. Third, both liberal arts and major subjects used 'electronic files in the form of PPT or text organized and written by instructors' (62.9% and 58.1% respectively), 'internet materials' (16.7% and 19% respectively) and 'paper book or materials' (10.4% and 12.3% respectively) to share lecture contents. For the screen displayed lecture contents, 93.5% of the respondents satisfied in major subjects, and 90.2% of the respondents satisfied in liberal arts. These results suggest developing multimedia-based lecture contents and an evaluation solution capable of real-time exam supervision, developing a task management system capable of AI-based plagiarism search, task guidance, and task evaluation, and institutionalizing a solution to copyright problems for electronicizing lecture materials so that lectures can be given in the ubiquitous environment.