• Title/Summary/Keyword: 데이터과학자

Search Result 604, Processing Time 0.028 seconds

Design and Implementation of Web Analyzing System based on User Create Log (사용자 생성 로그를 이용한 웹 분석시스템 설계 및 구현)

  • Go, Young-Dae;Lee, Eun-Bae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.11a
    • /
    • pp.264-267
    • /
    • 2007
  • 인터넷 사이트가 증가하면서 서비스 제공자는 사용자의 요구나 행동패턴을 파악하기 위하여 웹 마이닝 기법을 활용한다. 하지만 서버에 저장된 웹 로그 정보를 활용한 마이닝 기법은 전처리 과정에 많은 노력이 필요하고 사용자의 행동패턴이나 요구를 정확하게 파악하는데 한계가 있다. 이를 극복하기 위해 본 논문에서는 사용자 생성 로그정보를 이용한 방법을 제안한다. 제안 방법은 기존 서버에 저장되는 로그파일이 아닌 사용자의 행동에 의해 웹 페이지가 로딩될 때 마다 웹 마이닝에 필요한 정보를 수집하여 DB 에 저장하는 방법을 사용하였다. 이때 기존 로그파일에 로딩시간과 조회시간, 파라메타 정보를 추가하여 보다 사실적으로 사용자의 행동패턴을 파악하고자 하였다. 이렇게 생성된 로그파일을 기 등록된 메뉴정보, 쿼리정보와 조합하면 웹 마이닝에 필수적인 데이터정제, 사용자식별, 세션식별, 트랜잭션 식별등 전처리 과정의 효율성을 향상시키고 사용자의 행동패턴파악을 위한 정보 수집을 용이하게 해준다.

Implementation of an Image-based Korean Beef Grade Discrimination Automation Algorithm (이미지 기반 한우 등급 판별 자동화 알고리즘 구현)

  • Minji Kim;Junseok Oh;Eunchae Jeon;Yonghyun Kwon;YoungGyun Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.444-446
    • /
    • 2024
  • 한국의 육류 소비량이 늘어감에 따라 한우의 수요 및 공급도 점차 늘어가고 있다. 한우는 육질 등급(QG)과 육량 등급(YG)으로 나누어 판별되며 근내지방도, 고기 색, 지방색, 조직감, 성숙도, 도체 중량, 배최장근 단면적, 등지방두께 등 여러 항목을 고려한다. 현재는 주로 등배근을 맨눈으로 확인하는 수동 판별 방식이 사용된다. 하지만 평가사가 정확하게 판단하기 어렵고, 작업자의 부주의로 인한 육류의 오염 등 시간과 비용의 문제점이 있다. 본 연구에서는 이러한 문제점들을 한우 등급 판별 자동화로 해결하기 위해 한우의 등심 단면 이미지를 활용하여 등배근의 근내지방도를 산출하고 한우 등급을 자동 판별하는 알고리즘을 구현하였으며 평균 정확도는 79.2%를 달성하였다.

Sound event detection model using self-training based on noisy student model (잡음 학생 모델 기반의 자가 학습을 활용한 음향 사건 검지)

  • Kim, Nam Kyun;Park, Chang-Soo;Kim, Hong Kook;Hur, Jin Ook;Lim, Jeong Eun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.479-487
    • /
    • 2021
  • In this paper, we propose an Sound Event Detection (SED) model using self-training based on a noisy student model. The proposed SED model consists of two stages. In the first stage, a mean-teacher model based on an Residual Convolutional Recurrent Neural Network (RCRNN) is constructed to provide target labels regarding weakly labeled or unlabeled data. In the second stage, a self-training-based noisy student model is constructed by applying different noise types. That is, feature noises, such as time-frequency shift, mixup, SpecAugment, and dropout-based model noise are used here. In addition, a semi-supervised loss function is applied to train the noisy student model, which acts as label noise injection. The performance of the proposed SED model is evaluated on the validation set of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 Challenge Task 4. The experiments show that the single model and ensemble model of the proposed SED based on the noisy student model improve F1-score by 4.6 % and 3.4 % compared to the top-ranked model in DCASE 2020 challenge Task 4, respectively.

Design of a Video Metadata Schema and Implementation of an Authoring Tool for User Edited Contents Creation (User Edited Contents 생성을 위한 동영상 메타데이터 스키마 설계 및 저작 도구 구현)

  • Song, Insun;Nang, Jongho
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.413-418
    • /
    • 2015
  • In this paper, we design new video metadata schema for searching video segments to create UEC (User Edited Contents). The proposed video metadata schema employs hierarchically structured units of 'Title-Event-Place(Scene)-Shot', and defines the fields of the semantic information as structured form in each segment unit. Since this video metadata schema is defined by analyzing the structure of existing UECs and by experimenting the tagging and searching the video segment units for creating the UECs, it helps the users to search useful video segments for UEC easily than MPEG-7 MDS (Multimedia Description Scheme) which is a general purpose international standard for video metadata schema.

Unmanned Aircraft Platform Based Real-time LiDAR Data Processing Architecture for Real-time Detection Information (실시간 탐지정보 제공을 위한 무인기 플랫폼 기반 실시간 LiDAR 데이터 처리구조)

  • Eum, Junho;Berhanu, Eyassu;Oh, Sangyoon
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.12
    • /
    • pp.745-750
    • /
    • 2015
  • LiDAR(Light Detection and Ranging) technology provides realistic 3-dimension image information, and it has been widely utilized in various fields. However, the utilization of this technology in the military domain requires prompt responses to dynamically changing tactical environment and is therefore limited since LiDAR technology requires complex processing in order for extensive amounts of LiDAR data to be utilized. In this paper, we introduce an Unmanned Aircraft Platform Based Real-time LiDAR Data Processing Architecture that can provide real-time detection information by parallel processing and off-loading between the UAV processing and high-performance data processing areas. We also conducted experiments to verify the feasibility of our proposed architecture. Processing with ARM cluster similar to the UAV platform processing area results in similar or better performance when compared to the current method. We determined that our proposed architecture can be utilized in the military domain for tactical and combat purposes such as unmanned monitoring system.

A Secure and Practical Encrypted Data De-duplication with Proof of Ownership in Cloud Storage (클라우드 스토리지 상에서 안전하고 실용적인 암호데이터 중복제거와 소유권 증명 기술)

  • Park, Cheolhee;Hong, Dowon;Seo, Changho
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1165-1172
    • /
    • 2016
  • In cloud storage environment, deduplication enables efficient use of the storage. Also, in order to save network bandwidth, cloud storage service provider has introduced client-side deduplication. Cloud storage service users want to upload encrypted data to ensure confidentiality. However, common encryption method cannot be combined with deduplication, because each user uses a different private key. Also, client-side deduplication can be vulnerable to security threats because file tag replaces the entire file. Recently, proof of ownership schemes have suggested to remedy the vulnerabilities of client-side deduplication. Nevertheless, client-side deduplication over encrypted data still causes problems in efficiency and security. In this paper, we propose a secure and practical client-side encrypted data deduplication scheme that has resilience to brute force attack and performs proof of ownership over encrypted data.

A Scheduling Algorithm for Performance Enhancement of Science Data Center Network based on OpenFlow (오픈플로우 기반의 과학실험데이터센터 네트워크의 성능 향상을 위한 스케줄링 알고리즘)

  • Kong, Jong Uk;Min, Seok Hong;Lee, Jae Yong;Kim, Byung Chul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1655-1665
    • /
    • 2017
  • Recently data centers are being constructed actively by many cloud service providers, enterprises, research institutes, etc. Generally, they are built on tree topology using ECMP data forwarding scheme for load balancing. In this paper, we examine data center network topologies like tree topology and fat-tree topology, and load balancing technologies like MLAG and ECMP. Then, we propose a scheduling algorithm to efficiently transmit particular files stored on the hosts in the data center to the destination node outside the data center, where fat-tree topology and OpenFlow protocol between infrastructure layer and control layer are used. We run performance analysis by numerical method, and compare the analysis results with those of ECMP. Through the performance comparison, we show the outperformance of the proposed algorithm in terms of throughput and file transfer completion time.

Performance Evaluation of Recurrent Neural Network Algorithms for Recommendation System in E-commerce (전자상거래 추천시스템을 위한 순환신경망 알고리즘들의 성능평가)

  • Seo, Jihye;Yong, Hwan-Seung
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.7
    • /
    • pp.440-445
    • /
    • 2017
  • Due to the advance of e-commerce systems, the number of people using online shopping and products has significantly increased. Therefore, the need for an accurate recommendation system is becoming increasingly more important. Recurrent neural network is a deep-learning algorithm that utilizes sequential information in training. In this paper, an evaluation is performed on the application of recurrent neural networks to recommendation systems. We evaluated three recurrent algorithms (RNN, LSTM and GRU) and three optimal algorithms(Adagrad, RMSProp and Adam) which are commonly used. In the experiments, we used the TensorFlow open source library produced by Google and e-commerce session data from RecSys Challenge 2015. The results using the optimal hyperparameters found in this study are compared with those of RecSys Challenge 2015 participants.

Decentralized Group Key Management for Untrusted Dynamic Networks (신뢰할 수 없는 동적 네트워크 환경을 위한 비중앙화 그룹키 관리 기법)

  • Hur, Jun-Beom;Yoon, Hyun-Soo
    • Journal of KIISE:Information Networking
    • /
    • v.36 no.4
    • /
    • pp.263-274
    • /
    • 2009
  • Decentralized group key management mechanisms offer beneficial solutions to enhance the scalability and reliability of a secure multicast framework by confining the impact of a membership change in a local area. However, many of the previous decentralized solutions reveal the plaintext to the intermediate relaying proxies, or require the key distribution center to coordinate secure group communications between subgroups. In this study, we propose a decentralized group key management scheme that features a mechanism allowing a service provider to deliver the group key to valid members in a distributed manner using the proxy cryptography. In the proposed scheme, the key distribution center is eliminated while data confidentiality of the transmitted message is provided during the message delivery process. The proposed scheme can support a secure group communication in dynamic network environments where there is no trusted central controller for the whole network and the network topology changes frequently.

Analysis of privacy issues and countermeasures in neural network learning (신경망 학습에서 프라이버시 이슈 및 대응방법 분석)

  • Hong, Eun-Ju;Lee, Su-Jin;Hong, Do-won;Seo, Chang-Ho
    • Journal of Digital Convergence
    • /
    • v.17 no.7
    • /
    • pp.285-292
    • /
    • 2019
  • With the popularization of PC, SNS and IoT, a lot of data is generated and the amount is increasing exponentially. Artificial neural network learning is a topic that attracts attention in many fields in recent years by using huge amounts of data. Artificial neural network learning has shown tremendous potential in speech recognition and image recognition, and is widely applied to a variety of complex areas such as medical diagnosis, artificial intelligence games, and face recognition. The results of artificial neural networks are accurate enough to surpass real human beings. Despite these many advantages, privacy problems still exist in artificial neural network learning. Learning data for artificial neural network learning includes various information including personal sensitive information, so that privacy can be exposed due to malicious attackers. There is a privacy risk that occurs when an attacker interferes with learning and degrades learning or attacks a model that has completed learning. In this paper, we analyze the attack method of the recently proposed neural network model and its privacy protection method.