• Title/Summary/Keyword: Big Data Environment

Search Result 976, Processing Time 0.027 seconds

Export-Import Value Nowcasting Procedure Using Big Data-AIS and Machine Learning Techniques

  • NICKELSON, Jimmy;NOORAENI, Rani;EFLIZA, EFLIZA
    • Asian Journal of Business Environment
    • /
    • v.12 no.3
    • /
    • pp.1-12
    • /
    • 2022
  • Purpose: This study aims to investigate whether AIS data can be used as a supporting indicator or as an initial signal to describe Indonesia's export-import conditions in real-time. Research design, data, and methodology: This study performs several stages of data selection to obtain indicators from AIS that truly reflect export-import activities in Indonesia. Also, investigate the potential of AIS indicators in producing forecasts of the value and volume of Indonesian export-import using conventional statistical methods and machine learning techniques. Results: The six preprocessing stages defined in this study filtered AIS data from 661.8 million messages to 73.5 million messages. Seven predictors were formed from the selected AIS data. The AIS indicator can be used to provide an initial signal about Indonesia's import-export activities. Each export or import activity has its own predictor. Conventional statistical methods and machine learning techniques have the same ability both in forecasting Indonesia's exports and imports. Conclusions: Big data AIS can be used as a supporting indicator as a signal of the condition of export-import values in Indonesia. The right method of building indicators can make the data valuable for the performance of the forecasting model.

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

  • Kim, Minjeong;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.93-110
    • /
    • 2015
  • The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.

SuperDepthTransfer: Depth Extraction from Image Using Instance-Based Learning with Superpixels

  • Zhu, Yuesheng;Jiang, Yifeng;Huang, Zhuandi;Luo, Guibo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.4968-4986
    • /
    • 2017
  • In this paper, we primarily address the difficulty of automatic generation of a plausible depth map from a single image in an unstructured environment. The aim is to extrapolate a depth map with a more correct, rich, and distinct depth order, which is both quantitatively accurate as well as visually pleasing. Our technique, which is fundamentally based on a preexisting DepthTransfer algorithm, transfers depth information at the level of superpixels. This occurs within a framework that replaces a pixel basis with one of instance-based learning. A vital superpixels feature enhancing matching precision is posterior incorporation of predictive semantic labels into the depth extraction procedure. Finally, a modified Cross Bilateral Filter is leveraged to augment the final depth field. For training and evaluation, experiments were conducted using the Make3D Range Image Dataset and vividly demonstrate that this depth estimation method outperforms state-of-the-art methods for the correlation coefficient metric, mean log10 error and root mean squared error, and achieves comparable performance for the average relative error metric in both efficacy and computational efficiency. This approach can be utilized to automatically convert 2D images into stereo for 3D visualization, producing anaglyph images that are visually superior in realism and simultaneously more immersive.

A Study on Activation of New Mobile Communication Spectrum in the Environment of Mobile Big Data Traffic (모바일 빅 데이터 트래픽 환경에서 새로운 이동통신 주파수의 활성화 방안 연구)

  • Chung, Woo-Ghee
    • Journal of Satellite, Information and Communications
    • /
    • v.7 no.2
    • /
    • pp.42-46
    • /
    • 2012
  • This paper analyses technical and economical conditions which activate the use of mobile communication spectrum not to limit the growth of mobile broadband service because of mobile big data traffic and proposes the method which activate the use of mobile communication spectrum. To activate new mobile communication spectrum the expenditure and income of investment should be balanced. The activation of new mobile communication spectrum to process mobile big data traffic depends on technical and economical conditions, internal and external factors of service provider. The investment expenditure is relate to CAPEX, OPEX which is internal factors of service provider and to spectrum price which is external factor of service. The investment income is relate to tariff system which is internal factors of service provider and to spectrum neutrality which is external factor of service provider. The activation of new mobile communication spectrum can be implemented when the investment expenditure and investment income meet the balance including the spectrum price in the investment expenditure and the tariff system which is able to extend network and the income based on traffic increase by external contents in the investment income.

Methodology on e-Navigation-Assisted Ocean Monitoring and Big Data Analysis (이내비게이션을 활용한 해양환경관측 및 빅데이터 분석방안)

  • LEE, GUAN-HONG;PARK, JAE-HUN;HA, HO KYUNG;KIM, DO WAN;LEE, WOOJOO;KIM, HONGTAE;SHIN, HYUN-JUNG
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.23 no.4
    • /
    • pp.204-217
    • /
    • 2018
  • This study proposes a cost-effective method to monitor coastal environments using e-Navigation-implemented domestic and international ferries, and to analyze big data of records such as wind, temperature, salinity, waves, and currents that are gathered through e-Navigation system. First, we present the concept and architecture of e-Navigation operation system based on the General Information Center on Maritime Safety and Security. Then, the marine observation system that can be applied to ferries operating in our nation's territory is discussed. Analytical methods, such as spatio-temporal mixed effects model, ensemble method, and meshfree method, in handling real-time big data obtained by the e-Navigation observing system are then explained in detail. This study will support the implementation of the Korean e-Navigation project that focuses on the safety of small vessels such as coasters and fishing vessels.

A Study on Vehicle Big Data-based Micro-scale Segment Speed Information Service for Future Traffic Environment Assistance (미래 교통환경 지원을 위한 차량 빅데이터 기반의 미시구간 속도정보 서비스 방안 연구)

  • Choi, Kanghyeok;Chong, Kyusoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.2
    • /
    • pp.74-84
    • /
    • 2022
  • Vehicle average speed information which measured at a point or a short section has a problem in that it cannot accurately provide the speed changes on an actual highway. In this study, segment separation method based on vehicle big data for accurate micro-speed estimation is proposed. In this study, to find the point where the speed deviation occurs using location-based individual vehicle big data, time and space mean speed functions were used. Next, points being changed micro-scale speed are classified through gradual segment separation based on geohash. By the comparative evaluation for the results, this study presents that the link-based speed is could not represent accurate speed for micro-scale segments.

Analysis of Factors Influencing Street Vitality in High-Density Residential Areas Based on Multi-source Data: A Case Study of Shanghai

  • Yuan, Meilun;Chen, Yong
    • International Journal of High-Rise Buildings
    • /
    • v.10 no.1
    • /
    • pp.1-8
    • /
    • 2021
  • Currently, big data and open data, together with traditional measured data, have come to constitute a new data environment, expanding new technical paths for quantitative analysis of the street environment. Streets provide precious linear public space in high-density residential areas. Pedestrian activities are the main body of street vitality. In this paper, 441 street segments were selected from 21 residential districts in high-density downtown area of Shanghai as cases, to quantitatively evaluate the influencing factors of pedestrian activities. Bivariate analysis was performed, and the results showed that street vitality was not only correlated with a highly populated environment, but also with other factors. In particular, the density of entrances and exits of residential properties, the proportion of walkable areas, and the density of retail and service facilities, were correlated with the vitality of street segments. The magnitudes of correlation between the street environmental factors and the pedestrian traffic differed across various trip purposes. Segment connectivity factors were more correlated with walking for leisure than for transportation. While public transportation factors were mainly correlated with walking for transportation, vehicular traffic factors were negatively correlated with walking for leisure.

Parallelization of Genome Sequence Data Pre-Processing on Big Data and HPC Framework (빅데이터 및 고성능컴퓨팅 프레임워크를 활용한 유전체 데이터 전처리 과정의 병렬화)

  • Byun, Eun-Kyu;Kwak, Jae-Hyuck;Mun, Jihyeob
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.10
    • /
    • pp.231-238
    • /
    • 2019
  • Analyzing next-generation genome sequencing data in a conventional way using single server may take several tens of hours depending on the data size. However, in order to cope with emergency situations where the results need to be known within a few hours, it is required to improve the performance of a single genome analysis. In this paper, we propose a parallelized method for pre-processing genome sequence data which can reduce the analysis time by utilizing the big data technology and the highperformance computing cluster which is connected to the high-speed network and shares the parallel file system. For the reliability of analytical data, we have chosen a strategy to parallelize the existing analytical tools and algorithms to the new environment. Parallelized processing, data distribution, and parallel merging techniques have been developed and performance improvements have been confirmed through experiments.

Estimation of ship operational efficiency from AIS data using big data technology

  • Kim, Seong-Hoon;Roh, Myung-Il;Oh, Min-Jae;Park, Sung-Woo;Kim, In-Il
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.12 no.1
    • /
    • pp.440-454
    • /
    • 2020
  • To prevent pollution from ships, the Energy Efficiency Design Index (EEDI) is a mandatory guideline for all new ships. The Ship Energy Efficiency Management Plan (SEEMP) has also been applied by MARPOL to all existing ships. SEEMP provides the Energy Efficiency Operational Indicator (EEOI) for monitoring the operational efficiency of a ship. By monitoring the EEOI, the shipowner or operator can establish strategic plans, such as routing, hull cleaning, decommissioning, new building, etc. The key parameter in calculating EEOI is Fuel Oil Consumption (FOC). It can be measured on board while a ship is operating. This means that only the shipowner or operator can calculate the EEOI of their own ships. If the EEOI can be calculated without the actual FOC, however, then the other stakeholders, such as the shipbuilding company and Class, or others who don't have the measured FOC, can check how efficiently their ships are operating compared to other ships. In this study, we propose a method to estimate the EEOI without requiring the actual FOC. The Automatic Identification System (AIS) data, ship static data, and environment data that can be publicly obtained are used to calculate the EEOI. Since the public data are of large capacity, big data technologies, specifically Hadoop and Spark, are used. We verify the proposed method using actual data, and the result shows that the proposed method can estimate EEOI from public data without actual FOC.

BDSS: Blockchain-based Data Sharing Scheme With Fine-grained Access Control And Permission Revocation In Medical Environment

  • Zhang, Lejun;Zou, Yanfei;Yousuf, Muhammad Hassam;Wang, Weizheng;Jin, Zilong;Su, Yansen;Kim, Seokhoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1634-1652
    • /
    • 2022
  • Due to the increasing need for data sharing in the age of big data, how to achieve data access control and implement user permission revocation in the blockchain environment becomes an urgent problem. To solve the above problems, we propose a novel blockchain-based data sharing scheme (BDSS) with fine-grained access control and permission revocation in this paper, which regards the medical environment as the application scenario. In this scheme, we separate the public part and private part of the electronic medical record (EMR). Then, we use symmetric searchable encryption (SSE) technology to encrypt these two parts separately, and use attribute-based encryption (ABE) technology to encrypt symmetric keys which used in SSE technology separately. This guarantees better fine-grained access control and makes patients to share data at ease. In addition, we design a mechanism for EMR permission grant and revocation so that hospital can verify attribute set to determine whether to grant and revoke access permission through blockchain, so it is no longer necessary for ciphertext re-encryption and key update. Finally, security analysis, security proof and performance evaluation demonstrate that the proposed scheme is safe and effective in practical applications.