• Title/Summary/Keyword: 데이터 일관성

Search Result 699, Processing Time 0.021 seconds

Generating Sponsored Blog Texts through Fine-Tuning of Korean LLMs (한국어 언어모델 파인튜닝을 통한 협찬 블로그 텍스트 생성)

  • Bo Kyeong Kim;Jae Yeon Byun;Kyung-Ae Cha
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.3
    • /
    • pp.1-12
    • /
    • 2024
  • In this paper, we fine-tuned KoAlpaca, a large-scale Korean language model, and implemented a blog text generation system utilizing it. Blogs on social media platforms are widely used as a marketing tool for businesses. We constructed training data of positive reviews through emotion analysis and refinement of collected sponsored blog texts and applied QLoRA for the lightweight training of KoAlpaca. QLoRA is a fine-tuning approach that significantly reduces the memory usage required for training, with experiments in an environment with a parameter size of 12.8B showing up to a 58.8% decrease in memory usage compared to LoRA. To evaluate the generative performance of the fine-tuned model, texts generated from 100 inputs not included in the training data produced on average more than twice the number of words compared to the pre-trained model, with texts of positive sentiment also appearing more than twice as often. In a survey conducted for qualitative evaluation of generative performance, responses indicated that the fine-tuned model's generated outputs were more relevant to the given topics on average 77.5% of the time. This demonstrates that the positive review generation language model for sponsored content in this paper can enhance the efficiency of time management for content creation and ensure consistent marketing effects. However, to reduce the generation of content that deviates from the category of positive reviews due to elements of the pre-trained model, we plan to proceed with fine-tuning using the augmentation of training data.

Quality Control Scheme of GIS-based Bus Network for Stabilization of BIS - Focusing on Real-Time Public Transportation Information (BIS 안정화를 위한 버스기반정보 GIS DB 품질 관리 방안 - 실시간 환승교통 종합정보 시스템을 사례로)

  • Ju, Yong-Jin;Ham, Chang-Hak
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.20 no.1
    • /
    • pp.33-41
    • /
    • 2012
  • BIS is an arrival guidance system which is able to supply passengers with bus service condition via Kiosks at a bus stop, internet and mobile service through pinpointing bus location in real time. It is very significant to improve the quality of traffic information by quality control of GIS-based bus network so as to maintain navigational information and to implement reliable BIS. Therefore this study aims to build criteria to quantitatively evaluate data quality of the product in accordance with the process in bus network data and to suggest guideline of quality control. To achieve this, we have categorized geometric and logical errors occurring during constructing bus network database by giving a specific case study on TAGO and set up sectional guideline and procedures to examine database for systematic and coherent quality control management. Proceeding from what has been said above, the outcome of our research leads to quality guarantee for objective and reliable bus network database and is fully expected to bring benefit of providing a more accurate public transportation information and improving reliability of BIS through preventing a variety of errors in system operation in advance.

Unveiling the Potential: Exploring NIRv Peak as an Accurate Estimator of Crop Yield at the County Level (군·시도 수준에서의 작물 수확량 추정: 옥수수와 콩에 대한 근적외선 반사율 지수(NIRv) 최댓값의 잠재력 해석)

  • Daewon Kim;Ryoungseob Kwon
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.3
    • /
    • pp.182-196
    • /
    • 2023
  • Accurate and timely estimation of crop yields is crucial for various purposes, including global food security planning and agricultural policy development. Remote sensing techniques, particularly using vegetation indices (VIs), have show n promise in monitoring and predicting crop conditions. However, traditional VIs such as the normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) have limitations in capturing rapid changes in vegetation photosynthesis and may not accurately represent crop productivity. An alternative vegetation index, the near-infrared reflectance of vegetation (NIRv), has been proposed as a better predictor of crop yield due to its strong correlation with gross primary productivity (GPP) and its ability to untangle confounding effects in canopies. In this study, we investigated the potential of NIRv in estimating crop yield, specifically for corn and soybean crops in major crop-producing regions in 14 states of the United States. Our results demonstrated a significant correlation between the peak value of NIRv and crop yield/area for both corn and soybean. The correlation w as slightly stronger for soybean than for corn. Moreover, most of the target states exhibited a notable relationship between NIRv peak and yield, with consistent slopes across different states. Furthermore, we observed a distinct pattern in the yearly data, where most values were closely clustered together. However, the year 2012 stood out as an outlier in several states, suggesting unique crop conditions during that period. Based on the established relationships between NIRv peak and yield, we predicted crop yield data for 2022 and evaluated the accuracy of the predictions using the Root Mean Square Percentage Error (RMSPE). Our findings indicate the potential of NIRv peak in estimating crop yield at the county level, with varying accuracy across different counties.

A Study on Promotion Strategies for Examining Platforms of Convergence Contents (방송.통신 융합 환경에 적합한 다중 플랫폼 융합 콘텐츠 육성 전략)

  • Park, Soo-Ile;Shin, Dong-Pil;Chun, Sang-Kwon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2009.01a
    • /
    • pp.197-202
    • /
    • 2009
  • 과학기술의 발달로 인한 사회 문화적 트렌드의 변화는 새로운 기회와 가능성을 제공해 주며, 정보통신기술은 통신과 방송, 통신과 콘텐츠 등 영역간의 경계를 허물며 융합을 가능하게 하고, 우리의 감성과 상상력을 자극하여 새로운 문화적 가능성을 열어주고 있다. 이러한 상황들은 방송 통신 융합이라는 이름으로 방송과 통신, TV와 PC 온라인과 오프라인 등의 모든 영역에서 다양한 노력이 진행되고 있다. 방송과 통신의 융합은 마치 역사상 신대륙의 개척 과정처럼 새로운 제품과 새로운 시장을 창출해내는 능력을 가지고 있기 때문에, 국내는 물론 세계의 모든 비즈니스 업체들은 이 기회의 땅을 향해 전력 질주하고 있다. 또한, 이에 따르는 콘텐츠의 융합 역시 괄목할만하며, 게임과 영화, 다큐멘터리와 드라마 등의 콘텐츠 간의 융합은 물론이고, 최근에는 모바일에서 영화를 제작하고, 게임과 소설 네트워크가 결합하고, 심지어는 게임 안에서 음악을 유통시키는 유통의 융합까지도 이뤄지고 있다. 이와 같은 다양한 융합의 확산은 미디어와 플랫폼의 등장뿐만 아니라 플랫폼 간 교차와 연결 및 통합이 가능한 미디어 전경(landscape)을 창출해 내고 있으며, 인터넷과 TV의 결합은 다양한 애플리케이션을 구현할 수 있는 전송 메커니즘을 서로 연결시켜 수많은 형태의 다중 플랫폼을 등장시키고 있다. 이로 인하여 방송 서비스와 인터넷 서비스가 네트워크나 전송 플랫폼의 구별 없이, 그리고 디바이스의 선택과 상관없이 활용되는 통합 플랫폼 환경이 폭 넓게 조성되고 있다. 따라서, 방송 통신 융합 환경에 적합한 다중 플랫폼 융합 콘텐츠는 사용자의 요구 및 새로운 비즈니스 모텔에 대한 요구를 만족할 수 있어야 하며, 일관된 기술로 통선 및 서비스간의 호환성을 유지하는 인터페이스의 표준화가 이루어져야한다. 방송 통신 융합 환경에 적합한 다중 플랫폼 융합 콘텐츠는 초고속 데이터 통신망을 활용하는 멀티미디어 및 IP 멀티캐스트 기능을 활용한 서비스들과 연계하여, 관련된 소재 산업들의 파급효과가 매우 크며, 관련 분야에 미치는 효과가 막대하므로, 이에 대한 적절한 육성전략을 고찰해보도록 한다.

  • PDF

A Qualitative Study on the Library Services Policy for the Disabled Person in European Countries (유럽국가의 장애인 도서관서비스 정책에 관한 질적 연구)

  • Lee, Jung-Yeoun
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.3
    • /
    • pp.147-168
    • /
    • 2010
  • The purpose of this study is to present suggestions for Korean libraries for disabled person after collecting and analyzing qualitative data upon library policies of national libraries, public libraries and libraries for disabled person in European countries including Sweden, UK and France. Korean national library support center for the disabled should be independent in order to have consistent policy and ability for its execution. It should also support private libraries for the disabled in inheriting professionalism and accumulated history. Developing alternative material, integrated catalogue and professional service is required by cooperative and systematic arrangements among national and public libraries for disabled person including schools and university libraries. Also, it should be able to grow not only on the basis of internal cooperation among libraries but also with the help of external organizations like social and legal systems, organizations related with the disabled and regional self governing bodies.

A Data Mining Tool for Massive Trajectory Data (대규모 궤적 데이타를 위한 데이타 마이닝 툴)

  • Lee, Jae-Gil
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.3
    • /
    • pp.145-153
    • /
    • 2009
  • Trajectory data are ubiquitous in the real world. Recent progress on satellite, sensor, RFID, video, and wireless technologies has made it possible to systematically track object movements and collect huge amounts of trajectory data. Accordingly, there is an ever-increasing interest in performing data analysis over trajectory data. In this paper, we develop a data mining tool for massive trajectory data. This mining tool supports three operations, clustering, classification, and outlier detection, which are the most widely used ones. Trajectory clustering discovers common movement patterns, trajectory classification predicts the class labels of moving objects based on their trajectories, and trajectory outlier detection finds trajectories that are grossly different from or inconsistent with the remaining set of trajectories. The primary advantage of the mining tool is to take advantage of the information of partial trajectories in the process of data mining. The effectiveness of the mining tool is shown using various real trajectory data sets. We believe that we have provided practical software for trajectory data mining which can be used in many real applications.

Real-time Implementation of a Multi-channel G.729A Speech Coder on a 16 Bit Fixed-point DSP (16 비트 고정 소수점 DSP를 이용한 다채널 G.729A음성 부호화기의 실시간 구현)

  • 안도건;유승균;최용수;이재성;강태익;박성현
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.4
    • /
    • pp.45-51
    • /
    • 2000
  • This paper describes real-time implementation of a multi-channel G.729A speech coder using a 16 bit fixed-point Digital Signal Processor (DSP) and its application to a Voice Mailing Service (VMS) system. TMS320C549 by Texas Instruments was used as a fixed point DSP chip and a 4 channel G.729A coder was implemented on the chip. The implemented coder required 14.5 MIPS for the encoder and 3.6 MIPS for the decoder at each channel. In addition, memories required by the coder were 9.88K words and 1.69K words for code and data sections, respectively. As a result, the developed VMS system that accommodates two DSP chips was able to support totally 8 channels. Experimental results showed that the our multi-channel coder passes all of test vectors provided by ITU-T.

  • PDF

Design and Implementation of an E-Catalog System for the Efficiency of Electronic Commerce (전자상거래 효율성을 증가시키기 위한 E-Catalog 시스템 설계 및 구현)

  • Choi, Ok-Kyung;Han, Sang-Yong
    • The KIPS Transactions:PartD
    • /
    • v.10D no.1
    • /
    • pp.167-174
    • /
    • 2003
  • Today in Korea, various types of B2B or B2C businesses are carried out on the Internet and the catalog information is the molt important factor to make customers purchase the product. However, no case can be found where information is shared between the business partners, more specifically, each catalog supplier possesses data that are incompatible with others. Though the e-business market has rapidly expanded, it is still difficult for businesses to attract buyers unless an integrated system is provided for more fast and convenient B2B businesses. Such a systematic and integrated catalog system is highly demanded along with current database management system Therefore, this study suggests the E-Catalog system consists of a fixed and standardized catalog system offering product information and a network-based architecture offering products to customers through a search system. The proposed system also supports CRM (Customer Relation Management).

A Surface Reconstruction Method from Contours Based on Dividing Virtual Belt (가상벨트 분할에 기반한 등고선으로부터의 표면재구성 방법)

  • Choi, Young-Kyu;Lee, Seung-Ha
    • The KIPS Transactions:PartB
    • /
    • v.14B no.6
    • /
    • pp.413-422
    • /
    • 2007
  • This paper addresses a new technique for constructing surface model from a set of wire-frame contours. The most difficult problem of this technique, called contour triangulation, arises when there are many branches on the surface, and causes lots of ambiguities in surface definition process. In this paper, the branching problem is reduced as the surface reconstruction from a set of virtual belts and virtual canyons. To tile the virtual belts, a divide-and-conquer strategy based tiling technique, called the BPA algorithm, is adopted. The virtual canyons are covered naturally by an iterative convex removal algorithm with addition of a center vertex for each branching surface. Compared with most of the previous works reducing the multiple branching problem into a set of tiling problems between contours, our method can handle the problem more easily by transforming it into more simple topology, the virtual belt and the virtual canyon. Furthermore, the proposed method does not involve any set of complicated criteria, and provides a simple and robust algorithm for surface triangulation. The result shows that our method works well even though there are many complicated branches in the object.

An Application of Support Vector Machines to Personal Credit Scoring: Focusing on Financial Institutions in China (Support Vector Machines을 이용한 개인신용평가 : 중국 금융기관을 중심으로)

  • Ding, Xuan-Ze;Lee, Young-Chan
    • Journal of Industrial Convergence
    • /
    • v.16 no.4
    • /
    • pp.33-46
    • /
    • 2018
  • Personal credit scoring is an effective tool for banks to properly guide decision profitably on granting loans. Recently, many classification algorithms and models are used in personal credit scoring. Personal credit scoring technology is usually divided into statistical method and non-statistical method. Statistical method includes linear regression, discriminate analysis, logistic regression, and decision tree, etc. Non-statistical method includes linear programming, neural network, genetic algorithm and support vector machine, etc. But for the development of the credit scoring model, there is no consistent conclusion to be drawn regarding which method is the best. In this paper, we will compare the performance of the most common scoring techniques such as logistic regression, neural network, and support vector machines using personal credit data of the financial institution in China. Specifically, we build three models respectively, classify the customers and compare analysis results. According to the results, support vector machine has better performance than logistic regression and neural networks.