• Title/Summary/Keyword: Public Data API

Search Result 67, Processing Time 0.028 seconds

The Blog Ranking Algorithm Reflecting Trend Index (트렌드 지수를 반영한 블로그 랭킹 알고리즘)

  • Lee, Yong-Suk;Kim, Hyoung Joong
    • Journal of Digital Contents Society
    • /
    • v.18 no.3
    • /
    • pp.551-558
    • /
    • 2017
  • The growth of blogs has two aspect of providing various information and marketing. This study collected the rankings of blog posts of large portal using OpenAPI and investigated the features of blogs ranked through the exploratory data analysis technique. As a result of the analysis, it was found that the influence of the blogger and the recent creation date of the post were highly influential factors in the top rank. Due to the weakness of these evaluation algorithms, there was a problem of showing the search results which is concentrated to the power blogger's post. In this study, we propose an algorithm that improves the reliability of content by adding the reliability DB information which is verified by the experts and reflects the fairness of the application of the ranking score through the trend index indicating various public interests. Improved algorithms have made it possible to provide more reliable information in the search results of the relevant field and have an effect of making it difficult to manipulate ranking by illegal applications that increase the number of visitors.

Implementation of preventing screen capture modules for privacy (개인 정보 보호를 위한 화면 캡쳐 방지 모듈 구현)

  • Kwak, Dong-uk;Yun, Dong-young;Lee, Jong-hyeok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.787-790
    • /
    • 2012
  • Recently due to the development of the information society and the spread of computer, interest for personal information is increased and as policy and technology associated with the development, we have been various attempts to protect your personal information. In this paper, for agencies and departments to computer use and to deal with Important data of individuals, personal information or the company's confidential information, we proposed modules to protect them. As a result, we prevent a public agency or private institutions within that using mean bad or stealing another person's information. When we communicate various information with the systems in the institutions, the module can be prevented critical data and personal information exposure.

  • PDF

Artificial Intelligence Algorithms, Model-Based Social Data Collection and Content Exploration (소셜데이터 분석 및 인공지능 알고리즘 기반 범죄 수사 기법 연구)

  • An, Dong-Uk;Leem, Choon Seong
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.23-34
    • /
    • 2019
  • Recently, the crime that utilizes the digital platform is continuously increasing. About 140,000 cases occurred in 2015 and about 150,000 cases occurred in 2016. Therefore, it is considered that there is a limit handling those online crimes by old-fashioned investigation techniques. Investigators' manual online search and cognitive investigation methods those are broadly used today are not enough to proactively cope with rapid changing civil crimes. In addition, the characteristics of the content that is posted to unspecified users of social media makes investigations more difficult. This study suggests the site-based collection and the Open API among the content web collection methods considering the characteristics of the online media where the infringement crimes occur. Since illegal content is published and deleted quickly, and new words and alterations are generated quickly and variously, it is difficult to recognize them quickly by dictionary-based morphological analysis registered manually. In order to solve this problem, we propose a tokenizing method in the existing dictionary-based morphological analysis through WPM (Word Piece Model), which is a data preprocessing method for quick recognizing and responding to illegal contents posting online infringement crimes. In the analysis of data, the optimal precision is verified through the Vote-based ensemble method by utilizing a classification learning model based on supervised learning for the investigation of illegal contents. This study utilizes a sorting algorithm model centering on illegal multilevel business cases to proactively recognize crimes invading the public economy, and presents an empirical study to effectively deal with social data collection and content investigation.

  • PDF

A Study on the New Trends of EDI based Internet (인터넷을 기반으로 하는 EDI 신조류)

  • 조원길
    • The Journal of Information Technology
    • /
    • v.4 no.1
    • /
    • pp.125-139
    • /
    • 2001
  • EDI(Electronic Data Interchange) works by providing a collection of standard message formats and element dictionary in a simple way for businesses to exchange data via any electronic messaging service. Open-edi is electronic data interchange among autonomous parties using public standards and aiming towards interoperability over time, business sectors, information technology and data types. The number of Internet services using XML/EDI has grown rapidly since it is easily expansible and exchangeable. To use this service, the client does not have to install EDI S/W but only needs internet browser. Consequently, it became much easier and faster to handle the trading process in an office. eBusiness SML (extensible markup language) electronic data interchange. eXedi is the service that realizes B2B of XML/EDI. eXedi can be used easily in small and medium sized companies. Companies in any place can access to eXedi using the existing Internet connection. XML/EDI provides a standard framework to exchange different types of data -- for example, an invoice, healthcare claim, project status -- so that the information be it in a transaction, exchanged via an Application Program Interface (API), web automation, database portal, catalog, a workflow document or message can be searched, decoded, manipulated, and displayed consistently and correctly by first implementing EDI dictionaries and extending our vocabulary via on-line repositories to include our business language, rules and objects.

  • PDF

A Process Perspective Event-log Analysis Method for Airport BHS (Baggage Handling System) (공항 수하물 처리 시스템 이벤트 로그의 프로세스 관점 분석 방안 연구)

  • Park, Shin-nyum;Song, Minseok
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.181-188
    • /
    • 2020
  • As the size of the airport terminal grows in line with the rapid growth of aviation passengers, the advanced baggage handling system that combines various data technologies has become an essential element in order to handle the baggage carried by passengers swiftly and accurately. Therefore, this study introduces the method of analyzing the baggage handling capacity of domestic airports through the latest data analysis methodology from the process point of view to advance the operation of the airport BHS and the main points based on event log data. By presenting an accurate load prediction method, it can lead to advanced BHS operation strategies in the future, such as the preemptive arrangement of resources and optimization of flight-carrousel scheduling. The data used in the analysis utilized the APIs that can be obtained by searching for "Korea Airports Corporation" in the public data portal. As a result of applying the method to the domestic airport BHS simulation model, it was possible to confirm a high level of predictive performance.

Implementation of anti-screen capture modules for privacy protection (개인 정보 보호를 위한 화면 캡쳐 방지 모듈 구현)

  • Lee, Jong-Hyeok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.1
    • /
    • pp.91-96
    • /
    • 2014
  • According to the spread of computers and the development of the information society, people are focused on privacy information. As the development of its associated policy and technology, it has been tried various attempts to protect their personal information. In this paper, we proposed anti-screen capture modules to protect personal information or a company's confidential information for agencies and departments that keeps top security. As a result, we can prevent an illegal use or a stealing of another person's information in a public agency or personal computer. Also modules can stop exposures of top security data and personal information during they communicate with others in their institution's sever system.

Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)

  • An, Jungkook;Kim, Hee-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.49-67
    • /
    • 2015
  • Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.

A Study of the Transition Process in Presidential Electronic Records Transfer and Improvement Measures : Focused on the Electronic Records of the 19th President Moon Jae-in's Administration (대통령 전자기록물의 이관방식 변천과 개선방안 연구 19대 문재인 정부 대통령 전자기록물을 중심으로 )

  • Yun, Jeonghun
    • The Korean Journal of Archival Studies
    • /
    • no.75
    • /
    • pp.41-89
    • /
    • 2023
  • Since the enactment of the Act on the Management of Presidential Archives in 2007, the cases of electronic records transfer in the 16th President Roh Moo-hyun's administration have played the role of an advance guard in managing public records and served as a test bed for new electronic records management. When transferring the electronic records of the 19th President Moon Jae-in's administration, the electronic records transfer method of President Roh's administration was inherited, while several innovative attempts were made. For instance, the Presidential Archives have for the first time converted the electronic documents from institutions advising the President into a long-term preservation package and transferred them online. In addition, considering the characteristics of the data, the administrative information dataset of the Presidential record creation institutions was transferred to the SIARD standard. Furthermore, the Presidential Archives had websites transferred in the form of OVF as a pilot test and collected social media directly through the API. Thus this study investigated the transition process of the presidential electronic records transfers from the 16th President Roh Moo-hyun's administration to the 19th President Moon Jae-in's. In addition, major achievements and issues were analyzed centering on the transfer method by type of electronic records during President Moon Jae-in's administration, and future improvement plans were presented.

Safety Verification Techniques of Privacy Policy Using GPT (GPT를 활용한 개인정보 처리방침 안전성 검증 기법)

  • Hye-Yeon Shim;MinSeo Kweun;DaYoung Yoon;JiYoung Seo;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.2
    • /
    • pp.207-216
    • /
    • 2024
  • As big data was built due to the 4th Industrial Revolution, personalized services increased rapidly. As a result, the amount of personal information collected from online services has increased, and concerns about users' personal information leakage and privacy infringement have increased. Online service providers provide privacy policies to address concerns about privacy infringement of users, but privacy policies are often misused due to the long and complex problem that it is difficult for users to directly identify risk items. Therefore, there is a need for a method that can automatically check whether the privacy policy is safe. However, the safety verification technique of the conventional blacklist and machine learning-based privacy policy has a problem that is difficult to expand or has low accessibility. In this paper, to solve the problem, we propose a safety verification technique for the privacy policy using the GPT-3.5 API, which is a generative artificial intelligence. Classification work can be performed evenin a new environment, and it shows the possibility that the general public without expertise can easily inspect the privacy policy. In the experiment, how accurately the blacklist-based privacy policy and the GPT-based privacy policy classify safe and unsafe sentences and the time spent on classification was measured. According to the experimental results, the proposed technique showed 10.34% higher accuracy on average than the conventional blacklist-based sentence safety verification technique.

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.