• Title/Summary/Keyword: web data mining

Search Result 412, Processing Time 0.022 seconds

Text Data Analysis Model Based on Web Application (웹 애플리케이션 기반의 텍스트 데이터 분석 모델)

  • Jin, Go-Whan
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.11
    • /
    • pp.785-792
    • /
    • 2021
  • Since the Fourth Industrial Revolution, various changes have occurred in society as a whole due to advance in technologies such as artificial intelligence and big data. The amount of data that can be collect in the process of applying important technologies tends to increase rapidly. Especially in academia, existing generated literature data is analyzed in order to grasp research trends, and analysis of these literature organizes the research flow and organizes some research methodologies and themes, or by grasping the subjects that are currently being talked about in academia, we are making a lot of contributions to setting the direction of future research. However, it is difficult to access whether data collection is necessary for the analysis of document data without the expertise of ordinary programs. In this paper, propose a text mining-based topic modeling Web application model. Even if you lack specialized knowledge about data analysis methods through the proposed model, you can perform various tasks such as collecting, storing, and text-analyzing research papers, and researchers can analyze previous research and research trends. It is expect that the time and effort required for data analysis can be reduce order to understand.

The Development of Design Knowledge Management System Using Data Mining (Data Mining 기법을 활용한 디자인 지식경영 시스템 구축)

  • 양종열;오민권;최경은
    • Archives of design research
    • /
    • v.16 no.2
    • /
    • pp.281-290
    • /
    • 2003
  • In the knowledge and information-based age of today, it would be fair to say that the compatibility of each person, enterprise, and nation can be evaluated by how each of them manages and maintains the knowledge created from data and information. Since the importance and necessity of knowledge management has been acknowledged, there have been studies to create, apply, and evaluate the knowledge concerning design. Previous studies done on this subject can be divided into three main categories - CRM, online statistical research, and eCRM - according to the materials used to create knowledge. These studies are meaningful in that they can create knowledge in their respective fields, although they are somewhat inadequate because the designers can't create as much knowledge as can be applied in business; design-related consumers demand composite knowledge integrating the characteristics of all three fields. In other words, they want to know the ordinary customers'preferences in the previous off-line market in the CRM field, the research results of statistical questionnaires to the various elements of design in statistical research fields, and even the pattern of preference and consumption of many and unspecified persons transcending the time and place in eCRU field. This study proposes to solve the problem related with web-based design knowledge maintenance through the synthetic application of CRM, Statistical Research, and eCRM The information proposed in the solution can De expected to help designers working at design-related enterprises, as well as research institutes, to develop the knowledge necessary to design more consumer-oriented products.

  • PDF

Baseline Study to Develop a Consumer Information System (소비자정보시스템 구축을 위한 기반 연구)

  • Nam Su-Jung;Kim Kee-Ok
    • Journal of Families and Better Life
    • /
    • v.23 no.1 s.73
    • /
    • pp.125-137
    • /
    • 2005
  • Information technology is an important driving force that has changed consumer information environments. In order to adjust in the new environments, consumers need an innovative information system. The purpose of this study was to develop a Consumer Information System (CIS). CIS is a device that supports consumer's decision-making process and elevates consumer information competence. The CIS was constructed by the following steps: (1) organization of developers, (2) systematization of consumer information, (3) data loading, (4) integration of consumer database: data warehouse, (5) data distribution, (6) composition of data mart, (7) use of data access tools: data-mining, OLAP, statistical analysis, Q+R, (8) data visualization: web server.

EXTENDED ONLINE DIVISIVE AGGLOMERATIVE CLUSTERING

  • Musa, Ibrahim Musa Ishag;Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.406-409
    • /
    • 2008
  • Clustering data streams has an importance over many applications like sensor networks. Existing hierarchical methods follow a semi fuzzy clustering that yields duplicate clusters. In order to solve the problems, we propose an extended online divisive agglomerative clustering on data streams. It builds a tree-like top-down hierarchy of clusters that evolves with data streams using geometric time frame for snapshots. It is an enhancement of the Online Divisive Agglomerative Clustering (ODAC) with a pruning strategy to avoid duplicate clusters. Our main features are providing update time and memory space which is independent of the number of examples on data streams. It can be utilized for clustering sensor data and network monitoring as well as web click streams.

  • PDF

A Study of Perception of Golfwear Using Big Data Analysis (빅데이터를 활용한 골프웨어에 관한 인식 연구)

  • Lee, Areum;Lee, Jin Hwa
    • Fashion & Textile Research Journal
    • /
    • v.20 no.5
    • /
    • pp.533-547
    • /
    • 2018
  • The objective of this study is to examine the perception of golfwear and related trends based on major keywords and associated words related to golfwear utilizing big data. For this study, the data was collected from blogs, Jisikin and Tips, news articles, and web $caf{\acute{e}}$ from two of the most commonly used search engines (Naver & Daum) containing the keywords, 'Golfwear' and 'Golf clothes'. For data collection, frequency and matrix data were extracted through Textom, from January 1, 2016 to December 31, 2017. From the matrix created by Textom, Degree centrality, Closeness centrality, Betweenness centrality, and Eigenvector centrality were calculated and analyzed by utilizing Netminer 4.0. As a result of analysis, it was found that the keyword 'brand' showed the highest rank in web visibility followed by 'woman', 'size', 'man', 'fashion', 'sports', 'price', 'store', 'discount', 'equipment' in the top 10 frequency rankings. For centrality calculations, only the top 30 keywords were included because the density was extremely high due to high frequency of the co-occurring keywords. The results of centrality calculations showed that the keywords on top of the rankings were similar to the frequency of the raw data. When the frequency was adjusted by subtracting 100 and 500 words, it showed different results as the low-ranking keywords such as J. Lindberg in the frequency analysis ranked high along with changes in the rankings of all centrality calculations. Such findings of this study will provide basis for marketing strategies and ways to increase awareness and web visibility for Golfwear brands.

A Study on Visualization of Digital Preservation Knowledge Domain Using CiteSpace (CiteSpace 적용을 통한 디지털 보존 지식영역 비주얼화 연구)

  • Kim Hee-Jung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.4
    • /
    • pp.89-104
    • /
    • 2005
  • This article identifies an emerging research paradigm and monitors the changes in digital preservation area using CiteSpace, a Java application which supports visual exploration with knowledge discovery in bibliographic databases. 74 articles on digital preservation field covering the time period from 1990-2005 were extracted from Web of Science. According to the result of analysis, core knowledge domains in digital preservation are technical preservation strategies, information network and preservation system, knowledge management and electronic government.

A Research on User′s Query Processing in Search Engine for Ocean using the Association Rules (연관 규칙 탐사 기법을 이용한 해양 전문 검색 엔진에서의 질의어 처리에 관한 연구)

  • 하창승;윤병수;류길수
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.266-272
    • /
    • 2002
  • Recently various of information suppliers provide information via WWW so the necessary of search engine grows larger. However the efficiency of most search engines is low comparatively because of using simple pattern match technique between user's query and web document. And a manifest contents of query for special expert field so much worse A specialized search engine returns the specialized information depend on each user's search goal. It is trend to develop specialized search engines in many countries. For example, in America, there are a site that searches only the recently updated headline news and the federal law and the government and and so on. However, most such engines don't satisfy the user's needs. This paper proposes the specialized search engine for ocean information that uses user's query related with ocean and search engine uses the association rules in web data mining. So specialized search engine for ocean provides more information related to ocean because of raising recall about user's query

  • PDF

Classification of Web Data Using SASOM+DT for Web Usage Mining (웹 사용 마이닝을 위한 SASOM+DT를 이용한 웹 데이터의 분류)

  • 유시호;김경중;조성배
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10d
    • /
    • pp.346-348
    • /
    • 2002
  • 웹 마이닝은 크게 구조 마이닝, 컨텐츠 마이닝, 사용 마이닝으로 분류될 수 있다. 이 중에서도 사용 마이닝은 사용자의 로그 데이터를 바탕으로 사용자가 탐색한 웹 페이지의 순서를 추출하거나 연관관계를 분석하는 작업이다. 특히 웹에 기반을 둔 애플리케이션의 요구를 충족시키기 위해서 사용 마이닝은 웹 마이닝에 있어서 중요한 부분으로 부각되고 있다. 본 논문에서는 사용자들의 웹 페이지의 방문 패턴을 분석하여, 미래행동을 예측하는 것을 문제로 삼고, 사용자들의 이용패턴을 SASOM(Strtcture-Adaptive SOM)분류기들의 DT(Decision Tree)앙상블을 이용하여 분류하는 방법을 제안해보았다. MS웹 데이터를 가지고 SASOM 분류기의 집합을 DT를 이용하여 결합한 결과, 분류기 하나만 사용한 경우 보다 더 좋은 결과를 얻어, 3.5% 이하의 낮은 오류율을 보였다.

  • PDF

Data Analysis Web Application Based on Text Mining (텍스트 마이닝 기반의 데이터 분석 웹 애플리케이션)

  • Gil, Wan-Je;Kim, Jae-Woong;Park, Koo-Rack;Lee, Yun-Yeol
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.103-104
    • /
    • 2021
  • 본 논문에서는 텍스트 마이닝 기반의 토픽 모델링 웹 애플리케이션 모델을 제안한다. 웹크롤링 기법을 활용하여 키워드를 입력하면 요약된 논문 정보를 파일로 저장할 수 있고 또한 키워드 빈도 분석과 토픽 모델링 등을 통해 연구 동향을 손쉽게 확인해볼 수 있는 웹 애플리케이션을 설계하고 구현하는 것을 목표로 한다. 제안 모델인 웹 애플리케이션을 통해 프로그래밍 언어와 데이터 분석 기법에 대한 지식이 부족하더라도 논문 수집과 저장, 텍스트 분석을 경험해볼 수 있다. 또한, 이러한 웹 시스템 개발은 기존의 html, css, java script와 같은 언어에 의존하지 않고 파이썬 라이브러리를 활용하였기 때문에 파이썬을 기반으로 데이터 분석과 머신러닝 교육을 수행할 경우 프로젝트 기반 수업 교육 과정으로 채택이 가능할 것으로 기대된다.

  • PDF

A Study on Big Data-Based Analysis of Risk Factors for Depression in Adolescents

  • Chun-Ok Jang
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.4
    • /
    • pp.449-455
    • /
    • 2023
  • The purpose of this study is to explore adolescent depression, increase understanding of social problems, and develop prevention and intervention strategies. As a research method, social big data was used to collect information related to 'youth depression', and related factors were identified through data mining and analysis of related rules. We used 'Sometrend Biz Tool' to collect and clean data from the web and then analyzed data in various languages. The study found that online articles about depression decreased during the school holidays (January to March), then increased from March to the end of June, and then decreased again from July. Therefore, it is important to establish a government-wide depression management monitoring system that can detect risk signs of adolescent depression in real time. In addition, regular stress relief and mental health education are needed during the semester, and measures must be prepared to deal with at-risk youth who share their depressed feelings in cyberspace. Results from these studies can be expected to provide important information in investigating and preventing youth depression and to contribute to policy development and intervention.