• Title/Summary/Keyword: web database

Search Result 1,527, Processing Time 0.033 seconds

Topic Model Augmentation and Extension Method using LDA and BERTopic (LDA와 BERTopic을 이용한 토픽모델링의 증강과 확장 기법 연구)

  • Kim, SeonWook;Yang, Kiduk
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.3
    • /
    • pp.99-132
    • /
    • 2022
  • The purpose of this study is to propose AET (Augmented and Extended Topics), a novel method of synthesizing both LDA and BERTopic results, and to analyze the recently published LIS articles as an experimental approach. To achieve the purpose of this study, 55,442 abstracts from 85 LIS journals within the WoS database, which spans from January 2001 to October 2021, were analyzed. AET first constructs a WORD2VEC-based cosine similarity matrix between LDA and BERTopic results, extracts AT (Augmented Topics) by repeating the matrix reordering and segmentation procedures as long as their semantic relations are still valid, and finally determines ET (Extended Topics) by removing any LDA related residual subtopics from the matrix and ordering the rest of them by F1 (BERTopic topic size rank, Inverse cosine similarity rank). AET, by comparing with the baseline LDA result, shows that AT has effectively concretized the original LDA topic model and ET has discovered new meaningful topics that LDA didn't. When it comes to the qualitative performance evaluation, AT performs better than LDA while ET shows similar performances except in a few cases.

Suitable clothing recommendation system by size and skin color (의류 사이즈별 및 피부톤에 기반을 둔 의류 추천 시스템)

  • Park, Chang-Young;Lim, Byeong-Chan;Lee, Won-Joon;Lee, Chang-Su;Kim, Min-Su;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.20 no.3
    • /
    • pp.407-413
    • /
    • 2022
  • Existing clothing recommendation systems remain at the level of showing appropriate photos when a user selects a type of clothing he or she likes after entering his or her own body size or body size. When a user purchases clothing using such recommendation systems, there are many cases in which it does not fit or does not fit the user's body size. In this study, to solve these problems of existing clothing recommendation systems, a system was implemented in which the user receives not only size but also skin tone and recommends clothing suitable for the user's body size as well as skin tone. In this system, clothing size information obtained through web crawling was periodically stored in a database for eight male tops to recommend clothing, and the entire pixel of the clothing image was analyzed to extract color text values. In order to confirm the performance of this system, a survey was conducted on 100 male college students, and the satisfaction level was 70%. Most of the reasons for not being satisfied are that the recommended clothing is limited, so it is judged that it is necessary to expand the target clothing in the future.

The Philippines Coconut Genomics Initiatives: Updates and Opportunities for Capacity Building and Genomics Research Collaboration

  • Hayde Flandez-Galvez;Darlon V. Lantican;Anand Noel C. Manohar;Maria Luz J. Sison;Roanne R. Gardoce;Barbara L. Caoili;Alma O. Canama-Salinas;Melvin P. Dancel;Romnick A. Latina;Cris Q. Cortaga;Don Serville R. Reynoso;Michelle S. Guerrero;Susan M. Rivera;Ernesto E. Emmanuel;Cristeta Cueto;Consorcia E. Reano;Ramon L. Rivera;Don Emanuel M. Cardona;Edward Cedrick J. Fernandez ;Robert Patrick M. Cabangbang;Maria Salve C. Vasquez;Jomari C. Domingo;Reina Esther S. Caro;Alissa Carol M. Ibarra;Frenzee Kroeizha L. Pammit;Jen Daine L. Nocum;Angelica Kate G. Gumpal;Jesmar Cagayan;Ronilo M. Bajaro;Joseph P. Lagman;Cynthia R. Gulay;Noe Fernandez-Pozo;Susan R. Strickler;Lukas A. Mueller
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.30-30
    • /
    • 2022
  • Philippines is the second world supplier of coconut by-products. As its first major genomics project, the Philippine Genome Center program for Agriculture (PGC-Agriculture) took the challenge to sequence and assemble the whole coconut genome. The project aims to provide advance genetics tools for our collaborating coconut researchers while taking the opportunity to initiate local capacity. Combination of different NGS platforms was explored and the Philippine 'Catigan Green Dwarf' (CATD) variety was selected with the breeders to be the crop's reference genome. A high quality genome assembly of CATD was generated and used to characterize important genes of coconut towards the development of resilient and outstanding varieties especially for added high-value traits. The talk will present the significant results of the project as published in various papers including the first report of whole genome sequence of a dwarf coconut variety. Updates will include the challenges hurdled and specific applications such as gene mining for host insect resistance and screening for least damaged coconuts (thus potentially insect resistant varieties). Genome-wide DNA markers as published and genes related to coconut oil qualitative/quantitative traits will also be presented, including initial molecular/biochemical studies that support nutritional and medicinal claims. A web-based genome database is currently built for ease access and wider utility of these genomics tools. Indeed, a major milestone accomplished by the coconut genomics research team, which was facilitated with the all-out government support and strong collaboration among multidisciplinary experts and partnership with advance research institutes.

  • PDF

Implementation of IoT-Based Irrigation Valve for Rice Cultivation (벼 재배용 사물인터넷 기반 물꼬 구현)

  • Byeonghan Lee;Deok-Gyeong Seong;Young Min Jin;Yeon-Hyeon Hwang;Young-Gwang Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.6
    • /
    • pp.93-98
    • /
    • 2023
  • In paddy rice farming, water management is a critical task. To suppress weed emergence during the early stages of growth, fields are deeply flooded, and after transplantation, the water level is reduced to promote rooting and stimulate stem generation. Later, water is drained to prevent the production of sterile tillers. The adequacy of water supply is influenced by various factors such as field location, irrigation channels, soil conditions, and weather, requiring farmers to frequently check water levels and control the ingress and egress of water. This effort increases if the fields are scattered in remote locations. Automated irrigation systems have been considered to reduce labor and improve productivity. However, the net income from rice production in 2022 was about KRW 320,000/10a on average, making it financially unfeasible to implement high-cost devices or construct new infrastructure. This study focused on developing an IoT-Based irrigation valve that can be easily integrated into existing agricultural infrastructure without additional construction. The research was carried out in three main areas: Firstly, an irrigation valve was designed for quick and easy installation on existing agricultural pipes. Secondly, a power circuit was developed to connect a low-power Cat M1 communication modem with an Arduino Nano board for remote operation. Thirdly, a cloud-based platform was used to set up a server and database environment and create a web interface that users can easily access.

Implementation of Reporting Tool Supporting OLAP and Data Mining Analysis Using XMLA (XMLA를 사용한 OLAP과 데이타 마이닝 분석이 가능한 리포팅 툴의 구현)

  • Choe, Jee-Woong;Kim, Myung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.3
    • /
    • pp.154-166
    • /
    • 2009
  • Database query and reporting tools, OLAP tools and data mining tools are typical front-end tools in Business Intelligence environment which is able to support gathering, consolidating and analyzing data produced from business operation activities and provide access to the result to enterprise's users. Traditional reporting tools have an advantage of creating sophisticated dynamic reports including SQL query result sets, which look like documents produced by word processors, and publishing the reports to the Web environment, but data source for the tools is limited to RDBMS. On the other hand, OLAP tools and data mining tools have an advantage of providing powerful information analysis functions on each own way, but built-in visualization components for analysis results are limited to tables or some charts. Thus, this paper presents a system that integrates three typical front-end tools to complement one another for BI environment. Traditional reporting tools only have a query editor for generating SQL statements to bring data from RDBMS. However, the reporting tool presented by this paper can extract data also from OLAP and data mining servers, because editors for OLAP and data mining query requests are added into this tool. Traditional systems produce all documents in the server side. This structure enables reporting tools to avoid repetitive process to generate documents, when many clients intend to access the same dynamic document. But, because this system targets that a few users generate documents for data analysis, this tool generates documents at the client side. Therefore, the tool has a processing mechanism to deal with a number of data despite the limited memory capacity of the report viewer in the client side. Also, this reporting tool has data structure for integrating data from three kinds of data sources into one document. Finally, most of traditional front-end tools for BI are dependent on data source architecture from specific vendor. To overcome the problem, this system uses XMLA that is a protocol based on web service to access to data sources for OLAP and data mining services from various vendors.

Job Preference Analysis and Job Matching System Development for the Middle Aged Class (중장년층 일자리 요구사항 분석 및 인력 고용 매칭 시스템 개발)

  • Kim, Seongchan;Jang, Jincheul;Kim, Seong Jung;Chin, Hyojin;Yi, Mun Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.247-264
    • /
    • 2016
  • With the rapid acceleration of low-birth rate and population aging, the employment of the neglected groups of people including the middle aged class is a crucial issue in South Korea. In particular, in the 2010s, the number of the middle aged who want to find a new job after retirement age is significantly increasing with the arrival of the retirement time of the baby boom generation (born 1955-1963). Despite the importance of matching jobs to this emerging middle aged class, private job portals as well as the Korean government do not provide any online job service tailored for them. A gigantic amount of job information is available online; however, the current recruiting systems do not meet the demand of the middle aged class as their primary targets are young workers. We are in dire need of a specially designed recruiting system for the middle aged. Meanwhile, when users are searching the desired occupations on the Worknet website, provided by the Korean Ministry of Employment and Labor, users are experiencing discomfort to search for similar jobs because Worknet is providing filtered search results on the basis of exact matches of a preferred job code. Besides, according to our Worknet data analysis, only about 24% of job seekers had landed on a job position consistent with their initial preferred job code while the rest had landed on a position different from their initial preference. To improve the situation, particularly for the middle aged class, we investigate a soft job matching technique by performing the following: 1) we review a user behavior logs of Worknet, which is a public job recruiting system set up by the Korean government and point out key system design implications for the middle aged. Specifically, we analyze the job postings that include preferential tags for the middle aged in order to disclose what types of jobs are in favor of the middle aged; 2) we develope a new occupation classification scheme for the middle aged, Korea Occupation Classification for the Middle-aged (KOCM), based on the similarity between jobs by reorganizing and modifying a general occupation classification scheme. When viewed from the perspective of job placement, an occupation classification scheme is a way to connect the enterprises and job seekers and a basic mechanism for job placement. The key features of KOCM include establishing the Simple Labor category, which is the most requested category by enterprises; and 3) we design MOMA (Middle-aged Occupation Matching Algorithm), which is a hybrid job matching algorithm comprising constraint-based reasoning and case-based reasoning. MOMA incorporates KOCM to expand query to search similar jobs in the database. MOMA utilizes cosine similarity between user requirement and job posting to rank a set of postings in terms of preferred job code, salary, distance, and job type. The developed system using MOMA demonstrates about 20 times of improvement over the hard matching performance. In implementing the algorithm for a web-based application of recruiting system for the middle aged, we also considered the usability issue of making the system easier to use, which is especially important for this particular class of users. That is, we wanted to improve the usability of the system during the job search process for the middle aged users by asking to enter only a few simple and core pieces of information such as preferred job (job code), salary, and (allowable) distance to the working place, enabling the middle aged to find a job suitable to their needs efficiently. The Web site implemented with MOMA should be able to contribute to improving job search of the middle aged class. We also expect the overall approach to be applicable to other groups of people for the improvement of job matching results.

Research Direction for Functional Foods Safety (건강기능식품 안전관리 연구방향)

  • Jung, Ki-Hwa
    • Journal of Food Hygiene and Safety
    • /
    • v.25 no.4
    • /
    • pp.410-417
    • /
    • 2010
  • Various functional foods, marketing health and functional effects, have been distributed in the market. These products, being in forms of foods, tablets, and capsules, are likely to be mistaken as drugs. In addition, non-experts may sell these as foods, or use these for therapy. Efforts for creating health food regulations or building regulatory system for improving the current status of functional foods have been made, but these have not been communicated to consumers yet. As a result, problems of circulating functional foods for therapy or adding illegal medical to such products have persisted, which has become worse by internet media. The cause of this problem can be categorized into (1) product itself and (2) its use, but in either case, one possible cause is lack of communications with consumers. Potential problems that can be caused by functional foods include illegal substances, hazardous substances, allergic reactions, considerations when administered to patients, drug interactions, ingredients with purity or concentrations too low to be detected, products with metabolic activations, health risks from over- or under-dose of vitamin and minerals, and products with alkaloids. (Journal of Health Science, 56, Supplement (2010)). The reason why side effects related to functional foods have been increasing is that under-qualified functional food companies are exaggerating the functionality for marketing purposes. KFDA has been informing consumers, through its web pages, to address the above mentioned issues related to functional foods, but there still is room for improvement, to promote proper use of functional foods and avoid drug interactions. Specifically, to address these issues, institutionalizing to collect information on approved products and their side effects, settling reevaluation systems, and standardizing preclinical tests and clinical tests are becoming urgent. Also to provide crucial information, unified database systems, seamlessly aggregating heterogeneous data in different domains, with user interfaces enabling effective one-stop search, are crucial.

Finding Weighted Sequential Patterns over Data Streams via a Gap-based Weighting Approach (발생 간격 기반 가중치 부여 기법을 활용한 데이터 스트림에서 가중치 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.55-75
    • /
    • 2010
  • Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledge embedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.

A Study on the Problems of Eating Habits of Mordern People and Suggesting Alternatives to Overcome Diseases: A Review of the Five Blue Zones, Based on the Roma Linda Region in the USA (현대인의 식습관 문제점 인지와 발생 질병극복을 위한 대안 제시: 5대 블루존 중 미국 로마린다 지역을 중심으로)

  • Shin, Kyung-Ok;Je, Haejong
    • Journal of Korean Society of Neurocognitive Rehabilitation
    • /
    • v.10 no.2
    • /
    • pp.53-62
    • /
    • 2018
  • The purpose of this study was to propose an alternative for the eating habits of modern people and coping with the diseases. The purpose of this study was to apply the principles of eating habits of people living in Roma Linda to modern dietary life and to help healthy life and prevent disease. The period of this study was from May 1, 2016 to February 28, 2018. Literature search was conducted using Pubmed and Korean academic web sites. Based on the recognition of wrong eating habits, we classify and classify diseases according to eating habits. A total of more than 100 papers were selected and 60 papers and a database were prepared. People living in Roma Linda have eight health principles. The Roma Linda practiced balanced nutritional intake, sufficient exercise, adequate water intake, sunlight, temperance (abstinence from alcohol etc.), fresh air, adequate rest, and trust in eating habits. People living in Roma Linda have a high intake of vegetables, fruits and nuts. People living in Roma Linda are educated about nutrition, and among them, there is a low prevalence of coronary heart disease and cancer, because they mostly do not smoke or drink alcohol. Unhealthy eating habits and dietary behavior are associated with many diseases. Many chronic, degenerative diseases are due to bad eating habits and stress. If you take good food habits of people living in the Roma Linda area and practice it steadily, it will have a great effect on disease prevention.

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.