• Title/Summary/Keyword: Industry(Occupation) Code Classification

Search Result 7, Processing Time 0.021 seconds

An Automated Industry and Occupation Coding System using Deep Learning (딥러닝 기법을 활용한 산업/직업 자동코딩 시스템)

  • Lim, Jungwoo;Moon, Hyeonseok;Lee, Chanhee;Woo, Chankyun;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.4
    • /
    • pp.23-30
    • /
    • 2021
  • An Automated Industry and Occupation Coding System assigns statistical classification code to the enormous amount of natural language data collected from people who write about their industry and occupation. Unlike previous studies that applied information retrieval, we propose a system that does not need an index database and gives proper code regardless of the level of classification. Also, we show our model, which utilized KoBERT that achieves high performance in natural language downstream tasks with deep learning, outperforms baseline. Our method achieves 95.65%, 91.51%, and 97.66% in Occupation/Industry Code Classification of Population and Housing Census, and Industry Code Classification of Census on Basic Characteristics of Establishments. Moreover, we also demonstrate future improvements through error analysis in the respect of data and modeling.

An automated Classification System of Standard Industry and Occupation Codes by Using Information Retrieval Techniques (정보검색 기법을 이용한 산업/직업 코드 자동 분류 시스템)

  • Lim, Heui Seok
    • The Journal of Korean Association of Computer Education
    • /
    • v.7 no.4
    • /
    • pp.51-60
    • /
    • 2004
  • This paper proposes an automated coding system of Korean standard industry/occupation for census which reduces a lot of cost and labor for manual coding. The proposed system converts natural language responses on survey questionnaires into corresponding numeric codes using information retrieval techniques and document classification algorithm. The system was experimented with 46,762 industry records and occupation 36,286 records using 10-fold cross -validation evaluation method. As experimental results, the system show 87.08% and 66.08% production rates when classifying industry records into level 2 and level 5 codes respectively. The system shows slightly lower performances on occupation code classification. We expect that the system is enough to be used as a semi-automate coding system which can minimize manual coding task or as a verification tool for manual coding results though it has much room to be improved as an automated coding system.

  • PDF

An Automatic Coding System of Korean Standard Industry/Occupation Code Using Example-based Learning (예제기반의 학습을 이용한 한국어 표준 산업/직업 자동 코딩 시스템)

  • Lim Heui-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.4
    • /
    • pp.169-179
    • /
    • 2005
  • Standard industry and occupation code are usually assigned manually in Korean census. The manual coding is very labor intensive and expensive task. Furthermore, inconsistent coding is resulted from the ability of human experts and their working environments. This paper proposes an automatic code classification system which converts natural language responses on survey questionnaires into corresponding numeric codes by using manually constructed rule base and example-based machine learning. The system was trained with 400,000 records of which standard codes was assigned. It was evaluated with 10-fold cross validation and was tested with three code sets: population occupation set, industry set, and industry survey set. The proposed system showed 76.63%, 82.24 and 99.68% accuracy for each code set.

  • PDF

Improving the Classification of Population and Housing Census with AI: An Industry and Job Code Study

  • Byung-Il Yun;Dahye Kim;Young-Jin Kim;Medard Edmund Mswahili;Young-Seob Jeong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.21-29
    • /
    • 2023
  • In this paper, we propose an AI-based system for automatically classifying industry and occupation codes in the population census. The accurate classification of industry and occupation codes is crucial for informing policy decisions, allocating resources, and conducting research. However, this task has traditionally been performed by human coders, which is time-consuming, resource-intensive, and prone to errors. Our system represents a significant improvement over the existing rule-based system used by the statistics agency, which relies on user-entered data for code classification. In this paper, we trained and evaluated several models, and developed an ensemble model that achieved an 86.76% match accuracy in industry and 81.84% in occupation, outperforming the best individual model. Additionally, we propose process improvement work based on the classification probability results of the model. Our proposed method utilizes an ensemble model that combines transfer learning techniques with pre-trained models. In this paper, we demonstrate the potential for AI-based systems to improve the accuracy and efficiency of population census data classification. By automating this process with AI, we can achieve more accurate and consistent results while reducing the workload on agency staff.

Assessing the Association Between Emotional Labor and Presenteeism Among Nurses in Korea: Cross-sectional Study Using the 4th Korean Working Conditions Survey

  • Jung, Sung Won;Lee, June-Hee;Lee, Kyung-Jae
    • Safety and Health at Work
    • /
    • v.11 no.1
    • /
    • pp.103-108
    • /
    • 2020
  • Background: Presenteeism has emerged as an important health-related issue and has been studied in a variety of occupation groups. This study examines the relationship between emotional labor and presenteeism in nurses in Republic of Korea. Methods: As a cross-sectional study, our study was conducted on 328 female nurses participating in the fourth Korean Working Conditions Survey (2015). Nurses were identified by the Korean Industry Classification Code. Multivariable logistic regression analysis was performed to explore the association between emotional labor and presenteeism. Results: Female nurses who always or sometimes hide their emotions in the workplace were found to have a high risk for presenteeism compared with female nurses who rarely hide their emotions in the workplace {odds ratio [OR] = 2.40 [95% confidence interval (CI) 1.04-5.54]; OR = 4.12 [95% CI 1.72-9.84], respectively}. Furthermore, the risk of presenteeism was higher in nurses who sometimes engaged with complaining customers compared with nurses who rarely did so, but it lacked statistical significance. Conclusion: Presenteeism in nurses can cause various negative secondary effects; therefore, an alternative should be sought to mediate nurses' emotional labor to prevent presenteeism.

Developing Asbestos Job Exposure Matrix Using Occupation and Industry Specific Exposure Data (1984-2008) in Republic of Korea

  • Choi, Sangjun;Kang, Dongmug;Park, Donguk;Lee, Hyunhee;Choi, Bongkyoo
    • Safety and Health at Work
    • /
    • v.8 no.1
    • /
    • pp.105-115
    • /
    • 2017
  • Background: The goal of this study is to develop a general population job-exposure matrix (GPJEM) on asbestos to estimate occupational asbestos exposure levels in the Republic of Korea. Methods: Three Korean domestic quantitative exposure datasets collected from 1984 to 2008 were used to build the GPJEM. Exposure groups in collected data were reclassified based on the current Korean Standard Industrial Classification ($9^{th}$ edition) and the Korean Standard Classification of Occupations code ($6^{th}$ edition) that is in accordance to international standards. All of the exposure levels were expressed by weighted arithmetic mean (WAM) and minimum and maximum concentrations. Results: Based on the established GPJEM, the 112 exposure groups could be reclassified into 86 industries and 74 occupations. In the 1980s, the highest exposure levels were estimated in "knitting and weaving machine operators" with a WAM concentration of 7.48 fibers/mL (f/mL); in the 1990s, "plastic products production machine operators" with 5.12 f/mL, and in the 2000s "detergents production machine operators" handling talc containing asbestos with 2.45 f/mL. Of the 112 exposure groups, 44 groups had higher WAM concentrations than the Korean occupational exposure limit of 0.1 f/mL. Conclusion: The newly constructed GPJEM which is generated from actual domestic quantitative exposure data could be useful in evaluating historical exposure levels to asbestos and could contribute to improved prediction of asbestos-related diseases among Koreans.

Typology of Korean Eco-sumers: Based on Clothing Disposal Behaviors (관우한국생태학적일개예설(关于韩国生态学的一个预设): 기우복장탑배적행위(基于服装搭配的行为))

  • Sung, Hee-Won;Kincade, Doris H.
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.1
    • /
    • pp.59-69
    • /
    • 2010
  • Green or an environmental consciousness has been a major issue for businesses and government offices, as well as consumers, worldwide. In response to this movement, the Korean government announced, in the early 2000s, the era of "Green Growth" as a way to encourage green-related business activities. The Korean fashion industry, in various levels of involvement, presents diverse eco-friendly products as a part of the green movement. These apparel products include organic products and recycled clothing. For these companies to be successful, they need information about who are the consumers who consider green issues (e.g., environmental sustainability) as part of their personal values when making a decision for product purchase, use, and disposal. These consumers can be considered as eco-sumers. Previous studies have examined consumers' purchase intention for or with eco-friendly products. In addition, studies have examined influential factors used to identify the eco-sumers or green consumers. However, limited attention was paid to eco-sumers' disposal or recycling behavior of clothes in comparison with their green product purchases. Clothing disposal behaviors are ways that consumer can get rid of unused clothing and in clue temporarily lending the item or permanently eliminating the item by "handing down" (e.g., giving it to a younger sibling), donating, exchanging, selling, or simply throwing it away. Accordingly, examining purchasing behaviors of eco-friendly fashion items in conjunction with clothing disposal behaviors should improve understanding of a consumer's clothing consumption behavior from the environmental perspective. The purpose of this exploratory study is to provide descriptive information about Korean eco-sumers who have ecologically-favorable lifestyles and behaviors when buying and disposing of clothes. The objectives of this study are to (a) categorize Koreans on the basis of clothing disposal behaviors; (b) investigate the differences in demographics, lifestyles, and clothing consumption values among segments; and (c) compare the purchase intention of eco-friendly fashion items and influential factors among segments. A self-administered questionnaire was developed based on previous studies. The questionnaire included 10 items of clothing disposal behavior, 22 items of LOHAS (Lifestyles of Health and Sustainability) characteristics, and 19 items of consumption values, measured by five-point Likert-type scales. In addition, the purchase intention of two eco-friendly fashion items and 11 attributes of each item were measured by seven-point Likert type scales. Two polyester fleece pullovers, made from fabric created from recycled bottles with the PET identification code, were selected from one Korean brand and one US imported brand among outdoor sportswear brands. A brief description of each product with a color picture was provided in the survey. Demographic variables (i.e., gender, age, marital status, education level, income, occupation) were also included. The data were collected through a professional web survey agency during May 2009. A total of 600 final usable questionnaires were analyzed. The age of respondents ranged from 20 to 49 years old with a mean age of 34 years. Fifty percent of the respondents were males and about 58% were married, and 62% reported having earned university degrees. Principal components factor analysis with varimax rotation was used to identify the underlying dimensions of the clothing disposal behavior scale, and three factors were generated (i.e., reselling behavior, donating behavior, non-recycling behavior). To categorize the respondents on the basis of clothing disposal behaviors, k-mean cluster analysis was used, and three segments were obtained. These consumer segments were labeled as 'Resale Group', 'Donation Group', and 'Non-Recycling Group.' The classification results indicated approximately 98 percent of the original cases were correctly classified. With respect to demographic characteristics among the three segments, significant differences were found in gender, marital status, occupation, and age. LOHAS characteristics were reduced into the following five factors: self-satisfaction, family orientation, health concern, environmental concern, and voluntary service. Significant differences were found in the LOHAS factors among the three clusters. Resale Group and Donation Group showed a similar predisposition to LOHAS issues while the Non-Recycling Group presented the lowest mean scores on the LOHAS factors compared to the other segments. The Resale and Donation Groups described themselves as enjoying or being satisfied with their lives and spending spare-time with family. In addition, these two groups cared about health and organic foods, and tried to conserve energy and resources. Principal components factor analysis generated clothing consumption values into the following three factors: personal values, social value, and practical value. The ANOVA test with the factors showed differences primarily between the Resale Group and the other two groups. The Resale Group was more concerned about personal value and social value than the other segments. In contrast, the Non-Recycling Group presented the higher level of social value than did Donation Group. In a comparison of the intention to purchase eco-friendly products, the Resale Group showed the highest mean score on intent to purchase Product A. On the other hand, the Donation Group presented the highest intention to purchase for Product B among segments. In addition, the mean scores indicated that the Korean product (Product B) was more preferable for purchase than the U.S. product (Product A). Stepwise regression analysis was used to identify the influence of product attributes on the purchase intention of eco product. With respect to Product A, design, price and contribution to environmental preservation were significant to predict purchase intention for the Resale Group, while price and compatibility with my image factors were significant for the Donation Group. For the Non-Recycling Group, design, price compatibility with the factors of my image, participation to eco campaign, and contribution to environmental preservation were significant. Price appropriateness was significant for each of the three clusters. With respect to Product B, design, price and compatibility with my image factors were important, but different attributes were associated significantly with purchase intention for each of the three groups. The influence of LOHAS characteristics and clothing consumption values on intention to purchase Products A and B were also examined. The LOHAS factor of health concern and the personal value factor were significant in the relationships with the purchase intention; however, the explanatory powers were low in the three segments. Findings showed that each group as classified by clothing disposal behaviors showed differences in the attributes of a product, personal values, and the LOHAS characteristics that influenced their purchase intention of eco-friendly products. Findings would enable organizations to understand eco-friendly behavior and to design appropriate strategic decisions to appeal eco-sumers.