Research Article

A Classification Model for Addressing Welfare Blind Spots through Progressive Variable Expansion: Focusing on Variable Generation via TVAE and Recall Optimization

Youngsik Park¹ · Dongwon Lee¹ · Hyoung-Yong Lee¹

¹ Hansung University

Published: January 2026 · Vol. 55 No. 2 · pp. 583-611

DOI: https://doi.org/10.17287/kmr.2026.55.2.583

Full Text

Abstract

This study aims to address the key causes of welfare blind spots—missing variables and prediction omissions—by applying a Tabular Variational AutoEncoder (TVAE)-based structural missing data imputation technique and designing a recall-centered classification model. Using welfare administrative data collected from January 2018 to November 2023, the study systematically verified the effects of variable expansion and synthetic data integration. In Phase 1, recall slightly improved by approximately 0.4–1.0 percentage points through direct variable expansion. In Phase 2, combining TVAE-imputed data led to a substantial increase in recall, from 57.7% to 94.4%. A sensitivity analysis of synthetic data combination ratios (Test 1–4) confirmed that stable predictive performance was maintained even with 10–20% synthetic data inclusion. The time-block validation further demonstrated temporal robustness, with ROC-AUC values remaining between 0.66 and 0.67 and PR-AUC between 0.31 and 0.32 across time periods. These results empirically demonstrate that TVAE-based imputation effectively enhances the quality and predictive reliability of welfare administrative data, supporting the validity of recall-oriented, data-driven decision-making in welfare policy design.

Keywords: 복지사각지대TVAE점진적 변수 확장재현율 최적화머신러닝