PROJECTS
FairTabGen: Unifying Counterfactual and Causal Fairness in Synthetic Tabular Data Generation
This project develops FairTabGen, a fairness-aware LLM-based framework for generating synthetic tabular data. By unifying counterfactual and causal fairness definitions into both generation and evaluation, FairTabGen ensures equitable data while preserving statistical utility. The framework leverages in-context learning, prompt refinement, and fairness-aware data curation to balance fairness and performance. Tested across diverse datasets, it achieves up to 10% gains on fairness metrics such as demographic parity and path-specific causal effects, while using less than 20% of the original data. FairTabGen provides an efficient, principled solution for producing fair synthetic data in privacy-sensitive and low-data healthcare settings.