PROJECTS
FairCauseSyn: Towards Causally Fair LLM-Augmented Synthetic Data Generation
This project develops FairCauseSyn, the first LLM-augmented framework for generating synthetic health data with causal fairness. Unlike existing methods that only address counterfactual fairness, FairCauseSyn preserves underlying causal structures, ensuring more equitable outcomes across sensitive attributes. Applied to real-world tabular health data, the system produces synthetic datasets that deviate by less than 10% from real data on causal fairness metrics and reduces bias by up to 70% when training predictors. By enabling access to high-quality, fair synthetic data, FairCauseSyn advances equitable health research and supports bias-free healthcare delivery.