Dr. Uzair Javaid

Dr. Uzair Javaid is the CEO and Co-Founder of Betterdata AI, a company focused on Programmable Synthetic Data generation using Generative AI and Privacy Engineering. Betterdata’s technology helps data science and engineering teams easily access and share sensitive customer/business data while complying with global data protection and AI regulations.
Previously, Uzair worked as a Software Engineer and Business Development Executive at Merkle Science (Series A $20M+), where he worked on developing taint analysis techniques for blockchain wallets. 

Uzair has a strong academic background in Computer Science/Engineering with a Ph.D. from National University of Singapore (Top 10 in the world). His research focused on designing and analyzing blockchain-based cybersecurity solutions for cyber-physical systems with specialization in data security and privacy engineering techniques. 

In one of his PhD. projects, he reverse engineered the encryption algorithm of Ethereum blockchain and ethically hacked 670 user wallets. He has been cited 600+ times across 15+ publications in globally reputable conferences and journals, and has also received recognition for his work including Best Paper Award and Scholarships. 

In addition to his work at Betterdata AI, Uzair is also an advisor at German Entrepreneurship Asia, providing guidance and expertise to support entrepreneurship initiatives in the Asian region. He has been actively involved in paying-it-forward as well, volunteering as a peer student support group member at National University of Singapore and serving as a technical program committee member for the International Academy, Research, and Industry Association.

Using Privacy Preserving and Scalable Synthetic Data to Enhance Cyber Security Ops for DHS

Dr. Uzair Javaid
April 29, 2025

Table of Contents

Summary:
  • Betterdata’s Large Tabular Model (LTM) generated synthetic data with high fidelity and zero PII exposure.
  • LTM enables zero-shot and few-shot adaptability without constant retraining.
  • Built-in privacy audits ensure every synthetic dataset meets DHS’s strict compliance standards.
  • LTM accelerated cyber defense training, improved anomaly detection, and enabled secure interagency data sharing.
  • We solved ML training on high-quality non-anonymized data for the US Department of Homeland Security (DHS). 

    The Challenge for DHS:

    • Operational data is often sensitive and cannot be shared across departments due to privacy, security, and regulatory constraints. 
    • Traditional anonymization techniques fail to fully protect against re-identification risks, limiting the ability to conduct effective cybersecurity and infrastructure protection exercises.

    Thus the Department of Homeland Security (DHS) required high-quality data to train ML models, test critical systems, and simulate real-world scenarios.

    Our Solution - Large Tabular Model (LTM):

    Through our foundational model, ‘Large Tabular Model (LTM)’, DHS can generate high-fidelity synthetic data that mirrors the statistical properties of real datasets, without carrying any sensitive or personally identifiable information (PII).

    • Zero-shot and few-shot adaptability: Unlike rule-based or deep learning-based synthetic data generators that need retraining for each use case, LTM can adapt to new datasets with minimal input.
    • Built-in privacy audits: Every synthetic dataset generated undergoes rigorous privacy assessments, ensuring compliance with DHS’s stringent security and privacy standards

    The Impact:

    Through the adoption of Betterdata’s LTM, DHS is achieving game-changing outcomes:

    • Faster Cyber Defense Simulations: DHS can now simulate sophisticated cyber-attack scenarios using realistic yet risk-free datasets, accelerating training and strategic planning.
    • Enhanced Anomaly Detection: ML models trained on synthetic data identify anomalies and threats more accurately, without ever accessing real user information.
    • Secure Interagency Collaboration: Agencies can share synthetic datasets freely, breaking down data silos without risking policy violations.
    • Regulatory Compliance: DHS remains fully aligned with national cybersecurity mandates while advancing its AI-driven threat intelligence programs.

    Synthetic data has the potential to transform industries by enabling government agencies and enterprises to innovate without any restrictions. Because it allows enterprises and government agencies to work with accessible, fair, and scalable data. Something that was not possible in the not-so-distant past. With data protected and utility maintained (or even improved in some cases), innovation is not a question of how but when.

    Dr. Uzair Javaid
    Dr. Uzair Javaid is the CEO and Co-Founder of Betterdata AI, specializing in programmable synthetic data generation using Generative AI and Privacy Engineering. With a Ph.D. in Computer Science from the National University of Singapore, his research has focused on blockchain-based cybersecurity solutions. He has 15+ publications and 600+ citations, and his work in data security has earned him awards and recognition. Previously, he worked at Merkle Science, developing taint analysis techniques for blockchain wallets. Dr. Javaid also advises at German Entrepreneurship Asia, supporting entrepreneurship in the region.
    Related Articles

    don’t let data
    slow you down

    Our 3 step synthetic data solution increases your business performance by 10x
    Thank you! Your submission has been received!
    Oops! Something went wrong while submitting the form.