Alexander El-HajjData Scientist · Montreal QC · (01)
Data scientist & generative-AI builder, based in Montreal.
I like hard problems and the unglamorous engineering that makes systems actually work. Day job: modernizing secure data systems at Statistics Canada. The rest of the time: building, testing, and shipping practical ML tools. Eight years turning research and messy data into things that run.
I research and chase down hard problems, then build the thing that unblocks the team. I'm happiest understanding a system end-to-end and finding the simple version of it.
For eight years I've worked where data meets reality: secure public-sector analytics at Statistics Canada, applied ML and OCR research for federal partners, and reinforcement-learning work that ended up in a Springer volume. More recently that curiosity pulled me into generative AI: retrieval over messy document stores, fine-tuning small models under tight constraints, on-device inference with real safety boundaries, and agents on cloud infrastructure. I care about evaluation, observability, and the integration work that decides whether a prototype survives. I hold a Reliability Status security clearance.
Origin story
My interest in AI started in a very ordinary place: a Carleton University library seat in early 2018, watching Samuel Arzt's Deep Learning Cars. The video is simple: little red cars try to drive through a course, most of them crash almost immediately, and a small neural network in the corner shows the inputs and connections changing as each generation runs.
What got me was watching failure become information. At generation 1, nothing looks intelligent. The cars hit walls, spin out, and disappear. Then one gets a little farther. Another learns a cleaner turn. Over time the movement starts to look less random, until the system has learned enough from the course to make it through. It made neural networks feel visible, not mystical.
After that I started digging: Geoffrey Hinton lectures, the early ideas behind artificial intelligence, and the question of how simple learning rules can produce behavior that feels surprisingly adaptive. Later that year I took Data Mining 1 and got my first hands-on pass at SVMs, Random Forests, and Logistic Regression. That became the hook, and eventually a working style: understand the system, test the assumptions, and build the version that survives contact with real data and real users.
Now · 2021-present
Senior Analyst - Survey of Household Spending (SHS)
Statistics Canada · Hybrid
My current work is the quiet kind of modernization that makes public data systems healthier from the inside. I help move long-running survey production out of legacy SAS habits and into open-source workflows that analysts can review, test, and maintain. A lot of the job is translation: turning institutional knowledge into reproducible code, building validation tools people actually use, and helping teams adopt GitLab practices without making the process feel heavier than the work itself.
2022-2023
Generative AI Engineer
Independent Consultant · Remote
Before GenAI had clean playbooks, I was turning unstable notebooks, diffusion experiments, and video-generation tools into client-ready production workflows. The work sat between creative direction and engineering: setting up GPU environments, adapting open-source code, tuning prompts and model parameters, cleaning temporal artifacts, and helping teams ship campaign, event, and music-video pieces for Vuse, Wunderman Thompson, Standard Chartered Bank, and Taylor Gang Entertainment. Across published client and entertainment work, the outputs reached 5M+ combined views.
2020-2021
Data Scientist - Data Science Division
Statistics Canada · Ottawa
This was my first deep stretch inside applied government ML: messy inputs, real operational constraints, and partners who needed tools that made field work easier. On Project Cyclops, I helped turn smartphone photos of product labels into OCR-assisted compliance checks for Health Canada. Alongside that, I worked on pandemic-policy simulation research for PHAC, combining agent-based modelling, reinforcement learning, and dashboarding to explore how mitigation strategies could behave across a simulated Ontario population.
2018-2021
Analyst & Research Lead - Longitudinal and International Study of Adults (LISA)
Statistics Canada · Ottawa
LISA is where I learned how much good analysis depends on trust: trust in the data, in the release process, and in the people coordinating across teams. I worked across survey production, partner support, workshops, and research, eventually co-authoring food-insecurity papers that connected survey methodology with real social outcomes. I also started StatCan's first Kaggle competitive group because I wanted more people around me experimenting hands-on with machine learning, not just reading about it.
(03)
Selected work
Shipped & recognized
01
Special Recognition · Mila AI Safety Hackathon 2026
GuardAI a stacked guardrail that protects youth in digital crisis spaces
Built with team MindCraft. Keyword filters miss youth crisis because risk is a trajectory — it hides in coded language and multi-turn subtext. We fine-tuned a bilingual (EN/FR) EuroBERT-610M classifier on 85,000+ synthetic samples from a six-phase data-generation pipeline, using two-stage curriculum learning across 23 risk categories. A fast ~24ms classifier flags obvious signals; gray-zone cases escalate to a cascaded LLM judge. We tuned for recall on purpose — a false alarm gets filtered by a counsellor, a missed crisis does not.
Kept private under hackathon and safety constraints. Happy to walk through the pipeline, training tricks, and evaluation in detail.
EuroBERT · PyTorch · Curriculum learning · Synthetic data · 8-bit Adam · BF16 · Cascaded LLM judge
03
Shipped · 2026
Gemma Flares on-device health agent with a CI eval gate & hard safety boundary
A local-first Flutter/iOS app (Gemma via LiteRT-LM) for tracking IBD flare patterns. Deterministic code owns all risk math, routing, and persistence; the on-device model is sandboxed to explaining grounded evidence — it never computes risk or suggests medication. The interesting part is the operational layer: a persona-driven eval gate runs in CI on every PR — it scores each agent turn against safety and RAG-grounding contracts, flags rag_used_when_forbidden violations, and blocks the release if any hard-safety check fails. A runtime benchmark service captures cold-start and P50/P95 on-device generation latency.
persona-eval suite hard-safety pass P50/P95 latency115 tests · adversarial prompt-injection suite
Flutter · Dart · Gemma · LiteRT-LM · SQLCipher · CI eval gate · Persona suite · Latency benchmarks · Adversarial tests
02
Shipped · 2026
Local RAG Engine source-grounded retrieval over 20GB+ of PDFs
A retrieval system for document libraries too large to hold in memory. Resumable, idempotent S3 ingestion with multiprocessing and size guards; Qdrant + FastEmbed with MMR retrieval; answers grounded in their sources with an explicit "I don't know" refusal when the context doesn't cover the question. Ships with health-check and vector-store audit scripts so you can see what the index is actually doing.
Momentum an SMS-first personal finance agent on GCP
Syncs bank transactions through Plaid, classifies spending with Gemini on Vertex AI, and runs budget workflows over Cloud Functions, Cloud Run, and Pub/Sub. Handles the unglamorous part properly: idempotent pending-to-posted transaction state.
Currently building and experimenting with Hermes as a multi-agent workflow for real tasks. The focus is practical reliability: planner/executor patterns, tool-call routing, retrieval handoffs, prompt contracts, and tracing so failures are diagnosable and behavior is repeatable.
Ongoing: Hugging Face Agents Course and DeepLearning.AI's Generative AI with LLMs, focused on agent patterns, evaluation, and production LLM architecture. Completed: LangChain for LLM Application Development short project, applied to retrieval chains and tool-enabled workflows.
Heart Disease Prediction supervised vs. unsupervised, interpretable
Compared supervised and unsupervised methods for binary heart-disease detection on the Cleveland dataset using test accuracy, RMSE, ROC, and runtime. Best overall result was k-NN at 84.85% test accuracy (RMSE 0.3892, ROC 0.9100). My contribution covered Logistic Regression with factor variables, SVM Radial, RF Boosted Tree, RF with Stochastic Gradient Boost, rpart, FFTrees (normal/custom), and FFForest.
Steganalysis detecting hidden data in digital media
Implemented three image-steganalysis pipelines: K-means, SVM (Gaussian/Linear), and Neural Networks. Using BOSSbase v1.01 + stegosaur, paired cover/stego splits, and 686 SPAM features, the best test result came from Neural Networks (69.40% accuracy, RMSE 0.5531). My contribution focused on image steganalysis and literature review, translating detection methods into clear, practical digital-forensics workflow reasoning.
The same stubbornness that debugs a pipeline at 1am shows up here too. I train hard, I'm in the mountains when I can be, and I cook like it's a build process.
Muay Thai & BoxingTraining with the legend SaenchaiSkiingWhitewater, Nelson, BCHikingPedra da Gávea, BrazilCookingMy kitchen, MontrealAlwaysNo camera-roll proof for this one: side projects, papers, repeat.