Welcome to Debug & Discover! 👋

Here is a personal repository to share learning, thoughts, and insights on CS/ML/AI/Tech. Opinions expressed are my own.

The Narrowing Entrance: What Anthropic's Report Actually Means

While the headlines scream of mass layoffs, the data suggests a more subtle and structural shift: the ladder is being pulled up from the bottom. Source: https://www.anthropic.com/research/labor-market-impacts (Figure 2) The Ghost in the Unemployment Statistics “AI is coming for your job.” It’s the refrain of every tech influencer and doom-scrolling headline. Two days ago, a report dropped that seemed to confirm the worst. Anthropic, one of the leading labs behind the very tech in question, released Labor Market Impacts of AI: A New Measure and Early Evidence. ...

Introduction to aTPR: Correcting for Risk Heterogeneity in Fairness Evaluation

We disentangle population risk from model-specific performance when subgroups enter with different risk profile. (Core idea of the adjusted TPR; some image components generated by NanoBanana 2) In part 1, we saw that a raw true-positive-rate gap can be a mirror of baseline risk differences, not just model behavior. In part 2, we make that distinction operational and show how to measure fairness once those risk differences are held constant. ...

Equal Opportunity Under the Microscope: Why Fairness Evaluation Needs Risk Awareness

We dive into whether and how forcing equal outcomes in healthcare algorithms can overlook the reality of baseline risk and demographic diversity. Clipart generated by Google NanoBanana 2 A Familiar Evaluation Result — and an Uncomfortable Question Let’s begin with a short vignette. A healthcare analytics team is reviewing a mortality risk prediction model used to trigger palliative care consultations. The model has been deployed for several months. As part of routine governance, the team evaluates fairness across demographic groups. ...

Why NLP Still Matters in the Age of AI Agents

A language-first view of modern AI systems (2026) (source: Companies Bring AI Agents to Healthcare (WSJ)) A system that sounds simple—until it isn’t Imagine a health system designing a conversational AI service for telemedicine. Patients describe symptoms, concerns, and fragments of medical history in free text. The system responds conversationally, drawing on prior encounters, internal documentation, and clinical guidelines. It answers routine questions, summarizes relevant context, and—when appropriate—routes cases to clinicians. ...

From Vague to Vivid

Opening Happy new year! In my previous post, My Winter NLP Journey, I wrote about motivation: why I wanted to build a old model like GPT-2 and why implementation feels like the fastest path to understanding. I am glad to report I completed the journey as planned. This post is a reflection on what actually changed in my head after working through the course on my own. The biggest gain was simple to name but hard to achieve: several ideas that were vague for years finally became clear. ...

My Winter NLP Journey

The Gradient That Changed Everything It’s almost coincidental. On the Christmas Eve this year, I came across a math problem asking me to compute the partial derivatives of Word2Vec’s naive softmax loss — standard fare for any NLP course. But something compelled me to keep going, to really understand what these update rules were doing. The result was deceptively simple: $$ \frac{\partial J}{\partial v_c} = -u_o + \sum_{w\in V} \Pr[w|c] \, u_w = U(\hat{y} - y) $$What struck me wasn’t the math itself albeit it’s elegant but straightforward. What caught my attention was the structure of the learning process. What this math formulation suggests is that updating the center word vector $v_c$ requires knowing the current state of all context vectors $U$. But updating $U$ requires knowing $v_c$ (illustrated by partial derivative regarding $U$–the other piece of the puzzle). This chicken-and-egg dependency — where each parameter set treats the other as temporarily fixed — reminds us of Expectation-Maximization algorithms. It isn’t EM in the formal sense, but the alternating dependence—treating one parameter block as fixed while updating the other—shares the same structural intuition. ...

Post-mortem on Cloudflare Outage on Nov 2025: When a Single Assumption Went Global

An FP Practitioner’s Perspective on the Cloudflare Feature-File Outage Disclaimer: I am not affiliated with Cloudflare. This analysis is based entirely on publicly available incident reports and technical discussions. The architectural reconstructions and code examples represent my educated interpretation of what likely occurred, informed by functional programming principles and my own experience with similar failure modes. Where I speculate beyond published details, I have tried to make those inferences explicit. ...

AoC 2025 Final Reflection: Serious FP Journey

The Final Commit For the first time since Advent of Code began, the event ends on 12 Dec 2025 instead of on Christmas Day. Twelve days instead of twenty-five. My final solution—Day 12’s NP-hard tiling problem—took over 15 seconds to run, a humbling reminder that not every problem yields to elegance. But it works, and that’s what matters. The shortened 12-day format changed the rhythm entirely: less time to “warm up,” less room for recovery, and far more emphasis on momentum and clarity of thought. In that sustained sprint, F# didn’t just “work”—it got out of the way. Looking back at my commit history, I wrote less code this year than in 2024, but I enjoyed it more. ...

AoC 2025 Midpoint Review: How F# Clicks For Me

We are at the halfway mark of this year’s shortened 12-day Advent of Code. As I wrote three weeks ago, I decided to run an experiment: I abandoned my usual comfortable tool (Python) and my previous “challenge” tool (Rust) to solve everything personal in F#. In 2023 and 2024, I solved AoC in Rust. I treated it as a software engineering exercise: structured projects, cargo run --example dayXX, and strict memory discipline. ...

Working Toward Robustness (F# c-MLE Project Part 2)

How a lucky random number sequence hid a huge error until I tested on different platform — and what it taught me about production numerical code Days ago I wrote about implementing c-MLE in F# as my first substantial project in the language. The code worked. Tests passed. The optimizer converged to reasonable parameter estimates with violations driven to machine precision. Then I whimsically tested it on my M1 Max. Same code. Same seed. Same data. The estimated parameter was off by 38%. ...