Debug & Discover

Why BRR Works: Deep Dive Into Hadamard Matrix

Source: https://mathworld.wolfram.com/HadamardMatrix.html Going Deeper: The Mathematical Engine Behind BRR In our previous posts on Balanced Repeated Replication (BRR) and Fay’s method, we explored how these techniques solve the variance estimation problem in complex surveys. We saw the elegant result: create a set of replicate weights, recompute your statistics using each replicate, and combine the results to get variance estimates that properly account for the survey design. But we left something crucial as a black box: how exactly are these replicates constructed? We said “use a Hadamard matrix” and moved on, focusing instead on the weighting schemes and the variance formulas. For many practitioners, that’s sufficient – major survey data providers like the Medicare Current Beneficiary Survey (MCBS), NHANES, and many state-level behavioral health surveys provide pre-computed replicate weights in their public use files. You load the data, use the supplied weights, trust the mathematics, get your standard errors. ...

How I Learned Monads: Not Through Haskell But Through Rust

I approached learning monads in Haskell wrong and failed. Then I discovered I’d been using them in Rust all along without knowing. Introduction About a decade ago, I tried to learn Haskell. I was mesmerized by its elegance – the way types guided you toward correct programs, how pure functions composed so naturally, the terseness that still remained readable. I worked through A Gentle Introduction to Haskell, and everything made sense until I hit the chapter of monads. ...

Building a Modern MICE Imputer: Episode 1

Source: Imputation by Chained Equations (MICE): Bridging the Gap in Incomplete Data Analysis The Missing Data Challenge in Research Missing data is the silent saboteur of data-driven research. Whether you’re analyzing electronic health records, survey responses, or sensor measurements, incomplete observations are not the exception – they’re the rule. A recent systematic review of clinical studies found that over 80% of randomized controlled trials report some form of missing data, with missingness rates often exceeding 20% for key variables. ...

Beyond Bootstrap: Why Survey Data Demands Specialized Variance Estimation

Understanding the statistical machinery behind survey analysis tools and implementing proper variance estimation in general-purpose languages Disclaimers: The opening story below is hypothetical but reflects the author’s own experience with survey data analysis Code snippets in this article are for illustrative purposes, rather than production-ready implementations Source: Zinn, S. (2016). Variance Estimation with Balanced Repeated Replication: An Application to the Fifth and Ninth Grader Cohort Samples of the National Educational Panel Study. In: Blossfeld, HP., von Maurice, J., Bayer, M., Skopek, J. (eds) Methodological Issues of Longitudinal Surveys. Springer VS, Wiesbaden. https://doi.org/10.1007/978-3-658-11994-2_4 ...

Beyond Performance: Why Polars Represents a Paradigm Shift from pandas

(source: Pandas vs Polars: Is It Time to Rethink Python’s Trusted DataFrame Library?) Introduction After four years of writing production data science code with pandas, I thought I understood data manipulation in Python. I had memorized the subtle differences between .apply(), .transform(), and .agg(). I knew when to use .loc[] versus .iloc[], when to chain methods versus create intermediate variables, and how to navigate the maze of groupby operations that seemed to change behavior depending on context. ...

Why Memory-Safe Languages Matter: Lessons from C, Python, and Rust

C, Python, and Rust take different paths to memory safety. Here’s why that difference defines the future of secure coding. (Source: ROPdefender: A detection tool to defend against return-oriented programming attacks) In recent years, the U.S. government has made unprecedented recommendations for the adoption of memory-safe programming languages across critical infrastructure and government systems. The National Security Agency, CISA, and other agencies have explicitly called for migration away from C and C++ toward languages like Rust, citing memory safety vulnerabilities as a primary attack vector in modern cyber warfare. ...

Announcement: Rust Security Bootcamp!

Announcement: Rust Security Bootcamp! (source: https://analyticsindiamag.com/ai-features/rust-provides-the-ultimate-security-against-hackers/) A Rustacean’s journey through security vulnerabilities and a personal experiment in reimagining how we might learn cybersecurity Disclaimer: I am NOT a security expert. I’d rather position myself as an enthusiastic practitioner. Therefore, please take a grain of salt for any strong claims I make, ideally with doublechecks with credible sources. The Cathedral and the Bazaar of Broken Code Picture this: It’s 2025, and yet another critical vulnerability has just been disclosed. The root cause? A buffer overflow in C code written by brilliant engineers at a top-tier company. Sound familiar? If you’ve been in the security space for more than five minutes, this scenario probably triggers a weary sense of déjà vu. ...

AOC 2024 Postmortem: What I Actually Learned Throughout 50 Stars

“It’s fine, I’ll just implement a quick DFS and call it a day.” — Me, 30 minutes before discovering that Day 16 required bidirectional search and completely restructuring my approach for the third time. The Setup: Another July Adventure I have a confession: I started Advent of Code 2024 in late July, just like I did with AOC 2023 (which I tackled in August 2024). There’s something liberating about approaching these puzzles without the December time pressure, when the rest of the world is frantically debugging race conditions while battling holiday shopping deadlines. ...

Rust Handshake Challenge #4: The Postmortem

Where we look back at our journey from simple greeting exchanges to high-performance async servers – and ask the hard questions about learning, complexity, and whether it was all worth it. Source: https://moslehian.com/posts/2023/1-intro-async-rust-tokio/ The Journey’s End Three weeks ago, I embarked on what seemed like a simple Friday night project: implement a 3-way handshake protocol in Rust. What started as curiosity about network programming became an odyssey through the depths of concurrent systems design. ...

Rust Handshake Challenge #3: Event-Driven Handshaking

Where our thread pool server meets its match, and I discover that sometimes the best way to handle thousands of connections is to pretend threads don’t exist. (Source: https://berb.github.io/diploma-thesis/community/042_serverarch.html) The Tantalizing Promise Fresh from the victory of my thread pool implementation, I felt invincible. My server could handle hundreds of concurrent connections with predictable resource usage. Thread creation was under control, context switching was manageable, and memory consumption was reasonable. ...