Working Toward Robustness (F# c-MLE Project Part 2)

How a lucky random number sequence hid a huge error until I tested on different platform — and what it taught me about production numerical code Days ago I wrote about implementing c-MLE in F# as my first substantial project in the language. The code worked. Tests passed. The optimizer converged to reasonable parameter estimates with violations driven to machine precision. Then I whimsically tested it on my M1 Max. Same code. Same seed. Same data. The estimated parameter was off by 38%. ...

November 23, 2025 · 13 min · Sae-Hwan Park

My First F# Project: Implementing Constrained Optimization from Scratch

One week ago, I wrote about why I’m learning F# — a language I may never use professionally, but one I believe will change how I think in the languages I do use. The thesis was simple: F# might be the sweet spot for researchers who need readable, concise, and safe code without paying Rust’s memory-management tax or Haskell’s conceptual overhead. That was nearly weakly-educated speculation. This is what happened when I tried to prove it. ...

November 21, 2025 · 18 min · Sae-Hwan Park

Why I'm Learning F# (And Why It May Matter For You Data Scientists)

There’s something strange about learning a programming language you may never use professionally. When I tell people I’m learning F#, the responses are almost predictable: Are you switching careers? Is your team moving to .NET? No. I still work in population and behavioral health research as a data scientist and numerical programmer, where Python dominates and Rust handles the performance-critical parts of our pipeline. F# is unlikely to appear in production systems at my job. ...

November 14, 2025 · 9 min · Sae-Hwan Park

Functional Programming in Python

I confess gave up on Haskell about ten years ago. It wasn’t for lack of trying. I had spent months wrestling with type classes, drowning in monad transformers, and debugging cryptic compiler errors that felt more like philosophical riddles than helpful feedback. The promise of pure functional programming (FP) was intoxicating – bulletproof correctness, elegant abstractions, programs that composed like mathematical proofs. But the reality was different. The learning curve was steep, the tooling was sparse, and most importantly, I couldn’t use it professionally. Try convincing a team to rewrite a production system in Haskell. Try hiring engineers who know it. Try getting management approval for a language most people have never heard of. ...

November 8, 2025 · 17 min · Sae-Hwan Park

Why BRR Works: Deep Dive Into Hadamard Matrix

Source: https://mathworld.wolfram.com/HadamardMatrix.html Going Deeper: The Mathematical Engine Behind BRR In our previous posts on Balanced Repeated Replication (BRR) and Fay’s method, we explored how these techniques solve the variance estimation problem in complex surveys. We saw the elegant result: create a set of replicate weights, recompute your statistics using each replicate, and combine the results to get variance estimates that properly account for the survey design. But we left something crucial as a black box: how exactly are these replicates constructed? We said “use a Hadamard matrix” and moved on, focusing instead on the weighting schemes and the variance formulas. For many practitioners, that’s sufficient – major survey data providers like the Medicare Current Beneficiary Survey (MCBS), NHANES, and many state-level behavioral health surveys provide pre-computed replicate weights in their public use files. You load the data, use the supplied weights, trust the mathematics, get your standard errors. ...

October 31, 2025 · 30 min · Sae-Hwan Park

How I Learned Monads: Not Through Haskell But Through Rust

I approached learning monads in Haskell wrong and failed. Then I discovered I’d been using them in Rust all along without knowing. Introduction About a decade ago, I tried to learn Haskell. I was mesmerized by its elegance – the way types guided you toward correct programs, how pure functions composed so naturally, the terseness that still remained readable. I worked through A Gentle Introduction to Haskell, and everything made sense until I hit the chapter of monads. ...

October 25, 2025 · 17 min · Sae-Hwan Park

Building a Modern MICE Imputer: Episode 1

Source: Imputation by Chained Equations (MICE): Bridging the Gap in Incomplete Data Analysis The Missing Data Challenge in Research Missing data is the silent saboteur of data-driven research. Whether you’re analyzing electronic health records, survey responses, or sensor measurements, incomplete observations are not the exception – they’re the rule. A recent systematic review of clinical studies found that over 80% of randomized controlled trials report some form of missing data, with missingness rates often exceeding 20% for key variables. ...

September 19, 2025 · 13 min · Sae-Hwan Park

Beyond Bootstrap: Why Survey Data Demands Specialized Variance Estimation

Understanding the statistical machinery behind survey analysis tools and implementing proper variance estimation in general-purpose languages Disclaimers: The opening story below is hypothetical but reflects the author’s own experience with survey data analysis Code snippets in this article are for illustrative purposes, rather than production-ready implementations Source: Zinn, S. (2016). Variance Estimation with Balanced Repeated Replication: An Application to the Fifth and Ninth Grader Cohort Samples of the National Educational Panel Study. In: Blossfeld, HP., von Maurice, J., Bayer, M., Skopek, J. (eds) Methodological Issues of Longitudinal Surveys. Springer VS, Wiesbaden. https://doi.org/10.1007/978-3-658-11994-2_4 ...

September 14, 2025 · 21 min · Sae-Hwan Park

Beyond Performance: Why Polars Represents a Paradigm Shift from pandas

(source: Pandas vs Polars: Is It Time to Rethink Python’s Trusted DataFrame Library?) Introduction After four years of writing production data science code with pandas, I thought I understood data manipulation in Python. I had memorized the subtle differences between .apply(), .transform(), and .agg(). I knew when to use .loc[] versus .iloc[], when to chain methods versus create intermediate variables, and how to navigate the maze of groupby operations that seemed to change behavior depending on context. ...

September 6, 2025 · 24 min · Sae-Hwan Park

Why Memory-Safe Languages Matter: Lessons from C, Python, and Rust

C, Python, and Rust take different paths to memory safety. Here’s why that difference defines the future of secure coding. (Source: ROPdefender: A detection tool to defend against return-oriented programming attacks) In recent years, the U.S. government has made unprecedented recommendations for the adoption of memory-safe programming languages across critical infrastructure and government systems. The National Security Agency, CISA, and other agencies have explicitly called for migration away from C and C++ toward languages like Rust, citing memory safety vulnerabilities as a primary attack vector in modern cyber warfare. ...

August 29, 2025 · 17 min · Sae-Hwan Park