Quest for Better Blogging: #1 Why I'm Evolving Beyond Medium

A personal journey from platform acceptance to technical rebellion (source: https://www.netlify.com/blog/2020/04/14/what-is-a-static-site-generator-and-3-ways-to-find-the-best-one/) The Unquestioned Years A decade ago, I lived in blissful ignorance of better alternatives. Windows was my operating system, Microsoft Office was my document suite, and whatever blog platform provided was my publishing tool. I wrote movie reviews, game critiques, and random thoughts about life – mostly text with occasional images. The web editors worked fine because my needs were simple. ...

July 4, 2025 · 10 min · Sae-Hwan Park

The Other Path: Population-Average Models and GEE

When we need population-level insights without the distributional baggage of random effects Picture this: We’ve just finished implementing that elegant mixed effects model from last week’s deep dive, and the hospital executives are thrilled with our patient length-of-stay predictions. But then the epidemiologist on our team asks a different question: “What’s the average effect of this new treatment protocol across all our hospitals?” Our mixed effects model gives us subject-specific predictions – how much longer will this particular patient stay at this specific hospital? The policy question requires population-average inference – if we implement this protocol system-wide, what’s the expected change in average length of stay across the entire health system? ...

June 27, 2025 · 18 min · Sae-Hwan Park

Inside Mixed Effect Model

When your neural network treats every hospital the same, but your statistician intuition screams that they shouldn’t be Picture this: You’re building a model to predict patient length of stay across 200 hospitals. Your gradient boosting model achieves impressive metrics on your test set, but something feels off. Hospital A consistently shows longer stays than predicted, while Hospital B always runs shorter. Your model treats every hospital identically, missing systematic patterns that could unlock better predictions and deeper insights. ...

June 20, 2025 · 12 min · Sae-Hwan Park

How Factor Analysis and PCA Actually Differ

A deeper look beyond the textbook distinction that’s been confusing practitioners for decades The Problem We All Face Picture this: You’re sitting in a committee meeting, and someone suggests using “factor analysis to reduce dimensions” while another colleague insists “PCA will identify the underlying factors.” Both sound reasonable. Both seem to accomplish similar goals. Yet something feels… off. If you’ve found yourself nodding along while internally questioning whether these methods are really as different as your statistics textbook claimed, you’re not alone. The standard explanation—“FA finds latent factors, PCA reduces dimensions”—is technically correct but practically incomplete. It’s like saying “cars move people, planes fly people”—true, but missing the nuanced reality of when and why you’d choose one over the other. Choosing the wrong method can lead to misleading conclusions about underlying mechanisms or inefficient models that fail in production. ...

June 13, 2025 · 10 min · Sae-Hwan Park

A Second Look at Linux: Reflections from 2025

On operating systems, evolution, and the gradual convergence of computing environments The Return Ten years ago, I attempted to make Linux my daily driver and failed. Not catastrophically – more like learning to ride a bicycle and discovering that while technically possible, the experience required more effort than I was prepared to invest. My motivation wasn’t purely practical; I secretly aspired to be a Unix user because they looked like the hackers and gurus I wanted to become someday. There was an undeniable appeal to the idea of mastering a system that seemed to separate the technically sophisticated from ordinary computer users. ...

May 30, 2025 · 13 min · Sae-Hwan Park

Writing Maintainable Array Code: When NumPy Isn't Enough

Picture this: You’re implementing a complex neural network attention mechanism, and what should be elegant mathematical operations have devolved into a maze of None indexing, cryptic axis parameters, and debugging sessions that last longer than your coffee stays warm. If you’ve been there, you’re not alone. I recently read an article titled “I don’t like NumPy” that articulated some frustrations many of us have experienced when working with multi-dimensional matrices in Python. The author makes compelling points about the cognitive overhead of NumPy’s design choices, particularly when dealing with operations across multiple dimensions. ...

May 23, 2025 · 15 min · Sae-Hwan Park

How I Leveraged C Learning to Understand Rust Better

When I first encountered Rust after years of Python experience, I thought I understood it. “Ok, Rust’s way is interesting,” I told myself, nodding along to the Book’s explanations of ownership. The compiler errors were frustrating, but I got my code working eventually. I believed I had grasped the concepts. I was wrong. It wasn’t until I stepped away to learn C and systems programming that I realized how superficial my understanding had been. Only when I could visualize memory operations – seeing exactly what happened in the stack, heap, and global memory – did Rust’s ownership system transform from a set of arbitrary rules into a coherent mental model. ...

May 17, 2025 · 14 min · Sae-Hwan Park

The Linux-Windows Bridge I Wish I'd Discovered Years Ago

I built my own Linux environment on Windows using WSL2 (and you should, too) As someone deeply immersed in AI and ML development, I’ve often found myself juggling multiple computing environments. My workday typically involves switching between a Windows laptop at work, my personal MacStudio (along with one Windows desktop) at home, and SSH connections to remote computing clusters for intensive training jobs or working on data files that should not move (ruled by DUA). This fragmentation created friction in my workflow that I was eager to solve. ...

May 9, 2025 · 16 min · Sae-Hwan Park

The Quest for Heterogeneity: Understanding Conditional Average Treatment Effects (CATE)

We unmask heterogeneity, finding out how CATE learners help target interventions to those who will benefit most The Journey Beyond Average Effects In the vast landscape of causal inference, we’ve long relied on a simple compass: the Average Treatment Effect (ATE). Like ancient mariners navigating by a single star, researchers across disciplines have used this average to guide important decisions. But what if I told you that this single metric—this lone star—only reveals a fraction of the story? ...

May 3, 2025 · 23 min · Sae-Hwan Park

The Dimensional Odyssey: Navigating the Manifolds of t-SNE and UMAP

Prologue: The Curse of Dimensionality Imagine yourself as an explorer in a vast, multidimensional wilderness. Each step you take propels you along one of hundreds, perhaps thousands of different dimensions. The terrain stretches beyond what your mind can comprehend – a hyperdimensional landscape where traditional notions of distance and proximity lose their intuitive meaning. This is the world of high-dimensional data, a realm where our human perceptual limitations become painfully apparent. ...

April 18, 2025 · 25 min · Sae-Hwan Park