How Haskell confused me, Rust prepared me, F# made the idea click, and Python made me build the missing piece for data and ML workflows.

A nightly feature pipeline fails while we are sleeping.
Imagine a common failure case. A source table arrived late. One column has a new nullable value. A model metadata file is missing the training window. The pipeline is supposed to handle these cases because they are ordinary failures in data work. They are not developer mistakes in the same category as an index error or a broken invariant.
Let me illustrate what the first version of the job probably looked like:
|
|
Then production reality added branches after every step:
|
|
While it didnt become conceptually harder, the failure convention became louder than the data workflow.
That is the practical reason I care about railway-oriented programming, computational expressions, and monads. They are ways to keep the shape of a workflow intact while making expected failure explicit.
Haskell introduced the abstraction before I had the scar tissue
My first exposure to monads was Haskell a decade ago, and Haskell made them feel like something to survive.
I could follow small examples. Maybe meant a value may be absent. Either meant a computation may return a value or an error. IO meant the computation interacted with the outside world. I could read a do block with enough patience.
Let’s take a look at what shape a Haskell-flavored data workflow may have:
|
|
Each line unwraps a successful value and gives it a name. If a step returns the failure case for that monad, the block follows the monad’s rule. For Maybe, Nothing stops the computation. For Either error, Left error stops it.
The abstraction was useful, but I met it while also learning type classes, unfamiliar syntax, compiler errors, and functional vocabulary. At that point, “monad” felt less like a workflow tool and more like a password.
The formal shape did not help much at first:
$$\mathbb{M}(A)$$$$\mathcal{A} \rightarrow \mathbb{M}(\mathcal{A})$$
$$\mathbb{M}(\mathcal{A}) \rightarrow (\mathcal{A} \rightarrow \mathbb{M}(\mathcal{B})) \rightarrow \mathbb{M}(\mathcal{B})$$
A value sits inside some context $\mathbb{M}$. A normal value can be lifted into that context. A contextual value can be sequenced with a function that returns another contextual value. That sequencing operation is usually called bind.
That definition is compact, but it is not how the idea became useful to me. It became useful only after I had seen the same workflow problem in enough places: load a thing, unwrap it if it worked, stop if it failed, and keep the error path visible.
Haskell’s do notation was also my first encounter with what F# later made practical for me: a block where the language handles the sequencing rule. I did not understand it that way at the time, but the genealogy was there.
Still, Haskell left behind a useful suspicion: if every line in a workflow repeats the same error handling, maybe the caller is doing work the language or library could do once.
For a data team, that suspicion matters. If load_config, load_cohort, load_events, and write_feature_table all share a failure convention, the orchestration function should not manually rediscover that convention after every call.
Rust gave me the shape without the vocabulary fight
Rust made the idea feel practical before I had fully named it.
Result<T, E> and Option<T> encode two common facts: a computation may fail with an error, and a value may be absent. The ? operator makes the sequencing rule readable:
|
|
That is railway-oriented programming in everyday systems code. A successful Result unwraps and continues. An error returns early from the function. The type still advertises failure.
The important part is not syntax alone. Rust changed how I read function signatures. Option<Cohort> means absence is expected. Result<Cohort, PipelineError> means the failure has information the caller should handle. A function that can throw anything asks the caller to learn the contract through documentation, tests, or production incidents.
That distinction transfers directly to Python data work. A missing optional feature can be Option. A failed table load can be Result. A corrupted internal object can still raise an exception. The point is to separate domain outcomes from defects.
F# made computation expressions click
F# connected the ideas for me because it gives the abstraction a practical syntax.
A Result pipeline might look like this:
|
|
This is readable, but repeated Result.bind still exposes the plumbing. Computation expressions move that rule into a builder:
|
|
The builder decides what let!, return, and return! mean. For a Result builder, let! unwraps Ok value and stops on Error error. For an Option builder, None stops. For a validation builder, the rule can be different: collect independent validation errors instead of stopping at the first one.
This was the click. A computation expression is a programmable block for a repeated sequencing rule.
That matters in many data-heavy work because not every failure pattern should behave the same way. If the training dataset cannot be loaded, the pipeline should stop. If a model card has five missing metadata fields, the validation step should report all five. A single abstraction should not flatten those into the same behavior.
Python’s default tools work until the failure path becomes the workflow
Python already has error handling. It has exceptions, None, sentinel values, booleans, tuples, warnings, and custom objects. That flexibility is useful, but it also means a codebase can develop several failure dialects at once.
In machine learning pipeline, this shows up quickly:
|
|
Each function may be reasonable in isolation. Together, they are awkward. One function returns None. One raises. One returns a boolean. One assumes the input is valid. The caller has to remember the convention for every step.
The orchestration code starts accumulating defensive glue:
|
|
This is rather a normal Python code. But it is also the kind of normal Python that wears down development ergonomics.
The happy path is no longer the main shape of the function. The failure path is visible, but only as repeated local ceremony. The reader has to inspect each branch to know whether a failure stops the job, gets transformed, gets logged, gets swallowed, or gets re-raised.
Exceptions are still the right tool for many cases. If a database client fails because the network is unavailable, an exception may be appropriate. If an invariant is violated inside a model object, raising is often better than returning a value and hoping the caller notices.
Expected domain outcomes are different. A missing feature table, an invalid cohort definition, a rejected model card, or a metric below threshold is not necessarily an exceptional program state. It is a normal result of running the workflow against real inputs.
None has a similar problem. It is fine for a small local lookup:
|
|
It becomes weaker when it crosses function boundaries:
|
|
Why is it None? Was the card absent? Was the file unreadable? Was the JSON malformed? Was the run ID unknown? The type says only that the value is not there. The reason moved somewhere else, if it exists at all.
Railway-oriented programming gives those outcomes a stable shape:
|
|
Now the workflow can be written once:
|
|
The ergonomics are better because the developer does not keep retyping the same control-flow rule. Each line says what the pipeline does. The builder supplies the repeated behavior: unwrap Ok, stop on Err, return the final value as Ok.
The gain compounds during maintenance. Adding a new pipeline step is one line if the step returns the same shape:
|
|
There is no new try...except block to place incorrectly. No new None check to forget. No boolean failure that loses its explanation. The function’s contract stays stable as the workflow grows.
That is the ergonomic advantage I care about most. ROP does not make errors disappear. It makes the expected error path regular enough that I can stop hand-writing it every time.
ROP is the application developer’s entry point
Railway-oriented programming is the easiest part to use first.
A workflow has a success track and a failure track. Each step receives a successful value and returns either a new success or a failure. Once a failure appears, later success-only steps are skipped.
Let me express in Python-like ways/terms:
Disclaimer: Python built-in standard library doesn’t support this yet–as of the version 3.14; the following code snippets in this and next sections assume packages like
comp-builderswhich will be introduced later this essay.
|
|
The machinery is small enough to sketch. A Result builder needs the same decision every time a step returns:
|
|
That is illustrative rather than production-ready code, but it shows the point. The builder is not inventing a new failure policy at each line. It is applying one policy repeatedly: unwrap success, stop on failure, preserve the error value.
For a dependent feature pipeline, that is exactly the behavior I want. If cohort construction fails, event loading should not run. If event loading fails, feature building should not run. The failure should return as a value with enough information for logging, retrying, or reporting.
The practical benefit is not just shorter code. The function body becomes honest about the work it is doing:
|
|
Each yielded function still returns a Result. The block does not erase failure. It removes repeated short-circuit plumbing.
The team-level rule I would use is narrow: use ROP when three or more dependent steps share the same failure shape and the repeated checks are starting to hide the workflow. For a tiny function, ordinary Python is often better.
Validation is not the same workflow as loading data
A fail-fast rule is right for dependent steps. It is wrong for many validation tasks.
Suppose a model registration step checks metadata before publishing a model artifact:
|
|
If the owner is missing, the training window is malformed, and the intended-use field is empty, I want all three errors. Returning only the first one creates a slow repair loop.
That calls for a validation builder:
|
|
This looks similar to the Result block, but the rule should differ. Independent validation failures should accumulate. A data engineer fixing a malformed config file wants the full list of problems.
That is why these abstractions matter beyond aesthetics. They encode the decision policy. Stop early for dependent effects. Accumulate errors for independent checks. Use absence for optional values. Use async-result sequencing for networked or service-backed work.
PyMonad showed me Python could carry the idea
After F# clicked, I wanted the same development experience in Python. I tried PyMonad and liked that it brought familiar functional abstractions into Python.
The style I associate with PyMonad is explicit functional composition: values are wrapped in monadic containers, and transformations are chained with monadic operations. In practical Python, the code tends to look more expression-oriented than block-oriented.
A PyMonad-style sketch for a feature configuration workflow might look like this:
|
|
That style has real value. It makes the wrapper visible. It teaches the underlying idea. It is closer to how monads are often explained: a value in context, followed by functions that preserve the context.
I liked that. It gave me a way to experiment with monadic composition in Python instead of only reading about it in Haskell or F#.
But I also felt a mismatch with my day-to-day Python work. Data workflows often become clearer when intermediate values have names. I may need config, cohort, events, features, and metrics in the same function. I may need one local conditional, one log message, or one small derived value between two effectful steps.
A chain can handle that, and PyMonad-style code has ways to stay cleaner than the worst cases: helper functions, smaller composed steps, and in some contexts do-notation-like tools. The mismatch I felt was not that chaining is bad. It was that my workflows often needed named intermediate values inside one readable block.
A deliberately compressed example shows the pressure point:
|
|
This is still monadic composition, but it is no longer the development experience I wanted for ordinary Python data and ML orchestration.
My other concern was ownership. For personal and production-adjacent tools, I care about whether the abstraction is small enough for me to maintain, audit, and adapt. PyMonad felt broader and more academically oriented than my recurring use case.
The maintenance risk was anoother practical issue. I did not want a core workflow dependency whose future I could not reason about. The repository activity made that concern concrete: the most recent commit I found was from 2023, so as of May 2026 the project looks effectively inactive. For a library that would sit directly in my project workflow, that matters.
All things coonsidered, I wanted something small enough to own, adapt, and keep aligned with modern typed Python instead of betting my everyday ergonomics on an abstraction I liked but could not confidently maintain.
comp-builders is the narrower tool I wanted
That led me to build my own helpers: comp-builders.
The payoff is the block I wanted to write in ordinary Python:
|
|
This gives me the F# feeling in Python. The business objects have names. The ordering is visible. Failure remains part of the return type.
The project is intentionally small: typed Python, no runtime dependencies, and computation builders for Result, Option, AsyncResult, and Validation. The goal is not to make Python look like Haskell. The goal is to make common Python workflows read like workflows again.
The difference from PyMonad is mostly about interface and scope. PyMonad gives a more general monadic programming style. comp-builders focuses on the handful of computation shapes I repeatedly want in many coding situations: fail-fast result workflows, optional lookups, async result orchestration, and validation error accumulation.
The same package can express expected absence:
|
|
And validation with accumulation:
|
|
That is the contrast I care about:
| comp-builders | PyMonad |
|---|---|
| Workflow abstraction | Algebraic abstraction |
| Operational ergonomics | Mathematical generality |
| “How do I write reliable workflows?” | “How do I model computations functionally?” |
The monad part became ordinary after the workflow mattered
The formal definition that felt intimidating in Haskell became less mysterious once I had a concrete use for it.
For Result, the context is possible failure. For Option, the context is possible absence. For async result, the context includes awaiting and possible failure. For validation, the context includes validity and error accumulation.
The laws still matter when implementing reusable abstractions because they keep refactoring safe. Application code should not need to recite those laws in every review, but the library should behave predictably enough that extracting a helper function does not change the meaning of the workflow.
That is why I no longer think of monads as a Haskell obstacle. I think of them as one explanation for a practical design move: define the context once, then write the workflow in terms of values.
Why this matters for me (and many others)
The coding domain I belong to has many ordinary failure modes:
A source table is missing. A cohort query returns no rows. A feature is absent. A schema check fails. A metric is below a deployment threshold. A model artifact cannot be registered. A service call times out.
Treating all of these as exceptions blurs the contract. Treating all of them as None loses information. Treating each one with a custom ad hoc convention makes orchestration code noisy.
ROP and computation builders give me a way to sort those outcomes:
|
|
This function is readable because the sequencing policy is not repeated after every line. It is still honest because every yielded operation returns a type that can fail.
A team using this style would make different review comments. Instead of saying “add more try/except blocks,” reviewers can ask whether the failure is expected, what type represents it, whether the workflow should stop or accumulate errors, and where the final error value is translated into logs, HTTP responses, dashboard events, or retry decisions.
That is a better conversation.
The distinction that changed my code
My path went through Haskell confusion, Rust recognition, F# clarity, and Python adaptation.
Haskell showed me the abstraction too early. Rust gave me Result and Option as daily tools. F# made computation expressions feel like the missing syntax for workflow code. PyMonad showed me that Python could host these ideas. comp-builders became my narrower answer for the code I actually write: data pipelines, ML utilities, validation boundaries, and service orchestration.
The distinction I now care about is whether failure is part of the program’s shape.
When failure is expected, I want it in the return type. When absence is expected, I want it represented explicitly. When checks are independent, I want accumulated validation errors. When a workflow repeats the same sequencing rule, I want that rule encoded once.
ROP, computational expressions, and monads matter because they make that possible without giving up readable application code. For me, the point was never to make Python look like what it isn’t. The point was rather to make a workflow say what it does, while still admitting where it can fail.
Even when job fails while I am away, I want the error path to be as intentional as the happy path. The next morning, the code should tell me which step failed, why it failed, and which parts of the workflow never ran. That is a practical developer experience goal, beyond the matter of paradigm preferences.
Reference
- [1] J. DeLaat, PyMonad (2013), GitHub Repository https://github.com/jasondelaat/pymonad
- [2] S. Park, comp-builders (2026), GitHub Repository https://github.com/SaehwanPark/comp-builders
- [3] S. Wlaschin, Railway Oriented Programming (2014), https://fsharpforfunandprofit.com/rop/
- [4] P. Wadler, Monads for Functional Programming (1995), Advanced Functional Programming, Springer
- [5] D. Syme, A. Granicz and A. Cisternino, Expert F# 4.0 (2015), Apress