alt text

I confess gave up on Haskell about ten years ago.

It wasn’t for lack of trying. I had spent months wrestling with type classes, drowning in monad transformers, and debugging cryptic compiler errors that felt more like philosophical riddles than helpful feedback. The promise of pure functional programming (FP) was intoxicating – bulletproof correctness, elegant abstractions, programs that composed like mathematical proofs. But the reality was different. The learning curve was steep, the tooling was sparse, and most importantly, I couldn’t use it professionally. Try convincing a team to rewrite a production system in Haskell. Try hiring engineers who know it. Try getting management approval for a language most people have never heard of.

So I moved on. Back to Python, back to the practical world of getting things done.

But recently, something changed. I started learning Rust – not for its memory safety guarantees (though those are nice), but because I needed performance for a computational biology project. What surprised me wasn’t Rust’s borrow checker or zero-cost abstractions. It was how functional Rust felt. Option<T> and Result<T, E> are monads. The Iterator trait encourages transformation pipelines. Pattern matching is first-class. Immutability is the default.

Rust isn’t a functional programming language. But it borrows heavily from FP, and suddenly, a decade after abandoning Haskell, I was rediscovering functional concepts – this time, in a language I could actually use in production.

That realization was the spark: if Rust can blend imperative systems programming with FP principles, why can’t Python? In fact, why can’t most contemporary hybrid-paradigm languages?

But something stuck with me. Over the past decade, working on large-scale ML systems and population health analytics at Penn Medicine, I kept noticing the same bugs creeping into codebases: unexpected mutations, hidden state, functions that behave differently on Tuesday than Monday. The kind of bugs that pure FP promised to eliminate. Not through clever algorithms, but through fundamental design principles.

The Rust experience crystallized something important: FP isn’t about the language – it’s about the thinking. Rust proved that a systems language could adopt FP principles selectively and reap massive benefits. And if Rust could do it, so could Python.

Let me elaborate.


What is FP (or FP-flavor)?

I’ve long thought FP meant mastering monads, understanding category theory, and writing code that looked like mathematical notation. That’s intimidating. And honestly, unnecessary for most practical purposes.

After years of reflection, I’ve come to believe that FP is really centered around two core ideas:

1. Expression-Oriented Programming

Everything should return a value. Instead of writing statements that “do things,” write expressions that “evaluate to things.”

1
2
3
4
5
6
7
8
9
# statement-oriented (imperative)
result = None
if condition:
    result = process_a(data)
else:
    result = process_b(data)

# expression-oriented (functional)
result = process_a(data) if condition else process_b(data)

This might seem trivial, but it changes how you think. Statements execute. Expressions evaluate. When everything is an expression, your code becomes a tree of values flowing through transformations, rather than a sequence of commands changing state.

2. Immutability-First

Data doesn’t change – you create new data. Functions don’t modify their inputs. They take values, compute new values, and return them.

1
2
3
4
5
6
7
8
# mutation-based
def add_item(items, new_item):
    items.append(new_item)  # modifies in place
    return items

# immutability-based
def add_item(items, new_item):
    return (*items, new_item)  # creates new tuple

The immutable version looks slightly more verbose. But it eliminates entire categories of bugs: no accidental modifications, no spooky action at a distance, no “wait, who changed this list?”

Other FP Concepts (Helpful But Not Essential)

Yes, functional programming includes other ideas:

  • Higher-order functions (functions that take or return functions)
  • Function composition (building complex operations from simple ones)
  • Pattern matching (destructuring and branching on data structure)
  • Currying, partial application, recursion schemes, etc.

These are useful tools. But you don’t need to master them to benefit from FP thinking. Start with expressions and immutability. Everything else follows naturally.


Why It Matters (Especially at Scale)

Here’s what I’ve learned building ML systems that process millions of patient records: functional programming makes programs less prone to bugs at scale.

Not “slightly better.” Not “nice to have.” Fundamentally more reliable.

The Problem with Mutable State

Consider a typical data processing pipeline:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class DataProcessor:
    def __init__(self):
        self.buffer = []
        self.processed_count = 0
    
    def process(self, item):
        # apply some transformation
        transformed = self.transform(item)
        self.buffer.append(transformed)
        self.processed_count += 1
        return transformed
    
    def get_results(self):
        return self.buffer

This looks reasonable. But it’s a bug factory:

  • What happens if two threads call process() simultaneously?
  • What if someone calls get_results() and modifies the returned list?
  • What if transform() fails halfway through – is processed_count still accurate?
  • Can you test process() in isolation, or does it depend on the object’s history?

Now multiply this by a team of five developers, each touching different parts of the pipeline, over months of development. Chaos.

The FP Alternative

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from dataclasses import dataclass
from typing import Callable

@dataclass(frozen=True)
class ProcessingResult:
    transformed_items: tuple[Item, ...]
    count: int

def process_items(
    items: tuple[Item, ...],
    transform: Callable[[Item], Item]
) -> ProcessingResult:
    transformed = tuple(map(transform, items))
    return ProcessingResult(
        transformed_items=transformed,
        count=len(transformed)
    )

This version:

  • Cannot have race conditions – no shared mutable state
  • Cannot be corrupted – frozen dataclass prevents modification
  • Is trivially testable – pure function, same input always gives same output
  • Is trivially composable – output is input to next stage

Reasoning About Complex Systems

The human brain can only hold so many things in working memory. When you read imperative code, you need to track:

  • Current state of all variables
  • Order of operations
  • Side effects of each function call
  • Implicit dependencies

With FP-style code, each function is an isolated, self-contained transformation. You can understand it without loading the entire system into your head.


FP as Paradigm, Not Language

Here’s the truth nobody talks about: it’s extremely rare to get organizational buy-in for functional programming languages in large projects. The pitch on FPL is always compelling: better correctness guarantees, more maintainable code, fewer bugs. And they’re not wrong. But it never happens. Why?

The Organizational Reality

Hiring is hard enough. Most companies struggle to find good Python or JavaScript developers. Now try finding engineers who know Haskell or F#. Your hiring pipeline shrinks by 90%.

Risk aversion is real. CTOs don’t get promoted for bold technology choices. They get fired for projects that fail. Betting on an esoteric language is a career risk.

Legacy codebases exist. You can’t rewrite everything. Your new functional service needs to interoperate with years of accumulated imperative code.

Team dynamics matter. Even if you love FP, you need to convince your entire team. Good luck getting consensus on that.

This isn’t to say FPLs are bad. They may be excellent but practically unavailable for most professional contexts.

Why FP-Style Python Matters

This is where Python’s multi-paradigm nature becomes a strength (note: All the points below apply to any contemporary multi-paradigm languages, let let’s focus on Python this article). Python doesn’t force you into any paradigm. You can write imperative code, object-oriented code, or functional code–often in the same file.

This flexibility means:

  • No organizational friction – it’s already Python, your team already uses it
  • Incremental adoption – apply FP principles module by module, function by function
  • Pragmatic compromises – use FP where it helps, imperative where it doesn’t
  • Lower learning curve – team learns gradually, not all at once

We get 80% of FP’s benefits with 20% of the organizational friction.


How to Write Python in FP Style

Let me share the patterns I’ve found most valuable.

Domain Modeling Upfront

Define your data structures first, before writing any logic. Use dataclasses with frozen=True:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from dataclasses import dataclass
from typing import Literal

PatientStatus = Literal['active', 'discharged', 'transferred']

@dataclass(frozen=True)
class Patient:
    id: str
    age: int
    status: PatientStatus
    diagnoses: tuple[str, ...]

@dataclass(frozen=True)
class Encounter:
    patient_id: str
    timestamp: datetime
    duration_minutes: int

You may wonder. Why upfront?

  • Forces you to think about data shape before logic
  • Makes illegal states unrepresentable
  • Serves as documentation
  • Enables better IDE support and type checking

Immutability Everywhere

Use immutable collections consistently:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# prefer tuples over lists
patient_ids: tuple[str, ...] = ('P001', 'P002', 'P003')

# prefer frozenset over set
valid_statuses: frozenset[str] = frozenset(['active', 'inactive'])

# create new collections instead of modifying
def add_diagnosis(patient: Patient, diagnosis: str) -> Patient:
    return dataclass.replace(
        patient,
        diagnoses=(*patient.diagnoses, diagnosis)
    )

Pure Functions and State Independence

Write functions that depend only on their inputs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# avoid instance state
class AnalyticsEngine:
    def __init__(self):
        self.cache = {}  # hidden state
    
    def calculate_metric(self, data):
        if data.id in self.cache:
            return self.cache[data.id]
        result = self._compute(data)
        self.cache[data.id] = result
        return result

# prefer pure functions
def calculate_metric(data: Data) -> Metric:
    return compute_metric(data)  # pure, no hidden state

# if you need caching, make it explicit
from functools import lru_cache

@lru_cache(maxsize=1000)
def calculate_metric_cached(data: Data) -> Metric:
    return compute_metric(data)

Static Methods and Module-Level Functions

When you don’t need instance state, avoid classes:

1
2
3
4
5
6
7
8
9
# instead of this
class MathUtils:
    @staticmethod
    def calculate_bmi(weight_kg: float, height_m: float) -> float:
        return weight_kg / (height_m ** 2)

# prefer this
def calculate_bmi(weight_kg: float, height_m: float) -> float:
    return weight_kg / (height_m ** 2)

Simpler, clearer, more testable.

Functional Pipelines with Built-in Tools

Use map, filter, reduce for transformations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from functools import reduce

# imperative style
high_risk_patients = []
for patient in patients:
    if patient.age > 65 and 'diabetes' in patient.diagnoses:
        high_risk_patients.append(patient)

total_encounters = 0
for patient in high_risk_patients:
    total_encounters += len(patient.encounters)

# functional style
is_high_risk = lambda p: p.age > 65 and 'diabetes' in p.diagnoses
get_encounter_count = lambda p: len(p.encounters)

high_risk_patients = tuple(filter(is_high_risk, patients))
total_encounters = reduce(
    lambda acc, p: acc + get_encounter_count(p),
    high_risk_patients,
    0
)

Pattern Matching (Python 3.10+)

Use match for expression-oriented branching:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def categorize_risk(patient: Patient) -> str:
    match (patient.age, patient.diagnoses):
        case (age, _) if age < 18:
            return 'pediatric'
        case (age, dx) if age >= 65 and any('diabetes' in d for d in dx):
            return 'high_risk_elderly'
        case (age, _) if age >= 65:
            return 'elderly'
        case _:
            return 'standard'

Function Composition

Build complex operations from simple ones:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from functools import reduce
from typing import Callable, TypeVar

T = TypeVar('T')

def compose(*functions: Callable) -> Callable:
    """compose(f, g, h)(x) = f(g(h(x)))"""
    return reduce(lambda f, g: lambda x: f(g(x)), functions, lambda x: x)

# example: data cleaning pipeline
strip_whitespace = lambda s: s.strip()
lowercase = lambda s: s.lower()
remove_special = lambda s: ''.join(c for c in s if c.isalnum() or c.isspace())

clean_text = compose(remove_special, lowercase, strip_whitespace)

# usage
patient_name = clean_text("  John DOE!@# ")  # "john doe"

Data Processing Pipeline

Here’s how these principles come together:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
from dataclasses import dataclass
from typing import Callable
from functools import reduce

# domain upfront
@dataclass(frozen=True)
class RawRecord:
    id: str
    value: float
    timestamp: datetime

@dataclass(frozen=True)
class ValidatedRecord:
    id: str
    value: float
    timestamp: datetime

@dataclass(frozen=True)
class AggregatedResult:
    record_count: int
    total_value: float
    mean_value: float

# pure transformation functions
def validate_record(record: RawRecord) -> ValidatedRecord | None:
    if record.value < 0 or record.value > 1000:
        return None
    return ValidatedRecord(
        id=record.id,
        value=record.value,
        timestamp=record.timestamp
    )

def aggregate_records(records: tuple[ValidatedRecord, ...]) -> AggregatedResult:
    count = len(records)
    total = reduce(lambda acc, r: acc + r.value, records, 0.0)
    return AggregatedResult(
        record_count=count,
        total_value=total,
        mean_value=total / count if count > 0 else 0.0
    )

# compose pipeline
def process_data(raw_records: tuple[RawRecord, ...]) -> AggregatedResult:
    validated = tuple(
        record for record in map(validate_record, raw_records)
        if record is not None
    )
    return aggregate_records(validated)

# usage
raw_data = (
    RawRecord('R001', 42.5, datetime.now()),
    RawRecord('R002', -10.0, datetime.now()),  # will be filtered
    RawRecord('R003', 87.3, datetime.now()),
)

result = process_data(raw_data)
# AggregatedResult(record_count=2, total_value=129.8, mean_value=64.9)

Notice:

  • All data structures defined upfront
  • Every function is pure – no side effects
  • Immutable throughout – frozen=True, tuples everywhere
  • Easily testable – each function in isolation
  • Easily composable – pipeline is just function calls

Python is NOT Designed for FP-First

Let me be honest: Python is not Haskell. It wasn’t designed with functional programming as the primary paradigm, and that shows. After a decade of using both, here are the limitations you need to know.

No Tail-Call Optimization

Python doesn’t optimize tail recursion. This means recursive functions will hit the recursion limit (default 1000 calls):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# this will crash with RecursionError
def factorial(n: int) -> int:
    if n <= 1:
        return 1
    return n * factorial(n - 1)

factorial(2000)  # RecursionError!

# in Haskell, this would be optimized
# in Python, use iteration or itertools.accumulate

This is a fundamental limitation. Guido van Rossum has explicitly said tail-call optimization won’t be added because it breaks Python’s debugging story.

Performance Overhead

Immutability isn’t free. Creating new data structures instead of mutating existing ones costs memory and CPU:

1
2
3
4
5
6
7
# mutation is fast
items = [1, 2, 3]
items.append(4)  # O(1), modifies in place

# immutability is slower
items = (1, 2, 3)
items = (*items, 4)  # O(n), creates new tuple

For small collections, this doesn’t matter. For large-scale data processing, it can. Know when to compromise.

Verbose Syntax

Compare Python to languages designed for FP:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Python
result = reduce(
    lambda acc, x: acc + x,
    filter(lambda x: x > 0, numbers),
    0
)

# Haskell
result = foldl (+) 0 $ filter (> 0) numbers

# F#
let result = numbers |> List.filter (fun x -> x > 0) |> List.fold (+) 0

Python’s lambda syntax is clunky. Named functions help, but it’s still more verbose than purpose-built FP languages.

Weak Enforcement

Nothing prevents mutation of “immutable” objects if you try hard enough:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from dataclasses import dataclass

@dataclass(frozen=True)
class Person:
    name: str
    age: int

person = Person("Alice", 30)

# this fails as expected
# person.age = 31  # FrozenInstanceError

# but this works
object.__setattr__(person, 'age', 31)  # mutation achieved!

Python’s immutability is conventional, not enforced by the type system. You rely on discipline, not guarantees.

OOP-Heavy Standard Library

Python’s standard library is designed around object-oriented patterns:

1
2
3
4
5
6
7
8
# very OOP
file = open('data.txt')
file.write('hello')
file.close()

# functional wrappers exist, but feel bolted on
from pathlib import Path
Path('data.txt').write_text('hello')

You’ll often write adapters to make built-in libraries feel functional.

When to Be Pragmatic

Given these limitations, here’s my rule of thumb:

Use FP patterns for:

  • Business logic and domain models
  • Data transformations and pipelines
  • Anything that needs to be tested or reasoned about

Don’t use FP patterns for:

  • Performance-critical tight loops (use mutation)
  • Working with inherently stateful systems (databases, file I/O)
  • When the standard library provides a simpler imperative approach

FP is a tool, not a religion. We’d better use it where it helps.


Why Learning Serious FPLs Still Matters

Here’s something that might seem contradictory: even though Python has limitations for FP, and even though you probably won’t use Haskell or F# professionally, it’s absolutely worth learning a serious functional programming language.

I’m currently teaching myself F#. Not because I expect to deploy it professionally (I don’t). Not because I think it’ll make my resume shine (it won’t). But because the lessons learned from a functional-first language are remarkably transferable to every language I use daily.

What You Learn from Serious FPLs

Type-driven design. F# and Haskell would teach us to model your domain with types first, logic second. This habit – defining type PatientStatus = Active | Discharged | Transferred before writing any functions – translates directly to Python’s dataclasses and Literal types.

Principled error handling. FPLs teach us to make errors explicit and type-safe (e.g., Result types). When I return to Python, I can write:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from typing import TypeAlias

Success: TypeAlias = tuple[Literal[True], Patient]
Failure: TypeAlias = tuple[Literal[False], str]
Result: TypeAlias = Success | Failure

def validate_patient(data: dict) -> Result:
    if 'id' not in data:
        return (False, "missing patient id")
    return (True, Patient(**data))

Pipeline thinking. Pipe operator (e.g., |> in F#) trains us to think in data transformations:

1
2
3
4
5
// F#
patients
|> List.filter isHighRisk
|> List.map calculateScore
|> List.sortByDescending id

Even though Python lacks |>, the thinking remains:

1
2
3
4
5
6
# Python
result = sorted(
    map(calculate_score, filter(is_high_risk, patients)),
    key=lambda x: x,
    reverse=True
)

Immutability by default. In F#, mutation is opt-in (you must declare mutable). This default shapes how you think. Even in Python where mutation is easy, you start reaching for immutable patterns first.

The Transferability Insight

Learning Rust taught me monads through Option and Result. Learning F# is extending this further:

  • Discriminated unions (algebraic data types)
  • Exhaustive pattern matching
  • Computation expressions (like Haskell’s do-notation)
  • Type inference done right

None of these exist in Python with the same elegance. But the principles transfer:

  • Model with types
  • Make illegal states unrepresentable
  • Compose small functions into larger ones
  • Prefer expressions over statements
  • Default to immutability

When I return to Python, I’m not trying to write F# in Python. I’m applying FP principles with Python’s idioms. That’s the difference.

The Latin Analogy

Learning F# is like learning Latin. You may never speak Latin in daily life, but it makes you better at other lanauges derived from or influenced by Latins

Similarly, you may never deploy Haskell or F# professionally. But learning them makes you a better Python developer. A better Rust developer. A better thinker about code.

The organizational constraints haven’t changed – you still can’t force your team to adopt an FPL. But you can apply what you’ve learned from FPLs to every language you use. You code keeping FP principles in mind, even when the language doesn’t enforce them.

I think that’s the real value.


Closing Thoughts: Principles Over Purity

A decade ago, I thought functional programming was all-or-nothing. Either you write pure Haskell with monads and type-level programming, or you’re not doing FP.

At the end of the day, I don’t think that was the right way of thinking.

The journey from Haskell to Rust to Python – and now to F# – has taught me that the real insight isn’t about languages. It’s about principles. Expression-oriented thinking. Immutability-first design. Pure functions. State independence. These ideas transcend syntax. They make code better regardless of whether you’re writing Haskell, Rust, Python, or JavaScript.

Rust showed me that a systems language could embrace FP selectively and become more reliable. Python shows me that a scripting language can do the same. F# is teaching me the pure forms of these ideas, which then inform how I write code in every other language.

Python won’t enforce these principles the way a pure FP language would. But that’s okay. The flexibility is a feature. You can adopt FP where it helps and stay pragmatic where it doesn’t.

Working professionally, I’ve seen these principles prevent entire categories of bugs. Not because we’re using exotic languages, but because we’re applying disciplined thinking to everyday Python code. Immutable data models. Pure transformation functions. Explicit state management. These ideas, learned from serious FPLs, implemented in practical languages.

It works. Not perfectly – Python’s limitations are real. But it works well enough to make a measurable difference in code quality, maintainability, and team velocity.

To conclude, FP isn’t a language choice, it’s a way of thinking. You don’t need permission to use an exotic language. You don’t need to convince your team to rewrite everything. You just need to apply better principles to the code you’re already writing. The paradigm matters more than the language. Learn the principles from languages that embody them purely, then apply them pragmatically everywhere.