In Defense of Colloquial Entropy

11 Nov, 2024

purpose

Physicists long balked at colloquial usages of “entropy” (especially by philosophers). I believe some usages of entropy as resistance to systematic understanding has precise mathematical grounding in information theory. I am tired of arguing with my physics major friends over this.

definitions

Shannon entropy H(X) for discrete random variable X:

H(X) = -∑ p(x) log₂(p(x))

measures average bits needed to encode X’s outcomes.

For any compression scheme C mapping sequences of X to bit strings:

L(C) ≥ H(X)

(Shannon’s source coding theorem)

In statistical learning with probability distribution P and hypothesis space H:

Error = Bias² + Variance + Irreducible Error

(Bias variance trade off)

phenomenological mapping

Consider raw experiential phenomena:

Emotional states (continuous, high-dimensional)
Complex situations (multiple interacting agents/factors)
Novel problems (undefined solution spaces)
Transient physical experiences (direct sensory input)

These can be mapped to information-theoretic framework:

Each phenomenon generates sequence of observations
Observations drawn from probability distribution P
Mental models attempt compression into predictable patterns
Compression complexity bounded by Shannon entropy

claim

Consider this tweet I made:

you interpret via self-constructed systems. you can’t out-construct entropy (overfitting) => interpretation/discursive analysis is often overused. sit with anxiety inducing things before subjugating it with your pretty little intellectual frameworks.

Let’s prove this has mathematical meaning:

Map experiential phenomena to data points from distribution P
Mental models are compression schemes C with complexity L(C)
For experience with entropy H:
- By Shannon’s theorem: L(C) ≥ H
- Equality holds only for lossless compression
By bias-variance tradeoff, cannot simultaneously:
- Minimize model complexity L(C)
- Capture all patterns in P
- Generalize to new experiences

Therefore: For high-entropy phenomena, any attempt to “out-construct” (compress below H bits) must either:

Lose essential information (underfitting)
Create brittle, overfitted models
Accept fundamental incompressibility

analogous failure modes (AI)

Just as in machine learning (literally, artificial intellect) with high-entropy data, we encounter:

In compression:
- Information loss through simplification
- Over-specified models that don’t generalize
- Incompressible sequences requiring full description
In experiential interpretation:
- Oversimplified frameworks missing crucial nuance
- Complex frameworks that fail on new experiences
- Phenomena resisting systematic understanding

(broad/loose/quick) applications

Consider interpreting complex emotion:

Low-complexity: “I’m just sad” (1-bit classification, high bias)
High-complexity: “Unique combination of childhood memory #247, recent event #892, weather conditions…” (high variance)
Optimal: Maintain full complexity when compression would lose essence

Similarly for novel problems:

Low-complexity: Force into known solution patterns
High-complexity: Enumerate every possible factor
Optimal: Hold uncertainty while patterns emerge

These applications are analogies as even putting our experience in words is subjugating them to some form or system of human understanding/interpretation.

implications

For high-entropy experiences (high Kolmogorov complexity relative to description language), compressed representations must lose information
Cannot build frameworks that are simultaneously:
- Simple enough to be useful
- Complex enough to capture patterns
- Generalizable to new experiences
Raw phenomena often contain more information than any compressed representation can maintain

conclusion

Not all phenomena permit compression. Just as there exist mathematically incompressible sequences, there exist experiences whose true information content exceeds our capacity for systematic representation. The limit isn’t practical but mathematical.

Sometimes the most accurate model is the experience itself, held in its full entropy without reduction to simpler frameworks. This isn’t an argument against systematic understanding, but a precise delineation of its boundaries. This has been long known by vedic/buddhist thinkers.

My physics UG friends objecting to colloquial “entropy” ironically demonstrate my thesis–attempting to compress a mathematically valid analogy into overly rigid framework.

references

wikipedia
the inner depths of my mind
this book which is the best resource on entropy I have ever read