In Defense of Colloquial Entropy
purpose
Physicists long balked at colloquial usages of “entropy” (especially by philosophers). I believe some usages of entropy as resistance to systematic understanding has precise mathematical grounding in information theory. I am tired of arguing with my physics major friends over this.
definitions
Shannon entropy H(X) for discrete random variable X:
H(X) = -∑ p(x) log₂(p(x))
measures average bits needed to encode X’s outcomes.
For any compression scheme C mapping sequences of X to bit strings:
L(C) ≥ H(X)
(Shannon’s source coding theorem)
In statistical learning with probability distribution P and hypothesis space H:
Error = Bias² + Variance + Irreducible Error
(Bias variance trade off)
phenomenological mapping
Consider raw experiential phenomena:
- Emotional states (continuous, high-dimensional)
- Complex situations (multiple interacting agents/factors)
- Novel problems (undefined solution spaces)
- Transient physical experiences (direct sensory input)
These can be mapped to information-theoretic framework:
- Each phenomenon generates sequence of observations
- Observations drawn from probability distribution P
- Mental models attempt compression into predictable patterns
- Compression complexity bounded by Shannon entropy
claim
Consider this tweet I made:
you interpret via self-constructed systems. you can’t out-construct entropy (overfitting) => interpretation/discursive analysis is often overused. sit with anxiety inducing things before subjugating it with your pretty little intellectual frameworks.
Let’s prove this has mathematical meaning:
- Map experiential phenomena to data points from distribution P
- Mental models are compression schemes C with complexity L(C)
- For experience with entropy H:
- By Shannon’s theorem: L(C) ≥ H
- Equality holds only for lossless compression
- By bias-variance tradeoff, cannot simultaneously:
- Minimize model complexity L(C)
- Capture all patterns in P
- Generalize to new experiences
Therefore: For high-entropy phenomena, any attempt to “out-construct” (compress below H bits) must either:
- Lose essential information (underfitting)
- Create brittle, overfitted models
- Accept fundamental incompressibility
analogous failure modes (AI)
Just as in machine learning (literally, artificial intellect) with high-entropy data, we encounter:
-
In compression:
- Information loss through simplification
- Over-specified models that don’t generalize
- Incompressible sequences requiring full description
-
In experiential interpretation:
- Oversimplified frameworks missing crucial nuance
- Complex frameworks that fail on new experiences
- Phenomena resisting systematic understanding
(broad/loose/quick) applications
Consider interpreting complex emotion:
- Low-complexity: “I’m just sad” (1-bit classification, high bias)
- High-complexity: “Unique combination of childhood memory #247, recent event #892, weather conditions…” (high variance)
- Optimal: Maintain full complexity when compression would lose essence
Similarly for novel problems:
- Low-complexity: Force into known solution patterns
- High-complexity: Enumerate every possible factor
- Optimal: Hold uncertainty while patterns emerge
These applications are analogies as even putting our experience in words is subjugating them to some form or system of human understanding/interpretation.
implications
- For high-entropy experiences (high Kolmogorov complexity relative to description language), compressed representations must lose information
- Cannot build frameworks that are simultaneously:
- Simple enough to be useful
- Complex enough to capture patterns
- Generalizable to new experiences
- Raw phenomena often contain more information than any compressed representation can maintain
conclusion
Not all phenomena permit compression. Just as there exist mathematically incompressible sequences, there exist experiences whose true information content exceeds our capacity for systematic representation. The limit isn’t practical but mathematical.
Sometimes the most accurate model is the experience itself, held in its full entropy without reduction to simpler frameworks. This isn’t an argument against systematic understanding, but a precise delineation of its boundaries. This has been long known by vedic/buddhist thinkers.
My physics UG friends objecting to colloquial “entropy” ironically demonstrate my thesis–attempting to compress a mathematically valid analogy into overly rigid framework.
references
- wikipedia
- the inner depths of my mind
- this book which is the best resource on entropy I have ever read