have they though? i for one remember the "golden days" being much more about checking specific values for things you already know could happen than gathering telemetry to be able to ask questions you never would have even thought about asking at the time you set it up.
Author of the post here. I definitely agree with that assessment. Having multiple stacks monitor each other is a really good, albeit somewhat resource intensive, way to cover all of the *likely* cases. The cost of covering that last percent very quickly approaches the point of diminishing returns.
You start towards it in your post, but don't explicitly call it out: the central task is decreasing complexity.
A "dumb" heartbeat message, and a check for its absence, is reliable precisely because of its simplicity.
So if the question is "How do I monitor my complicated observability stack?" then the answer, to me, starts with "What is the simplest method of accomplishing the minimal version of my goals?"