Thoughts on Summaries The summary is a very useful thing. In its most general form, a summary is a one-way transformation that allows one to tell apart two pieces of data. No summary suffices to confirm whether two pieces of data be identical, however. In distributed computations without trust, summaries are used mainly to describe linked data structures whose links are meant to be unforgeable; such summaries are those generated by cryptographic hash checksum digest functions, and generally provide only one operation: equality. Such summaries are generally meaningless beyond equality, but non-cryptographic summaries beyond checksums remind one of their other uses, like improving the characteristics of computations. Cryptographic hash checksum digest functions are fundamentally trapdoor functions and I believe they exist not in general. All such functions are secured by ignorance, not mathematical truth, I think. Many systems rely on their existence, and can't exist in any shape without them. BitTorrent makes a fine example: The entire idea is predicated on the ability to describe the data to share without any large cost, and distributing a summary one one hundred thousandth the data's size is generally small enough to always work, but it requires that aspect of unforgeability lest lies be shared in place of the data. Proof-of-Work as in Bitcoin is another example, which treats the summaries as integers by virtue of all bit sequences corresponding to a unique integer according to some scheme, and which is able to work only so long as no one knows any way to optimize the chosen function sufficiently well. Mapping data to growing integers, which requires an assigner to exist, is always that best method to uniquely identify data when trust becomes even a little reasonable; I believe it will be vindicated. Content-addressable storage can exist in a world without the trapdoor function, but not in a hostile environment, and not by blindly trusting the function to be anything but confirmation of inequality. Summaries with more operations than equality can be very useful, I think; the example I've most used currently is a function I only call Summarize, which accepts a sequence of values in some domain and returns the counts of each value in said domain. The summary is very clearly non-cryptographic, but has its uses: If Summarize be considered a function of linear time complexity, and it reasonably can be, then it gives certain problems linear time complexity based on the sizes of the input sequences, such as anagram searching; more interestingly, the same summary also works for the harder problem of almost-anagram searching within a tolerance, since such a summary is useful with more than equality. I find it useful to think of checksums and the like as a subset of summaries with measly operations; within this framing, one naturally starts to think of other types of summaries useful in other ways. .