Seems this would not be a deterministic thing, or is there a way to do this reliably?
If you're using gzip, you can do something like this:
# diff <(zcat file1.gz) <(zcat file2.gz)
2.1m questions
2.1m answers
60 comments
57.0k users