Update Jan 2019: As mentioned in the comments, the information below is true up until version 3.2. Version 3.4+ changed the spec so that machine ID and process ID were merged into a single random 5 byte value instead. That might make it harder to figure out where a document came from, but it also simplifies the generation and reduces the likelihood of collisions.
Original Answer:
+1 for Sergio's answer, in terms of answering whether they could be guessed or not, they are not hashes, they are predictable, so they can be "brute forced" given enough time. The likelihood depends on how the ObjectIDs were generated and how you go about guessing. To explain, first, read the spec here:
Object ID Spec
Let us then break it down piece by piece:
- TimeStamp - completely predictable as long as you have a general idea of when the data was generated
- Machine - this is an MD5 hash of one of several options, some of which are more easily determined than others, but highly dependent on the environment
- PID - again, not a huge number of values here, and could be sleuthed for data generated from a known source
- Increment - if this is a random number rather than an increment (both are allowed), then it is less predictable
To expand a bit on the sources. ObjectIDs can be generated by:
- MongoDB itself (but can be migrated, moved, updated)
- The driver (on any machine that inserts or updates data)
- Your Application (you can manually insert your own ObjectID if you wish)
So, there are things you can do to make them harder to guess individually, but without a lot of forethought and safeguards, for a normal data set, the ranges of valid ObjectIDs should be fairly easy to work out since they are all prefixed with a timestamp (unless you are manipulating this in some way).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…