Basically elaborating on what's outlined here.
Here's how it works: let's say we have a function that takes a number from zero through nine, adds three and, if the result is greater than ten, subtracts ten. So f(2) = 5, f(8) = 1, etc. Now, we can make another function, call it f', that goes backwards, by adding seven instead of three. f'(5) = 2, f'(1) = 8, etc.
That's an example of a two-way function and its inverse. Theoretically, any mathematical functions that maps one thing to another can be reversed. In practice, though, you can make a function that scrambles its input so well that it's incredibly difficult to reverse.
Taking an input and applying a one-way function is called "hashing" the input, and what Amazon stores on their system is a "hash" of your secret key. SHA1 is an example of this kind of "one-way" function, it's also hardened against attacks.
The HMAC function builds on established hash functions to use a known key to authenticate a string of text. It works like this:
- You take the text of your request and your secret key and apply the HMAC function.
- You add that authentication header to your request and send it to Amazon.
- Amazon looks up their copy of the secret key, and the text you just sent and applies the HMAC function.
- If the result matches, they know that you have the same secret key.
The difference between this and PKI is that this method is RESTful, allowing a minimum number of exchanges between your system and Amazon's servers.
Isn't that basically the same thing as
asking me for my credit card numbers
or password and storing that in their
own database?
Yes, though the damage someone can do with S3 seems to be limited to draining your account.
How secret do they need to be? Are
these applications that use the secret
keys storing it somehow?
At some point, you're going to have to load the secret key, and with most Unix based systems, if an attacker can get root access they can get the key. If you encrypt the key, you have to have code to decrypt it, and at some point the decryption code has to be plain text so it can be executed. This is the same problem DRM has, except that you own the computer.
In many cases, I just put secret keys in a file with limited permissions, and take the usual precautions to prevent my system from being rooted. There are a few tricks to make it work properly with a multiuser system, such as avoiding temporary files and such.