webSockets and regular sockets are not the same thing. A webSocket runs over a regular socket, but runs its own connection scheme, security scheme and framing protocol on top of the regular socket and both endpoints must follow those additional steps for a connection to even be made. You can see the webSocket protocol here: https://www.rfc-editor.org/rfc/rfc6455
The biggest difference right away is that ALL webSocket connections start with an HTTP request from client to server. The client sends an HTTP request to the exact same server and port that is open for normal web communication (default of port 80, but if the web server is running on a different port, then the webSocket communication would follow it on that other port). The client sets a few custom headers, the most important of which is a header that indicates that the client wishes to "upgrade" to the webSocket protocol. In addition both sides exchange some security keys. If the server agrees to the "upgrade", then both client and server switch the protocol being spoken over that original socket from HTTP to webSocket and now the webSocket framing protocol is used.
In addition, the initial HTTP request can have a request path in it to indicate a "sub-destination" for the webSocket request. This allows all sorts of different webSocket requests to all be initiated with the same server and port.
There is also an optional sub-protocol specifier with the Sec-WebSocket-Protocol
header which allows request to further identify sub protocols (a common one might be "chat") so that both sides can agree on a specific set of message identifiers and their corresponding meaning that might be used.
The fact that a webSocket connection starts with an HTTP connection is critically important because if you can reach the web server for normal web communication, then you can reach it for a webSocket request without any networking infrastructure anywhere between client and server having to open new holes in the firewall or open new ports or anything like that.
You can see an excellent summary of how a webSocket connection is started here: https://developer.mozilla.org/en-US/docs/WebSockets/Writing_WebSocket_servers.
The webSocket protocol also defines ping and pong packets that help both sides know if an idle webSocket is still connected.
One can only assume that the reason it took awhile to get webSockets into all common browsers is the same reason that lots of useful capabilities took awhile. First a group of motivated folks have to identify and agree upon a need, then that group needs to take the lead in developing an approach to solve the problem, then the idea gets kicked around for awhile either gathering support and dealing with objections or competing with alternate ways of solving such a problem and then it appears to have enough momentum to actually be something that could become a standard, then someone decides to do a test/trial implementation in a browser and a matching server implementation (sometimes this step comes much earlier). Then, if it's still finding momentum and appears to be on a standards track, other browser makers will pick up the idea and start on their implementation. Once all browser makers have a decent working implementation (usually there are rounds of standards improvement as different implementations find holes in the specification or as early developers identify problems or missing capabilities or security issues arise). Then, it gets to the point where at least two major browsers have the feature in their latest releases, the standard is considered relatively solid and consumers start to adopt those browsers and some sites start to improve their user experience by using the new capability. At that point, the trailing browsers start to feel pressure to implement it. Then, sometimes years later, all major browser have the feature and those browsers have enough overall user adoption that web sites can rely on the feature (without having to have a major second fallback design that works when a browser doesn't support the feature). This entire process can take many, many years.
Here's an example of the initial HTTP request to initiate a webSocket connection:
GET /chat HTTP/1.1
Host: example.com:8000
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
And, the server response:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
And, a data frame example:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+