WebSockets

David Claveau
10 min readJun 26, 2021

This post provides a general overview of WebSockets, where and why they’re useful, and how they enable web apps to render data in real-time. The post will speak about what a WebSocket is, the differences between WebSockets and HTTP requests, how to setup and use WebSockets, and the efficiency of WebSockets in an app.

Man and woman talk using two cups with a string attaching them. Man is visibly confused.

Communication is inherently difficult.

This is a fact that doesn’t just relate to web applications. There’s some 7,000 spoken languages with their own characters, syntax, and nuance — English, French, Japanese, Hindi, Swahili just to name a few. If that wasn’t hard enough to traverse, these languages are spoken across the globe. Say you want to communicate with someone in your language on the other side of the world — what’s the best way to go about doing this? Do you send letters, call them, or send an email? Over time, technological advancements enabled communication around the world to progress from slower methods (such as letters) to much faster, more reliable and efficient methods (telephones and email).

In many ways, the web has undergone similar advancements since its inception to the point where we are now. Just as the way we exchange information from person to person across the globe, the web has changed how information is exchanged between clients and servers to be faster and more efficient.

HTTP Requests

In the beginning, the HTTP request was all that was needed from the web. The client requests something; the server responds. HTTP requests are quite straightforward, but a quick surface-level overview:

  1. The client makes a request to the server (and makes a connection).
  2. Within this request, the header field will include information of what the client is requesting.
  3. The server will take the request, fetch the data or complete a process as detailed by the client’s request.
  4. The server will send the response back to the client with information related to the request.
  5. The server will close the connection (this part is important, as we’ll see).

When webpages were simple and all we needed was text or images from a webpage, this one-way, unilateral type of communication was more than sufficient.

But as things changed, expectations from the web shifted; more data needed to be transferred and rendered, and webpages or web apps were expected to do more. Developers began to encounter more and more issues with the HTTP request/response structure — specifically when trying to create real-time applications that provided a faster communication style.*

*This isn’t to say that HTTP is falling to the wayside by any means. HTTP is still the standard and won’t be going anywhere anytime soon.

With the HTTP setup, every request and response requires a single connection. This means that each update requires the server to respond with information, and that information needs to be rendered to the client — usually in a new page load. With real-time updates, such as instant messages, this can prove to be clunky, slow, and inefficient.

A drawing of several cars driving single-file down a road

Long Polling as the Precursor

You may be saying though “what’s the problem? Surely coding techniques like AJAX can complete that for us and render information on the page, right?” You’re not wrong — we can use asynchronous code like AJAX to update the page and render the data as the response comes in from the server, basically creating real-time updates.

In fact AJAX will work, and you’re more than likely to find a setup called long polling on websites. Long polling is similar (on the surface) to WebSockets in that a client will make a request to a server, and the server will respond, but the client will perpetually ask the server to wait until it receives another request. This basically keeps an open connection with the server using constant HTTP requests. While effective, this requires recurring requests, headers being sent, and responses from the server.

But let’s finally talk about WebSockets.

Drawing where several people communicate in various manners: instant messaging, presentations, polling.

WebSockets

WebSockets were first mentioned by Michael Carter and Ian Hickson in 2008 after much frustration when using long polling methods. In fact, we can see Hickson suggesting the name for WebSockets in an email to Carter from 2008.

By 2011, the Internet Engineering Task Force (IETF) published the WebSocket Protocol, and by today’s standards all browsers (even our lazy friend, Internet Explorer) will support WebSocket functionality. Today, WebSockets are as instrumental in rendering updates on a webpage as HTTP requests.

The biggest difference between HTTP requests and WebSockets is the manner that data is transferred from client to server and back to the client. As we saw in the overview of HTTP above, requests and responses use the HTTP Protocol on top of the Transmission Control Protocol (TCP). The headers of the HTTP request specifies the data that is being transferred. Ultimately, this is summarized as: a request is made, a response is sent, the TCP connection is closed.

WebSockets, however, utilize the TCP communication layer to transfer and stream data directly, without the HTTP protocol, which creates a bidirectional, full-duplex protocol. This is a fancy way of saying that this allows the program to send and receive information independent of each other, and at the same time.

How Do WebSockets Work?

Two people shaking hands
WebSockets require a “handshake” between client and server

While WebSockets don’t use HTTP requests for consistent connections, they were designed to work with the current HTTP request method with their initial connection. This means the first request from client to server establishes a WebSocket connection using an HTTP request. The request will look something like this:

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Origin: http://example.com

To which the server will respond:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat

When the server receives a WebSocket request from the client, it will respond with a 101 message to confirm the change. What that happens, the Sec-WebSocket-Key and the Sec-WebSocket-Accept are used to provide a handshake between client and server. The Sec-WebSocket-Key is concatenated with a magic string (258EAFA5-E914–47DA-95CA-C5AB0DC85B11), takes the SHA-1 hash of the result, and this is returned with a base64 encoding of that hash.

While there’s a lot going on here (and there’s a bit more behind the scenes in Exchanging Data Frames), this is all to ensure the data being passed is secure and specific to the user. Until the connection is terminated by the client or the server, data can be sent and received without separate requests.

Using WebSockets

Incorporating and using WebSockets is pretty straightforward and there are numerous tutorials and guides for a variety of tech stacks. For myself, I chose to follow this YouTube tutorial by Karl Hadwen that took about 20 minutes to set up and use with the ws package for NodeJS. By the end of the tutorial, we created a very simple messaging app that is shared by any number of users (multiple tabs in the browser).

Once the server is started and running, we can navigate to the index page of the JavaScript app to see the WebSocket in action. The tutorial uses an immediate function call to open the WebSocket, and when we inspect the page using the Chrome Dev Tools, under the “Network” tab we can see that the connection is open and we see a successful 101 Switching Protocols response is received from the server. We put a handy console log stating the connection is open as well:

Inspect element of the webpage shows the websocket was successfully connected

With each message sent to the server, we can see in the “Messages” section that there’s an update with the information being passed, as well as the time when the message is coming through. In another tab in the browser, we see the same messages being rendered in real-time from the first tab’s input:

Inspect element of the webpage shows each message being sent and received by the websocket.

In fact, you’ll find WebSockets just about everywhere that live data and real-time interactions need to be rendered on screen immediately. One of the best examples is a stock-watching app, such as the one found on MarketWatch. All we need to do is inspect the page again, select “Network”, and we’ll see in “Messages” the changes in stock prices or changes from green arrows or red arrows — and just about anything else that should be updated dynamically.

Inspecting a stock-watching webpage shows the websocket updating the data in real-time

You’ll find different WebSocket solutions and libraries for any assortment of tech stacks. Here are a few that I found for other popular stacks:

WebSocket Efficiency

Remember how I continued to mention that WebSockets don’t require HTTP headers, and send information using TCP? That’s quite important in how efficient WebSocket data exchange is!

Because WebSockets don’t continuously use the HTTP headers (after the initial request), it saves a significant amount of data with each interaction. Why is this important? Looking at Google’s SPDY research project, HTTP headers vary from 200 bytes to 2 KB in size, and the common size of HTTP header is 700–800 bytes. Comparatively, using TCP directly for data transfer (which WebSockets do) will require a smaller amount of data with each transfer.

A great visualization comes from David Luecke’s article “HTTP vs Websockets: A performance comparison” where the data transfer speed and efficacy was benchmarked between HTTP and WebSockets (using the Socket.io library). To summarize Luecke’s findings, using WebSockets was slower with single connections (as a WebSocket library requires time to establish an initial connection), but with even a dozen or more connections, the speed increased when using WebSockets. With 100 concurrent connections and 500 or more requests each, the WebSocket library provided a 400% performance boost.

Courtesy of David Luecke’s article. Benchmark results for 100 concurrent connections and 500 to 2000 requests each.

All in all, the more users and concurrent connections, the more efficient and effective a WebSocket will be in rendering a web app’s data.

How Deep the WebSocket Rabbit Hole Goes

Building a single WebSocket app as rudimentary as my example above is great, gets the idea across, and hopefully shows how useful WebSockets can be. But I’m sure you’re asking “what if I have more than one user and I don’t want everyone to see the information? What if I need to set up WebSockets for messages and notifications, while also making sure the data rendered is showing the right information to the right people? Are WebSockets all I’ll ever need?”

As mentioned briefly before (but worth reiterating here again), WebSockets don’t replace HTTP — HTTP is here to stay. As we saw, WebSockets are great for showing real-time updates, such as stock prices on a stock-watching homepage. It all depends on the app, but in most cases you’ll see WebSockets being used for messaging, notification, and live-update features; whereas HTTP will still be utilized for your RESTful routes.

Pusher empowers developers to create powerful realtime features at scale.

In fact, using WebSockets for instant messaging, notifications, or data update changes became so common that platforms began popping up to help build these common WebSocket features. Platforms like Pusher or Ably provide you (and/or your team) the ability to manage your WebSockets securely and with confidence using their software as a service. These platforms offer seamless integration to help setup and render common WebSocket features, such as instant messaging and notifications. Using their API, these platforms track and monitor the different WebSocket integrations as they are used by your users. This ensures that the right information is being sent out; if you encounter any failures or edge cases, they are handled with ease.

As with most things in the tech world, you’ll probably ask yourself again “but do I really need products like Pusher or Ably to establish a WebSocket connection?” And like most things in the tech world, someone already answered it on StackOverflow.

Short answer: no, you don’t.

Long answer: no, you don’t, but it’s reassuring to have a framework, library, or platform watching your app’s back when things don’t work the way you expect them.

Conclusion

Communication is inherently difficult.

But as technology progresses, humans adapt and alter the methods for communication. You could argue that developers are often times hyper-aware when a systematic change needs to be made. Just as long polling created barriers for developers like Hickson and Carter, the suggestion of WebSockets (and their eventual acceptance as a Protocol) led to a sweeping change in how features like messaging and notifications were rendered to users.

Of course, as users adapt to features created by WebSockets, expectations will most likely shift once more.

More Resources

--

--