Websocket transport reliability (Socket.io data loss during reconnection)

Others have hinted at this in other answers and comments, but the root problem is that Socket.IO is just a delivery mechanism, and you cannot depend on it alone for reliable delivery. The only person who knows for sure that a message has been successfully delivered to the client is the client itself. For this kind of system, I would recommend making the following assertions:

  1. Messages aren’t sent directly to clients; instead, they get sent to the server and stored in some kind of data store.
  2. Clients are responsible for asking “what did I miss” when they reconnect, and will query the stored messages in the data store to update their state.
  3. If a message is sent to the server while the recipient client is connected, that message will be sent in real time to the client.

Of course, depending on your application’s needs, you can tune pieces of this–for example, you can use, say, a Redis list or sorted set for the messages, and clear them out if you know for a fact a client is up to date.


Here are a couple of examples:

Happy path:

  • U1 and U2 are both connected to the system.
  • U2 sends a message to the server that U1 should receive.
  • The server stores the message in some kind of persistent store, marking it for U1 with some kind of timestamp or sequential ID.
  • The server sends the message to U1 via Socket.IO.
  • U1’s client confirms (perhaps via a Socket.IO callback) that it received the message.
  • The server deletes the persisted message from the data store.

Offline path:

  • U1 looses internet connectivity.
  • U2 sends a message to the server that U1 should receive.
  • The server stores the message in some kind of persistent store, marking it for U1 with some kind of timestamp or sequential ID.
  • The server sends the message to U1 via Socket.IO.
  • U1’s client does not confirm receipt, because they are offline.
  • Perhaps U2 sends U1 a few more messages; they all get stored in the data store in the same fashion.
  • When U1 reconnects, it asks the server “The last message I saw was X / I have state X, what did I miss.”
  • The server sends U1 all the messages it missed from the data store based on U1’s request
  • U1’s client confirms receipt and the server removes those messages from the data store.

If you absolutely want guaranteed delivery, then it’s important to design your system in such a way that being connected doesn’t actually matter, and that realtime delivery is simply a bonus; this almost always involves a data store of some kind. As user568109 mentioned in a comment, there are messaging systems that abstract away the storage and delivery of said messages, and it may be worth looking into such a prebuilt solution. (You will likely still have to write the Socket.IO integration yourself.)

If you’re not interested in storing the messages in the database, you may be able to get away with storing them in a local array; the server tries to send U1 the message, and stores it in a list of “pending messages” until U1’s client confirms that it received it. If the client is offline, then when it comes back it can tell the server “Hey I was disconnected, please send me anything I missed” and the server can iterate through those messages.

Luckily, Socket.IO provides a mechanism that allows a client to “respond” to a message that looks like native JS callbacks. Here is some pseudocode:

// server
pendingMessagesForSocket = [];

function sendMessage(message) {
  pendingMessagesForSocket.push(message);
  socket.emit('message', message, function() {
    pendingMessagesForSocket.remove(message);
  }
};

socket.on('reconnection', function(lastKnownMessage) {
  // you may want to make sure you resend them in order, or one at a time, etc.
  for (message in pendingMessagesForSocket since lastKnownMessage) {
    socket.emit('message', message, function() {
      pendingMessagesForSocket.remove(message);
    }
  }
});

// client
socket.on('connection', function() {
  if (previouslyConnected) {
    socket.emit('reconnection', lastKnownMessage);
  } else {
    // first connection; any further connections means we disconnected
    previouslyConnected = true;
  }
});

socket.on('message', function(data, callback) {
  // Do something with `data`
  lastKnownMessage = data;
  callback(); // confirm we received the message
});

This is quite similar to the last suggestion, simply without a persistent data store.


You may also be interested in the concept of event sourcing.

Leave a Comment