How does the python socket.recv() method know that the end of the message has been reached?

It depends on the protocol. Some protocols like UDP send messages and exactly 1 message is returned per recv. Assuming you are talking about TCP specifically, there are several factors involved. TCP is stream oriented and because of things like the amount of currently outstanding send/recv data, lost/reordered packets on the wire, delayed acknowledgement of data, and the Nagle algorithm (which delays some small sends by a few hundred milliseconds), its behavior can change subtly as a conversation between client and server progresses.

All the receiver knows is that it is getting a stream of bytes. It could get anything from 1 to the fully requested buffer size on any recv. There is no one-to-one correlation between the send call on one side and the recv call on the other.

If you need to figure out message boundaries its up to the higher level protocols to figure that out. Take HTTP for example. It starts with a \r\n delimited header and then has a count of the remaining bytes the client should expect to receive. The client knows how to read the header because of the \r\n then knows exactly how many bytes are coming next. Part of the charm of RESTful protocols is that they are HTTP based and somebody else already figured this stuff out!

Some protocols use NUL to delimit messages. Others may have a fixed length binary header that includes a count of any variable data to come. I like zeromq which has a robust messaging system on top of TCP.

More details on what happens with receive…

When you do recv(1024), there are 6 possibilities

  1. There is no receive data. recv will wait until there is receive data. You can change that by setting a timeout.

  2. There is partial receive data. You’ll get that part right away. The rest is either buffered or hasn’t been sent yet and you just do another recv to get more (and the same rules apply).

  3. There is more than 1024 bytes available. You’ll get 1024 of that data and the rest is buffered in the kernel waiting for another receive.

  4. The other side has shut down the socket. You’ll get 0 bytes of data. 0 means you will never get more data on that socket. But if you keep asking for data, you’ll keep getting 0 bytes.

  5. The other side has reset the socket. You’ll get an exception.

  6. Some other strange thing has gone on and you’ll get an exception for that.

Leave a Comment