C# TcpClient and NetworkStream usage behaviour

It can be confusing at first when developing an application that employs TCP connection implementation of TcpClient, TcpListener and the underlying NetworkStream, because the way data is sent and read may not be what you think it is and the MSDN is not very explicit regarding how it actually works, leaving the developers to fall back on frustrating trial and error.

Most people would have no problems establishing a connection between a TcpClient and TcpListener, the problem is trying to NetworkStream.BeginRead() what is NetworkStream.Write()-ed on the other end. NetworkStream.DataAvailable and TcpClient.Available join the fun together with a bunch of Exception can make things complicated. This post is to clarify the behaviour of how NetworkStream works and generally.

Imagine NetworkStream to be a river stream, then NetworkStream.Write() is really about dumping data into the river. As long as the river has enough capacity, NetworkStream.Write() will keep dumping your data, otherwise it will wait for the next opportunity, then continue dumping, until there is nothing more to dump. Consecutive calls of NetworkStream.Write() would just mean that you are adding more data to be dumped. There is no promise of when the data will enter the stream as this depends on several factors, like connection quality, capacity etc. but TCP ensures that the data will be dumped in a way that you can receive them in the same order downstream.

Downstream on the receiving end, there is a dam stopping the data flow. NetworkStream.Read() opens the dam for a while and lets the data flow into a water basin before it shuts again. It has to shut the dam because the water basin has a limited capacity of TcpClient.ReceiveBufferSize. Finally, the workers in NetworkStream.Read() will transfer the data from the water basin to the storage buffer you provide so that you can do your own processing. You can specify how long to keep the dam opened by using the length parameter in NetworkStream.Read(buffer, offset, length), but note that NetworkStream.Read() only open the dam once and it will only fill up to the TcpClient.ReceiveBufferSize and nothing more even if you have a very large storage buffer. This means you will have to NetworkStream.Read() several times to fill your storage buffer.

In this data transfer model, you can see that there is no automatic coordination between Write() and Read(). While Write() just keeps dumping data down the line, your application has to decide when and how long to open the dam so that you can separate the data into logical units that can be converted back to their original types.

One way to achieve coordination within your application is to agree between the server and client to always send a fixed-length data to indicate the size of the upcoming data before sending the actual data. The fixed-length data can simply be a 4-byte Int32. This means that downstream, your application will open the dam for 4 bytes of data and then shut it. With the 4 bytes in your storage buffer, convert it to Int32 to find out the size of the upcoming payload e.g. 12.8mb, then open the dam again for another 12.8mb. In most cases, your water basin may not have 12.8mb capacity, hence you may need to open and close the dam several times, e.g. 4mb thrice then 0.8mb once.