cover image

Welcome to this extract from the book Network Programming in Elixir and Erlang.

This is one of the introductory chapters in the book, but formatted as a standalone HTML page. This means that the layout is very different to the PDF and paper versions of the book, but is fairly similar to what you'd see in the book's EPUB version.

Hyperlinks to book content outside this chapter will not work.

To read the full text, please buy the book.

Enjoy!

Chapter 2

TCP: Exploring the Basics

TCP is the most commonly used network protocol around. TCP stands for Transmission Control Protocol, so beware: it’s not the “TCP protocol”—that would be redundant!

TCP powers a lot of the Internet. Browsers talk to servers over HTTP, which is built on top of TCP. Most databases and queue systems also use TCP to communicate with clients. It’s easy to work with, and it’s the foundation you need to know to get started with network applications. Throughout this chapter, you’ll create a TCP client that connects to an existing TCP server, then build a simple TCP server yourself. In the process, we’ll get an idea of how more complex TCP servers work under the hood.

Let’s start things off with a quick overview of this network protocol.

TCP 101

Unless you’re working in less mainstream corners of the Elixir world, it’s likely that TCP is what you’ll directly use the most. TCP is the most ubiquitous transport layer protocol for network applications. In this section, we’ll write a simple TCP client to connect to an existing TCP server to cover the basics of TCP. Let’s start with its client-server architecture.

Clients and Servers

It might be a generalization, but you’ll find that most interactions on networks can be modeled as a server and a client talking to each other. This distinction is important. A server is a computer program that uses a protocol to listen on some sort of network interface. The protocol is generally a transport layer protocol underneath (TCP or UDP), but the interface provided by the server can be one that uses an application layer protocol (such as HTTP).

This is definitely true for TCP. In TCP, a server has to actively listen for incoming connections on a specific address and port combination. A TCP client can then establish a connection to that server to exchange data with it. The following figure is a rough representation of a server talking with multiple clients.

Computer network diagram showing Server connected to multiple Clients through a Network cloud containing routers and switches.

When a client connects to a server, a connection is opened on each side. This connection is persistent and remains open until one of the two sides terminates it. This is possible because TCP is a stateful protocol. Clients and servers have to store some state in the connection. TCP uses this state to keep track of packet order and deal with duplicated packets. Next up, sockets.

Sockets

TCP connections are always represented by an operating-system-level abstraction: the socket. The term socket dates back to RFC 147[4] in 1971, so we’re talking about pretty old stuff here. Sockets are the mechanism that the OS provides to give programs access to the network.

You can imagine a network socket as an actual wall-mounted telephone socket. You plug in your application and send and receive phone calls (data). The phone lines, or the network in our case, transport the data to another socket mounted on another wall, where another phone (a server) is listening for calls (connections).

Socket programming concept showing 'Your Program' (represented by a gear) connecting to 'The Network' (cloud) through an OS Socket interface.

You’ll see the name socket pretty much everywhere when dealing with network-related code. Most libraries and language bindings that provide interfaces for working with network protocols use the same terminology. Elixir and Erlang are no exception.

A language’s standard libraries use the OS socket APIs under the hood. At the OS level, a socket is a data structure that contains a bunch of data related to the connection. It holds information about the source and destination of the traffic, the buffered in-flight data, and the protocol being used. On that note, sockets are not specific to TCP. They’re independent of the network type and the protocol. If you want to learn more about sockets at the OS level, Rutgers University provides a great introduction to socket programming.[5]

In Elixir and Erlang, the socket abstraction is represented by a port. A port is an Erlang primitive data type used to communicate with external programs—not just network sockets. Ports share many traits with processes, such as passing data around through messages. Furthermore, ports are linked to the process that opens them. This is a desirable behavior for network applications: you don’t want your application to leave open sockets behind, and in this case just terminating your application (and all its processes) is enough to ensure all sockets get closed as well. If you want to read more about ports, the Erlang documentation[6] is a great resource.

It might be a little confusing, but an Erlang port has nothing to do with a network port, which is an entirely different concept. We’ll look at that next as we briefly touch on how TCP routes traffic.

IPs and Ports

TCP is also known as TCP/IP. IP stands for Internet Protocol. It’s the most common network protocol (layer 3 of the OSI model; see Appendix 1, The OSI Model). Its job is to route packets from sources to destinations. The way it does that is by annotating packets with two pieces of information: an address and a port.

Addresses, also known as IP addresses, identify hosts on the network. They look like four-element a.b.c.d strings, where each letter represents an integer from 0 to 255—for example, 140.82.121.3.

In network-speak, ports are a way to route packets to different programs within the same host. They range from 1 to 65535 (the possible values of 16 bits). For example, it’s standard to use port 443 to listen for HTTPS traffic and port 22 for SSH traffic. These ports have nothing to do with the Erlang data type—they just (sadly) share the same name.

You’ll sometimes see address-port combinations represented with a colon separating them, as in 140.82.121.3:443.

Now, enough talk—let’s see some code.

Writing a TCP Client with the gen_tcp Module

Writing a TCP server is slightly harder than writing a TCP client, so let’s look at clients first. Lucky for us, there are literally millions of TCP servers all around that we can connect to. Let’s start by writing a little program that opens a TCP socket by connecting to a TCP server listening at tcpbin.com on port 4242. This little service built by Harry Bagdi provides a simple TCP server that echoes back lines of data that we send to it. It’s perfect for starting out. You can find more details on the website.[7]

In this chapter (and most of this book), we’ll use modules that are part of Erlang’s standard distribution. When it comes to TCP, Erlang ships with the gen_tcp module.[8] gen_tcp provides a complete API for working with TCP, from creating client and server sockets to sending data and receiving data. To open a new connection, the function you want is :gen_tcp.connect/4.[9] Open an Elixir interactive shell by typing iex in your terminal, and then type the following:

 iex>​ {​:ok​, socket} =
 ...>​ ​:gen_tcp​.connect(​~​c​"​​tcpbin.com"​, 4242, [​:binary​], 5000)
 {:ok, #Port<0.6>}

The first argument is an Erlang string with the address to connect to. It could also be a raw IP address, but we’ll get to that a bit later. The second argument is the port that the server is listening on. We’re using 4242 here because that’s what tcpbin.com uses. The third argument is a list of options, and the fourth argument is a connection timeout in milliseconds. Milliseconds are the “lingua franca” for timeouts on the BEAM—see things such as timeout:minutes/1[10] or Kernel.to_timeout/1.[11] The :binary option is important: if we don’t pass it, all data sent and received through the socket will be in the form of charlists rather than binaries. We generally don’t want that, since binaries are more efficient and usually easier to work with.

The return value is an {:ok, socket} tuple. socket is a data structure that wraps the OS-level socket we talked about earlier. This returned socket identifies our client connection, and we can use it to exchange data with the server. Elixir prints the socket as #Port<...> because a socket is an Erlang port.[12] We’ll talk more about ports later in the book. The other possible return value of :gen_tcp.connect/4 is {:error, reason} in the case that something goes wrong. Returning {:ok, value} or {:error, reason} is a common pattern in Elixir and Erlang when dealing with functions that can fail. In the case of :gen_tcp.connect/4 and most other :gen_tcp functions, reason is a representation of a POSIX error code, such as ECONNREFUSED (represented as :econnrefused). The Erlang documentation has a whole section[13] on POSIX error codes that you can use for reference.

Well, time to send some data. To do that, we can use .send/2:

 iex>​ ​:gen_tcp​.send(socket, ​"​​Hello, world!\n"​)
 :ok

The data is a binary and ends with a newline character (\n). Okay, where’s our response? Shouldn’t we get "Hello, world!\n" echoed back somehow? It turns out that sending data is more straightforward than receiving data. Let’s rip the Band-Aid off sooner rather than later and explore socket modes.

Active and Passive Modes for Sockets

The response from the server didn’t show up in our previous IEx session, because it was delivered to the current process as a message. We can verify that with the flush/0 IEx helper,[14] which we can use to print out all the messages that we received in our IEx session:

 iex>​ flush()
 {:tcp, #Port<0.6>, "Hello, world!\n"}

You could also use Process.info(self(), :messages)[15] to inspect the messages without removing them from the mailbox.

:gen_tcp sockets can generally be in one of two modes: active mode or passive mode. By default, they start in active mode. In this mode, :gen_tcp delivers all the data that the socket receives—as well as some socket-related information—as messages to the process that controls the socket. For our purposes right now, that’s the process that initially created the socket—that is, the IEx session. This way, the :gen_tcp API and the interaction with the socket mimic the way that message passing works on the BEAM. Sending data with :gen_tcp.send/2 is asynchronous and non-blocking, as when sending messages with send/2[16] (or ! in Erlang). Receiving data is the same as receiving any other message.

The :gen_tcp API defines three possible messages:

The other possible mode that a socket can be in is passive mode. To actively retrieve data that a socket might have received in passive mode, you have to use the :gen_tcp.recv/3 function.[17] Let’s try it out by starting a socket and setting the :active option to false:

 iex>​ {​:ok​, socket} =
 ...>​ ​:gen_tcp​.connect(
 ...>​ ​~​c​"​​tcpbin.com"​,
 ...>​ 4242,
 ...>​ [​:binary​, ​active:​ false],
 ...>​ 5000
 ...>​ )
 iex>​ ​:gen_tcp​.send(socket, ​"​​Hello from a passive socket\n"​)
 :ok
 iex>​ ​:gen_tcp​.recv(socket, 0, 5000)
 {:ok, "Hello from a passive socket\n"}

:gen_tcp.recv/3 takes a socket as its first argument. The second argument is the number of bytes that we want to read from the socket. 0 is a special (and commonly used) value that tells the socket to return all available data. The third argument is a timeout (in milliseconds), after which the function returns {:error, :timeout} if it doesn’t receive any data.

Well, we have a working TCP client. Pretty easy, wasn’t it? There’s more to learn about the :gen_tcp API and about writing efficient and safe clients, but this is a great start. Now, to get a complete view of the basics of TCP, we need to write a TCP server. We’ll start by building a clone of the TCP echo server we’ve been talking to until now.

Building a TCP Echo Server

Our echo server will listen for TCP client connections and will be able to keep multiple connections open at the same time. When a client sends a line of data (characters ending in a newline \n), our server will send that line of data back. That’s all.

First things first—let’s create a new Mix project for our server:

 >​​ ​​mix​​ ​​new​​ ​​tcp_echo_server​​ ​​--sup​​ ​​--module​​ ​​TCPEchoServer
 * creating README.md
 * creating .formatter.exs
 * creating .gitignore
 * creating mix.exs
 * creating lib
 * creating lib/tcp_echo_server.ex
 * creating lib/tcp_echo_server/application.ex
 * creating test
 * creating test/test_helper.exs
 * creating test/tcp_echo_server_test.exs
 
 Your Mix project was created successfully.
 You can use "mix" to compile it, test it, and more:
 
  cd tcp_echo_server
  mix test
 
 Run "mix help" for more commands.

Make sure to pass the --sup flag so that Mix scaffolds a supervision tree for our application and hooks it up to start when the application starts. The --module argument just makes sure that the “TCP” acronym doesn’t get converted to Tcp in code.

To make sure you wired everything correctly, let’s ensure that you can run tests and that they pass:

 >​​ ​​mix​​ ​​test
 Compiling 2 files (.ex)
 Generated tcp_echo_server app
 ..
 Finished in 0.01 seconds (0.00s async, 0.01s sync)
 1 doctest, 1 test, 0 failures
 
 Randomized with seed 404120

Fantastic.

TCP Servers: How Do They Work?

While TCP clients initiate connections and have to specify the address and port of a TCP server, TCP servers have to listen for incoming connections on a given address and port combination. This is also called binding on the address and port.

To listen for TCP connections in Elixir and Erlang, we can use the :gen_tcp.listen/2 function.[18] It takes a port to bind to and a list of options. It looks like this:

 iex>​ {​:ok​, listen_socket} =
 ...>​ ​:gen_tcp​.listen(4000, [​:binary​, ​active:​ true])
 {:ok, #Port<0.5>}

This code binds on port 4000. The return value of :gen_tcp.listen/2 is similar to that of :gen_tcp.connect/4: it’s a tuple with the atom :ok and a listen socket, if everything goes well, or {:error, reason} if there’s an error. A listen socket is sort of a special TCP socket whose job is to listen for connections and then set up TCP sockets for each new connection. The options that we passed as the second argument to :gen_tcp.listen/2 are applied to all new sockets set up through this listen socket. Now that our listen socket is listening, we can accept new connections using the :gen_tcp.accept/2 function.[19]

 iex(2)>​ {​:ok​, socket} = ​:gen_tcp​.accept(listen_socket, 10_000)

:gen_tcp.accept/2 takes the listen socket and a timeout. It blocks until a client connects, and when one does, it returns {:ok, socket}, where socket is the socket for the new connection. The returned socket is the same as the client socket we dealt with earlier in the chapter. You can receive and send data through it in the exact same way. If you run this code, it’ll probably return {:error, :timeout} unless you connect a TCP client to the server within the ten-second timeout. In order to have the :gen_tcp.accept/2 call return a socket, let’s write some structured code and accompanying tests

Accepting TCP Connections in a Process

Let’s start with a process that we’ll place under our application’s supervision tree. This process will do the following:

  1. Call :gen_tcp.listen/2 to set up a listen socket
  2. Call :gen_tcp.accept/2 to accept a new connection
  3. Spawn a new process to handle the new connection
  4. Go back to the :gen_tcp.accept/2 call to accept new connections

In the figure shown you can see a visual representation of this accept loop.

Socket server flow diagram showing three steps: Listen, Accept, and Spawn Handler, which hands over the socket to a Handler Process.

We’ll use a GenServer for the listening process. In Elixir, you might also use a task,[20] since we won’t use most features of GenServers. But tasks are not available in Erlang, so we might as well pick an abstraction that is included in both languages. Create the file lib/tcp_echo_server/acceptor.ex.

 defmodule​ TCPEchoServer.Acceptor ​do
use​ GenServer
 
 require​ Logger
 
  @spec start_link(keyword()) :: GenServer.on_start()
 def​ start_link(options) ​do
  GenServer.start_link(__MODULE__, options)
 end
 
  @impl true
 def​ init(options) ​do
  port = Keyword.fetch!(options, ​:port​)
 
listen_options = [
 :binary​,
 active:​ true,
 exit_on_close:​ false,
 reuseaddr:​ true,
 backlog:​ 25
  ]
 
case​ ​:gen_tcp​.listen(port, listen_options) ​do
  {​:ok​, listen_socket} ->
  Logger.info(​"​​Started TCP server on port ​​#{​port​}​​"​)
send(self(), ​:accept​)
{​:ok​, listen_socket}
 
  {​:error​, reason} ->
  {​:stop​, reason}
 end
 end
 
  @impl true
def​ handle_info(​:accept​, listen_socket) ​do
case​ ​:gen_tcp​.accept(listen_socket, 2_000) ​do
  {​:ok​, socket} ->
{​:ok​, pid} = TCPEchoServer.Connection.start_link(socket)
:ok​ = ​:gen_tcp​.controlling_process(socket, pid)
send(self(), ​:accept​)
  {​:noreply​, listen_socket}
 
{​:error​, ​:timeout​} ->
  send(self(), ​:accept​)
  {​:noreply​, listen_socket}
 
{​:error​, reason} ->
  {​:stop​, reason, listen_socket}
 end
 end
 end

That’s a lot of code. Let’s take a look at it step by step.

We call use GenServer to define a GenServer, then define a standard start_link/1 function.

We know what the :binary and active: true options do from Active and Passive Modes for Sockets. We also use the :exit_on_close option here so that the socket isn’t linked to the process that creates it. Setting this option to false is useful to avoid closing the socket if the client shuts down its side. Next, with the :reuseaddr option, we can run and shut down the server multiple times without having to worry about unavailable ports. Last but not least, we use the :backlog option, which controls how many clients can be queued waiting to be accepted by the server. We’ll talk more about this option in Moving to the Server Side.

When initializing the GenServer (inside the init/1 callback), we use the :gen_tcp.listen/2 function we talked about earlier. We are passing along the port from the options argument that was passed to TCPEchoServer.Acceptor.start_link/1. Calling :gen_tcp.listen/2 when initializing the GenServer makes sense, since we want to make sure that our process is already listening once start_link/1 returns. We’re also using a case here to match on the return value of :gen_tcp.listen/2. If it returns an error, we stop our GenServer directly.

We consider initialization complete once the call to :gen_tcp.listen/2 returns. As we saw earlier, we still need to call :gen_tcp.accept/2 to accept new connections, but we want to do that outside of the initialization—otherwise, our GenServer would not finish initializing until a client attempted to connect (and a GenServer that doesn’t initialize would stop its supervisor from starting and cause other issues). To accept connections after initializing, we use the good old trick of sending a message to self(). Doing this allows initialization to complete and accepting to happen asynchronously after that. Remember, messages sent to self() within the init/1 callback get queued in the GenServer’s mailbox and are only processed after init/1 returns.

The state of this GenServer is the TCP listen socket itself, since for now we don’t need anything else.

We define a handle_info/2 callback to handle the :accept message that we send to self().

Here, we finally call :gen_tcp.accept/2 to get a new listen socket. We keep the timeout pretty short, at two seconds, and we’ll see why soon.

Once we have a new TCP socket that represents the connection to a new client, we want to spawn a new process to handle it. We haven’t defined TCPEchoServer.Connection yet, but we’ll do it soon.

We created a new socket, but before handing it over to the connection process, we need to change the socket’s controlling process. We’ll talk more about the controlling process in Understanding the Controlling Process of a Socket.

After spawning a new connection handler, we send :accept to self() again to keep accepting new connections.

If :gen_tcp.accept/2 returns {:error, :timeout}, we send :accept to self() again and go back to accepting. This is why we can keep a short timeout: if accepting times out, we go back to accepting again right away. We do this instead of passing a long timeout to :gen_tcp.accept/2 because it’s generally a bad idea for GenServers to block for long periods of time. They should be able to handle system messages, for example.

Finally, if :gen_tcp.accept/2 returns any other error, we stop the GenServer.

In order to start our newly defined GenServer, let’s add it to the list of children in lib/tcp_echo_server/application.ex.

 @impl true
 def​ start(_type, _args) ​do
  children = [
  {TCPEchoServer.Acceptor, ​port:​ 4000}
  ]
 
 # See https://hexdocs.pm/elixir/Supervisor.html
 # for other strategies and supported options
  opts = [​strategy:​ ​:one_for_one​, ​name:​ TCPEchoServer.Supervisor]
  Supervisor.start_link(children, opts)
 end

Let’s try to run our application as a sanity check.

 >​​ ​​mix​​ ​​run​​ ​​--no-halt
 Compiling 3 files (.ex)
 warning: TCPEchoServer.Connection.start_link/1 is undefined
  (module TCPEchoServer.Connection is not available or
  is yet to be defined)
  lib/tcp_echo_server/acceptor.ex:30: TCPEchoServer.Acceptor.handle_info/2
 
 Generated tcp_echo_server app
 
 10:17:00.712 [info] Started TCP server on port 4000

We get a warning about TCPEchoServer.Connection not being defined, which we expect. After that, however, we can see the log that we emit in the init/1 callback of our acceptor GenServer. No errors in sight. Success! Let’s take a quick detour to talk about the controlling process of a socket before we move on to handling connections.

Understanding the Controlling Process of a Socket

Every :gen_tcp socket has a controlling process. This is a BEAM process that is responsible for the socket itself. The controlling process of a socket starts by creating the socket (via :gen_tcp.connect/4 or :gen_tcp.accept/2). The socket is linked to its controlling process, which means that if the controlling process exits, then the BEAM automatically shuts the socket down and cleans things up. This is a useful behavior, because it avoids potential memory leaks and doesn’t require you to do anything to keep things tidy.

The controlling process of a socket is also the only process that can receive data from the socket when the socket is in active mode. (We discussed active and passive modes in Active and Passive Modes for Sockets.) While any process can call :gen_tcp.send/2 to send data through an open socket, only the controlling process receives the {:tcp, socket, data} messages. This is a sensible choice, since :gen_tcp has to know which process to send these messages to. If the socket is in passive mode, however, any process can call :gen_tcp.recv/3. But there is an important constraint here: only one process can be receiving data from the socket at any given time. If a process calls :gen_tcp.recv/3 on a socket, it will receive data from the socket. Until the call returns, however, any other process that calls :gen_tcp.recv/3 will get the return value {:error, :ealready}. If you think about it, this makes sense: if :gen_tcp allowed multiple processes to call :gen_tcp.recv/3 concurrently on the same socket, it wouldn’t know which process to return the received data to.

The good news is that :gen_tcp provides a function to change the controlling process of a socket, aptly named :gen_tcp.controlling_process/2.[21] It takes a :gen_tcp socket and the PID of the new controlling process. Only a socket’s controlling process can transfer the socket to another process by calling controlling_process/2, which is exactly what we did in the acceptor ​code​​​.

 iex> ​:gen_tcp​.controlling_process(socket, new_controlling_pid)

Now, if you’ve been burned in the past by stray BEAM messages or message-related race conditions, you might be wondering about sockets in active mode. What if some TCP messages arrive at the controlling process between calling controlling_process/2 and when the transfer happens? Wouldn’t you have leftover TCP messages in the old controlling process and missing messages in the new one? Well, the Erlang team thought of that. From the documentation for :gen_tcp.controlling_process/2:

If the socket is set in active mode, this function will transfer any messages in the mailbox of the caller to the new controlling process.

The pattern we used in TCPEchoServer.Acceptor is common when working with TCP servers in Elixir and Erlang. An acceptor process calls :gen_tcp.accept/2, spawns a process to handle the new client, and changes the controlling process of the accepted socket to the newly spawned process.

We’ve talked about the acceptor code enough, so let’s move on to code that handles single client connections.

Handling TCP Clients

We’ll start with a TCPEchoServer.Connection module. Each connection will be handled by a separate process, and we’ll once again use GenServer for these processes. Let’s define the TCPEchoServer.Connection module in lib/tcp_echo_server/connection.ex. The start_link/1 function matches what we used in TCPEchoServer.Acceptor.

 @spec start_link(​:gen_tcp​.socket()) :: GenServer.on_start()
 def​ start_link(socket) ​do
  GenServer.start_link(__MODULE__, socket)
 end

The following code defines an init/1 callback, which doesn’t need to do anything other than store the socket in the state. Let’s also define a struct with defstruct[22] to represent the state of our GenServer. This struct also has a :buffer field that defaults to the empty binary <<>>, but we’ll go over that in a second.

 defstruct [​:socket​, ​buffer:​ <<>>]
 
 @impl true
 def​ init(socket) ​do
  state = %__MODULE__{​socket:​ socket}
  {​:ok​, state}
 end

Structs as State

icon indicating an aside

When working with OTP behaviors such as GenServer or gen_statem in Elixir, I almost always use a struct for the state of the module. I use defstruct/2 in the module itself so that the struct is usable as %__MODULE__{...} throughout the module’s code. Structs in Elixir provide useful compile-time guarantees on field names. For example, if I were to misspell a field name or use a field that is not defined in the struct, the Elixir compiler would throw an error. I also like to use structs because they let me always match on at least %__MODULE__{} in any behavior callback, ensuring that all callbacks return a state with the correct shape.

Structs are not available in Erlang, but a common way to achieve similar results has historically been to use Erlang records. While this works, it seems the community is increasingly using maps as the state of these callback modules. Maps don’t provide the same compile-time guarantees as Elixir structs, but I think they’re usually the right choice in Erlang since they perform well and allow precise pattern matching.

:gen_tcp delivers data to our process the same way it does for TCP clients—with :tcp, :tcp_closed, and :tcp_error messages. So, let’s define a few clauses of the handle_info/1 callback to handle those.

1: @impl true
def​ handle_info(message, state)
# The "socket" variable must be the same in this pattern match!
5: def​ handle_info(
{​:tcp​, socket, data},
%__MODULE__{​socket:​ socket} = state
) ​do
state = update_in(state.buffer, &(&1 <> data))
10:  state = handle_new_data(state)
{​:noreply​, state}
end
def​ handle_info(
15:  {​:tcp_closed​, socket},
%__MODULE__{​socket:​ socket} = state
) ​do
{​:stop​, ​:normal​, state}
end
20: 
def​ handle_info(
{​:tcp_error​, socket, reason},
%__MODULE__{​socket:​ socket} = state
) ​do
25:  Logger.error(​"​​TCP connection error: ​​#{​inspect(reason)​}​​"​)
{​:stop​, ​:normal​, state}
end

The clauses for :tcp_closed (line 15) and :tcp_error (line 22) only stop the GenServer, which represents a connection to an ephemeral client. If the connection closes or drops, our process can’t (and probably shouldn’t) reconnect to the client, so shutting down with a :normal reason is the way to go. When handling new data on line 5, we append it to the state’s buffer and then call the handle_new_data/1 helper function with the updated state. Create a new handle_new_data/1 private function and fill it in.

1: defp​ handle_new_data(state) ​do
case​ String.split(state.buffer, ​"​​\n"​, ​parts:​ 2) ​do
[line, rest] ->
:ok​ = ​:gen_tcp​.send(state.socket, line <> ​"​​\n"​)
5:  state = put_in(state.buffer, rest)
handle_new_data(state)
_other ->
state
10: end
end

In the handle_new_data/1 function, we start by splitting the state’s buffer on the next newline character (\n) on line 2. We pass parts: 2 to String.split/3 so that even if more than one newline character is present, we still split only on the first newline character. If String.split/3 returns a list with two elements (line 3), it means that there was at least one newline character, so we have at least a complete line. In this case, we use :gen_tcp.send/2 to “echo” the line back to the client (line 4). We then update the state’s buffer and recursively call handle_new_data/1 again (lines 5 and 6) in case there are other lines available. The base case for the recursion is when no newline character is present—we return the state (line 9) and we’re done.

Elixir’s Access to Update the State

icon indicating an aside

I’ve used the update_in/2[23] macro in the code for TCPEchoServer.Connection. update_in/2 is part of a set of accessors (such as get_in and put_in) that Elixir provides. These functions and macros provide a way to access and update nested data. They’re generally based on the Access behavior,[24] but work on struct fields.

I tend to use these a lot when working with the states of OTP processes, such as the state of our GenServer. They let me write concise code to update or override deeply nested parts of the state and get the updated state back. If you’re not familiar with these functions, I highly recommend taking a look at the documentation. You’ll see me use them again and again in this book.

Buffering is a common technique in network programs, because application-level programs (such as our GenServer) don’t know how TCP packets are split and delivered through the underlying TCP layer. For example, a client might send the string "hello\n" and then the string "world\n" with two separate TCP send calls, but the OS might buffer the data and send a single TCP packet containing the string "hello\nworld\n". Or it might decide to send data every 8 bytes, resulting in two TCP packets: one containing "hello\nwo" and one "rld\n". Buffering addresses all these problems by reconstructing the data at the application layer according to the agreed-upon protocol. In this case, the protocol specifies that each logical packet is a line.

Our TCP echo server is complete. We have an acceptor process that listens on a port and accepts new TCP connections. Once clients connect, the acceptor spawns a process for each client. Next, let’s do some testing before taking a look at a few shortcomings of this approach.

Testing with TCP

When prototyping, it’s common to do interactive manual testing early on—sometimes even before writing any automated tests. Luckily, most operating systems ship with great tools for working with network programs. In our case, we’ll use netcat[25] (often used as nc), a widely known tool from the 1990s that should be available on most Unix-based and Windows systems. It’s a little program that lets you read from and write to TCP (or UDP) connections. First things first, let’s start our shiny new echo server.

 >​​ ​​mix​​ ​​run​​ ​​--no-halt
 19:01:05.073 [info] Started TCP server on port 4000

Our server is listening for connections on the address localhost and port 4000. Now we’re ready to test this out with netcat. The easiest way to do that is to write lines of text to standard output and then pipe it through netcat.

 >​​ ​​echo​​ ​​"Hello world"​​ ​​|​​ ​​nc​​ ​​localhost​​ ​​4000
 Hello world

This example works because echo adds a newline to the string it echoes by default. If we try to send multiple lines, it still works.

 >​​ ​​echo​​ ​​-en​​ ​​"Hello\nworld\n"​​ ​​|​​ ​​nc​​ ​​localhost​​ ​​4000
 Hello
 world

Our rudimentary manual testing indicates that our server is working correctly. Let’s also write some automated tests. As we know by now, Erlang’s :gen_tcp supports creating both sides of a TCP connection: the client and the server. To be fair, this is what most TCP bindings for other languages do as well. The nice thing is that it makes testing easier. If we want to test a server, as in this case, we can use :gen_tcp again to write clients for our server.

We’ll start with a simple test for a single TCP client that connects to our server (at localhost:4000) and sends one line of text. Then, we can assert that our server echoes that line back. Create the file test/tcp_echo_server/integration_test.exs.

 test ​"​​sends back the received data"​ ​do
  {​:ok​, socket} =
 :gen_tcp​.connect(​~​c​"​​localhost"​, 4000, [​:binary​, ​active:​ false])
 
  assert ​:ok​ = ​:gen_tcp​.send(socket, ​"​​Hello world\n"​)
 
  assert {​:ok​, data} = ​:gen_tcp​.recv(socket, 0, 500)
  assert data == ​"​​Hello world\n"
 end

We used the client socket in passive mode in this test to make it easy to receive all available data with :gen_tcp.recv/3. Next, we can add a similar test for fragmented data—that is, data with no newline characters in it or with more than one.

 test ​"​​handles fragmented data"​ ​do
  {​:ok​, socket} =
 :gen_tcp​.connect(​~​c​"​​localhost"​, 4000, [​:binary​, ​active:​ false])
 
  assert ​:ok​ = ​:gen_tcp​.send(socket, ​"​​Hello"​)
  assert ​:ok​ = ​:gen_tcp​.send(socket, ​"​​ world\nand one more\n"​)
 
  assert {​:ok​, data} = ​:gen_tcp​.recv(socket, 0, 500)
  assert data == ​"​​Hello world\nand one more\n"
 end

Flaky Test

icon indicating an aside

This test might fail here and there. Oops! That’s because our echo server calls :gen_tcp.send/2 on each line it breaks down from the incoming data. But we can’t have too much control over how the operating system’s TCP stack decides to buffer those multiple send calls. Usually, being small strings, they’re buffered into a single TCP packet returned from a single recv/3 call. Sometimes, though, there will be two packets (one per send/2), which results in the recv/3 call returning just "hello\n". You could make this test deterministic by checking the result of the first recv/3 and calling recv/3 once more if necessary. This is left as an exercise to the reader.

Kids, the lesson here is this: you never know how the other end of a connection is going to behave.

The last test we’ll add makes sure our server can handle multiple clients simultaneously.

1: test ​"​​handles multiple clients simultaneously"​ ​do
tasks =
for _ <- 1..5 ​do
Task.async(​fn​ ->
5:  {​:ok​, socket} =
:gen_tcp​.connect(​~​c​"​​localhost"​, 4000, [​:binary​, ​active:​ false])
assert ​:ok​ = ​:gen_tcp​.send(socket, ​"​​Hello world\n"​)
10:  assert {​:ok​, data} = ​:gen_tcp​.recv(socket, 0, 500)
assert data == ​"​​Hello world\n"
end​)
end
15:  Task.await_many(tasks)
end

To simulate multiple clients, we used Elixir tasks. We spawned a bunch of tasks with Task.async/1 (line 4), each setting up a socket, sending data, and receiving the echoed data back. Then, we used Task.await_many/1 (line 15) to wait until all the tasks in the list finish. There’s nothing left to do but run the tests.

 >​​ ​​mix​​ ​​test
 
 19:41:08.005 [info] Started TCP server on port 4000
 .....
 Finished in 0.03 seconds (0.03s async, 0.00s sync)
 1 doctest, 4 tests, 0 failures
 
 Randomized with seed 13966

Success! For the last part of this chapter, let’s take a closer look at two :gen_tcp options that have a lot to offer: :active and :packet.

Becoming Socket Pros with Modes and Packet Parsing

Erlang’s :gen_tcp module provides many options you can use when starting sockets. We won’t use most of them in this book, but we’ll use :active and :packet in the next chapters. We’ve already seen active: true (active mode) and active: false (passive mode), but it turns out that we can pass other values to change the behavior of the socket. The :packet option, instead, lets you off-load some parsing or packing of the data to :gen_tcp.

Being More Precise with Active Sockets

The :active option controls how :gen_tcp delivers data to a socket. As you learned, active: true means the data is delivered as process messages, while active: false means you need to manually receive data. As a matter of fact, :active can take not only a boolean value but also the atom :once or an integer n. When :active is :once, the socket is in active mode until it sends a message to the controlling process (for example, if the socket receives TCP data). Once the socket sends the message, it automatically goes back into passive mode. You have to manually set it back if you need to return to active mode.

Active once sockets tend to be common in practice, because they’re often the perfect compromise between active and passive modes. If your socket is in passive mode, you’ll have to call :gen_tcp.recv/3 every time you want to fetch new data and be able to copy it to the controlling process. Not ideal if the other peer can send data at any time, since you’d be blocking the process calling recv/3. This can be the opposite of a reactive model, since you have to either fetch often or risk data delays. Active mode solves the reactivity issue, but it opens the socket’s controlling process to a sort of denial-of-service attack. If a client sends data frequently and your TCP receive buffer is small, the controlling process might end up receiving a lot of messages. This can be a problem if the controlling process can’t keep up with them. In such a case, the message queue of the controlling process would fill up and potentially cause a memory leak. By using active: :once, you can avoid having to constantly call recv/3, and at the same time you’ll have to explicitly reactivate the socket to eliminate the message-queuing situation.

Setting the socket back to active: :once every time the controlling process receives a message is a common approach in the real world. It usually looks something like the code sketched out here.

 defmodule​ Connection ​do
  usual GenServer code
 
 def​ handle_info({​:tcp​, socket, data}, %{​socket:​ socket} = state) ​do
 :ok​ = ​:inet​.setopts(socket, ​active:​ ​:once​)
  handle_data(data)
  {​:noreply​, state}
 end
 end

The function you need to use to change a socket’s mode is not in the :gen_tcp module but in :inet.setopts/2.[26] This is because :gen_tcp is not the only module that provides a socket-based API in Erlang. For example, we’ll work with sockets using the :ssl module as well. :inet supports sockets from all of these modules, so some common functions (such as setopts/2) live in there.

Changing Socket Options

icon indicating an aside

Pay special attention to :inet.setopts/2. We’ll use it often in this book, since it works on TCP sockets (client and server) and UDP sockets.

:active can also take one other value: an integer n. This mode is similar to active: :once, but the socket will deliver n messages before going back to passive mode. When it does go back to passive mode, it delivers one more {:tcp_passive, socket} message to the controlling process. This message allows the controlling process to know when it has to set the socket back to active mode. active: n is slightly more complex than that. In fact, n can even be a negative number. The socket keeps a count of messages it can deliver, and setting active: n with a negative n subtracts from that count. active: n seems to be used less often than the alternatives, but it can be a nifty tool to deal with flow control. For example, the controlling process might regulate and update the number n of allowed messages based on message size, how “busy” it is, and other such factors.

To put active: :once into practice, let’s modify our TCP echo server from the previous section. Instead of passing active: true when starting the TCP listen socket in code/tcp_echo_server/lib/tcp_echo_server/acceptor.ex, we’ll use active: :once:

 -listen_options = [:binary, active: true, exit_on_close: false]
 +listen_options = [:binary, active: :once, exit_on_close: false]

We then need to set the socket back to active: :once every time we get a {:tcp, socket, data} message in code/tcp_echo_server/lib/tcp_echo_server/connection.ex.

  def handle_info(
  {:tcp, socket, data},
  %__MODULE__{socket: socket} = state
  ) do
 + :ok = :inet.setopts(socket, active: :once)
  state = update_in(state.buffer, &(&1 <> data))
  state = handle_new_data(state)
  {:noreply, state}
  end

active: :once is especially powerful when paired with the :packet option, which we’ll look at next.

Off-Loading Some Parsing with the Packet Option

In the code we’ve written so far, we parsed and packed our own data, but we’ve only received all available data. :gen_tcp offers a powerful alternative with the :packet option. :packet can be many different values. We won’t explore all of them here, but you can refer to the documentation for :inet.setopts/2[27] for a comprehensive list.

If not specified, the default value for the :packet option is :raw, which means that :gen_tcp (or rather :inet) won’t do anything to incoming or outgoing data. Let’s start with another possible value: :line. This value only affects the received data. If you set it, the socket will only deliver complete lines—that is, sequences of bytes that end with the byte 10, which represents a newline (\n) in ASCII. In active mode, lines are delivered as {:tcp, socket, line} messages. In passive mode, :gen_tcp.recv(socket, 0, timeout) returns {:ok, line} if successful. You’ll still have to add a newline character manually to data you send through the socket, but you won’t have the headache of splitting and buffering incoming data.

If we were to use packet: :line in our TCPEchoServer.Acceptor, we could simplify TCPEchoServer.Connection significantly by not buffering data and just sending back every line received through a {:tcp, socket, data} message. The relevant handle_info/2 clause would look like this:

 def​ handle_info(
  {​:tcp​, socket, line},
  %__MODULE__{​socket:​ socket} = state
  ) ​do
 :ok​ = ​:gen_tcp​.send(state.socket, line)
  {​:noreply​, state}
 end

Another powerful value for :packet is one of the integers 1, 2, and 4. In this case, the value represents a number of bytes to use as the header for received and sent data. This header encodes the number of bytes expected to follow. For example, if :packet equals 2 and you send the data "hello", :gen_tcp will in fact send the data <<0, 5, "hello">> through the socket—that is, two bytes encoding the number 5 (the number of bytes in the string "hello") and then the string itself. When :packet is set to one of the supported integer values, :gen_tcp applies the corresponding header logic to incoming data and strips the header when delivering the data to the controlling process. This :packet mode is especially useful for binary protocols that specify packet length. In those cases, it feels like cheating: your code just sends and receives data without thinking about encoding, because :gen_tcp takes care of it for you.

Wrapping Up

You’ve made it through the introduction to the first protocol. Great job! You’re one step closer to unlocking your full network programming potential. We got a good look at TCP, one of the most widely used transport layer protocols, and we reviewed some important networking concepts, such as sockets, addresses, and ports. You also became familiar with using TCP in Elixir and Erlang through the :gen_tcp standard-library module, writing both a TCP client and a TCP server.

In the next chapter, we’ll step up our TCP game by going over best practices and design patterns for building scalable and reliable TCP clients and servers on the BEAM.