KEMBAR78
Cit 303 Lecture Note Part 2 | PDF | Network Socket | Internet Protocols
0% found this document useful (0 votes)
32 views5 pages

Cit 303 Lecture Note Part 2

Network programming involves writing software that communicates over a network, utilizing sockets as the primary means of data exchange. Sockets can be classified into network sockets for remote communication and Unix Domain sockets for local communication, with various configurations determining their behavior. The document also discusses socket implementation, configuration, client-server interactions, blocking versus non-blocking modes, and byte ordering issues in network communication.

Uploaded by

mohgydado
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views5 pages

Cit 303 Lecture Note Part 2

Network programming involves writing software that communicates over a network, utilizing sockets as the primary means of data exchange. Sockets can be classified into network sockets for remote communication and Unix Domain sockets for local communication, with various configurations determining their behavior. The document also discusses socket implementation, configuration, client-server interactions, blocking versus non-blocking modes, and byte ordering issues in network communication.

Uploaded by

mohgydado
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Network programming: introduction to sockets

Programming

Network programming in a nutshell


Network programming is about writing computer programs that talk to each other over a
computer network. The world is full of such type of programs: for example, the web browser you
are using to read website is a piece of software that connects to a remote computer where the
data is stored and grabs the text content to display on your screen.

The browser and the web server can do their networking job thanks to the operating systems they
run on, where all the necessary network protocols have been implemented. The operating
system's parts that provide network functionality are called sockets.

Understanding sockets
A socket is an abstraction over a communication flow. Concretely, sockets are programming
objects provided by the operating system that allow your programs to send and receive data.
There are two types of sockets in the programming world:

1. network sockets — they are used to exchange data between programs over a network, or
in other words between two remote hosts;
2. Unix Domain sockets — also known as UNIX sockets, they are used to exchange data
between programs running on the same machine (i.e. in the same host). This is a form
of Inter-Process Communication (IPC).

This article is about network programming. However, sockets in modern operating systems
support both styles: changing the socket type is just a matter of configuration.

What does a socket look like?


A socket is an object that you create, configure and on which invoke some functions to send or
receive data. For example, the pseudo-code below shows how to send a piece of text (Hi there)
through a fictional socket:

Socket socket(...configuration...) (1)


socket.connect(...address of a remote host...) (2)
socket.send("Hi there") (3)
socket.close()
A socket is first created and configured (1), then it is used to establish a connection to a remote
host (2) and to send the message (3). Sockets are the foundation to more complex programs that
send and receive data.

Socket implementations
The example above is just pseudo-code: actual sockets come with the operating system, so they
are written in low-level languages (C, mostly). Their programming interface (API) however —
the socket object layout, how to initialize it, the function names, ... — is very similar to the
pseudo-code above. In fact, such API style is known as the Berkeley sockets interface,
sometimes also called POSIX sockets or BSD sockets.

All modern operating systems implement the Berkeley socket interface, but not all of them stick
to the original specifications. Also, using sockets provided by the operating system forces to
write low-level code. That's why most of the time uses the networking abstractions provided by
higher-level programming languages. For example, the std::net module in Rust,
the jdk.net package in Java or the Boost.Asio library for C++.

Python is interesting: its socket module is the translation of the Berkeley sockets interface into
Python's object-oriented style. The advantage is that work with the original API without all the
headaches of manual memory management required by the C language.

Configuring a socket

As mentioned earlier, a socket must be configured before use. Have to specify the socket family,
the socket type and the optional protocol. Those properties define the nature of the socket and
its behavior.

The socket family


A socket must be determines whether can work over the Internet or a local one. More
specifically, can have IPv4-based sockets, IPv6-based sockets or UNIX sockets.

A socket configured as IPv4 or IPv6 can exchange data with remote hosts. The former works
with IP addresses version 4, the latter works with IP addresses version 6. A socket configured as
UNIX is used to exchange data between programs on the same machine. Windows, a non-UNIX
operating system, recently added support for the UNIX socket type.

The socket type


A type of socket must be determines if communication want to establish. Here the choice boils
down to three types: stream sockets for connection-oriented protocols such as the Transmission
Control Protocol (TCP); datagram sockets for connectionless protocols such as the User
Datagram Protocol; raw sockets for low-level communication protocols such as the Internet
Protocol.
With stream or datagram sockets you are using popular protocols already implemented for you
by the operating system. Raw sockets instead allow to do whatever you want: you can implement
your own protocols, generate custom IP packets, intercept network traffic or just mess around by
sending invalid data to other computers.

Raw sockets are powerful and might cause harm if used for malicious purposes. This is probably
the reason why raw sockets on Windows are in read-only mode: you can't send data with them.

The protocol

This piece of information is often optional, as sockets can automatically determine how to
behave given the family and the type described above. For example a stream, IPv6-based socket
is automatically prepared for a TCP-over-IPv6 transmission. Even better, the Berkeley sockets
API includes some utility functions to determine the right parameters for the socket
configuration given the address you want to connect to.

What you can do with sockets


As a designer, you can do whatever you want with sockets. However, socket-based programs
usually end up being clients or servers. Clients establish the connection to servers, which in turn
listen to clients and exchange data with them.

For example, the browser you are using to read this article is a client: it talks to a remote server
that contains the web page to be displayed. Both the browser and the server make use of sockets
under the hood for the actual data transmission.

The pseudo-code snippet I've shown you before is a client. Let me put it back here for clarity:

Socket socket(...configuration...)
socket.connect(...address of a remote host...)
socket.send("Hi there")
socket.close()

A server is different in that usually it waits for new clients coming in. Here's a pseudo-code
example of a server that replies back to clients with the Welcome string:

Socket socket(...configuration...)
socket.bind(...address...) (1)
socket.listen() (2)
while (...some condition...):
client = socket.accept() (3)
client.send("Welcome")
socket.close()
Beyond the usual configuration, the Berkeley sockets API wants to bind the socket object to an
address (1). This means that your program will react when it receives data from a client with a
certain IP address, for example.

What you can define in the ...address... part actually varies between the protocol in use. For
example in a TCP connection you also have to specify the port a client can connect to, and if you
don't pass any IP address in it your server will accept connections from any client.

The socket is then instructed to listen (2), to make it accept incoming connection requests from
clients. Finally, the program waits for new clients to come in with the accept() function (3). From
this point on, the server has established a connection with a client and can send data to it. The
picture below shows the API calls involved in a typical scenario of a client and a server talking
to eachother.

1. A client-server conversation and the API calls involved.


Blocking versus non-blocking sockets
The accept() function in the server pseudo-code above is blocking: the while loop doesn't make
any progress until a new client arrives. In other words: your program is stuck waiting for new
connections.

The same logic applies to the connect() function in the client example: no progress is made until
a connection is established. Sockets are blocking by default, but you can put them into non-
blocking mode during the configuration stage.

With non-blocking sockets the functions mentioned above return immediately without waiting.
This is especially useful for servers that need to handle thousands of connections at the same
time, or more generally when you want to write programs that don't get stuck waiting for
external events.

Non-blocking programs are faster, but your code becomes a little bit more complicated as you
are entering the realm of asynchronous programming. Choosing blocking versus non-blocking
mode is usually a trade-off between performance and programming complexity.

Byte ordering problems

When communicating across a network, you may encounter computers with a different
architecture than yours. Byte ordering is what changes the most: how data is stored in memory.

Take the hexadecimal number 0xA4FFBC01 for example: some computers store the most
significant byte (A4) before the less significant byte (01), so that the number in memory appears
as it is written (A4 FF BC 01). This way of storing numbers matches how us humans write things
down and is called big endian. The standard byte ordering in networking is big endian and for
this reason it is also known as network byte order.

Some computers do the opposite instead: they store the most significant byte (A4) after the less
significant byte (01), so that the number in memory appears flipped (01 BC FF A4). This byte
ordering is called little endian, known as host byte order in the networking jargon.

So, exchanging data between two computers with different byte ordering requires an adjustment.
The trick is to convert data to network byte order before sending it, and convert it to host byte
order on arrival. This is done through some utility functions that come with the Berkeley sockets
API such as the hton* and ntoh* families: they read as host-to-network and network-to-host and
perform the byte conversion.

You might also like