History of ‘The Internet’

6 min readFeb 24, 2023

Hi Everyone,

This post is dedicated to those curious minds who wants to know how the Internet came into existence. How did it come in the first place and why?

ARPANET

In 1969, US Govt of Defense wanted to create a project , which was named ARPANET (Advanced Research Project Agency NETwork). This project enabled the officials to share files files and data within the systems.

Now, the drawback of this project was that it did not enable to contribute files to this project. So, if a person had a resource that he wanted to share as hyperlinks -which enables you to redirect to the resource as you click on the hyperlink, it wasn’t possible until WWW came into existence.

World Wide Web

Just as the name suggests, WWW is a Wide Network all around the World. This was created by Tim Burners for CERN- an abbreviation of “Council of Europe for Nuclear Research”(Translated from French) , which overcame the above drawback so that people could share their scientific research , which was later accessible to anyone on the Internet. Its interesting to note that the first website was hosted on Tim Berners’s computer. On 30 April 1993, CERN put the World Wide Web software in the public domain.

The first Website — http://info.cern.ch/hypertext/WWW/TheProject.html

Here, it lists out the documents , however, it did not enable us to search using any keywords. Here came the Browsers.

Web Browsers

As the number of documents on the Internet increased, it was difficult to find thing on the site. Hence came up search engines to find documents. Yahoo was the first search engine to the present ones like Google, Mozilla, Bing etc.

Now that there were multiple computers to serve the website and web browsers, it meant that there are more contributors hence the requirement for more servers. Now we need to have a basic set of rules to tell the servers how they can send data, upload documents and many other actions. For this ,they set up rules or protocols.

This is how the Internet is currently its present state. Now how exactly does the internal connections happen? Protocols help with establishing rules for the communication.

Protocols

The most popular network protocol in the world, TCP/IP protocol suite, was designed in 1970s by 2 DARPA scientists — Vint Cerf and Bob Kahn(Founders of the Internet Society), persons most often called the fathers of the Internet. This protocol gave a general set of rules and regulations for communication of data, of any kind, over the network.

TCP vs UDP

TCP — Transmission Control Protocol , is a connection-oriented communication protocol( ie. it establishes a connection between systems). It ensures that all data packets sent from client reaches the server . The packets will be in order in which they were sent. Once the process is complete, the connection is closed. They are used in sending Emails, accessing resources on the WWW etc.

UDP — User Datagram Protocol, is a connectionless protocol. It doesn’t open a connection between system, and doesn’t ensure that data packets reaches the server ,incase some of them is lost in between. They are used in Video Conferencing, online games etc.

SMTP ,IMAP, POP3 for Email transfers

These protocols are used to send and receive messages from the mail servers (like Gmail, Yahoo etc. ).

The email goes to the mail server from the sender using SMTP — Simple Mail Transfer Protocol, which allows the recipient to receive the email.

IMAP — Internet Messaging Access Protocol and POP3 — Post Office Protocol v3 are email protocol that deals with managing and retrieving email messages from the receiving server.

IMAP vs POP3

POP3 downloads the email from a server to a single computer, then deletes the email from the server.
On the other hand, IMAP stores the message on a server and synchronizes the message across multiple devices.

DHCP

Dynamic Host Configuration Protocol

In most home internet setups, the DHCP server is actually a Router. It manages requests for IP addresses from the Internet Network and keeps a record of all the IP addresses it assigns, for various devices connected to it and to which devices it assigns them. It also maintains an IP address pool to choose from. Each device is identified using its MAC(Media Access Control) Address.

This protocol is what enables us to connect devices to Wi-Fi

Disadvantages:

Single point of failure: If a network has only one DHCP server and it fails, clients can’t gain access to the web.
No static IP: Computers that are connected to a network with DHCP implemented can’t be used as servers because their IP address is always changing.

So now, we know how our devices in our internal network are connected to the Internet. So suppose I have 3 devices connected to my DHCP server uniquely identified by MAC addresses. Now , if Device 1 sends a request to the Google Server, it will send a response back to the Internal Network via Routers. But how does the Router identify the device ?

Response to the Requested Device using NAT

A Network Address Translation (NAT) is the process of mapping an IP address to another, by changing the header of IP packets while in transit via a router.

A NAT works by selecting gateways that sit between two local networks: the internal network, and the outside network.

Systems on the inside network are typically assigned private IP addresses that cannot be routed to external networks.

A few externally valid IP addresses are assigned to the gateway. The gateway makes outbound traffic from an inside system appear to be coming from one of the valid external addresses.

It takes incoming traffic aimed at a valid external address and sends it to the correct internal system.

This helps ensure security. Because each outgoing or incoming request must go through a translation process that offers the opportunity to qualify or authenticate incoming streams and match them to outgoing requests.

Response to the Requested Application using Ports

Now a device can have multiple applications running on them. How do we identify which application has requested ?

Ports enable the device to identify the application to which the response has to be sent.

This is a long store, bear with me :P

A socket connection is uniquely identified by a tuple of [Protocol, Local IP, Local Port, Peer IP, Peer Port].

A TCP server creates a listening socket with a tuple of [TCP, Listen IP, Listen Port, 0, 0]. When a client requests to connect to a server, the network routes the request to the specified IP/Port. The receiving device then routes the request to a matching listening socket, performs a 3way handshake with the client, and puts it into a queue. Later, when accept() is called, it extracts the next pending client from the queue and returns a new socket identified with a tuple of [TCP, Listen IP, Listen Port, Client IP, Client Port]. Because of this, a single listening socket can accept multiple Clients from different Client IP/Port combinations.

A TCP client creates a connecting socket with a tuple of [TCP, Local IP, Local Port, 0, 0]. When the 3way handshake is complete, the socket's tuple is updated to [TCP, Local IP, Local Port, Server IP, Server Port].

All subsequent data exchanges use these tuples.

Data sent out from a Client’s connecting socket will be sent to the associated Server IP/Port and stored in the buffer of the accepted Server socket whose tuple matches both the Client and Server.

Data sent out from a Server’s listening socket will be ignored, since there is no associated Client.

Data sent out from an accepted Server socket will be sent to the associated Client IP/Port and stored in the buffer of the connected Client socket whose tuple matches the Client and Server.

And that is how we receive the response to the devices connected from the Internet, to the Internal Network, without our IPs being exposed to the World.