How the internet works?

This blog is mostly about coding, but it's also important to get to know related areas and tools - for example how the internet works. This post describes what happens in the background when you visit a website till the end result is shown to you.

The structure of a URL

A Uniform Resource Locator (URL) has three main parts:



The protocol

However, there is one protocol apparently in the URL - it is the 'http' or 'https' - there are multiple protocols which describe how the network participants communicate with each other. They are conventions or standards and they build on each other, here are some of them:

If you browse some websites, the browser accesses the server with the HyperText Transfer Protocol (HTTP) . HTTP is the foundation of data communication for the World Wide Web. So HTTP is the protocol of web. There are two forms of its appearance: http and https, https is the secured version. That 's' means that the transferred data is not accessible by anyone except the recipient.

. HTTP is the foundation of data communication for the World Wide Web. So HTTP is the protocol of web. There are two forms of its appearance: http and https, https is the secured version. That 's' means that the transferred data is not accessible by anyone except the recipient. The Transmission Control Protocol (TCP) can help the programs to talk on two different computers. It ensures the transmission of the data streams reliably and correctly.

can help the programs to talk on two different computers. It ensures the transmission of the data streams reliably and correctly. And in the next step our computers pack this into Internet Protocol (IP) packets, so they can travel through the internet. (IP is the basic protocol of the internet.) After arriving at their destination, the packing process takes place in the another direction, and our web server receives the request.

What's 404?

The average users meet the 404 error message most often - but what does it mean exactly? Possibly this is one of the most "infamous" of HTTP status codes.

HTTP status codes tell the browser that everything was perfectly loaded or there was an issue during page load. Of course the users meet only the error messages, because if everything goes well, the page just appears.

When it is loaded successfully, the status code is one of the two hundreds (2xx).

If the page what are you looking for is moved, the code is 3xx - in this case, the browser can redirect automatically.

The 4xx class of status code is present when the requested page is not found. These are usually client errors, because you ask something that does not exist.

And the status codes which are beginning with the digit "5" indicates a server error, for example it is incapable of performing the request (in the absence of resources) because there are too many users on the page, or the developer made a big mistake. :)

If you have never met any of the 404 pages (but I doubt that), check these 3, I think these are the best error message pages: GitHub, Airbnb and Bitly.

The domain

The domain identifies your computer like a personal ID and every computer has an identification number. This number is the above-mentioned IP address. The communication between computers takes place between these addresses.

But because of the IP addresses would be too difficult to remember (they are sequence of numbers) the domain names make it easier for us humans to remember them. For example one of the IP address of the IMDb site is 207.171.162.180, I think that would be so confusing if we have to recall these numbers instead of meaningful words or that four letters.

These domains are contained in the domain name system (DNS) and identified during browsing, so it is pairing with the corresponding IP address. If there is a website behind the domain, the server sends the web page to the browser's computer and the website is displayed on the monitor.

The path

A path contains data, usually organized in hierarchical form, and resembles the file system path (which shows where you can find a file in your computer). The path must begin with a single slash (/) and the different parts are separated by slashes. The path contains the resources on the Web server.

Summary

You might think of an URL like a regular postal mail address: the protocol represents the postal service you want to use, the domain name is the city or town, and the port is like the zip code; the path represents the building where your mail should be delivered; the parameters represent extra information such as the number of the apartment in the building; and, finally, the anchor represents the actual person to whom you've addressed your mail. (Source: Mozilla Developer Network)

Page loading step by step

Two main elements of a website are the HTML and the CSS.

The HTML (HyperText Markup Language) is a description language what contains the content or data. The CSS (Cascading Style Sheets) gives the look, style of this data, while with JavaScript you can add programed behaviour as well.

In this video below you can see how one of the IMDb's page is loading - step by step. Many things can affect the loading, such as the speed of your internet. At around 4 seconds, the first bits of the page begin to appear and then slowly the CSS, the images and other assets are loaded too.

(I made this video with WebPagetest, you can test every single website. It's fun!)

The picture below shows what is happening when you intentionally turn off the CSS or the page was not loaded properly. You can see that without the stylesheet the design does not exist and it's not too pretty.



Coming up next on the blog

Now as we understand what are the basic building blocks of a webpage, we are going to deep dive into them, starting with HTML. The next blogpost will about how to build simple webpages with HTML only.