HTTP stands for Hypertext Transfer Protocol . It is essentially a communication protocol used to conduct information between the client and a web server.
HTTP was invented with HTML to create the first interactive text-based web browser known as the original World Wide Web
Today, the HTTP protocol remains one of the main ways to use the Internet that net users cannot live without.
If you are an internet user, you should definitely pay attention to this article that tells you everything you need to know about the HTTP protocol
Chapter 1: What is HTTP?
Because the HTTP protocol is one of the fundamental elements of the internet, we will talk together in this chapter about everything you need to understand about this protocol
1.1. Definition and history of HTTP
HTTP or Hypertext Transfer Protocol is a technique for encoding and conducting information between a web browser and a web server. HTTP is the main protocol for transmitting information on the Internet.
The HTTP 0.9 protocol was created in the late eighties, but with a low capacity
The combination of this invention with HTML and URLs is now considered the basis of the global information initiative on the World Wide Web
This innovation was led into the web world by Tim Berners-Lee at CERN in Geneva for the sharing of information between the physics community.
Given the inadequacies of the HTTP 0.9 protocol, Berners-Lee thought of improvements by inventing the first real version HTTP/1.0 in 1991
This new work of the actor was proposed as RFC 1945 to the Internet Engineering Task Force (IETF) regulatory body in 1996.
With the release of NCSA Mosaic, an easy-to-use graphical browser, the WWW grew in popularity and some of the limitations of version 1.0 of the protocol became evident, in particular:
- The inability to host multiple www sites on the same server (The so-called virtual host),
- Failure to reuse available connections,
- The impotence of insufficient security mechanisms,
- Etc.
From now on, it is the new version HTTP / 1.1 which was born. It is presented as RFC 2068 in 1997 and updated later in 1999 as described by RFC 2616.
1.2. How the HTTP protocol works
Whenever a user makes a request on the Internet, he is undoubtedly using the HTTP protocol. Not only to send the request to the server hosting the requested page, but also to receive the data in response from the server.
This implies an unavoidable presence of the HTTP protocol in both the client’s application layer and the server’s application layer, otherwise the communication cannot take place.
The client’s request is made via the browser, which manages the entire communication and returns the resources requested by the user to the screen. The browser sees a Web page as a set of objects linked together by hyperlinks.
Thus, a Web page will surely be composed of an HTML body and other resources that could be scripts, images, Java applets, etc.
HTTP relies on the TCP protocol of the transport layer to ensure the transfer of data from the server to the client.
The main reason for the use of TCP protocol by HTTP can be explained by the data transfer that it guarantees, unlike the UDP transfer protocol
Although the use of TCP has a great advantage of reliable data transfer, it also implies a longer waiting time to receive the requested resource
This can happen only because before transmitting the data, TCP must establish a connection in an operation called Handshake
Reliable data transfer is essential for HTTP, because if the entire HTML body of a Web page were not transferred to the browser, due to a transmission error, the requested page would be impossible to display or its content would be altered.
As mentioned earlier, HTTP belongs to the application layer while TCP is a transport layer protocol. This explains a little about how HTTP exchanges data with TCP.
This is possible because of the sockets that represent the contact point between the application layer and the transport layer. Each application on a specific host will have a specific interface socket to the transport layer
For example, if a user requests a Web page and sends an e-mail at the same time, there will be two sockets
- One that handles the exchange data between HTTP and the transport layer,
- And the other between the email protocols and the transport layer.
But when the user tries to open two web pages at the same time, only one socket could be created in this case
Because the socket in this case refers to a single process. The socket allows to manage, independently, the received packets to display separately the two requested web pages.
1.3. The purpose of the HTTP protocol
When we talk about the hypertext transfer protocol, we immediately refer to the role of HTTP in the transmission of website data on the Internet
As for hypertext, it refers to the standard form of Web sites through which one page can refer users to another page via clickable hyperlinks, usually simply called links
The purpose of the HTTP protocol is to provide a standard way for Web browsers and servers to communicate with each other.
Web pages are designed using hypertext markup language, or HTML, but HTTP is used today to transfer more than just HTML and the Cascading Style Sheets, or CSS, used to indicate how pages should be displayed
HTTP is also used to transfer other content on Web sites, including images, video and audio files.
Computers can connect to Web servers using HTTP simply to request files from particular Web addresses
When a computer is simply retrieving data, it typically sends an HTTP message called a GET request, and when it is sending form data or downloading a file, it uses other message formats called PUT or POST requests
You can see the HTTP messages that your web browser sends in many browsers via the built-in developer tools.
Today, HTTP is used by many applications other than web browsers to send messages to servers
People who create applications voluntarily choose HTTP because it is well understood by many developers.
Another reason is that HTTP is generally not filtered by network firewalls designed to allow Web traffic, which means that HTTP messages can pass through most home and office networks without a problem.
1.4. The benefits of HTTP
The first thing you need to know is that HTTP uses an advanced addressing scheme. It assigns an IP address with recognizable names so that it can be easily found on the World Wide Web
Compared to the standard IP address procedure with a series of numbers, by using this, the public can easily interact with the Internet.
Whenever an application needs additional capabilities, HTTP can grant it extra functionality by downloading extensions or plugins and displaying the relevant data
With HTTP, each file is downloaded from an independent connection and then closed. For this reason, no more than one element of a web page is transferred. Therefore, the risk of interception during transmission is low.
In addition, when the page is first loaded, all HTTP pages are stored in Internet caches called page caches.
Therefore, once the page is visited again, the content will be loaded faster.
1.5. The disadvantages of HTTP
Since HTTP does not run on the basis of data encryption, it is quite possible that your content could be modified by someone else.
This is the reason why HTTP is considered an insecure method and displays data integrity. This makes the data vulnerable to attacks.
Confidentiality is another problem encountered in an HTTP connection. If an attacker manages to intercept the request, he can display all the content present in the web page
In addition, they can also collect confidential information such as username and password very easily
On the other hand, even if HTTP receives all the data it needs, clients do not take any action to close the connection. Therefore, during this time, the server will not be present
In addition, once HTTP has to create multiple connections to transmit a web page, it causes administrative overhead in the connection.
1.how are HTTP and HTTPS different?
While HTTP stands for Hypertext Transfer Protocol, HTTPS simply stands for Hypertext Transfer Protocol Secure
You will notice that some URLs start with HTTP and others with HTTPS, the ”S” stands for secure encryption, which is guaranteed with a certificate
HTTPS simply means a decrypted public key on the recipient’s side. And this public key is obtained in an SSL certificate.
The SSL certificate represents an online identity document, indicating that the website is protected and free from external threats
Websites that collect people’s sensitive information, including personal addresses and credit card numbers, must purchase an SSL license
SSL encryption has many benefits for both customers and websites. Key benefits include:
- Protection from hackers : Since the certificate protects sensitive information, hackers and identity thieves will face high data security.
- Authenticity and reliability: People want to do business with a secure and trustworthy website. They don’t make purchases on websites that are not verified and encrypted
- Increased conversion rate: According to an analysis, secure e-commerce websites will quickly see an increase of 18 à 87% in conversion rates.
As for the HTTP protocol, it is limited to message transmission without security encryption. This makes the data very vulnerable to attack.
Chapter 2: What is an HTTP status code?
It can happen that the client can make requests to the server, and in return the server responds with status codes and message payloads. The status code is important and tells the client how to interpret the server’s response
The HTTP specification defines certain number ranges for specific types of responses:
2.1. 1xx: Information messages
All HTTP/1.1 clients must accept the Transfer-Encoding Header. This class of codes was introduced in HTTP / 1.1 and is purely provisional
The server can send an Expect: 100-continue message, telling the client to continue sending the rest of the request, or ignore it if it has already sent it. HTTP/1.0 clients are supposed to ignore this header.
2.2. 2xx : Successful
This indicates that the client’s request was successfully processed. Very often you will see 200 displayed. For a GET request, the server sends the data in the message. There are other less frequently used codes:
- 202 Accepted: The request was accepted, but may not include the resource in the response. This is useful for asynchronous processing on the server side. The server may choose to send information for monitoring.
- 205 Reset Content: Tells the client to reset its document view.
- 206 Partial Content: Indicates that the response contains only partial content. Additional headers indicate the exact range and expiration information of the content.
2.3. 3xx: Redirect
- 303 See Other: The resource is temporarily located at a new URL.
- 304 Unchanged: The server has determined that the resource has not changed and the client should use its cached copy. This relies on the client sending ETag (Entity Tag) information which is a hash of the content. The server compares this with its own ETag calculation to verify the changes.
2.4. 4xx : Client error
These codes are used when the server thinks the client is at fault, either by requesting an invalid resource or by making a bad request
The most popular code in this class is 404 Not Found, which I think everyone identifies with. 404 indicates that the resource is invalid and does not exist on the server. Other codes in this class include:
- 400 Bad Request: The request was incorrectly formed.
- 403 Forbidden: The server has denied access to the resource.
- 404 indicates that the resource is invalid and does not exist on the server. This forces the client to take further action. Often they are forced to access a different URL to retrieve the resource.
- 405 Invalid method: Invalid HTTP verb used in the request line, or the server does not support this verb.
- 409 Conflict: The server could not complete the request because the client is trying to modify a resource that is newer than the client’s timestamp. Conflicts occur mainly for PUT requests when making collaborative changes to a resource.
2.5. 5xx: Server error
This class of codes is used to indicate a server failure in processing the request. The most commonly used error code is 500 Internal Server Error. The others in this class are:
- 501 Not Implemented: The server does not yet support the requested functionality.
- 503 Service Unavailable: This can occur if an internal system on the server is down or the server is overloaded. Usually the server does not even respond and the request expires.
Conclusion
HTTP is since its first version until the recent one the main data transmission protocol on the Internet. It is an inevitable way for any Internet user to get a response to his request.
It is then a necessity for anyone using the Net to know the essentials about HTTP. For this purpose, we have covered the points that may seem complex to you about the concept of HTTP.
I hope this article was useful to you, don’t hesitate to leave me comments if you have any questions.