Hyper Text Transfer Protocol (HTTP)

In our previous article, we saw that for data to pass from one host(Client/Server) to another host(Client/Server) it requires to go through a set of logical Layers of Network (OSI and TCP/IP model) . For those who haven't gone through it yet, please click here.

We noticed that the Top Most Layer of the model is "The Application" layer. This layer deals with the applications through which we can connect to the internet(eg. Web Browsers). "HTTP" is the protocol of Application layer. It stands for Hyper Text Transfer Protocol. Hyper-Text is a term that define the words those are linked, thus Hyperlinks. HTTP is a request/response prototype. Means whenever we type a web address and hit enter, our web browser sends a request to that web server which gives us a Response. This is the basic analogy behind the whole protocol.

All of us know about "www"(World Wide Web). Almost all the website addresses start with "www", for eg: "www.holymotherpython.blogspot.in" . Well what exactly is "www"?

In 1989, Tim Berners-Lee, a British scientist at CERN, along with Robert Cailliau invented the first Web Browser and named it…"WWW"! which was later named Nexus. Therefore by using the services of www , we could connect to the internet.

The "www" service needs an unique address to locate the server's Host. This address is called the Uniform Resource Indicator(URI).

In the figure above, we can find all the indications that the web browser needs to locate the Receiver's PC(usually Servers). It informs that the request is made through the HTTP Port i.e. port number 80. And various other details as well. A certain part of URI contains a Uniform Resource Locator(URL) as well. As the technology advanced, the web browsers became intelligent. As a result we now don't need to type the lengthy URI which are harder to remember, instead we just type the URL. But this does not decrease the importance of URI's. It's still there. Our browser adds it automatically.

But http wasn't always this smart. The first version of Http i.e. HTTP v1.0 had some of the gaps which was later filled with HTTP v1.1 . Let's look at them one-by-one.

HTTP 1.0

This protocol was non-persistent and stateless. Persistency in a network means, sustaining the network connection until all the data has been transported. This wasn't the case with HTTP 1.0. To understand better, let's take the following example that resembles HTTP 1.0.

Let's say, you wish to call your friend(say his name is Dick Rules.)to inquire about something.

You call his phone.

You: "Hello…"

Dick: "Hey , What's up?"

And then he hangs up the phone.

You again call your friend Dick and reply.

You: "Hey I just wanted to check if you're still up for that movie?"

Dick: "Yeah Dude. Totally."

He again hangs up.

You still got to ask him if he could pick you up...

You: "Hey dude , i need a ride..."

Dick: " Yea,sure man. I'll be there till 9 pm. Is that good for you?"

And before you could answer him, he hangs up again!!! Well he's Dick after all! Agreed…but this is what exactly was wrong with HTTP 1.0 . It was non-persistent. As soon as the server got the request from the client's computer, it sends a response and immediately terminates the connection. This way the client can only send a single request .

Now, let's understand what stateless connection means.

Assume Dick calls you...

You: Yeah man?

Dick: Hey could you tell me which movie were we going?

You: Twilight! I love it (Yeah! That's you...)

You hang up.(Cause non-persistent connection)

Dick calls again.

Dick: Hey, what was the name of the movie again?

You: Twilight.

You Hang up. He calls again and asks you the same thing and you reply again.

Now, after sometime you'll probably get pissed off at him...but this isn't the case with HTTP 1.0 . In Http 1.0 there was no memory to store your last request...So even if the client requests same thing again and again, the server responds and terminates the connection, again and again. Now let's see the case with HTTP 1.1

HTTP 1.1

HTTP 1.1 is statefull and persistent. This means that the connection is going to be alive until officially terminated by the client. This termination is known as an Overhead. Also if you request the server for something, it goes into server's memory.

Architecture of HTTP

Http has a header which carries the relevant data within it.

There is a slight difference between Client's header and Server's header for HTTP. The client has a Request Line and server has a Status(Response) line.

Request Lines are the lines which indicate the type of request the client makes. Status lines are the lines that inform about the response generated by the server.

Request Lines: These are the commands given by the Client's Browser.

GET :

Whenever a client wants to look a document in a webpage, this is the method by which our browser does so. You can check it yourself in your Browser: " Open Browser(I use Edge) -> press f12 ->Network".

POST:

This is used when the client makes a small entry in a web page . eg. Usernames, passwords etc.

PUT:

This method is used when client wants to enter a big data into a webpage. Say writing a post for your blog or some email message.

HEAD:

Used whenever client wants the information about the document instead the document itself.

TRACE:

Remember in our previous article we noticed that between a client and server there are many intermediate servers which pass on the upcoming data forward. If the client wants the information about these intermediate servers, he/she uses this method. This is a good method to check out the problem in between a particular network. Although, for privacy issues many of the servers don't respond to this method request.

There are more request lines but I think that till now, you probably got an idea behind using the Request lines. Moving on to Status Lines.

Status Lines:

These are actually numbers that signify the status of response made by the server. There is a particular range of numbers used in this service.

100 (INFO):

If a client makes a request asking about the information of some object from the web page, assuming that the server agrees to respond. The server along with the information sends a code that falls in this range.

200 (ALL OK):

When a request made is responded successfully, this status line is used. Refer to the figure above.

300(Redirection) :

If for some reason the original website URL is changed to another, then to redirect all the client requests for the same URL, this code is used. If the redirection is Permanent, then code 301 is used. If the redirection is temporary, code 302 is used.

400(Client Error) :

"Error 404"This code has annoyed me a lot. It tells that there is some problem at Client's end. Hence can be corrected.

500(Server Error) :

This code signifies error from Server's end. Thus all that a client can do is wait for the site to resolve the error.

There is an even better version of Http now - "HTTP /2"

This was the introduction of HTTP but there's still more to it like DNS, Cookies, Web Page etc. In our next article we'll look at the DNS in details.

I hope you found this article helpful. Please Comment and share your thoughts with me.

Till then...Keep Exploring!

The Geek Expo

Contact Form