In our
previous article, we saw that for data to pass from one host(Client/Server) to
another host(Client/Server) it requires to go through a set of logical Layers
of Network (OSI and TCP/IP model) . For those who haven't gone through it yet,
please click here.
We
noticed that the Top Most Layer of the model is "The Application"
layer. This layer deals with the applications through which we can connect to
the internet(eg. Web Browsers). "HTTP" is the protocol of
Application layer. It stands for Hyper Text Transfer Protocol. Hyper-Text is a
term that define the words those are linked, thus Hyperlinks. HTTP is a
request/response prototype. Means whenever we type a web address and hit
enter, our web browser sends a request to that web server which gives us a
Response. This is the basic analogy behind the whole protocol.
All of us know about "www"(World
Wide Web). Almost all the website addresses start with "www", for eg:
"www.holymotherpython.blogspot.in"
. Well what exactly is "www"?
In 1989,
Tim Berners-Lee, a British scientist at CERN, along with Robert Cailliau
invented the first Web Browser and named it…"WWW"! which was later
named Nexus. Therefore by using the services of www , we could connect to the
internet.
The
"www" service needs an unique address to locate the server's Host.
This address is called the Uniform Resource Indicator(URI).
In the
figure above, we can find all the indications that the web browser needs to
locate the Receiver's PC(usually Servers). It informs that the request is made
through the HTTP Port i.e. port number 80. And various other details as well. A
certain part of URI contains a Uniform Resource Locator(URL) as well. As the
technology advanced, the web browsers became intelligent. As a result we now
don't need to type the lengthy URI which are harder to remember, instead we
just type the URL. But this does not decrease the importance of URI's. It's still there. Our browser adds it
automatically.
But http
wasn't always this smart. The first version of Http i.e. HTTP v1.0 had some of
the gaps which was later filled with HTTP v1.1 . Let's look at them one-by-one.
HTTP 1.0
This
protocol was non-persistent and stateless. Persistency in a network means,
sustaining the network connection until all the data has been transported. This
wasn't the case with HTTP 1.0. To understand better, let's take the following example that resembles HTTP 1.0.
Let's say, you wish
to call your friend(say his name is Dick Rules.)to inquire about something.
You call
his phone.
You:
"Hello…"
Dick:
"Hey , What's up?"
And then
he hangs up the phone.
You again
call your friend Dick and reply.
You:
"Hey I just wanted to check if you're still up for that movie?"
Dick:
"Yeah Dude. Totally."
He again
hangs up.
You still
got to ask him if he could pick you up...
You:
"Hey dude , i need a ride..."
Dick:
" Yea,sure man. I'll be there till 9 pm. Is that good for you?"
And
before you could answer him, he hangs up again!!! Well he's Dick after all!
Agreed…but this is what exactly was wrong with HTTP 1.0 . It was
non-persistent. As soon as the server got the request from the client's
computer, it sends a response and immediately terminates the connection. This
way the client can only send a single request .
Now,
let's understand what stateless connection means.
Assume Dick
calls you...
You: Yeah
man?
Dick: Hey
could you tell me which movie were we going?
You:
Twilight! I love it (Yeah! That's you...)
You hang
up.(Cause non-persistent connection)
Dick
calls again.
Dick:
Hey, what was the name of the movie again?
You:
Twilight.
You Hang
up. He calls again and asks you the same thing and you reply again.
Now,
after sometime you'll probably get pissed off at him...but this isn't the case
with HTTP 1.0 . In Http 1.0 there was no memory to store your last request...So
even if the client requests same thing again and again, the server responds and
terminates the connection, again and again. Now let's see the case with HTTP 1.1
HTTP 1.1
is statefull and persistent. This means that the connection is going to be
alive until officially terminated by the client. This termination is known as
an Overhead. Also if you request the server for something, it goes into
server's memory.
Architecture of HTTP
Http has
a header which carries the relevant data within it.
There is
a slight difference between Client's header and Server's header for HTTP. The
client has a Request Line and server has a Status(Response) line.
Request
Lines are the lines which indicate the type of request the client makes. Status
lines are the lines that inform about the response generated by the server.
Request Lines: These are the commands given by the Client's Browser.
GET :
Whenever a client wants to look a document in a webpage, this is the method by which our
browser does so. You can check it yourself in your Browser: " Open Browser(I use Edge) -> press f12 ->Network".
POST:
This is used when the client makes a small entry in a web page . eg. Usernames,
passwords etc.
PUT:
This
method is used when client wants to enter a big data into a webpage. Say
writing a post for your blog or some email message.
HEAD:
Used whenever client wants the information
about the document instead the document itself.
TRACE:
Remember in our previous article we noticed that between a client and server
there are many intermediate servers which pass on the upcoming data forward. If the
client wants the information about these intermediate servers, he/she uses this
method. This is a good method to check out the problem in between a particular network. Although, for privacy issues many of the servers don't respond to this method request.
There are
more request lines but I think that till now, you probably got an idea behind
using the Request lines. Moving on to Status Lines.
Status Lines:
These are
actually numbers that signify the status of response made by the server. There
is a particular range of numbers used in this service.
100 (INFO):
If a client makes a request asking about the information of some object
from the web page, assuming that the server agrees to respond. The server along
with the information sends a code that falls in this range.
200 (ALL OK):
When a request made is responded successfully, this status line is used. Refer to the figure above.
300(Redirection) :
If for some reason the original website URL is changed to another, then to
redirect all the client requests for the same URL, this code is used. If the
redirection is Permanent, then code 301 is used. If the redirection is
temporary, code 302 is used.
400(Client Error) :
"Error 404"This code has annoyed me a lot. It tells that
there is some problem at Client's end. Hence can be corrected.
500(Server Error) :
This code signifies error from Server's end. Thus all that a client can
do is wait for the site to resolve the error.
There is an even better version of Http now - "HTTP /2"
This was
the introduction of HTTP but there's still more to it like DNS, Cookies, Web
Page etc. In our next article we'll look at the DNS in details.
I hope you
found this article helpful. Please Comment and share your thoughts with me.
Till
then...Keep Exploring!