What is HTTP?
A basic understanding of how the HTTP protocol works will give us a better understanding of how servlets are invoked. In this lesson we talk about the HTTP protocol, the HTTP request/response mechanism, HTTP methods and HTTP response codes.
So what is HTTP? Well the Hypertext Transfer Protocol, commonly known as HTTP is a simple request/response protocol that is used by the vast majority of internet traffic to send and receive data over the web. The HTTP protocol runs on top of the TCP/IP protocol. The job of the Transmission Control Protocol (TCP) is ensuring that the packaging and unpackaging of data between the client and server is done without any data loss. There may be many routers and URI addresses between the client request origin and the server destination and it's the job of the Internet Protocol (IP) to decipher these endpoints and make sure the request moves along the connection pathway to its destination.
You need to have a rudimentary understanding of how HTML works to get the most benefit from these tutorials. Visit the HTML section of the site for in-depth tutorials on HTML.
A HTTP request is typically made from a web browser which we can think of as the client and a HTTP response is sent back from a web server with the information requested. As far as HTTP is concerned, after the exchange of data has occurred via the request/response mechanism, the conversation between the client and server has ended and so has the connection between them.
The following diagram illustrates the client/server exhange via the request/response mechanism.
A HTTP request is initiated when a resource is requested by the client from a server. For instance when you click on a link within a search results page a HTTP request is initiated and eventually a HTTP response such as a file download or a HTML page to be rendered, is sent back to the client. Both HTTP requests and HTTP responses consist of three parts:
- The request or response line followed by a carriage return.
- A request or response header followed by a blank line.
- An optional request or response body.
HTTP Request Overview
Lets look at a HTTP request in more detail and to do this we will use the current page as an example.
The request line comes first and consists of a HTTP method, the path to the resource we have requested from the server and the protocol and version that the web browser is requesting. We will talk much more about HTTP
methods later in the lesson but for our example of getting to the current page from a link the HTTP method will be a
GET. The following code shows what the request line for this page would look like:
GET http://jsptutor.info/servlets25/whatishttp.html HTTP/1.1
Ok next comes the request header section which consists of name value pairs consisting of the name of the request header followed by the value and delimited by a colon and space. Lets take a look at the request header section for this page. The screenshot below was taken using the firebug console and looking at the Resources tab within the Page Speed addon. Within this tab we expanded the page and looked at the HTTP request information, which is shown in the screenshot below:
You can think of the request headers as metadata about the request made by the client which includes such things as the user agent, which is the engine the web browser uses, language information and the type of encoding.
There is no request body in this example as a
GET was issued and all information for this HTML method is passed in the request line.
HTTP Response Overview
Ok now we have seen the anatomy of a HTTP request it's time to look at a HTTP response in much more detail and again we will use the current page as an example.
The response line comes first and consists of the protocol and version that was actually used and passed back to the client, a HTTP response code and a brief description of the response code. We will talk much more about HTTP response codes later in the lesson but for our example a status of 200 denotes a successful response. The following code shows what the response line for the requested page would look like:
HTTP/1.1 200 OK
Next comes the response header section which consists of name value pairs consisting of the name of the request header followed by the value and delimited by a colon and space. Lets take a look at the response header section for this page. The screenshot below was taken using the firebug console and looking at the Resources tab within the Page Speed addon. Within this tab we expanded the page and looked at the HTTP request information, which is shown in the screenshot below:
You can think of the response headers as metadata about the response made by the server which includes such things as the content type, content length, the web server used and the type of encoding.
Finally, following a blank line comes the request body, which in this example is the HTML required to render this page, within the web browser. If the HTML includes other entities such as images, then another HTTP request will be made to return and render these within the HTML as well.
Apart from the
GET method we have already seen there are six other HTTP methods we can use and which will be discussed in this section. There is also a
CONNECT HTTP method which isn't in
widespread use for Servlets 2.5 and so we will defer discussion of this particular HTTP method for a later servlet version. The following subsections outline the seven HTTP methods and their use.
DELETE method allows us to remove the contents of the target URI from the server and is the opposite of the
PUT method. The
Request-URI header is used to place the value of
the target URI into, for the resource that is to be deleted. As we are deleting a resource from the server the HTTP request body is generally empty.
There are security implications involved when using the
DELETE method and so caution should be the key when using this HTTP method.
A server can delay its response to the
DELETE method but must respond at some point in the future. A response code of
200 indicates the resource has been deleted, whilst a response code of
indicates that the resource has been marked for deletion as some later point.
We have already seen an example of the
GET method above, which is used to request a resource from the server. The
GET method makes up a huge proportion of HTTP commands sent to the server and
is typically triggered by clicking on a link, typing a URI into the address line of the browser and hitting the
enter key, or pressing the
submit button of a form. As we are requesting a
resource from the server the HTTP request body is always empty.
GET method is triggered from a form, then any form parameters are appended to the URI, beginning with a
? and delimited by
&. The following code shows an
example of a URL where a resource called
nameform will be looked up on the server and firstname and lastname attributes have been sent with values of
GET http://jsptutor.info/nameform?firstname=john&lastname=smith HTTP/1.1
HEAD method is the same as the
GET method, apart from not returning any resource from the server, as it doesn't return a response body. All the response headers containing metadata
about the resource are returned though and can be interrogated. This can be efficient, if you need to look at file sizes for instance, before deciding whether to use the
GET method to retrieve the
resource. As we are requesting information from the server the HTTP request body is always empty.
OPTIONS method is the same as the
GET method, apart from not returning any resource from the server, as it doesn't return a response body. In this respect it is similar to the
HEAD method but more minimal as it only returns a response header showing which HTTP methods can be used on the resource. As we are requesting information from the server the HTTP request body is empty,
although unlike the
HEAD methods this may change in a future release.
POST method is the most used HTTP method after the
GET method and can also be triggered from HTML forms that are set to
POST. There are several compelling reasons why we may
want to use the
POST method instead of the
GET method. Firstly when using the
GET method, parameter information from a form is visible to the user within the URI query string and for sensitive
information we may want this hidden. Secondly there may be a lot of information that we want to pass across to the server and using the URI to append it to may prove cumbersome. With the
all data is placed within the request body thus keeping it private and also stopping the URI from becoming clogged with form parameters. The other major usage for the
POST method is to create/update/delete
information on the server, for example to update database information from the parameters supplied. Although we could write code at the server end to do this from a
GET method, that's not a good practice to get into.
PUT method allows us to add content, typically a file. to the target URI on the server and is the opposite of the
DELETE method. The
Request-URI header is used to place the value of
the target URI into, for the resource that is to be added. As we are putting a resource onto the server the HTTP request body contains the data to put into the resource specified in the
PUT method varies from the The
POST method as the resource is always put into the target URI specified in the
Request-URI header. When using the
we as programmers, have much more control over the location the resource is being sent to.
Care should be taken when using the
PUT method as an existing resource matching the target URI will be overwritten without warning.
A response code of
200 indicates the resource has been replaced, a response code of
201 indicates the resource has been created and a response code of
204 indicates the resource
has not been put on the server.
TRACE method is useful for seeing the state of the request when it reaches the server. As this may involve traversal through many internet connections this HTTP method can be useful for system
engineers for debugging purposes and is the least useful method for us programmers.
Idempotency & Safety
First lets clarify the definition of Idempotency in the context of HTTP methods. We can think of idempotency within a HTTP method, as an operation that can be applied multiple times without changing the outcome
beyond its initial application. As an example we would expect the
GET method to return a resource without impacting the server and for this operation to be the same no matter how many times we did it.
Whereas using the
POST method, to for example update an account balance would have a different effect each time the operation was executed and so the
POST method is clearly NOT idempotent.
HEAD method doesn't change anything on the server and so is safe, whereas using the
PUT method can change server state and so is inherently unsafe. The following table shows
whcih HTTP methods are idempotent and/or safe:
HTTP Response Codes
We saw some response codes when describing the various HTTP methods. Although we are not going to list all the HTTP response codes here the first digit of a response code has a generic meaning for a HTTP response code group which is explained in the following table:
|1xx||Informational||Reported during a client connection.|
|2xx||Successful||Reported after succesful completion of a request.|
|3xx||Redirection||Requested resource has been moved.|
|4xx||Client Error||Client request error.|
|5xx||Server Error||Server response error.|
Remembering the generic meaning of the first digit of HTTP response codes can be useful when developing your own applications and you run into problems. So it's worth taking a few minutes to memorize the above HTTP response codes for future debugging and such.
There's a lot more to HTTP than we have covered in this whirlwind tour of the protocol, but this is more than enough information to aid us in our use of servlets.
The following links from the World Wide Web Consortium (W3C) cover everything else you would ever want to know about HTTP.
The official HTTP spec can be found at: Hypertext Transfer Protocol -- HTTP/1.1 - World Wide Web Consortium
A full list of HTTP status codes can be found at the offcial W3C Standards site: HTTP/1.1: Status Code Definitions
Lesson 2 Complete
In this lesson we looked at the HTTP protocol and how information is passed to and from a web server using HTTP methods. We also learnt about idempotency and safety, as well as some useful information on HTTP response codes.
In the next lesson we learn about the Java EE5 Platform and how servlets fit into the architecture.