The Basics of HTTP - Part 2 - The Request

I'm still feeling pretty under the weather, but I just have to get a blog post done or I won't be able to live with myself over the weekend. So here is a continuation of my HTTP series. There is a lot more to learn about this basic protocol than I ever expected. It makes me want to stop calling it "basic". :)

In order for a client (web browser, web service, etc) to get a resource (HTML page, image, JavaScript file, flash file, etc) from an HTTP server (web server like IIS or Apache) it must request the resource. That request must be made and formatted in a certain way for it to work.

How a request is made

HTTP requests are made by a client that asks a server for a specific resource. This resource is usually a web page, but could also be an image, a text file, a flash movie, or just about anything else. The requested resource is identified to the server through the use of a URI (Uniform Resource Identifier). The type of URI that we usually use is a URL (Uniform Resource Locator), so that is what we will talk about.

Note: the other type of URI is a URN (Uniform Resource Name) which you can read more about here.

URLs are a specifically formatted string that indicates where a resource is located on the internet by providing certain information.

A properly formatted URL will look something like this:


http://www.12robots.com:80/archive/index.cfm?archid=1

Note: The above URL is completely fabricated. It will not work.

Here are the components of a URL broken out:

  • http - Is the scheme and is used to identify the protocol that is in use
  • :// - Is used to separate the scheme from the rest of the URL
  • www.12robots.com - Is the host
  • , it could also be an IP address (i.e. 206.55.4.123)
  • : - This colon is used to separate the host name and the port that is used to access the resource
  • 80 - Is the post to use for this request, it is option, port 80 will be used by default for an HTTP connection
  • / - Indicates the end of the host and the beginning of the path to the requested resource
  • archive/index.cfm - Is the path to the resource, where "archive" is the sub-folder (or server context path) and index.cfm is the requested resource
  • ?archid=1 - Is the query of the request. A query is used to pass additional information along with the request to be used by the resource (we will discuss this more in the future)

How a request is formatted

So having a properly formatted request URL is not enough to make a request from a server for a resource. We also need a HTTP request message. A request message usually consists of 2 parts. A Start Line and a Headers section. There can also be a body section, but we won't look at that just yet.

So here is what the start line of a simple HTTP request will look like:


GET /index.cfm HTTP/1.1

This simple line tells the server that we are making a "GET" request (more on that later), the specific resource we are requesting is the index.cfm file at the root level, and we are using the HTTP v1.1 protocol.

Pretty simple.

Now we can add the Header section:


GET /index.cfm HTTP/1.1
Host: www.12robots.com

Headers are always attribute/value pairs separated by a colon(:). each set of Attribute value pairs is then separated by a line feed.

Here we have supplied a Host Header. This tells the web server which host we are requesting (in case the server hosts more than one).

And this is all we need to request that resource. Pretty simple huh?

So how do we actually make that request? Well, there are a few ways. We could simple type the address http://www.12robots.com/index.cfm in a browser, and our browser (IE 5.5, right?) will format that HTTP Message for us and send it over TCP port 80 to the web server using the HTTP protocol, and that makes things really simple.

Another option we have is to use Telnet to make the request. This is fun to try, just to see it work. You can launch your favorite Telnet client and type:


>
open 12robots.com 80

Here you are telling it to open a telnet connection, but to do it on port 80. You can then start typing the request.


GET /index.cfm HTTP/1.1
Host: 12robots.com

Then hit enter twice. The way the request Header section is ended is with a Carriage Return and a Line Feed (CRLF). Hitting enter twice will accomplish this.

You should instantly receive a lot of HTML scrolling across your screen. What you may have missed, at the top, though is the response message.


HTTP/1.1 200 OK
Date: Fri, 06 Mar 2009 16:17:58 GMT
Server: Apache/2.2.8 (Win32) JRun/4.0 PHP/5.2.6
Set-Cookie: CFID=3290;expires=Sun, 06-MAR-2009 16:18:03 GMT;path=/
Set-Cookie: CFTOKEN=ba4b1ah4h5hd8d535-DC45JKBA0-FFE1-BE4C-436C8CEA89C0F73F;expires
=Sun, 06-MAR-2009 16:18:03 GMT;path=/
Content-Type: text/html; charset=UTF-8

We'll talk more about the response message in my next HTTP post. When I am feeling better.

One other method that may be more fun, is to use <cfhttp> to make HTTP requests. You can start with a very simple tag, and make changes to try different things. You can capture the results and dump them to your screen.


<cfhttp url="http://www.12robots.com" method="GET" result="httpResult" />

<cfdump var="#httpResult#">

Comments
Rick O's Gravatar The "Live HTTP Headers" extension for Firefox sniffs out and displays this transaction, making it trivially easy to study and debug.
# Posted By Rick O | 3/6/09 10:13 AM
Jason Dean's Gravatar @Rick, Thanks for the tip. I'm going to go try that out.
# Posted By Jason Dean | 3/6/09 10:32 AM
Brian FitzGerald's Gravatar Hey, that Telnet thing was really cool :) (just tried it)

I'm finding this series very informative, thanks for putting them together!
# Posted By Brian FitzGerald | 3/16/09 9:01 AM
Jason Dean's Gravatar @Brian, thanks. I'm glad your getting something out of it. I learned a lot by doing the research.
# Posted By Jason Dean | 3/16/09 9:31 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.9.1. Contact Blog Owner