The HTTP Series (Part 1): Overview of the Basic Concepts
We all work with the internet everyday. But even you're a dev, you might not be familiar with the low-level workings of systems like HTTP. Read on to learn more!
Join the DZone community and get the full member experience.
Join For Freein this article, i will present you the basics of http.
but why http?
you may be asking yourself, why should i read about http?
well, if you are a software developer, you will understand how to write better applications by learning how they communicate. if you are system architect or network admin, you will get deeper knowledge on designing complicated network architectures.
rest, which is very important architectural style nowadays, completely relies upon utilizing http features, so that makes http even more important to understand. if you want to make great restful applications , you must understand http first.
so are you willing to pass on the chance to understand and learn the fundamental concepts behind the world wide web and network communications?
i hope not!
the focus of the article will be on explaining the most important parts of http as simply as humanly possible. the idea is to organize all the useful information about http in one place, to save you the time of going through books and rfcs to find the information you need.
this is the first article of the http series . it will give you a short introduction of the most important concepts of the http.
- the http series (part 1): overview of the basic concepts
- the http series (part 2): architectural aspects
- the http series (part 3): client identification
- the http series (part 4): authentication mechanisms
- the http series (part 5): security
without further ado, let’s dive in.
http definition
the founder of http is tim berners-lee (the guy also considered to be the inventor of the world wide web). among other names important to the development of the http is roy fielding , who is also the originator of the rest architectural style.
the hypertext transfer protocol is the protocol that applications use to communicate with each other. in essence, http is in charge of delegating all of the internet's media files between clients and servers. that includes html, images, text files, movies, and everything in between. and it does this quickly and reliably.
http is the application protocol and not the transport protocol because it is used for the communication in the application layer. to jog your memory here is what the network stack looks like.
from this image, you can clearly see the that the http is the application protocol and that tcp works on the transport layer.
resources
everything on the internet is a resource, and http works with resources. that includes files, streams, services, and everything else. an html page is a resource, a youtube video is a resource, your spreadsheet of daily tasks on a web application is a resource… you get the point.
and how do you differentiate one resource from another?
by giving them urls (uniform resource locators).
a url points to the unique location where your browser can find the resource.
how the messages are exchanged between web client and web server
every piece of content, every resource lives on some web server (http server). these servers are expecting an http request to provide those resources.
but how do you request a resource from a web server?
you need an http client, of course!
you are using an http client right now to read this article. web browsers are http clients. they communicate with http servers to retrieve the resources to your computer. some of the most popular clients are google’s chrome, mozilla’s firefox, opera, apple’s safari, and, unfortunately, the still infamous internet explorer.
messages and some message examples
so what does the http message look like?
without talking too much about it, here are some examples of http messages:
get request
get /repos/codemazeblog/consumerestfulapisexamples http/1.1
host: api.github.com
content-type: application/json
authorization: basic dghhbmtziehhcmfszcbsb21iyxv0lcbtdwnoigfwchjly2lhdgvk
cache-control: no-cache
post request
post /repos/codemazeblog/consumerestfulapisexamples/hooks?access_token=5643f4128a9cf974517346b2158d04c8aa7ad45f http/1.1
host: api.github.com
content-type: application/json
cache-control: no-cache
{
"url": "http://www.example.com/example",
"events": [
"push"
],
"name": "web",
"active": true,
"config": {
"url": "http://www.example.com/example",
"content_type": "json"
}
}
here is the example of one get and one post request. let’s go quickly through the different parts of these requests.
the first line of the request is reserved for the request line. it consists of a request method name , request uri, and http version.
the next few lines represent the request headers . request headers provide additional info to the requests, like the content types the request expects in response, authorization information, etc.
for the get request, the story ends right there. a post request can also have a body and carry additional info in the form of a body message. in this case, it is a json message with additional info on how the github webhook should be created for the given repo specified in the uri. that message is required for the webhook creation so we are using a post request to provide that information to the github api.
the request line and request headers must be followed by <cr><lf> (carriage return and line feed \r\n), and there is a single empty line between message headers and message body that contains only crlf.
reference for http requests.
and what do we get as a response to these requests?
response message
http/1.1 200 ok
server: github.com
date: sun, 18 jun 2017 13:10:41 gmt
content-type: application/json; charset=utf-8
transfer-encoding: chunked
status: 200 ok
x-ratelimit-limit: 5000
x-ratelimit-remaining: 4996
x-ratelimit-reset: 1497792723
cache-control: private, max-age=60, s-maxage=60
[
{
"type": "repository",
"id": 14437404,
"name": "web",
"active": true,
"events": [
"push"
],
"config": {
"content_type": "json",
"insecure_ssl": "0",
"url": "http://www.example.com/example"
},
"updated_at": "2017-06-18t12:17:15z",
"created_at": "2017-06-18t12:03:15z",
"url": "https://api.github.com/repos/codemazeblog/consumerestfulapisexamples/hooks/14437404",
"test_url": "https://api.github.com/repos/codemazeblog/consumerestfulapisexamples/hooks/14437404/test",
"ping_url": "https://api.github.com/repos/codemazeblog/consumerestfulapisexamples/hooks/14437404/pings",
"last_response": {
"code": 422,
"status": "misconfigured",
"message": "invalid http response: 404"
}
},
]
the response message is pretty much structured the same as the request, except the first line that is called is the status line, which surprising as it is, carries information about the response status .
the status line is followed by the response headers and response body .
reference for http response.
mime types
mime types are used as a standardized way to describe the file types on the internet. your browser has a list of mime types and the same goes for web servers. that way files can be transferred the same way regardless of the operating system.
a fun fact is that mime stands for multipurpose internet mail extension because they were originally developed for multimedia emails. they have been since adapted to be used for http and several other protocols.
every mime type consists of a type , subtype , and a list of optional parameters in the following format: type/subtype; optional parameters.
here are a few examples:
content-type: application/json
content-type: text/xml; charset=utf-8
accept: image/gif
you can find the list of commonly used mime types and subtypes in the http reference .
request methods
http request methods (also referred to as “verbs”) define the action that will be performed on the resource. http defines several request methods of which the most commonly known/used are get and post methods.
a request method can be idempotent or not idempotent. this is just a fancy term for explaining whether the method is safe/unsafe to be called several times from the same resources. in other words, that means that the get method, that has the sole purpose of retrieving information, should by default be idempotent. calling get on the same resource over and over should not result with a different response. on the other hand, the post method is not an idempotent method.
prior to http/1.1, there were just three methods: get, post, and head, and the specification of the http/1.1 brought a few more in the play: options, put, delete, trace, and connect.
find out more about each one of these methods does in the http reference .
headers
header fields are colon-separated name-value fields that you can find just after the first line of a request or response message. they provide more context to the http messages and ensure clients and servers are appropriately informed about the nature of the request or response.
there are five types of headers in total:
- general headers: these headers are useful to both server and client. one good example is the date header field which provides the information about the time of the message creation.
- request headers: specific to the request messages. they provide the server with additional information. for example, accept: */* header field informs the server that the client is willing to receive any media type.
- response headers: specific to the response messages. they provide the client with additional information. for example, the allow: get, head, put header field informs the client which methods are allowed for the requested resource.
- entity headers: these headers deal with entity body. for example, content-type: text/html header lets the application know that the data is an html document.
- extension headers: these are nonstandard headers constructed by application developers. they are not the part of http but need to be tolerated.
you can find the list of commonly used request and response headers in the http reference .
status codes
the status code is a three digit number that denotes the result of a request. it is followed by the reason phrase which is a humanly readable status code explanation.
some examples include:
- 200 ok
- 404 not found
- 500 internal server error
the status codes are classified into five different groups.
both status code classification and the entire list of status codes and their meanings can be found in the http reference .
conclusion
phew, that was a lot of information.
the knowledge you gain by learning http is not the kind that helps you to solve some problem directly. but it gives you the understanding the underlying principles of internet communication which you can apply to almost every other problem on a higher level than http. whether it is rest, apis, web application development, or networks, you can now be at least a bit more confident while solving these kinds of problems.
of course, http is a pretty large topic to talk about and there is still a lot more to it than the basic concepts.
read about the architectural aspects of http in the part 2 of the http series .
was this article helpful to you? please leave the comment and let me know.
Published at DZone with permission of Vladimir Pecanac. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Creating Scalable OpenAI GPT Applications in Java
-
Application Architecture Design Principles
-
How Web3 Is Driving Social and Financial Empowerment
-
Integrating AWS With Salesforce Using Terraform
Comments