What is OAuth 2.0 and why should you care?
By Justin Richer and Antonio Sanso
In this article, excerpted from OAuth 2 in Action, we’ll introduce you to OAuth 2.0 and talk about why it’s important.
If you’re a software developer on the web today, chances are you’ve heard of OAuth. It is a security protocol used to protect a large (and growing) number of web APIs all over the world, from large-scale providers like Facebook and Google to small one-off APIs at startups and inside enterprises. It’s used to connect websites to each other and it powers native and mobile applications connecting to cloud services. It’s being used as the security layer for a growing number of standard protocols in a variety of domains, from healthcare to identity, from energy to the social web. OAuth is far and away the dominant security method on the web today, and its ubiquity has leveled the playing field for developers wanting to secure their applications.
But what is it, how does it work, and why do we need it? In this article, we’ll explore these questions.
What is OAuth 2.0?
OAuth 2.0 is a delegation protocol, a means of letting someone who controls a resource to allow a software application to access that resource on their behalf without impersonating them. The client does this by requesting authorization from the owner of the resource and receiving a token that it can use to access the resource. This all happens without the client application needing to impersonate the resource owner, since the token explicitly represents a delegated right of access. In many ways, you can think of the OAuth token as a “valet key” for the web. The valet key of a car allows the owner of the car to give limited access to someone, the valet, without handing over full control in the form of the owner’s key. Simple valet keys limit the valet to accessing the ignition and doors but not the trunk or glove box. More complex valet keys can limit the upper speed of the car and even shut the car off if it travels more than a set distance from its starting point, sending an alert to the owner. In much the same way, OAuth tokens can limit access of the client to only the actions that the resource owner has delegated.
For example, let’s say that you have a cloud photo storage service and a photo printing service, and you want to be able to print the photos that you have stored in your storage service. Luckily, your cloud storage service has an API that the printing service knows how to talk to. This is great, except that the two services are run by different companies, which means that your account with the storage service has no connection to your account with the account on the printing service. We could use OAuth to solve this problem by letting you delegate access to your photos across the different services.
While OAuth is largely agnostic to what kind of resource it is protecting, it does fit very nicely with today’s RESTful web services, and it works well for both web and native client applications. It can be scaled from a small single-user application all the way to a multi-million-user internet API. It’s just as much at home on the untamed wilds of the web, where it grew up, as it is inside the controlled and monitored boundaries of an enterprise.
And that’s not all: if you’ve used mobile or web technology in the last five years, chances are even higher that you’ve actually used OAuth to delegate your authority to an application. If you’ve ever used a Facebook application or signed in with Google to a website, then you’ve used OAuth. There are also many cases where the use of the OAuth protocol is completely transparent, such as its use in Steam and Spotify’s desktop applications. Unless an end user was looking for the telltale marks of an OAuth transaction, one would never know it was being used. This is of course a very good thing, since a good security system should be nearly invisible when all is functioning properly.
We know that OAuth is a security protocol, but what exactly does it do? Since you’re reading an article purportedly about OAuth 2.0, that’s a fair question. According to the specification which defines it:
The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service, either on behalf of a resource owner by orchestrating an approval interaction between the resource owner and the HTTP service, or by allowing the third-party application to obtain access on its own behalf.
Let’s unpack that a bit: As an authorization framework, OAuth is all about getting the right of access from one component of a system to another. In particular, in the OAuth world, a client application wants to get access to a protected resource on behalf of a resource owner (usually an end user). These are the components that we have so far:
The goal is to connect the client to the protected resource on behalf of the resource owner. In our printing example, let’s say you’ve uploaded your vacation photos to the photo storage site, and now you want to get them printed. The storage site’s API is the resource, and the printing service is the client of that API. You, the resource owner, need to be able to delegate part of your authority to the printer so that it can read your photos. You probably don’t want the printer to be able to read all of your photos, nor do you want them to be able to delete photos or upload new ones of their own. But, ultimately, what you’re interested in is getting your photos printed and if you’re like most users, you’re not going to be thinking about the security architectures of the systems you’re using to get that done.
Chances are that you’re not like most users and you actually care about security architectures. In the next section, we’ll see how this problem could be solved imperfectly without OAuth, and then we’ll look at how OAuth can solve it better.
The Bad Old Days: Credential Sharing (and Credential Theft)
The problem of wanting to connect multiple disparate services is hardly new, and one could make a compelling argument that it’s been around from the moment there was more than one network-connected service in the world.
One approach, popular in the enterprise space, is to copy the user’s credentials and replay them on another service.
In this case, the photo printer assumes that the user is using the same credentials at the printer that they’re using at the storage site. When the user logs into the printer, the printer simply replays the user’s username and password at the storage site in order to gain access to the user’s account over there, pretending to be the user.
In this scenario, the user needs to authenticate to the client using some kind of credential, usually something that is centrally controlled and agreed upon by both the client and the protected resource. The client then takes that credential, such as a username and password or a domain session cookie, and replays it back to the protected resource, pretending to be the user. The protected resource acts as if the user had just authenticated directly, which does in fact make the connection between the client and protected resource as required above.
This approach requires that the client and protected resource authenticate using the same credentials, which limits the effectiveness of this credential-theft technique to a single security domain.
What if the two services lived in different security domains, like in our photo printing example? Faced with this challenge, these would-be credential thieves could employ an age-old method for stealing something: just ask the user. If the printing service wants to get the user’s photos, it can prompt the user for their username and password on the photo storage site.
Just like before, the printer replays these credentials over on the protected resource and impersonates the user. In this scenario, the credentials that the user uses to log into the client can be different from those used at the protected resource. However, the client simply asks the user to provide a username and password for the protected resource. Many users will in fact do this when promised a useful service involving the protected resource, and this is one of the most common approaches to mobile applications accessing a user account today.
However, this approach still works only in a very limited set of circumstances: the client needs to have access to the user’s credentials directly, and those credentials need to be able to be replayed against a service outside of the user’s presence. This rules out a large variety of credential types, including nearly all federated and higher-security logins.
For those situations where it does work, it exposes the user’s primary credentials to a potentially untrustworthy application, the client. In order to continue to act as the user, the client has to store the user’s password in a re-playable fashion (often in plaintext or a reversible encryption mechanism) for later user at the protected resource. If the client application is ever compromised, the attacker gains access not only to the client but also the protected resource, as well as any other service where the end user may have used the same password.
Furthermore, in both of these approaches, the client application is impersonating the resource owner, and the protected resource has no way of telling a call directly from the resource owner from a call being directed through a client. Why is that undesirable? Let’s go back to our printing service example. Many of the approaches will work, in limited circumstances, but consider that you don’t want the printing service to be able to upload or delete photos from the storage service, just read the ones you want printed. Furthermore, you want it to be able to read only while you want the photos printed, and you’d like the ability to turn that access off at any time. However, if the printing service needs to impersonate you to access your photos, the storage service has no way to tell if it’s the printer or you asking to do something.
If the printing service surreptitiously copies your password off in the background (even though it promised not to), it can pretend to be you and grab your photos whenever it wants to. The only way to turn the rogue printing service off is to change your password at the storage service, invalidating their copy of your password in the process. Couple this with the fact that many users re-use passwords across different systems and you have yet another place where passwords can be stolen and accounts correlated with each other.
By now we’ve seen that replaying user passwords is bad. What if instead we gave the printing service universal access to all photos on the storage service on behalf of anyone it chose? Another common approach is to use a developer key issued to the client, which uses this to call the protected resource directly.
In this approach, the developer key acts as a kind of universal key that allows the client to impersonate any user that it chooses, probably through an API parameter. This has the benefit of not exposing the user’s credentials to the client, but at the cost of the client requiring a highly powerful credential. Our printing service could print any photos that it wanted to at any time, for any user, since the client effectively has free reign over the data on the protected resource. This can work to an extent, but only inside of a single security domain where the client can be fully known and trusted to the protected resource. It is vanishingly unlikely that any such relationship would be built across two organizations, such as those in our photo printing scenario. Additionally, the damage done to the protected resource if the client’s credentials are stolen is potentially catastrophic, since all users of the storage service are affected by the breach whether they ever used the printer or not.
Can’t we do better than this?
Secure Delegated Access to Web APIs
Another possible approach is to give users a password that is just for sharing with third party services.
This is starting to get closer to a desirable system, as the user no longer has to share their real password with the client, nor does the protected resource need to implicitly trust the client to act properly on behalf of all users at all times. However, the usability of such a system is, on its own, not very good. This requires the user to generate, distribute, and manage these special credentials in addition to the primary passwords they already must curate. Since it’s the user who must manage these credentials, there is also, generally speaking, no correlation between the client program and the credential itself. This makes it difficult to revoke access to a specific application.
What if we were able to have a limited credential, issued separately for each client and each user combination, to be used at a protected resource? We could then tie limited rights to each of these limited credentials. More importantly, what if there were a network-based protocol that allowed the generation and secure distribution of these limited credentials across security boundaries in a way that’s both user-friendly and scalable to the internet as a whole?
OAuth is a protocol designed to do exactly that: In OAuth, the Web Authorization Protocol, the end user delegates some part of their authority to access the protected resource to the client application to act on their behalf. In order to make that happen, OAuth introduces another component into the system: the authorization server:
The authorization server (AS) is trusted by the protected resource to issue special purpose security credentials – called OAuth access tokens – to clients. In order to get this token, the client first sends the resource owner to the authorization server in order to request that the resource owner authorize this client. The user authenticates to the authorization server and is generally presented with a choice of whether or not to authorize the client making the request. The client is able to ask for a subset of functionality, or scopes, which the user may be able to further diminish. Once the authorization grant has been made, the client can then request an access token from the authorization server. This access token can be used at the protected resource to access the API, as granted by the resource owner.
At no time in this process are the resource owner’s credentials exposed to the client: the resource owner authenticates to the authorization server separately from anything used to authenticate to the client. Neither does the client have a high-powered developer key: while most OAuth clients do have their own set of client credentials, in this archetypical OAuth process, the client is unable to access anything on its own. Instead, it must be authorized by a valid resource owner before it can access any protected resources.
Also, the user generally never has to see or deal with the access token directly. Instead of requiring them to generate tokens and paste them into clients, the OAuth protocol facilitates this process and makes it relatively simple for the client to request a token and the user to authorize the client. Clients can then manage the tokens, and users can manage the client applications.
OAuth 2.0: the Good, the Bad and the Ugly
OAuth 2.0 is very good at capturing a user delegation decision and expressing that across the network. It allows for multiple different parties to be involved in the security decision process, most notably the end user at run time.
One key assumption in the design of OAuth 2.0 was that there would always be several orders of magnitude more clients in the wild than there would be authorization servers or protected resources. As a consequence, wherever possible, the community decided that complexity should be shifted away from clients and onto servers. This is very good for client developers who no longer have to deal with signature normalizations or parsing complicated security policy documents. OAuth tokens provide a mechanism that is only slightly more complex than passwords but is significantly more secure, when used properly.
However, the flip side of this is that authorization servers and protected resources are now responsible for more of the complexity and security. A client needs to manage securing only its own credentials and tokens, and the breach of a single client would be bad but limited in its damage. An authorization server, on the other hand, needs to manage and secure the credentials and tokens for all clients and all users on a system. While this does make it more of a target for attack, it is significantly easier to make a single authorization server highly secure than it is to make a thousand clients written by independent developers just as secure.
The extensibility and modularity of OAuth 2.0 is one of its greatest assets, since it allows the protocol to be used in a wide variety of environments. However, this same flexibility leads to basic incompatibility problems between implementations. OAuth leaves many pieces optional, which can confuse developers who are trying to implement it between two systems.
Even worse, some of the available options in OAuth can be taken in the wrong context or not enforced properly, leading to insecure implementations. Suffice it to say, just because a system implements OAuth, and even implements it correctly according to the spec, does not mean that this system is actually secure in practice.
Ultimately, OAuth 2.0 is a very good protocol but it’s far from perfect, and certainly not a panacea. We will see its replacement at some point in the future, as with all things in technology, but no real contender has yet emerged as of the writing of this article. It’s just as likely that OAuth 2.0’s replacement will end up being a profile or extension of OAuth 2.0 itself.
What OAuth Isn't
OAuth is used for many different kinds of APIs and applications, connecting the online world in ways never before possible. Even though it is approaching ubiquity, there are many things that OAuth is not, and it’s important to understand these boundaries when understanding the protocol itself.
OAuth is not defined outside of the HTTP protocol.
OAuth is not an authentication protocol (though it can be used to build one).
OAuth does not define a mechanism for user-to-user delegation.
OAuth does not define authorization processing mechanisms.
OAuth 2.0 does not define a token format.
OAuth 2.0 defines no cryptographic methods.
OAuth 2.0 is also not a single protocol but rather a framework defining a family of related protocols.
Instead of attempting to be a monolithic protocol that solves all aspects of a security system, OAuth 2.0 focuses on one thing and leaves room for other components to do their pieces where it makes more sense. While there are many things that OAuth is not, OAuth does provide a solid basis that can be built upon by other focused tools to create more comprehensive security architecture designs.
OAuth is a widely used security standard that enables secure access to protected resources in a fashion that’s friendly to web APIs.
OAuth is a delegation protocol that provides authorization across systems
OAuth replaces the password-sharing anti-pattern with a delegation protocol that is simultaneously more secure and more usable
OAuth is focused on solving a small set of problems and solving them well, which makes it a suitable component within larger security systems
For more information on the OAuth 2 protocol, including hands-on guidance and exercises building an OAuth system from end to end, check out the new book OAuth 2 In Action from Manning Publications.