OAuth 2.0 in Web API | Howard Dierking

:

This week I’ll be pushing a sample to the WebAPI-Prototype branch in our Codeplex repository for doing an OAuth 2.0 dance for Web APIs.  Before I get into how the sample works, let’s quickly define some OAuth 2.0 terms (note that these are all defined in the OAuth 2.0 spec, but I’m defining them here so that you don’t need to read the spec as a prerequisite for this blog post).

Some Context

Resource owner – represents the entity who actually owns the protected resources.  In the case of this sample, the user (e.g. – you) is the resource owner.

Client – represents the application which accesses a resource owner’s protected resources after the resource owner has explicitly granted it permission – this last part can also be said, “after the resource owner has authorized the client to his/her protected resources”.

Resource Server – represents the application that is responsible for hosting the resource owner’s protected resource.  The resource server uses access tokens when processing and responding to requests for protected resources.

Authorization Server – represents the application that is responsible for authenticating a resource owner, enabling the resource owner to authorize a client, and ultimately issuing an access token to the client.

As the OAuth 2.0 spec defines, the abstract flow appears really simple:

diagram

However, in practice, nothing is ever as simple as the abstract flow makes it seems (that’s probably why they call it “abstract”).  For each of the stages of this flow, there are multiple ways of accomplishing the stage.  In this post, I’m going to discuss one way of implementing the OAuth 2.0 flow, but know that this represents only 1 of several possible options.  Moving forward, I plan on adding more flows into the prototype branch (and of course, blogging about them here).

A Brief Disclaimer

I’m a big believer in the principle of “the right tool for the right job” – and as I’m building out this prototype, one of the thoughts I keep arriving at is that while OAuth gives you authentication as a byproduct of having the resource owner authorize a client to a resource server, the focus of OAuth seems much more on authorization than it does federated identity.  I think that what I’m getting at is in the difference between delegated authorization and federated identity, but I’m having a hard time getting to a crisp, non-philosophical articulation of the differences between those 2 terms.  Judging from a quick Google search, I’m not the only one having trouble coming up with a clean articulation of the differences (I suspect this is because authN is always required at some point to do authZ).

At any rate, you may end up asking yourself a similar question as we go through the discussion of the sample, so I wanted to be up front in saying that I feel the same sense of weirdness and am also actively investigating OAuth patterns that use a federated identity system for authN and OAuth for authZ (or patterns to simply make it easier to integrate with federated identity systems, if simply not managing credentials is really the thing you care about).

The OAuth 2.0 Sample Workflow

For the sample, the goal was to secure a Web API using Facebook’s OAuth 2.0 capabilities so that the Web API didn’t need to maintain any usernames or passwords.  The resulting workflow looks like the following:

diagram (1)

As you can see right off the bat, the concrete example is a good bit more complex than the abstract flow defined by the OAuth 2.0 spec.  Why is that?  As I see it, there are 2 primary drivers of the added complexity to the end-to-end flow.  First, I decided that my Web API represents my OAuth client.  This means that once authorized by the resource owner (e.g. user), my Web API can get resources directly from the resource server (e.g. Facebook) without having to expose that data to the resource owner, or any malicious application pretending to be the resource owner.  However, because the end user experience is still a browser, and because my browser experience is AJAX-based (meaning that I can’t count on the browser to act directly on 302 status codes), I need a mini-protocol between my Web API and my AJAX code.

The second reason for the added complexity relates a bit to the “brief disclaimer” above.  That is, this workflow takes advantage of the fact that authentication is required for authorization – and my Web API client does call into Facebook (in its resource server role) to get some resource owner data.  However, the Facebook access token is used by my Web API only long enough to get a piece of user-identifying data which it then plugs into an existing authorization store – I’ve faked out the ASP.NET membership provider in this case.

I’ll be writing another post on my general thoughts here, but for now, this workflow works (even with the additional complexity) and should set the stage for showing how AuthN/AuthZ can work for Web API.

Plugging into Web API

One of the goals I had for the prototype was to enable the auth mechanism to plug into Web API using the standard extensibility points.  Additionally, I wanted it to use the standard AuthorizeAttribute that MVC uses.  As such, I took advantage of 2 Web API extensibility hooks: operation handlers and message handlers.  Additionally, there’s some logic behind wiring everything up that I encapsulated into an extension method that works with the HttpConfiguration object.

First, we have the OAuthFacebookOperationHandler.  The responsibility of this operation handler is to examine the incoming request to see whether the operation being called needs to be protected and if so, determine whether the user has been authorized (currently making this determination using a cookie).  Based on whether or not the user has been authorized for the operation, the handler will either pass control over to the operation or return a standard Http challenge response containing the Facebook OAuth authorization dialog URL.

First, I’m sure you’ll notice that there’s a bunch of todo notes still in the code.  I’m actively working on getting this more seamlessly plugged into things like the membership provider, the AuthorizeAttribute, and the forms authentication cookie generator.  However, the purpose of this post is to show the mechanics of initiating an OAuth flow, so hopefully those unfinished aspects won’t be too big of a distraction.  But that aside, the basic workflow looks like this:

  • check to see whether the user has already gone through the OAuth dance, been authenticated, and has authorized the Web API.  If this process has happened, there will be a cookie in the request
    • If the user has been authenticated, authorize the user based on the AuthorizeAttribute on the service operation
    • If the user has not been authenticated, throw a 401 challenge response with a scheme of OAuth and a location parameter of the Facebook auth dialog

In my sample, I have an AJAX client, so the following jQuery is able to take the challenge response, parse out the location and popup the Facebook auth dialog…

Like I said, this jQuery uses a regex to get the value of the location parameter in the response.  It then uses that to open the Facebook auth dialog.  The user will then interact with Facebook to authenticate and then authorize my Web API to a Facebook application that I created (note: you’ll need to create register an application with Facebook to use its OAuth authorization server endpoints).  If you look back at the URL that’s passed to the location header (line 45 of the operation handler), you’ll notice that one of the query values is the redirect URI.  My operation handler will always generate a value that is the address of the originally requested resource with an additional path segment called “authtoken”.  Therefore, if my resource is localhost/comments, the redirect URL will be localhost/comments/authtoken (this isn’t a requirement of any kind – it was just how I put this code together).  The thing is, I don’t have a resource or service named authtoken.  So how does this get picked up?  Enter the OAuthFacebookMessageHandler.

If you remember from Glenn’s post about Web API extensibility, message handlers give you an opportunity to plug in down in the channel stack and work directly with HttpRequestMessage and HttpResponseMessage.  And while this isn’t terribly helpful if you need details about the operation (parameters, return values, etc.), it’s great if you need only work at the Http level – and it’s very fast!  In my case, I need to basically capture all requests that have authtoken as the last path segment, get some information about the user from Facebook so that I can correlate that with my local authorization store (e.g. membership provider) and then drop a cookie that can be used in subsequent requests.

As you would expect, this code takes in an HttpRequestMessage.  However, because it’s task-based, it needs to return a Task<HttpResponseMessage> rather than just an HttpResponseMessage.  With async operation handlers, the basic principles are as follows:

  • To continue the flow of execution to the inner message handler and ultimately up through the service operation, call base.SendAsync(..) .  If you want to do some processing on the response side only, after the operation has been called, call base.SendAsync, but then add a ContinueWith(..) call.
  • To short circuit all further processing of the request and simply return a response, return your own Task<HttpResponseMessage> using Task.Factory.StartNew(..)

Because there is no actual service operation for /authtoken, I’m really dealing purely at the Http layer, so I’ll just be returning a new task rather than allowing the incoming message to propagate – and within my request processing I’m doing the following things:

  • Exchange the auth code returned to me from Facebook for an access token – this is the thing that is actually used to access Facebook data
  • Use the access token to get my username
  • Drop my username to a cookie so that my operation handler can pick it up on subsequent calls (note: don’t EVER, EVER, EVER actually do this in production – I’m trying to show you the mechanics of the OAuth dance, and only that)
  • Emit some script to close the popup window

If all is successful, this handler will set future calls up for success, then signal the parent window to both close the popup and retry the newly-authorized request, so that while there’s a lot of moving parts here, it feels pretty seamless for the end user.

Finally, there’s the question around how all of this gets wired up.  I extracted the configuration code into a single, more simple extension method:

This means that I can register for OAuth with code similar to:

config.RegisterOAuth(new FacebookOAuthClient(FB_APP_ID, FB_SECRET));

Where To Next?

There are a bunch of identity and auth related scenarios that I’m currently prototyping, in addition to cleaning up some of the hard-coded stuff in this one.  Here’s how I’ve currently prioritized security-related scenario investigations:

  • Ensure that Http basic over SSL works without any issues
  • Web API as an AuthN + AuthZ + Resource server – this would make the Web API equivalent to a service like Twitter where you could enable users to authorize 3rd party applications to your Web API (along with manage the permissions of those applications) – the key here is that Web API would have manage its own credential store.
  • Web API as an AuthZ + Resource server – this is similar to the previous with the notable exception that Web API would rely on a federated identity provider for AuthN.  In this scenario, we would integrate with an STS (or broker like ACS).

Do these look like the right scenarios?  The right ordering?  As with everything else, I welcome your feedback!