|
Previous articles in this series:
So at this point we've decided that going with a RESTful Architecture is the right decision. In fact, one might even go so far as to say that using RESTful architecture is such a relatively easy thing to do and with such high rewards that it really should be a no-brainer. In my opinion, if you're exposing a Web Service, there should be a damn good reason for it NOT to be RESTful.
So our services are now being built with the idea that we're transferring state between resources. This then brings up the big question, how exactly do we represent that state? What is the wire format of the state transmission? In this blog post, I'll take a look at some of the most commonly used wire formats and why I like them and why I don't like them.
I remember back in the good old days when we had to send packets uphill both ways in a snowstorm to get to our web services. Actually, it was before we were calling them web services. I remember chatting with some folks about what I was doing at my day job when I explained that I got around the DCOM routing problem by sending XML payloads as the body of HTTP messages. Then we started calling that web services. Then, everyone decided that we needed to standardize this crap because there was way too much cowboy Web Servicing going on 'round here. This is when SOAP showed up. SOAP was created to get around DCOM limitations, but despite its inheritance from XML it is still a remote method invocation wire format. When you create a SOAP message, you are serializing your intent to remotely invoke a method, and you are receiving the response of that method as SOAP.
SOAP in its early form (1.0) was really, really bloated. Then 1.2 came along and you could, if you tried hard enough, create SOAP messages that were similar in size and bloat to that of standard XML document payloads. This is where the WS-* protocols come in. These were all standardized extensions that used SOAP and SOAP headers to add additional functionality to SOAP-based web services, such as enhanced/token security, encryption, routing, addressing, and much, much more. Eventually this entire bundle of standards became so large and so difficult to grasp in its entirety that people started referring to them as WS-DeathStar instead of WS-*.
Don't get me wrong, I freely admit that there are some cases out there where you simply must use some of these standards to get the job done, most of them exist within complex enterprise apps. However, I've found that for well over 90% of the services that I've created, I don't need to touch SOAP or WS-*. The additional complexity, message bloat, and difficulty in testing and consuming cross-platform make SOAP a really, really unattractive option for me.
Why am I talking about WSDL here on a blog post about wire formats? Because unfortunately a lot of people still think WSDL is a wire format. WSDL (Web Services Description Language) is an XML dialect that is used to describe how to invoke remote methods on a SOAP-based web service. I realize I'm inviting a lot of flames when I say this, but: If you find that your service needs WSDL in order to be consumed, perhaps you should rethink the design and attempt to remove some unnecessary complexity. Other than those few cases when I was required (nearly had a gun to my head) to build WS-*/SOAP-based services, I have never, ever, needed to build a service that had a WSDL coupling. Ever.
Ah, POX. Like that trusty friend you had back in high school... never all that popular with the jocks, but you knew you could always count on him to get the job done. POX (or Plain Old XML) isn't really a wire format per se. It is a serialization format, but when a developer says, "I'm using POX", they are doing more than simply stating that their web service communicates via XML. They are taking the philosophical stance that their service is going to communicate using un-enveloped XML, and will use as many of the powerful features already available in the HTTP protocol as possible to get all the fucntionality they need. They have decided that they don't need to deal with WS-* or SOAP or WSDL or even UDDI... no, they are deciding that simplicity is the best way to go. People who adopt POX as their wire format are almost always proponents of RESTful Architecture.
So how does POX work with RESTful Architectures? Quite simply, actually... and most people are really amazed at what the combination of XML over HTTP can actually accomplish, especially if you utilize a couple of very useful HTTP response codes.
To create a new resource, you would perform an HTTP POST to the URL of that resource, and send as the HTTP body an XML document that describes the resource:
<Customer>
<FirstName>Kevin</FirstName>
<LastName>Hoffman</LastName>
</Customer>
The service then responds with the HTTP code 201 (Created) and in the Location HTTP response header, the service has supplied the new URL of the resource, such as http://server/service/customers/2901. If you want to delete a resource, simply send an HTTP DELETE to the URL of the resource. Modify the resource? Send an HTTP PUT containing an XML document that represents the complete state you expect the resource to have after the operation. This is not the same as sending a diffgram-style request. And of course, if you want to get an individual resource or a collection of resources, send a GET to either the container or the individual resource.
So why is this useful? The biggest benefit is that the format that is used to describe the resources is pure, plain, simple XML. This makes it incredibly easy for developers to consume the service. More importantly, it makes the service extremely easy to be tested through automated testing methods (if you've got curl on your machine, you can test a web service from the command line..try that with a SOAP/WS-*/blah wrapped service..). Another benefit is that its predictable. If everyone in your organization is using a RESTful architecture, and using POX, then you can be reasonably sure that if you tweak an XML element from the response you got on an HTTP GET and send it back with an HTTP PUT, you should be able to modify the resource... all without having had to consume a WSDL file, generate code on the fly, or wrap everything in complex packaging.
Something that I think a lot of people don't realize is that feed formats are really, really powerful wire formats. The Atom protocol is a fantastic way to publish information from a RESTful web service. If your service is exposing time-series data, event-type data, alerts, notifications, or anything like that - consider exposing a GET endpoint that dumps feed-style data. There are two reasons for this. The first is that any newsreader can then point to your service and you can monitor the alerts without ever having to write your own client application. The second reason is that consuming feeds is something that has become ubiquitous. Just about every language on every platform has access to a library that can parse Atom or RSS feeds. Rather than polling a data-centric service over and over again and then trolling through the data looking for important changes, you could simply subscribe to the feed. When a new item appears in the feed, take the resource URL linked from inside the feed and go get it via POX. This type of scenario can create some really, really powerful enterprise web services with very little effort.
-
Coming up in this series of blog posts I am going to talk more about building web services as simply and practically as possible in an enterprise. This will include some discussion of how to secure web services (without using SOAP/WS-*!!), and some sample code written using WCF 3.5 and ASP.NET MVC that publish and consume a secure, practical, reliable, RESTful POX service.
If there are specific WS-related topics that you want me to cover, please comment and let me know, otherwise I may never get off my RESTful POX soapbox :)
When you create the webservice(s), can you show how to hook it up from a
Cocoa App? I am working on doing that right now and find there is not a
lot of others out there doing it. I am using both MVC and WCF :).
I understand the lure of REST. But in a complex Enterprise of several
development teams, from different regions (globally) all writing different
implementations of the same web service, with 100s of different consuming
applications, then you really need a hard specification. A document and
examples won't cut it. This is where SOAP+WSDL+XSD shines and REST fails.
REST is fine for CRUD style operations where little interop is required,
but unfortunately large enterprises often don't live in that world.
I have no problem with specifications... I think if you're doing XML as
your wire format, you need XSD files and a central repository/registration
of web services. I disagree, I think large enterprises can live in a world
of resource distribution instead of resorting to the easy, quick fix of
RPC-over-HTTP. WSDL is a machine-consumed language for creating
contract-based client consumer code - it has never been used as a good
specification. If you have undisciplined service developers, they are going
to screw up an enterprise WSDL/SOAP service just as easily as they would a
REST/POX. On the other hand, if you have disciplined devs building your
REST/POX service, they can make it as enterprise-savvy as a SOAP/WSDL
service. I encourage someone to give me a real-world enterprise scenario
that couldn't have been solved with proper resource definition and
implemented in REST/POX. (Aside from the remote few places where true WS-*
protocols were needed)
Easy example, order entry on a stock market. These are not CRUD operations
- not shared state. They are real transactions, requesting to amend an
order - rejection, partial amendment. There's a complex state transition
for transactions that *could* be modelled in REST - but then you are using
REST as POX. Besides you need something more than HTTP - you need
WS-Reliable, multiplexing etc...
I was going to save my discussion of WADL for another blog post. But
basically a lot of the benefits you tout of SOAP/WSDL can be fixed with
WADL. I think that, in the right hands, WADL is just that added bit of
structure and discipline that can unify an enterprise RESTful architecture.
WADL - I haven't played with it in depth. But to me it's WSDL Lite where
every operation *must* be an HTTP verb. Basically REST is describing the
end state of a transaction (the resource itself), where the operation must
be Create, Read, Update or Delete (CRUD).
To me, the guidance is simple: If the service can be described using CRUD
then REST is probably the best choice. If the service is more complicated,
where the transactions need to be modelled themselves (rather than just the
end-state) - then something richer like SOAP is probably the best fit.
I just think it's a shame that we've got to this place and SOAP could't
have been rescued. REST + WADL is fine, but it doesn't scale when you
start to need to add features that WS-* has tackled.
Maybe Microsoft will attempt to connect the two worlds with Oslo or
something.
Joe - all good points. And, as I said in the blog post, my experience has
been that I've been able to model my needs as RESTful services for over 90%
of the services I've needed to build. In tems of transactional systems, it
is possible to have transactional resources in a RESTful service, in
essence you are enabling POST and GET on a transaction resource. What I've
been seeing lately is that the services that people are claming cannot be
modelled in REST actually have no business being modelled as WS-* services
either - those services are ones that people should be using messaging
middleware for instead of building actual Web Services. It then becomes an
Enterprise SOA/"bus" model vs. Web Services where the messaging middleware
often wins out.
The usual arguments around machine readable specs and how code gen is evil
is very frustrating. I think this is really missing the point. Schemas
change over time, you need to design for change. Saying that code gen is
evil because it doesn't cope with change, use a human readable document
instead - this is just bypassing the problem.
I'll be reading your series with interest.
I don't believe that code gen is evil. I don't think that using only a
human-readable spec is good either. I think you need to build your services
around the idea that you're going to change, but if you do change, you do
it on a new endpoint (hence the convention of doing /v1 and /v2 endpoints).
If you have a humand-readable doc, then humans can write code to consume
your service. If you have a machine-readable spec, then compilers can make
the job of _humans_ easier by doing code gen. If you have both, then humans
can _avoid screwing up royally_ and use generated code. I think the ideal
situation is when you've got a human-readable spec doc, a machine-readable
spec that is either used for codegen or format validation (schema), or both
(WADL). If your developers can't deal with the fact that service contracts
change, then some education is necessary, but each endpoint is (IMHO) a
fixed contract. If you need more features, you do it on a new contract with
a new endpoint.