Descripción de la Tecnología Dump Flooding

There are three other HTTP methods I consider part of the uniform interface. Two of them are simple utility methods, so I’ll cover them first.

• Retrieve a metadata-only representation: HTTP HEAD

• Check which HTTP methods a particular resource supports: HTTP OPTIONS

You saw the HEAD method exposed by the S3 services’s resources in Chapter 3. An S3 client uses HEAD to fetch metadata about a resource without downloading the possibly enormous entity-body. That’s what HEAD is for. A client can use HEAD to check whether a resource exists, or find out other information about the resource, without fetching its entire representation. HEAD gives you exactly what a GET request would give you, but without the entity-body.

There are two standard HTTP methods I don’t cover in this book: TRACE and CONNECT. TRACE is used to debug proxies, and CON- NECT is used to forward some other protocol through an HTTP proxy.

The OPTIONS method lets the client discover what it’s allowed to do to a resource. The response to an OPTIONS request contains the HTTP Allow header, which lays out the subset of the uniform interface this resource supports. Here’s a sample Allow header:

Allow: GET, HEAD

That particular header means the client can expect the server to act reasonably to a GET or HEAD request for this resource, but that the resource doesn’t support any other HTTP methods. Effectively, this resource is read-only.

The headers the client sends in the request may affect the Allow header the server sends in response. For instance, if you send a proper Authorization header along with an OPTIONS request, you may find that you’re allowed to make GET, HEAD, PUT, and DELETE requests against a particular URI. If you send the same OPTIONS request but omit the Authorization header, you may find that you’re only allowed to make GET and HEAD requests. The OPTIONS method lets the client do simple access control checks.

In theory, the server can send additional information in response to an OPTIONS request, and the client can send OPTIONS requests that ask very specific questions about the server’s capabilities. Very nice, except there are no accepted standards for what a client might ask in an OPTIONS request. Apart from the Allow header there are no accepted standards for what a server might send in response. Most web servers and frameworks feature very poor support for OPTIONS. So far, OPTIONS is a promising idea that nobody uses.

POST

Now we come to that most misunderstood of HTTP methods: POST. This method essentially has two purposes: one that fits in with the constraints of REST, and one that goes outside REST and introduces an element of the RPC style. In complex cases like this it’s best to go back to the original text. Here’s what RFC 2616, the HTTP standard, says about POST (this is from section 9.5):

• Annotation of existing resources;

• Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles;

• Providing a block of data, such as the result of submitting a form, to a data-handling process;

• Extending a database through an append operation.

The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI. The posted entity is subordinate to that URI in the same way that a file is subordinate to a directory containing it, a news article is subordinate to a newsgroup to which it is posted, or a record is subordinate to a database. What does this mean in the context of REST and the ROA?

Creating subordinate resources

In a RESTful design, POST is commonly used to create subordinate resources: resources that exist in relation to some other “parent” resource. A weblog program may expose each weblog as a resource (/weblogs/myweblog), and the individual weblog entries as subordinate resources (/weblogs/myweblog/entries/1). A web-enabled database may expose a table as a resource, and the individual database rows as its subordinate resources. To create a weblog entry or a database record, you POST to the parent: the weblog or the database table. What data you post, and what format it’s in, depends on the service, but as with PUT, this is the point where application state becomes resource state. You may see this use of POST called POST(a), for “append”. When I say “POST” in this book, I almost always mean POST(a).

Why can’t you just use PUT to create subordinate resources? Well, sometimes you can. An S3 object is a subordinate resource: every S3 object is contained in some S3 bucket. But we don’t create an S3 object by sending a POST request to the bucket. We send a PUT request directly to the URI of the object. The difference between PUT and POST is this: the client uses PUT when it’s in charge of deciding which URI the new resource should have. The client uses POST when the server is in charge of deciding which URI the new resource should have.

The S3 service expects clients to create S3 objects with PUT, because an S3 object’s URI is completely determined by its name and the name of the bucket. If the client knows enough to create the object, it knows what its URI will be. The obvious URI to use as the target of the PUT request is the one the bucket will live at once it exists. But consider an application in which the server has more control over the URIs: say, a weblog program. The client can gather all the information necessary to create a weblog entry, and still not know what URI the entry will have once created. Maybe the server bases the URIs on ordering or an internal database ID: will the final URI be /weblogs/ myweblog/entries/1 or /weblogs/myweblog/entries/1000? Maybe the final URI is based on the posting time: what time does the server think it is? The client shouldn’t have to know these things.

The POST method is a way of creating a new resource without the client having to know its exact URI. In most cases the client only needs to know the URI of a “parent” or “factory” resource. The server takes the representation from the entity-body and use it to create a new resource “underneath” the “parent” resource (the meaning of “underneath” depends on context).

The response to this sort of POST request usually has an HTTP status code of 201 (“Created”). Its Location header contains the URI of the newly created subordinate resource. Now that the resource actually exists and the client knows its URI, future requests can use the PUT method to modify that resource, GET to fetch a representation of it, and DELETE to delete it.

Table 4-1 shows how a PUT request to a URI might create or modify the underlying resource; and how a POST request to the same URI might create a new, subordinate resource.

Table 4-1. PUT actions

PUT to a new re-

source PUT to an existingresource POST

/weblogs N/A (resource al-

ready exists) No effect Create a new weblog

/weblogs/myweblog Create this weblog Modify this weblog’s

settings Create a new weblog entry

/weblogs/myweblog/ entries/1

N/A (how would you get this URI?)

Edit this weblog entry

Post a comment to this weblog entry

Appending to the resource state

The information conveyed in a POST to a resource doesn’t have to result in a whole new subordinate resource. Sometimes when you POST data to a resource, it appends the information you POSTed to its own state, instead of putting it in a new resource. Consider an event logging service that exposes a single resource: the log. Say its URI is /log. To get the log you send a GET request to /log.

Now, how should a client append to the log? The client might send a PUT request to /log, but the PUT method has the implication of creating a new resource, or overwriting old settings with new ones. The client isn’t doing either: it’s just appending information to the end of the log.

The POST method works here, just as it would if each log entry was exposed as a separate resource. The semantics of POST are the same in both cases: the client adds subordinate information to an existing resource. The only difference is that in the case of the weblog and weblog entries, the subordinate information showed up as a new resource. Here, the subordinate information shows up as new data in the parent’s representation.

Overloaded POST: The not-so-uniform interface

That way of looking at things explains most of what the HTTP standard says about POST. You can use it to create resources underneath a parent resource, and you can use it to append extra data onto the current state of a resource. The one use of POST I haven’t explained is the one you’re probably most familiar with, because it’s the one that drives almost all web applications: providing a block of data, such as the result of submitting a form, to a data-handling process.

What’s a “data-handling process”? That sounds pretty vague. And, indeed, just about anything can be a data-handling process. Using POST this way turns a resource into a tiny message processor that acts like an XML-RPC server. The resource accepts a POST request, examines the request, and decides to do... something. Then it decides to serve to the client... some data.

I call this use of POST overloaded POST, by analogy to operator overloading in a pro- gramming language. It’s overloaded because a single HTTP method is being used to signify any number of non-HTTP methods. It’s confusing for the same reason operator overloading can be confusing: you thought you knew what HTTP POST did, but now it’s being used to achieve some unknown purpose. You might see overloaded POST called POST(p), for “process.”

When your service exposes overloaded POST, you reopen the question: “why should the server do this instead of that?” Every HTTP request has to contain method information, and when you use overloaded POST it can’t go into the HTTP method. The POST method is just a directive to the server, saying: “Look inside the HTTP request for the real method information.” The real information may be in the URI, the HTTP headers, or the entity-body. However it happens, an element of the RPC style has crept into the service.

When the method information isn’t found in the HTTP method, the interface stops being uniform. The real method information might be anything. As a REST partisan I don’t like this very much, but occasionally it’s unavoidable. By Chapter 9 you’ll have seen how just about any scenario you can think of can be exposed through HTTP’s uniform interface, but sometimes the RPC style is the easiest way to express complex operations that span multiple resources.

You may need to expose overloaded POST even if you’re only using POST to create subordinate resources or to append to a resource’s representation. What if a single resource supports both kinds of POST? How does the server know whether a client is POSTing to create a subordinate resource, or to append to the existing resource’s representation? You may need to put some additional method information elsewhere in the HTTP request.

Overloaded POST should not be used to cover up poor resource design. Remember, a resource can be anything. It’s usually possible to shuffle your resource design so that the uniform interface applies, rather than introduce the RPC style into your service.

In document Análisis de factibilidad técnica económica para la implementación de una completación Dump flooding para un proyecto de recuperación secundaria (página 35-55)