Expect: php curl should play nice

Apr 25 2009

Once again, my job has me writing mutli-threaded PHP1 scripts that use PHP’s CURL2 library to connect to remote servers. (I’m calling an API here!)  Without going into too much detail, the networking specifics changed between me and the api server, adding a new, or newly reconfigured, invisible proxy to the data path.

This proxy is running Lighttpd, while light in name, is starting to throw around it’s weight and get in my way.

Warning! We are going to get technical!

Here is the output of curl when trying to access the api. (All ip addresses have been obfuscated to protect the innocent!)

[cc]

* About to connect() to theapi.com port 80 (#0)

* Trying 42.42.42.42… * connected

* Connected to theapi.com (42.42.42.42) port 80 (#0)

> POST /ApiCommand HTTP/1.1

Host: theapi.com

Accept: */*

Content-Length: 1760

Content-Type: application/x-www-form-urlencoded

Expect: 100-continue

< HTTP/1.1 417 Expectation Failed

< Connection: close

< Content-Length: 0

< Date: Mon, 20 Apr 2009 14:16:26 GMT

< Server: lighttpd/1.4.1

<

* Closing connection #0

[/cc]

Wow, that was short lived. The culprit here is that lighttpd doesn't handle the Expect: 100-continue properly. In fact it cacks on it entirely. What you get (if you aren’t looking at headers) is an empty response from curl. Not very fun to debug if you were expecting some kind of response from the api.

It has been suggested 34 that this bug will be fixed in version 1.5.x but that doesn’t help us right here, right now5. So what to do. Who will save me? The internet of course!

gnegg, amongst others, has run into this problem before me and has a fix6 that solves the problem easily. Just add a blank Expect: header to your curl call. That fixes the problem nice and quick:

[cc lang='php']curl_setopt($ch, CURLOPT_HTTPHEADER, array(‘Expect:’));[/cc]

And that’s it, by implicitly setting the Expect: directive we bypass the default condition that CURL injects on every request.

Now I can’t say exactly what Expect: does, but I think it helps by chopping up data into smaller chunks, but that is a guess and I don’t feel like looking it up.

Update: I’m so very wrong!

Philip Hofstetter, who wrote the article on gnegg.ch sent me an email today clearing up my assumption about the Expect: 100-continue HTTP directive:

I’m the author of the blog post on gnegg.ch you referred to and I

wanted to take the oportunity to explain what the idea behind Expect:

100-continue is:

The idea behind Expect: 100-continue is to give the server a chance to

make checks for the requests validity without the client actually

having to send all the data first.

So the clients sends the POST request just as if it would just post

the data, but it leaves the request body completely empty, but adds

the Expect: 100-continue.

The server can now check

- if maybe authentication is needed but not provided

- if the URL in question is even capable of accepting a POST request

- if the remote host is permitted to send the data

or what ever else that can be checked independently of the post body.

Now the server either sends back a fitting error code or it sends the

doe 100 telling the client that it’s ok to go ahead and resend the

request, but this time WITH the request body and WITHOUT the expect

header.

So the idea isn’t chopping up the data, it’s making it as sure as

possible to detect early failures without the client having to

transmit all the data first.

I hope that helps to explain what’s going on.

I knew I should have just gone and looked it up! Thanks Philip!

No responses yet

Leave a Reply