Feedburner has Header Issues
So, recently I’ve been watching a little bit of a FeedBurner problem arise between some friends on IRC, over at Mobtiopia. Seems that MobileTech uses Feedburner for his feed, via http://feeds.feedburner.com/Mobiletech.
However, Erik was having some problems, where every time the feed loaded, all items would look new. He dug into the problem a bit, and found what the root of the problem most likely is: Every 30 minutes or so, the “Last-Modified” header for Feedburner updates. It caches it per IP for about 30 minutes to an hour, it seems, but once you get past that, it just snaps back to the current time.
Now, I’m not one to criticize over small mistakes like this, but I feel bad for Feedburner: Their conditional get mechanism is completely broken by this difference. Rather than returning a 304 not modified, Feedburner is returning the full feed every hour someone asks for it. That’s got to be quite a hit on their bandwidth.
In this case, there were a couple other issues relating to the problem: Tarek’s feed was previously powered by RDF, which FeedBurner seemed to chew up and spit out in a quite ugly way in his case. However, I haven’t seen it happen the same way in other cases. In Tarek’s case, the feed was actually using what looked like uniquely generated local IDs for an rdf:about – which, although fine in RDF-parlance, is not allowed according to the RSS 1.0 specification, and doesn’t pass the feed validator. Most likely, Erik’s newsreader, newzcrawler, saw these funky looking IDs and didn’t treat them as permalinks, contributing to the problem.
Regardless of other issues though, I’ve checked a few other feedburner feeds, and every single one of them has a Last-Modified header in the past hour. This is simply not a good plan for your bandwidth, or for RSS in general: you’re dealing with a lot more traffic than you need to. 304 Not Modified is your friend, either via Etags or If-Modified-Since. RSS readers are doing good at cleaning up their act and using these headers – if the servers don’t support them, that’s just going to discourage such use in the future, and contribute to the load problem that RSS has become for so many people.
I’m going to let Feedburner know about this in more detail, and this is really not a slight against them. Headers for HTTP/RSS are hard to get right, not something that just “works” out of the box typically. So, I understand the difficulties attached to them. Getting them wrong, however, has some major consequences on all parties, so I hope they can figure out what’s up and get it fixed, both for their sake and for the sake of people that use them.
February 18th, 2005 at 8:39 pm
Hi there, thanks for the detailed post. We’re intimately familiar with if-mod-since and 304’s and the various issues and preferred means of dealing with these things. You have discovered a recently introduced bug that (to make a short story shorter) got injected into the system when we were actually doing some performance tuning! We’re working on this right now, and we should have a fix for it checked in soon. We’re investigating your other comments as well, and we should have a lot more detail on that this weekend. Thanks again for the post and the detailed thoughts.
February 18th, 2005 at 9:04 pm
Dick – That’s great to hear. When I was looking into it, I was amazed that such a service could have had the problem for that long without suffering pretty serious affects – sounds like my concerns weren’t misplaced, in this case. I’m sure you guys would have found it, but it seems like maybe I helped point it out.
If you have any questions on the RSS 1.0 format, feel free to drop them to me in email: I tend to be pretty good with these things, from way too much experience with them 🙂
February 18th, 2005 at 11:39 pm
Hey Christopher, you definitely are the reason we found it sooner rather than later. It was a very recent change that caused it and we’re heads down on several things right now, so your post was a red alert for us. Thanks again, and stay tuned.
February 25th, 2005 at 11:36 am
Christopher, thank you again for letting us know about this. We recently updated our load balancing strategy, and in doing so we lost the last-modified synchronization across app servers. We’ve restored the functionality as of last night, so hopefully you’ll be seeing a lot more 304s!