There are significant differences between SUP and the Six Apart update stream: (1) SUP documents are fixed length. SUP clients periodically poll the server for new updates. Six Apart streams updates to the client. (2) SUP documents contain opaque feed identifiers. Six Apart sends a stream of Atom feed entries.
- Gary Burd
SUP can also be used to monitor for updates on any URL regardless of content-type using the X-SUP-ID HTTP header. You are correct though that it's essentially a compact, standardized, generalized, discoverable form of an update feed.
- Paul Buchheit
As SUP is basically a "service-to-service" behind the scene protocol, streamed version with persistent connections seems to be more interesting. Also, SixApart streams contain the actual data and stream consumer does not need to fetch individual feeds.
- Alex Kapranoff
Alex, streamed is less efficient because it delivers data that I don't care about, has privacy problems (what would Google Reader Shared Items stream?), and is much more complex to implement (try doing it in PHP on shared hosting).
- Paul Buchheit
The first and the second things you mention are problems of the SixApart implementation which is not a standard of some unchangeable kind. One can easily add filtering to the Atom Stream protocol and probably do something with the secret URLs. Implementation for small shared hosting based sites is, on the other side, nearly impossible, yes. But why a small site would want to implement SUP? As far as I can see it should be interesting for moderately big services with thousands of feeds.
- Alex Kapranoff
How do you propose that Google Reader should implement this Atom streaming in such a way that it would preserve feed privacy? My PHP on shard hosting example is somewhat extreme, but PHP is actually very popular, as is Rails, which I think would also have some difficulty managing endless streams of updates. The SUP design is very much line with how everything else on the web works so that it can be implemented everywhere -- endless streams of Atom data are unique to a single site I believe.
- Paul Buchheit
I don't see how abstracting the username by a layer is going to help privacy. It should be up to the feed publishers on how to handle privacy. They can and should offer better privacy controls than an abstraction. Based on what i've read about this so far, I could easily write a script to crawl flickr or twitter and get every single users id and link it to their SUP-ID. I put it on a different comment, but maybe a POLL verb for http would work.
- Shawn McCollum
Shawn, The id abstraction prevents crawlers from discovering the URLs of private feeds using information in the SUP feed.
- Gary Burd
Private feeds shouldn't be in the SUP Feed at all. Obfuscation as a security or privacy model doesn't prevent, it just delays the inevitable. The SUP protocol isn't going to be universally accepted and no matter what, your going to have to support feeds the current way. Handle private feeds normally since without authentication it's really just hidden not private. I like the way FF allows you to regenerate your api key, something like that should be used for private feeds rather then a id and password.
- Shawn McCollum
But Shawn; the SUP ID in the private feed means nothing unless you know the feed it belongs to. And if you know the feed it belongs to then you deserve to know it (ie someone registered it with you). Private feeds and SUP get along just fine. I suppose you could add another layer on top by generating a unique SUP id for each feed for each client. So the SUP id only means something to the client who requested it. But that's a lot more work for the server generating the SUP feed.
- Benjamin Golub
Shawn, the advantage of SUP is that private feeds can still be protected by whatever mechanism you choose (such as the FriendFeed remote key), but still feed update information into a public SUP feed because the SUP-ID is completely opaque. (you can't discover the SUP-ID unless you already have access to the feed)
- Paul Buchheit
another thought - why re-invent? Did they look at www.sitemaps.org?
- Dave Hodson
It doesn't mean nothing, it mean that there is something you don't know. I can do alot with "knowing I don't know something".
- Shawn McCollum
I think the SUP-ID concept is interesting but over architecting a solution for a small subset of the issue your trying to solve. It's nice that SUP-ID works to limit the size and help with private feeds. You could also implement something like the base html tag for size. Then either separate private and public updates with the private ones using SUP-ID or use a marker to identify that the backend needs to go through an extra hoop on this one.
- Shawn McCollum
"It doesn't mean nothing, it mean that there is something you don't know. I can do alot with "knowing I don't know something"." Fill your SUP feed with random bogus SUP ids and bogus data. Then there will be *a lot* you don't know (including how many accounts the service *actually* has).
- Benjamin Golub
Shawn, can you give an example of something you could do, knowing that an unknown feed within OurDoings was updated at a specific time? I don't see the vulnerability here. This is a real question as I already implemented SUP.
- Bruce Lewis
from fftogo
Just so everyone knows, I'm not trolling and I love friendfeed. I have personal not professional interest in speeding up feed aggregation. Just chatting so... Loading up random data will have a negative effect on the size feature of SUP and will cause more processing to be done by the consumer of the feed. Bruce, I'll put something down to answer your question in a bit, but right now i've got to pick up my son from daycare.
- Shawn McCollum
It's funny you mention Netflix personalized feeds indirectly in your blog post, Dare, as I'm not sure a basic SUP feed would work help much for our feeds. The personalized Netflix feeds generate generate about 6 million posts per day (about 2M each of queue adds, shipped DVDs and received DVDs). Given that any given feed consumer is likely only interested in a small fraction of the 8.4M+ subscribers, the signal-to-nose ratio in a SUP feed would be quite low, unless you created a SUP feed per consumer.
- Michael Hart
Michael, SUP is intended for large feed consumers, not people monitoring one or two feeds. If you are doing 6 million feed updates per day, and your SUP feed compresses to about 8 bytes/entry (as the FriendFeed SUP does), then that works out to about 555 bytes/second (or 33k/minute). A compressed netflix feed appears to be about 22k, so the breakeven point is around 2000 feed fetches / day. For clients who poll feeds every half hour, that is only 45 unique feeds. For those that poll 1000s, it's a big win.
- Paul Buchheit
Of course the math is different with if-modified-since, but from what I can tell netflix does not currently support that. Also, if you want to be a little more clever, the size of the SUP could be reduced substantially by only including info on feeds that have ever been fetched, since the majority of netflix feeds have never been accessed. You can also have separate SUP feeds for queue adds, shipped, and received, since some clients (such as FriendFeed) are only interested in one category (adds, in our case).
- Paul Buchheit
And of course that's just the bandwidth savings. Using SUP would also reduce feed latency and the overall number of requests.
- Paul Buchheit
Paul, drop me an email and we can take this discussion offline. We have some interesting feed improvements on the way in the very short term including ETag-based cache control and programmatic feed access behind OAuth to get rid of the copy/paste nonsense. You can reach me at mhart at netflix dot com
- Michael Hart
It doesn't go far enough for me in terms of granularity. I'd like a way to decouple feed page sizes from updates, so you can know exactly the entries that are updated/new, not just the feeds. This would be fantastic for mobile apps. Right now it's about asking endpoints to support a 'since' parameter on the feed. I wonder if X-SUP-ID will go through mobile networks- what they do sometimes with HTTP is 'interesting'.
- Bill de hÓra