It is possible (given sufficient resources or even just using a web search) to infer the resource from a resource token. The resource might be password protected but by tracking the SUP document it becomes possible to perform
traffic analysis: http://en.wikipedia.org/wiki... on private resources because you will know how often and when they update. The spec should point out the potential risk.
- Adewale Oshineye
I've given this issue a lot of thought because it affects OurDoings. The risk is negligible. I don't have time for an extensive comment now, but hopefully can fill in details later.
- Bruce Lewis
via fftogo
Briefly, if you don't already have access to the resource, the resource token won't lead you to it. If you know enough about the resource to do traffic analysis, you might be able to figure out what token goes with said resource, but you won't have any more info than when you started, i.e. you'll know when the resource updates. You already had that knowledge or you couldn't do traffic analysis.
- Bruce Lewis
Given a private resource at example.net/1 (perhaps discovered through the Referer header) all I can tell is that there's something there. Given an Update Document that points to example.net/1 (perhaps because the publisher has decided not to use opaque tokens because all the urls are password protected) it becomes possible to tell when this private resource is being updated. Moreover it...
more...
- Adewale Oshineye
If people know your URLs and you use a hash of your URLs as the token, then yes you're telling the world when each was updated, no traffic analysis necessary. Seems obvious to me, but you're probably right that implementors should be warned somewhere. (I take back what I said about having thought a lot about this; I used opaque tokens and didn't think about the risk of non-opaque tokens.)
- Bruce Lewis
Would adding this paragraph to the Security section of the spec help: "SUP does not require that a site MUST encrypt its resource tokens. The resource tokens may be rendered opaque through strong encryption, hashing or they may be non-opaque. In a scenario where a SUP document is being used to indicate that private or password protected resources have been updated and the resource...
more...
- Adewale Oshineye
Is it possible to get a SUP_ID on entries. I would like to monitor a large set of Friendfeed entry, when they are commented on, liked etc, i.e, whenever it is updated. (via http://friendfeed.com/friendf...)
Not currently, but you could monitor the SUP_IDs of the feed and re-check the entry when the feed is updated (entry updates are also feed updates).
- Paul Buchheit
Thanks Paul, I might be following 1000 entries and if each is over 1000 individual users then I may check to see if my entry is updated when in fact it is another unrelated entry on that users feed that has changed - I effectively polled the feed for no reason.... I honestly do like the idea of treating an entry as a feed in its own right.
- Paul Kinlan
I agree, it would be nice. Monitoring the feeds would probably still be helpful though since most feeds don't update very often.
- Paul Buchheit
I am also thinking of using SUP_ID of users feeds to invalidate the caches on the Friendfeed version of Amplifeeder (http://paul.kinlan.me as an example). I need to cache for as long as possible and polling a SUP feed I hope will allow me to use the cache more effectively.
- Paul Kinlan
"At Google I/O 2009, we demoed a nifty sample application that tracks updates to any number of YouTube user activity feeds. The technology behind the application is the Simple Update Protocol (SUP), a simple and compact "ping feed" that enables your application to efficiently monitor changes to a large number of user activity feeds. If you run a social network with tons of users who also happen to be active on YouTube, you should consider using SUP to let your users easily share their updates on YouTube with their friends through their social graph on your site."
- Paul Buchheit
via Bookmarklet
Paul, how important do you believe SUP is vs. other initiatives on the Web today?
- Louis Gray
Louis, SUP provides a very simple way for sites such as YouTube to expose update information to bulk consumers such as FriendFeed. Obviously this is just one part of what is needed to make everything realtime or at least more realtime, but it's a very important part. My hope is that other realtime efforts (such as PubSubHubBub) will also add support so that SUP will also become easier to consume, especially for smaller sites and projects.
- Paul Buchheit
It also removes the need for RSS, or a custom API for many things
- Jesse Stay
Jesse, SUP does not replace RSS, it accelerates it :) (though it can be used for all HTTP resources and media types, not just RSS)
- Paul Buchheit
Oh - I haven't really looked at the spec. So my guess is SUP is what notifies servers to pick up the latest RSS?
- Jesse Stay
Can someone summarize for those of us in a low-bandwidth situation?
- Bruce Lewis
Summary: YouTube has a SUP feed (http://gdata.youtube.com/sup) which was announced at Google I/O. As far as I can tell it hasn't been documented yet but it sounds like they are working on it. FriendFeed has been consuming the feed since a few days before the Google I/O announcement.
- Benjamin Golub
Couldn't see anything immediately so i'll ask. I want to be able to create my own FriendFeed SUP so that i monitor say 1k feeds/60 seconds. I know the SUP ID's you assign so i'd like to optimize my querying by asking my own URL for any updates e.g . http://friendfeed.com/api...
If you monitor http://friendfeed.com/api..., then you get all changes at FriendFeed including the 1K feeds that you are interested in. We do not provide a way to get a subset of the changes.
- Gary Burd
I've check the API call per the previous post and can confirm that X-SUP-ID is set (http://feeds.seesmic.com/sup...), that the sup.json file validates and I even see in the apache logs it being requested. But I don't see my updates until I tell FF to refresh the service. Any hints?
Hi Mike, there is an issue with the FriendFeed code that crawls Seesmic that I hope to have fixed soon. I see nothing wrong with your SUP implementation. Sorry for the delay.
- Benjamin Golub
It may take up to 1/2 hour before SUP is active on all seesmic feeds though. Also, if you want to make ff find seesmic entries even faster, you should add more available_periods (like 30 or 60 seconds).
- Paul Buchheit
we should definitely optimize it for seesmic videos to show in 30 secs on Friendfeed
- Loic Le Meur
btw we're almost done updating our player too so that it is lighter
- Loic Le Meur
my logs show 6 hits earlier today but then nothing and the sup.json file has already been validated (and is still valid). (update: a single sup.json hit followed by two hits for a user file)
- Mike Taylor
well I'll be - uhhh, hmm - thanks -- I thought it was reading atom files so I added the link tag to them. ok - let me go tweak the header for the user's json response
- Mike Taylor
SUP is designed so that publishers can easily expose feed updates. Large consumers can use that information directly, or make it available through a variety of other channels for other consumers, large and small. See the FAQ: http://code.google.com/p.... Push vs pull is not an important distinction. Also, handling a large number of connections is not a difficult problem -- that's how IM works, for example.
- Paul Buchheit
Not a bad idea -- if the URI is optional. So the Updates Document could, at the publisher's discretion, include the Resource URI in addition to the Resource Token.
- DeWitt Clinton
Consequences: clients still have to maintain a mapping (for handling private or protected resources); Updates Document gets larger; clients now have to decide if they're going to base their mapping on the Updates Document or the resource (this could be problematic if they're out of synch); clients have to be a little more sophisticated as they have to check for optional features in each payload. Did I miss anything?
- Adewale Oshineye
There can be many URIs mapped to the same resource token. How does a service select the URI to include in the updates document?
- Gary Burd
Just one: A list may be the wrong structure for optional values, and we may need to use a dict instead. So it becomes: { "uri": "...", "resource": "...", "update": "..." } which is extremely verbose.
- DeWitt Clinton
I can tell already that reaching consensus on this is going to be difficult...
- DeWitt Clinton
Because the URI is optional, clients are required to maintain a resource token -> URI mapping. Given that clients already maintaining this mapping, what is the value of including the URI in the updates document?
- Gary Burd
@Gary - I believe the idea is that many (most?) publishers don't have private feeds, and would be fine with a URI only in the Updates Document.
- DeWitt Clinton
@DeWitt I was about to suggest a flag to indicate if there are going to be private feeds in an Updates Document but I now think that this entire proposal may be unnecessary complication. Optional elements were one of the areas of exploding complexity + interop issues in RSS and I tend to think that unless there's a strong benefit (e.g. bandwidth savings or greatly simplifying the common case) this should not be added. Adoption will be increased if interop testing of client implementations is made easy
- Adewale Oshineye
I personally concur, but Mark's suggestion has merits, so I'm glad we're discussing it. That said, unless someone steps forward and takes up the case for including the URI in some form, I'm okay if we consider it shelved for now. It will likely be reopened once the first draft is published, but we can address that when the time comes.
- DeWitt Clinton
@DeWitt, Many important publishers do have private streams. Publishers with private streams at Google include Reader, Blogger and Picasa.
- Gary Burd
Replace the 2nd sentence of Section 4 with: "Each unique underlying resource MUST always map to a single, stable Resource Token. Different representations of the same resource MAY share the same Resource Token for instance if a single user's account is represented with both Atom and RSS feeds."
- Adewale Oshineye
Follow it with "Resource tokens SHOULD be opaque in order to permit the Updates Document to refer to private or password protected resources as well as to avoid exposing information about the total number of active users of a publisher's service."
- Adewale Oshineye
That's not bad. I'm not sure the part about "both Atom and RSS" feeds is comprehensive, so we may need to add "for example". Also, please see the comment from Mark Nottingham about an (optional) URI in addition to the Resource Token: http://friendfeed.com/e...
- DeWitt Clinton
I received more feedback that this discussion will almost certainly need to move to a mailing list to make it through an IETF standardization track. Thoughts on this below.
We have two reasonable choices here that I see: We can either move to a list and continue down the standards path, which, practically speaking, will require Paul and Gary to buy in to that strategy. Or we can take what we have now, package it up as a perfectly reasonable specification, (but not a standard), and iterate on it independently of the IETF or other standards body.
- DeWitt Clinton
I'm a little wary of standardising before we've gained sufficient real-world experience (cf J2EE or OMB) and I'd argue for a hybrid approach. Take the current spec, stick a version number on it and iterate based on feedback from real-world implementations. Then, in X months time or when the process of iterative revision seems to have stabilised, take it to the IETF
- Adewale Oshineye
That could work. Though I'll admit to not being personally excited about continuing to proxy feedback back and forth between my personal inbox and this FriendFeed room.
- DeWitt Clinton
And to be clear, this would live in I-D form for at least a version or two, regardless of whether we ultimately take this to a standards track. I can almost guarantee that it wouldn't be standardized by a broad community in its current form.
- DeWitt Clinton
For the record, I liked the Room for discussion purposes. While it reinforced my sentiment that I would pay cold hard cash to FriendFeed for the ability to insert newlines and write longer comments -- everything else worked perfectly well.
- DeWitt Clinton
Okay, how about this for a plan: I'll copy Eran's and other's existing notes into the room so we can use what we can to improve the current version. We'll take as much of that as we can, produce the first official draft, and then take it to an existing list. Mark Nottingham suggested one w3 list in particular and I'll sync up with him on it.
- DeWitt Clinton
Paul/Gary, did you want to add anything to this version?
- DeWitt Clinton
Notifixious can now "track" your feeds better if you use SUP! We are also setting up a "public ping" and a "public SUP feed". - http://blog.notifixio.us/post...
What should happen in a situation where a consumer is interested in only a subset of the resources being published? Should the spec say whether there's one global Update Document per publisher? Or is that a separate discussion about best practice?
It is usually the case that a consumer is interested in only a subset of the resources. The consumer should ignore resources that the consumer does not care about.
- Gary Burd
A publisher can have an arbitrary number of update documents (one per publisher, one per resource, a shared update document at another service, and so on). It's up to the publisher to decide how segment resources to update documents. Assuming that each consumer is interested in a random set of resources on a publisher, a single update document will be most efficient.
- Gary Burd
This is a very good question. In particular, I've been worried about the case when a given resource is being monitored at both a collection level (to use the AtomPub terminology) and at a site-wide level. Should we make it clear that there can be multiple Update Link Header or Update Link Elements in or associated with a given resource?
- DeWitt Clinton
How does a client select the updates document to monitor if a resource has multiple links?
- Gary Burd
Presumably it could monitor any one of them and they'd all be updated when the resource is.
- DeWitt Clinton
Using Gdata as one example, few consumers would ever want to read or be able to meaningful handle the Updates Document for all of Google. But some might want to track the updates for a single domain, or a single user, or a single collection.
- DeWitt Clinton
Oh, I see your question now -- how would the client know which is which...
- DeWitt Clinton
Lets make this concrete. Imagine I want to monitor flickr for the photos of a few hundred people. I don't really want to handle 1 Updates Document for every flickr user and then filter out the ones I'm not interested in but I also don't want to poll flickr hundreds of times per minute. If each resource advertised the shard of the global Updates Document that it was in then I could minimise the number of urls I need to poll. It would be up to the server to decide how big it's shards should be
- Adewale Oshineye
Adewale, Because a resource advertises the location of the update document, a service can choose to shard resources with one update document per shard. The service would not have a "global" update document in this case.
- Gary Burd
Another discussion I had with Eran that I want to raise here is the question of whether or not we should be combining the location of the Updates Document with the identification of the Resource Token.
This has made me a bit uncomfortable from the beginning, and I like Eran's suggestions (which I'll copy below) as a solution. His ideas match what I planned to propose, so there are at least two people who independently came to the same conclusion.
- DeWitt Clinton
Eran suggests splitting this into two URI-based Links, one for the Resource Token and one for the Updates Document. I.e.: "Link: <#as78d9a8d7>; rel=”resource-token”" and "Link: <http://example.com/sup.json>; rel=”updates”"
- DeWitt Clinton
One keen insight he had is that the target for the 'resource-token' Link should be a URI itself. In most cases this can be unqualified (e.g. '#as78d9a8d7') and thus relative to the resource itself. Or it could be absolutely qualified relative to the resource base ('/#as78d9a8d7'). But it can also be fully qualified, (e.g. 'http://example.com/#as78d9a8d7').
- DeWitt Clinton
I don't understand the previous comment. Can you give a concrete example where it's useful to split the token and document links?
- Gary Burd
My initial reaction wasn't so much that they needed to be split to be used independently, but rather that the convention of conflating two distinct concepts into a single URL was odd. We're asking clients to parse the fragment out and treat it as something other than part of the URL for the updates document. I understand the motivation (it was compact) but that doesn't make it simple.
- DeWitt Clinton
But given that we're also considering the multiple-updates-documents scenario, it is reasonable to think that there would be one Resource Token link, and several Updates Documents links.
- DeWitt Clinton
Another benefit is that this would allow alternate forms of mapping resource to Resource Token. Currently this is tightly coupled with the discovery of the location of the Updates Document, but that doesn't need to be the case.
- DeWitt Clinton
Eran suggested the www-talk@w3.org lists. I'm not personally biased toward one list over another, but reusing an existing one makes sense to me, particular because it would increase the number of eyes on the spec and likely raise the amount of meaningful feedback. If there are problems with SUP we want to know that now. If there are no problems, we want the exposure.
- DeWitt Clinton
Eran Hammer-Lahav sent me a lengthy email with great feedback. How should we handle this? I'll put it up as a Google Doc so we can all read his feedback, but how and where should we discuss it?
He also added, when I asked if I could post his comments: "You can make this as public as you want. I would have sent it to a list if there was one… you can use www-talk@w3.org BTW… We use it for /site-meta and my new Discovery I-D (which I am thinking of proposing for some more advanced discovery for SUP but not sure yet how much value it will add)."
- DeWitt Clinton
r31 (Revised as follows: * The term 'Updates Document' in place...) committed by dclinton - http://code.google.com/p...
I don't feel we've reached consensus on 'SUP Feed', 'SUP ID', or 'Update ID' quite yet. If we don't reach a consensus about changing the language, then we'll publish the status quo.
- DeWitt Clinton
Personally I do think the spec can be improved with different language. However, in my opinion, the spec is implementable and readable with the current names, so I don't consider the status quo to be a show-stopper.
- DeWitt Clinton
Can you give a summary of your preferred names?
- Gary Burd
My preference would be 'SUP Feed' -> 'Updates Document', 'SUP ID' -> 'Resource Token', and 'Update ID' -> 'Revision Token'.
- DeWitt Clinton
I would also like to go back through the informative text and change references to 'feeds' to 'resources'. We can still use examples of RSS and Atom feeds in the text, but I don't want to give the impression that this is only applicable to those formats when it works equally well with any URI-addressable resource.
- DeWitt Clinton
If my suggestions on naming contribute to perceived lack of consensus, consider them withdrawn. I don't see gnarly problems arising from any of the proposed names.
- Bruce Lewis
via fftogo
The word 'document' still seems a bit odd to me, but if you both think that these terms are significantly better, then I'm fine with changing them. (and it does seem that people are confused by 'feed') I do like emphasizing that it applies to all http resources.
- Paul Buchheit
I'm going to turn a version right now that tries the new language on for size. We can review and revert to the existing language if it isn't an improvement.
- DeWitt Clinton
Hum, it seems that at some point our IP is rejected :
curl http://friendfeed.com/api...
{"errorCode":"limit-exceeded"}
Is there any way to be "whitelisted"? (our IP looks like XX.XX.29.74)
Thx