I'm sorry for everyone who noticed Netflix overwhelming their feed recently. RSS GUIDs are supposed to stay unique and constant but some services change them every few months. We have some logic in place at FriendFeed to minimize the impact but it still sucks :(
I have always ignored the GUID, choosing instead of figure out rss entry uniqueness based on a combination of title, content, author, date published, and, of course, permalink. I found that too many things change to be able to rely on something like a publisher-provided ID. I'm sure GUID works well 99% of the time, but for some reason, I found it a fun problem to just attack entries by the fact that they remain 99% constant over the course of a day (taking edits and permalink changes into account) to determine uniqueness.
- Samuel Clay
ah - thanks ben (solid community alerting award)...
- mike "glemak" dunn
Samuel (it is hard writing that name, you will always be Rob in my head): It's a tough call. For private things (aggregators, readers) I think it's ok to have a fancy algorithm that tries to be smart. But for FriendFeed I think false positives would annoy users just as much as duplicates. Can you imagine the frustration if your Tweet never appeared but you *know* it's there? The difference is on FriendFeed the feeds users add are generally filled with stuff they make so they know when something is missing. But in a reader (where the feeds are out of the user's control) if you have a false positive and detect a duplicate that isn't really one the user will probably never know.
- Benjamin Golub
Ben: I uploaded some stuff to Flickr on Halloween and it never showed up, so you're right -- it is frustrating!
- Gabe