John Palfrey explains the issues of RSS copyright
Posted on Tuesday, January 17, 2006
at 7:02 PM
(permalink)
John Palfrey has responded to the questioning of Top10Sources' use of RSS feeds exactly the way a law professor should, by turning it into an opportunity to educate the blogosphere on the finer points of copyright law in relation to RSS and blogs. As the Executive Director of the Berkman Center for Internet and Society, he explores the potential risks to the Internet from overly restrictive limits on the use of RSS feeds by aggregators. The Berkman Center currently holds the copyright for the RSS 2.0 specification, and Palfrey handles this responsibility by explaining the best way for RSS to fulfill its potential. Finally, as a founding partner of the RSS Investors LP venture fund and the founder of Top10Sources, Palfrey protects his investment by skillfully deflecting the criticisms levelled against the new aggregation site. He sure is in the middle of RSS, isn't he? Let's take a look at some of his arguments, since they are a blueprint of where RSS and copyright law intersect.
Palfrey contends that aggregators like Top10Sources are not violating copyright law, but acknowledges that this is still an unresolved issue. What I find interesting is the way he casts the opponents of his view:
"The strong form of the pro-copyright argument runs like this: the creator of the RSS feed retains, automatically, all copyrights in the content in the feed and retains all rights in its republication, use as a derivative work, and so forth. Given that those rights have been retained fully by the creator of the site, the argument goes, it is unlawful for someone -- presumably in a commercial context -- to republish that copyrighted context without license to do so. This is the Web 2.0 variant of the argument that is litigated frequently in the context of web-based content, with plaintiffs like the RIAA and the MPAA (in the p2p context), the publishers (like McGraw-Hill, or Perfect 10) who are suing Google, and the like."
I can't judge the legal argument, but I respect his tactics. I don't think there is a single blogger who wants to be on the same side as the RIAA or MPAA.
He warns his readers of the consequences of the "strong form" of copyright being applied to RSS:
"Is the blogosphere arguing itself right into a trainwreck of the sort that has played out over music and movies? Consider the world that A (prominent) VC envisions,
here and
here, wherein content is micro-chunked and syndicated. This world cannot emerge if every plausible copyright claim is asserted and litigated.
Palfrey's most valuable recommendation is that bloggers should add a copyright statement to their feeds.
"Creative Commons licenses, as I've argued on this blog, are the way to go -- to embed them into the RSS feeds when they go out, with clear instructions for your intent. If you want people to run your feed in private aggregators, but not in public aggregators that are for-profit, to re-offer your content just as you've offered it, and to attibute authorship to you, why not add to your feed a BY-NC-SA license?"
I agree. When examining feeds for inclusion in my aggregator, I was surprised to find that none of them contained a copyright notice. My feed had one, but I've now updated it to match my site's Creative Commons license, which spells out exactly what a republisher is permitted do.
How does Top10Sources carry out Palfrey's less restrictive view of RSS copyright?
"As the editor compiles the site, the editor sends out an e-mail to the person who appears to be responsible for the site, or, sometimes, posts a comment to say that the site has been chosen. The site renders a list of those sites offering the feeds as directlinks to the page. The site also subscribes to those feeds and renders them all together on a single page."
So the site has adopted an opt-out model for aggregation. Top10Sources notifies the feed owner, and the owner has the responsibility of requesting that a feed be removed. As a practical matter, this is the only way to run an aggregator. As I've mentioned in other posts, my attempts to gain permission from feed owners in advance of launching my RubyRiver aggregator was met with almost a complete lack of response. RSS was built to promote syndication, and an aggregator is a valuable part of that model. Requiring an opt-in model would limit the potential of RSS, and stifle an important avenue for Internet communication. As Palfrey says, "fundamentally, RSS is ads" for the blog and aggregators are a vital channel for these ads.
One question left unanswered by Palfrey's response is the amount of a feed that should be republished, especially in light of the site's opt-out model. He admits that this is an evolving area:
"I expect to take up this issue again with the management team once again. I don't think there's anything being done wrong from the perspective of the law. But we should take up for discussion some of the ethical issues that Mike Rundle and Om Malik raise and suggestions that Adam Green makes about how much of a given feed that the site republishes -- maybe a truncated version of the feeds is the right thing to render."
This debate over aggregation will certainly continue, but for now I find it fascinating to watch Palfrey navigate the current controversy. From a PR perspective I give him an A. I attended his Harvard Extension School class on cyberlaw a few years ago (which probably accounts for the academic tone I find myself adopting here), and frankly, he is a lot more interesting now that he has to apply his legal theories to a company in which he holds an important stake. I wish all Harvard profs had this real world opportunity. I hope people like Om Malik continue to
hold his feet to the fire. The blogosphere will benefit from his involvement.
The fine line between plagiarism and aggregation
Posted on Monday, January 16, 2006
at 5:13 PM
(permalink)
As the publisher of a brand new RSS aggregator I'm sensitive to this issue. I fretted publicly over the issues of publishing excerpts versus full feeds. In the end I decided to err on the side of caution and only publish the first few sentences of each RSS item, and to strip out all HTML tags to make the post less functional. My reasoning was that this would force the users to visit the original site if they were interested in a post. I was also afraid of getting caught up in the current blog/splog controversy. Om Malik has been on the warpath about this issue since he discovered sites that were blatantly copying his feed and claiming it as their own. Now he has set his sights on the new aggregator TopTenSources. In the interests of full disclosure, I should say that I know some of the people involved in this site, including one of the investors, John Palfrey. It is precisely because I know them that I find it hard to believe that they are knowing engaging in anything disreputable, let alone illegal. Palfrey is about as upright as they come, and along with being the director of the Berkman Center, he is a Harvard Law professor specializing in cyber law. So I'm going to assume that TopTenSources is fully complaint with the law. What remains to be determined is if they have stepped over the bounds of accepted aggregator behavior.
The first thing to look at is whether they are republishing the RSS feeds as their own content. That was the thing that set Om off in the first place, and that drove others, like John Battelle, into his camp. TopTenSources clearly states their role as an aggregator on the home page:
Top 10 Sources is a directory of sites that bring you the freshest, most relevant content on the Web. We know it's impossible for anyone to keep track of the 20 million+ online sources of information. So our editors search Web 2.0 -- blogs, podcasts, wikis, news sites, and every kind of syndicated sources online -- by hand.
The pages within the site don't include this statement, but each blog's feed items start with the name of the original blog. This blog name is not a link back to the owner's site, which is something I would change. Each item's headline is a link back to the original post. Overall, I'd say there is no attempt to blur the true owner of the feed items on the part of TopTenSource.
The more troublesome issue is TopTenSource's use of complete feeds, including images and links. In some cases the items republished are quite long. Again, I assume the legality of this use, but it does appear to step over the line of common behavior. I think they would be much safer only reprinting the first paragraph, especially in the current climate.
I hope this doesn't turn into another Tech Memorandum firestorm, because that will make it harder for any of us who want to work in the area of online aggregation.