Spam: Web of Trust v0.2 Request for Comments

Table of Contents

  1. Brief
  2. Specifications
    1. RSS 2.0 Extension
    2. Rules
  3. Implementations
    1. PHP 4 Reference Implementation
    2. Geeklog 1.4.1 Implementation
  4. Who?
  5. License & Copyright
  6. Standards

Brief

SWOT is an RSS Extension for the sharing of anti-spam blacklists. The idea is that each site in the web publishes an RSS Feed of their latest blacklist items for other sites to subscribe to.

This on it's own would result in a few supernodes scattered around, like the old MT Blacklist. A lot of effort for a few people that provides a great service to many people.

People would then subscribe their sites anti-spam engine to the SWOT feeds they trust to publish good quality blacklists.

We want this to be a distributed system, without a Single Point of Failure. We want it to be a web, so, each site that publishes out it's feed will not only include the items that particular site has added, but, also those it has added from other sites.

This way, we don't have a situation where nodes become saturated with traffic as they have a good black list and everyone subscribes. But, we don't want to remove the trust element.

To manage this, we will include a "distance" count, starting at 1, and incrementing by one each time a site adds it to it's "pass on" list of blacklist items. We will also record the source. So, when I subscribe to the list at somesite.com, I can tell which items come directly from that site, and which items come from other sites, and how far down the web of trust from somesite.com they are.

Then, when I subscribe to a particular site, I say I trust that one site, and, any other sites that site trusts to a specified distance. So, from somesite.com I might trust up to 5 hops away, but, for someothersite.com I may only trust that site and one hop further away.

I might also decide that I will never trust items from nastysite.com.

That's the idea, how do we implement it? [back]

Specifications

RSS Extension

SWOT is implemented as an extension for the RSS 2.0 syndication format. The SWOT namespace is added by a namespace declaration in the <rss> root node.

Blacklisted Regular Expressions

Blacklist regular expressions are passed in the <title> element of the <item> element. The <link> element of the <item> element will be re-tasked to indicate the original source of the blacklist item (i.e. the url of the feed that added it).

Two new elements in the SWOT namespace will be added to the item, the first is to indicate the distance from the feed the particular item orginiates. i.e it will have a value of 0 if the item was added by the site providing the feed, and greater than 0 for items syndicated into that site. The element is <hops> and it is a child element of the <item> element.

The second element is <action> this may contain add, remove or modify. In the case of modifiy a third element is required, <original>, to indicate which originally issued expression is being refined

Sample


        <rss version="2.0" xmlns:swot="http://swot.fuckingbrit.com">
            <channel>
                <title>SWOT Feed</title>
                <description>A shared, distributed blacklist feed.</description>
                <link>http://swot.fuckingbrit.com</link>
                
                <item>
                    <title>porn</title>
                    <link>http://www.fuckingbrit.com/backend/swot.xml</link>
                    <swot:hops>0</swot:hops>
                    <swot:action>add</swot:action>
                </item>
                <item>
                    <title>pills</title>
                    <link>http://www.geeklog.net/backend/swot.xml</link>
                    <swot:hops>1</swot:hops>
                    <swot:action>add</swot:action>
                </item>
                <item>
                    <title>casinos</title>
                    <link>http://www.fuckingbrit.com/backend/swot.xml</link>
                    <swot:hops>0</swot:hops>
                    <swot:action>add</swot:action>
                </item>
                <item>
                    <title>poker</title>
                    <link>http://www.fuckingbrit.com/backend/swot.xml</link>
                    <swot:hops>0</swot:hops>
                    <swot:action>modify</swot:action>
                    <swot:original>hold'em poker</swot:original>
                </item>
                <item>
                    <title>syndication</title>
                    <link>http://www.geeklog.info/backend/swot.xml</link>
                    <swot:hops>2</swot:hops>
                    <swot:action>remove</swot:action>
                </item>
            </channel>
        </rss>
    

[back]

Rules

It is intended that the following rules are applied when producing your feeds:

  1. The latest items in your SWOT blacklist should be exported, regardless of source, in the order they were added to your site.
  2. The link element must be the URL of the RSS feed orginating the regular expression in the title element. In the case of items added to your site by yourself, that will be the url of the feed you are creating.
  3. The hops element must contain the number of hops from your feed to the target feed. i.e. 0 for your original items, and for imported items, the hops count from the feed plus one.

It is intended that the following rules are applied when importing your feeds:

  1. When adding new items to your site, store them with a datestamp, you do not need to store them with imported items necessarily, but if you do, you will need to be able to tell your items from other items.
  2. When reading in items from a feed, check to see if the blacklist item already exists in your lists. If it does, and the new source is more trusted than the orginal source, update your date stamp, hops and source url to that of the new source.
  3. When reading in items from a feed, you should check that the hops count is not greater than the trust level that you have defined for the feed source url.
  4. When adding a new feed to your list of feeds, you should check that you are not already receiving blacklist entries from that source and either warn or prevent the addition.
  5. It should be possible to mark a particular source of regular expressions as untrusted, no matter who proxies that source to your system, they should be ignored by your system.

Implementations

PHP 4 Reference Implementation

Geeklog 1.4.1 Implementation

Who?

This was inspired by Jay Allen's MT Blacklist and Tom Willet's Spam-X plugin for Geeklog. However, the idea and development are so far 100% done by me, Michael Jervis. Any questions, comments, suggestions etc go to mike AT fuckingbrit DOT com. [back]

License & Copyright

Copyright © 2006-2008 Michael Jervis. License: Currently no license to do anything with anything is granted to anybody. This is a specification published for comment. When I have something that is suitable for licensing I will issue it under a suitably open and free license. [back]

Standards

This page is written to comply with XHTML 1.1 Strict, and is re-validated each time I update it. You can check here. This page is also written with no style, images or chuff. And it's therefore pretty easy to make it WAI AAA compliant. You can check that here. It us also section 508 validated, you can check that here. [back]