>Example spam data

Download data set

First value in each row is the class label, in {0,1}, where 0 is "not spam" and 1 is "spam". Remaining values in each row are {0,1} valeus for non-presence/presence of words in the message. Top 2000 features have been selected using information gain. Examples are randomly ordered. There are 628 positive examples out of 2000. Data is matlab readable, with whitespace separated values.