SVMsequel Documentation

Hal Daume III ( )



First Release 31 March 2004

Introduction

SVMsequel is a complete environment for training and use support vector machines. Some familiarity with kernel methods will be helpful (see here for class notes I've used for using SVMs for natural language processing if you need a refresher.

Documentation

There is far too much to document here, so please see svmsequel.ps or svmsequel.pdf for relevant documentation on how to use SVMsequel.

I will say that there is really no reason to ever use SVMseq again. You really should switch over to this new program. It's orders of magnitude faster.

Currently SVMsequel doesn't support ranking (likely to come soon) or regression (unlikely to come soon). It's very fast (did I say that yet) and handles enormous datasets very nicely. Additionally, it supports multiclass classification and probabilistic classification.

Kernels available include:

Linear, Polynomial, RBF, Sigmoid

Information Diffusion on discrete manifolds

Information Diffusion on the n-simplex

String kernels (based on dynamic programming) -- O(n*m)

String kernels (based on suffix-trees) -- O(n+m)

Tree kernels (as above)

Download

You can download the source, or binaries for i686 Linux or Sun4 Solaris. To compile it you will need an O'Caml compiler.

Bugs

If you observe any bugs (things that say "Internal error") that are replicable or not, please send me the relevant files and the command you last executed. If possible, run save environment and just send me the environment and the command. Also let me know what architecture/OS you are using.

Frequently Asked Questions

Any question that I've received by email is posted here (senders remain anonymous). So far most of these have to do with the nonstandard kernels.