spam assassin
bayes learn version 3.2.4

sa-learn [options] [file]...

Given a "typical" selection of incoming mail classified as spam or ham (non-spam), feed each mail to SpamAssassin, allowing it to 'learn' what signs are likely to mean spam, and which are likely to mean ham.

Run for each mail folder, and it will ''learn'' from the mail.

Globbing of the folder is supported; * will scan every folder that matches. See Mail::SpamAssassin::ArchiveIterator for more details. SpamAssassin remembers which mail messages it has learnt already, and will not re-learn those messages again, unless you use the --forget option. Messages learnt as spam will have SpamAssassin markup removed, on the fly. If you make a mistake and scan a mail as ham when it is spam, or vice versa, simply rerun this command with the correct classification, and the mistake will be corrected. SpamAssassin will automatically 'forget' the previous indications.

of spamd to perform training remotely, over a network, see -L .

--ham| (--spam) Learn messages as ham i.e. non-spam | (spam)
--mbox | (--mbx) Input sources are in mbox|(mbx) format
-f file,
--folders=file
Read list of files/directories from file
--forget Forget a message from STDIN
--use-ignores Use bayes_ignore_from and bayes_ignore_to
--sync | (--no-sync) Syncronize (skip synchronization of) the database and the journal if needed
--force-expire Force a database sync and expiry run
--dump [all|data|magic] Display the contents of the Bayes database
--regexp re For dump specifies which tokens to dump
--showdots|(--progress) Show progress using dots (progress bar)
-L,
--local
Operate locally, no network accesses
--import Migrate data from older version/non DB_file based databases
--clear Wipe out existing database
--backup |
(--restore filename)
Backup existing database to STDOUT > file
Restore a database from filename
--dbpath Allows commandline override (in bayes_path form) for where to read the Bayes DB from
-u username,
--username=username
Override username taken from the runtime environment, used with SQL
-C path,
--configpath=path,
--config-file=path
Path to standard configuration dir
-p file
--prefspath=file,
--prefs-file=file
Set user preferences file, default: ~/.spamassassin/user_prefs
--siteconfigpath=path Path for site configs. default: /etc/mail/spamassassin
--cf='config line' Additional line of configuration
-D
--debug [area=n,...]
If no areas are listed, all debugging information is printed.
Diagnostic output can be enabled for each area individually;
spamassassin -D bayes,learn,dns
For more information about which areas (also known as channels) are available, please see: wiki.apache.org/spamassassin/DebugChannels
-V,--version
-h,--help
 

 sa-learn --dump magic
 0.000          0          3          0  non-token data: bayes db version
 0.000          0     261396          0  non-token data: nspam
 0.000          0      18089          0  non-token data: nham
 0.000          0     148790          0  non-token data: ntokens 
 0.000          0 1230126517          0  non-token data: oldest atime
 0.000          0 1236139617          0  non-token data: newest atime
 0.000          0 1236140767          0  non-token data: last journal sync atime
 0.000          0 1235651034          0  non-token data: last expiry atime
 0.000          0    5529600          0  non-token data: last expire atime delta
 0.000          0      10952          0  non-token data: last expire reduction count
 

 sa-learn --dump data
0.062       2274       2400 1236137441  c0614089c0
0.507      13202        889 1236085607  2dd27dc5f9
0.001          2        226 1236119931  461312c98e
0.003          8        170 1235173556  262e33315c
…     148790 lines !
0.016          0          1 1235685679  da21efbad4
0.016          0          1 1235706793  2f834646e6
0.987          1          0 1235769482  ab8e7006c3
0.987          1          0 1236111118  77f749b43b

Spam Assassin database sa-learn Sync

Output from -sync -D
[13046] dbg: locker: safe_lock: link to /home/dauser/.spamassassin/bayes.lock: link ok
[13046] dbg: bayes: tie-ing to DB file R/W /home/dauser/.spamassassin/bayes_toks
[13046] dbg: bayes: tie-ing to DB file R/W /home/dauser/.spamassassin/bayes_seen
[13046] dbg: bayes: found bayes db version 3
[13046] dbg: locker: refresh_lock: refresh /home/dauser/.spamassassin/bayes.lock
[13046] dbg: bayes: DB expiry: tokens in DB: 148790, Expiry max size: 150000, 
                   Oldest atime: 1230126517, 
                   Newest atime: 1236139617, 
                   Last expire:  1235651034,    
                   Current time: 1236140791
[13046] dbg: bayes: expiry completed
[13046] dbg: bayes: untie-ing
[13046] dbg: bayes: files locked, now unlocking lock
[13046] dbg: locker: safe_unlock: unlink /home/dauser/.spamassassin/bayes.lock

backup format:

v   3   db_version # this must be the first line!!!
v   261603  num_spam
v   19752   num_nonspam
t   2275    2542    1239258238  c0614089c0
t   13207   1334    1239311789  2dd27dc5f9
t   2   275 1239303031  461312c98e
t   8   189 1239254329  262e33315c
t   1   107 1239123759  91919f0fac
t   3   206 1239312110  90775ea219

…

s   s   d6dafc819eb1dcefde08076645bc9f8ed412519b@sa_generated
s   s   f0a5362fbb8e433dd8378e7986bb8d80edffbc03@sa_generated
s   h   266cee79307b372d7104c3dae598dca8acb9e15e@sa_generated
s   s   00ff668488135f17f4ac7c5a2ab7f573abfbef2f@sa_generated
s   s   0691c29973f6f3e9cb873387c3ede6bba6479581@sa_generated
s   s   5f1ec60edd55cb841c6dc14f014363350eb80453@sa_generated
Spam Assassin configuration

Original sa-learn