Remote Rspamd learn spam and ham

Question asked by Roger - 4/7/2023 at 5:19 AM

Unanswered

Hello together

I have now integrated Remote Rspamd into the latest SmarterMail version and enabled it for the SpoolFilter. So far it also works via port 11333 on /checkv2 but how does it work with the

/learnspam

/learnham

Is there any documentation somewhere which path this should be on RSpamd and how it is implemented/set up?

Thanks and greetings

Roger

16 Replies

Reply to Thread

Webio Replied

4/7/2023 at 12:28 PM

IMHO you should just put there endpoint names which you wrote here.

More info on rspamd documentation:

https://rspamd.com/doc/architecture/protocol.html

/learnspam - Trains bayes classifier on spam message
/learnham - Trains bayes classifier on ham message
/checkv2 - Checks message and return action (same as normal worker)

This should work when client using webmail will mark message as spam or not spam. IMHO it works just like in rspamd GUI in section Scan/Learn where you are providing message source and press Upload HAM or Upload SPAM buttons.

EDIT: But I haven't used it yet since I only use latest SmarterMail builds on my incoming gateways.

Roger Replied

4/8/2023 at 11:01 AM

Thank you for the information. If I understood the following page correctly, it requires appropriate privileges for /learnspam and /learnham.

https://rspamd.com/doc/workers/controller.html

Is it correct that I need to create a key pair for /learnspam and /learnham accordingly? If so, where do I enter the public key in SmarterMail so that it is authorized to access these two parameters?

As far as I understand you can also whitelist the IP address of the system with secure_ip.

/auth
/symbols
/actions
/maps
/getmap
/graph
/pie
/history
/historyreset (priv)
/learnspam (priv)
/learnham (priv)
/saveactions (priv)
/savesymbols (priv)
/savemap (priv)
/scan
/check
/checkv2
/stat
/statreset (priv)
/counters
/metrics

I did some tests here with these commands using Insomnia and /checkv2 and /scan worked:

Commands like /counters /stat and especially /learnspam and /learnham bring an error see screenshot:

greetings

Roger Replied

4/9/2023 at 7:38 AM

I tried a bit more and found out that using port 11334 instead of 11333 works and /learnspam as well as /learnham can be addressed if the IP address of the SmarterMail server is entered in the configuration in Rspamd with secure_ip = "xxx.xxx.xxx.xxx".

Using Curl and the Insomnia tool I can now address it and it seems to work.

Also on the server, rspamc learn_spam /var/www/folder/spam/ can now be used to learn a directory with spam mails.

However, in the SmarterMail logs I get this error message when a user clicks on the "JunkMail" button on a mail via the webmail:

[2023.04.09] on MailService.Spam.RspamdClient.<ReportSpamOrHam>d__20.MoveNext()

In RSpamd then he seems not to have learned this message as spam.

Webio Replied

5/27/2023 at 10:50 PM

Hello Roger,

have you maybe solved this issue? I've updated SmarterMail to latest version (8545), added rspamd (used 11334 port) instance to my main server (I'm using incoming gateways to do all spam checks but option "Send user spam feedback to antispam providers" is enabled). When I click Move to Junk I don't see any JS errors but Bayesian statistics are not changing when I do that.

A System Administrator Replied

5/31/2023 at 11:26 AM

Just adding we have an Ubuntu based rspamd setup running using the Smarter Tools guide (mostly) and have also seen that there is nothing learned when using the "Move to Junk" button. The rspamd web console shows scores are working fine but nothing is learned (outside of my manual uploads). I can see the following in the "Error" logs:

[2023.05.31] 07:26:20.629 Response status code does not indicate success: 500 (invalid command).
[2023.05.31] at System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode()
[2023.05.31] at MailService.Spam.RspamdClient.<ReportSpamOrHam>d__20.MoveNext()

I've confirmed my "Send user spam feedback to antispam providers" is enabled and our rspamd server settings look like this:

Webio Replied

6/1/2023 at 12:19 AM

IMHO just like Roger S. said you should use port 11334 since this is (learnspam) a controler port which is based on documentation on of its enpoints:

https://rspamd.com/doc/architecture/protocol.html#controller-http-endpoints

but anyway this is also not causing rspamd filter to learn spam sent by SmarterMail.

Gabriele Maoret - SERSIS Replied

6/1/2023 at 12:42 AM

I think the SmarterTools team should put effort into making a WELL DONE guide on how to set up an RSPAMD server (maybe with Debian, not with Ubuntu...) and connect it properly to SmarterMail...

The current guide takes too many things for granted and also lacks some configuration and information regarding the connection between SmarterMail and RSPAMD...

Gabriele Maoret - Head of SysAdmins and CISO at SERSIS Currently manages 6 SmarterMail installations (1 in the cloud for SERSIS which provides services to a few hundred third-party email domains + 5 on-premise for customers who prefer to have their mail server in-house)

Webio Replied

6/1/2023 at 3:33 AM

I would not set this as top priority for them. This is separate product like SpamAssassin. If they will create KB entry for that then they will have to track possible changes in rspamd. When it comes to SmarterMail configuration then it is not difficult. It just does not work for spam learning. Scanning is working great.

There is possibility that for learning ham/spam should be used different port than for scanning according to this doc:

https://rspamd.com/doc/workers/

normal: this worker is designed to scan mail messages
controller: this worker performs configuration actions, such as learning, adding fuzzy hashes and serving web interface requests

Normal is on 11333 port and controller is on 11334 port.

But still learning is broken IMHO. I'm using 11333 port on my incoming gateways and 11334 on my main SmarterMail instance. Scanning works, learning not (and also reading passed spam score from gateways is broken currently - 8545 build)

A System Administrator Replied

6/1/2023 at 3:49 PM

I'm pretty sure the issue is that the SmarterMail settings page for rspamd servers only has a single field for the server address (which is where you include the port) however there are two workers that handle tasks and they do not use the same port (they cannot both listen for HTTP on the same port).

As Webio mentioned, the "normal" worker handles scanning messages and defaults to 11333 while the "controller" worker handles learning + web UI and defaults to 11334. We need some way to specify a unique path (including port) for each.

Webio Replied

6/5/2023 at 12:08 PM

There is one more thing I would like to check on your ends: can you disable rspamd from spam checks, have "Send user spam feedback to antispam filters" and check if you see any activity on rspamd from IP address from SmarterMail. Since I have scenario with incoming gateways performing all spam scan I don't do any scanning on main SmarterMail instance but even if I don't have any checks enabled in Spam Checks section I see scanning in rspamd logs.

Webio Replied

6/11/2023 at 4:46 AM

Can someone of you who is trying to make it work check rspamd logs for errors like:

Skip spam sample to keep spam/ham balance; too many spam samples: 1001

I've modified neural config file according to one of configs available here:

https://www.rspamd.com/doc/modules/neural.html

and now when I'm checking rspamd logs I see instead of mentioned above error lines like:

tail -f -n 40 /var/log/rspamd/rspamd.log | grep learned

2023-06-11 09:48:21 #24292(controller) <a85e3b>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: fosVXYHK.....
2023-06-11 09:57:02 #24292(controller) <2a4f6f>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 16864336.....
2023-06-11 10:24:22 #24292(controller) <bc1df0>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: synerise.....
2023-06-11 10:33:03 #24292(controller) <052d8d>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 20230606.....
2023-06-11 10:33:03 #24292(controller) <003f14>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 20230603.....
2023-06-11 10:35:43 #24292(controller) <7869f4>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 1637fc51.....
2023-06-11 10:35:43 #24292(controller) <44719c>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 6x58309......
2023-06-11 11:11:05 #24292(controller) <fe2471>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 00000000.....
2023-06-11 12:35:17 #24292(controller) <0ba52e>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 6x58309......
2023-06-11 13:22:09 #24292(controller) <cfa69c>; csession; rspamd_controller_learn_fin_task: <LOCALIPOFSMARTERMAIL> learned message as spam: 4eaff4e7.....

David O'Leary Replied

12/17/2024 at 10:04 AM

Has anyone been able to get learnspam and learnham working? If so, how?

Owner of Efficion Consulting

Roger Replied

2/13/2025 at 5:28 AM

not yet

Ben Rowland Replied

3/12/2025 at 7:33 PM

Yes, I have been able to get this working. I was a little confused by the directions but the key is understanding that SM needs to communicate to rspamd on the 11334 port. To do that, the SM IP address must be whitelisted, because it is otherwise blocked. Yes 11334 is the same as the website which is why this seems odd.

I found that with SM set to communicate on 11333, it worked on the spool, but the /learnspam and /learnham endpoints did not work. Apparently those are used by the cron that runs ever 5 minutes and sends spam and ham samples to rspamd. So to get this to work, SM must be set to use 11334 and communication from SM to rspamd must be whitelisted.

I am using a private network, so in my case, 10.0.0.8 SM server communications to 10.0.0.9 rspamd server.

I found that I had to get rspamd to listen on more than just localhost so that I could establish communication from my SM server, so I changed the binding in /etc/rspamd/rspamd.conf (which had been "localhost:11334"):

bind_socket = "*:11334";

To check, you can curl your rspamd server from SM:

curl http://10.0.0.9:11334/learnspam

If this communication works, it will say:

{"error":"Empty body is not permitted"}

In my case, it initially said not authorized. So, in /etc/rspamd/local.d/worker-controller.inc you must whitelist the IP address of your SM mail server. Thus add:

secure_ip = "10.0.0.8";

With that done I restarted the spamd service:

sudo systemctl restart rspamd

Then I marked some spam, and waited while tailing the log file:

tail -f /var/log/rspamd/rspamd.log

tail -f /var/log/rspamd/rspamd.log | grep [spam email here]

Finally, you can confirm that the keys have been added to redis:

redis-cli keys "*"

The web UI also shows them as "Learns" under the Bayesian statistics widget.

David O'Leary Replied

3/13/2025 at 2:45 PM

Thank you Ben!!! Your post helped me get this working finally.
The main things I learned as I worked from your post:
1.) From your SmarterMail server, you should be able to open a browser and go to: [yourserver.com]:11334/checkv2 and get a message back that is along the lines of:

{"is_skipped":false,"score":15.0,"required_score":15.0,"action":"reject","thresholds":{"reject":15.0,"add header":6.0,"greylist":4.0},"symbols":{"COMPLETELY_EMPTY":{"name":"COMPLETELY_EMPTY","score":15.0,"metric_score":15.0}},"messages":{},"time_real":0.000653,"milter":{"remove_headers":{"X-Spam":0}}}

2.) For me, I had to change the bindings in /etc/rspamd/rspamd.conf from "localhost:11334" to "*:11334". After I did this, I was getting the "not authorized" message. 

3.) To fix the "not authorized message", in the /etc/rspamd/local.d/worker-controller.inc file, for secure_ip entry, I had a comma separated list of IPs and that wasn't working. When I switched it to the single primary of my SmarterMail server, it started working.

It is now official working. Yay!

Owner of Efficion Consulting

Ben Rowland Replied

3/16/2025 at 7:19 AM

Great news that it’s working.

I just came across this documentation, which has some great ideas:

https://github.com/martinschaible/rspamd-installation-for-smartermail/wiki/Installation-and-Configuration-Rspamd

I can see that I have to make some improvements.

I got it working by modifying some of the configuration files but now see I need to modify the local.d configuration files instead. To test that the configuration changes were made, I found that I can use:

rspamadm configdump

It’s helpful to know if the settings change you make actually “take”.

I have now set greylisting to null to disable it since SM is already doing that. His instructions advocate for bumping up the reject threshold, which I will consider.

I am using a local network for server to server communication, but it may be a better situation to use a reverse proxy and ssl configuration. The email is transmitted unsecured from SM to rspamd for scanning so just need to consider the security implications of this on any given network.

I initially changed the configuration to point to my dns servers, but I thought the suggestion of using unbound dns was a good idea to reduce the load on my primary dns servers, so am trying that.

I also recognize that there are duplicate checks run on SM and rspamd, such as RBL. That may require some tuning.

I realize that rspamd can greylist, add headers, rewrite the subject, and reject the message. What I’m not clear on is how SmarterMail interacts with greylisting and reject actions from rspamd.

My rspamd showed lots of greylisting (before I disabled it). That means the email was already accepted by SM so I can’t see how SM would utilize that action. Same thing with reject. I assume SM doesn’t outright reject but instead wants to use rspamd as one of its factors. It would be helpful to understand the degree of integration SM has with rspamd actions and the recommendations for how to use it most effectively.

Back to Community Threads

Please leave this box unchecked

Reply to Thread

Enter the verification text

16 Replies

Reply to Thread

Tags

Related Knowledge Base Articles