2
Deeper Integration with AI/OpenAI/ChatGPT
Idea shared by James North - 6/10/2025 at 11:17 PM
Proposed
Hi,

I just tried out the AI integration with Smartermail from the Marketplace. I've found it...underwhelming. Email drafting is all well and good, but it would be really interesting to have much deeper integration with AI.

For example, a chatbot you can ask about your emails:
  • Finding emails: "I need to find that email where I spoke to Brian about [whatever] over a year ago"
  • Asking for insights about your emails

I was wondering if that was on the roadmap.

12 Replies

Reply to Thread
3
Really?? And using your aown data to train an LLM? That you havent any control over?

If you want to use AI then pls. do so with a private AI only using your own data.... and not leaking it to everyone else.

Its not only your own info thats getting shared.... all your ever wrote to or had any correspondance with is also included.
1
According to OpenAI's privacy policy, they don't train their AI on your data when you use their API: https://platform.openai.com/docs/guides/your-data

Your data is your data. As of March 1, 2023, data sent to the OpenAI API is not used to train or improve OpenAI models (unless you explicitly opt in to share data with us).

Additionally, it can be implemented in a way that not all of your data needs to be sent. See Windsurf's Security page (which is actually about privacy):

Within each of these requests, the client machine sends a combination of context, such as relevant snippets of code, recent actions taken within the editor, the conversation history (if relevant), and user-specified signals (ex. rules, memories, context pinning, etc). No single request contains entire codebases or large contiguous pieces of code data. Even for ahead-of-time personalization, any codebase parsing happens on the client machine and individual code snippets are sent to compute the embeddings so that the server is not receiving a single request with the entire codebase.
Of course ideally you would have integration with a local LLM running on your own server infrastructure. But the Smartermail integration is already provider-agnostic.

The Thunderbird Assist service is a good example of how deep integration can be much more interesting than just email drafting.
1
Unfortunately, if you want an AI to be able to search your emails, then you have to pass ALL of your email content (ALL of your emails ever...) to that AI so it can analyze the content and then do the search you ask for...

How do you plan to get them to do that without them being able to keep your data?

Assuming we pretend to trust them, do you give them your entire email database EVERY TIME you ask for a search and then have them delete it right after? How many gigabytes of upload do you have to give them each time?


Gabriele Maoret - Head of SysAdmins and CISO at SERSIS Currently manages 6 SmarterMail installations (1 in the cloud for SERSIS which provides services to a few hundred third-party email domains + 5 on-premise for customers who prefer to have their mail server in-house)
0
None because they have it allredy. Data is the new gold.

ERP.... they know everybody that you invoice. They know your customers and your prices and what you sell the most. Your vendors and your contacts.

What more do they need to know to compete or sell your data?
1
I was playing Devil's Advocate with trusting these companies :) But a lot of people do actually trust them. And I don't think there's been any privacy-related scandals about them as yet. Good to know people around here care a lot about privacy.

Personally, it unnerves me when I need to give consent for Scribe to be used to go to most of my doctors nowadays.

I'm not personally interested in using AI for my emails. But I've got customers who are. And it seems like using a local LLM on a server you actually control would be the privacy-preserving way to do it, as Brian suggestted Thunderbird Assist uses local models for privacy reasons. I don't see why you couldn't setup a Mistral model and integrate it with Smartermail. Certainly, it seems like Microsoft and Google are doing deep integration with their own models.

As to uploading the entire database of emails - I don't think that's how it works. As I understand it, the data is processed to create embeddings that are stored in a vector database, which is significantly compressed. And then periodically updated.

As an aside:
There seems to be a lot of talk about vector databases, embeddings, and such. This is a bit beyond me. But I did find a research article about embeddings and how private they are: https://arxiv.org/html/2411.05034v1

(Which is to say, by default, not very, though this paper shows it can be mitigated, apparently)
1

1. Clearview AI – Facial Recognition Database Scraping

  • What happened: Clearview AI scraped over 3 billion images from social media and public websites without consent to build a facial recognition tool used by law enforcement.
  • Privacy concern: Individuals’ biometric data was collected and used without knowledge or permission.
  • Impact: Lawsuits and bans in several countries; regulators in Canada, Australia, and Europe ruled the company violated privacy laws.

2. Facebook & Cambridge Analytica (with AI-driven profiling)

  • What happened: Data from 87 million Facebook users was harvested via a personality quiz app and used to create AI-driven psychological profiles for targeted political advertising.
  • Privacy concern: Massive unauthorized use of personal data for behavioral prediction and manipulation.
  • Impact: $5 billion fine for Facebook from the FTC; widespread loss of trust in data privacy and social media platforms.

3. Amazon Alexa – Voice Recordings and Human Review

  • What happened: Amazon employees listened to users’ Alexa recordings to improve AI voice recognition accuracy—without clearly informing users.
  • Privacy concern: Private conversations were recorded and reviewed without explicit consent.
  • Impact: Backlash and increased scrutiny of smart home devices; Amazon introduced clearer opt-outs and improved privacy policies.

4. Zoom AI Features – Using User Data Without Consent

  • What happened: In 2023, Zoom faced backlash for updating its terms to allow training AI on customer data (including video, audio, and chat), which many users saw as a breach of trust.
  • Privacy concern: AI training on sensitive communications without clear user consent.
  • Impact: Zoom was forced to clarify and change its terms; damaged trust among enterprise users.

5. Google Bard (Gemini) – Data Misuse and Internal Access

  • What happened: Internal whistleblowers raised concerns that Google engineers accessed users' chat data in AI tools for training and debugging.
  • Privacy concern: Sensitive user data potentially accessed by employees or used without proper safeguards.
  • Impact: Internal investigations and increased calls for transparency in AI model training data practices.

Asked Google.....this is what turned up after a milisecond research...

0
If we're talking scandals, I'd be interested in one relating to OpenAI, which states in certain terms, right at the top of the agreement, that they do not train their models on your data. If they broke that agreement, I would be very interested in seeing reports of that, because OpenAI has staked their reputation on saying they do not do this. I couldn't find any articles saying they have.

As for Google Bard:

It really isn't surprising that Google is training their AI tools on user's data...in fact, I would be surprised if they ever said they wouldn't. Amazon is even less surprising, the Zoom scandal is about the fact they updated their terms to say they would train their AI on user's data, not for breaching a contract saying they wouldn't (Adobe would be a more recent example of the same thing). Clearview AI isn't offering an API as far as I'm aware, and the Facebook one is a real reach...

But I suppose that's about what you can expect from Google Bard's output 🤷

Now, I had seen recently that the NYT lawsuit with OpenAI had them ask the judge to serve an injunction to retain all their user's data so they could peruse it. I figured this had little chance of going through and would be appealed.


To comply with the order, OpenAI must "retain all user content indefinitely going forward, based on speculation" that the news plaintiffs "might find something that supports their case," OpenAI's statement alleged.
The order impacts users of ChatGPT Free, Plus, and Pro, as well as users of OpenAI’s application programming interface (API), OpenAI specified in a court filing this week. But "this does not impact ChatGPT Enterprise or ChatGPT Edu customers," OpenAI emphasized in its more recent statement. It also doesn't impact any user with a Zero Data Retention agreement.
To which my first thought was:

"Uh, but they didn't update the privacy policy..?"

And then I realised: "oh, of course. The privacy policy says they don't train on your data. Not that they don't retain it. And there is that section in there about sharing it with third parties for legal reasons."

Their policy:

By default, abuse monitoring logs are generated for all API feature usage and retained for up to 30 days, unless we are legally required to retain the logs for longer.
Anyway.

Back to my feature request... limiting it to a local LLM where the data never leaves the server/user's device. I don't see any privacy issues with that.
3
Matt Petty Replied
Employee Post

Here's something i've been experimenting with in my own time :) 

Matt Petty Senior Software Developer SmarterTools Inc. www.smartertools.com
0
Ok, i would actually love to hear a real answer on this one... I think this conversation got a little to far into the "Tin Foil Hat" realm. The only people who have something to fear from any of these systems is honestly people who have something to hide, or haven't setup their security correctly...

That being said, we routinely tell our clients if you are doing anything illegal, your contract is terminated.

If a client wants end to end encryption, full security and so forth, forward them to something like ProntoMail or Tuta Mail. Around 80% of the world is using Outlook or Gmail, including fortune 500 companies... if you think Microsoft and Google aren't using every single email on those system to train their AI's... ok then.

But my clients, who don't really care, would love to have a more robust system, not integrated into SM, but at least an Add-on, so that users / administrators can decide if they want to opt in. The Thunderbird Assist system looks intriguing. I'm thinking this is also similar to what Microsoft's Copilot achieves (albeit with only email instead of the entire 365 suite feeding it info), but both are desktop applications that already integrate with SM for send / receive, and my guess is the original question was more based on web-portal than desktop applications?

With all of this though, if people are worried about the contents of their emails, would it not be something to also looking into, something like Pronto Mail does that offers end to end encryption so that the server only contains encrypted emails? Add the encryption feature on the way in (after virus and spam scans) then decrypt on the way out (when being retrieve by imap/pop)? would also have to be built as either an add-on or feature that needs to be turned on/off by admins, maybe per-account?


3
None of the companies actually understand how their "AI" works.  Over the past decade, we have numerous documented instances in which the AIs were doing things unintended, and in some cases actively worked to obfuscate their behaviors.  So, when OpenAI makes promises about privacy, or Microsoft says that Copilot will not leak information out of your tenant, you can not only take that with a grain of salt but absolutely consider them to be complete liars.
0
Search for "Echoleak"....

That should worry you.
0
AI integration with Online Meetings, Contacts, Calendar, Tasks, Notes, and File Storage in various ways including online meeting summaries, transcribing, suggestions, and scheduling would be a great add. 

Reply to Thread

Enter the verification text