SMv17/100 - indexing question/issue
Problem reported by Webio - 1/21/2019 at 4:53 AM
Not A Problem

I've upgraded from v16 to v17 on weekend and now I ha ve few questions related to indexing process. I'm trying to understand it. I know there was info that it could take a while but take a look at this Indexing log entry:

[2019.01.20] 06:37:52.773 [EMAILADDRESS] Retreived from queue
[2019.01.20] 06:37:52.773 [EMAILADDRESS] Marking user as being indexed.
[2019.01.20] 06:37:52.773 [EMAILADDRESS] Should Reindex (g.woloszyn) - User marked for reindex.
[2019.01.20] 06:37:53.835 [EMAILADDRESS] Reindex (EMAILADDRESS) - All indexed items removed
[2019.01.20] 06:37:53.835 [EMAILADDRESS] Found 53 pending deletes.
[2019.01.20] 06:37:53.882 [EMAILADDRESS] Found 2501 items to index. (Max is 2500)
[2019.01.20] 06:38:29.976 [EMAILADDRESS] Index committed to disk
[2019.01.20] 06:38:29.992 [EMAILADDRESS] Index segment counts is greater than 20, optimizing index.
[2019.01.20] 06:38:29.992 [EMAILADDRESS] Unindexed remaining: 0 / 219254 (0%)
[2019.01.20] 06:38:29.992 [EMAILADDRESS] User Indexed
[2019.01.20] 06:38:30.023 [EMAILADDRESS] User removed from indexing queue
So we have whole mailbox marked for reindex and 30 seconds later we have "Unindexed remaining: 0 / 219254 (0%) User Indexed". When I try to look for lets say DHL in my Inbox (where I'm sure that there are various emails from them) I'm getting only few emails from 2013.

I had also few calls and tickets from my customers complaining about searching for their emails in webmail.


EDIT: One more thing about SmarterTrack which is used here. Search from is completely useless IMHO because it is just not allowing ordering posts by date. I've tried to find anything about index,indexing,reindex using search form but it was returning posts from variuos dates and I was not sure how this is ordered.

4 Replies

Reply to Thread
Employee Replied
Employee Post

There are a couple of items that I will explain from your post.  The first thing that I noticed is the Index Segment Count.  By default, this setting is configured for 20.  In SmarterMail 16 and earlier versions, we were using an older version of Lucene and this setting worked.  In the latest versions we have upgrade Lucene indexer to the latest version and made significant optimizations throughout the product.  We are going to be increasing that default number for new installs to somewhere between 1000-2000.  This will positively impact your CPU by increasing that number without negatively impacting indexing.  Second, at 6:37 the log states that it found 53 deletes and 2501 items to index.  By default, we chunk indexing into 2500 items per "loop".  After it has committed that round of indexing it will then index the next 2500, etc.  At 6:38 the log states there are xxx number of unindexed items remaining.  Zero items are left to index and the user is removed from the queue.  I agree that it can be misleading with the quantity and percentage shown.  At first glance you can expect that number to increase instead of decrease overtime to when you have 100% indexed.  I will discuss this with the team and see about changing it to lessen confusion.

I hope this helps!
Webio Replied
Ok but can I do something now to speed things up?

I can modify this values:

    "indexing_segment_file_count_before_optimization": 20,
    "indexing_items_before_garbage_collection": 5000,
    "indexing_items_per_pass": 2500,
    "indexing_max_threads": 1,
    "indexing_deleted_items_before_optimization": 1000,
    "indexing_seconds_in_queue_before_index": 120,
from settings file. Also is this normal that I still don't see any difference in search results? IMHO if indexing is telling that user is indexed then searching should work correctly right (I still see only messages from 2013 when looking for "DHL" search phrase where I had many DHL emails recently).
Matt Petty Replied
Employee Post
"indexing_segment_file_count_before_optimization": 20
Set this to atleast 500.

Also if you just recently moved to 17, you need to reindex all the user's on your box. If you set your threads to 1, keep an eye on your indexing queue, it might fill up and user's waiting for an index will not get indexed.
Matt Petty
Software Developer
SmarterTools Inc.
(877) 357-6278
Webio Replied
Currently I have threads set to 1. It looks like this settin has been inherited from v15->v16->v17. I've changed indexing params to ones from other server which has fresh configuration but I've changed indexing threads

My current Mailbox Indexing in Troubleshooting section is showing almost 5k mailboxes. Interesting thing is that when I open indexing status I see also Completed mailboxes. Should this table just show mailboxes which are or will be indexed?

EDIT: So after night Mailbox Indexing number is 2.7k. It's hard to tell how many are being indexed or waiting in queue since there are a lot of positions with Completed status.

Reply to Thread