FTS search string with point

Hi,

FTS works fine unless the search string contains a point. Example:
target: virt.virt_flux-ingest

virt_flux-ingest OK
virt.virt_flux-ingest KO

Do you know why ?

Regards, Stef.

Dot (.) symbol is used by Lucene as token separator, virt and virt_flux-ingest in this case are separate tokens and you may search using that parts only. If you want “phrase search” then you may try wrapping your search criteria into quotes:

"virt.virt_flux-ingest"

Thank you for your response. I’ve tried ftsService.search(“"virt.virt_flux-ingest"”) and ftsService.search(“virt?virt_flux-ingest”) with no more success.

I don’t see any reference to dot in Apache Lucene - Query Parser Syntax.

I see in FTS.java/isTokenChar that dot in not recognized as a character from token.

In CUBA FTS add-on, Lucene query syntax that you reference is not used. You can see how the search queries are built in com.haulmont.fts.core.sys.LuceneSearcherBean#createQueryForAllFieldSearch(java.lang.String) method.

Tokenizer doesn’t consider dot as a part of the token, so regular term search doesn’t work.

The phrase query ftsService.search("\"virt.virt_flux-ingest\"") should work,

No, it doesn’t. Perhaps the indexing ? How do I check the Lucene indexes ?

Edit:
The target is in fact kafka.groups.sandbox_virt.virt_flux-ingest.recette.routeur_dataqueue.dataqueue-consumer.

ftsService.search("\“kafka.groups.sandbox_virt.virt_flux-ingest.recette.routeur_dataqueue.dataqueue-consumer\”");

return results, not

ftsService.search(“kafka.groups.sandbox_virt.virt_flux-ingest.recette.routeur_dataqueue.dataqueue-consumer”);

ftsService.search("\“virt.virt_flux-ingest\”");

still doesn’t return results.

Here is my very simple sample project where phrase search works fine: cuba-fts-dot.zip (91.3 KB)

Take a look, maybe you’ll be able to see what you’re doing in a different way in your app.

Sorry but:

Lucene index files weren’t put into zip archive. You need to reindex entities.
Open the Administration - JMX Console screen, find the FtsManager MBean there and invoke its methods: reindexAll and then processEntireQueue

Tired, end of week, sorry. :slight_smile: So, after reindexing, it’s ok in your test project.

As I’ve said in previous post, the real target is

kafka.groups.sandbox_virt.virt_flux-ingest.recette.routeur_dataqueue.dataqueue-consumer

. So, in my project, a search string

“kafka.groups.sandbox_virt.virt_flux-ingest.recette.routeur_dataqueue.dataqueue-consumer”

points to one record (it’s ok). But a search string like

“virt.virt_flux-ingest”

doesn’t point to any record. I’m checking my project. Thank you !

Sorry but it’s not (totally) ok with your project:
Orders:

kafka.groups.prod_virt.virt_flux-ingest.prod.routeur_dataqueue.dataqueue-consumer
kafka-consumer-group:virt.virt_flux-ingest.recette.routeur_dataqueue.dataqueue-consumer
virt.virt_flux-ingest

Search string:

“virt.virt_flux-ingest”

Results:

Order
Number: kafka-consumer-group:virt.virt_flux-ingest.recette.routeur_dataqueue…
Number: virt.virt_flux-ingest

FTS search isn’t the same as String.contains(…) method. It works in a different way.

For example, the string kafka.groups.prod_virt.virt_flux-ingest.prod.routeur_dataqueue.dataqueue-consumer while indexing is split into tokens. Dots and semicolons are token separators, so the following tokens are extracted:

  • kafka
  • groups
  • prod_virt
  • virt_flux-ingest
  • prod
  • routeur_dataqueue
  • dataqueue-consumer

When you preform a regular FTS search (without quotes) it searches among this tokens (finds the token that equals your search term).

When you type your query in quotes, the search is performed for whole tokens that follows one another.

When you search for virt.virt_flux-ingest then there is no token virt followed by dot and token virt_flux-ingest. But there is a token prod_virt followed by dot and token virt_flux-ingest, so the query proc_virt.virt_flux-ingest will work.

Thank you.