Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fulltext search is not working #3308

Closed
haristariqmage4 opened this issue Jun 29, 2024 · 8 comments
Closed

Fulltext search is not working #3308

haristariqmage4 opened this issue Jun 29, 2024 · 8 comments
Assignees
Labels

Comments

@haristariqmage4
Copy link

Preconditions

Magento Version : 2.4.0

ElasticSuite Version : 2.1.0

Environment : Developer

Third party modules :
Amasty_Base
Amasty_CronScheduleList
Amasty_Customform
Amasty_InvisibleCaptcha
Amasty_RequestQuote
Amasty_QuoteAttributesManagement
Amasty_RequestAQuoteProSubscriptionPackage
Amasty_QuoteAttributes
Amazon_Core
Amazon_Login
Amazon_Payment
Clarion_CustomerAttribute
Codazon_AjaxCartPro
Codazon_AjaxLayeredNav
Codazon_AjaxLayeredNavPro
Codazon_Core
Codazon_GoogleAmpManager
Codazon_ImproveBundle
Codazon_Lookbookpro
Codazon_MegaMenu
Codazon_OneStepCheckout
Codazon_ProductFilter
Codazon_ThemeOptions
Codazon_QuickShop
Codazon_ShippingCostCalculator
Codazon_Shopbybrandpro
Codazon_Slideshow
Codazon_ProductLabel
Codazon_Utility
Dotdigitalgroup_Email
Dotdigitalgroup_ChatHarrigo_EverCrumbs
Klarna_Core
Klarna_Ordermanagement
Klarna_Onsitemessaging
Klarna_Kp
Klaviyo_Reclaim
MageMe_HidePrice
MageWorx_SearchSuiteAutocomplete
Magefan_Community
Magefan_Blog
Magefan_WysiwygAdvanced
Magemonkeys_CategoryFilter
Magemonkeys_CompanyName
Magemonkeys_Customerinfo
Magemonkeys_FeaturedProduct
Magemonkeys_HideMyOrders
Magemonkeys_Ordermail
Magemonkeys_Product
Magemonkeys_Quote
Magemonkeys_RemoveQuoteCartPrice
Magemonkeys_RepresentativeAttr
Magemonkeys_RestrictCategory
Magemonkeys_WelcomeEmailCc
Mageplaza_Core
Mageplaza_BannerSlider
Mageplaza_BackendReindex
Mageplaza_MassProductActions
Mageplaza_Smtp
Magestat_SplitOrder
OlegKoval_RegenerateUrlRewrites
PayPal_Braintree
PayPal_BraintreeGraphQl
RapideWeb_ProductListTable
Smile_ElasticsuiteCore
Smile_ElasticsuiteCatalog
Smile_ElasticsuiteCatalogGraphQl
Smile_ElasticsuiteCatalogRule
Smile_ElasticsuiteCatalogOptimizer
Smile_ElasticsuiteTracker
Smile_ElasticsuiteThesaurus
Smile_ElasticsuiteSwatches
Smile_ElasticsuiteIndices
Smile_ElasticsuiteAnalytics
Smile_ElasticsuiteVirtualCategory
Temando_ShippingRemover
Ulmod_Ordernotes
Vertex_Tax
Vertex_AddressValidation
WebShopApps_MatrixRate
Yotpo_Yotpo
Zero1_Patches

How do we make results for "dextrose 5% water" show the same as results for "d5w"? Since one is multiple words and the other is technically just one?
How can we make sure that items like Sharps Container 26 1/4 ° 20 w * 14 3/4 D Inch 19 BD Gallon are not included in the search results for 'D5W'?

Expected result

More narrow product search that will only allow for exact terms to be fetched
Searches like 'D5W' should not have results that include hits for 'D' '5' 'W'

Actual result

image

@rbayet
Copy link
Collaborator

rbayet commented Jul 1, 2024

Hello @haristariqmage4,

This is probably due to the "word_delimiter" of the "standard" (text) analyzer which will transform your product names before indexing it.
This "word_delimiter" component DO split words like "D5W" when switching from a letter to a digit and vice versa, so you are correct assuming that we do search for "D", "5" and "W" when searching for "D5W".
The issue is then that you have other product names with those isolated letters (for example coming from a product name string like "3/5 H X 10 7/10 W X 6 D").
You can check what's happening on the analyzer side of things from the admin interface in the Elasticsuite > System > Analysis.

If you can't have "simpler" product names, I would recommend trying to change the configuration of the "word_delimiter" token filter by disabling "split_on_numerics" in the elasticsuite_analysis.xml (through a composer patch or a re-definition of the XML in a custom module)

Then for the original issue, a thesaurus entry for associating "D5W" or "d5w" to "Dextrose 5% Water" should to the trick.

Regards,

@haristariqmage4
Copy link
Author

image
There is only System > Indices. Please define more.

@rbayet
Copy link
Collaborator

rbayet commented Jul 1, 2024

Hello @haristariqmage4,

Indeed, I've just saw

Magento Version : 2.4.0

ElasticSuite Version : 2.1.0 (I guess it's 2.10.0)

That's ... old :)
Indeed, you will not have that screen which has been introduced in 2.10.13, so in Magento 2.4.1 and above only.
You can install cerebro locally and reproduce what that screen does from cerebro's "analysis" screen :
image

Or use directly the _analyze endpoint of your Elasticsearch in CLI

/var/www/html $ curl -H "Content-Type: application/json" -XPOST http://opensearch:9200/magento2_fr_fr_catalog_product/_analyze?pretty -d '{"analyzer":"standard","text":"d5w"}'
{
  "tokens" : [
    {
      "token" : "d5w",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "d",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "d5w",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "5",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "w",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "<ALPHANUM>",
      "position" : 2
    }
  ]
}

(Replace http://opensearch:9200/magento2_fr_fr_catalog_product/ by http://[your_elasticsearch_server_address_or_hostname]/[your_catalog_product_index_name])

Regards,

@haristariqmage4
Copy link
Author

image
Hi @rbayet ,
Hereis my analysis, now what should i do?

@rbayet
Copy link
Collaborator

rbayet commented Jul 2, 2024

Hello @haristariqmage4,

So now that your thesaurus is in place, you have two options (that could be combined, actually)

  1. reducing the score penalty for products matching a synonym
  2. altering the "word_delimiter" token filter in the way I described

1. reducing the score penalty for products matching a synonym
When searching for "d5w" you will now also search for "dextrose 5% water" but by default the products matching only "dextrose 5% water" will suffer a score penalty with a tenth of their expected score.

You can change that by reducing (up to 1, ie "no penalty") the setting available at Elasticsuite > Search Relevance > Thesaurus Configuration > Synonyms Configuration > Synonyms Weight Divider
image

The products matching individually "D", "5" and "W" will still be present in the search results list but maybe at a lower place for you to be satisfied.

2. altering the "word_delimiter" token filter in the way I described
If you're not satisfied, or as an alternative, you can redefine or finetune the word_delimiter token filter which is responsible for splitting "D5W" into "D", "5" and "W".

You probably only need to change the "split_on_numerics" from "true" to "false".

You can do that either with a composer patch on that distribution file OR create a custom module in app/code with a local elasticsuite_analysis.xml which will contain just the re-defined word_delimiter token filter.
In both cases, this will require clearing the Magento cache and performing a full reindex.

Please be aware that this approach could have adverse side effects, for instance preventing finding products with a "L48B" in their name or their SKU by searching for "L 48 B" for instance.

Regards,

@haristariqmage4
Copy link
Author

haristariqmage4 commented Jul 2, 2024

@rbayet
Should i generate_word_parts -> false too in elasticsuite_analysis.

@haristariqmage4
Copy link
Author

haristariqmage4 commented Jul 2, 2024

Also what can be the solution to avoid:

Please be aware that this approach could have adverse side effects, for instance preventing finding products with a "L48B" in their name or their SKU by searching for "L 48 B" for instance.

this side effects

@rbayet rbayet assigned rbayet and unassigned haristariqmage4 Jul 8, 2024
@haristariqmage4
Copy link
Author

@rbayet ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants