Wednesday, March 9, 2016

Relevance Ranking Modules in Endeca

Relevance Ranking controls the order in which search results are displayed to the end user . You configure the Relevance Ranking feature to ... thumbnail 1 summary
Relevance Ranking controls the order in which search results are displayed to the end user . You configure the Relevance Ranking feature to display the most important search results earliest to the user.

Relevance ranking can be used to independently control the result ordering for both record search and dimension search queries.

The importance of a search result is generally an application-specific.The Relevance Ranking feature provides a flexible, configurable set of result ranking modules to build application specific ranking.

Relevance raking can be applied using search interface or passing algorithm while building query.
Site specific relevance ranking can be applied as well using results list cartridge or passing at query time.

Followings are available set of Relevance Ranking modules and their ranking behaviors.

Exact

Provides a finer grained but more expansive alternative.
Groups the results into three strata.
1.      Highest stratum contains results whose complete text matches the user’s query exactly.
2.      Middle stratum contains results that contain the user’s query as a sub phrase.
3.      Lowest stratum contains other match types such as normal conjunctive matches.
Use only on small text fields such as dimension values or small property values like part IDs


Field

Ranks documents based on the search interface field with the highest priority in which it matched.
Useful in relevance ranking strategies for catalog applications.
Assigns a score to each result based on the static rank of the dimension or property member or members of the search interface that caused the document to match the query.
Valid only for record search operations.
Assigns a score of zero to all results for other types of search requests.
Treats all matches the same, whether or not they are due to query expansion.


First

Designed primarily for use with unstructured data.
Ranks documents by how close the query terms are to the beginning of the document.
Groups its results into variably-sized strata.
Takes advantages of the fact that the closer something is to the beginning of a document, the more likely it is to be relevant.
When the query has a single term, the First module behavior is straight-forward:
-        It retrieves the first absolute position of the word in the document, then calculates which stratum contains that position.
When the query has multiple terms, the First module behaves as follows:
-        The first absolute position for each of the query terms is determined, and then the median position of these positions is calculated.
Supports wildcard queries.
Does not work with Boolean searches and cross-field matching.



Freq

Provides result scoring based on the frequency of the user’s query terms in the result text
Score produced for a result record is the sum of the frequencies of all user search terms in all fields that match a sufficient number of terms
Number of terms depends on the match mode
Cross-field match records are assigned a score of zero
Ignores matches due to query expansion



Glom

Ranks single-field matches ahead of cross-field matches and also ahead of records that do not contain the search term at all
Tie-breaker function in combination with the Maxfield module
Only useful with record search operations
Treats all matches the same, whether or not they are due to query expansion



Interp

General-purpose module that assigns a score to each result record based on the query processing techniques used to obtain the match.
Ranks results as follows:
All non-partial matches are ranked ahead of all partial matches
All single-field matches are ranked ahead of all cross-field matches
All non-spelling-corrected matches are ranked above all spelling-corrected matches
All thesaurus matches are ranked below all non-thesaurus matches
All stemming matches are ranked below all non-stemming matches.



MaxField

Behaves like the field module, except in how it scores cross-field matches.
Selects the score of the highest-ranked field that contributed to the match.
Valid only for record search operations.
Treats all matches the same, whether or not they are due to query expansion.



Nterms

Ranks matches according to how many query terms they match.
Only applicable to search modes where results can vary in how many query terms they match.
Treats all matches the same, whether or not they are due to query expansion.



Numfields

Ranks results based on the number of fields in the associated search interface in which a match occurs.
Treats all matches the same, whether or not they are due to query expansion.
Only useful with record search operations.



Phrase

Ranks results containing the user’s query as an exact phrase, or a subset of the exact phrase.
Records that have the phrase are ranked higher than records which do not contain the phrase.
Three options that you can use to customize its behavior:
1.   Rank based on length of sub-phrases
2.   Use approximate sub-phrase/phrase matching
Apply spell correction, thesaurus, and stemming.




Proximity

Designed primarily for use with unstructured data.
Ranks how close the query terms are to each other in a document by counting the number of intervening words.
Groups its results into variable sized strata.
When the query has multiple terms, Proximity behaves as follows:
-        All of the absolute position for each of the query terms are computed
-        The smallest range that includes at least one instance of each of the query terms is calculated
-        This range’s length is given in number of words
-        The score for each document is the strata that contains the difference of the range’s length and the number of terms in the query
-        Smaller differences are better than larger differences
Under stemming, spelling correction, and the thesaurus, the expanded terms are treated as if they were in the query.
Proximity scores partially matched queries as if the query only contained the matching terms.
Does not work with Boolean searches, cross-field matching, or wildcard searches.



Spell

Ranks spelling-corrected matches below other kinds of matches
Assigns a rank of zero to matches from spelling correction, and a rank of one from all other sources
Ignores all other sorts of query expansion.



Static

Assigns a static or constant data-specific value to each search result
For record search operations, the first parameter to the module specifies a property, which defines the sort order assigned by the module
The second parameter can be specified as ascending or descending to indicate the sort order to use for the specified property
In a catalog application, setting the static module by Price, descending leads to more expensive products being displayed first.
For dimension search, the first parameter can be specified as nbins, depth, or rank.


Stem

Ranks stemming matches below other kinds of matches
Assigns a rank of zero to matches from stemming, and a rank of one from all other sources
Ignores all other sorts of query expansion


Stratify

Used to boost or bury records in the result set
Takes one or more Endeca Query Language (EQL) expressions and groups results into strata
Records are placed in the stratum associated with the first EQL expression they match
If an asterisk is specified instead of an EQL expression, unmatched records are placed in the corresponding stratum
Basic component of the record boost and bury feature



Thesaurus

Ranks thesaurus matches due to below other sorts of matches.
Thesaurus assigns a rank of zero to matches from the thesaurus, and a rank of one from all other sources.
Ignores all other sorts of query expansion.



Weighted Feq

Scores results based on the frequency of user query terms in the result
Weights the individual query term frequencies for each result by the overall frequency in the complete data set of each query term
Terms resulting in fewer search results are weighted more heavily than more frequently occurring terms
Ignores matches due to query expansion




No comments

Post a Comment

Text Widget