How does the bento/single search work?
The search box on the Library website searches our new Blacklight catalog, Summon, our Library websites and a “Best Bet” index for top search hits.
Results are displayed in “bento boxes”:
- Articles returns the top results from a Journal Article search in Summon.
- Library Websites returns the top results of our Library websites using Google Custom Search.
- The rest of the boxes (Books, Journals, Musical Recordings, etc.) return the top results from the Catalog for each of these formats.
- Best Bets highlights resources that are within the top 100 queries from our Library websites. (See our post on Best Bets.)
Position of “bento” boxes
Best Bets (when applicable), Articles and Library Websites are in a fixed position in the layout. The remaining three formats have items with the highest maximum relevancy scores from the Catalog, and are positioned based on that relevancy, with the highest relevancy score in the top left position.
Note: we’re not ranking the relevancy of formats, just individual items. We assign those items to formats and use the relevancy score for each format’s top (i.e., most relevant) item as the determinant for which position the format takes on the page, but we’re not actually ranking the formats themselves.
How Solr is used for relevancy
We are using Solr’s Result Grouping to get Solr to do format-level relevancy ranking for us. It groups the results by format, returning a hit count and a relevance score for each format (which is top document relevance score within that format), and the top five hits for each format. Our actual query arguments to Solr look like this:
group.ngroups=true&q=”ivanhoe”&group.limit=5&group.field=format_main_facet&group=true&wt=ruby&defType=edismax&rows=20
(I suspect that the defType=edismax is unnecessary here.) The limitation of Solr’s grouping functions is that you cannot group on a field that is multivalued, which our format facet field is. So we made a new, single valued “format_main_facet” field including only the “top” format for records with more than one format value. (“Thesis” is really more a genre label than a format, so nearly all of our theses have a second format value – most commonly “Book”.)