π¨ Disclaimer: This post will guide you through the steps to set up Solr Search as the backbone of a Drupal Search API. This setup is specifically for scenarios where Drupal is running locally on your machine using Docker through DDEV. I might mention some concept during this post, but since I am relatively new to this topic, they are likely just rip-off from Google or Perplexity, and might nto be correct
Intuition of Solr
To begin with, why Apache Solr ? why not use other options such as “Database Search (MySQL)” or “Elastic Search” ?
According to the offcial website for Apache Solr, it says:
Apache Solr is an open-source enterprise search platform built on Apache Lucene, designed for high-performance full-text search and indexing. It supports distributed indexing, replication, load balancing, and advanced search features like faceting, filtering, and spell checking. Solr is highly customizable and fault-tolerant, making it suitable for large-scale applications like e-commerce and content management systems
And I have used perplexity to compare between them, here are the results:
Feature or Aspect Apache Solr Elastic Search MySQL (Database) Purpose Full-text Search Engine Full-text Search Engine Relational Structured Database Architecture Built on Apache Lucene; Support dustribtued or non-dustributed modes via SolrCloud (standalone vs cloud mode) Real-time indexing with JSON-based queries No native full-text indexing; uses indexes for structured queries Query Language Solr Query Parser; REST-like APIs JSON-based DSL for complex queries SQL Scalability Requires manual configuration (e.g., SolrCloud with ZooKeeper) Seamless horizontal scaling out-of-the-box Limited horizontal scaling; better suited for vertical scaling Processing Batch Processing Real-time Processing Batch Processing Performance Optimized for read-heavy applications; good for complex queries but requires setup effort Real-time performance; excels in analytics and dynamic environments Strong for transactional operations but slower for complex searches Use Case Bigher Website; E-commerce, CMS, large-scale text search Bigger Website; Real-time analytics, monitoring dashboards Small Website; CRUD operations, structured data storage
The reason solr search is more preferred than mysql (storing the index in database) is because:
Performance and Scalability: Efficient Querying, Scalability
Advanced Search Features: Full-text Search (stemming tokenization, phrase matching, wildcard) Faceted Navigation, Auto-complete and Spell Check
Better Customization: Solr’s extensive configuration options allows the developer to tailor search behavior more precisely to the website’s need without requiring deep PHP/JAVA knowledge
Decoupled, Reduced Database Load: By off-loading search operation to an external Solr server, Drupal website reduce the load on their primary db, improving site performance.
The reason solr search is more preferred than elastic search:
- Mature Ecosystem and Community Support in Drupal: Solr has a long standing presenceof being used by Drupal, and hence it is more well-integrated and offers out-of-the-box features with drupal, and the community overall does a much better job maintaining Search API Solr module comparing that of Elastic Search. (In the meanwhile some of its legacy features such as XML-based configruation align well with older system/drupal or infrastructure)
- Ease of Integration with Drupal: generally speaking, configuring Solr in Drupal is much easier comparing to that of Elastic search (elastic search may require additioanl effort to adapt its JSON-based query DSL), Solr’s integration with drupal is straightforward due to its compatibility with existing tools and workflows, for instance faceted search, highlighting, custom field weighting (out-of-the-box supported by Search API module)
- Static Data Handling : SOlr excels at handling static or less frequently updated data due to its ability to use an univerted reader for sorting and faceting, which improves performance, many goverment organisation uses solr with relatively static datasets, making solr a better fit
Demonstration
Step-1: Example Drupal Website using DDEV
To begin with let’s have an example drupal website up and running using ddev on our local computer:
|
|
In the meanwhile I might also install and enable some modules that are of the common use:
|
|
|
|
After that let’s initialize the example website using the guided interface to install drupal:
Then you should have your example drupal website ready:
Notice that it has already gotten a search bar at the top that are functionting, this is built using drupal core “Search” module;
The “Search” modules by itself is actually already functional to most if not all small sites, it provide full-text search functionality across content (node, taxonomy, users) have basic ranking of result based on relevant, and index the content using site’s database (almost alike “Search API” + “Database” Server option) it is simple to setup but have no support towards the advanced features like faceted navigation and auto-complete, and relies on the Drupal’s DB for indexing, it does not support real-time indexing (index is only established when the CRON job is ran), its relatively limited in customization ability for search behavior or result ranking as well… (And that is exactly why we need Solr)
Step-2: Installation of Search API
Next you’ll need to install search api module to the Drupal website, simply run the following:
|
|
Now you might ask: what is the Search API Module? (Here goes perplexity again…)
Framework for Search: The Search API module acts as a framework to define how content is indexed and searched. It decouples the search logic from the backend, allowing you to use different search engines (e.g., database, Solr, Elasticsearch).
Customizable Indexing: You can define what fields to index, apply preprocessors (e.g., stemming, stop words), and control field weighting to influence search result rankings.
Integration with Views: The module integrates seamlessly with Drupalβs Views module, enabling you to create custom search result pages and filters. β’ Advanced Features: β’ Faceted navigation. β’ Autocomplete suggestions. β’ Highlighting search terms in results. β’ Spell-checking and βDid you mean?β functionality.
So Solr search needs the interfaces and models provided by the Search API module, this includes how the content are indexed into Solr, what is the back abstraction, and connect the preprocessors for search behaviour (such as removing stop words from the query) with the same functions provided by Solr, etc.
Now, notice that when we go to the configuration page for the Search API, there’s an error complaining there’s no backend avaible for Search API, this is because:
- we have not got the solr search related plug-in/extension installed
- we have not got the solr server/service running in DDEV
Step-3: Configuration of (Solr) Server
So without further or due, let’s get the Solr Sever running and search api server configured.
|
|
For more information on how to configure this ddev/solr-search
you might also wanna have a look on the add-on’s readme file: https://github.com/ddev/ddev-solr
Whenever we have the solr service running, we can install the “Search API Solr” (and the Search API Solr Admin module) and use that service as the backend server for our search API:
|
|
Here we want to choose (example configuration screenshot: link):
- Solr Connector:
Solr Cloud with Basic Auth
- Solr node:
solr
(found in ddev describe’s url/port column - Solr port:
8983
(found in ddev describe’s url/port column - Default Solr collection:
example-solr-collection
(choose however you like, the collection will be created later when you run the drush command - HTTP Basic Auth (default username/password by ddev/solr-search
- Username:
solr
- Password:
SolrRocks
- Username:
When complete and click on “Save”, this is what you’ll see
This is because the collection have not been created, to do that you will need to either click on the “Upload Configset
” button, that uses the admin module we just installed to created the collections OR run the following drush
command:
|
|
(more instruction to be found at: https://github.com/ddev/ddev-solr?tab=readme-ov-file#drupal-and-search-api-solr
Step-4: Configuration of Index
Now we have our search “server” ready to go, we need to configure the “index”, here I might introduce some explanations for these terminologies to help you with the understanding:
A βserverβ in Search API refers to the backend system or search engine where data is stored, indexed, and queried. It acts as the engine powering search functionality.
An βindexβ in Search API defines what content from your Drupal site should be indexed (stored) in the server and how it is structured for search purposes.
Here we’ll just index the “recipe” content type that have some example instances by default:
Here we have chosen to index the fields: Title, Summary, Tags (taxonomy)
However, when you finish the setup by clicking “Save changes”, this is what you might be seeing: “0/xx indexed status
”, this is because the indexing is triggered only when the CRON job is ran unless configured otherwise:
Now all the “article” nodes will be indexed and stored into the solr server:
Step-3: Create Search Page using View
Lastly, I would want to consume these indexes using a page, notice instead of regular drupal entities/node such as “content
”, “user
” or “taxonomy
” I’ve chosen “Index Article (Solr Index)
”, that is the “index” we just created:
(note that when you click on save, you might see a warning message that says “The selected caching mechanism does not work with views on Search API indexes. Use one of the Search API-specific caching options. The selected caching mechanism was changed accordingly for the view *Search Recipe (Solr)*.
” this is because caching for search is different then that of regular drupal search pages, you can find the relavtn settings under “Advanced > Caching”
I’ll have “full-text search” as the exposed filter for user to enter their search query; Title, summary, relavnce score (generated by search api or solr when you perform the search) as the fields to display; And order the items by their relevancy score:
Now as you can see below, when I search for the word delicious
two result comes up, one with the exact spelling delicious
one is the adjective version of it deliviously
, and they relevancy score differs slightly because of this:
As we come to an end, I’ll show off some super powers of having solr search in the equation, for instance you can do:
Excerpt
- by ticking “Highlight” in preprocessor tab of the index’s configuration (screenshot)
- and adding the “Excerpt” field to the view (screenshot)
Auto-Complete
- install the “Search API Auto-complete” and “Search API Solr Auto-complete” modules (screenshot)
- enable it in the “auto-complete” tab in “search api” configuration(screenshot)
Facets / Laywerd Filtering
- install and enable the “Facets” module (screenshot)
- create new facets using index field"Tag" (screenshot)
- place the facet block component in sidebar (screenshot)
- turn on replacing “value/key” with label (screenshot)
(optionally you can also have the filter inside the view body via “Facets Better Exposed” module that are dependent on the “Better Exposed Filter” module
Searching within Attachment or Document (such as PDF, DOCX, TXT, CSV)
- install and enable the “Search API Attachment” Module (link)
- configure the module to use the “Solr extractor” (screenshot)
- enable “file attachment option” in your index processors options (screenshot)
- configure the search api index to include field “Search api attachments: YOUR-FILE-FIELD-NAME” (screenshot)
- change the type for “search api attachment: YOUR-FILE-FILELD-NAME” from “
string
” to “full-text
” (screenshot)- reindex via running cron and searc in your view (no special view configuration required)
(for more information you can refer to the ReadMe.md file of the project: https://git.drupalcode.org/project/search_api_attachments
Reference
- β WebWash - How to Build Powerful Search Pages in Drupal: Guide you thought the Search Module (build-in), Search API + Database Module (built-in), Search API + Solr Search, Auto-complete, Facets, etc. (useful timestamps: screenshot)
- β ddev/solr-search - Readme.md: how to configure ddev add-on of solr server with Drupal Search API.
- here’s other link that might be helpful as well:
- the traditional way of solr without ddev
- drupalize.me
- some drupal con demonstration