Connect Django Haystack to Solr Cloud

At BV FAPESP ( we use Solr as the searchengine backend, and a library called Haystack to tie Solr to Django.

In 2018, me and my team wrote a Python/Django library to use with Apache Solr in cloud mode. We were avoiding the use of Django/Haystack library, since there were some features not supported, like grouping, Streaming Expressions, Graph Analysis.

So far so good, before the end of the project I had in production environment Solr Cloud running smoothly, but I still had a single Solr running with Haystack, because we didn't re-code the whole system, and there still exist a legacy using Haystack.

To turn-off the single Solr, we moved all documents to Solr Cloud and connected Haystack to it. This is what I documented here, for myself and maybe you, trying to make the same.


There is Solr Cloud python backend for Haystack, that you can find here:

Copy this file to your Haystack env/virtualenv folder, like this:

Use this

Zookeeper / Kazoo

You must use Zookeeper in production environment
pip install kazoo


For the configurations bellow, follow this comments here:

Put those lines on your settings/ with your infrastructure settings.

    'default': {
        'ENGINE': 'solrcloud_backend.SolrCloudEngine',
        'URL': '',  # this is the ZooKeeper
        'COLLECTION': 'gettingstarted',  # example SolrCloud collection
Configure this settings in to:


ZooKeeper.CLUSTER_STATE = '/collections/{}/state.json'.format(connection_options['COLLECTION'])
zookeeper = ZooKeeper(connection_options['URL'])

If you use uwsgi, just take care when you change your environmet / virtualenv, to set PATH and PYTHONPATH correctly.

This is for me to remember, when creating a new virtualenv and copy already configured.

Popular posts from this blog

Atom - Jupyter / Hydrogen

ETL, SOLID and Design Patterns

Metodologias em ação