Connect Django Haystack to Solr Cloud
At BV FAPESP (www.bv.fapesp.br) we use Solr as the searchengine backend, and a library called Haystack to tie Solr to Django.
In 2018, me and my team wrote a Python/Django library to use with Apache Solr in cloud mode. We were avoiding the use of Django/Haystack library, since there were some features not supported, like grouping, Streaming Expressions, Graph Analysis.
So far so good, before the end of the project I had in production environment Solr Cloud running smoothly, but I still had a single Solr running with Haystack, because we didn't re-code the whole system, and there still exist a legacy using Haystack.
To turn-off the single Solr, we moved all documents to Solr Cloud and connected Haystack to it. This is what I documented here, for myself and maybe you, trying to make the same.
https://github.com/django-haystack/django-haystack/pull/1580/commits/13df4a9e69ececd5567636085df4e353ce540a35
Copy this file to your Haystack env/virtualenv folder, like this:
/lib/python2.7/site-packages/haystack/backends/solrcloud_backend.py
Use this pysolr.py
https://github.com/django-haystack/pysolr/blob/master/pysolr.py
You must use Zookeeper in production environment
In 2018, me and my team wrote a Python/Django library to use with Apache Solr in cloud mode. We were avoiding the use of Django/Haystack library, since there were some features not supported, like grouping, Streaming Expressions, Graph Analysis.
So far so good, before the end of the project I had in production environment Solr Cloud running smoothly, but I still had a single Solr running with Haystack, because we didn't re-code the whole system, and there still exist a legacy using Haystack.
To turn-off the single Solr, we moved all documents to Solr Cloud and connected Haystack to it. This is what I documented here, for myself and maybe you, trying to make the same.
Step-by-step
There is Solr Cloud python backend for Haystack, that you can find here:https://github.com/django-haystack/django-haystack/pull/1580/commits/13df4a9e69ececd5567636085df4e353ce540a35
Copy this file to your Haystack env/virtualenv folder, like this:
Use this pysolr.py
https://github.com/django-haystack/pysolr/blob/master/pysolr.py
Zookeeper / Kazoo
You must use Zookeeper in production environmentpip install kazoo
[...]
This is for me to remember, when creating a new virtualenv and copy wsgi.py already configured.
Configuring
For the configurations bellow, follow this comments here:
Put those lines on your settings/local_settings.py with your infrastructure settings.
https://github.com/django-haystack/django-haystack/pull/1580#issuecomment-399378902
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'solrcloud_backend.SolrCloudEngine',
'URL': '127.0.0.1:9983', # this is the ZooKeeper
'COLLECTION': 'gettingstarted', # example SolrCloud collection
},
}
Configure this settings in pysolr.py to:[...]
ZooKeeper.CLUSTER_STATE = '/collections/{}/state.json'.format(connection_options['COLLECTION'])
zookeeper = ZooKeeper(connection_options['URL'])
[...]
wsgi.py
If you use uwsgi, just take care when you change your environmet / virtualenv, to set PATH and PYTHONPATH correctly.This is for me to remember, when creating a new virtualenv and copy wsgi.py already configured.