Fork me on GitHub

free-text searching

An article by Gaspard Bucher

Sphinx Search

The “sphinx” brick brings a better search experience by providing fulltext indexed search engine capabilities (through using the sphinxsearch indexer).

Sphinx search has some requirements though. It needs
  • to have a worker running (for partial updates)
  • a cron job (for full updates) and
  • the external sphinx daemon.

This feature is important for many intranet applications, but only seldom necessary for websites.

We had some issues with delayed deltas when using thinking-sphinx on debian. If you have problems too, or manage to have the deltas working, please let us know !

  1. install sphinx
  2. install the thinking sphinx gem
  3. configure boot script

boot script

When everything is installed and working, you need to make sure the sphinx daemon is started on reboot. (old boot scripts)

Getting started

Once you have Sphinx and thinking-sphinx installed, the “sphinx” brick is automatically activated add all fulltext searches will go through Sphinx (you can change the setting in config/bricks.yml if you do not like this).

Once sphinx is activated, you can use “sphinx” in sqliss for fulltext search.

nodes where sphinx match #{params[:q]} in site

capistrano on debian server

Just deploy your application and.. voilà: the sphinx search daemon will be started for you, a cron task will be added to reindex your data and you even have a worker up and running for the deltas.

In case you run into troubles or need to execute the commands by hand, read on…

sphinx related

Setup:

# rake RAILS_ENV=production sphinx:setup

Start search daemon:

# rake RAILS_ENV=production sphinx:start

Stop search daemon:

# rake RAILS_ENV=production sphinx:stop

To index your data, you have to run:

# rake RAILS_ENV=production sphinx:index

You can use the setup_indexer rake task to install/update the cron job for you.

worker (delta indexes)

If you have installed the “delayed_job” gem, the “worker” brick is activated and delta indexes are resolved with the “delayed_job” plugin. If you should therefore start a worker to handle the delta indexes (all changes that happen between two full indexes). Something like this starts the job:

# rake RAILS_ENV=production worker:start

To stop the worker:

# rake RAILS_ENV=production worker:stop

capistrano

When the brick is activated on your development machine, the sphinx tasks will get inserted into the deployment process, starting and stopping the search deamon as needed.

If you have enabled sphinx in production mode on your development machine, this will automatically activate sphinx on the server. Make sure sphinx and thinking-sphinx are installed there !

For every rake task listed above, there is an equivalent capistrano task. These tasks are:

  • sphinx_setup, sphinx_start, sphinx_stop, sphinx_setup_indexer
  • worker_start, worker_stop

And if you want to see the details, have a look at the capistrano receipts in bricks/sphinx/deploy.rb and bricks/worker/deploy.rb and the rake tasks in bricks/sphinx/tasks.rb and bricks/worker/tasks.rb.

comments

  1. Saturday, February 12 2011 15:36 Gaspard

    You need to install delayed_job 1.8.4
    gem install delayed_job --version=1.8.4

  2. Saturday, February 12 2011 15:42 Gaspard

    You also need thinking-sphinx 1.3.16
    gem install thinking-sphinx --version=1.3.16

  3. leave a comment