Fork me on GitHub

Indexing and Searching

Some missing introductory info here:

Finding what you are looking for by going though all records and checking the fields is slow… even for a computer if there are many records. Just like in books, things can be found much faster you have an nicely ordered index that can point you to the relevant locations.

What gets indexed by default?

What are the idx_ fields in the properties definition for?
Zena provides some predefined fields in the nodes table that are indexed (these fields start with “idx_” , for example, idx_datetime1, idx_integer1, etc). If a custom property is assigned to one of those idx_ fields, the selected index gets populated with the properties’ values, and SQLiss queries can be used to find nodes based on that property. Beware of conflicts, when assigning different properties of a node to the same idx_ field! They will delete and overwrite each other (the classes prop_eval is also run on index rebuilds)

What are the idx_ fields in the class definitions for?
To have your custom fields indexed (or “serialized”?), edit the virtual class definitions and add the field names? to idx_text_[high/medium/low] (or leave blank to use title/summary/text)???

What happens if the same idx field is selected for different properties? The field is overwritten with each property index update (on writes or rebuild_index).

What kind of queries are considered “fulltext search queries” (handed to sphinx if available)? can postgress fulltext indexing be used?

Data modeling rules

Query possibilities and performance depend, to a large extent, on the structure used for the data model. It is therefore good to know some guidelines.

By default, class instance properties are not indexed, but the instances themselves are. Thus, you are all set if you use classes, relations and roles for things you need to filter on in queries.

Klasses
  • Are used to create node instances (entries or records).
  • Every node has a title, summary, text, and administrative properties.
  • Klasses can have a specific template.
  • A node can only be of one class (but it includes the parent classes).
  • A node can have multiple roles.
Relations (links)
  • Are used to create links between node instances (only one per pair?).
  • Every link has a _comment, _status, and a _date? field.
  • Are explicitly set/unset (assigned).
  • Allow to group nodes of different classes (with a generic relation definition).
Roles (property sets)
  • Can be attached to base classes like Node or Document.
  • A node can have multiple roles and it helps grouping properties.
  • Are assigned simply by saving a corresponding property (no drive permissions required like for changing class).

If you need to process the data according to related IDs, use ralation links (because they are indexed?) instead of saving IDs in property fields.

Nodes of the Page class are usually listed in menus (default template), but not Notes.

Some scattered info on index and search facilities in zena

search_box (what does fulltext mean in the match query there? The “match” example produces a rendering error!)

filters (How do “like” and “match” clauses differ?)

free-text searching

indexing (seems not to be linked from anywhere else)

New scope index

query_parse

Select and having clauses in SQLiss

SQL-only Search

With SQLiss queries the idx fields are not accessible, because they are not located in the nodes table, but inside the “versions” table. (? how does the following chage this?)

To still get something close to Sphinx, you can create a property (say “class_search_dat”) and have it indexed in a string index (for example idx_string1). Then, in your models, you use prop_eval to automatically fill the “class_search_dat” field with the values from the properties you want to have in the index. Finally, you use for example

contacts where class_search_dat like "%#{params[:q]}%" in site

to do the query.

We use this technique a lot to index contacts for example. Together with a live filter search, it makes finding persons really easy and fast.

Example query for a multiple words search from a single text field input?

This is really just a way to search for substrings, not a fulltext index search with proper ranking.
Also, the “class_search_dat” index in idx_string1 is only 255 chars long for performance reasons, and it requires you to manually maintain the prop_eval definition when property definitions change, and to save the node again before changes take effect.

Fulltext search with ranking

Install sphinx and this will cause zena to …?

Example fulltext search query