A SearchableModel for App Engine
SearchableModel, a nice little extension to db.Model, can be found in the App Engine SDK. It's a lite full-text index that can provide some search capability to your app. Using it is simple. When declaring your model class replace db.Model with search.SearchableModel:
from google.appengine.ext import search class Article(search.SearchableModel): some_searchable_prop = db.StringProperty() another_big_searchable = db.TextProperty() ...
Then in your handler, use the search() method:
query = Article.all().search('something').filter(...)
SearchableModel is a subclass of db.Model that looks through properties derived from basestring. It adds a __searchable_text_index property that contains a list of words associated with the model instance.
Some caveats apply:
- If you try to index too much text, you might run into CPU quota issues on put() as it builds the index. Background processing, if it gets offered in the future, would address this issue.
- There is a cap of 5000 indexed property values a single entity (one model instance) may have. This could severely limit your keywords if you have a large number of indexed properties. ListProperties take a toll. Text and Blob properties aren't indexed, but your text property will generate keywords in SearchableModel.
- SearchableModel entities currently don't display well in the App Engine data viewers because of the size of __searchable_text_index.
- Multi-word searches seem broken for now because indexing the same property multiple times is not available until the next SDK release.
from models import search # This is my search module class Article(search.SearchableModel): unsearchable_properties = [ 'permalink', 'legacy_id', 'excerpt', 'article_type', 'html', 'format'] ...
Any properties listed in unsearchable_properties are ignored for full-text indexing. This is useful if you have properties derived from basestring that are either altered versions of other properties (like html created from a body property) or strings that aren't useful as keywords. You can also increase _FULL_TEXT_MIN_LENGTH (in search.py) from the default 3 to increase the required size of indexed keywords.
My TODO list includes filtering code snippets from the full-text indexing, so for this article, all the code fragments would be ignored. You can see the results by typing single searches in the search box to your right. (Try "rails".)This search feature was incorporated into Bloog as a learning exercise, but the previous Google Ajax Search API is a better fit for this public-facing blog. Earlier versions of Bloog had this Ajax search incorporated and I may resurrect it in the future.