Feature #3002
openOverhaul CAC search by using external search appliance
0%
Description
We currently use a Google Custom Search Engine for our primary search. This is bad in a number of ways:
1. It's Google
2. It's not faceted in ways that make sense for the Commons (ie, you can't distinguish between relevant content types, like Forum Posts vs Blog Posts)
3. It doesn't index private/hidden content
By using an external search appliance, like Sphinx or Elasticsearch, we can greatly enhance our search experience. In addition to improved discoverability of our content, we could leverage a search appliance in the future for topic modeling, data mining, recommendation engines, etc.
Requirements:
a. Should have a robust API, ideally with a PHP or even a WordPress integration already in the wild for us to start with
b. The search appliance itself should be fairly easy to get up and running, because it's likely that we'll have to do it ourselves given our lack of André
c. We should be able to index content in such a way that it respects differences in content type (groups, group forum posts, blog comments, wiki pages, etc etc etc)
d. We should be able to filter results based on privacy/visibility settings (this could happen either in the index or on the WordPress end)
I'm leaning toward Elasticsearch, as I think it meets all these criteria, and it just generally seems neat.
My plan of attack is to set up a test instance somewhere, and then take the existing WP integration plugin http://wordpress.org/plugins/wp-elasticsearch/ and start to build it out to handle BuddyPress and WP Multisite more elegantly. I'll spend a few hours exploring to get a sense of just how big a job it'll be, and then we can think about milestones.
Related issues