April 8th, 2008

Google Search Appliance / Google Mini Integration for Drupal

Google’s enterprise search technology is becoming an increasingly popular choice for IT managers to manage their intranets and pool data from multiple sources.

It provides:

  • Excellent keyword searching (obviously) based on pagerank as well as customizable weighting factors
  • Recommended links
  • Source and Date Biasing
  • Good support for different char sets
  • Incredible speed and performance
  • Support for multiple collections and frontends
  • An easy REST interface for API integration

Leveraging Google’s technology on your CMS (Drupal) gives us the following benefits:

  • Potential for more relevant results
  • Integration with 3rd party databases and sites
  • Advanced search features like synonyms, stemming and language detection

How the module works

See the Google Appliance for Drupal module page for more details on implementation.

First thing you need to do is to setup the module so it knows where to connect to:

Google Appliance Settings

At the minimum, you need:

  • Search name (this will appear on a new tab on the search screen).
  • Host Name ( the URL or IP where your GSA or Mini is located )
  • Collection (which collection you wish to search ).
  • Client ( This doesn’t matter much, it just has to be valid. This is equivilent to the “frontend” in the GSA ).

Okay, done? If you want to, enable caching (which will cache results so you don’t need to re-query for the same search within the timeout period) and set the debug level.

Now, you’ll need to tell the mini where to crawl. For this, just go to your GSA administration screen, and punch in the url of your site. For node pages, the module will add meta-tags for the following:

  • Taxonomy (Advanced search filter coming soon!)
  • Date Modified and Created (Date sorting coming soon!)
  • Author
  • Status (pub/unpub)
  • Language (if using i18n)

After installing the module, you will see a new tab on the search screen:
Google Tab

Fire off a search and see your results (drupalified).

Google Results

In addition, you can enable the recommended links block, and if you have configured key matches in drupal, they will show up in this block.

This module is still in Beta, so expect some issues. Here are a few I know of:

  • No meta tags on non-node pages, means they will be found, but won’t have the type / author, etc fields in the results
  • Does not use url() on incoming links, which means if the mini finds node/123 pages, they won’t get aliases

There are probably lots more, but hopefully, this will get people who are interested in this going and we can work on making it better.

I am available for GSA and mini consulting and custom integrations, just contact me

6 Responses to “Google Search Appliance / Google Mini Integration for Drupal”

  1. Ken H. says:

    This seems really great. The project I’m working on isn’t (yet :) ) large enough to justify having a google box yet. But when it does, I really look forward to trying this out.

    Keep up the good work!

  2. Adam says:

    I am getting an error:
    call to undefined function curl_init() in GoogleMini.php at line 287

    Any help?

    Thanks
    Adam

  3. Jacob Singh says:

    Hi Adam,

    Thanks for writing. I apologize, something is funny with my wordpress / SMTP, and I never got a notification. You should really post this in the issue queue on the project page so others will see it.

    Anyway, the answer is that you need curl enabled in your php environment. If you are on debian (or ubuntu) this is apt-get php-curl, on redhat it’s probably something similar with yum or up2date, and if you build fromo source, I believe it is –with-curl as a config option.

    Best,
    J

  4. Chris says:

    I am getting the error:

    Fatal error: Class ‘SimpleXMLIterator’ not found in /usr/home/sites/dev.isdmedia.com/trunk/sites/all/modules/google_appliance/GoogleMini.php on line 325

    SimpleXML is listed when viewing phpinfo.php…

    Any advice? (Thanks in advance)

  5. Jacob Singh says:

    Hi Chris,

    Please make support requests on the issue queue. This should be easy enough to solve, do a google search for the error, there are lots of results.

    thanks!
    Jacob

  6. Brandon Dixon says:

    Hey,

    I think your module is great, but I see it hasn’t been updated in a while. I am using Drupal 6 and started to port it over, but I am having issues with the results. Do you think you could help me out?

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

How To find me

Telephone: +1 510.277.0891 | Email: jacobsingh at gmail daht calm

Solution Graphics