August 12th, 2008

250k nodes working to save our habitat

I had the privilege of working with Srijan Technologies this spring on Drupal and Agile Development trainings for their team and helping them get Apache Solr kicking for the India Environmental Portal which just launched last week to much fanfare.

The site is based on Drupal 5 and features:

What is an environmental portal?

India Environmental Portal is an initiative of the Center for Society and the Environment, one of India’s oldest and most revered environmental NGOs. Here is an excerpt from their about page:

This is the age of environment. And to make a difference, in our lifestyle, in policy and in practice we need information, which is accessible, well categorized and easy to use. The India Environment Portal is our effort to put together a one-stop shop of all that you want to know about environment and development issues. Its politics is overt: to build open, networked and informed societies, who can use knowledge to make change…..

This is a people’s portal. It will actively collate and exchange data, research and information from people working in the field, in campaigns, in scientific institutions, in research and in industry.

I recommend checking out the about page to find out more about this exciting resource:
http://www.indiaenvironmentportal.org.in/content/about-us

Congratulations to Ipsita, Rahul, Syed, Shashank, and all the rest of the excellent team at Srijan!

And a special thanks to drunken monkey and Robert Douglass for their work to integrate Drupal and Apache Solr.

Some press about the launch

http://www.financialexpress.com/news/National-portal-on-environment/347761/

http://in.news.yahoo.com/43/20080811/812/tnl-national-online-portal-on-environmen.html

http://www.thehindu.com/holnus/002200808112067.htm

http://www.ecoearth.info/shared/reader/welcome.aspx?linkid=104697&keybold=climate%20forest%20environment%20warming

http://www.indiaprwire.com/businessnews/20080811/32685.htm

http://alootechie.net/content/indiaenvironmentportalorgin-launched-provide-environmental-information

July 1st, 2008

Solving bad IA using enterprise search (Reverse Advanced Search)

Since I started working with Apache Solr in Drupal, I’ve realized how much client money has been wasted making ill advised advanced searches. We’ve all gotten the requests for “advanced” searches and it makes any IA-god fearing developer cringe. For the 1% of users who use them, you blow tons of budget, and the result is often quite poor because the client doesn’t really know their data or their users that well.

For those of you who are unfamiliar with faceted search compare the following:

I did a search for WSXGA because I’m looking for a laptop with decent resolution on two sites.

Laptops Direct

vs.

New Egg

(click to enlarge image in new window)

The New Egg search lets me filter, so I know that I’m looking for a laptop between $750 -> $1000, I’ll get 5 results. After that filter, I’ll know what’s available, and the # per manufacturer etc.

Contrast that with an advanced search form where I have to put in all my criteria, and hope I get a result. I might also miss certain results if my vocabulary is bad, or I don’t understand that the website says “high resolution” instead of WSXGA, so I don’t select it.

I think it’s obvious to anyone why faceted search is a good thing. In my next post, I’ll be exploring why is hasn’t gotten widespread adoption, particularly in the small business / NGO sector, and how I plan to help change that.

June 30th, 2008

Solving bad IA using enterprise search (Vocabulary)

Long time no blog…

I had a bit of a realization (or rather a resurgence of a recurring realization) that I enjoy writing. It happened this weekend as I was “getting away from it all”.

I’ve been interested in enterprise search for small and medium enterprises for a while now. Having implemented the Google Mini and the GSA, I’ve seen how a good search can really turn your information architecture on its head in a good way. Like any conversation between two entities be they two people or a person and a website, communication is difficult, and many of the same rules apply:

Vocabulary

You have to speak the same language. This doesn’t mean an Thai can’t talk to a Nigerian, but it does mean that when you are communicating, if the same word doesn’t mean the same thing (which it never does), your intentions, delivery and content is worthless. That is why non-verbal communication and communication over phones is so ineffective compared to face to face meetings. The words may be the same, but the interpretation never is.

So what does this have to do with enterprise search? When a user wants something from your website, they are looking for a keyword. Many computer scientists have tried to make linguistic aware search engines which correctly interpret sentences and question. Some of these results are useful, but generally, I believe people don’t come to a site thinking “Do you have any

    red Toyota Corolla

.” Internally, they are simply thinking “Corolla” and “Red”. For instance, I could speak only two words of English, Red and Corolla, and chances are, I could walk into any American City and rent a red Toyota Corolla.

When one plans information architecture for a site, they usually start with Persona or stereotypes of users, which have goals. And then you define actions they take to meet those goals, and try to use the same vocabulary and thought process of these users to make an interface which is organized like their brain. But when you have 10 different Persona, how is this possible? And within your 10 stereotypes, there can be huge variation and outside of your assumptions, there may be other users you never thought about. By having an good search engine, even if you have one page about red Mustangs which is buried in your site, people might find it. By having an excellent search engine which has synonyms, facets, spell checking, related results, etc you may be able to help the user not only find what they thought they were looking for, but contextual information about it. What if there is a mechanic looking for parts who types in

    1988 Corolla Fuel Pump

into a search, shouldn’t the search engine know what years the fuel pumps for Corollas available are the same and available, and allow him to filter? Shouldn’t it know that in the late eighties the Chevy Nova was a clone of the Corolla, and had the same parts cheaper?

This is the type of high value information which comes from dealing with a real human, and no amount of brilliant forethought in information architecture can pre-assume what the person is actually looking for. If I were doing IA for a parts website sans search, I’d have to have categories by model, by year, etc Even in a straightforward example like that, value is lost. That’s why search engines need to ask the extra question, and today’s search engines that most sites use are not.

Faceted Search

Next whenever, I’ll write something about faceted search (fancy name for search with fields and filters) and how I think the combination of Apache Solr and open source CMSs like Drupal, Typo3 and Joomla, are going to pave the way to an entirely new concept of information architecture and where we spend out usability testing money.

How To find me

Telephone: +1 510.277.0891 | Email: jacobsingh at gmail daht calm

Solution Graphics