November 6th, 2009

Spam yourself. Spamalot or a spamalittle with DevelMailLog

I was upgrading The Watcher module to Drupal 7 today and found myself having to test a lot of email sending. Looking around in vain for a fake email system to log emails to the disk instead of sending them out into the interwebs to risk getting called the dreaded meat product, I decided to write one using the new pluggable mail system interface in Drupal 7.

Spam!

Previous versions of this have existed in the past, but I couldn’t find anything in devel currently. Here’s how it works. If you want to save your mails locally to files:

Step 1
Install Devel
Step 2
Apply this patch (until it gets committed) – Review here: http://drupal.org/node/625062.

cd sites/all/modules/devel
curl http://drupal.org/files/issues/develmail-625062-1.patch | patch -p0

Step 3
In your settings.php file:

$conf['mail_system'] = array('default-system', 'DevelMailLog');

That’s it!

Unless you set anything else mails are saved to files/mails/$to-$subject-$datetime.mail.txt

Example
Contact | My Site

Terminal — bash — 140×50

Bonus
You can change the directory with
variable_set(’devel_debug_mail_directory’, file_directory_path() . ‘/mails’);

Or the file format
variable_set(’devel_debug_mail_file_format’, ‘%to-%subject-%datetime.mail.txt’);

Till next time spammers…

October 22nd, 2009

What can we do to make Drupal 7 faster?

Drupal 7’s major API code freeze is behind us so time to take stock of the effect of the massive API overhauls and the hotly debated new interfaces and how they effect performance. As part of the last sprint at Acquia, I was tasked with comparing the performance of Drupal 6 and Drupal 7 in similar conditions so we know how much work we all have to do before Drupal 7 is ready for release.

So how does D7 match up with D6?

Of men who have a sense of honor, more come through alive than are slain, but from those who flee comes neither glory nor any help.
Homer, The Iliad


The Legend of the Drupal release cycle
In every Drupal development and release cycle, there is a period of rampant innovation by thousands of people distributed throughout the world. They get on IRC, email lists, in the flesh, etc and just bang out a ton of great code. Then the dust settles and the code freezes. Everyone wakes up after 3 days of recovery sleep, finishes their caffeinated beverage of choice, and tries out their new super-duper-fantastic toy to see how it works. It’s full of creaky limbs and flashy lightbulbs attached to misplaced handles. It gets poked and prodded, and then, we take it around the block to see how it runs and what we need to do to harden the security and make it as fast as possible (two areas Drupal has always excelled in). Then we all pitch in to get the bug fixing and performance tuning done.

Drupal 6 was of course a major refinement over Drupal 5 without creating too many waves for developers or user interfaces. It was also a little bit slower. Drupal 7 looks sexy, has much more consistency in its APIs, a kick ass database abstraction layer, a powerful ORM in fields. However as is expected in this pre-release stage, I have found in my testing that Drupal 7 in the current stage is slower, and now is the time to focus on performance.

Disclaimer: These are very preliminary numbers using a new benchmarking setup (which is described below). Neither the methodology nor the reports are perfect, so please do you own benchmarks (I also cover that later).

Summary (Cliff notes)

  • As expected, Drupal 7 is slower as it is pre-release and more feature rich.
  • For anonymous (cached) browsing, D7 is close on /taxonomy/1 and /node/1, it is much slower on /node
  • Authenticated users browsing is about 2-3x slower
  • User operations (login, user page, logout) are about 3-4x slower

What / how are we testing- Check the lanes

Target machine: IBM T60p thinkpad (Ubuntu 9.10, 2.16Ghz Core Duo + 2GB Dual channel RAM).
Testing machine: MacBookPro 2.5ghz 4GB of RAM

Testing machine ran jmeter and the two boxes were connected via LAN cables to minimize the network effect.

Testing platform – Set up the pins

Bowling pins

To start with, we need a reproducible environment of fake data to test against. As an attempt to create such a standard, I started the NorthDrop project on Drupal.org. It is an install profile which uses devel_generate (dumy content generation module) to make fake content depending on settings provided during the install.

I got devel_generate mostly working for D7 and I backported NorthDrop to D6 so we could get two installs with almost identical types and amounts of content.

For the purposes of this test and not having someone call the society for the preservation of old and battered laptops, I put it on the “small setting” which includes:

$sample_sizes['small'] = array();
$sample_sizes['small']['nodes'] = 200;
$sample_sizes['small']['comments'] = 4; // Per node
$sample_sizes['small']['users'] = 50;
$sample_sizes['small']['terms'] = 15;
$sample_sizes['small']['vocabs'] = 3;

Testing format – The approach…

For this basic profiling test, I built three thread groups. A thread group is a set of fake users who will do the same routine (hit a few paths / submit forms) for a certain number of loops. In this case, here are my thread groups:

BowlingThreads

All three thread groups run simultaneously and the results are saved to xml files which are later views / processed.
For verification of results, I used the highly scientific eye-ball method of, “eh… that’s pretty close” after running each test 3-4 times.

Results – The Scorecard

Although I highlighted the important conclusions in the summary, here are the tables it is derived from:
Benchmark Data for Drupal 6 / Drupal 7

And here is a nice histogram built by this awesome script by the folks at Atlassian (makers of JIRA and Fisheye).

This shows the response times for Authenticated users as percentage of requests. (click to see a larger image).

Drupal 6:

Drupal6-percent-auth

Drupal 7:

Drupal7-percent-stacked

What next? – The long drive home

Profiling

We need to examine the causes in more detail. This type of basic performance testing gives us some clues as to what pages / content cause load. Next we have to open up xcache and start getting into the nitty gritty to identify what needs to change.

Improved tests, isolation tests

I’ve posted the jmx I used for testing to a new github repo for this. If I get time, I will be writing another post to outline how this jmx was built and what it takes to run it. hint: it’s really easy! . It would be great if we built tests which just test one facet at a time, and also if we profiled more write heavy ops like commenting, content creation, etc.

Automated testing

The framework is in place to automate this, especially in D7 since the northdrop profile can be installed from the CLI. Jmeter can take params, and the jmetergraph.pl program can give us a good visual. Everyone’s eventual goal is something like testbot to run after every commit or perhaps on certain patches to give us an indication of what effect a change will have on general performance. We’re just dev hours and a server farm away from getting this set-up.

Resource profiling

Looking from the outside, we just get response times. We need to also identify what is making it slow from a system level. Is it MySQL? Is it PHP? RAM or CPU? are we I/O limited in some operations? Sometimes we can go backwards in performance, but forwards in scalability. This should be accounted for.

Good bye!

bowling-pins
If you want to get the test files used for this report they are here.
A special thanks to greg_harvey and Graham Taylor for sending me a starter jmx they had built previously for functional tests. I hope this post is useful in spurring the discussion of D7 performance, please feel free to leave a comment on pajamadesign.com or on my Acquia blog.

There is still time to fix these performance issues so dive in.

October 22nd, 2009

Measuring Drupal performance with jmeter

As some of you know, I’ve been doing some performance testing of Drupal recently.

I’ve also started a project on github for this work. I’ll be updating the documentation here, there and elsewhere as more time is invested, but for now, here is a brief intro to what jmeter looks like and how it can be used to profile Drupal.

Video on Drupal profiling with jmeter

Apologies for not making zooms, etc. If it is not viewable, let me know and I’ll spend a bit more time.

Best,
Jacob

August 6th, 2009

Plugin Manager in Core (part deux)

Sorry, long time no blog.

It’s been a crazy three months working on the Plugin Manager in Core project.

For those not acquainted, the plan is to make a GUI based installer / updater for Drupal modules and themes.

Available updates | dev7

We were almost done, and even had it all accessible

Then, some concerns were raised in the community about security and reliability. If you would like the US Library of Congress ref number for this discussion and the issues about Plugin Manager in D7, please contact me directly, I’ll notify you when they have finished building a computer fast enough to import them into their collection.

At any rate, here is the gist:

I, Adrian Rossouw, and probably some others are working to get something in by September 1st.

I’ve developed a specification, a backlog, and worked with Dries to finalize it’s acceptance.
Here is the something we are building:
Plugin manager for D7 code freeze spec.

I’ve also started out on a few of the issues, namely adding chmod support to FileTransfers, and moving the security sensitive operations to a separate file.

But there is a lot of other work to do, and we need all the help we can get. So if you’re interested in volunteering, comment here, or the main specification issue, email me, call me, show up at my house, whatever.

Also, come to my session at DrupalCon. I’ll also be trying to organize a BoF to talk about future plans.

Take care!
Jacob

July 3rd, 2009

Be Drunk

(Received in an email… really sounds like a Sufi, but it’s not AFAICT from the name).

Be Drunk
by Charles Baudelaire
Translated by Louis Simpson

You have to be always drunk. That’s all there is to it—it’s the only
way. So as not to feel the horrible burden of time that breaks your
back and bends you to the earth, you have to be continually drunk.
But on what? Wine, poetry or virtue, as you wish. But be drunk.
And if sometimes, on the steps of a palace or the green grass of a
ditch, in the mournful solitude of your room, you wake again,
drunkenness already diminishing or gone, ask the wind, the wave, the
star, the bird, the clock, everything that is flying, everything that
is groaning, everything that is rolling, everything that is singing,
everything that is speaking. . .ask what time it is and wind, wave,
star, bird, clock will answer you: “It is time to be drunk! So as not
to be the martyred slaves of time, be drunk, be continually drunk! On
wine, on poetry or on virtue as you wish.”

June 25th, 2009

The death of the Drupal programmer

Okay, so that’s going a bit too far. But we’re getting ever closer to the dream module and theme updates and installs using a GUI in your browser!

Many thanks to cwgordon, Joshua Rogers, dww and especially chx for kicking some serious arse on this issue and getting us very close.

update_process2.mov (video/quicktime Object)

That’s right, in Drupal 7 you will be able to update your modules and themes without learning FTP, SSH or CVS.

Check out my latest screencast

and get involved.

Also PLEASE vote for my session at DrupalCon Paris.

June 10th, 2009

Finding and installing Drupal modules from your site

In my last post, I showed how updating modules may look in Drupal 7. If you didn’t see it it is here

In this post, I’m going to throw up some “dream” wireframes which may or may not make the ver 1 cut, but are perhaps good start in the right direction.

Hopefully, at least parts of them will be practical to implement, or at least spur discussion.

Without further ado:

Check out the clickable wireframes

And give feedback here:
Plugin Manager in Core: Part 3 (integration with installation system)

June 10th, 2009

Updating modules and themes in Drupal 7

The problem: Updates in Drupal require FTP / SSH and a bit of know how

When the average Drupal site owner without ssh, cvs and other geek gadgets wants to update modules on or themes on their Drupal site, they currently have to do the following:

  1. Go update status and see the mod is out of date
  2. Take the site offline
  3. Make a backup (if they can)
  4. Know where to find the module on d.o., download the tarball
  5. Unzip the tarball
  6. Remove the current directory
  7. Use FTP to upload the new directory
  8. Run update.php

We’re trying to provide a way that users can get the same user friendliness of a package manager like Synaptic. Where updates and new installs are just a few clicks, no geek gadget belt.

I’ve entered the D7 ux fray, specifically focusing my generous amount of Acquia community time on getting a project called the Plugin Manager spruced up and into core.

For more background on the effort, see: Plugin Manager in Core (part 1).

The solution: make Drupal update like everything else.

Mozilla Firefox

Here is the issue:
Plugin Manager Part 2 : The update status UI

I’ve been working out some wireframes of how the process might look, and I wanted to share them with the planet to see what people thought of them. So without further ado:

Check out the clickable wireframes

Round 2

June 9th, 2009

Wake up and smell the coffee (through an HMAC filter)

Hey, stay out of my index!

So when I first joined Acquia, my fledgling Solr hosting service had IP based security. You, the customer could tell me what IPs you were going to connect with, and I would allow access to your search index from those IPs.

One of the first major tasks was to implement HMAC based authentication to the service to ensure against man-in-the-middle attacks and provide a way to use from any IP. Also, it is standard operating procedure for other Acquia services.

Fail first!

In the first iteration, we built something on the load balancers (which run nginx) because it provided a central point of access control, the balancers were under-utilized and we didn’t have to mess with the Solr code.

This worked okay for awhile, and was decently fast but was quite flaky as some stupid developer had the brilliant idea to implement it as python middleware with fcgi (flup). That developer was me.

Don’t fail second!

So to combat the unstable nature of the fcgi protocol, and to make things a little more efficient, I (along with help from Peter Wolanin and Douglas Hubler) rebuilt it in Java using a Servlet Filter. This was a royal pain the butt, as Java is pretty tricky when it comes to input streams and buffers.

Thankfully the results are worth it:

It’s hard to tell from this graph because of the peak, but the median stayed almost the same (blue line), and the average decreases pretty significantly (purple) as does the 90% line (yellow). Click the image to see it larger.

source=solr_nginx_access (eventtype=solr_search_request)| timechart span=2h median(request_time), perc90(request_time), avg(request_time) as avg_request_time - in the past 3 days - ip-10-251-75-227 - Splunk 3.4.8

This graph shows the standard deviation (blue) in addition to the previous numbers and describes more acutely what the previous graph suggests, that is, the previous implementation was not any slower really, but less consistent, causing some of the requests to take much longer than others.

source=solr_nginx_access (eventtype=solr_search_request)| timechart span=2h stdev(request_time), median(request_time), perc90(request_time), avg(request_time) as avg_request_time - in the past 3 days - ip-10-251-75-227 - Splunk 3.4.8

So there you have, Acquia Search is both secure and fast and now 200% more reliably fast :)

April 6th, 2009

This Python needs Adult Supervision

A while back, I wrote a daemon. No, I’m not a satanist mom, it’s a program which will basically stick around and manage a bunch of other little minions as they server content via unix sockets to a webserver (nginx).

The point here is to take in traffic from nginx via python and do something with it. For this I found an excellent tutorial which got me started:

http://www.p16blog.com/p16/2008/11/quick-demo-of-python-wsgi-nginx.html

This worked great, but then you needed to write something to manage all the little unix sockets, start them when they died, etc.

So I had to custom write something (at least I thought) as nothing in existence seemed suited for the task. It has worked “okay” but is having some mysterious problems under heavy real world load, and I needed to find something more robust for the task.

I recently stumbled across:
http://just-another.net/2009/01/18/byteflowdjangosupervisordnginx-win/

This thing looks perfect, but I can’t quite get it work… Basically, supervisord is a python application which has a very usefull and usable configuration file to specify programs you would like to run as services. It replaces 90% of the init.d scripts in existence I imagine.

In theory,
you create a block like this:

; Production setup
[fcgi-program:gate]
socket=tcp://127.0.0.1:1212  ; We reference this later in nginx
command = /usr/local/solrflare/bin/gate.py  ; Calls the above code

This means, that when I run supervisord, it starts a daemon which will fire up my python script (which currently looks like this):

#!/usr/bin/python
from flup.server.fcgi import WSGIServer
import time, os, sys

def app(environ, start_response):
        status = "200 OK"
        response_headers = [('Content-type', 'text/plain')]
        start_response(status, response_headers)
        return ["If a thread dies in the middle of a request, and noone is a around to hear it, does it give a status code?\n"]
WSGIServer(app).run()

And Supervisord provides a nice little web interface to monitor and manage the daemon, also provides a nice interactive shell program and XML-RPC! (among many other cool features).

Supervisor Status

When this works, it will be awesome because I can throw out a lot of code (which I love to do). However, currently it just kinda sits there when I curl the port… doesn’t do anything, doesn’t log anything.

Update: I got it working! And it is awesome. I just needed to deal w/ an nginx config issue I created and do some permissions wrangling. It is running great so far!

How To find me

Telephone: +1 510.277.0891 | Email: jacobsingh at gmail daht calm

Solution Graphics