Previous related posts: https://point7.wordpress.com/2011/09/03/the-amazing-story-of-appengine-and-the-two-orders-of-magnitude/ and https://point7.wordpress.com/2011/09/04/appengine-tuning-1/

As I’ve detailed in the previous posts, I’m facing a big Google AppEngine bill for Syyncc, based on the number of running instances. Lots of small developers like me have been struggling with this (and bellowing) but I’m betting that this is in my control, and due to poor suboptimal coding on my part.

In my last attempt at tuning, I managed to drop the average running instances from over 10 to around 5 or 6. Can I do better?

I already identified a likely culprit for the excess instances, which is that I’m kicking off 50 tasks, all at once, every 2 minutes. That kind of spiky load must force the scheduler to keep more instances running than really necessary. It’s more than necessary because I want the tasks done within the 2 minute period, but I don’t care about latency within that.

Now, these tasks are calls to /ssprocessmonitor, which the original post showed as averaging a bit over a second run time. The longest run time I’ve been able to see by checking my logs is about 4 seconds.

If I were to do these processes in a bit of sequential processing (in a backend?) then they’d finish before the two minutes were up, more or less. But I don’t want to go so far as to rearchitect around that. Can I do something simpler?

Here’s the code that wants optimising, from /socialsyncprocessing2 :

monitors = SSMonitorBase.GetAllOverdueMonitors(50, 0) if monitors: for monitor in monitors: logging.debug("Enqueuing task for monitor=%s" % (monitor.key())) # See SSMonitorProcessor.py for ssprocessmonitor handler taskqueue.add( url='/ssprocessmonitor', params={'key': monitor.key()} )

The problem here is that the tasks are being added to the taskqueue all at once. Wouldn’t it be nice if there were a simple way to stagger them, spread them out through the 2 minutes? Let’s look at the doco:

http://code.google.com/appengine/docs/python/taskqueue/tasks.html

Here’s a nice little morsel, an optional parameter in taskqueue.add() :

countdown: Minimum time to wait before executing this task, in seconds. Defaults to zero.

Using that, I could spread the tasks out, 2 seconds apart. That means the last one would at best begin 100 seconds after the tasks were scheduled, which still gives it (and any straggling tasks) 20 seconds to complete before this whole process starts again.

So, here’s the modified code:

monitors = SSMonitorBase.GetAllOverdueMonitors(50, 0) if monitors: lcountdown = 0 for monitor in monitors: logging.debug("Enqueuing task for monitor=%s" % (monitor.key())) # See SSMonitorProcessor.py for ssprocessmonitor handler taskqueue.add( url='/ssprocessmonitor', params={'key': monitor.key()}, countdown=lcountdown ) lcountdown += 2

There are a lot of better ways to be doing what I’m doing here, but this might at least get me out of a jam.

So how did it go? Here’s the instances graph, see if you can spot where I uploaded the new code:

Is this success? It’s better, a bit up and down. Maybe I need more time to see how it behaves.

Here’s a the view over a four days, so you can see it back when it was really bad, the previous optimisation, then this one.

I’m feeling better about that.

What I’m going to be billed under the new system is:

4 cents / instance hour, less 1 free hour per hour.

So that’s the area under the graph, minus a 1 unit high horizontal strip, times 4 cents.

If I can get the average instances to 3, I’ll be paying 4 cents X (3-1) X 24 = US$1.92/day . That’s still a chunk more than I’m currently paying, but it’s doable (as long as everything else stays lowish). Cautious optimism!

—

Now that’s all great, but there’s something else that I might have been able to use, and without a code change.

Apparently, push queues (ie: what I’m using) can be configured:

You can define any number of individual queues by providing a queue name . You can control the rate at which tasks are processed in each queue by defining other directives, such as rate , bucket_size , and max_concurrent_requests . You can read more about these directives in the Queue Definitions section.

The task queue uses token buckets to control the rate of task execution. Each named queue has a token bucket that holds a certain number of tokens, defined by the bucket_size directive. Each time your application executes a task, it uses a token. Your app continues processing tasks in the queue until the queue’s bucket runs out of tokens. App Engine refills the bucket with new tokens continuously based on the rate that you specified for the queue.

(from http://code.google.com/appengine/docs/python/config/queue.html#Defining_Push_Queues_and_Processing_Rates )

So I could have defined a small bucket_size, and a low rate. Can you have a fractional rate? If so, the code changes above could be reverted, and instead I could add this to queue.yaml:

queue: - name: default rate: 0.5/s bucket_size: 1

I’ll keep that up my sleeve. I’ve already made the code change above, so I may as well let it run for a bit, see how it performs.

Update: Here’s the graph from the next morning. It’s full of win!