Thursday, 14 January 2010

Crowd Sourcing tips from uTest

I went to a talk given by John Montgomery (VP of uTest) on community based testing yesterday. And picked up some interesting thoughts on crowd sourcing. I consider uTest to be a successful and promising crowd sourcing platform. And I'm really interested in how it has distinguished itself in this area. Here is what I have learned:

Grow a professional crowd
Unless it is a generic crowd sourcing platform like the mechanical turks, the average tasks are best achieved by people professionally trained and with years of experience. So it is important to make sure the workers are serious professionals rather than random people who just want to kill some time.

Get to know your super stars
Out of the thousands of workers there are going to be super stars. They are just so good at what they do. It's important to keep them interested and motivated and build a relationship with them. They are almost the whole point of using crowd sourcing for the top customers.

Assign project managers to triage issues
It's common to engage hundreds or thousands of workers for a crowd sourcing project. The noise of communication can be a killer. Dividing them into groups and having project managers to triage issues is a natural solution.

Keep the tasks both challenging and fit into small time slots
Crowd sourcing works best when the tasks are small and can fit in bits of spare time. But make sure it's still interesting and challenging.

Hand pick the super stars for important tasks
Remember the super stars. Assemble a team from them for the tasks that you can't afford to fail. Beginners are not welcome here.

Get new comers involved as soon as possible
The new comers get bored easily if they find it hard to get selected for a task. Make sure they feel welcome.

Tuesday, 16 June 2009

Debugging and fixing python code on the fly

As mentioned in the previous post, it is very time consuming to debug an application that has long execution time because the programmer often has to run the application multiple times to see if a fix is working and to fix multiple bugs. In this post I am going to describe a technique that enables trying out fixes multiple times and fix multiple bugs within one execution.
The idea is to use Python's decorator to inject some scaffoldings to the code that's to be tested and debugged, and use Python's ability to dynamically reload source file to fix issues and resume execution. Here is the decorator:
def untrusted(func):
  def wrapper_func(*args, **kwds):
    try:
      return func(*args, **kwds)
    except Exception, e:
      print str(e) or e.__class__
      import pdb
      pdb.set_trace()
  return wrapper_func
It just calls the function that's being decorated and go into debugging mode if an exception occurs. Here is a function that has a typo, and imagine it's going to be called in the middle of an hour long batch job.

@untrusted
def print_date(d):
print d.strftim("%Y-%m-%d")

When the function is called, an exception is raised and caught by the decorator and on the command line it looks like:
'datetime.datetime' object has no attribute 'strftim'
--Return--
> /Users/jiayao/examples/jiayao/debug.py(9)wrapper_func()->None
-> pdb.set_trace()
(Pdb) 

Now I can fix the typo in the source code. Then:
(Pdb) import example

(Pdb) example.print_date(*args)
2009-06-16
Then the program can be resumed as if no error has ever happened. "import example" loads the source code dynamically so I can execute the correct implementation of "print_date" with the same arguments as the original invocation. Note "import example" will work only once, calling import on the same module more than once has no effect, so to load the source again you need to call "reload(example)" the next time.

Another example, this time the error is using a module but forgot to import it first:
@untrusted
def match(pattern, text):
  return re.search(pattern, text)
match("^abc", "abcde")

Entering debug mode:
global name 're' is not defined
--Return--
> /Users/jiayao/examples/jiayao/debug.py(9)wrapper_func()->None
-> pdb.set_trace()
(Pdb)

Simple calling import re and invoke the function again will not work because the function maintains it's own copy of globals including the imports. "import re" here will not change the small world encapsulated in the function object. So we have to inject the import into it:
(Pdb) import re
(Pdb) func.func_globals['re'] = re
(Pdb) func(*args)
<_sre.SRE_Match object at 0x24e4f0>

One last example, this is dealing with objects and it's a bit more complicated:
class A(object):
  def __init__(self, text=None):
    self.text = text

  @untrusted
  def search(self, keyword):
    return keyword in self.text

a = A()
  a.search("abc")


>python example.py
argument of type 'NoneType' is not iterable
--Return--
> /Users/jiayao/examples/jiayao/debug.py(9)wrapper_func()->None
-> pdb.set_trace()

And we fix "search" function in the source code:
@untrusted
  def search(self, keyword):
    if self.text:
      return keyword in self.text
    else:
      return False

And we try out the fix:
(Pdb) import example
(Pdb) example.A.search(*args)
*** TypeError: unbound method wrapper_func() must be called with A instance as first argument (got A instance instead)

This is a strange error at first glance. But the "A" is not the same as the other "A":
(Pdb) example.A
< class 'example.A'>
(Pdb) type(args[0])
< class '__main__.A'>

So we can not call example.A on an __main__.A object. We can work around this:
(Pdb) new_a = example.A()
(Pdb) new_a.__dict__ = args[0].__dict__
(Pdb) example.A.search(new_a, *args[1:])
False

Hope this is useful for some fellow programmers out there. Use the time you saved wisely! :)

Wednesday, 10 June 2009

I love Python, I love dyanmic typing, but I also feel the pain

Most people who have written python programs that takes longer than a couple of minutes to run remembers the pain. The pain of program crashing after x minutes/hours due to some trivial problems like typos in function names. You carefully fix the problem, and run it again, x minutes later yourself debugging another crash caused by another trivial problem...

What is the root problem? A lot of people who came from Java or C++ background will say: you need static type checking! True, static type checking would have caught a lot if not most of the mistakes I made in writing Python code. Although that is still a subset of the problems. Generally speaking, the pain comes from errors that are not detectable until runtime.

What are the solutions? For the ones that can be caught by type checking, the natural answer is "Let there be type checking!". That has been discussed extensively in the Python community in the past few years. See [1]Guido's essays on the topic. It's not introduced in Py3k. PEP 246 was a formal attempt to add optional type checking in Python but it was rejected in 2009 because "Something much better is about to happen. --- GvR". So for now the problem is not fundamentally addressed. One partial solution is static analysis. Pycheckerand Pylint are the two most popular ones. They catch a lot of errors that are usually caught by the compiler in static typing languages. They also catch coding style violations which in turn helps writing less error prone code. Still there are a lot errors goes undetected under these tools. For example:
def print_date(d):
  d.strftiem("%Y-%m-%d")
Nobody knows anything about "d" and there is no way to find out "strftiem" is a typo until we run it. And the code may have executed for hours by the time this function is called.

Another partial solution is unit testing. Writing comprehensive unit tests can really iron out a lot of problems like this. But there are a few reasons unit testing is not enough to catch all the errors, but I won't go into that here.

Running static analysis frequently and writing unit tests can really save a lot of pains. But what next? There are still runtime errors falling through the cracks. Logging and debugger can help finding the problem, but you have to run the code again and again to iron out the all problems, which can be hours or days...

In the next post I will talk about a technique that I use to further reduce the pain.

[1] Guido's essays on static type checking

Thursday, 27 November 2008

Configuration

Just read Yang's new blog about configuration .  He has enumerated a list of requirements for a configuration framework. I have some thoughts on this as well because I have just been through a configuration hell in the past year as well. To crystallize the concept here, I will try to categorize configuraiton:
  1. Dependency tree. E.g. A cannot function until B is online, B cannot function until C and D is online etc
  2. Wiring. E.g. A can connect to B at address 10.1.1.0:8000
  3. Membership. A and B are in Group 1. This often ties to permission and capability
  4. Variables. From physical knots to parameters set for a program
  5. Access control.
It is very high level and broad. I don't think a "framework" can enables effective configuration management across these categories. It's an important problem to solve. But I believe it is easier to tackle with a divide and conquer approach. i.e. come up with a process and tool to solve a clearly defined category. Two things are quite important: 1. engineers typically attack problems by building powerful tools. However the process of using it is often much more important. 2. try not to be ambitious when defining the target category, because ambition there means complexity hell later on when more and more clever people try to use the tool to configure things more cleverly.

Tuesday, 8 July 2008

Google App Engine with Django Unit Testing

To my surprise it is extremely easy to develop unit tests for GAE apps. With the help of GoogleAppEngineDjangoHelper, you can write unit tests around your models and application logic against a fake backend store. So you should be able to test most of your code without involving a browser! Here is an list of steps to develop a doctest for GAE apps.
  1. Checkout http://code.google.com/p/google-app-engine-django/ and make the necessary changes to your code base according to http://code.google.com/p/google-app-engine-django/source/browse/trunk/README
  2. In your project directory run: python manage.py shell
  3. Explore the APIs and play with your models in the interactive session.
  4. Record the session as a doctest. See an example in the Rietveld project : http://code.google.com/p/rietveld/source/browse/branches/testing/codereview/tests.py
  5. Replay the test by: python manage.py test _your_pacakge_
(Sure you can write the more traditional unit tests as well after step 1, this is just an example to show how natural it is to write a unit test for GAE apps)

Saturday, 7 June 2008

Hitting scalability issue in the wild appengine world

Yesterday we hit a scalability issue on Rietveld, DeadlineExceededError keeps showing up on review issues that has a lot of patches. It was caused by an nested loop that access datastore in every iteration. After the panic and patching, I felt pretty good about this "feature" of App Engine. It means most of the time when we hit DeadlineExceededError, there is probably an inefficient algorithm just been checked in. We would have assumed it to be some network delay if app engine didn't kill the request.
Anothe lesson learnt is to use the profiler often. If the datastore access shows up high in the profiling report, we should look at the pending change again for performance bugs, and consider caching.

Tuesday, 13 May 2008

More code review

Since my last post, I have been working with Guido on the rietveld project http://code.google.com/p/rietveld/ .
It's been a great experience. I've added some features here and there. Guido has been very prompt in giving review comments. That made the transatlantic collaboration a bit easier. And I've learned a lot from his thorough reviews. I will continue working on this project for as long as I can keep learning something new and build useful stuff. And hopefully encourage more people in the open source community to start doing frequent code reviews along the way.