Thursday, 27 November 2008

Configuration

Just read Yang's new blog about configuration .  He has enumerated a list of requirements for a configuration framework. I have some thoughts on this as well because I have just been through a configuration hell in the past year as well. To crystallize the concept here, I will try to categorize configuraiton:
  1. Dependency tree. E.g. A cannot function until B is online, B cannot function until C and D is online etc
  2. Wiring. E.g. A can connect to B at address 10.1.1.0:8000
  3. Membership. A and B are in Group 1. This often ties to permission and capability
  4. Variables. From physical knots to parameters set for a program
  5. Access control.
It is very high level and broad. I don't think a "framework" can enables effective configuration management across these categories. It's an important problem to solve. But I believe it is easier to tackle with a divide and conquer approach. i.e. come up with a process and tool to solve a clearly defined category. Two things are quite important: 1. engineers typically attack problems by building powerful tools. However the process of using it is often much more important. 2. try not to be ambitious when defining the target category, because ambition there means complexity hell later on when more and more clever people try to use the tool to configure things more cleverly.

Tuesday, 8 July 2008

Google App Engine with Django Unit Testing

To my surprise it is extremely easy to develop unit tests for GAE apps. With the help of GoogleAppEngineDjangoHelper, you can write unit tests around your models and application logic against a fake backend store. So you should be able to test most of your code without involving a browser! Here is an list of steps to develop a doctest for GAE apps.
  1. Checkout http://code.google.com/p/google-app-engine-django/ and make the necessary changes to your code base according to http://code.google.com/p/google-app-engine-django/source/browse/trunk/README
  2. In your project directory run: python manage.py shell
  3. Explore the APIs and play with your models in the interactive session.
  4. Record the session as a doctest. See an example in the Rietveld project : http://code.google.com/p/rietveld/source/browse/branches/testing/codereview/tests.py
  5. Replay the test by: python manage.py test _your_pacakge_
(Sure you can write the more traditional unit tests as well after step 1, this is just an example to show how natural it is to write a unit test for GAE apps)

Saturday, 7 June 2008

Hitting scalability issue in the wild appengine world

Yesterday we hit a scalability issue on Rietveld, DeadlineExceededError keeps showing up on review issues that has a lot of patches. It was caused by an nested loop that access datastore in every iteration. After the panic and patching, I felt pretty good about this "feature" of App Engine. It means most of the time when we hit DeadlineExceededError, there is probably an inefficient algorithm just been checked in. We would have assumed it to be some network delay if app engine didn't kill the request.
Anothe lesson learnt is to use the profiler often. If the datastore access shows up high in the profiling report, we should look at the pending change again for performance bugs, and consider caching.

Tuesday, 13 May 2008

More code review

Since my last post, I have been working with Guido on the rietveld project http://code.google.com/p/rietveld/ .
It's been a great experience. I've added some features here and there. Guido has been very prompt in giving review comments. That made the transatlantic collaboration a bit easier. And I've learned a lot from his thorough reviews. I will continue working on this project for as long as I can keep learning something new and build useful stuff. And hopefully encourage more people in the open source community to start doing frequent code reviews along the way.

Wednesday, 7 May 2008

Open source code review system!

I have just wrote in my blog yesterday that I want to build an open source code review system on Google apps engine. And today I am already reading Guido van Rossum's Rietveld . It is based on the code review system "Mondrian" that's widely used in Google, which is also developed by Guido.
I've just tried it out with a patch I made for WebDriver. Worked pretty well even though the user interface looks still in early stage. I'm going to checkout the code to see more about it now.

Monday, 5 May 2008

Goolg Application Engine

The GAE (Google Application Engine) is Google's solution to scalable web hosting. Unlike the traditional web hosting that provides limited bandwidth and data storage, it is backed by the same infrastructure that hosts lots of Google's web applications.
I've been trying out the GAE for the past two weeks. It's pretty easy to use. I've built a very simple web based market place kind of application, which I may release as open source once I am happy to demo it. All I used is some python code that manages the business logic and data management, and some html and css. There is almost no configuration needed. GAE hosts all the web resources like html, image, css, python code, and all the data. Data can be retrieved using SQL like query (but much simplified, it's more like a object persistence layer). There are some problems, like the lack of a background process. Which has not been a problem for me yet, but already being worked on by the GAE team.
I felt this is a start of many great things to come. Previously you need to find a web hosting solution that provides database and cgi support whenever you want to setup something that interacts with the users on the websites. And you will need a data center to back it if that becomes popular or just happens to be slashdotted or digged. GAE definitely lowered the barrier to entry for the web application market. I can't wait to see all the innovative applications that are going to be built by all the new players who are riding this tide into the market.