Thursday, April 15, 2010

Content Provider Iterator (or Things That Make Me Nervous)

    Yesterday Tim Bray, Google's newest Android Developer Advocate, shared his Content Provider Iterator. Before he shared it with you, he showed it to me to check "is this sick and twisted?". Here (with some additional colour) is what I told him.
Let me start with an admission. I have a deep mistrust of metaprogramming tricks in statically-typed systems. I've spent far too much of my working life unpicking frameworks created to simplify complex parts of existing frameworks. Often as not, the class descriptions could be summed up as "I don't understand this way of doing things -- this class makes this framework behave the way I expect."

There's often a good reason the original framework works the way it does. In Android that reason is performance. When developing for servers and PCs simplicity and familiarity is often a good trade-off for a marginal performance hit. In mobile it isn’t.

That said, these sorts of helper classes can improve code readability and cement best practices so I owe it to Tim to justify my unease.

The Problem

First off, let’s look at what we’re trying to solve -- iterating over a Cursor.

Tim’s code lets us do this:

Reader calls = new Reader(Call.class, activity, CallLog.Calls.CONTENT_URI);
for (Call call : calls)
  String number = call.getnumber();


The alternative in “normal” Android is this:

Cursor cursor =   

  getContentResolver().query(CallLog.Calls.CONTENT_URI,
                             null, null, null, null);
while (cursor.moveToNext())
  String number = 

   cursor.getString(cursor.getColumnIndex(CallLog.Calls.NUMBER));
cursor.close();


Certainly Tim’s solution is more familiar to those of us who prefer working with Objects to Cursors, and we save a line of code, but what’s the cost?

A Solution?

The two golden rules when designing for performance in Android are:

  • Don't do work that you don't need to do
  • Don't allocate memory if you can avoid it
 This pattern looks expensive.

1) Avoid creating objects

Allocating memory (and performing garbage collection in a resource constrained environment is expensive. Every time you extract information from a Cursor Using this class, you're creating a new object for each row, and allocating memory to store every column value too.

2) Avoid Reflection

Wow, that’s a lot of reflection. Reflection is expensive.

For every iteration you’re using reflection to create a new instance of your row object. During the construction of every object you use reflection to find a Setter for each column, then again to invoke the Setter for each column.

That’s a lot of reflection and it won’t take long to feel that hit on your performance.

3) Avoid Method Calls

Method calls are (comparatively) expensive - enough so that it's generally considered good practice in Android to avoid using Getters and Setters within a class. After creating objects you need virtual method calls to allocate each of the row values used, and again to extract them.

4) Minimize frequency of object creation

What's neat about this solution is it's dynamicsm, it creates your new objects each time you to iterate over the list without needing to maintain a separate collection.

But there is a risk here. Because it's dynamic, the generic solution doesn't offer a solution for monitoring changes to objects or the addition or removal of rows. To handle this you'd re-iterate over the list whenever the underlying dataset changes - reinforcing the performance issues described above.

5) Static state and thread safety

As written, this code isn’t thread safe, and maintains the current position of the cursor as a state variable. Using the same Reader to iterate over a cursor in two threads will come to grief very quickly indeed!

Conclusion

The Call Log has a half dozen columns and maybe a hundred rows, so you might get away with this. But what happens when you do it over contacts? Or email? Or Ocado’s entire grocery catalogue? It’s a dangerous pattern because its performance cost is dependent entirely on the shape of your dataset.

Imagine the case for a retailer with a database of several thousand products, each with a dozen String-based columns. Say 2,000 products with 12 string rows. That’s using reflection over 50,000 times and creating nearing 30,000 objects. That’s an awful lot of reflection and object creation to save a line of code.

Different platforms require different thinking. In my experience, becoming comfortable with how these systems are designed to be used is vital to understanding the best ways of implementing solutions. To quote an Android engineer on the subject, “I understand that coming from Ruby, one would want objects generated for you, but there is such a huge performance cost on mobile devices that it simply doesn't work.”

6 comments:

  1. I have a doubt. Can we do this in better way:

    while (cursor.moveToNext())
    String number =
    cursor.getString(cursor.getColumnIndex(CallLog.Calls.NUMBER));
    cursor.close();

    like this:

    int idxNumber = cursor.getColumnIndex(CallLog.Calls.NUMBER);
    while (cursor.moveToNext())
    String number = cursor.getString(idxNumber );
    cursor.close();

    Is this version faster? How much is the cost of calling cursor.getColumnIndex()?

    To whole article: maybe Tim can do the simply comparison of time execution and memory consumption for both version of code, using few providers like calls, contacts and emails, hmm?

    ReplyDelete
  2. Anonymous1:17 am BST

    Reto you have aptly put into words a point I have often wanted to make. I personally avoid reflection like the plague, and have always been annoyed at how much time I've spent debugging, optimizing and trying to maintain systems that over use it. My design goals are speed to market, maintainability, ease of implementation, and performance. It feels to me like reflection can sometimes help with speed to market and ease of implementation (for those that like reflection), but for me it feels so wrong it's harder, and it reduces maintainability. I believe it can sometimes help, and sometimes really hurt, performance.

    My 2 cents,
    -MK

    ReplyDelete
  3. I don't see any profiling that proves the performance impact of this code is what you suspect, so it's not as useful as one might think. @Krzysztof: nice suggestion.

    ReplyDelete
  4. Poor code. You forgot:
    1. ContentResolver.query() can return NULL!
    2. ContentResolver.query() can throw exception, namely SecurityException.
    3. Query.close() needs to be in a finally, to ensure cleanup.

    Let's worry about correct operation first!

    ReplyDelete
  5. I am starting to learn Android but could it be possible to do the following?

    Make an iterable collection that returns dummy Call objects (generated using the fly weight pattern and depending on the URI) with the index and lazily move the cursor and fetch the data that is being queried.

    It will mean delegating the fetching of the data to the Call object but could be used to provide the OO simplicity of the process.

    The reader can be used as a factory for these collections (can you use generics on Android? I said I am starting (8 )

    Pablisco.

    ReplyDelete
  6. Thank u for this post!
    I want to read more comments on this topic,
    that would be very helpful.So I have bookmarked
    and is waiting for more comments

    ReplyDelete