Dec 5
Cool Python Tricks part II
A new installment in my ever-popular series.
Todays installment will concern one of the best performance enhancing tricks in python: list comprehensions.
Most beginning pythonistas will produce code like this:
lst=[] for i in range(0, 10): lst.append(i**2)
Now, thats pretty okay. It runs… okay. It is the kind of code I would expect from a newcomer to Python, however, one of Python’s weaknesses is its slower running speed. This, by the way, is a weakness for pretty much any interpreted language, like Java or Ruby.
Now, imagine you have a much larger data set to work with. On the order of thousands, hundreds of thousands. Processing all of it at Python speed will drag your program down.
But, you are in luck: there is a faster way.
Unfortunately, this faster method has its own set of issues, and the first and most obvious one, is legibility. Done wrong this technique can and will obscure the meaning of your program, and make it harder to maintain.
This technique is list comprehensions. The upper piece of code would be, in a list comprehension,
lst=[i**2 for i in range(0, 10)]
This line of code runs in C speed, not Python speed. For a large dataset, this can be an improvement of 10-100 times faster!
Now, lets break it down: there are three parts to a list comprehension. The first portion is an expression. The second portion is a for clause. And the third portion is one or more for or if clauses.
The first for clause in the list comprehension declares a variable, and the expression must do something with that variable.
Or, for those language lawyers:
test ::= or_test | lambda_form
testlist ::= test ( “,” test )* [ "," ]
list_display ::= “[" [listmaker] “]”
listmaker ::= expression ( list_for | ( “,” expression )* [","] )
list_iter ::= list_for | list_if
list_for ::= “for” expression_list “in” testlist [list_iter]
list_if ::= “if” test [list_iter]
Basically, a list comprehension consists of:
[expression using 'x' variable for clause declaring 'x' (optionally: if 'x' meets some condition)]
Again, one of the cardinal rules of optimization are that you only optimize when you know where the bottlenecks are. So, when you first code a for loop or list initialization, use the regular for loop. Then, after the program works, you can optimize it out to a list comprehension, and test the resulting speed.
This will, in most cases, make your program faster, and make you a Python guru. So, till next time, keep on hissing!
2 comments
2 Comments so far
I would suggest that generator expressions have become even more important than list comprehensions, due to the value of laziness. Why generate those values before you find out if you even need them?
Thanks wtd. I’ll be thinking about that topic, probably writing a post soon.
An initial consideration is the speed of the construct. List comprehensions have the benefit of being computer at C speed, not Python speed. Its always the first optimization I make. I’m not sure about the performance of generators, but I will find out.