Tips for golfing in Python
What general tips do you have for golfing in Python? I'm looking for ideas which can be applied to code-golf problems and which are also at least somewhat specific to Python (e.g. "remove comments" is not an answer).
Please post one tip per answer.
Oh, I can see a whole set of questions like this one coming for each language...
@Marthinho I agree. Just started a C++ equivalent. I don't think its a bad thing though, as long as we don't see the same answers re-posted across many of these question types.
Love the question but I have to keep telling myself "this is ONLY for fun NOT for production code"
@dorukayhan Nope; it's a valid [tag:code-golf] [tag:tips] question, asking for tips on shortening [tag:python] code for CG'ing purposes. Such questions are perfectly valid for the site, and none of these tags explicitly says that the question should be CW'd, unlike SO, which required CG challenges to be CW'd. Also, writing a good answer, and finding such tips always deserves something, that is taken away if the question is community wiki (rep).
@Chris_Rands That simply does not universally hold, as there are cases in which Python 3 allows for shorter submissions.
Note that this will not necessarily work for defining mutable objects that you will be modifying in-place. a=b= is actually different from a=;b=
But NEVER use a=b=c= or any object instanciation since all the variables will point to the same instance. That's probably not what you want.
Conditionals can be lengthy. In some cases, you can replace a simple conditional with
conditionis true, then
if a<b:return a
These aren't exactly the same. The first one evaluates only the expression that is returned while the second one always evaluates them both. These ones do short-circuit: `a if a
@marinus, they are not equal: just consider `P and A or B` for any A that gives `bool(A)=False`. But `(P and [A] or [B])` will do the job. See http://www.diveintopython.net/power_of_introspection/and_or.html for reference.
Be careful of using this to do recursion, ie. `f = lambda a:(a, f(a-1))[a>1]` because this will evaluate the options *before* the conditional, unlike `f = lambda a: f(a-1) if a>1 else a`, which only executes the recursive `f(a-1)` if the condition `a>1` evaluates to `True`.
A great thing I did once is:
if 3 > a > 1 < b < 5: foo()
if a > 1 and b > 1 and 3 > a and 5 > b: foo()
Python’s comparison operators rock.
Using that everything is comparable in Python 2, you can also avoid the
andoperator this way. For example, if
if a<b and c>d:foo()
can be shortened by one character to:
This uses that every list is larger than any integer.
dare lists, this gets even better:
Love the symmetry. Reminds me of the old Perl golf trick for finding the min of $a and $b: `[$a => $b]->[$b <= $a]` :)
Note that the second example (no lists) can also be done with `if(ad):foo()`
If you're using a built-in function repeatedly, it might be more space-efficient to give it a new name, if using different arguments:
for x in r(10):
for y in r(100):print x,y
r=range and the other two r's are 9 characters; using range twice is 10 characters. Not a huge saving in this example but all it would take is one more use of range to see a significant saving.
Indeed two repetitions is too little to save on a length five function name. You need: length 2: 6 reps, length 3: 4 reps, length 4 or 5: 3 reps, length >=6: 2 reps. AKA (length-1)*(reps-1)>4.
Sometimes your Python code requires you to have 2 levels of indentation. The obvious thing to do is use one and two spaces for each indentation level.
However, Python 2 considers the tab and space characters to be different indenting levels.
This means the first indentation level can be one space and the second can be one tab character.
This fails in python3: you can no more mix spaces and tabs(a bad thing for codegolf, but a good thing in all other cases).
@trichoplax, In python 3.4.3 I get `TabError: inconsistent use of tabs and spaces in indentation.`
Note that your editor *may* be changing the tab characters into space. This happened to me in VSCode. The trick there is to 1) enable whitespace rendering - `"editor.renderWhitespace": "all"` and 2) stop the editor from replacing it with whitespace - `"editor.insertSpaces": false, "editor.detectIndentation": false`. (VSCode version 1.46.1). If you want these to work only for the current golfing project, you can add a local settings.json file to the current workspace.
Use string substitution and
execto deal with long keywords like
lambdathat are repeated often in your code.
a=lambda b:lambda c:lambda d:lambda e:lambda f:0 # 48 bytes (plain)
exec"a=`b:`c:`d:`e:`f:0".replace('`','lambda ') # 47 bytes (replace)
exec"a=%sb:%sc:%sd:%se:%sf:0"%(('lambda ',)*5) # 46 bytes (%)
The target string is very often
'lambda ', which is 7 bytes long. Suppose your code snippet contains
'lambda ', and is
sbytes long. Then:
s - 6n + 29bytes long.
s - 5n + 22 + len(str(n))bytes long.
From a plot of bytes saved over
plainfor these three options, we can see that:
- For n < 5 lambdas, you're better off not doing anything fancy at all.
- For n = 5, writing
exec"..."%(('lambda ',)*5)saves 2 bytes, and is your best option.
- For n > 5, writing
exec"...".replace('`','lambda ')is your best option.
For other cases, you can index the table below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (occurences)
3 | - - - - - - - - - - - - - - r r r r r
4 | - - - - - - - - - r r r r r r r r r r
5 | - - - - - - - r r r r r r r r r r r r
6 | - - - - - r r r r r r r r r r r r r r
7 | - - - - % r r r r r r r r r r r r r r
8 | - - - % % r r r r r r r r r r r r r r
9 | - - - % % r r r r r r r r r r r r r r
10 | - - % % % r r r r r r r r r r r r r r
11 | - - % % % r r r r r r r r r r r r r r
12 | - - % % % r r r r r r r r r r r r r r r = replace
13 | - - % % % r r r r r r r r r r r r r r % = string %
14 | - % % % % r r r r r r r r r r r r r r - = do nothing
15 | - % % % % r r r r r r r r r r r r r r
For example, if the string
lambda x,y:(length 11) occurs 3 times in your code, you're better off writing
I added a new operator for lambda in my language based off of python: `=>` is just the string `= lambda `. For example, `f=>:0` would be `f = lambda: 0`.
The same applies to `import` - you can do `for s in ('module1','module2','etc'):exec"from %s import*"%s`
Use extended slicing to select one string from many
>>> for x in 0,1,2:print"fbboaaorz"[x::3]
>>> for x in 0,1,2:print["foo","bar","baz"][x]
In this Boolean two-string case, one can also write
Unlike interleaving, this works for strings of any length, but can have operator precedence issues if
bis instead an expression.
Note that the first example is exactly the same length as `for x in ("foo","bar","baz"): print x`
Store lookup tables as magic numbers
Say you want to hardcode a Boolean lookup table, like which of the first twelve English numbers contain an
Then, you can implement this lookup table concisely as:
with the resulting
1being equal to
The idea is that the magic number stores the table as a bitstring
0b111010000010, with the
n-th digit (from the end) corresponding the the
nth table entry. We access the
nth entry by bitshifting the number
nspaces to the right and taking the last digit by
This storage method is very efficient. Compare to the alternatives
You can have your lookup table store multibit entries that can be extracted like
to extract the relevant four-bit block.
Could we have an example result for the four-bit block? Did you use a rule for n-bit block?
`n`to convert an integer to a string instead of using
Note: Only works in Python 2.
Integers smaller than -2\*\*31 or bigger than 2\*\*31-1 (Longs) gets an 'L' tacked on at the end.
Collapse two numerical loops into one
Say you're iterating over the cells of an
m*ngrid. Instead of two nested
forloops, one for the row and one of the columns, it's usually shorter to use a single loop to iterate over the
m*ncells of the grid. You can extract the row and column of the cell inside the loop.
for i in range(m):
for j in range(n):
for k in range(m*n):
In effect, you're iterating over the Cartesian product of the two ranges, encoding the pair
x=i*n+j. You've save a costly
rangecall and a level of indentation inside the loop. The order of iteration is unchanged.
/in Python 3. If you refer to
jmany times, it may be faster to assign their values
j=k%ninside the loop.
For reference, to extend this to 3 loops: `for i in range(m*n*o): do_stuff(i/n/o,i%(n*o)/o,i%o)`