Tips for golfing in Python

  • What general tips do you have for golfing in Python? I'm looking for ideas which can be applied to code-golf problems and which are also at least somewhat specific to Python (e.g. "remove comments" is not an answer).



    Please post one tip per answer.


    Oh, I can see a whole set of questions like this one coming for each language...

    @Marthinho I agree. Just started a C++ equivalent. I don't think its a bad thing though, as long as we don't see the same answers re-posted across many of these question types.

    Love the question but I have to keep telling myself "this is ONLY for fun NOT for production code"

    Shouldn't this question be a community wiki post?

    @dorukayhan Nope; it's a valid [tag:code-golf] [tag:tips] question, asking for tips on shortening [tag:python] code for CG'ing purposes. Such questions are perfectly valid for the site, and none of these tags explicitly says that the question should be CW'd, unlike SO, which required CG challenges to be CW'd. Also, writing a good answer, and finding such tips always deserves something, that is taken away if the question is community wiki (rep).

    Use Python 2 for golfing not 3

    @Chris_Rands That simply does not universally hold, as there are cases in which Python 3 allows for shorter submissions.

    @JonathanFrech Especially the new `:=` operator in 3.8

  • Use a=b=c=0 instead of a,b,c=0,0,0.



    Use a,b,c='123' instead of a,b,c='1','2','3'.


    that's nice tip in general :)

    Note that this will not necessarily work for defining mutable objects that you will be modifying in-place. a=b=[1] is actually different from a=[1];b=[1]

    The funny thing about the first tip is that it works in Java too.

    @Justin Yes, but only with primitive types

    But NEVER use a=b=c=[] or any object instanciation since all the variables will point to the same instance. That's probably not what you want.

    First part bad, second part good. a, b, and c will all refer to the same thing; changing one will change them all!

  • Conditionals can be lengthy. In some cases, you can replace a simple conditional with (a,b)[condition]. If condition is true, then b is returned.



    Compare



    if a<b:return a
    else:return b


    To this



    return(b,a)[a<b]

    These aren't exactly the same. The first one evaluates only the expression that is returned while the second one always evaluates them both. These ones do short-circuit: `a if a

    `(lambda(): b, lambda(): a)a < b` make your own short-circuit with lambdas

    @marinus, they are not equal: just consider `P and A or B` for any A that gives `bool(A)=False`. But `(P and [A] or [B])[0]` will do the job. See http://www.diveintopython.net/power_of_introspection/and_or.html for reference.

    Lambdas are way longer than a conditional expression.

    @user2357112 But they make you look so much cooler when you use them. :]

    Be careful of using this to do recursion, ie. `f = lambda a:(a, f(a-1))[a>1]` because this will evaluate the options *before* the conditional, unlike `f = lambda a: f(a-1) if a>1 else a`, which only executes the recursive `f(a-1)` if the condition `a>1` evaluates to `True`.

    Of course, in this particular case, it can be golfed further to 'return min(a,b)'

  • A great thing I did once is:



    if 3 > a > 1 < b < 5: foo()


    instead of:



    if a > 1 and b > 1 and 3 > a and 5 > b: foo()


    Python’s comparison operators rock.






    Using that everything is comparable in Python 2, you can also avoid the and operator this way. For example, if a, b, c and d are integers,



    if a<b and c>d:foo()


    can be shortened by one character to:



    if a<b<[]>c>d:foo()


    This uses that every list is larger than any integer.



    If c and d are lists, this gets even better:



    if a<b<c>d:foo()

    Of course if this were actually golfed it'd be `3>a>1

    Love the symmetry. Reminds me of the old Perl golf trick for finding the min of $a and $b: `[$a => $b]->[$b <= $a]` :)

    Note that the second example (no lists) can also be done with `if(ad):foo()`

    The + should be a `*`. An `or` would be `+`

    Perl 6 allows it too

    @EriktheOutgolfer `foo()if 3>a>1

  • If you're using a built-in function repeatedly, it might be more space-efficient to give it a new name, if using different arguments:



    r=range
    for x in r(10):
    for y in r(100):print x,y

    Didn't actually save any bytes, though.

    r=range and the other two r's are 9 characters; using range twice is 10 characters. Not a huge saving in this example but all it would take is one more use of range to see a significant saving.

    @Frank The additional newline is another character.

    Indeed two repetitions is too little to save on a length five function name. You need: length 2: 6 reps, length 3: 4 reps, length 4 or 5: 3 reps, length >=6: 2 reps. AKA (length-1)*(reps-1)>4.

    Note this is applicable to all languages with first-class functions.

    Dosent work for `split()` function :/

    @DollarAkshay why ?!?

  • Sometimes your Python code requires you to have 2 levels of indentation. The obvious thing to do is use one and two spaces for each indentation level.


    However, Python 2 considers the tab and space characters to be different indenting levels.


    This means the first indentation level can be one space and the second can be one tab character.


    For example:


    if 1:
    if 1:
    pass

    Cool, I never thought about this one!

    This fails in python3: you can no more mix spaces and tabs(a bad thing for codegolf, but a good thing in all other cases).

    In python 3.4 this seems to work fine.

    @trichoplax, In python 3.4.3 I get `TabError: inconsistent use of tabs and spaces in indentation.`

    For reference, a tab is worth 8 spaces.

    Note that your editor *may* be changing the tab characters into space. This happened to me in VSCode. The trick there is to 1) enable whitespace rendering - `"editor.renderWhitespace": "all"` and 2) stop the editor from replacing it with whitespace - `"editor.insertSpaces": false, "editor.detectIndentation": false`. (VSCode version 1.46.1). If you want these to work only for the current golfing project, you can add a local settings.json file to the current workspace.

  • Use string substitution and exec to deal with long keywords like lambda that are repeated often in your code.



    a=lambda b:lambda c:lambda d:lambda e:lambda f:0   # 48 bytes  (plain)
    exec"a=`b:`c:`d:`e:`f:0".replace('`','lambda ') # 47 bytes (replace)
    exec"a=%sb:%sc:%sd:%se:%sf:0"%(('lambda ',)*5) # 46 bytes (%)


    The target string is very often 'lambda ', which is 7 bytes long. Suppose your code snippet contains n occurences of 'lambda ', and is s bytes long. Then:




    • The plain option is s bytes long.

    • The replace option is s - 6n + 29 bytes long.

    • The % option is s - 5n + 22 + len(str(n)) bytes long.



    From a plot of bytes saved over plain for these three options, we can see that:




    • For n < 5 lambdas, you're better off not doing anything fancy at all.

    • For n = 5, writing exec"..."%(('lambda ',)*5) saves 2 bytes, and is your best option.

    • For n > 5, writing exec"...".replace('`','lambda ') is your best option.



    For other cases, you can index the table below:



              1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 (occurences)
    +---------------------------------------------------------
    3 | - - - - - - - - - - - - - - r r r r r
    4 | - - - - - - - - - r r r r r r r r r r
    5 | - - - - - - - r r r r r r r r r r r r
    6 | - - - - - r r r r r r r r r r r r r r
    7 | - - - - % r r r r r r r r r r r r r r
    8 | - - - % % r r r r r r r r r r r r r r
    9 | - - - % % r r r r r r r r r r r r r r
    10 | - - % % % r r r r r r r r r r r r r r
    11 | - - % % % r r r r r r r r r r r r r r
    12 | - - % % % r r r r r r r r r r r r r r r = replace
    13 | - - % % % r r r r r r r r r r r r r r % = string %
    14 | - % % % % r r r r r r r r r r r r r r - = do nothing
    15 | - % % % % r r r r r r r r r r r r r r
    (length)


    For example, if the string lambda x,y: (length 11) occurs 3 times in your code, you're better off writing exec"..."%(('lambda x,y:',)*3).


    this should get more votes, it's a very useful tip.

    it's *extremely* rare that this works. the cost of `replace` is huge.

    When it does work, though, it helps a lot.

    Interesting, never even tghought of this!

    I added a new operator for lambda in my language based off of python: `=>` is just the string `= lambda `. For example, `f=>:0` would be `f = lambda: 0`.

    The same applies to `import` - you can do `for s in ('module1','module2','etc'):exec"from %s import*"%s`

    If you use this often enough for `.replace("R",'.replace("')` to save bytes, many other replacements become cheaper. (It also makes your code entirely unreadable.)

  • Use extended slicing to select one string from many



    >>> for x in 0,1,2:print"fbboaaorz"[x::3]
    ...
    foo
    bar
    baz


    vs



    >>> for x in 0,1,2:print["foo","bar","baz"][x]
    ...
    foo
    bar
    baz


    In this Boolean two-string case, one can also write



    b*"string"or"other_string"


    for



    ["other_string","string"][b]


    Unlike interleaving, this works for strings of any length, but can have operator precedence issues if b is instead an expression.


    Note that the first example is exactly the same length as `for x in ("foo","bar","baz"): print x`

    @MateenUlhaq, That's just an example of how the different values of `x` are rendered. The golfed part is the `"fbboaaorz"[x::3]` vs `["foo","bar","baz"][x]` How the `x` value is derived would be another part of your golf solution.

  • Store lookup tables as magic numbers



    Say you want to hardcode a Boolean lookup table, like which of the first twelve English numbers contain an n.



    0: False
    1: True
    2: False
    3: False
    4: False
    5: False
    6: False
    7: True
    8: False
    9: True
    10:True
    11:True
    12:False


    Then, you can implement this lookup table concisely as:



    3714>>i&1


    with the resulting 0 or 1 being equal to False to True.



    The idea is that the magic number stores the table as a bitstring bin(3714) = 0b111010000010, with the n-th digit (from the end) corresponding the the nth table entry. We access the nth entry by bitshifting the number n spaces to the right and taking the last digit by &1.



    This storage method is very efficient. Compare to the alternatives



    n in[1,7,9,10,11]
    '0111010000010'[n]>'0'


    You can have your lookup table store multibit entries that can be extracted like



     340954054>>4*n&15


    to extract the relevant four-bit block.


    Could we have an example result for the four-bit block? Did you use a rule for n-bit block?

    Hex might sometimes be even smaller.

    This is useful for a lot of languages.

    @Joonazan Hex is smaller for numbers over 999 999.

    `n in [...]` might be smaller for sparse sets.

  • Use `n` to convert an integer to a string instead of using str(n):


    >>> n=123
    >>> `n`
    '123'

    Note: Only works in Python 2.


    Nice, but doesn't work with Python3.

    Attention: really works for integers, but not for strings, for example.

    btw. `` is short for repr

    Integers smaller than -2\*\*31 or bigger than 2\*\*31-1 (Longs) gets an 'L' tacked on at the end.

    This can also be used to print floats to full precision

  • Collapse two numerical loops into one



    Say you're iterating over the cells of an m*n grid. Instead of two nested for loops, one for the row and one of the columns, it's usually shorter to use a single loop to iterate over the m*n cells of the grid. You can extract the row and column of the cell inside the loop.



    Original code:





    for i in range(m):
    for j in range(n):
    do_stuff(i,j)


    Golfed code:



    for k in range(m*n):
    do_stuff(k/n,k%n)


    In effect, you're iterating over the Cartesian product of the two ranges, encoding the pair (i,j) as x=i*n+j. You've save a costly range call and a level of indentation inside the loop. The order of iteration is unchanged.



    Use // instead of / in Python 3. If you refer to i and j many times, it may be faster to assign their values i=k/n, j=k%n inside the loop.


    This is awesome. I had never realised this was possible!

    I saw this in the tips for JavaScript. It's a pretty useful trick in most languages.

    For reference, to extend this to 3 loops: `for i in range(m*n*o): do_stuff(i/n/o,i%(n*o)/o,i%o)`

    In some cases, `itertools.product` can be much more concise than nested loops, especially when generating cartesian products. `a1, a2, b1, b2` are examples of the cartesian product of `'ab'` and `'12'`

License under CC-BY-SA with attribution


Content dated before 7/24/2021 11:53 AM

Tags used