List of first n prime numbers most efficiently and in shortest code

  • Rules are simple:

    • First n primes (not primes below n), should be printed to standard output separated by newlines (primes should be generated within the code)

    • primes cannot be generated by an inbuilt function or through a library, i.e. use of a inbuilt or library function such as, prime = get_nth_prime(n), is_a_prime(number), or factorlist = list_all_factors(number) won't be very creative.

    • Scoring - Say, we define Score = f([number of chars in the code]), O(f(n)) being the complexity of your algorithm where n is the number of primes it finds. So for example, if you have a 300 char code with O(n^2) complexity, score is 300^2 = 90000, for 300 chars with O(n*ln(n)), score becomes 300*5.7 = 1711.13 (let's assume all logs to be natural logs for simplicity)

    • Use any existing programming language, lowest score wins

    Edit: Problem has been changed from finding 'first 1000000 primes' to 'first n primes' because of a confusion about what 'n' in O(f(n)) is, n is the number of primes you find (finding primes is the problem here and so complexity of the problem depends on the number of primes found)

    Note: to clarify some confusions on complexity, if 'n' is the number of primes you find and 'N' is the nth prime found, complexity in terms of n is and N are not equivalent i.e. O(f(n)) != O(f(N)) as , f(N) != constant * f(n) and N != constant * n, because we know that nth prime function is not linear, I though since we were finding 'n' primes complexity should be easily expressible in terms of 'n'.

    As pointed out by Kibbee, you can visit this site to verify your solutions (here, is the old google docs list)

    Please Include these in your solution -

    • what complexity your program has (include basic analysis if not trivial)

    • character length of code

    • the final calculated score

    This is my first CodeGolf Question so, if there is a mistake or loophole in above rules please do point them out.

    Nope I think it is very different this says to find first million primes not primes under million, the two problems are very different.

    also scoring criterion here accounts for how efficiently you do it. I didn't find any ingrained judgement of efficiency there

    My answer for that one was `1[\p:i.78498` my answer for this would be `1[\p:i.1000000`. Even assuming that J's internal prime algorithm is O(n^2) my score would still only be 196.

    Is it a direct function in J that gives primes? if that is so, use of such functions as one that gives you a prime or checks if the number is prime is unfair, and I will modify problem statement to incorporate that

    @Optimus: You could do a md5sum of the primes file and print the handful of bytes, instead of uploading 8M.

    I assume with complexity `f(n)` you mean `n` the number of primes to be calculated. In your question the number is fixed, so every solution would be `O(1)` and thus have the same score, namely `1`. Also basis of the logarithm `log2(n)` in the complexity is not well-specified - but has influence on the proposed scoring.

    @Gareth as I said "prime numbers should be generated within the code" using a direct function really makes it a moot point to ask this in the first place

    @Howard I am a little weak on concepts, I fixed the number to be a million so that I could ask people to write shortest code to give out a specific output, but that shouldn't change the complexity of their algorithms for any other general case

    It is not specified but basis of log is still determinable, say in a binary search analysis determines that the base of log is 2 since the depth of a binary tree is log2(n) (and not ln(n) or log10n), but I do know, log2(n), ln(n) and other bases are obtainable multiplying by a common constant factor), so for this problem we'll just say O(logx(n))==O(log2(n))==O(log10(n)) which is the way it is normally assumed anyway

    @userunknown I think an md5sum would rarely be useful, there are many reasons that would cause the sum to be different for valid answers (one being windows \r\n vs linux \n, another being trailing spaces and endlines, encoding issues and many others)

    Google docs version keeps crashing my browser. Both Firefox and Chrome. Probably best to check out this version.

    @Optimus: If we have one line per prime, we surely have p\n and not \np, so there should be no problem with endlines. I don't see where spaces should occur from - they are out of spec. Encoding issues for decimal numbers? You're kidding! The only argument is MS-LFs, which can be clarified by 2 words like this: "\n EOLs" or healed with 2 md5sums. Another Idea would be a tail of the last 10, but this would simplify the job of finding the upper bound on calculation - depending on the way you do it. Instead of many words: `d21960bf14feb65190e5540c2996d0df` with Unix EOLs. Last ends in `...85863`.

    Nobody seems to manage to calculates their complexity properly. There's confusion about whether `n` is the number of primes or the maximum prime, and everyone ignores the fact that addition of numbers in the range `0..n` is `O(logn)`, and multiplication and division are even more expensive. I suggest that you give some example algorithms along with their correct complexity.

    I was thinking about that right now, should I change the problem statement to say 'find first n primes' instead of a million, That would make thing clearer in terms of complexity

    @ugoren I do plan to manage the scores and complexity myself, any help from anyone else is also welcome. I think after these edits, it should now be less ambiguous.

    The current best known primality test for a k-bit number is `O-tilde(k^6)`. This leads to the implication that anyone who claims a running time better than `O-tilde(n ln n (ln(n ln n))^6)` has misunderstood some part of the problem; and to the question as to how `O-tilde` complexities should be handled in the scoring.

    Nobody has pointed out, that O(n) is equivalent to O(kn) (for constant k) in complexity terms, but not in score terms. For example, suppose my complexity is O(n^10). That is equivalent to O(n^10 * 1E-308), and I may still win the challenge with a huge program with terrible complexity.

    I am voting to close this challenge as "unclear what you're asking", the issue is pointed out by JDL above, and so far there has been an exploit attempt.

  • Python (129 chars, O(n*log log n), score of 203.948)

    I'd say the Sieve of Eratosthenes is the way to go. Very simple and relatively fast.

    for i in x(2,3936):
    if a[i]:
    for j in x(i*i,N,i):a[j]=0
    print [i for i in x(len(a))if a[i]==1][2:]

    Improved code from before.

    Python (191 156 152 chars, O(n*log log n)(?), score of 252.620(?))

    I cannot at all calculate complexity, this is the best approximation I can give.

    from math import log as l
    for i in range(int(n**.5)+1):
    a=filter(lambda x:x%a[i] or x==a[i],a)
    print a[:n]

    n*int(l(n)+l(l(n))) is the top boundary of the nth prime number.

    The complexity (and thus score) calculation is based on the upper bound `n` but not on the number of primes. Thus, I assume the score has to be higher. See my comment above.

    Upper bound `n`? What's that?

    The upper bound here is `N=15485864`. For complexity calculations based on `n=1000000`, you can say `N=n*log(n)` (because of the density of primes).

    If my score needs to be fixed, please fix it for me, I still don't have a good understanding of the scoring system.

    @beary605 would it be ok if I modified the problems to finding first n primes? that would solve a lot of confusion on complexity and what n is in O(f(n))

    Yes, it would be fine.

    @Optimus, the outer loop executes O(n ln n) times. This is equivalent to trial division by primes only, so O((n ln n)^1.5 / ln(n ln n)) divisions, not taking into account the cost of a division. I believe Python has unbounded integers, but I don't know what division algorithm it uses for them.

    @PeterTaylor Can we assume for sanity that each division takes same O(k**1.5) time in every programming language without creating a big exploitable loophole? If so, we could say that, O[(n*ln(n))**1.5 / ln(n*ln(n))] will be the complexity of beary605's solution and hence score becomes 3178.93.. also then, walpen's solution's complexity can be appropriately determined

    @Optimus, not really. That creates a fixed size limit which allows an argument to be made for a O(1) solution with a large constant.

    I have updated my algorithm, what is it now?

    @PeterTaylor As I naively did not expect before floating this problem, the complexity analysis for all of solutions are pretty complicated. I had initially thought that I would personally verify each analysis and score, but my knowledge of computational complexity being very basic, it just seems that someone would then have to verify my verifications. If it's not too much trouble, could you help with this process. (I mean you're already helping a lot, But I don't know what to reply beary605 about his last comment for example.)

    `152*loglog152~=152*3~=450` how did you get 203?

    The upper bound of `n*int(l(n)+l(l(n)))` is incorrect for the general case (e.g. for `n in [6-9,12-18,31-40,…,251762-262020,…]`) as you're taking a rather tight upper bound and lowering it by replacing the log term by its floored value, you'd have to use `int(n*(l(n)+l(l(n))))` or `n*(int(l(n)+l(l(n)))+1)` instead.

  • Haskell, n^1.1 empirical growth rate, 89 chars, score 139 (?)

    The following works at GHCi prompt when the general library that it uses have been previously loaded. Print n-th prime, 1-based:

    let s=3:minus[5,7..](unionAll[[p*p,p*p+2*p..]|p<-s])in getLine>>=(print.((0:2:s)!!).read)

    This is unbounded sieve of Eratosthenes, using a general-use library for ordered lists. Empirical complexity between 100,000 and 200,000 primes O(n^1.1). Fits to O(n*log(n)*log(log n)).

    About complexity estimation

    I measured run time for 100k and 200k primes, then calculated logBase 2 (t2/t1), which produced n^1.09. Defining g n = n*log n*log(log n), calculating logBase 2 (g 200000 / g 100000) gives n^1.12.

    Then, 89**1.1 = 139, although g(89) = 600.       --- (?)

    It seems that for scoring the estimated growth rate should be used instead of complexity function itself. For example, g2 n = n*((log n)**2)*log(log n) is much better than n**1.5, but for 100 chars the two produce score of 3239 and 1000, respectively. This can't be right. Estimation on 200k/100k range gives logBase 2 (g2 200000 / g2 100000) = 1.2 and thus score of 100**1.2 = 251.

    Also, I don't attempt to print out all primes, just the n-th prime instead.

    No imports, 240 chars. n^1.15 empirical growth rate, score 546.

    s n=let s=3:g 5(a[[p*p,p*p+2*p..]|p<-s])in(0:2:s)!!n
    a((x:s):t)=x:u s(a$p t)
    p((x:s):r:t)=(x:u s r):p t
    g k [email protected](x:t)|k<x=k:g(k+2)s|True=g(k+2)t
    u [email protected](x:r)[email protected](y:t)=case(compare x y)of LT->x:u r b;EQ->x:u r t;GT->y:u a t

    the aforementioned library is module `Data.List.Ordered`, from package `data-ordlist`. for "empirical orders of growth" see this.

  • Haskell, 72 89 chars, O(n^2), Score 7921

    Highest score per char count wins right? Modified for first N. Also I apparently can't use a calculator, so my score is not as abysmally bad as I thought. (using the complexity for basic trial division as found in the source below).

    As per Will Ness the below is not a full Haskell program (it actually relies on the REPL). The following is a more complete program with a pseudo-sieve (the imports actually save a char, but I dislike imports in code golf).

    main=getLine>>= \x->print.take(read x).(let s(x:y)=x:s(filter((>0).(`mod`x))y)in s)$[2..]

    This version is undoubtedly (n^2). The algorithm is just a golf version of the naive ``sieve'', as seen here
    Old ghci 1 liner

    getLine>>= \x->print.take(read x)$Data.List.nubBy(\x y->x`mod`y==0)[2..]

    Leaving up the old, cheating answer because the library it links to is pretty nice.


    See here for an implementation and the links for the time complexity. Unfortunately wheels have a log(n) lookup time, slowing us down by a factor.

    •primes cannot be generated by an inbuilt functon or through a library

    @walpen I'am sorry I modified the rules without notification, please make the changes as you see fit

    Wouldn't the complexity be something like O((n ln n)^1.5 ln (n ln n)^0.585)? (Or O((n ln n)^1.5 ln (n ln n)) if Haskell uses naive division rather than, as I've assumed, Karatsuba)

    No, because that gives me a horrendous score :/. But I'm sure you're right. It just looked like trial division, and that's the time complexity of trial division (maybe, according to my poor reading comprehension of a possibly wrong source) so I picked that. For now I'll call my score NaN, that seems safe.

    I'm assuming (my Haskell is negligible, but I know how it would be natural to do it in SML...) that you're only doing trial division by smaller primes, in which case trial division on a P does O(P^0.5 / ln P) divisions. But if P has k bits, a division takes O(k^1.585) (Karatsuba) or O(k^2) (naïve) time, and you need to run through O(n lg n) numbers of length O(ln(n lg n)) bits.

    BTW at 72 chars, if you're on the absolute cutting edge of primality testing, you'd have a score just short of 11 million. The rest of the current answers would be far far worse.

    Oh, so now that makes sense. I tend to pretend that all math takes constant time out of laziness (I program in Haskell after all).

    Hi, I think we can leave out the imports... let's say we use `:m +Module`, how this translates into code? It doesn't. In the mean time I followed your lead and posted something w/out `main=` ;) will add that now to my post.

    I think the whole import thing, and whether an answer can just run in the REPL is a question for meta. Most of the Haskell answers I've seen are self standing, compiled programs, but I do not know if they have to be.

    The following is interesting in that it entirely avoids the use of numbers. So instead it outputs an infinite string of '.' and 'p' characters: `let f='.';o c(x:y)=x:c y;z c(x:y)=f:c y;p n='p':ap fix p(o.n)in f:f:p z`

  • This is so easy even my text editor can do this!

    Vim: 143 keystrokes (115 actions): O(n^2*log(n)): Score: 101485.21


    qpqqdqA^V^Ayyp<Esc>3h"[email protected]<Esc>0C1<Esc>@"ddmpqdmd"bywo^Ra ^Rb 0 0 `[email protected] ^Ra 0 ^[email protected]%@b<Enter> `[email protected] 0 `[email protected]<Esc>0*w*[email protected]"[email protected]@p

    Input: N should be on the first line of a blank document. After this is finished, each prime from 2 to N will be a separate line.

    Running the Commands:

    First, note that any commands with a caret in front of them mean you need to hold Ctrl and type the next letter (e.g. ^V is Ctrl-v and ^R is Ctrl-r).

    This will overwrite anything in your @a, @b, @d, and @p registers.

    Because this uses q commands, it can't just be placed in a macro. However, here are some tips for running it.

    For large N, this is very slow. It took around 27 minutes to run N=5000; consider yourself warned.


    This uses a basic recursive algorithm for finding primes. Given a list of all primes between 1 and A, A+1 is prime if it not divisible by any of the numbers in the list of primes. Start at A = 2 and add primes to the list as they are found. After N recursions, the list will contain all the primes up to N.


    This algorithm has a complexity of O(nN), where N is the input number and n is the number of primes up to N. Each recursion tests n numbers, and N recursions are performed, giving O(nN).

    However, N ~ n*log(n), giving the final complexity as O(n2*log(n))(


    It's not easy to discern the program flow from the vim commands, so I rewrote it in Python following the same flow. Like the Vim code, the python code will error out when it reaches the end. Python doesn't like too much recursion; if you try this code with N > 150 or so, it will reach the max recursion depth

    N = 20
    primes = range(2, N+1)

    # Python needs these defined.
    mark_p = b = a = -1

    # Check new number for factors.
    # This macro could be wrapped up in @d, but it saves space to leave it separate.
    def p():
    global mark_d, mark_p, primes, a
    mark_d = 0
    a = primes[mark_p]

    # Checks factor and determine what to do next
    def d():
    global mark_d, mark_p, a, b, primes
    b = primes[mark_d]
    if(a == b): # Number is prime, check the next number
    mark_p += 1
    if(a%b == 0): # Number is not prime, delete it and check next number
    else: # Number might be prime, try next possible factor
    mark_d += 1

    mark_p = 0 #Start at first number

    Now, to break down the actual keystrokes!

    • qpqqdq Clears out the @d and @p registers. This will ensure nothing runs when setting up these recursive macros.

    • A^V^Ayyp<Esc>3h"[email protected]<Esc>0C1<Esc>@"dd Turns the input into a list of numbers from 2 to N+1. The N+1 entry is deleted as a side effect of setting up the @d macro.

      • Specifically, writes a macro that increments a number, then copies it on the next line, then it writes a 1 and executes this macro N times.

    • mpqdmd"bywo^Ra ^Rb 0 0 `[email protected] ^Ra 0 ^[email protected]%@b<Enter> `[email protected] 0 `[email protected]<Esc>0*w*[email protected] writes the @d macro, which implements the d() function above. "If" statements are interesting to implement in Vim. By using the search operator *, it's possible to choose a certain path to follow. Breaking the command down further we get

    • When this is first run, the `[email protected] command is run, deleting the N+1 line.

    • qpmp"[email protected] writes the @p macro, which saves the number under the cursor, then goes to the first entry and runs @d on that.

    • [email protected] actually executes @p so that it will iterate over the whole file.

  • C#, 447 Characters, Bytes 452, Score ?

    using System;namespace PrimeNumbers{class C{static void GN(ulong n){ulong primes=0;for (ulong i=0;i<(n*3);i++){if(IP(i)==true){primes++;if(primes==n){Console.WriteLine(i);}}}}static bool IP(ulong n){if(n<=3){return n>1;}else if (n%2==0||n%3==0){return false;}for(ulong i=5;i*i<=n;i+=6){if(n%i==0||n%(i+2)==0){return false;}}return true;}static void Main(string[] args){ulong n=Convert.ToUInt64(Console.ReadLine());for(ulong i=0;i<n;i++){GN(i);}}}}

    scriptcs Variant, 381 Characters, 385 Bytes, Score ?

    using System;static void GetN(ulong n){ulong primes=0;for (ulong i=0;i<(n*500);i++){if(IsPrime(i)==true){primes++;if(primes==n){Console.WriteLine(i);}}}}public static bool IsPrime(ulong n){if(n<=3){return n>1;}else if (n%2==0||n%3==0){return false;}for(ulong i=5;i*i<=n;i+=6){if(n%i==0||n%(i+2)==0){return false;}}return true;}ulong n=Convert.ToUInt64(Console.ReadLine());for(ulong i=0;i<n;i++){GetN(i);}

    If you install scriptcs then you can run it.

    P.S. I wrote this in Vim :D

    You can save some characters by removing some unnecessary whitespace. For example, it's not necessary to put whitespace around an `=` and `<` sign. Also, I don't think there's a difference in bytes and characters for this code -- it's 548 characters and 548 bytes.

    Oh thanks, this is my first CodeGolf!

  • Scala 263 characters

    Updated to fit to the new requirements. 25% of the code deal with finding a reasonable upper bound to calculate primes below.

    object P extends App{
    def c(M:Int)={
    val p=collection.mutable.BitSet(M+1)
    (3 to M+1 by 2).map(p(_)=true)
    var j=2*i;
    val i=args(0).toInt

    I got a sieve too.

    Here is an empirical test of the calculation costs, ungolfed for analysis:

    object PrimesTo extends App{
    var cnt=0
    def c(M:Int)={
    val p=(false::false::true::List.range(3,M+1).map(_%2!=0)).toArray
    for (i <- List.range (3, M, 2)
    if (p (i))) {
    var j=2*i;
    while (j < M) {
    if (p (j))
    (1 to M).filter (x => p (x))
    val i = args(0).toInt
    To get the number x with i primes below, it is nearly ln(x)*x. For small numbers
    we need a correction factor 1.13, and to avoid a bigger factor for very small
    numbers we add 666 as an upper bound.
    val x = (math.log(i)*i*1.13).toInt+666
    println (c(x).take (i).mkString("\n"))
    System.err.println (x + "\tcount: " + cnt)
    for n in {1..5} ; do i=$((10**$n)); scala -J-Xmx768M P $i ; done

    leads to following counts:

    List (960, 1766, 15127, 217099, 2988966)

    I'm not sure how to calculate the score. Is it worth to write 5 more characters?

    scala> List(4, 25, 168, 1229, 9592, 78498, 664579, 5761455, 50847534).map(x=>(math.log(x)*x*1.13).toInt+666) 
    res42: List[Int] = List(672, 756, 1638, 10545, 100045, 1000419, 10068909, 101346800, 1019549994)

    scala> List(4, 25, 168, 1229, 9592, 78498, 664579, 5761455, 50847534).map(x=>(math.log(x)*x*1.3)toInt)
    res43: List[Int] = List(7, 104, 1119, 11365, 114329, 1150158, 11582935, 116592898, 1172932855)

    For bigger n it reduces the calculations by about 16% in that range, but afaik for the score formula, we don't consider constant factors?

    new Big-O considerations:

    To find 1 000, 10 000, 100 000 primes and so on, I use a formula about the density of primes x=>(math.log(x)*x*1.3 which determines the outer loop I'm running.

    So for values i in 1 to 6=> NPrimes (10^i) runs 9399, 133768 ... times the outer loop.

    I found this O-function iteratively with help from the comment of Peter Taylor, who suggested a much higher value for exponentiation, instead of 1.01 he suggested 1.5:

    def O(n:Int) = (math.pow((n * math.log (n)), 1.01)).toLong

    O: (n: Int)Long

    val ns = List(10, 100, 1000, 10000, 100000, 1000*1000).map(x=>(math.log(x)*x*1.3)toInt).map(O) 

    ns: List[Long] = List(102, 4152, 91532, 1612894, 25192460, 364664351)

     That's the list of upper values, to find primes below (since my algorithm has to know this value before it has to estimate it), send through the O-function, to find similar quotients for moving from 100 to 1000 to 10000 primes and so on: 

    (ns.head /: ns.tail)((a, b) => {println (b*1.0/a); b})

    This are the quotients, if I use 1.01 as exponent. Here is what the counter finds empirically:

    ns: Array[Int] = Array(1628, 2929, 23583, 321898, 4291625, 54289190, 660847317)

    (ns.head /: ns.tail)((a, b) => {println (b*1.0/a); b})

    The first two values are outliers, because I have make a constant correction for my estimation formular for small values (up to 1000).

    With Peter Taylors suggestion of 1.5 it would look like:


    Now with my value I get to:

    res85: Long = 1576

    But I'm unsure, how close I can come with my O-function to the observed values.

    Sorry I made some changes to the problem statement to reduce some ambiguity related to complexity, (I am sure your solution wouldn't change much)

    This is effectively trial division by primes. The number of times through the inner loop is `O(M^1.5 / ln M)`, and each time through you do `O(ln M)` work (addition), so overall it's `O(M^1.5) = O((n ln n)^1.5)`.

    With ^1.02 instead of ^1.5 `def O(n:Int) = (math.pow((n * math.log (n)), 1.02)).toLong` I get much closer to the values, empirically found with my counter. I insert my findings in my post.

  • GolfScript (45 chars, score claimed ~7708)

    ~[]2{..3${1$\%!}?={[email protected]\+\}{;}if)1$,3$<}do;\;n*

    This does simple trial division by primes. If near the cutting edge of Ruby (i.e. using the arithmetic uses Toom-Cook 3 multiplication, so a trial division is O(n^1.465) and the overall cost of the divisions is O((n ln n)^1.5 ln (n ln n)^0.465) = O(n^1.5 (ln n)^1.965)†. However, in GolfScript adding an element to an array requires copying the array. I've optimised this to copy the list of primes only when it finds a new prime, so only n times in total. Each copy operation is O(n) items of size O(ln(n ln n)) = O(ln n)†, giving O(n^2 ln n).

    And this, boys and girls, is why GolfScript is used for golfing rather than for serious programming.

    O(ln (n ln n)) = O(ln n + ln ln n) = O(ln n). I should have spotted this before commenting on various posts...

  • QBASIC, 98 Chars, Complexity N Sqrt(N), Score 970

    FOR J=2 TO I^.5
    IF K=1e6 THEN GOTO B
    GOTO A

    I've modified the problem statement a bit, its now find first 'n' primes, I am sorry for no notification

    I suppose we can assume "in-source" input for this program; i.e., the input is the numeral right after the `IF K=` (so the program length would not include the numeral). As it stands, the program prints the first n primes not including 2, which can be fixed by adding `?2` at the beginning, and changing `K=...` to `K=...-1`. The program can also be golfed a bit by taking the spaces out of `J=2 TO`, `J=0 THEN`, `K=...-1 THEN`, and by removing the indenting. I believe this results in a 96-character program.

  • Ruby 66 chars, O(n^2) Score - 4356

    lazy is available since Ruby 2.0, and 1.0/0 is a cool trick to get an infinite range:


    You can shave one char off by changing it to `(2..(1.0/0)){|i|!(2...i).any?{|j|i%j<1}}.take(n).to_a`

    Or even: (This makes the solution less efficient, but it does not change the upper O(n²) bound) `(2..(1.0/0)){|i|(2..i).one?{|j|i%j<1}}.take(n).to_a`. This shaves off two more characters.

    Well changing it to `(2..(1.0/0)){|i|!(2...i).any?{|j|i%j<1}}.first(n)` will result in 61 chars.

  • Scala, 124 characters

    object Q extends App{Stream.from(2).filter(p=>(2 to p)takeWhile(i=>i*i<=p)forall{p%_!= 0})take(args(0)toInt)foreach println}

    Simple trial division up to square root. Complexity therefore should be

    124^1.5 < 1381, so that would be my score i guess?

License under CC-BY-SA with attribution

Content dated before 7/24/2021 11:53 AM