Sieve of Eratosthenes

You are encouraged to You are encouraged to solve this task according to the task description, using any language you may know.

This task has been clarified. Its programming examples are in need of review to ensure that they still fit the requirements of the task.



The Sieve of Eratosthenes is a simple algorithm that finds the prime numbers up to a given integer.





Task

Implement the Sieve of Eratosthenes algorithm, with the only allowed optimization that the outer loop can stop at the square root of the limit, and the inner loop may start at the square of the prime just found.

That means especially that you shouldn't optimize by using pre-computed wheels, i.e. don't assume you need only to cross out odd numbers (wheel based on 2), numbers equal to 1 or 5 modulo 6 (wheel based on 2 and 3), or similar wheels based on low primes.

If there's an easy way to add such a wheel based optimization, implement it as an alternative version.





Note

It is important that the sieve algorithm be the actual algorithm used to find prime numbers for the task.





Related tasks







For maximum compatibility, this program uses only the basic instruction set.

* Sieve of Eratosthenes

ERATOST CSECT

USING ERATOST,R12

SAVEAREA B STM-SAVEAREA(R15)

DC 17F'0'

DC CL8'ERATOST'

STM STM R14,R12,12(R13) save calling context

ST R13,4(R15)

ST R15,8(R13)

LR R12,R15 set addessability

* ---- CODE

LA R4,1 I=1

LA R6,1 increment

L R7,N limit

LOOPI BXH R4,R6,ENDLOOPI do I=2 to N

LR R1,R4 R1=I

BCTR R1,0

LA R14,CRIBLE(R1)

CLI 0(R14),X'01'

BNE ENDIF if not CRIBLE(I)

LR R5,R4 J=I

LR R8,R4

LR R9,R7

LOOPJ BXH R5,R8,ENDLOOPJ do J=I*2 to N by I

LR R1,R5 R1=J

BCTR R1,0

LA R14,CRIBLE(R1)

MVI 0(R14),X'00' CRIBLE(J)='0'B

B LOOPJ

ENDLOOPJ EQU *

ENDIF EQU *

B LOOPI

ENDLOOPI EQU *

LA R4,1 I=1

LA R6,1

L R7,N

LOOP BXH R4,R6,ENDLOOP do I=1 to N

LR R1,R4 R1=I

BCTR R1,0

LA R14,CRIBLE(R1)

CLI 0(R14),X'01'

BNE NOTPRIME if not CRIBLE(I)

CVD R4,P P=I

UNPK Z,P Z=P

MVC C,Z C=Z

OI C+L'C-1,X'F0' zap sign

MVC WTOBUF(8),C+8

WTO MF=(E,WTOMSG)

NOTPRIME EQU *

B LOOP

ENDLOOP EQU *

RETURN EQU *

LM R14,R12,12(R13) restore context

XR R15,R15 set return code to 0

BR R14 return to caller

* ---- DATA

I DS F

J DS F

DS 0F

P DS PL8 packed

Z DS ZL16 zoned

C DS CL16 character

WTOMSG DS 0F

DC H'80' length of WTO buffer

DC H'0' must be binary zeroes

WTOBUF DC 80C' '

LTORG

N DC F'100000'

CRIBLE DC 100000X'01'

YREGS

END ERATOST

Output:

00000002 00000003 00000005 00000007 00000011 00000013 00000017 00000019 00000023 00000029 00000031 00000037 00000041 00000043 00000047 00000053 00000059 00000061 00000067 ... 00099767 00099787 00099793 00099809 00099817 00099823 00099829 00099833 00099839 00099859 00099871 00099877 00099881 00099901 00099907 00099923 00099929 00099961 00099971 00099989 00099991

If this subroutine is called with the value of n in the accumulator, it will store an array of the primes less than n beginning at address 1000 hex and return the number of primes it has found in the accumulator.

ERATOS: STA $D0 ; value of n

LDA #$00

LDX #$00

SETUP: STA $1000,X ; populate array

ADC #$01

INX

CPX $D0

BPL SET

JMP SETUP

SET: LDX #$02

SIEVE: LDA $1000,X ; find non-zero

INX

CPX $D0

BPL SIEVED

CMP #$00

BEQ SIEVE

STA $D1 ; current prime

MARK: CLC

ADC $D1

TAY

LDA #$00

STA $1000,Y

TYA

CMP $D0

BPL SIEVE

JMP MARK

SIEVED: LDX #$01

LDY #$00

COPY: INX

CPX $D0

BPL COPIED

LDA $1000,X

CMP #$00

BEQ COPY

STA $2000,Y

INY

JMP COPY

COPIED: TYA ; how many found

RTS

Algorithm somewhat optimized: array omits 1, 2, all higher odd numbers. Optimized for storage: uses bit array for prime/composite flags.

Some of the macro code is derived from the examples included with EASy68K. See 68000 "100 Doors" listing for additional information.

*-----------------------------------------------------------

* Title : BitSieve

* Written by : G. A. Tippery

* Date : 2014 - Feb - 24 , 2013 - Dec - 22

* Description: Prime number sieve

*-----------------------------------------------------------

ORG $1000



** ---- Generic macros ---- **

PUSH MACRO

MOVE . L \ 1 , - ( SP )

ENDM



POP MACRO

MOVE . L ( SP ) + ,\ 1

ENDM



DROP MACRO

ADDQ # 4 , SP

ENDM



PUTS MACRO

** Print a null - terminated string w / o CRLF **

** Usage: PUTS stringaddress

** Returns with D0 , A1 modified

MOVEQ # 14 , D0 ; task number 14 (display null string)

LEA \ 1 , A1 ; address of string

TRAP # 15 ; display it

ENDM



GETN MACRO

MOVEQ # 4 , D0 ; Read a number from the keyboard into D1.L.

TRAP # 15

ENDM



** ---- Application - specific macros ---- **



val MACRO ; Used by bit sieve. Converts bit address to the number it represents.

ADD . L \ 1 ,\ 1 ; double it because odd numbers are omitted

ADDQ # 3 ,\ 1 ; add offset because initial primes (1, 2) are omitted

ENDM



* ** ================================================================================ **

* ** Integer square root routine, bisection method **

* ** IN: D0 , should be 0 < D0 < $10000 ( 65536 ) -- higher values MAY work, no guarantee

* ** OUT: D1

*

SquareRoot:

*

MOVEM . L D2 - D4 , - ( SP ) ; save registers needed for local variables

* DO == n

* D1 == a

* D2 == b

* D3 == guess

* D4 == temp

*

* a = 1 ;

* b = n ;

MOVEQ # 1 , D1

MOVE . L D0 , D2

* do {

REPEAT

* guess = ( a + b ) / 2 ;

MOVE . L D1 , D3

ADD . L D2 , D3

LSR . L # 1 , D3

* if ( guess * guess > n ) { // inverse function of sqrt is square

MOVE . L D3 , D4

MULU D4 , D4 ; guess^2

CMP . L D0 , D4

BLS .else

* b = guess ;

MOVE . L D3 , D2

BRA .endif

* } else {

.else:

* a = guess ;

MOVE . L D3 , D1

* } // if

.endif:

* } while ( ( b - a ) > 1 ) ; ; Same as until (b-a)<=1 or until (a-b)>=1

MOVE . L D2 , D4

SUB . L D1 , D4 ; b-a

UNTIL. L D4 < LE > # 1 DO. S

* return ( a ) ; Result is in D1

* } // LongSqrt ( )

MOVEM . L ( SP ) + ,D2 - D4 ; restore saved registers

RTS

*

* ** ================================================================================ **





** ======================================================================= **

*

** Prime - number Sieve of Eratosthenes routine using a big bit field for flags **

* Enter with D0 = size of sieve ( bit array )

* Prints found primes 10 per line

* Returns # prime found in D6

*

* Register usage:

*

* D0 == n

* D1 == prime

* D2 == sqroot

* D3 == PIndex

* D4 == CIndex

* D5 == MaxIndex

* D6 == PCount

*

* A0 == PMtx [ 0 ]

*

* On return, all registers above except D0 are modified. Could add MOVEMs to save and restore D2 - D6 / A0 .

*



** ------------------------ **



GetBit: ** sub - part of Sieve subroutine **

** Entry: bit # is on TOS

** Exit: A6 holds the byte number, D7 holds the bit number within the byte

** Note: Input param is still on TOS after return. Could have passed via a register, but

** wanted to practice with stack. : )

*

MOVE . L ( 4 , SP ) , D7 ; get value from (pre-call) TOS

ASR . L # 3 , D7 ; /8

MOVEA D7 , A6 ; byte #

MOVE . L ( 4 , SP ) , D7 ; get value from (pre-call) TOS

AND . L # $7 , D7 ; bit #

RTS



** ------------------------ **



Sieve:

MOVE D0 , D5

SUBQ # 1 , D5

JSR SquareRoot ; sqrt D0 => D1

MOVE . L D1 , D2

LEA PArray, A0

CLR . L D3

*

PrimeLoop:

MOVE . L D3 , D1

val D1

MOVE . L D3 , D4

ADD . L D1 , D4

*

CxLoop: ; Goes through array marking multiples of d1 as composite numbers

CMP . L D5 , D4

BHI ExitCx

PUSH D4 ; set D7 as bit # and A6 as byte pointer for D4'th bit of array

JSR GetBit

DROP

BSET D7 , 0 ( A0 , A6 . L ) ; set bit to mark as composite number

ADD . L D1 , D4 ; next number to mark

BRA CxLoop

ExitCx:

CLR . L D1 ; Clear new-prime-found flag

ADDQ # 1 , D3 ; Start just past last prime found

PxLoop: ; Searches for next unmarked (not composite) number

CMP . L D2 , D3 ; no point searching past where first unmarked multiple would be past end of array

BHI ExitPx ; if past end of array

TST . L D1

BNE ExitPx ; if flag set, new prime found

PUSH D3 ; check D3'th bit flag

JSR GetBit ; sets D7 as bit # and A6 as byte pointer

DROP ; drop TOS

BTST D7 , 0 ( A0 , A6 . L ) ; read bit flag

BNE IsSet ; If already tagged as composite

MOVEQ # - 1 , D1 ; Set flag that we've found a new prime

IsSet:

ADDQ # 1 , D3 ; next PIndex

BRA PxLoop

ExitPx:

SUBQ # 1 , D3 ; back up PIndex

TST . L D1 ; Did we find a new prime #?

BNE PrimeLoop ; If another prime # found, go process it

*

; fall through to print routine



** ------------------------ **



* Print primes found

*

* D4 == Column count

*

* Print header and assumed primes ( # 1 , # 2 )

PUTS Header ; Print string @ Header, no CR/LF

MOVEQ # 2 , D6 ; Start counter at 2 because #1 and #2 are assumed primes

MOVEQ # 2 , D4

*

MOVEQ # 0 , D3

PrintLoop:

CMP . L D5 , D3

BHS ExitPL

PUSH D3

JSR GetBit ; sets D7 as bit # and A6 as byte pointer

DROP ; drop TOS

BTST D7 , 0 ( A0 , A6 . L )

BNE NotPrime

* printf ( " %6d" , val ( PIndex )

MOVE . L D3 , D1

val D1

AND . L # $0000FFFF , D1

MOVEQ # 6 , D2

MOVEQ # 20 , D0 ; display signed RJ

TRAP # 15

ADDQ # 1 , D4

ADDQ # 1 , D6

* *** Display formatting ***

* if ( ( PCount % 10 ) == 0 ) printf ( "

" ) ;

CMP # 10 , D4

BLO NoLF

PUTS CRLF

MOVEQ # 0 , D4

NoLF:

NotPrime:

ADDQ # 1 , D3

BRA PrintLoop

ExitPL:

RTS



** ======================================================================= **



N EQU 5000 ; *** Size of boolean (bit) array ***

SizeInBytes EQU ( N + 7 ) / 8

*

START: ; first instruction of program

MOVE . L #N, D0 ; # to test

JSR Sieve

* printf ( "

%d prime numbers found.

" , D6 ) ; ***

PUTS Summary1, A1

MOVE # 3 , D0 ; Display signed number in D1.L in decimal in smallest field.

MOVE . W D6 , D1

TRAP # 15

PUTS Summary2, A1



SIMHALT ; halt simulator



** ======================================================================= **



* Variables and constants here



ORG $2000

CR EQU 13

LF EQU 10

CRLF DC.B CR,LF, $00



PArray: DCB.B SizeInBytes, 0



Header: DC.B CR,LF,LF, ' Primes' ,CR,LF, ' ======' ,CR,LF

DC.B ' 1 2' , $00



Summary1: DC.B CR,LF, ' ' , $00

Summary2: DC.B ' prime numbers found.' ,CR,LF, $00



END START ; last line of source

Works with: as version Raspberry Pi 3B version Buster 64 bits



/* ARM assembly AARCH64 Raspberry PI 3B */

/* program cribleEras64.s */



/*******************************************/

/* Constantes file */

/*******************************************/

/* for this file see task include a file in language AArch64 assembly */

.include "../includeConstantesARM64.inc"



.equ MAXI, 100



/*********************************/

/* Initialized data */

/*********************************/

.data

sMessResult: .asciz "Prime : @

"

szCarriageReturn: .asciz "

"



/*********************************/

/* UnInitialized data */

/*********************************/

.bss

sZoneConv: .skip 24

TablePrime: .skip 8 * MAXI

/*********************************/

/* code section */

/*********************************/

.text

.global main

main: // entry of program

ldr x4,qAdrTablePrime // address prime table

mov x0,#2 // prime 2

bl displayPrime

mov x1,#2

mov x2,#1

1: // loop for multiple of 2

str x2,[x4,x1,lsl #3] // mark multiple of 2

add x1,x1,#2

cmp x1,#MAXI // end ?

ble 1b // no loop

mov x1,#3 // begin indice

mov x3,#1

2:

ldr x2,[x4,x1,lsl #3] // load table élément

cmp x2,#1 // is prime ?

beq 4f

mov x0,x1 // yes -> display

bl displayPrime

mov x2,x1

3: // and loop to mark multiples of this prime

str x3,[x4,x2,lsl #3]

add x2,x2,x1 // add the prime

cmp x2,#MAXI // end ?

ble 3b // no -> loop

4:

add x1,x1,2 // other prime in table

cmp x1,MAXI // end table ?

ble 2b // no -> loop



100: // standard end of the program

mov x0,0 // return code

mov x8,EXIT // request to exit program

svc 0 // perform the system call

qAdrszCarriageReturn: .quad szCarriageReturn

qAdrsMessResult: .quad sMessResult

qAdrTablePrime: .quad TablePrime



/******************************************************************/

/* Display prime table elements */

/******************************************************************/

/* x0 contains the prime */

displayPrime:

stp x1,lr,[sp,-16]! // save registers

ldr x1,qAdrsZoneConv

bl conversion10 // call décimal conversion

ldr x0,qAdrsMessResult

ldr x1,qAdrsZoneConv // insert conversion in message

bl strInsertAtCharInc

bl affichageMess // display message

100:

ldp x1,lr,[sp],16 // restaur 2 registers

ret // return to address lr x30

qAdrsZoneConv: .quad sZoneConv



/********************************************************/

/* File Include fonctions */

/********************************************************/

/* for this file see task include a file in language AArch64 assembly */

.include "../includeARM64.inc"



Prime : 2 Prime : 3 Prime : 5 Prime : 7 Prime : 11 Prime : 13 Prime : 17 Prime : 19 Prime : 23 Prime : 29 Prime : 31 Prime : 37 Prime : 41 Prime : 43 Prime : 47 Prime : 53 Prime : 59 Prime : 61 Prime : 67 Prime : 71 Prime : 73 Prime : 79 Prime : 83 Prime : 89 Prime : 97



PARAMETERS : p_limit TYPE i OBLIGATORY DEFAULT 100 .



AT SELECTION-SCREEN ON p_limit .

IF p_limit LE 1 .

MESSAGE 'Limit must be higher then 1 . ' TYPE 'E' .

ENDIF .



START-OF-SELECTION .

FIELD-SYMBOLS : < fs_prime > TYPE flag .

DATA : gt_prime TYPE TABLE OF flag ,

gv_prime TYPE flag ,

gv_i TYPE i ,

gv_j TYPE i .



DO p_limit TIMES .

IF sy-index > 1 .

gv_prime = abap_true .

ELSE .

gv_prime = abap_false .

ENDIF .



APPEND gv_prime TO gt_prime .

ENDDO .



gv_i = 2 .

WHILE ( gv_i <= trunc ( sqrt ( p_limit ) ) ) .

IF ( gt_prime [ gv_i ] EQ abap_true ) .

gv_j = gv_i ** 2 .

WHILE ( gv_j <= p_limit ) .

gt_prime [ gv_j ] = abap_false .

gv_j = ( gv_i ** 2 ) + ( sy-index * gv_i ) .

ENDWHILE .

ENDIF .

gv_i = gv_i + 1 .

ENDWHILE .



LOOP AT gt_prime INTO gv_prime .

IF gv_prime = abap_true .

WRITE : / sy-tabix .

ENDIF .

ENDLOOP .



( defun nats-to-from ( n i )

( declare ( xargs : measure ( nfix ( - n i ) ) ) )

( if ( zp ( - n i ) )

nil

( cons i ( nats-to-from n ( + i 1 ) ) ) ) )



( defun remove-multiples-up-to-r ( factor limit xs i )

( declare ( xargs : measure ( nfix ( - limit i ) ) ) )

( if ( or ( > i limit )

( zp ( - limit i ) )

( zp factor ) )

xs

( remove-multiples-up-to-r

factor

limit

( remove i xs )

( + i factor ) ) ) )



( defun remove-multiples-up-to ( factor limit xs )

( remove-multiples-up-to-r factor limit xs ( * factor 2 ) ) )



( defun sieve-r ( factor limit )

( declare ( xargs : measure ( nfix ( - limit factor ) ) ) )

( if ( zp ( - limit factor ) )

( nats-to-from limit 2 )

( remove-multiples-up-to factor ( + limit 1 )

( sieve-r ( 1 + factor ) limit ) ) ) )



( defun sieve ( limit )

( sieve-r 2 limit ) )

Works with ActionScript 3.0 (this is utilizing the actions panel, not a separated class file)

function eratosthenes ( limit: int ) : Array

{

var primes: Array = new Array ( ) ;

if ( limit > = 2 ) {

var sqrtlmt: int = int ( Math . sqrt ( limit ) - 2 ) ;

var nums: Array = new Array ( ) ; // start with an empty Array...

for ( var i: int = 2 ; i < = limit; i++ ) // and

nums. push ( i ) ; // only initialize the Array once...

for ( var j: int = 0 ; j < = sqrtlmt; j++ ) {

var p: int = nums [ j ]

if ( p )

for ( var t: int = p * p - 2 ; t < nums. length ; t += p )

nums [ t ] = 0 ;

}

for ( var m: int = 0 ; m < nums. length ; m++ ) {

var r: int = nums [ m ] ;

if ( r )

primes. push ( r ) ;

}

}

return primes;

}

var e : Array = eratosthenes ( 1000 ) ;

trace ( e ) ;

Output:

Output:

2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,409,419,421,431,433,439,443,449,457,461,463,467,479,487,491,499,503,509,521,523,541,547,557,563,569,571,577,587,593,599,601,607,613,617,619,631,641,643,647,653,659,661,673,677,683,691,701,709,719,727,733,739,743,751,757,761,769,773,787,797,809,811,821,823,827,829,839,853,857,859,863,877,881,883,887,907,911,919,929,937,941,947,953,967,971,977,983,991,997

with Ada. Text_IO , Ada. Command_Line ;



procedure Eratos is



Last: Positive := Positive'Value ( Ada. Command_Line . Argument ( 1 ) ) ;

Prime: array ( 1 .. Last ) of Boolean := ( 1 => False, others => True ) ;

Base: Positive := 2 ;

Cnt: Positive;

begin

loop

exit when Base * Base > Last;

if Prime ( Base ) then

Cnt := Base + Base;

loop

exit when Cnt > Last;

Prime ( Cnt ) := False;

Cnt := Cnt + Base;

end loop ;

end if ;

Base := Base + 1 ;

end loop ;

Ada. Text_IO . Put ( "Primes less or equal" & Positive'Image ( Last ) & " are:" ) ;

for Number in Prime' Range loop

if Prime ( Number ) then

Ada. Text_IO . Put ( Positive'Image ( Number ) ) ;

end if ;

end loop ;

end Eratos;

Output:

> ./eratos 31 Primes less or equal 31 are : 2 3 5 7 11 13 17 19 23 29 31



-- imports

open import Data.Nat as ℕ using (ℕ; suc; zero; _+_; _∸_)

open import Data.Vec as Vec using (Vec; _∷_; []; tabulate; foldr)

open import Data.Fin as Fin using (Fin; suc; zero)

open import Function using (_∘_; const; id)

open import Data.List as List using (List; _∷_; [])

open import Data.Maybe using (Maybe; just; nothing)



-- Without square cutoff optimization

module Simple where

primes : ∀ n → List (Fin n)

primes zero = []

primes (suc zero) = []

primes (suc (suc zero)) = []

primes (suc (suc (suc m))) = sieve (tabulate (just ∘ suc))

where

sieve : ∀ {n} → Vec (Maybe (Fin (2 + m))) n → List (Fin (3 + m))

sieve [] = []

sieve (nothing ∷ xs) = sieve xs

sieve (just x ∷ xs) = suc x ∷ sieve (foldr B remove (const []) xs x)

where

B = λ n → ∀ {i} → Fin i → Vec (Maybe (Fin (2 + m))) n



remove : ∀ {n} → Maybe (Fin (2 + m)) → B n → B (suc n)

remove _ ys zero = nothing ∷ ys x

remove y ys (suc z) = y ∷ ys z



-- With square cutoff optimization

module SquareOpt where

primes : ∀ n → List (Fin n)

primes zero = []

primes (suc zero) = []

primes (suc (suc zero)) = []

primes (suc (suc (suc m))) = sieve 1 m (Vec.tabulate (just ∘ Fin.suc ∘ Fin.suc))

where

sieve : ∀ {n} → ℕ → ℕ → Vec (Maybe (Fin (3 + m))) n → List (Fin (3 + m))

sieve _ zero = List.mapMaybe id ∘ Vec.toList

sieve _ (suc _) [] = []

sieve i (suc l) (nothing ∷ xs) = sieve (suc i) (l ∸ i ∸ i) xs

sieve i (suc l) (just x ∷ xs) = x ∷ sieve (suc i) (l ∸ i ∸ i) (Vec.foldr B remove (const []) xs i)

where

B = λ n → ℕ → Vec (Maybe (Fin (3 + m))) n



remove : ∀ {i} → Maybe (Fin (3 + m)) → B i → B (suc i)

remove _ ys zero = nothing ∷ ys i

remove y ys (suc j) = y ∷ ys j



Tested with Agena 2.9.5 Win32

# Sieve of Eratosthenes



# generate and return a sequence containing the primes up to sieveSize

sieve := proc( sieveSize :: number ) :: sequence is

local sieve, result;



result := seq(); # sequence of primes - initially empty

create register sieve( sieveSize ); # "vector" to be sieved



sieve[ 1 ] := false;

for sPos from 2 to sieveSize do sieve[ sPos ] := true od;



# sieve the primes

for sPos from 2 to entier( sqrt( sieveSize ) ) do

if sieve[ sPos ] then

for p from sPos * sPos to sieveSize by sPos do

sieve[ p ] := false

od

fi

od;



# construct the sequence of primes

for sPos from 1 to sieveSize do

if sieve[ sPos ] then insert sPos into result fi

od



return result

end; # sieve





# test the sieve proc

for i in sieve( 100 ) do write( " ", i ) od; print();

Output:

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

Based on the 1962 Revised Repport:

comment Sieve of Eratosthenes; begin integer array t[0:1000]; integer i,j,k; for i:=0 step 1 until 1000 do t[i]:=1; t[0]:=0; t[1]:=0; i:=0; for i:=i while i<1000 do begin for i:=i while i<1000 and t[i]=0 do i:=i+1; if i<1000 then begin j:=2; k:=j*i; for k:=k while k<1000 do begin t[k]:=0; j:=j+1; k:=j*i end; i:=i+1 end end; for i:=0 step 1 until 999 do if t[i]≠0 then print(i,ꞌ is primeꞌ) end

An 1964 Implementation:

Works with: ALGOL 60 for OS/360

'BEGIN'

'INTEGER' 'ARRAY' CANDIDATES(/0..1000/);

'INTEGER' I,J,K;

'COMMENT' SET LINE-LENGTH=120,SET LINES-PER-PAGE=62,OPEN;

SYSACT(1,6,120); SYSACT(1,8,62); SYSACT(1,12,1);

'FOR' I := 0 'STEP' 1 'UNTIL' 1000 'DO'

'BEGIN'

CANDIDATES(/I/) := 1;

'END';

CANDIDATES(/0/) := 0;

CANDIDATES(/1/) := 0;

I := 0;

'FOR' I := I 'WHILE' I 'LESS' 1000 'DO'

'BEGIN'

'FOR' I := I 'WHILE' I 'LESS' 1000

'AND' CANDIDATES(/I/) 'EQUAL' 0 'DO'

I := I+1;

'IF' I 'LESS' 1000 'THEN'

'BEGIN'

J := 2;

K := J*I;

'FOR' K := K 'WHILE' K 'LESS' 1000 'DO'

'BEGIN'

CANDIDATES(/K/) := 0;

J := J + 1;

K := J*I;

'END';

I := I+1;

'END'

'END';

'FOR' I := 0 'STEP' 1 'UNTIL' 999 'DO'

'IF' CANDIDATES(/I/) 'NOTEQUAL' 0 'THEN'

'BEGIN'

OUTINTEGER(1,I);

OUTSTRING(1,'(' IS PRIME')');

'COMMENT' NEW LINE;

SYSACT(1,14,1)

'END'

'END'

'END'

BOOL prime = TRUE , non prime = FALSE ;

PROC eratosthenes = ( INT n ) [ ] BOOL :

(

[ n ] BOOL sieve ;

FOR i TO UPB sieve DO sieve [ i ] := prime OD ;

INT m = ENTIER sqrt ( n ) ;

sieve [ 1 ] := non prime ;

FOR i FROM 2 TO m DO

IF sieve [ i ] = prime THEN

FOR j FROM i * i BY i TO n DO

sieve [ j ] := non prime

OD

FI

OD ;

sieve

) ;



print ( ( eratosthenes ( 80 ) , new line ) )

Output:

FTTFTFTFFFTFTFFFTFTFFFTFFFFFTFTFFFFFTFFFTFTFFFTFFFFFTFFFFFTFTFFFFFTFFFTFTFFFFFTF

Standard, non-optimised sieve [ edit ]

begin



% implements the sieve of Eratosthenes %

% s(i) is set to true if i is prime, false otherwise %

% algol W doesn't have a upb operator, so we pass the size of the %

% array in n %

procedure sieve( logical array s ( * ); integer value n ) ;

begin



% start with everything flagged as prime %

for i := 1 until n do s( i ) := true;



% sieve out the non-primes %

s( 1 ) := false;

for i := 2 until truncate( sqrt( n ) )

do begin

if s( i )

then begin

for p := i * i step i until n do s( p ) := false

end if_s_i

end for_i ;



end sieve ;



% test the sieve procedure %



integer sieveMax;



sieveMax := 100;

begin



logical array s ( 1 :: sieveMax );



i_w := 2; % set output field width %

s_w := 1; % and output separator width %



% find and display the primes %

sieve( s, sieveMax );

for i := 1 until sieveMax do if s( i ) then writeon( i );



end



end.

Output:

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

Odd numbers only version [ edit ]

Alternative version that only stores odd numbers greater than 1 in the sieve.

begin

% implements the sieve of Eratosthenes %

% only odd numbers appear in the sieve, which starts at 3 %

% s( i ) is set to true if ( i * 2 ) + 1 is prime %

procedure sieve2( logical array s ( * ); integer value n ) ;

begin

% start with everything flagged as prime %

for i := 1 until n do s( i ) := true;

% sieve out the non-primes %

% the subscripts of s are 1 2 3 4 5 6 7 8 9 10 11 12 13... %

% which correspond to 3 5 7 9 11 13 15 17 19 21 23 25 27... %

for i := 1 until truncate( sqrt( n ) ) do begin

if s( i ) then begin

integer ip;

ip := ( i * 2 ) + 1;

for p := i + ip step ip until n do s( p ) := false

end if_s_i

end for_i ;

end sieve2 ;

% test the sieve2 procedure %

integer primeMax, arrayMax;

primeMax := 100;

arrayMax := ( primeMax div 2 ) - 1;

begin

logical array s ( 1 :: arrayMax);

i_w := 2; % set output field width %

s_w := 1; % and output separator width %

% find and display the primes %

sieve2( s, arrayMax );

write( 2 );

for i := 1 until arrayMax do if s( i ) then writeon( ( i * 2 ) + 1 );

end

end.

Output:

Same as the standard version.



BEGIN



COMMENT

FIND PRIMES UP TO THE SPECIFIED LIMIT (HERE 1,000) USING

CLASSIC SIEVE OF ERATOSTHENES;



% CALCULATE INTEGER SQUARE ROOT %

INTEGER FUNCTION ISQRT(N);

INTEGER N;

BEGIN

INTEGER R1, R2;

R1 := N;

R2 := 1;

WHILE R1 > R2 DO

BEGIN

R1 := (R1+R2) / 2;

R2 := N / R1;

END;

ISQRT := R1;

END;



INTEGER LIMIT, I, J, FALSE, TRUE, COL, COUNT;

INTEGER ARRAY FLAGS[1:1000];



LIMIT := 1000;

FALSE := 0;

TRUE := 1;



WRITE("FINDING PRIMES FROM 2 TO",LIMIT);



% INITIALIZE TABLE %

WRITE("INITIALIZING ... ");

FOR I := 1 STEP 1 UNTIL LIMIT DO

FLAGS[I] := TRUE;



% SIEVE FOR PRIMES %

WRITEON("SIEVING ... ");

FOR I := 2 STEP 1 UNTIL ISQRT(LIMIT) DO

BEGIN

IF FLAGS[I] = TRUE THEN

FOR J := (I * I) STEP I UNTIL LIMIT DO

FLAGS[J] := FALSE;

END;



% WRITE OUT THE PRIMES TEN PER LINE %

WRITEON("PRINTING");

COUNT := 0;

COL := 1;

WRITE("");

FOR I := 2 STEP 1 UNTIL LIMIT DO

BEGIN

IF FLAGS[I] = TRUE THEN

BEGIN

WRITEON(I);

COUNT := COUNT + 1;

COL := COL + 1;

IF COL > 10 THEN

BEGIN

WRITE("");

COL := 1;

END;

END;

END;



WRITE("");

WRITE(COUNT, " PRIMES WERE FOUND.");



END



Output:

FINDING PRIMES FROM 2 TO 1000 INTIALIZING ... SIEVING ... PRINTING 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 . . . 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997 168 PRIMES WERE FOUND.

All these versions requires ⎕io←0 (index origin 0).

It would have been better to require a result of the boolean mask rather than the actual list of primes. The list of primes obtains readily from the mask by application of a simple function (here {⍵/⍳⍴⍵} ). Other related computations (such as the number of primes < n) obtain readily from the mask, easier than producing the list of primes.

Non-Optimized Version [ edit ]

sieve2←{

b←⍵⍴1

b[⍳2⌊⍵]←0

2≥⍵:b

p←{⍵/⍳⍴⍵}∇⌈⍵*0.5

m←1+⌊(⍵-1+p×p)÷p

b ⊣ p {b[⍺×⍺+⍳⍵]←0}¨ m

}



primes2←{⍵/⍳⍴⍵}∘sieve2

The required list of prime divisors obtains by recursion ( {⍵/⍳⍴⍵}∇⌈⍵*0.5 ).

Optimized Version [ edit ]

sieve←{

b←⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5

b[⍳6⌊⍵]←(6⌊⍵)⍴0 0 1 1 0 1

49≥⍵:b

p←3↓{⍵/⍳⍴⍵}∇⌈⍵*0.5

m←1+⌊(⍵-1+p×p)÷2×p

b ⊣ p {b[⍺×⍺+2×⍳⍵]←0}¨ m

}



primes←{⍵/⍳⍴⍵}∘sieve

The optimizations are as follows:

Multiples of 2 3 5 are marked by initializing b with ⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5 rather than with ⍵⍴1 .

are marked by initializing with rather than with . Subsequently, only odd multiples of primes > 5 are marked.

Multiples of a prime to be marked start at its square.

Examples [ edit ]

primes 100

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97



primes¨ ⍳14

┌┬┬┬─┬───┬───┬─────┬─────┬───────┬───────┬───────┬───────┬──────────┬──────────┐

││││2│2 3│2 3│2 3 5│2 3 5│2 3 5 7│2 3 5 7│2 3 5 7│2 3 5 7│2 3 5 7 11│2 3 5 7 11│

└┴┴┴─┴───┴───┴─────┴─────┴───────┴───────┴───────┴───────┴──────────┴──────────┘



sieve 13

0 0 1 1 0 1 0 1 0 0 0 1 0



+/∘sieve¨ 10*⍳10

0 4 25 168 1229 9592 78498 664579 5761455 50847534

The last expression computes the number of primes < 1e0 1e1 ... 1e9. The last number 50847534 can perhaps be called the anti-Bertelsen's number (http://mathworld.wolfram.com/BertelsensNumber.html).

on sieveOfEratosthenes ( limit ) -- Return all the prime numbers from 2 to limit.

-- A local "owner" for the potentially long number list. Speeds up references to its items and properties.

script o

-- Initialise the list with a 'missing value' to represent an "already crossed out 1".

property numberList : { missing value }

end script



-- Complete the list with the numbers from 2 up to and including the specified limit.

-- AppleScript indices are 1-based, so each number's position in the list is the same as its value.

repeat with n from 2 to limit

set end of o ' s numberList to n

end repeat



-- Work through the positions up to the square root of the limit.

-- Where a position's contents haven't been "crossed out" themselves, cross out the contents of all positions that are higher multiples of it, starting at its square.

repeat with position from 2 to ( limit ^ 0.5 ) div 1

if ( item position of o ' s numberList is not missing value ) then

repeat with multiple from position * position to limit by position

set item multiple of o ' s numberList to missing value

end repeat

end if

end repeat



-- The surviving numbers are the primes.

return o ' s numberList ' s numbers

end sieveOfEratosthenes



sieveOfEratosthenes ( 1000 )

Output:

{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997}

Works with: as version Raspberry Pi





/* ARM assembly Raspberry PI */

/* program cribleEras . s */



/* REMARK 1 : this program use routines in a include file

see task Include a file language arm assembly

for the routine affichageMess conversion10

see at end of this program the instruction include */

/* for constantes see task include a file in arm assembly */

/************************************/

/* Constantes */

/************************************/

. include "../constantes.inc"



. equ MAXI , 101





/*********************************/

/* Initialized data */

/*********************************/

. data

sMessResult : . asciz "Prime : @

"

szCarriageReturn : . asciz "

"



/*********************************/

/* UnInitialized data */

/*********************************/

. bss

sZoneConv : . skip 24

TablePrime : . skip 4 * MAXI

/*********************************/

/* code section */

/*********************************/

. text

. global main

main : @ entry of program

ldr r4 , iAdrTablePrime @ address prime table

mov r0 , # 2 @ prime 2

bl displayPrime

mov r1 , # 2

mov r2 , # 1

1 : @ loop for multiple of 2

str r2 , [ r4 , r1 , lsl # 2 ] @ mark multiple of 2

add r1 , # 2

cmp r1 , #MAXI @ end ?

ble 1b @ no loop

mov r1 , # 3 @ begin indice

mov r3 , # 1

2 :

ldr r2 , [ r4 , r1 , lsl # 2 ] @ load table élément

cmp r2 , # 1 @ is prime ?

beq 4f

mov r0 , r1 @ yes - > display

bl displayPrime

mov r2 , r1

3 : @ and loop to mark multiples of this prime

str r3 , [ r4 , r2 , lsl # 2 ]

add r2 , r1 @ add the prime

cmp r2 , #MAXI @ end ?

ble 3b @ no - > loop

4 :

add r1 , # 2 @ other prime in table

cmp r1 , #MAXI @ end table ?

ble 2b @ no - > loop



100 : @ standard end of the program

mov r0 , # 0 @ return code

mov r7 , #EXIT @ request to exit program

svc # 0 @ perform the system call

iAdrszCarriageReturn : . int szCarriageReturn

iAdrsMessResult : . int sMessResult

iAdrTablePrime : . int TablePrime



/******************************************************************/

/* Display prime table elements */

/******************************************************************/

/* r0 contains the prime */

displayPrime :

push { r1 , lr } @ save registers

ldr r1 , iAdrsZoneConv

bl conversion10 @ call décimal conversion

ldr r0 , iAdrsMessResult

ldr r1 , iAdrsZoneConv @ insert conversion in message

bl strInsertAtCharInc

bl affichageMess @ display message

100 :

pop { r1 , lr }

bx lr

iAdrsZoneConv : . int sZoneConv

/***************************************************/

/* ROUTINES INCLUDE */

/***************************************************/

. include "../affichage.inc"



Prime : 2 Prime : 3 Prime : 5 Prime : 7 Prime : 11 Prime : 13 Prime : 17 Prime : 19 Prime : 23 Prime : 29 Prime : 31 Prime : 37 Prime : 41 Prime : 43 Prime : 47 Prime : 53 Prime : 59 Prime : 61 Prime : 67 Prime : 71 Prime : 73 Prime : 79 Prime : 83 Prime : 89 Prime : 97 Prime : 101

Search autohotkey.com: of Eratosthenes

Source: AutoHotkey forum by Laszlo

MsgBox % "12345678901234567890`n" Sieve ( 20 )



Sieve ( n ) { ; Sieve of Eratosthenes => string of 0|1 chars, 1 at position k: k is prime

Static zero := 48 , one := 49 ; Asc("0"), Asc("1")

VarSetCapacity ( S , n , one )

NumPut ( zero , S , 0 , "char" )

i := 2

Loop % sqrt ( n ) - 1 {

If ( NumGet ( S , i - 1 , "char" ) = one )

Loop % n // i

If ( A_Index > 1 )

NumPut ( zero , S , A_Index * i - 1 , "char" )

i += 1 + ( i > 2 )

}

Return S

}

Alternative Version [ edit ]

Sieve_of_Eratosthenes ( n ) {

arr := [ ]

loop % n - 1

if A_Index > 1

arr [ A_Index ] := true



for i , v in arr {

if ( i > Sqrt ( n ) )

break

else if arr [ i ]

while ( ( j := i * 2 + ( A_Index - 1 ) * i ) < n )

arr . delete ( j )

}

return Arr

}

n := 101

Arr := Sieve_of_Eratosthenes ( n )

loop , % n - 1

output .= ( Arr [ A_Index ] ? A_Index : "." ) . ( ! Mod ( A_Index , 10 ) ? "`n" : "`t" )

MsgBox % output

return

Output:

. 2 3 . 5 . 7 . . . 11 . 13 . . . 17 . 19 . . . 23 . . . . . 29 . 31 . . . . . 37 . . . 41 . 43 . . . 47 . . . . . 53 . . . . . 59 . 61 . . . . . 67 . . . 71 . 73 . . . . . 79 . . . 83 . . . . . 89 . . . . . . . 97 . . .

#include <Array.au3>

$M = InputBox ( "Integer" , "Enter biggest Integer" )

Global $a [ $M ] , $r [ $M ] , $c = 1

For $i = 2 To $M - 1

If Not $a [ $i ] Then

$r [ $c ] = $i

$c += 1

For $k = $i To $M - 1 Step $i

$a [ $k ] = True

Next

EndIf

Next

$r [ 0 ] = $c - 1

ReDim $r [ $c ]

_ArrayDisplay ( $r )

Examples:

An initial array holds all numbers 2..max (which is entered on stdin); then all products of integers are deleted from it; the remaining are displayed in the unsorted appearance of a hash table. Here, the script is entered directly on the commandline, and input entered on stdin:

$ awk '{for(i=2;i<=$1;i++) a[i]=1; > for(i=2;i<=sqrt($1);i++) for(j=2;j<=$1;j++) delete a[i*j]; > for(i in a) printf i" "}' 100 71 53 17 5 73 37 19 83 47 29 7 67 59 11 97 79 89 31 13 41 23 2 61 43 3

The following variant does not unset non-primes, but sets them to 0, to preserve order in output:

$ awk '{for(i=2;i<=$1;i++) a[i]=1; > for(i=2;i<=sqrt($1);i++) for(j=2;j<=$1;j++) a[i*j]=0; > for(i=2;i<=$1;i++) if(a[i])printf i" "}' 100 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

Now with the script from a file, input from commandline as well as stdin, and input is checked for valid numbers:



# usage: gawk -v n=101 -f sieve.awk



function sieve ( n ) { # print n,":"

for ( i= 2 ; i <= n; i ++ ) a [ i ] = 1 ;

for ( i= 2 ; i <= sqrt ( n ) ;i ++ ) for ( j= 2 ;j <= n;j ++ ) a [ i * j ] = 0 ;

for ( i= 2 ; i <= n; i ++ ) if ( a [ i ] ) printf i " "

print ""

}



BEGIN { print "Sieve of Eratosthenes:"

if ( n > 1 ) sieve ( n )

}



{ n= $1 + 0 }

n < 2 { exit }

{ sieve ( n ) }



END { print "Bye!" }



Here is an alternate version that uses an associative array to record composites with a prime dividing it. It can be considered a slow version, as it does not cross out composites until needed. This version assumes enough memory to hold all primes up to ULIMIT. It prints out noncomposites greater than 1.



BEGIN { ULIMIT= 100



for ( n= 1 ; ( n ++ ) < ULIMIT ; )

if ( n in S ) {

p = S [ n ]

delete S [ n ]

for ( m = n ; ( m + = p ) in S ; ) { }

S [ m ] = p

}

else print ( S [ ( n + n ) ] = n )

}



Bash [ edit ]

See solutions at UNIX Shell.

Works with: FreeBASIC

Works with: RapidQ

DIM n AS Integer , k AS Integer , limit AS Integer



INPUT "Enter number to search to: " ; limit

DIM flags ( limit ) AS Integer



FOR n = 2 TO SQR ( limit )

IF flags ( n ) = 0 THEN

FOR k = n*n TO limit STEP n

flags ( k ) = 1

NEXT k

END IF

NEXT n



' Display the primes

FOR n = 2 TO limit

IF flags ( n ) = 0 THEN PRINT n; ", " ;

NEXT n

10 INPUT "ENTER NUMBER TO SEARCH TO: ";LIMIT

20 DIM FLAGS(LIMIT)

30 FOR N = 2 TO SQR (LIMIT)

40 IF FLAGS(N) < > 0 GOTO 80

50 FOR K = N * N TO LIMIT STEP N

60 FLAGS(K) = 1

70 NEXT K

80 NEXT N

90 REM DISPLAY THE PRIMES

100 FOR N = 2 TO LIMIT

110 IF FLAGS(N) = 0 THEN PRINT N;", ";

120 NEXT N

100 PROGRAM "Sieve.bas"

110 LET LIMIT=100

120 NUMERIC T(1 TO LIMIT)

130 FOR I=1 TO LIMIT

140 LET T(I)=0

150 NEXT

160 FOR I=2 TO SQR(LIMIT)

170 IF T(I)<>1 THEN

180 FOR K=I*I TO LIMIT STEP I

190 LET T(K)=1

200 NEXT

210 END IF

220 NEXT

230 FOR I=2 TO LIMIT ! Display the primes

240 IF T(I)=0 THEN PRINT I;

250 NEXT

10 DEFINT a-z

20 INPUT "Limit" ;limit

30 DIM f ( limit )

40 FOR n= 2 TO SQR ( limit )

50 IF f ( n ) = 1 THEN 90

60 FOR k=n*n TO limit STEP n

70 f ( k ) = 1

80 NEXT k

90 NEXT n

100 FOR n= 2 TO limit

110 IF f ( n ) = 0 THEN PRINT n; "," ;

120 NEXT

5 REM Tested with MSXPen web emulator

6 REM Translated from Rosetta's ZX Spectrum implementation

10 INPUT "Enter number to search to: ";l

20 DIM p(l)

30 FOR n=2 TO SQR(l)

40 IF p(n)<>0 THEN NEXT n

50 FOR k=n*n TO l STEP n

60 LET p(k)=1

70 NEXT k

80 NEXT n

90 REM Display the primes

100 FOR n=2 TO l

110 IF p(n)=0 THEN PRINT n;", ";

120 NEXT n

If you only have 1k of RAM, this program will work—but you will only be able to sieve numbers up to 101. The program is therefore more useful if you have more memory available.

A note on FAST and SLOW : under normal circumstances the CPU spends about 3/4 of its time driving the display and only 1/4 doing everything else. Entering FAST mode blanks the screen (which we do not want to update anyway), resulting in substantially improved performance; we then return to SLOW mode when we have something to print out.

10 INPUT L

20 FAST

30 DIM N(L)

40 FOR I=2 TO SQR L

50 IF N(I) THEN GOTO 90

60 FOR J=I+I TO L STEP I

70 LET N(J)=1

80 NEXT J

90 NEXT I

100 SLOW

110 FOR I=2 TO L

120 IF NOT N(I) THEN PRINT I;" ";

130 NEXT I

10 INPUT "Enter number to search to: " ;l

20 DIM p ( l )

30 FOR n= 2 TO SQR l

40 IF p ( n ) <> 0 THEN NEXT n

50 FOR k=n*n TO l STEP n

60 LET p ( k ) = 1

70 NEXT k

80 NEXT n

90 REM Display the primes

100 FOR n= 2 TO l

110 IF p ( n ) = 0 THEN PRINT n; ", " ;

120 NEXT n

:: Sieve of Eratosthenes for Rosetta Code - PG

@ echo off

setlocal ENABLEDELAYEDEXPANSION

setlocal ENABLEEXTENSIONS

rem echo on

set /p n=limit:

rem set n=100

for /L %% i in ( 1,1, % n %) do set crible. %% i =1

for /L %% i in ( 2,1, % n %) do (

if ! crible.%% i ! EQU 1 (

set /A w = %% i * 2

for /L %% j in (! w ! , %% i , % n %) do (

set crible. %% j =0

)

)

)

for /L %% i in ( 2,1, % n %) do (

if ! crible.%% i ! EQU 1 echo %% i

)

pause

Output:

limit: 100 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

limit% = 100000

DIM sieve% limit%



prime% = 2

WHILE prime%^2 < limit%

FOR I% = prime%*2 TO limit% STEP prime%

sieve%?I% = 1

NEXT

REPEAT prime% += 1 : UNTIL sieve%?prime%=0

ENDWHILE



REM Display the primes:

FOR I% = 1 TO limit%

IF sieve%?I% = 0 PRINT I%;

NEXT

2>:3g" "-!v\ g30 < |!`"O":+1_:.:03p>03g+:"O"`| @ ^ p3\" ":< 2 234567890123456789012345678901234567890123456789012345678901234567890123456789

This solution does not use an array. Instead, numbers themselves are used as variables. The numbers that are not prime are set (to the silly value "nonprime"). Finally all numbers up to the limit are tested for being initialised. The uninitialised (unset) ones must be the primes.

( ( eratosthenes

= n j i

. !arg:?n

& 1:?i

& whl

' ( (1+!i:?i)^2:~>!n:?j

& ( !!i

| whl

' ( !j:~>!n

& nonprime:?!j

& !j+!i:?j

)

)

)

& 1:?i

& whl

' ( 1+!i:~>!n:?i

& (!!i|put$(!i " "))

)

)

& eratosthenes$100

)

Output:

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

#include <stdlib.h>

#include <math.h>



char *

eratosthenes ( int n , int * c )

{

char * sieve ;

int i , j , m ;



if ( n < 2 )

return NULL ;



* c = n - 1 ; /* primes count */

m = ( int ) sqrt ( ( double ) n ) ;



/* calloc initializes to zero */

sieve = calloc ( n + 1 , sizeof ( char ) ) ;

sieve [ 0 ] = 1 ;

sieve [ 1 ] = 1 ;

for ( i = 2 ; i <= m ; i ++ )

if ( ! sieve [ i ] )

for ( j = i * i ; j <= n ; j += i )

if ( ! sieve [ j ] ) {

sieve [ j ] = 1 ;

-- ( * c ) ;

}

return sieve ;

}

Plain sieve, without any optimizations:Possible optimizations include sieving only odd numbers (or more complex wheels), packing the sieve into bits to improve locality (and allow larger sieves), etc.

Another example:

We first fill ones into an array and assume all numbers are prime. Then, in a loop, fill zeroes into those places where i * j is less than or equal to n (number of primes requested), which means they have multiples! To understand this better, look at the output of the following example.

#include <stdio.h>

#include <malloc.h>

void sieve ( int *, int ) ;



int main ( int argc , char * argv )

{

int * array , n = 10 ;

array = ( int * ) malloc ( ( n + 1 ) * sizeof ( int ) ) ;

sieve ( array , n ) ;

return 0 ;

}



void sieve ( int * a , int n )

{

int i = 0 , j = 0 ;



for ( i = 2 ; i <= n ; i ++ ) {

a [ i ] = 1 ;

}



for ( i = 2 ; i <= n ; i ++ ) {

printf ( "

i:%d" , i ) ;

if ( a [ i ] == 1 ) {

for ( j = i ; ( i * j ) <= n ; j ++ ) {

printf ( "

j:%d" , j ) ;

printf ( "

Before a[%d*%d]: %d" , i , j , a [ i * j ] ) ;

a [ ( i * j ) ] = 0 ;

printf ( "

After a[%d*%d]: %d" , i , j , a [ i * j ] ) ;

}

}

}



printf ( "

Primes numbers from 1 to %d are : " , n ) ;

for ( i = 2 ; i <= n ; i ++ ) {

if ( a [ i ] == 1 )

printf ( "%d, " , i ) ;

}

printf ( "



" ) ;

}

Output:

i:2

j:2

Before a[2*2]: 1

After a[2*2]: 0

j:3

Before a[2*3]: 1

After a[2*3]: 0

j:4

Before a[2*4]: 1

After a[2*4]: 0

j:5

Before a[2*5]: 1

After a[2*5]: 0

i:3

j:3

Before a[3*3]: 1

After a[3*3]: 0

i:4

i:5

i:6

i:7

i:8

i:9

i:10

Primes numbers from 1 to 10 are : 2, 3, 5, 7,

using System ;

using System.Collections ;

using System.Collections.Generic ;



namespace SieveOfEratosthenes

{

class Program

{

static void Main ( string [ ] args )

{

int maxprime = int . Parse ( args [ 0 ] ) ;

var primelist = GetAllPrimesLessThan ( maxprime ) ;

foreach ( int prime in primelist )

{

Console . WriteLine ( prime ) ;

}

Console . WriteLine ( "Count = " + primelist . Count ) ;

Console . ReadLine ( ) ;

}



private static List < int > GetAllPrimesLessThan ( int maxPrime )

{

var primes = new List < int > ( ) ;

var maxSquareRoot = ( int ) Math . Sqrt ( maxPrime ) ;

var eliminated = new BitArray ( maxPrime + 1 ) ;



for ( int i = 2 ; i <= maxPrime ; ++ i )

{

if ( ! eliminated [ i ] )

{

primes . Add ( i ) ;

if ( i <= maxSquareRoot )

{

for ( int j = i * i ; j <= maxPrime ; j += i )

{

eliminated [ j ] = true ;

}

}

}

}

return primes ;

}

}

}

Unbounded [ edit ]

To print this back, we look for ones in the array and only print those spots.

Richard Bird Sieve

Translation of: F#

To show that C# code can be written in somewhat functional paradigms, the following in an implementation of the Richard Bird sieve from the Epilogue of [Melissa E. O'Neill's definitive article](http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf) in Haskell:

using System ;

using System.Collections ;

using System.Collections.Generic ;

using System.Linq ;

using PrimeT = System . UInt32 ;

class PrimesBird : IEnumerable < PrimeT > {

private struct CIS < T > {

public T v ; public Func < CIS < T >> cont ;

public CIS ( T v, Func < CIS < T >> cont ) {

this . v = v ; this . cont = cont ;

}

}

private CIS < PrimeT > pmlts ( PrimeT p ) {

Func < PrimeT, CIS < PrimeT >> fn = null ;

fn = ( c ) => new CIS < PrimeT > ( c, ( ) => fn ( c + p ) ) ;

return fn ( p * p ) ;

}

private CIS < CIS < PrimeT >> allmlts ( CIS < PrimeT > ps ) {

return new CIS < CIS < PrimeT >> ( pmlts ( ps . v ) , ( ) => allmlts ( ps . cont ( ) ) ) ; }

private CIS < PrimeT > merge ( CIS < PrimeT > xs, CIS < PrimeT > ys ) {

var x = xs . v ; var y = ys . v ;

if ( x < y ) return new CIS < PrimeT > ( x, ( ) => merge ( xs . cont ( ) , ys ) ) ;

else if ( y < x ) return new CIS < PrimeT > ( y, ( ) => merge ( xs, ys . cont ( ) ) ) ;

else return new CIS < PrimeT > ( x, ( ) => merge ( xs . cont ( ) , ys . cont ( ) ) ) ;

}

private CIS < PrimeT > cmpsts ( CIS < CIS < PrimeT >> css ) {

return new CIS < PrimeT > ( css . v . v , ( ) => merge ( css . v . cont ( ) , cmpsts ( css . cont ( ) ) ) ) ; }

private CIS < PrimeT > minusat ( PrimeT n, CIS < PrimeT > cs ) {

var nn = n ; var ncs = cs ;

for ( ; ; ++ nn ) {

if ( nn >= ncs . v ) ncs = ncs . cont ( ) ;

else return new CIS < PrimeT > ( nn, ( ) => minusat ( ++ nn, ncs ) ) ;

}

}

private CIS < PrimeT > prms ( ) {

return new CIS < PrimeT > ( 2 , ( ) => minusat ( 3 , cmpsts ( allmlts ( prms ( ) ) ) ) ) ; }

public IEnumerator < PrimeT > GetEnumerator ( ) {

for ( var ps = prms ( ) ; ; ps = ps . cont ( ) ) yield return ps . v ;

}

IEnumerator IEnumerable . GetEnumerator ( ) { return ( IEnumerator ) GetEnumerator ( ) ; }

}

Tree Folding Sieve

Translation of: F#

The above code can easily be converted to "odds-only" and a infinite tree-like folding scheme with the following minor changes:

using System ;

using System.Collections ;

using System.Collections.Generic ;

using System.Linq ;

using PrimeT = System . UInt32 ;

class PrimesTreeFold : IEnumerable < PrimeT > {

private struct CIS < T > {

public T v ; public Func < CIS < T >> cont ;

public CIS ( T v, Func < CIS < T >> cont ) {

this . v = v ; this . cont = cont ;

}

}

private CIS < PrimeT > pmlts ( PrimeT p ) {

var adv = p + p ;

Func < PrimeT, CIS < PrimeT >> fn = null ;

fn = ( c ) => new CIS < PrimeT > ( c, ( ) => fn ( c + adv ) ) ;

return fn ( p * p ) ;

}

private CIS < CIS < PrimeT >> allmlts ( CIS < PrimeT > ps ) {

return new CIS < CIS < PrimeT >> ( pmlts ( ps . v ) , ( ) => allmlts ( ps . cont ( ) ) ) ;

}

private CIS < PrimeT > merge ( CIS < PrimeT > xs, CIS < PrimeT > ys ) {

var x = xs . v ; var y = ys . v ;

if ( x < y ) return new CIS < PrimeT > ( x, ( ) => merge ( xs . cont ( ) , ys ) ) ;

else if ( y < x ) return new CIS < PrimeT > ( y, ( ) => merge ( xs, ys . cont ( ) ) ) ;

else return new CIS < PrimeT > ( x, ( ) => merge ( xs . cont ( ) , ys . cont ( ) ) ) ;

}

private CIS < CIS < PrimeT >> pairs ( CIS < CIS < PrimeT >> css ) {

var nxtcss = css . cont ( ) ;

return new CIS < CIS < PrimeT >> ( merge ( css . v , nxtcss . v ) , ( ) => pairs ( nxtcss . cont ( ) ) ) ; }

private CIS < PrimeT > cmpsts ( CIS < CIS < PrimeT >> css ) {

return new CIS < PrimeT > ( css . v . v , ( ) => merge ( css . v . cont ( ) , cmpsts ( pairs ( css . cont ( ) ) ) ) ) ;

}

private CIS < PrimeT > minusat ( PrimeT n, CIS < PrimeT > cs ) {

var nn = n ; var ncs = cs ;

for ( ; ; nn += 2 ) {

if ( nn >= ncs . v ) ncs = ncs . cont ( ) ;

else return new CIS < PrimeT > ( nn, ( ) => minusat ( nn + 2 , ncs ) ) ;

}

}

private CIS < PrimeT > oddprms ( ) {

return new CIS < PrimeT > ( 3 , ( ) => minusat ( 5 , cmpsts ( allmlts ( oddprms ( ) ) ) ) ) ;

}

public IEnumerator < PrimeT > GetEnumerator ( ) {

yield return 2 ;

for ( var ps = oddprms ( ) ; ; ps = ps . cont ( ) ) yield return ps . v ;

}

IEnumerator IEnumerable . GetEnumerator ( ) { return ( IEnumerator ) GetEnumerator ( ) ; }

}

The above code runs over ten times faster than the original Richard Bird algorithm.

Priority Queue Sieve

Translation of: F#

First, an implementation of a Min Heap Priority Queue is provided as extracted from the entry at RosettaCode, with only the necessary methods duplicated here:

namespace PriorityQ {

using KeyT = System . UInt32 ;

using System ;

using System.Collections.Generic ;

using System.Linq ;

class Tuple < K, V > { // for DotNet 3.5 without Tuple's

public K Item1 ; public V Item2 ;

public Tuple ( K k, V v ) { Item1 = k ; Item2 = v ; }

public override string ToString ( ) {

return "(" + Item1 . ToString ( ) + ", " + Item2 . ToString ( ) + ")" ;

}

}

class MinHeapPQ < V > {

private struct HeapEntry {

public KeyT k ; public V v ;

public HeapEntry ( KeyT k, V v ) { this . k = k ; this . v = v ; }

}

private List < HeapEntry > pq ;

private MinHeapPQ ( ) { this . pq = new List < HeapEntry > ( ) ; }

private bool mt { get { return pq . Count == 0 ; } }

private Tuple < KeyT, V > pkmn {

get {

if ( pq . Count == 0 ) return null ;

else {

var mn = pq [ 0 ] ;

return new Tuple < KeyT, V > ( mn . k , mn . v ) ;

}

}

}

private void psh ( KeyT k, V v ) { // add extra very high item if none

if ( pq . Count == 0 ) pq . Add ( new HeapEntry ( UInt32 . MaxValue , v ) ) ;

var i = pq . Count ; pq . Add ( pq [ i - 1 ] ) ; // copy bottom item...

for ( var ni = i >> 1 ; ni > 0 ; i >>= 1 , ni >>= 1 ) {

var t = pq [ ni - 1 ] ;

if ( t . k > k ) pq [ i - 1 ] = t ; else break ;

}

pq [ i - 1 ] = new HeapEntry ( k, v ) ;

}

private void siftdown ( KeyT k, V v, int ndx ) {

var cnt = pq . Count - 1 ; var i = ndx ;

for ( var ni = i + i + 1 ; ni < cnt ; ni = ni + ni + 1 ) {

var oi = i ; var lk = pq [ ni ] . k ; var rk = pq [ ni + 1 ] . k ;

var nk = k ;

if ( k > lk ) { i = ni ; nk = lk ; }

if ( nk > rk ) { ni += 1 ; i = ni ; }

if ( i != oi ) pq [ oi ] = pq [ i ] ; else break ;

}

pq [ i ] = new HeapEntry ( k, v ) ;

}

private void rplcmin ( KeyT k, V v ) {

if ( pq . Count > 1 ) siftdown ( k, v, 0 ) ; }

public static MinHeapPQ < V > empty { get { return new MinHeapPQ < V > ( ) ; } }

public static Tuple < KeyT, V > peekMin ( MinHeapPQ < V > pq ) { return pq . pkmn ; }

public static MinHeapPQ < V > push ( KeyT k, V v, MinHeapPQ < V > pq ) {

pq . psh ( k, v ) ; return pq ; }

public static MinHeapPQ < V > replaceMin ( KeyT k, V v, MinHeapPQ < V > pq ) {

pq . rplcmin ( k, v ) ; return pq ; }

}

The following code implements an improved version of the odds-only O'Neil algorithm, which provides the improvements of only adding base prime composite number streams to the queue when the sieved number reaches the square of the base prime (saving a huge amount of memory and considerable execution time, including not needing as wide a range of a type for the internal prime numbers) as well as minimizing stream processing using fusion:

using System ;

using System.Collections ;

using System.Collections.Generic ;

using System.Linq ;

using PrimeT = System . UInt32 ;

class PrimesPQ : IEnumerable < PrimeT > {

private IEnumerator < PrimeT > nmrtr ( ) {

MinHeapPQ < PrimeT > pq = MinHeapPQ < PrimeT >. empty ;

PrimeT bp = 3 ; PrimeT q = 9 ;

IEnumerator < PrimeT > bps = null ;

yield return 2 ; yield return 3 ;

for ( var n = ( PrimeT ) 5 ; ; n += 2 ) {

if ( n >= q ) { // always equal or less...

if ( q <= 9 ) {

bps = nmrtr ( ) ;

bps . MoveNext ( ) ; bps . MoveNext ( ) ; } // move to 3...

bps . MoveNext ( ) ; var nbp = bps . Current ; q = nbp * nbp ;

var adv = bp + bp ; bp = nbp ;

pq = MinHeapPQ < PrimeT >. push ( n + adv, adv, pq ) ;

}

else {

var pk = MinHeapPQ < PrimeT >. peekMin ( pq ) ;

var ck = ( pk == null ) ? q : pk . Item1 ;

if ( n >= ck ) {

do { var adv = pk . Item2 ;

pq = MinHeapPQ < PrimeT >. replaceMin ( ck + adv, adv, pq ) ;

pk = MinHeapPQ < PrimeT >. peekMin ( pq ) ; ck = pk . Item1 ;

} while ( n >= ck ) ;

}

else yield return n ;

}

}

}

public IEnumerator < PrimeT > GetEnumerator ( ) { return nmrtr ( ) ; }

IEnumerator IEnumerable . GetEnumerator ( ) { return ( IEnumerator ) GetEnumerator ( ) ; }

}

The above code is at least about 2.5 times faster than the Tree Folding version.

Dictionary (Hash table) Sieve

The above code adds quite a bit of overhead in having to provide a version of a Priority Queue for little advantage over a Dictionary (hash table based) version as per the code below:

using System ;

using System.Collections ;

using System.Collections.Generic ;

using System.Linq ;

using PrimeT = System . UInt32 ;

class PrimesDict : IEnumerable < PrimeT > {

private IEnumerator < PrimeT > nmrtr ( ) {

Dictionary < PrimeT, PrimeT > dct = new Dictionary < PrimeT, PrimeT > ( ) ;

PrimeT bp = 3 ; PrimeT q = 9 ;

IEnumerator < PrimeT > bps = null ;

yield return 2 ; yield return 3 ;

for ( var n = ( PrimeT ) 5 ; ; n += 2 ) {

if ( n >= q ) { // always equal or less...

if ( q <= 9 ) {

bps = nmrtr ( ) ;

bps . MoveNext ( ) ; bps . MoveNext ( ) ;

} // move to 3...

bps . MoveNext ( ) ; var nbp = bps . Current ; q = nbp * nbp ;

var adv = bp + bp ; bp = nbp ;

dct . Add ( n + adv, adv ) ;

}

else {

if ( dct . ContainsKey ( n ) ) {

PrimeT nadv ; dct . TryGetValue ( n, out nadv ) ; dct . Remove ( n ) ; var nc = n + nadv ;

while ( dct . ContainsKey ( nc ) ) nc += nadv ;

dct . Add ( nc, nadv ) ;

}

else yield return n ;

}

}

}

public IEnumerator < PrimeT > GetEnumerator ( ) { return nmrtr ( ) ; }

IEnumerator IEnumerable . GetEnumerator ( ) { return ( IEnumerator ) GetEnumerator ( ) ; }

}

The above code runs in about three quarters of the time as the above Priority Queue based version for a range of a million primes which will fall even further behind for increasing ranges due to the Dictionary providing O(1) access times as compared to the O(log n) access times for the Priority Queue; the only slight advantage of the PQ based version is at very small ranges where the constant factor overhead of computing the table hashes becomes greater than the "log n" factor for small "n".

Page Segmented Array Sieve

All of the above unbounded versions are really just an intellectual exercise as with very little extra lines of code above the fastest Dictionary based version, one can have an bit-packed page-segmented array based version as follows:

using System ;

using System.Collections ;

using System.Collections.Generic ;

using System.Linq ;

using PrimeT = System . UInt32 ;

class PrimesPgd : IEnumerable < PrimeT > {

private const int PGSZ = 1 << 14 ; // L1 CPU cache size in bytes

private const int BFBTS = PGSZ * 8 ; // in bits

private const int BFRNG = BFBTS * 2 ;

public IEnumerator < PrimeT > nmrtr ( ) {

IEnumerator < PrimeT > bps = null ;

List < uint > bpa = new List < uint > ( ) ;

uint [ ] cbuf = new uint [ PGSZ / 4 ] ; // 4 byte words

yield return 2 ;

for ( var lowi = ( PrimeT ) 0 ; ; lowi += BFBTS ) {

for ( var bi = 0 ; ; ++ bi ) {

if ( bi < 1 ) {

if ( bi < 0 ) { bi = 0 ; yield return 2 ; }

PrimeT nxt = 3 + lowi + lowi + BFRNG ;

if ( lowi <= 0 ) { // cull very first page

for ( int i = 0 , p = 3 , sqr = 9 ; sqr < nxt ; i ++ , p += 2 , sqr = p * p )

if ( ( cbuf [ i >> 5 ] & ( 1 << ( i & 31 ) ) ) == 0 )

for ( int j = ( sqr - 3 ) >> 1 ; j < BFBTS ; j += p )

cbuf [ j >> 5 ] |= 1u << j ;

}

else { // cull for the rest of the pages

Array . Clear ( cbuf, 0 , cbuf . Length ) ;

if ( bpa . Count == 0 ) { // inite secondar base primes stream

bps = nmrtr ( ) ; bps . MoveNext ( ) ; bps . MoveNext ( ) ;

bpa . Add ( ( uint ) bps . Current ) ; bps . MoveNext ( ) ;

} // add 3 to base primes array

// make sure bpa contains enough base primes...

for ( PrimeT p = bpa [ bpa . Count - 1 ] , sqr = p * p ; sqr < nxt ; ) {

p = bps . Current ; bps . MoveNext ( ) ; sqr = p * p ; bpa . Add ( ( uint ) p ) ;

}

for ( int i = 0 , lmt = bpa . Count - 1 ; i < lmt ; i ++ ) {

var p = ( PrimeT ) bpa [ i ] ; var s = ( p * p - 3 ) >> 1 ;

// adjust start index based on page lower limit...

if ( s >= lowi ) s -= lowi ;

else {

var r = ( lowi - s ) % p ;

s = ( r != 0 ) ? p - r : 0 ;

}

for ( var j = ( uint ) s ; j < BFBTS ; j += p )

cbuf [ j >> 5 ] |= 1u << ( ( int ) j ) ;

}

}

}

while ( bi < BFBTS && ( cbuf [ bi >> 5 ] & ( 1 << ( bi & 31 ) ) ) != 0 ) ++ bi ;

if ( bi < BFBTS ) yield return 3 + ( ( ( PrimeT ) bi + lowi ) << 1 ) ;

else break ; // outer loop for next page segment...

}

}

}

public IEnumerator < PrimeT > GetEnumerator ( ) { return nmrtr ( ) ; }

IEnumerator IEnumerable . GetEnumerator ( ) { return ( IEnumerator ) GetEnumerator ( ) ; }

}

The above code is about 25 times faster than the Dictionary version at computing the first about 50 million primes (up to a range of one billion), with the actual enumeration of the result sequence now taking longer than the time it takes to cull the composite number representation bits from the arrays, meaning that it is over 50 times faster at actually sieving the primes. The code owes its speed as compared to a naive "one huge memory array" algorithm to using an array size that is the size of the CPU L1 or L2 caches and using bit-packing to fit more number representations into this limited capacity; in this way RAM memory access times are reduced by a factor of from about four to about 10 (depending on CPU and RAM speed) as compared to those naive implementations, and the minor computational cost of the bit manipulations is compensated by a large factor in total execution time.

The time to enumerate the result primes sequence can be reduced somewhat (about a second) by removing the automatic iterator "yield return" statements and converting them into a "rull-your-own" IEnumerable<PrimeT> implementation, but for page segmentation of odds-only, this iteration of the results will still take longer than the time to actually cull the composite numbers from the page arrays.

In order to make further gains in speed, custom methods must be used to avoid using iterator sequences. If this is done, then further gains can be made by extreme wheel factorization (up to about another about four times gain in speed) and multi-processing (with another gain in speed proportional to the actual independent CPU cores used).

Note that all of these gains in speed are not due to C# other than it compiles to reasonably efficient machine code, but rather to proper use of the Sieve of Eratosthenes algorithm.

All of the above unbounded code can be tested by the following "main" method (replace the name "PrimesXXX" with the name of the class to be tested):

static void Main ( string [ ] args ) {

Console . WriteLine ( PrimesXXX ( ) . ElementAt ( 1000000 - 1 ) ) ; // zero based indexing...

}

To produce the following output for all tested versions (although some are considerably faster than others):

Output:

15485863

Standard Library [ edit ]

This implementation follows the standard library pattern of std::iota. The start and end iterators are provided for the container. The destination container is used for marking primes and then filled with the primes which are less than the container size. This method requires no memory allocation inside the function.

#include <iostream>

#include <algorithm>

#include <vector>



// requires Iterator satisfies RandomAccessIterator

template < typename Iterator >

size_t prime_sieve ( Iterator start, Iterator end )

{

if ( start == end ) return 0 ;

// clear the container with 0

std :: fill ( start, end, 0 ) ;

// mark composites with 1

for ( Iterator prime_it = start + 1 ; prime_it ! = end ; ++ prime_it )

{

if ( * prime_it == 1 ) continue ;

// determine the prime number represented by this iterator location

size_t stride = ( prime_it - start ) + 1 ;

// mark all multiples of this prime number up to max

Iterator mark_it = prime_it ;

while ( ( end - mark_it ) > stride )

{

std :: advance ( mark_it, stride ) ;

* mark_it = 1 ;

}

}

// copy marked primes into container

Iterator out_it = start ;

for ( Iterator it = start + 1 ; it ! = end ; ++ it )

{

if ( * it == 0 )

{

* out_it = ( it - start ) + 1 ;

++ out_it ;

}

}

return out_it - start ;

}



int main ( int argc, const char * argv [ ] )

{

std :: vector < int > primes ( 100 ) ;

size_t count = prime_sieve ( primes. begin ( ) , primes. end ( ) ) ;

// display the primes

for ( size_t i = 0 ; i < count ; ++ i )

std :: cout << primes [ i ] << " " ;

std :: cout << std :: endl ;

return 1 ;

}

Boost [ edit ]

// yield all prime numbers less than limit.

template < class UnaryFunction >

void primesupto ( int limit, UnaryFunction yield )

{

std :: vector < bool > is_prime ( limit, true ) ;



const int sqrt_limit = static_cast < int > ( std :: sqrt ( limit ) ) ;

for ( int n = 2 ; n <= sqrt_limit ; ++ n )

if ( is_prime [ n ] ) {

yield ( n ) ;



for ( unsigned k = n * n, ulim = static_cast < unsigned > ( limit ) ; k < ulim ; k + = n )

//NOTE: "unsigned" is used to avoid an overflow in `k+=n` for `limit` near INT_MAX

is_prime [ k ] = false ;

}



for ( int n = sqrt_limit + 1 ; n < limit ; ++ n )

if ( is_prime [ n ] )

yield ( n ) ;

}

Full program:

Works with: Boost

/**

$ g++ -I/path/to/boost sieve.cpp -o sieve && sieve 10000000

*/

#include <inttypes.h> // uintmax_t

#include <limits>

#include <cmath>

#include <iostream>

#include <sstream>

#include <vector>



#include <boost/lambda/lambda.hpp>



int main ( int argc, char * argv [ ] )

{

using namespace std ;

using namespace boost :: lambda ;



int limit = 10000 ;

if ( argc == 2 ) {

stringstream ss ( argv [ -- argc ] ) ;

ss >> limit ;



if ( limit < 1 or ss. fail ( ) ) {

cerr << "USAGE:

sieve LIMIT



where LIMIT in the range [1, "

<< numeric_limits < int > :: max ( ) << ")" << endl ;

return 2 ;

}

}



// print primes less then 100

primesupto ( 100 , cout << _1 << " " ) ;

cout << endl ;



// find number of primes less then limit and their sum

int count = 0 ;

uintmax_t sum = 0 ;

primesupto ( limit, ( var ( sum ) + = _1, var ( count ) + = 1 ) ) ;



cout << "limit sum pi(n)

"

<< limit << " " << sum << " " << count << endl ;

}

This solution uses nested iterators to create new wheels at run time:

// yield prime and remove all multiples of it from children sieves

iter sieve(prime):int {



yield prime;



var last = prime;

label candidates for candidate in sieve(prime+1) do {

for composite in last..candidate by prime do {



// candidate is a multiple of this prime

if composite == candidate {

// remember size of last composite

last = composite;

// and try the next candidate

continue candidates;

}

}



// candidate cannot need to be removed by this sieve

// yield to parent sieve for checking

yield candidate;

}

}

config const N = 30;

for p in sieve(2) {

write(" ", p);

if p > N {

writeln();

break;

}

}

Alternate Conventional Bit-Packed Implementation [ edit ]

The topmost sieve needs to be started with 2 (the smallest prime):

The following code implements the conventional monolithic (one large array) Sieve of Eratosthenes where the representations of the numbers use only one bit per number, using an iteration for output so as to not require further memory allocation:

compile with the `--fast` option

use Time;

use BitOps;



type Prime = uint(32);



config const limit: Prime = 1000000000; // sieve limit



proc main() {

write("The first 25 primes are: ");

for p in primes(100) do write(p, " "); writeln();



var count = 0; for p in primes(1000000) do count += 1;

writeln("Count of primes to a million is: ", count, ".");



var timer: Timer;

timer.start();



count = 0;

for p in primes(limit) do count += 1;



timer.stop();

write("Found ", count, " primes up to ", limit);

writeln(" in ", timer.elapsed(TimeUnits.milliseconds), " milliseconds.");

}



iter primes(n: Prime): Prime {

const szlmt = n / 8;

var cmpsts: [0 .. szlmt] uint(8); // even number of byte array rounded up



for bp in 2 .. n {

if cmpsts[bp >> 3] & (1: uint(8) << (bp & 7)) == 0 {

const s0 = bp * bp;

if s0 > n then break;

for c in s0 .. n by bp { cmpsts[c >> 3] |= 1: uint(8) << (c & 7); }

}

}



for p in 2 .. n do if cmpsts[p >> 3] & (1: uint(8) << (p & 7)) == 0 then yield p;



}

Output:

The first 25 primes are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 Count of primes to a million is: 78498. Found 50847534 primes up to 1000000000 in 11185.0 milliseconds.

Alternate Odds-Only Bit-Packed Implementation [ edit ]

use Time;

use BitOps;



type Prime = int(32);



config const limit: Prime = 1000000000; // sieve limit



proc main() {

write("The first 25 primes are: ");

for p in primes(100) do write(p, " "); writeln();



var count = 0; for p in primes(1000000) do count += 1;

writeln("Count of primes to a million is: ", count, ".");



var timer: Timer;

timer.start();



count = 0;

for p in primes(limit) do count += 1;



timer.stop();

write("Found ", count, " primes up to ", limit);

writeln(" in ", timer.elapsed(TimeUnits.milliseconds), " milliseconds.");

}



iter primes(n: Prime): Prime {

const ndxlmt = (n - 3) / 2;

const szlmt = ndxlmt / 8;

var cmpsts: [0 .. szlmt] uint(8); // even number of byte array rounded up



for i in 0 .. ndxlmt { // never gets to the end!

if cmpsts[i >> 3] & (1: uint(8) << (i & 7)) == 0 {

const bp = i + i + 3;

const s0 = (bp * bp - 3) / 2;

if s0 > ndxlmt then break;

for s in s0 .. ndxlmt by bp do cmpsts[s >> 3] |= 1: uint(8) << (s & 7);

}

}



yield 2;

for i in 0 .. ndxlmt do

if cmpsts[i >> 3] & (1: uint(8) << (i & 7)) == 0 then yield i + i + 3;



}

Output:

The first 25 primes are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 Count of primes to a million is: 78498. Found 50847534 primes up to 1000000000 in 5480.91 milliseconds.

As you can see, sieving odds-only is about twice as fast due to the reduced number of operations; however this is still not all that fast at about 20 CPU clock cycles per sieve culling operation due to the size of the array exceeding the CPU cache size(s).

Hash Table Based Odds-Only Version [ edit ]

Translation of: Python

Works with: Chapel version 1.22

use Time;



config const limit = 1000000;



type Prime = uint(32);



class Primes { // needed so we can use next to get successive values

var n: Prime; var obp: Prime; var q: Prime;

var bps: owned Primes?;

var keys: domain(Prime); var dict: [keys] Prime;

proc next(): Prime { // odd primes!

if this.n < 5 { this.n = 5; return 3; }

if this.bps == nil {

this.bps = new Primes(); // secondary odd base primes feed

this.obp = this.bps!.next(); this.q = this.obp * this.obp;

}

while true {

if this.n >= this.q { // advance secondary stream of base primes...

const adv = this.obp * 2; const key = this.q + adv;

this.obp = this.bps!.next(); this.q = this.obp * this.obp;

this.keys += key; this.dict[key] = adv;

}

else if this.keys.contains(this.n) { // found a composite; advance...

const adv = this.dict[this.n]; this.keys.remove(this.n);

var nkey = this.n + adv;

while this.keys.contains(nkey) do nkey += adv;

this.keys += nkey; this.dict[nkey] = adv;

}

else { const p = this.n; this.n += 2; return p; }

this.n += 2;

}

return 0; // to keep compiler happy in returning a value!

}

iter these(): Prime { yield 2; while true do yield this.next(); }

}



proc main() {

var count = 0;

write("The first 25 primes are: ");

for p in new Primes() { if count >= 25 then break; write(p, " "); count += 1; }

writeln();



var timer: Timer;

timer.start();



count = 0;

for p in new Primes() { if p > limit then break; count += 1; }



timer.stop();

write("Found ", count, " primes up to ", limit);

writeln(" in ", timer.elapsed(TimeUnits.milliseconds), " milliseconds.");

}

Output:

The first 25 primes are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 Found 78498 primes up to 1000000 in 1307.0 milliseconds.

As you can see, this is much slower than the array based versions, and in fact so slow as to be almost useless; this is due to Chapel's current inefficient implementation of hash tables, both as Association Arrays as used here and in the Map standard library (even slower, likely due to extra indirection). In fact, this is about ten times slower than as the Python code from which it was translated due to Python's extremely well optimized hash table implementation as used internally. Also, this does not have the expected O(1) performance as for usual hash table implementations such as the Python code, but gets much slower with increasing range. This slowness is likely partly due to the Associative Array implementation using complex cryptographic hashing algorithms but there is also something extremely wrong in that it misses the amortized O(1) performance!

This desperately needs some attention in Chapel!

To show that it's not really the fault of the language but rather just this part of a built-in library, the following code implements a specialized BasePrimesTable that works similarly to the way the Python associative arrays work as to hashing algorithm used (none as the hash values for integers are just themselves) and something similar to the Python method of handling hash table collisions is used:

Works with: Chapel version 1.22

Compile with the `--fast` compiler command line option

use Time;



config const limit = 1000000;



type Prime = uint(32);



record BasePrimesTable { // specialized for the use here...

record BasePrimeEntry { var fullkey: Prime; var val: Prime; }

var cpcty: int = 8; var sz: int = 0;

var dom = { 0 .. cpcty - 1 }; var bpa: [dom] BasePrimeEntry;

proc grow() {

const ndom = dom; var cbpa: [ndom] BasePrimeEntry = bpa[ndom];

bpa = new BasePrimeEntry(); cpcty *= 2; dom = { 0 .. cpcty - 1 };

for kv in cbpa do if kv.fullkey != 0 then add(kv.fullkey, kv.val);

}

proc find(k: Prime): int { // internal get location of value or -1

const msk = cpcty - 1; var skey = k: int & msk;

var perturb = k: int; var loop = 8;

do {

if bpa[skey].fullkey == k then return skey;

perturb >>= 5; skey = (5 * skey + 1 + perturb) & msk;

loop -= 1; if perturb > 0 then loop = 8;

} while loop > 0;

return -1; // not found!

}

proc contains(k: Prime): bool { return find(k) >= 0; }

proc add(k, v: Prime) { // if exists then replaces else new entry

const fndi = find(k);

if fndi >= 0 then bpa[fndi] = new BasePrimeEntry(k, v);

else {

sz += 1; if 2 * sz > cpcty then grow();

const msk = cpcty - 1; var skey = k: int & msk;

var perturb = k: int; var loop = 8;

do {

if bpa[skey].fullkey == 0 {

bpa[skey] = new BasePrimeEntry(k, v); return; }

perturb >>= 5; skey = (5 * skey + 1 + perturb) & msk;

loop -= 1; if perturb > 0 then loop = 8;

} while loop > 0;

}

}

proc remove(k: Prime) { // if doesn't exist does nothing

const fndi = find(k);

if fndi >= 0 { bpa[fndi].fullkey = 0; sz -= 1; }

}

proc this(k: Prime): Prime { // returns value or 0 if not found

const fndi = find(k);

if fndi < 0 then return 0; else return bpa[fndi].val;

}

}



class Primes { // needed so we can use next to get successive values

var n: Prime; var obp: Prime; var q: Prime;

var bps: shared Primes?; var dict = new BasePrimesTable();

proc next(): Prime { // odd primes!

if this.n < 5 { this.n = 5; return 3; }

if this.bps == nil {

this.bps = new Primes(); // secondary odd base primes feed

this.obp = this.bps!.next(); this.q = this.obp * this.obp;

}

while true {

if this.n >= this.q { // advance secondary stream of base primes...

const adv = this.obp * 2; const key = this.q + adv;

this.obp = this.bps!.next(); this.q = this.obp * this.obp;

this.dict.add(key, adv);

}

else if this.dict.contains(this.n) { // found a composite; advance...

const adv = this.dict[this.n]; this.dict.remove(this.n);

var nkey = this.n + adv;

while this.dict.contains(nkey) do nkey += adv;

this.dict.add(nkey, adv);

}

else { const p = this.n; this.n += 2; return p; }

this.n += 2;

}

return 0; // to keep compiler happy in returning a value!

}

iter these(): Prime { yield 2; while true do yield this.next(); }

}



proc main() {

var count = 0;

write("The first 25 primes are: ");

for p in new Primes() { if count >= 25 then break; write(p, " "); count += 1; }

writeln();



var timer: Timer;

timer.start();



count = 0;

for p in new Primes() { if p > limit then break; count += 1; }



timer.stop();

write("Found ", count, " primes up to ", limit);

writeln(" in ", timer.elapsed(TimeUnits.milliseconds), " milliseconds.");

}

Output:

The first 25 primes are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 Found 78498 primes up to 1000000 in 15.6 milliseconds.

When called with the `--limit=100000000` command line option to the executable file, the following results:

Output:

The first 25 primes are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 Found 5761455 primes up to 100000000 in 2591.61 milliseconds.

This last code is quite usable up to a hundred million (as here) or even a billion in a little over ten times the time, but is still slower than the very simple odds-only monolithic array version and is also more complex, although it uses less memory (only for the hash table for the base primes of about eight Kilobytes for sieving to a billion compared to over 60 Megabytes for the monolithic odds-only simple version).

Functional Tree Folding Odds-Only Version [ edit ]

Chapel isn't really a very functional language even though it has some functional forms of code in the Higher Order Functions (HOF's) of zippered, scanned, and reduced, iterations and has first class functions (FCF's) and lambdas (anonymous functions), these last can't be closures (capture variable bindings from external scope(s)), nor can the work around of using classes to emulate closures handle recursive (Y-combinator type) variable bindings using reference fields (at least currently with version 1.22). However, the Tree Folding add-on to the Richard Bird lazy list sieve doesn't require any of the things that can't be emulated using classes, so a version is given as follows:

Translation of: Nim

Works with: 1.22 version - compile with the --fast compiler command line flag for full optimization

use Time;



type Prime = uint(32);



config const limit = 1000000: Prime;



// Chapel doesn't have closures, so we need to emulate them with classes...

class PrimeCIS { // base prime stream...

var head: Prime;

proc next(): shared PrimeCIS { return new shared PrimeCIS(); }

}



class PrimeMultiples: PrimeCIS {

var adv: Prime;

override proc next(): shared PrimeCIS {

return new shared PrimeMultiples(

this.head + this.adv, this.adv): shared PrimeCIS; }

}



class PrimeCISCIS { // base stream of prime streams; never used directly...

var head: shared PrimeCIS;

proc init() { this.head = new shared PrimeCIS(); }

proc next(): shared PrimeCISCIS {

return new shared PrimeCISCIS(); }

}



class AllMultiples: PrimeCISCIS {

var bps: shared PrimeCIS;

proc init(bsprms: shared PrimeCIS) {

const bp = bsprms.head; const sqr = bp * bp; const adv = bp + bp;

this.head = new shared PrimeMultiples(sqr, adv): PrimeCIS;

this.bps = bsprms;

}

override proc next(): shared PrimeCISCIS {

return new shared AllMultiples(this.bps.next()): PrimeCISCIS; }

}



class Union: PrimeCIS {

var feeda, feedb: shared PrimeCIS;

proc init(fda: shared PrimeCIS, fdb: shared PrimeCIS) {

const ahd = fda.head; const bhd = fdb.head;

this.head = if ahd < bhd then ahd else bhd;

this.feeda = fda; this.feedb = fdb;

}

override proc next(): shared PrimeCIS {

const ahd = this.feeda.head; const bhd = this.feedb.head;

if ahd < bhd then

return new shared Union(this.feeda.next(), this.feedb): shared PrimeCIS;

if ahd > bhd then

return new shared Union(this.feeda, this.feedb.next()): shared PrimeCIS;

return new shared Union(this.feeda.next(),

this.feedb.next()): shared PrimeCIS;

}

}



class Pairs: PrimeCISCIS {

var feed: shared PrimeCISCIS;

proc init(fd: shared PrimeCISCIS) {

const fs = fd.head; const sss = fd.next(); const ss = sss.head;

this.head = new shared Union(fs, ss): shared PrimeCIS; this.feed = sss;

}

override proc next(): shared PrimeCISCIS {

return new shared Pairs(this.feed.next()): shared PrimeCISCIS; }

}



class Composites: PrimeCIS {

var feed: shared PrimeCISCIS;

proc init(fd: shared PrimeCISCIS) {

this.head = fd.head.head; this.feed = fd;

}

override proc next(): shared PrimeCIS {

const fs = this.feed.head.next();

const prs = new shared Pairs(this.feed.next()): shared PrimeCISCIS;

const ncs = new shared Composites(prs): shared PrimeCIS;

return new shared Union(fs, ncs): shared PrimeCIS;

}

}



class OddPrimesFrom: PrimeCIS {

var cmpsts: shared PrimeCIS;

override proc next(): shared PrimeCIS {

var n = head + 2; var cs = this.cmpsts;

while true {

if n < cs.head then

return new shared OddPrimesFrom(n, cs): shared PrimeCIS;

n += 2; cs = cs.next();

}

return this.cmpsts; // never used; keeps compiler happy!

}

}



class OddPrimes: PrimeCIS {

proc init() { this.head = 3; }

override proc next(): shared PrimeCIS {

const bps = new shared OddPrimes(): shared PrimeCIS;

const mlts = new shared AllMultiples(bps): shared PrimeCISCIS;

const cmpsts = new shared Composites(mlts): shared PrimeCIS;

return new shared OddPrimesFrom(5, cmpsts): shared PrimeCIS;

}

}



iter primes(): Prime {

yield 2; var cur = new shared OddPrimes(): shared PrimeCIS;

while true { yield cur.head; cur = cur.next(); }

}



// test it...

write("The first 25 primes are: "); var cnt = 0;

for p in primes() { if cnt >= 25 then break; cnt += 1; write(" ", p); }



var timer: Timer; timer.start(); cnt = 0;

for p in primes() { if p > limit then break; cnt += 1; }

timer.stop(); write("

Found ", cnt, " primes up to ", limit);

writeln(" in ", timer.elapsed(TimeUnits.milliseconds), " milliseconds.");

Output:

The first 25 primes are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 Found 78498 primes up to 1000000 in 512.612 milliseconds.

The above code is really just a toy example to show that Chapel can handle some tasks functionally (within the above stated limits) although doing so is slower than the Hash Table version above and also takes more memory as the nested lazy list structure consumes more memory in lazy list links and "plumbing" than does the simple implementation of a Hash Table. It also has a worst asymptotic performance with an extra `log(n)` factor where `n` is the sieving range; this can be shown by running the above program with `--limit=10000000` run time command line option to sieve to ten million which takes about 6.7 seconds to count the primes up to ten million (a factor of ten higher range, but much higher than the expected factor of about 10 per cent as per the Hash Table version with about 20 per cent more operations for this version). Other than for the extra operations, this version is generally slower due to the time to do the many small allocations/deallocations of the functional object instances, and this will be highly dependent on the platform on which it is run: cygwin on Windows is particularly slow due to the extra level of indirection, and some on-line IDE's may also be slow due to their level of virtualization.

A Multi-Threaded Page-Segmented Odds-Only Bit-Packed Version [ edit ]

To take advantage of the features that make Chapel shine, we need to use it to do some parallel computations, and to efficiently do that for the Sieve of Eratosthenes, we need to divide the work into page segments where we can assign each largish segment to a separate thread/task; once we have divided the work, Chapel offers lots of means to implement the parallelism but to be a true Sieve of Eratosthenes, we need to have the ability to output the results in order; with many of the convenience mechanisms not doing that, the best/simplest option is likely task parallelism with the output results assigned to an rotary indexed array containing the `sync` results. It turns out that, although the Chapel compiler can sometimes optimize the code so the overhead of creating tasks is not onerous, for this case where the actual tasks are somewhat complex, the compiler can't recognize that an automatically generated thread pool(s) are required and we need to generate the thread pool(s) manually. The code that does this is as follows:

Works with: 1.22 version - compile with the --fast compiler command line flag for full optimization

use Time; use BitOps;



type Prime = uint(64);

type PrimeNdx = int(64);

type BasePrime = uint(32);



config const LIMIT = 1000000000: Prime;



config const L1 = 16; // CPU L1 cache size in Kilobytes (1024);

assert (L1 == 16 || L1 == 32 || L1 == 64,

"L1 cache size must be 16, 32, or 64 Kilobytes!");

config const L2 = 128; // CPU L2 cache size in Kilobytes (1024);

assert (L2 == 128 || L2 == 256 || L2 == 512,

"L2 cache size must be 128, 256, or 512 Kilobytes!");

const CPUL1CACHE: int = L1 * 1024 * 8; // size in bits!

const CPUL2CACHE: int = L2 * 1024 * 8; // size in bits!

config const NUMTHRDS = here.maxTaskPar;

assert(NUMTHRDS >= 1, "NUMTHRDS must be at least one!");



const WHLPRMS = [ 2: Prime, 3: Prime, 5: Prime, 7: Prime,

11: Prime, 13: Prime, 17: Prime];

const FRSTSVPRM = 19: Prime; // past the pre-cull primes!

// 2 eliminated as even; 255255 in bytes...

const WHLPTRNSPN = 3 * 5 * 7 * 11 * 13 * 17;

// rounded up to next 64-bit boundary plus a 16 Kilobyte buffer for overflow...

const WHLPTRNBTSZ = ((WHLPTRNSPN * 8 + 63) & (-64)) + 131072;



// number of base primes within small span!

const SZBPSTRTS = 6542 - WHLPRMS.size + 1; // extra one for marker!

// number of base primes for CPU L1 cache buffer!

const SZMNSTRTS = (if L1 == 16 then 12251 else

if L1 == 32 then 23000 else 43390)

- WHLPRMS.size + 1; // extra one for marker!



// faster than bit twiddling...

const bitmsk = for i in 0 .. 7 do 1:uint(8) << i;



var WHLPTRN: SieveBuffer = new SieveBuffer(WHLPTRNBTSZ); fillWHLPTRN(WHLPTRN);

proc fillWHLPTRN(ref wp: SieveBuffer) {

const hi = WHLPRMS.size - 1;

const rng = 0 .. hi; var whlhd = new BasePrimeArr({rng});

// contains wheel pattern primes skipping the small wheel prime (2)!...

// never advances past the first base prime arr as it ends with a huge!...

for i in rng do whlhd.bparr[i] = (if i != hi then WHLPRMS[i + 1] // skip 2!

else 0x7FFFFFFF): BasePrime; // last huge!

var whlbpas = new shared BasePrimeArrs(); whlbpas.head = whlhd;

var whlstrts = new StrtsArr({rng});

wp.cull(0, WHLPTRNBTSZ, whlbpas, whlstrts);

// elimnate wheel primes from the WHLPTRN buffer!...

wp.cmpsts[0] = 0xFF: uint(8);

}



class PrimeArr { var dom = { 0 .. -1 }; var prmarr: [dom] Prime; }

class BasePrimeArr { var dom = { 0 .. -1 }; var bparr: [dom] BasePrime; }

record StrtsArr { var dom = { 0 .. -1 }; var strtsarr: [dom] int(32); }

record SieveBuffer {

var dom = { 0 .. -1 }; var cmpsts: [dom] uint(8) = 0;

proc init() {}

proc init(btsz: int) { dom = { 0 .. btsz / 8 - 1 }; }

proc deint() { dom = { 0 .. -1 }; }



proc fill(lwi: PrimeNdx) { // fill from the WHLPTRN stamp...

const sz = cmpsts.size; const mvsz = min(sz, 16384);

var mdlo = ((lwi / 8) % (WHLPTRNSPN: PrimeNdx)): int;

for i in 0 .. sz - 1 by 16384 {

c_memcpy(c_ptrTo(cmpsts[i]): c_void_ptr,

c_ptrTo(WHLPTRN.cmpsts[mdlo]): c_void_ptr, mvsz);

mdlo += 16384; if mdlo >= WHLPTRNSPN then mdlo -= WHLPTRNSPN;

}

}



proc count(btlmt: int) { // count by 64 bits using CPU popcount...

const lstwrd = btlmt / 64; const lstmsk = (-2):uint(64) << (btlmt & 63);

const cmpstsp = c_ptrTo(cmpsts: [dom] uint(8)): c_ptr(uint(64));

var i = 0; var cnt = (lstwrd * 64 + 64): int;

while i < lstwrd { cnt -= popcount(cmpstsp[i]): int; i += 1; }

return cnt - popcount(cmpstsp[lstwrd] | lstmsk): int;

}



// most of the time is spent doing culling operations as follows!...

proc cull(lwi: PrimeNdx, bsbtsz: int, bpas: BasePrimeArrs,

ref strts: StrtsArr) {

const btlmt = cmpsts.size * 8 - 1; const bplmt = bsbtsz / 32;

const ndxlmt = lwi: Prime + btlmt: Prime; // can't overflow!

const strtssz = strts.strtsarr.size;

// C pointer for speed magic!...

const cmpstsp = c_ptrTo(cmpsts[0]);

const strtsp = c_ptrTo(strts.strtsarr);



// first fill the strts array with pre-calculated start addresses...

var i = 0; for bp in bpas {

// calculate page start address for the given base prime...

const bpi = bp: int; const bbp = bp: Prime; const ndx0 = (bbp - 3) / 2;

const s0 = (ndx0 + ndx0) * (ndx0 + 3) + 3; // can't overflow!

if s0 > ndxlmt then {

if i < strtssz then strtsp[i] = -1: int(32); break; }

var s = 0: int;

if s0 >= lwi: Prime then s = (s0 - lwi: Prime): int;

else { const r = (lwi: Prime - s0) % bbp;

if r == 0 then s = 0: int; else s = (bbp - r): int; };

if i < strtssz - 1 { strtsp[i] = s: int(32); i += 1; continue; }

if i < strtssz { strtsp[i] = -1; i = strtssz; }

// cull the full buffer for this given base prime as usual...

while s <= btlmt { cmpstsp[s >> 3] |= bitmsk[s & 7]; s += bpi; } // only works up to limit of int(32)**2!!!!!!!!

}



// cull the smaller sub buffers according to the strts array...

for sbtlmt in bsbtsz - 1 .. btlmt by bsbtsz {

i = 0; for bp in bpas { // bp never bigger than uint(32)!

// cull the sub buffer for this given base prime...

var s = strtsp[i]: int; if s < 0 then break;

var bpi = bp: int; var nxt = 0x7FFFFFFFFFFFFFFF;

if bpi <= bplmt { // use loop "unpeeling"...

const slmt = s + bpi * 8 - 1;

while s <= slmt {

const bmi = s & 7; const msk = bitmsk[bmi];

var c = s >> 3; const clmt = sbtlmt >> 3;

while c <= clmt { cmpstsp[c] |= msk; c += bpi; }

nxt = min(nxt, (c << 3): int(64) | bmi: int(64)); s += bpi;

}

strtsp[i] = nxt: int(32); i += 1;

}

else { while s <= sbtlmt { // standard cull loop...

cmpstsp[s >> 3] |= bitmsk[s & 7]; s += bpi; }

strtsp[i] = s: int(32); i += 1; }

}

}

}

}



// a generic class that contains a page result generating function;

// allows manual iteration through the use of the next() method;

// multi-threaded through the use of a thread pool...

class PagedResults {

type resulttype; // must always be a unmanaged class!

var fnc: func(PrimeNdx, SieveBuffer, resulttype);

var lwi: PrimeNdx; var bsbtsz: int;

var bpas: shared BasePrimeArrs? = nil: shared BasePrimeArrs?;

var sbs: [ 0 .. NUMTHRDS - 1 ] SieveBuffer = new SieveBuffer();

var strts: [ 0 .. NUMTHRDS - 1 ] StrtsArr = new StrtsArr();

var qi: int = 0;

var wrkq$: [ 0 .. NUMTHRDS - 1 ] sync PrimeNdx;

var rsltsq$: [ 0 .. NUMTHRDS - 1 ] sync resulttype?;



proc deinit() { // kill the thread pool when out of scope...

if this.bpas == nil then return; // no thread pool!

for i in this.wrkq$.domain { this.wrkq$[i] = -1; while true {

const r = this.rsltsq$[i]; if r == nil then break; else delete(r); }

}

}



proc next() {

proc dowrk(ri: int) { // used internally!...

while true {

const li = this.wrkq$[ri];

if li < 0 { this.rsltsq$[ri] = nil: resulttype?; break; }

this.sbs[ri].fill(li);

this.sbs[ri].cull(li, this.bsbtsz, this.bpas!, this.strts[ri]);

this.rsltsq$[ri] = this.fnc(li, this.sbs[ri]): this.resulttype?;

}

}

if this.bpas == nil { // init on first use; avoids data race!

this.bpas = new BasePrimeArrs();

if this.bsbtsz < CPUL1CACHE {

this.sbs = new SieveBuffer(bsbtsz);

this.strts = new StrtsArr({0 .. SZBPSTRTS - 1});

}

else {

this.sbs = new SieveBuffer(CPUL2CACHE);

this.strts = new StrtsArr({0 .. SZMNSTRTS - 1});

}

// start threadpool and give it inital work...

for i in rsltsq$.domain {

begin with (const in i) dowrk(i);

this.wrkq$[i] = this.lwi; this.lwi += this.sbs[i].cmpsts.size * 8;

}

}

const rslt = shared.create(this.rsltsq$[qi].readFE());

this.wrkq$[qi] = this.lwi;

this.lwi += this.sbs[qi].cmpsts.size * 8;

this.qi = if qi >= NUMTHRDS - 1 then 0 else qi + 1;

return rslt;

}



iter these() { while this.lwi >= 0 do yield this.next(); }

}



// this first class function can't be embedded in another function currently!...

proc sb2bparr(lwi: PrimeNdx, sb: SieveBuffer): unmanaged BasePrimeArr {

const bsprm = (lwi + lwi + 3): BasePrime;

const szlmt = sb.cmpsts.size * 8 - 1; var i, j = 0;

var arr = new unmanaged BasePrimeArr({ 0 .. sb.count(szlmt) - 1 });

while i <= szlmt { if sb.cmpsts[i >> 3] & bitmsk[i & 7] == 0 {

arr.bparr[j] = bsprm + (i + i): BasePrime; j += 1; }

i += 1; }

return arr;

}



// a memoizing lazy list of BasePrimeArr's...

class BasePrimeArrs {

var head: shared BasePrimeArr;

var tail: shared BasePrimeArrs? = nil: shared BasePrimeArrs?;

var lock$: sync bool = true;

var feed: shared PagedResults =

new shared PagedResults(unmanaged BasePrimeArr, sb2bparr, 65536, 65536);



proc init() { // make our own first array to break data race!

var sb = new SieveBuffer(256); sb.fill(0);

this.head = shared.create(sb2bparr(0, sb)): shared BasePrimeArr;

this.complete(); // fake base primes!

sb = new SieveBuffer(65536); sb.fill(0);

// use (completed) self as source of base primes!

var strts = new StrtsArr({ 0 .. 256 });

sb.cull(0, 65536, this, strts);

// replace head with new larger version culled using fake base primes!...

this.head = shared.create(sb2bparr(0, sb)): shared BasePrimeArr;

}



proc init(hd: shared BasePrimeArr, fd: shared PagedResults) {

this.head = hd; this.feed = fd; }



proc next(): shared BasePrimeArrs {

if this.tail == nil { // in case other thread slipped through!

if this.lock$ && this.tail == nil { // empty sync -> block others!

const nhd = shared.create(this.feed.next()): shared BasePrimeArr;

this.tail = new shared BasePrimeArrs(nhd , this.feed);

}

this.lock$ = false; // fill the sync so other threads can do nothing!

}

return this.tail: shared BasePrimeArrs; // necessary cast!

}



iter these(): BasePrime {

for bp in this.head.bparr do yield bp; var cur = this.next();

while true {

for bp in cur.head.bparr do yield bp; cur = cur.next(); }

}

}



// this first class function can't be embedded in another function currently!...

proc sb2prmarr(lwi: PrimeNdx, sb: SieveBuffer): unmanaged PrimeArr {

const bsprm = (lwi + lwi + 3): Prime;

const szlmt = sb.cmpsts.size * 8 - 1; var i, j = 0;

var arr = new unmanaged PrimeArr({0 .. sb.count(szlmt) - 1});

while i <= szlmt { if sb.cmpsts[i >> 3] & bitmsk[i & 7] == 0 then {

arr.prmarr[j] = bsprm + (i + i): Prime; j += 1; }

i += 1; }

return arr;

}



iter primes(): Prime {

for p in WHLPRMS do yield p: Prime;

for pa in new shared PagedResults(unmanaged PrimeArr, sb2prmarr,

0, CPUL1CACHE) do

for p in pa!.prmarr do yield p;

}



// use a class so that it can be used as a generic sync value!...

class CntNxt { var cnt: int; var nxt: PrimeNdx; }



// currently no closures so need to reference this global from FCF!

var NDXLMT = 0: PrimeNdx;



// this first class function (FCF) can't be embedded in another function currently!...

proc sb2cnt(lwi: PrimeNdx, sb: SieveBuffer): unmanaged CntNxt {

const btszlmt = sb.cmpsts.size * 8 - 1; const lstndx = lwi + btszlmt: PrimeNdx;

const btlmt = if lstndx > NDXLMT then max(0, (NDXLMT - lwi): int) else btszlmt;

return new unmanaged CntNxt(sb.count(btlmt), lstndx);

}



proc countPrimesTo(lmt: Prime): int(64) {

NDXLMT = ((lmt - 3) / 2): PrimeNdx; var count = 0: int(64);

for p in WHLPRMS { if p > lmt then break; count += 1; }

if lmt < FRSTSVPRM then return count;

for cn in new shared PagedResults(unmanaged CntNxt, sb2cnt, 0, CPUL1CACHE) {

count += cn!.cnt: int(64); if cn!.nxt >= NDXLMT then break;

}

return count;

}



// test it...

write("The first 25 primes are: "); var cnt = 0;

for p in primes() { if cnt >= 25 then break; cnt += 1; write(" ", p); }



cnt = 0; for p in primes() { if p > 1000000 then break; cnt += 1; }

writeln("

There are ", cnt, " primes up to a million.");



write("Sieving to ", LIMIT, " with ");

write("CPU L1/L2 cache sizes of ", L1, "/", L2, " KiloBytes ");

writeln("using ", NUMTHRDS, " threads.");

var timer: Timer; timer.start();

// the slow way!:

// var count = 0; for p in primes() { if p > LIMIT then break; count += 1; }

const count = countPrimesTo(LIMIT); // the fast way!

timer.stop();

write("Found ", count, " primes up to ", LIMIT);

writeln(" in ", timer.elapsed(TimeUnits.milliseconds), " milliseconds.");

Output:

The first 25 primes are: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 There are 78498 primes up to a million. Sieving to 1000000000 with CPU L1/L2 cache sizes of 16/128 KiloBytes using 4 threads. Found 50847534 primes up to 1000000000 in 257.015 milliseconds.

Note that the above code does implement some functional concepts as in a memoized lazy list of base prime arrays, but as this is used at the page level, the slowish performance doesn't impact the overall execution time much and the code is much more elegant in using this concept.

Also note that the code is a little bit uglier than it should be in the use of "unmanaged" result type for the generic "PagedResults" class that needs to be converted to a "shared" class result type so that Chapel will automatically destroy these; this is due to a bug/issue in version 1.22 not allowing a "shared" nilable field of a generic record or class to be initialized, requiring this work-around. If one is not careful, this can lead to a memory leak, which this code had but has been fixed.

The tight inner loops for culling composite number representations have been optimized to some extent in using "loop unpeeling" for smaller base primes to simplify the loops down to simple masking by a constant with eight separate loops for the repeating pattern over bytes and culling by CPU L1 cache sizes over the outer sieve buffer size of the CPU L2 cache size in order to make the task work-sized chunks larger for less task context switching overheads and for reduced time lost to culling start address calculations per base prime (which needs to use some integer division that is always slower than other integer operations). This last optimization allows for reasonably efficient culling up to the square of the CPU L2 cache size in bits or 1e12 for the one Megabit CPU L2 cache size many mid-range Intel CPU's have currently when used for multi-threading (half of the actual size for Hyper-Threaded - HT - threads as they share both the L1 and the L2 caches over the pairs of Hyper-Threaded (HT) threads per core).

Although this code can be used for much higher sieving ranges, it is not recommended due to not yet being tuned for better efficiency above 1e12; there are no checks implemented limiting the user to this range, and at some point there will be errors due to integer overflows but these will be for huge sieving ranges taking months -> years to execute on common desktop CPU's.

A further optimization used is to create a pre-culled WHLPTRN SieveBuffer where the odd primes (since we cull odds-only) of 3, 5, 7, 11, 13, and 17 have already been culled and using that to pre-fill the page segment buffers so that no culling by these base prime values is required, this reduces the number of operations by about 45% compared to if it wasn't done but the ratio of better performance is only about 34.5% better as this changes the ratio of (fast) smaller base primes to larger (slower) ones.

All of the improvements to this point allow the shown performance as per the displayed output for the above program; using a command line argument of `--LIMIT=100000000000`, it can count the primes to 1e11 in about 38.25 seconds using just a two core/four thread Intel Core i3-2100 at 3.1 GHz (Sandy Bridge series) as these tests used. It will be about eight times faster than this using a more modern desktop CPU such as the Intel Core i7-9700K which has four times as many effective cores, about a 16% higher CPU clock rate, is about 25% faster due the a more modern CPU architecture which is seven generations newer, and doesn't share its 32 Kilobyte L1 data cache nor its 256 Kilobyte L2 cache over HT threads, which it doesn't have. Of course using a top end AMD Threadripper with its 64/128 cores/threads will be almost eight times faster again except that it will lose about 20% due to its slower clock speed when all cores/threads are used; note that high core CPU's will only give these speed gains for larger sieving ranges such as 1e11 and above since otherwise there aren't enough work chunks to go around for all the threads available!

Incredibly, even run single threaded (argument of `--NUMTHRDS=1`) this implementation is only about 20% slower than the reference Sieve of Atkin "primegen/primespeed" implementation in counting the number of primes to a billion and is about 20% faster in counting the primes to a hundred billion (arguments of `--LIMIT=100000000000 --NUMTHRDS=1`) with both using the same size of CPU L1 cache buffer of 16 Kilobytes; This implementation does not yet have the level of wheel optimization of the Sieve of Atkin as it has only the limited wheel optimization of Odds-Only plus the use of the pre-cull fill. Maximum wheel factorization will reduce the number of operations for this code to less than half the current number, making it faster than the Sieve of Atkin for all ranges, and approach the speed of Kim Walisch's "primesieve". In addition, there are the further optimizations that "primesieve" uses of extreme loop unrolling that will make the code yet another about 25% faster so that it will have about the speed of "primesieve".

This example is incorrect. Please fix the code and remove this message. Details: The first version uses rem testing and so is a trial division algorithm, not a sieve of Eratosthenes.

Note: this is not a Sieve of Eratosthenes; it is just an inefficient version (because it doesn't trial just the found primes <= the square root of the range) trial division.

primes< is a functional interpretation of the Sieve of Eratosthenes. It merely removes the set of composite numbers from the set of odd numbers (wheel of 2) leaving behind only prime numbers. It uses a transducer internally with "into #{}".



( defn primes < [ n ]

( if ( <= n 2 )

( )

( remove ( into # { }

( mapcat # ( range ( * % % ) n % ) )

( range 3 ( Math / sqrt n ) 2 ) )

( cons 2 ( range 3 n 2 ) ) ) ) )



Calculates primes up to and including n using a mutable boolean array but otherwise entirely functional code.



( defn primes - to

"Computes lazy sequence of prime numbers up to a given number using sieve of Eratosthenes"

[ n ]

( let [ root ( -> n Math / sqrt long ) ,

cmpsts ( boolean - array ( inc n ) ) ,

cullp ( fn [ p ]

( loop [ i ( * p p ) ]

( if ( <= i n )

( do ( aset cmpsts i true )

( recur ( + i p ) ) ) ) ) ) ]

( do ( dorun ( map # ( cullp % ) ( filter # ( not ( aget cmpsts % ) )

( range 2 ( inc root ) ) ) ) )

( filter # ( not ( aget cmpsts % ) ) ( range 2 ( inc n ) ) ) ) ) )



Alternative implementation using Clojure's side-effect oriented list comprehension.



( defn primes - to

"Returns a lazy sequence of prime numbers less than lim"

[ lim ]

( let [ refs ( boolean - array ( + lim 1 ) true )

root ( int ( Math / sqrt lim ) ) ]

( do ( doseq [ i ( range 2 lim )

: while ( <= i root )

: when ( aget refs i ) ]

( doseq [ j ( range ( * i i ) lim i ) ]

( aset refs j false ) ) )

( filter # ( aget refs % ) ( range 2 lim ) ) ) ) )



Alternative implementation using Clojure's side-effect oriented list comprehension. Odds only.



( defn primes - to

"Returns a lazy sequence of prime numbers less than lim"

[ lim ]

( let [ max - i ( int ( / ( - lim 1 ) 2 ) )

refs ( boolean - array max - i true )

root ( / ( dec ( int ( Math / sqrt lim ) ) ) 2 ) ]

( do ( doseq [ i ( range 1 ( inc root ) )

: when ( aget refs i ) ]

( doseq [ j ( range ( * ( + i i ) ( inc i ) ) max - i ( + i i 1 ) ) ]

( aset refs j false ) ) )

( cons 2 ( map # ( + % % 1 ) ( filter # ( aget refs % ) ( range 1 max - i ) ) ) ) ) ) )



This implemantation is about twice fast than previous one and use only half memory. From the index of array calculates the value it represents as (2*i + 1), the step between two index that represents the multiples of primes to mark as composite is also (2*i + 1). The index of the square of the prime to start composite marking is 2*i*(i+1).

Alternative very slow entirely functional implementation using lazy sequences



( defn primes - to

"Computes lazy sequence of prime numbers up to a given number using sieve of Eratosthenes"

[ n ]

( letfn [ ( nxtprm [ cs ] ; current candidates

( let [ p ( first cs ) ]

( if ( > p ( Math / sqrt n ) ) cs

( cons p ( lazy - seq ( nxtprm ( -> ( range ( * p p ) ( inc n ) p )

set ( remove cs ) rest ) ) ) ) ) ) ) ]

( nxtprm ( range 2 ( inc n ) ) ) ) )



The reason that the above code is so slow is that it has has a high constant factor overhead due to using a (hash) set to remove the composites from the future composites stream, each prime composite stream removal requires a scan across all remaining composites (compared to using an array or vector where only the culled values are referenced, and due to the slowness of Clojure sequence operations as compared to ite