Doing data science, I often start loop functions without a clear idea of how long they'll take. When working with exceptionally huge datasets, it can be hours. That's why I created this quick and dirty progress bar (okay, there's no bar, it's a counter to 100) so I can judge whether I should wait, get up to make some coffee, go do laundry, or spin up a large AWS instance to get the job done.

The Github repo is here. If you're wondering how I made an IPython notebook look somewhat pretty on a blog, I blogged about it here



# code for the progress bar import time class ProgressBar : def __init__ ( self , loop_length ): import time self . start = time . time () self . increment_size = 100.0 / loop_length self . curr_count = 0 self . curr_pct = 0 self . overflow = False print ( '% complete: ' , end = '' ) def increment ( self ): self . curr_count += self . increment_size if int ( self . curr_count ) > self . curr_pct : self . curr_pct = int ( self . curr_count ) if self . curr_pct <= 100 : print ( self . curr_pct , end = ' ' ) elif self . overflow == False : print ( "

* Count has gone over 100%; likely either due to:

- an error in the loop_length specified when " + \ "progress_bar was instantiated

- an error in the placement of the increment() function" ) print ( 'Elapsed time when progress bar full: {:0.1f} seconds.' . format ( time . time () - self . start )) self . overflow = True def finish ( self ): if 99 <= self . curr_pct <= 100 : # rounding sometimes makes the maximum count 99. print ( "100" , end = ' ' ) print ( '

Elapsed time: {:0.1f} seconds.

' . format ( time . time () - self . start )) elif self . overflow == True : print ( 'Elapsed time after end of loop: {:0.1f} seconds.

' . format ( time . time () - self . start )) else : print ( '

* End of loop reached earlier than expected.

Elapsed time: {:0.1f} seconds.

' . format ( time . time () - self . start ))



# normal usage, on my slow crappy laptop loop_length = 1000000 pbar = ProgressBar ( loop_length ) for i in range ( loop_length ): # your code goes here pbar . increment () pbar . finish () % complete: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 100 Elapsed time: 1.7 seconds.



# here's what happens if the loop lengths are mismatched so that # the progress bar expects fewer iterations than there are loop_length = 1000000 pbar = ProgressBar ( loop_length / 2 ) for i in range ( loop_length ): # your code goes here pbar . increment () pbar . finish () % complete: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 * Count has gone over 100%; likely either due to: - an error in the loop_length specified when progress_bar was instantiated - an error in the placement of the increment() function Elapsed time when progress bar full: 0.8 seconds. Elapsed time after end of loop: 1.5 seconds.