If you are still not amazed by the power that the Python Language is capable of, then in this part we are going to learn how to generate a Bitcoin address or a wallet in python. I just love how easy it is to communicate with your computer if you have a Linux OS through python and how many interesting projects you can make with it.

In this article I am going to analyze the source code of Electrum, the Bitcoin wallet that is purely written in Python, and it should work with any python 2.x and I believe even with python 3.x package, by default, all dependencies that this software uses are in the default packages. So no additional software is needed it's self-sustainable.

Disclaimer: Use this code and information at your own risk, I shall not be responsible for any damages resulting from the use of the modified code, nor the information provided in this article. It's not recommended to modify the code that generates private keys if you don't know what you are doing!





Playing with the Code





I have downloaded the latest version of the Electrum's source code from Github:

The seed generator file is basically located in lib it's named mnemonic.py and the function is make_seed() , it’s this block of code:

Which you can actually call from the terminal as well, through an internal command. So if you have Electrum installed, then I think it’s like this:

electrum make_seed --nbits 125

This would create a 125 bit seed for you, if you have Electrum installed, but you can also call that mnemonic script through another python file, and customize it for example (like generate multiple ones, or integrate it with some other code).

We will create a new file named testcall.py from where we will call this Mnemonic code, it has to be in the same lib folder though. It looks like this:

And if we call it from the terminal using python testcall.py command:

Basically we are importing the Mnemonic class from the mnemonic.py file just calling it as mnemonic . I haven’t talked about classes yet, they are in the more advanced parts of the Python language, basically they are object that bind together functions. Here the make_seed() function is contained inside the Mnemonic class, and it’s called through that, together with other functions that depend on eachother. It could be done with just 1 function, but using it like this is more elegant and less error prone since it can handle exceptions. I am not a very good expert in Classes, so I’m just gonna leave it like this.

In the Mnemonic class you can define 1 parameter, the language, which has the following values:

None = English

= English en = English

= English es = Spanish

= Spanish zh = Chinese

= Chinese ja = Japanese

= Japanese pt = Portuguese

You can see the country codes in the i18n.py file, but only these have wordlists available for now, visible in the wordlist folder. Basically here is how you create a Chinese seed just replace that argument with the country code:

print Mnemonic('zh').make_seed('standard', 132, 1)

And this will give out some seed in Chinese:

There are also multiple types of seeds you can generate, which you can see in the version.py file:

standard - Normal wallet

segwit - Support for upcoming Segregated Witness softfork based addresses of Bitcoin

2fa - Two Factor Authentication based Wallets

The next argument is the num_bits variable which from the command line is called with nbits command, basically just the number of bits entropy your seed will have (recommended minimum 128 for security)

The last argument is the custom_entropy , basically just an integer with which you multiply your seed number, just in case your RNG is bad, this replaces a part of the secret with the customly generated number by you, of the same entropy size.

So if I call it like this, where I chose a custom entropy number, this would generate a seed this way, of course the entropy number has to be a secret as well:

print Mnemonic('en').make_seed('standard', 132, 2349823353453453459428932342349489238)

I don’t really recommend using this code, it looks kind of weird to me, I am not cryptographic expert but I just don’t like how this inserts entropy into your number. I have heard that multiplying numbers decreases entropy, so I am not sure about this part of the code. In fact I am going to message the dev about this issue, see what his response is about this. However no worries, the default wallet generation doesn’t call the custom entropy part, so if you are generating a wallet in Electrum through the GUI, or leaving it at 1 value, then this is of no concern to you.









Auditing the Seed Generator





Ok so now that we know how to generate a seed, let’s see what exactly does the seed generator do. After all anyone using Electrum has to rely on the security and integrity of this code, otherwise you can lose all your money if this code were to be written badly. So we really have to trust this code 100% if we want to store a lot of Bitcoin in Electrum. So let’s analyze it.

So let’s analyze the make_seed() function, this is where the action is, first of all I will put many print codes in it to print out each variable at each step:

Basically I just print out the each variable at each step. Ok so we are calling the make_seed() function from our testcall.py file with python testcall.py command. Where the testcall file is like this:

print Mnemonic('en').make_seed('standard', 132, 1)

Just a standard seed generation, it prints out these:





Well let’s take it step by step.

First the version.py is imported where the codes of the file is, it basically translates that standard argument into 01 which will be the prefix of the seed later. So it sets the prefix to a 01 string.

is imported where the codes of the file is, it basically translates that argument into which will be the prefix of the seed later. So it sets the prefix to a string. Then the bwp (bits per word) variable takes the log 2 value of the length of the word list, I mean how many words there are in there, in this case the English list: english.txt . There are 2048 words in the English list, and log 2 of that is 11.

(bits per word) variable takes the log value of the length of the word list, I mean how many words there are in there, in this case the English list: . There are 2048 words in the English list, and log of that is 11. Then the num_bits is divided by bwp and rounded up, turned into an integer and multiplied by bwp again. I don’t know why this is necessary since it gives back the same value, I guess it’s just some kind of precaution.

is divided by and rounded up, turned into an integer and multiplied by again. I don’t know why this is necessary since it gives back the same value, I guess it’s just some kind of precaution. n_custom becomes 0 if we leave the custom_entropy at default 1, so that no extra entropy is added

becomes 0 if we leave the at default 1, so that no extra entropy is added n again, it remains the same as the num_bits input if no custom entropy is added.

again, it remains the same as the input if no custom entropy is added. So basically if you generate a default wallet with no extra entropy, then the n variable becomes the main number holding the amount of entropy you define initially through num_bits . So in our case it remains equivalent since we don’t add anything.

variable becomes the main number holding the amount of entropy you define initially through . So in our case it remains equivalent since we don’t add anything. Then my_entropy will just pick a random number between 0 and 2 n , where n is the same n , so it will be a large number, this is the prototype to the seed.

will just pick a random number between 0 and 2 , where is the same , so it will be a large number, this is the prototype to the seed. Then we go into a while loop to search for a random number that starts with 01 which will serve as a checksum of the seed.

which will serve as a checksum of the seed. If the custom entropy is 0, then basically we just add 1 to the my_entropy number until the first 2 bits become 0 and 1. Actually the first 2 bits of it’s hashed format. So that happens is that it encodes it with mnemonic_encode(i) and right after it decodes it with mnemonic_decode(seed) I guess to test if the number can be encoded in words, otherwise it would give some error. That is what the assert command does, it tests for errors.

number until the first 2 bits become 0 and 1. Actually the first 2 bits of it’s hashed format. So that happens is that it encodes it with and right after it decodes it with mnemonic_decode(seed) I guess to test if the number can be encoded in words, otherwise it would give some error. That is what the command does, it tests for errors. Then it goes into the is_new_seed() function, if you generate a seed now, if you import and older seed in the old format then it goes into the old function. But this code that I executed above goes into the new function. This is where the magic happens. The is_new_seed() function is actually located in the bitcoin.py file:

What happens here is interesting, first the seed gets normalized with the normalize_text() function in the mnenonic.py file, I think the Chinese or other strange languages get transmuted into ASCII text I believe. So this function does not much with the English wordlist.

function in the file, I think the Chinese or other strange languages get transmuted into ASCII text I believe. So this function does not much with the English wordlist. Then is when things get interesting, it takes the HMAC-SHA512 hash of the seed list, in the English text version of it basically in our case. And it checks the first 2 characters to be 01 , since we called a standard wallet. Electrum defines the standard wallet as a seed whose HMAC-SHA512 encoded with Seed version starts with 01 , a Segwit wallet whose HMAC-SHA512 encoded with Seed version starts with 02 and so on… So basically that while loop increments that my_entropy variable by 1 until the wordlist that it gives back whose HMAC-SHA512 encoded with Seed version starts with 01 in our case. After it found that number, it exits the loop, and it returns the seed:

because sister decrease neither cool more car galaxy one upset high allow

That’s it, that is how basically Electrum generates a seed. And this seed’s HMAC-SHA512 sum will start with 01 , you can even check it yourself. So in Linux you can install a tool called GTKHash to calculate hashes, so let me demonstrate, we take the the seed, and add the HMAC message Seed version as defined in that function:

So as you can see if we add the HMAC message Seed version together with the seed it gives us the 512 bit hash that will start with 01 so in this case this is a valid default seed compatible with Electrum.

Of course the HMAC system is unbreakable, especially the 512 bit version of it is probably quantum computer resistant, so there is no way to reverse engineer the seed from this system.

However there is 1 issue, if we fix the first 2 characters of the hex format, where obviously the HMAC-SHA512 output is in hexadecimal format, well that loses entropy.

So that is why we start with 132 bits of entropy, because we lose about 4 bits of entropy, and hence the output at the end will only have 128 bits of entropy which his what we want by default, it’s safe to use 128 bits of entropy, in fact it’s recommended to only use above 120 bits now, given how powerful computers get.

So we start with 132 bits, we lose some bits due to fixing the first 2 characters, and then we remain with 128 bits which is computationally secure. To brute force this it requires a supercomputer to go through 2128 combinations which is pretty much impossible since there is not enough energy on Earth to go through that many combinations, in fact some people say that you can’t even count until this number range, not to mention hashing and other memory intensive operations









Conclusion

It looks like Electrum is safe to use. It has passed my audit, although I am no crypto expert but from what I have researched and learned it looks safe to me.

I am still skeptical about that custom_entropy thing, I should ask the dev what that does exactly, but other than that, default wallet generation is flawless. There are no backdoors in my opinion.

After all many thousands of people use Electrum, especially people holding large amounts there so it better damn be safe to use, and in my opinion it is.

I have analyzed it’s main seed generation code in this article. Of course the code is a lot more than this, but we already know that if you generate a seed on an Offline Computer with it, it should be safe. Now I haven’t looked into the network related parts of it, but I trust them to be safe.





It’s a cool wallet, use it if you want: https://electrum.org





Sources:

Electrum software is the Copyright of Thomas Voegtlin licensed with MIT license.

licensed with MIT license. Python is a trademark of the Python Software Foundation