CoreDev.tech, May 2016, Zurich meeting notes

Short introduction

These are some notes typed during the Zurich gathering and they are almost entirely the words of others from the event, and usually not those of the typist. There is little or no attribution for who said what and different speakers are not indicated so it all runs together as a single stream. The participants can be found on http://coredev.tech/nextmeeting.html for a list. The typist is responsible for practically all errors and mistakes. Please keep in mind there were 3 full days with multiple parallel conversations going on so not everything could be transcribed.

day 1 - 2016-05-20

Jonas introduction

Why did we organize this event? The idea came up when we were at Scaling Bitcoin in Montreal. There's always nice talks but not much time to really work on stuff and code. So slowly the idea came up that we should have time to sit together and work on stuff and discuss stuff. So the idea is to have 3 days to work on whatever you want or whatever we want, and discuss proposals or strategies in person so that we might have elaboration speak. But I don't think we should do any consensus decisions, it would not be appropriate. We can do proposals but no decisions. I recommend no decisions.

Yes I am aware that most developers prefer to work in their own environments. There is clear scientific fact that working together in person has some value for team structure and productivity and stability. Sometimes we are under lots of pressure from various parties. I think to survive the challenge or benefit from it, we need to have a stable team and have clear communications. My goal, or our goal, could be to improve Core during the next few days in a few direct and indirect ways. Indirectly I mean, someone could say after these few days, what did we do during the three days it wasn't really productive? Sometimes that happens when you sit together, you talk a lot but code productivity is not visible. But we should not underestimate the long-term implications of meetings on team stability and knowing the other side and leading to more clear communications.

I would say it's good if we make significant progress on this group, like merging segwit today. That would be awesome. Don't feel too pressured to make code-based progress. That we're able to sit together in one room and try to work together as a team is the value itself. If we make code progress, or merge pull requests, even better.

There's some fixed schedule, like the lunches and dinners, and the breakfast as well. The rest is up to us what we want to do during this time. Every contributor has his own roadmap in open-source, everyone has their own goals, so we ... it's totally up to us what we want to do, we should probably work on segwit maybe get it done, maybe merge it, no pressure, it would also be productive if we want to discuss stuff, someone could bring up a topic, there's some topics floating around like p2p refactor, the idea I had about talking about p2p encryption. Someone could write down a proposed time slot for something to discuss, on the wall there's a list of ... and then a room and code. And then maybe someone shows up and registers for that particular talk. Or if this is too much, we can sit around and code. But maybe the scheduling leads to a bit more productivity.

Write down what you want to talk about, maybe talk with others to arrange it, and then let's see who's writing after that. As an example, I would like to work on the p2p encryption maybe I can pull in Pieter for that, I'm not sure since everyone wants to talk with Pieter, we should not all work together 20 people on the same thing. We should try to have 5 people in a group at a time. We could try to bring up topics to work on, form groups, then try to work in groups.

segwit fuzzing

Jonas Nick was asked to fuzz test the new sighash function in segwit. So far hasn't found anything.

It was taking 4 seconds to sign. When signing a transaction, we start with a regular transaction then convert into multiple transactions many times. Each conversion just hashes it...

travis-ci build infrastructure things

travis-ci bit for caching has been flipped for everyone globally. Push to master, push bitcoin master to your own master, because what happens is that it builds master, it pulls the cache from there for every branch you pull, it pulls the cache from master and tries to use that. If you don't have your own master cache, it... if you push your own branch, it pulls it from master, if the branch has a cache then it pulls it from that. "master" is special", ask travis-ci I guess. Ideally, it would pull from the upstream master branch. I'm not sure what a niche use-case that is. You could argue that we might not want to share our cache with downstream, or the other way around, maybe the downstream doesn't want to use our own cache. This would have to be configurable, and I don't know if they want to add that. The fact that they special-case "master" is a little odd, since github doesn't even do that.

issue 5669 - hostname problem... travis-ci support might be dragging their feet now that we've been switching off of the paid plan.

Maybe a machine is passing the whitelist, and sometimes it's not passing the whitelist. What has to be whitelisted? OSX SDK is not redistributable. So it's a best-effort don't let anyone randomly grab it from our site, but our travis-ci machines will presumably pass. If they change the names of their machines or something... They could add a token to their interface. They had that.

We still have a broken constructor MSVC workaround. We have that in some of the stream stuff. We have some conditionals about many versions of MSVC. There was something in the .. vector operator or string.. vector equality or vector some overload. MSVC default copy constructor for standard allocator.

Maybe we could try to get Apple to make that SDK available. We might not use anything from that file. We basically need the stubs and headers, it's the Oracle v. Google argument. We could get involved in that. Maybe we could go back to Judge Alsup and figure out if this qualifies as "fair use".

Verify commits

verify-commits.sh -- in the doc files repo for the cubes/qubes stuff. It's rather critical. Qubes OS. Master dotfile repo that gets copied around to various VMs. Yes the key needs to be in your local ring in order for the verify-commits script to pass signed commits. We could commit a temporary pgp keyring. Using your own keyring lets you verify the keys separately.

Segwit nsequence

We should only have nsequence id for tips. It's changing an invariant. We could do that as a separate refactor change and then this becomes easy. Because it may have different uh.... (trailing off)

You only want the sequence for tips? That's not exactly right. No, it is. Anything but the nsequence of the tip doesn't matter. We try to converge to is the among the highest longest valid ones, the ones with the lowest nsequence... invalidate block and every... you normally validate it when you connect it, so. So your idea would be that when you... when a block ceases to be the tip, you get rid of the sequence? There's different options. Either we say the sequence value is irrelevant for anything not a tip, which means you remove the check in CheckBlockIndex to only enforce the rule when uh... yeah. That would be one way. And also that means when you switch to a new tip, you can reset the counter to zero so that you don't get any overflows unless you have 2 billion competing tips which is probably a sign of another problem. And what else? An alternative is that there's some separate map from block index, used to get the sequence number only maintained for the tips. Yeah, alright. It may actually be ... we could replace, set the index candidates where the value is the nsequence value because I think it's for exactly those that we need it. I thought ... we insert things... we insert, then we prune, then the only thing that matters is the sequence. But that's a bigger change. That's probably memory wise the smallest, who cares about 4 bytes per block in the index? So that's 1.6 megabytes. For mapblockindex. We could get rid of it and have a block index, we could save 1.6 megabytes there, alright.

Github config for forcing commit signing

There's git config for github for pulling all the pull requests. There's the ?w trick. Any url in github ... it's the whitespace diff. They even have different diff view options, there should be a checkbox there for whitespace. Wladimir suggested this to support@github.com but that's still not implemented in their UI. It used to be much more restricted. You couldn't even extend the range at which you were looking. So maybe they will add this eventually, since they updated the diff functionality at least once before based on requests.

Segwit

What about doing the check blocks before the rewind? Not at level 4, but for the other levels. So how about 4 is skipped when you know you are in that state? Yes it's ugly. You have to pass into checkblocks that you're in an inconsistent state. You end up in a state where your tip doesn't have any recent blocks at all. So we're violating the constraint that we haven't pruned, that one day worth of blocks always remains. Well, two days. I don't know the motivation behind checkblocks. For pruning nodes, we should never go past that data point, then if we're willing to do that the problem is substantially simpler. I have not much evidence of it actually catching that many errors. It caught the version corruption. It still exists. Right now, transaction versions are a signed value and they get serialized into the UTXO set using the u-encoder that is part of the coins database stuff. If you send in a number negative transaction version number, the value stored in the UTXO database is wrong. We don't use that for anything, it was put into the database because maybe it would be used for consensus rules in the future. And maybe a UTXO version number... there's a spending one, but not a creating one. The transaction creating the output is never relevant. And it seems unlikely that we would use that ever in a consensus rule.

You wouldn't want to use it as a consensus rule. It's wrong, and it's storing incorrect data, and the checks actually found that. We solved the problem by bypassing that test. What was the argument for not wanting to use it? When would the semantics of how a UTXOut behaves when you spend it, because of the transaction version of the transaction that created it? It's a layer violation in that you have a global flag of a transaction that changes the semantics of its outputs when spent later. So either it makes sense to put a version number in a particular output (like segwit does), but not in the created transaction, because then you're able to mix things. Segwit puts a version in the output, which makes sense.

There wasn't a reason to do this when it was done, just why not. Taking it out would save like 5 megabytes out of the UTXO set size. It's not that it hurts having it, it hurts right now in the sense that it's wrong. Could we fix it? The reason why it is not fixed is because ... if we fix it, we need to fix everyone's database. In fact, there might be no way to do that than forcing a full reindex. There are a small number of transactions in the blockchain with negative nversion numbers. You can't read the version number and fix in place. If we know about all of them in the fixup script, sure. There's data lost.

If you serialize and read back, what do you get? No clue. Not the right thing. Otherwise we could say that for everything in the future we're going to collapse them. It depends on the implementation specifics of the varint encoder or something.

So in checkblocks, what about a change to just not run checkblocks, if you're pruning, stop complaining about data at that point. Sure. It's independent of segwit. And then that makes the change in segwit easier. We can backoff the intensity of checkblocks in a number of different ways. If you combine rewind with pruning, you will end up with no data. If you update after activation, basically you have a two day window to update. If you wait a couple weeks, you're hosed. The alternative is that you download the blockchain anyway. It's not a security reduction compared to the known pruned case where, not sure if people are aware of this, when you upgrade a node to add segwit support the node that was previously running with non-segwit code after segwit activated, and then it accepts a segwit-invalid-chain during that time, it won't know that it's invalid. That's the historic behavior with soft-forks as well in the past. But you were talking about rewinds? It's always been the case. The difference is the extra data. It violates the constraint that we think we have blocks once we have added them to the database. You might serve blocks that are invalid to your peers. And secondary, you want people to be validating which is a separate part. You need to obtain the extra data. You bring up a node, it's pruned, it's past the horizon, you can't reorg back to the point where segwit started, you need to get rid of the blocks you have so that you don't serve them up, rewind will do that. There's weird edge cases here that we never got to before. This will be an interesting stress testing for the pruning code. That's cute that we just have to get rid of the blocks. UTXO is right, that's what's important. That's ugly. That you can end up with no blocks in pruning mode? Yeah. It's consistent with non-pruning behavior in terms of security. It's just fragile. If you remembered, if we just added a flag that said what ... what script verify flags a block was checked with in the block index, then when you start up with your new block index loading rules saying it was checked with these other flags, it could just reorg back to that point and apply forward. Segwit does this, but for have data not for verify flags. If you are pruning, it is no longer possible to do that, because you can't go back further than your prune horizon. If not for .., we could download blocks and go back further.

Is there any reason that you would have to rescan the wallet? I haven't thought about how the wallet interacts with segwit. There's another issue here for find fork and global index. You may end up with a wallet where the CWalletTransaction does not contain the actual signatures. In our current wallet, there might not be anything that uses. The wallet locator could be ahead of the tip. It shouldn't be a problem, because it should be a descendant of your tip, but that's not how it works because of how locators work with find global index will walk back before your tip and it will try to scan from a point that you don't have data. I fixed this with a simple change where you check in both directions. If chain active has the locator thing you are looking up, then return it. Or the other way around. Well, that sounds wrong. What if you are actually on a fork, and a fork after the last thing that is in your locator, but before the branching point with your chain tip. Your code will say it's good but we're ont. If the locator descends from your tip, the lo.... you go from the latest.... you walk the locator from latest to earliest, so when you encounter something that... Is there a better way to fix this? We should apply the locator to index best headers and find the branch point there. This is always a better estimator than doing it with chain active. Chain active is always an ancestor of index best headers. Actually I'm not sure that's true.

You find the first entry in the locator that is in your block index. Forget best. And this will always be... the code does this. It looks it up in mapblockindex. If you know about it, then you have chain active contains that entry, return it. I added the reverse condition as well. Return the tip if that chain's... eh I think I need to see it. There's a pull request.

I didn't want to ask about segwit wallet behavior. The segwit patch includes wallet support, right? Yes. You have a wallet, it's running on a node, it doesn't have segwit, another copy of your wallet runs on a segwit-running node. If you upgrade the old wallet because you noticed the transactions were invisible. If that other node was pruned, it was possible to hit that. So in that situation you would miss transactions. You deserve both pieces at that point. You're running a wallet, two copies of the same wallet. I could see it with a backup. But yes it requires you to mix a segwit-running node with a non-segwit running node. One option would be to put a big warning label, if you're running a pruning node segwit is different and you might want to upgrade before activation. If you didn't, then you might want to reindex. We have reindex chainstate by the way. There's a reindex chain state. It wouldn't work now because you have to download the blocks anyway.

We did get the reindex to be much faster now, so that's good. And synchronous. Yeah there's no helping that corner case, the only answer is docs and reindexing.

Whenever you're using a wallet, and you upgrade post-soft-fork, and then you do listtransactions or getrawtransaction, it will not contain this witness for anything that pays to you even if it was addresses in your wallet beforehand. Is this a problem? It's not usually data you need to... how would you have non-segwit... there would be non-segwit address. Where do we expose the witness? In getrawtransaction. It's not something you should be looking at, but I'm sure some people are, to try to figure out who's paying them. Oh well, screw em.

If you run the same wallet on two hosts at the same time, bad things happen in general. If you're generating address on another machine, you have to import them. You set your key pool to some really large value. You initially set your key pool to a million addresses, then you move your wallet around, it works almost perfectly when you do that, but it's not a supported config. This will become more of a problem when we merge bip32 stuff, because it will become almost completely correct to run wallets in multiple places. But it still wont work right, because you might still try to spend the same coins at the same time. If you generated the keys in both places, yeah but what if both wallets spend. In segwit, sure that will work fine, but if you do this upgrade thing. Same key but different address, it won't be scanning for them. It's no different than any other wallet duplication. Whenever you create a new address in a new place, you import it into another, you try to import it into a non-segwit wallet. You could even generate the wallet on the segwit host with segwit addresses, copy it into a non-segwit host... the wallet will probably be imported just fine. The wallet stores where it last scanned to. When you upgrade to non-segwit, when we do bip9, it wont go back before the pruning horizon. It will try to fix it. But it might not be able to go back far enough, because of pruning.

We should change the functions-- so that they wont detect bare pubkeys to the same address as... bare pubkeys are used. They should be separate. petertodd used it in one of his example scripts. We shouldn't be scanning for bare pubkeys for everyone in the wallet, only the ones imported as bare pubkeys in the wallet. There should be a way to import a key that imports as a bare pubkey. We do basically support bare pubkeys in the wallet, we just don't generate them. It's ugly. Whenever you have a key, we will treat anything that pays via P2PKH or P2PK to that key as is "mine". None of it has any flags how it's limited, so... [....] We didn't repeat that mistake with P2SH.

We might want to soft-disable the wallet for segwit in the initial release. We need to have the wallet to test it. Right now it's still manual. We should potentially hide that. Because basically even if ... it also risks users associating this behavior with segwit. Had to write out a procedure for miners to testing segnet, there needs to be a switch. We could also just, a switch could change the wallet behavior for getnewaddress now returns a segwit address, which is what we should deploy long-term anyway. We don't want RPC changing based on network conditions. There are mining pools that cannot pay to P2SH-segwit addresses. We should default to generating segwit addresses once segwit is deployed and running pretty well. We should figure out what the native formats are going to be. Separate from the native formats, regardless of where those guys, we should be handing out segwit addresses once it's well deployed. P2SH-segwit addresses. We can accomplish this by having a configure flag that says "use segwit addresses" and it defaults to 0 unless you're in segnet or segnet+testnet, defaulting to 1, and then we change the default to 1 in the next release. Or we can have "getnewaddress" with passing flags. Maybe a pubkey directly... that's going to make integration harder. If you go to deploy, say in 13.1, segwit is well-deployed, we generate segwit adresses as the default type, if you don't like that, then use a config option to usesegwitaddresses=0 or something. Tie it to wallet version? yeah it could be tied to that... We have to handle change as well. Change should always be segwit addresses, we can do "change to segwit address" which we can do aggressively. If you look at the different types of output you have, randomly pick one of them, and mimic their type in the change. P2SH isn't enough, it has to be P2SH-segwit... P2SH is still better than, it's slightly better than... It will get better with Schnorr.

nickler is testing segnet chainparams with 0.9 to make sure segwit is still soft-fork compatible.

Error code and correction for a new address type

In terms of address type stuff, as Pieter was saying he wants to talk about address type stuff. Rusty wrote a blog post a while ago on native segwit address type which consists of a prefix colon and then a base32 encoding of just the scriptpubkey directly. There's some issues with integrating that into a wallet because you would only want to use it for recognized cases. Like say we have an address type that encodes any possible scriptpubkey up to a certain limit in size. Now you do a listtransactions and suddenly all your old transactions, oh they also match this new type -- and then it will be listed like that-- that's not what you want. This is the default for everything we don't have an alternative for, or you only do it for a whitelist of the subset. But that's independent of the encoding. That's more of a UI issue. You check the particular case. And then the question is, what do you do for the checksum? After trying many different possibilities and grinding several I guess weeks of CPU time on this, we found a code that adds 6 characters for any random error it has at least a chance of 1 in a billion to fail at most. It will detect up to 4 transpositions or substitution errors for sure 100%. It doesn't fix anything though-- it can fix 2 transpositions. Stop. No.

In the BIP we would tell people loudly to not fix errors. But the capacity to fix errors means we can make a UI that could point to where the errors are. You would type in an address, it would tell you which spots to look at. If you tell them that it's wrong, then they will go figure it out on their own. The ability to correct errors, the strongest use for it is that it's a strong guarantee that it can also detect it. Using an error correction code as a design criteria, it gives you the properties you want. Whether a UI should make use of it, it's an orthogonal question. And yes there are many suggestions that could be done here including things like, detecting an error and knowing what the error is, but telling the user to look at two other spots to make sure they don't just try things randomly. You definitely don't want a UI that does something like "are you sure this 3 doesn't have to be a 5?" and then they put in a 5 there. That would be horrible. The important property of these is that we guarantee detection of errors that users are likely to make. It guarantees detecting 4 substitutions, 4 transpositions, a substitution, etc. There's a one in 1 billion chance, but outside of those cases. 1 in 1 billion is probably. Right now it's 1 in 1 billion, but no guarantees for any particular errors. 1 in 4 billion is distributed such that common errors by humans are that chance, unfortunately. The error correction coding which sipa has been working on is like 4 lines of code. The error detection recovery code is like 500 lines. Just encoding and checking is like 5 lines of code. What is the length properties? It's longer, but not much. For a pay-to-witness pubkey hash, we have a scriptpubkey of 22 bytes. 22*8/5 .. 36 plus.. it's of course 42 characters. Most of the length change comes from that it is encoding 256 bit data. P2WSH-- those would be 55 characters. Before it was 5.8 bits or something... that's probably the largest downside is the length.

When used in a spoken context, base32 is about 4x faster. It's also about half the size when put into a QR code. QR codes are not efficient for base58. There are digits, alphanumerics, bytes or kanji. base58 is stored as bytes, but base32 can be stored as alphanumeric. It's upper-case only alphanumeric. So you would have to use uppercase base32. But you want to display a lowercase address text. I don't want my key yelling at me, to be fair. But maybe that would prevent some people from copy-pasting this. Maybe we should encode this as emoji.

One of the advantages of having error-protected addresses of any kind, it's not just human error in transcription but also memory error in computers because bitcoin transactions are irreversible. Prudent wallet software should be creating a transaction and then signing it, and then checking the input. If your address that you put in is error-protected in anyway, then it's a strong test that funds are going to the right location. Unfortunately that doesn't protect the amounts because they are not under the address error correction. The amount could be embedded in the address and the amount could be under the address error protection. The full thing would be checkable by wallets. You could also add a signature on top of that, but it's overkill at that point. Buying anything online, you always have to copy-paste two things, you might drop a digit, or you might swap the fee with the value. In the scheme that Rusty described, you have a prefix and then an address, so you could have btc:A:amount, or something, the address would be slightly longer with the address, it would apply the check to the amount. That might be something that we might want to move forward with as well.

I like the prefix idea a lot. It doesn't make sense to have a version byte. How would the transition look like for base58 address types to this other type? It will take years. We still have uncompressed keys. P2SH took so long. And P2SH is multisig, it's not seen as a general case, it's seen as multisig by most people. Multisig is seen as somewhat sexy, it just took a long time to get parties to update. Did we track the adoption of P2SH?

There was javascript input validation that was doing address checking. That took a long time. If people are reimplementing parts of the protocol, well.. oh it sucks so they rewrite it from scratch which is dumb. I would be inclined to pick a fixed-length format. If some addresses have different lengths... every type of address would have the same length. The problem is that the moment someone says oh yeah let's have the bright idea of... if it's a scriptpubkey, it can't be a fixed length. With amount, yeah amount is fine. Whatever we go pick, we probably do want a fixed length format because of what people write. With base58 checks, leading zeroes get stripped, so the addresses are not all the same size. But there's a fixed maximum length, and people do hard-code that in and stuff. Ignoring the fact that it can encode arbitrary scriptpubkeys, it is fixed length. The moment we want to encode something else, it just breaks it, come across another pubkey type and it has another length. The stored field that someone is using might just be enough, with the recovery characters knocked off, for it to still work. So nullifying the ... well all we can do is tell people not to do that.

You feed it an array of base32, it computes the checksum, ... ... It could be part of the spec. Even though it encodes an arbitrary scriptpubkey, it has a maximum size. Nobody is going to implement what the spec says, though. There's an implicit one in the protocol, and it's 10,000 bytes. Pay-to-witness scripthash address, and pay-to-witness pubkeyhash format, but the format actually just encodes a scriptpubkey but it's just the standard for those two but not for arbitrary scriptpubkeys which I think would be dangerous. Other people can later extend it, it becomes easier to extend because all you have to do is remove a limitation. The risk is that what people will implement oh it's a scriptpubkey. If you limit it that much, one of the reasons to have something encoding an arbitrary scriptpubkey, so if in the network we deploy script versiontype 5 with just MAST or something, it will take forever to upgrade the entire industry. We don't want to create that problem. We're not going to want to do this again. If we have a new thing again, like lightning, changing an address type is the most difficult thing. Address type being extensible, lightning.... confidential transactions is a good example-- no I don't think that's a good example either, so what's a good example is MAST. You would have to extend for blinding. CT is not a good example because, the payer will need additional processing infrastructure, I can't create a new address type now that is generic for CT. But for MAST I can encode the raw scriptpubkey. Right. It's unrealistic to say we'll never have to have new address types in the future. We can include a lua interpreter or something. It's a new address. You give someone, when you want them to pay you, you give them a virtual agent who acts on your behalf to negotiate the protocol for you, it can barter and stuff.... <humor>petertodd's on it. Whenever you want someone to pay you, you just send petertodd. P2PT - pay-to-petertodd.</humor>

We can't hope to cover all possible future cases, but we can cover a lot of stuff like a lot of new segwit descriptors, but we can't cover stealth addresses or CT. We should cover the things that can be covered and not design further. Being able to put the amounts under it would be super useful. With a different prefix, it would have to be optional yeah with a different prefix. You can only store 288 bits of data, so change to a different code or something.....

XOR the genesis blockhash into the checksum for the new address type. I think jtimon came up with that idea. It's likely that the prefix will at some point collide with some other software system that uses the same prefix. So it's a prefix plus base32 data, so it's recognizable, and then the prefix can be something like BTC or whatever. But short two-three character strings, if htis gets adopted by many software systems maybe not just bitcoin but maybe tor or something, and they like the same thing, it's not unreasonable that there will be a collision on the prefix. To make sure there's no confusion, you can XOR in a 32 bit context ID into the checksum without breaking its properties. What you do in bitcoin, you take the lower 30 bits from the genesis blockhash so whenever you refer to another chain, your code could never be valid. The higher ones are all zero. That was for compatibility, sure. P2SH problem with litecoin, don't know if you seen this, ... sending bitcoin to a litecoin address.

Using the same pubkeys in bitcoin and litecoin; some people have sent to coinbase.com keys... and Coinbase had to write a blogpost about this saying no we're not going to return your litecoins, because our HSM and yadda yadda.

Litecoin never deployed P2SH. They just rebased on top of bitcoin. At least one altcoin has taken the bitcoin genesis block, but it was attacked immediately because someone inserted the bitcoin blockchain. They changed the p2p peer port, but then someone force-connected to it "and your altcoin is now bitcoin". Consensus worked, right? There are many altcoins that have copied litecoin's genesis block. Feathercoin was one of them. They were too lazy to change checkpoints? No this was before checkpoints. They used checkpoints to prevent reorging back to litecoin. That's effectively the same thing as choosing a different genesis block.

The first upgrade with segwit could include Schnorr signatures, and also the new address type. But this will not be included in the initial deployment of segwit. If it's a long time, ... they are orthogonal. Announcing them at the same time doesn't mean that you can only use it after the new... the addresses are not a consensus thing, but it's harder to upgrade. If there's nothing out there for a long time, it might be like bip38 where people just make up their own standards. We need to talk about this. Rusty's blog post was a sign of the fact that he felt this was a problem. I'm sure that he won't be the only person. Tadge says he uses base58 and it starts with a "k". It's so much easier to just have something that fits with everything else. If I'm doing that, other developers will be like, I want something that I can .. to those addresses. Initially there was bip143, anyway.... I would say, faster is probably in some ways safer in that if there's sort of a void for a while then people will.

bip should include encoding and decoding code in the 5 most popular languages. It should be in the most popular languages so that people can copy/paste. This will be much easier than base58. A lot of people struggled with implementing base58check stuff, especially because they would go and look at it.

It was scope creep in segwit to figure out the serialization structuring. In segwit also, it has implications for fraud proofs but that's a semi-discredited idea anyway. Do you have a proof that fraud proofs are discredited? I wrote an article on that, but it turned into a 20 page blog post that I haven't finished yet. And one of the big cases where reifying the transaction hash would be most valuable, is not needed for segwit in any case. Didn't we fix the input value thing? It signs input values? That would be the big gain, using it to get input values and get input value proofs small but you don't need those for segwit for some cases. It's the 99% solution.

What about merkle proofs for the witnesses? Yeah we should do those. What's the timing on that? What's a good point to do that? I don't think it's hard. You need the witnesses in your wallet. I don't think it's hard. On bip37, I was asking if people need to change the code where we scan, right now we scan scriptsigs for stuff, do we need to scan witnesses for stuff? I think the general conclusion on this, unrelated to segwit, inclusion of signatures in bip37 was a massive attack. Are people using it? As far as I can tell, nobody has been using that in bloom when I looked a few months ago. It causes massive false positive rates in the filters. A call should be made, we should ask if other people are using that. I wasn't able to find anyone using that. bip37 bloom filters scan the signatures and they scan the pushes in the signatures as well. We should make sure that they don't. I don't know whether, in segwit right now, the bip37 code, I don't know what they do with witnesses. They don't scan them. It just looks at scriptsigs. So should we send an email to the dev list and ask if anyone expects this to work? I expect that if we get reply, it will be people cheering that the behavior is being improved by not scanning the witnesses. Maybe some wallet authors might be doing the wrong thing right now? I don't think there will be any response, because I don't think most people have realized that this bip37 signature scanning is screwing them.

This should be an update to bip144 then. It should be an intentional change to say that bip37 should not look at witnesses. Do you want OP_RETURN or do you want OP_DROP? When I made the 4 meg blocks on segnet, it was just OP_DROP and stuff. We can use OP_RETURN but it costs 4x as much, so just do OP_DROP before our pubkeyhash and then they put their junk data in the witness part. Is that better? is that worse? I would prefer the junk data to be in the witness. We have a policy that prevents more than 80 bytes in an output, and there's no policy for preventing anything in witness data right now. That policy is not very effective anyway. Given the outcry about the size about OP_RETURN, I would say the policy is effective until people figure out how to bypass it. As far as why it's preferable to have it in the witness, the only real argument I can give is that if there are lite clients in the future that don't need the signatures on the transactions, which I think is likely, and using them to decide who signed in multisig well that's kind of a special use case. It means that, there's much less data that a lite client will need to download if there's any junk data in the block in witnesses instead. Even with a false positive in there, it would skip over the junk data. If you want filtering of stuff in the witness, that sounds like an extra filtering thing in general. Lite clients generally can't validate signatures, they don't have the UTXO set, they can validate signatures but not the UTXO. They don't have the pubkey. And they don't need to, their security model doesn't rely on that. That's a massive... For example, there was this post to the mailing list a few weeks back with a proposal that sort of inverts how bloom filtering works, where blocks have commitments to a map. And when you get a hit, you just go and download a whole block. And having junk data in the witness is good in that case because you wouldn't download the witness for the whole block. In the case where you want the junk data, you can go get it from the witness data. The payment code stuff is poorly designed right now, but maybe it could use something to search for your payment code in an output. If that data is moved into the witness, if you do that, you lose the privacy advantages. Just for the locator thing? Yeah. The people who are using this weird OP_RETURN stuff, if you move to OP_DROP or to the witness, how to make it easier for them. Many or any of them don't seem to be using bip37 filtering to find their stuff. It's been described, maybe one of them is, generally they aren't. Usually they don't even use the things they put into the blockchain, it's write-only. Even if there was something that people were scribbling into a witness that we want bip37 to trigger on, well maybe there should bed a flag to tell bip37 to scan for that, and turn that on only when you want it, because it's a special case.

bitcoinj wallets currently rely on pushing the outpoints which also happens automatically. Which also contributes a ton to the tainting of the filters. ... Schnorr multisig that lets you use the size 1, is not often an improvement for all applications, because you often want to know who signed. So it sounds like you need a way to get the witnesses for a block. I would like to check for commitments, but I would still like to check signatures. I don't know if the one on the blockchain has the same signature. It would be easy to give you the commitment path for it as well. That code is pretty generic. Witness txid contains the normal txid means you only need to check one tree.

We should discuss policy for just for witness. What's the current behavior? There is none. We need to have that in order to prevent people from doing the parasitic thing. There should be a... the same rules as normal applies. You can't have a transaction whose virtual size is now above a certain limit. So, alright. Say I'm a node relaying to the network, a segwit transaction arrives at my door step, I take its signature, I push BLAH BLAH BLAH DROP. That won’t work, because segwit script validation implicitly has cleanstack. BLAH BLAH BLAH DROP. No you can't do that, it's only pushes in the script.

In the docs, with checkmultisig, you don't put 0 on the first witness item, you just put a new length. You know how with.. you have an empty witness item, not a witness item with a zero. It's not clear in the docs. That's what OP_0 does, it pushes the empty string, the name "0" is misleading. The docs should be clarified. It would make sense to update the docs to that 0 is encoded as an empty byte array. It pushes an empty byte array. It's possible to serialize an explicit 0, but we don't do that anywhere. But it violates minimal encoding or whatever. If you take the script and directly put it in, ... are there any other ways you can take witness and modify it to be longer but still be valid? I think not. But I'm very uncertain. This is the fixing malleability subset about it that we were never quite confident we had answered all those questions.

If your script starts with OP_DROP, then, you're ... right? If you don't do that intentionally, OP_DROP with an empty stack is a NOP? No, it's a failure. It's a stack exception or underflow. If your script starts with an OP_DROP, you send a transaction with an empty length thing, and then someone sticks a megabyte of garbage there. It wont relay because the fee is too low. You can't... if you had many drops, they could put that megabyte of junk. Currently there is a consensus limit that the stack items for execution are limited to 520 bytes which is what they could be before... this is not a restriction on the length of the witness script. This only applies in the currently defined versions, good. But we may want to put a policy that is smaller because there's no use for anything larger than 72 bytes currently. It might make sense to put an 80 byte policy limit. There's a script that publishes arbitrarily padded whitespace text into the blockchain, that's why it's whitespace padded. In spite of showing all this wonderful stuff, they just whine about OP_RETURN not being big enough, they have never done any research or whatever.

One reason to keep using OP_RETURN is because some websites use OP_RETURN to txid mapping. They are using websockets to that for their web API. These people are probably going to keep doing that.

We need to cache the wtxid but we might not need a lookup. In a post-segwit plan, we just communicate the short ID key. Looking up based on short ID, it picks transactions from a discount. We iterate the mempool. We benchmarked this. 50 CPU cycles. 5000 go to iterating the mempool per entry. For currently small mempools, it's a few milliseconds, which is fine.

PDF malware issue

There have been some PDF 0-day exploits targeting Blockstream people and Bitcoin Core people. Patrick also got some. It was from mega.nz. Qubes uses a new VM for every app, including networking, disks, and reading PDFs. Actual VM, using xen. Xserver in its own VM. It does some window manager magic so that it looks like a single system, windows are color coded by which VM they are running in. It's Redhat on top of Xen. You can make all your VMs be debian. They don't have loops reading untrusted data sources in Qubes.

0-day in the UTXO set to go and exploit the virus scanners that are monitoring the blockchain. There was a recent one where the security researcher sent the exploit demo to the antivirus company, they didn't get the report because it crashed their mail server because it auto-scans the attachment using the same software.

Segwit2 review important things

The last commit that I added included some commentary https://github.com/sipa/bitcoin/commit/5b0cd48d9f29f4a839640819230d4e88e077ce40. This was sipa going over everything except the tests. And Tadge did a reimplementation from the bips. He did need to talk on IRC a bit to get it right. And as a result there have been some changes made to the bips. Jonas Nick was doing some coverage analysis and automatic fuzzing of the code as well. Are we still missing seed nodes? We should upgrade some of the seed nodes to only report... or.. petertodd runs the testnet ones. How do you want to approach those tasks?

The least destructive thing is to just add new ones. Add a segwit testnet seed node. Add new ones. Assign to every seed node, add a service flag and the current ones just get no network then you... I already do this for RBF seed nodes. I can take that code and change the numbering. Yeah. Go ahead. Just the refactor of adding service bits to the seed nodes, that's something that we should merge first.

I have a code style question. What do we do for comment blocks? I'm looking at sipa's commentary thing. Adjacent chunks are alternating between slash slash comments and slash star blocks. It depends on whether you want doxygen to pull these up. This is not intended to be merged. It's inconsistent, I noticed now. It's stenography and encoding a help message. If you notice, some are spaces and some are tabs. You alternate it quite nicely.

We should not explain results, but we should explain differences in the comments. Talking about the refactor is not helpful. Convert the text to talking about the differences. Slash slash comments in C99 are not valid comments, so I use those in my own source code so that the text wont get committed. This was fixed in the last year. I think they fixed the variable length arrays.

Child pays for parent

It's done and ready for review. There's a refactor which needs a test to verify that it's identical. There was ancestor tracking, the CMB refactor, then CPFP uses both. So the pre-reqs are all merged? The refactor isn't merged. If you don't care about the refactor, you can just review the whole thing. It's child pays for parent, not children pay for parent- we don't support that, it's too hard. It's only the first born. ANYONECANPAY but they all can't. Together with Chinese policy, that's perfect. One child per parent (paying). How does it determine which one to pay? You look at each child and decide if it kind of makes it sort it to the top. Once you evaluate the ancestor fee rate, it's updated to include something that is potentially a parent. Whenever we include a transaction, we walk its descendants and put them in a special holding area.

What's the best possible fee when you do an RBF fee bump? 1.5x times the prior fee. Fee estimates might be higher, you need to make sure the new transaction pays for its bandwidth. What do we do if the fees are equal or even lower? Then why are you bumping it? Maybe you want to add an output. There are still cases you want to bump it. A multiplicative increase guarantees you can cover any span of fees, anything smaller and you can't do that. Bram's fee estimation stuff? If we were to do more extensive bumping in the Bitcoin Core wallet, we would probably want to do the pre-sign thing where all the bumps are pre-computed in advance so you don't have to come back. Greenaddress does this. It's pre-signed. I thought it was. Pre-signed means you don't have to go back and get access to the private key again. You can also be slick and locktime the updates. There's no technical reason needed to do that. You can send them off to different people to broadcast.

What is the user experience if someone bumps their fee at the same time that a block gets mined with his original transaction? For the end-user, how does that work? From their point of view, they should click and in they notice that the fee is less than they expected. They should not see both transactions. They should just see one transaction. If you bumped a fee, you hide it immediately, you have the risk of the original getting mined. But it should be shown in the UI as one transaction, that's hard I think. It's all txid based at the moment. It's output based.

The correct thing is easy to imagine for the GUI, which is that there should be one entry ever shown on the screen for some transaction. If you go into more advanced details you might see more detail. Implementing that is a little tricky. In listtransactions, it's output oriented, it's not that different. What about a version in RPC to bump or increase the fee, then we try to calculate the optimized fee, not just 2x the fee? On RPC, on GUI it could be a context menu for increase fee or bump fee. No outputs added.. Well you need new outputs for change sometimes. You should probably have a discussion with Lawrence who implemented this at greenaddress. One challenge is that, say you have bumped a transaction and you had change from that transaction, and you don't know which is going to get confirmed, either one might. You want to make another transaction that would spend the change of the first transaction. And so, what do you do? If you bump, you invalidate it and then you don't know which one. So you create descendent transactions for each one? So adding an extra output is sane. An alternative is that you never do that, you add an output. Well you still have to deal with the case where the original transaction... even if you add the output, what happens if the prior transaction got confirmed? Now you have to still, you still have to make a descendant. You do both.

With wallets, you probably want the wallet to include the user intent recording. This is way different. With bumping, what descendants I was going to pay, then bump add it. It's an invasive rewrite. If you bump a transaction fee, ... you also have to make... in that case you already computed the one that didn't get bumped. You actually have to find all the children of the transaction you are bumping, and then merge them too. And then if the person you spend it to is respending the change, well if they are re-spending unconfirmed transactions from a third-party... well that edge case I'm not worried about. Lawrence implemented a first cut of this, and he realized it's more complex than he thought. I think the minimum set of things, with just bump and not adding more outputs, ... as I said, Lawrence and implemented it and then realized his implementation was not right and then went and did more work. It's not easy. It's in some sense, it's easier to do with precompute kind of approach.

Fee estimator seems to be mostly kind of working. It's a little freaky though at times. I haven't been seeing people using the fee estimator also complain that their transactions are stuck and taking forever. I haven't seen that. I see a lot of complaints on stackexchange about transactions not confirming, although I don't know what wallets. Usually that's places with static fees. Someone was complaining about dynamic fees but have not initialized or something, not enough blocks or something. Reasons to bump the fee, if you target to the next 10 blocks, or target to the next 2 blocks, it's useful yeah. The state of the network changes. The fee estimator can't see the future. The more full blocks you see, the more important fee estimators will become. But I saw some users complaining about the fee estimate, there's no estimate fee one.

UTXO MMR and delayed commitments

Delay when you commit to the MMR so that you have some time to do it. And maybe not with every block. That's another one you can do. I nearly wrote up the version with periodicity instead of always. It's reasonable to calculate it every time. It makes it simpler. It could be a lot faster if you calculate it in batches, potentially.

There's an optimization that you can do in these schemes where you have a portion of data in the MMR tree where nodes aren't keeping everything, but they can keep the top levels of the tree, and the proofs are shorter because you don't have to include those parts in the proofs. This adds 10 MB of storage to the nodes and keeps the proof size down. These optimizations are useful because otherwise you end up with really big proofs like 4 megabytes. You could change how the pruning rules work. The data that you go prune is all consensus critical. If you make this hard assumption that you should provide this data in your block... well you can make the pruning rules more liberal.

Like segwit, we're probably not going to have a good handle on what these schemes feel like without implementing it. In segwit, we did an initial implementation in Elements Alpha, and that experience gave us the confidence to go and do the real thing. petertodd has implemented the code that does this type of stuff; when you get it right, it gets nice. There's lots of ways to screw it up. The implications are easy to get wrong. For segwit, when we first talked about doing this separation of signatures and the rest of the transaction data, everyone's initial impression was that this would be a whole rewrite of the codebase that would be terribly complex. When sipa thought about implementing it, the original version, it turned out that you wrap the serializers and it's done pretty much. And that's perhaps something that should be done here, this should be implemented perhaps not in bitcoin but in some test thing just to get an idea. Fortunately some people want this stuff to get implemented, the data structures are already done. This would be necessary to move these ideas forward.

There's subtle issues, especially with pruning. How do you write code that interacts with this insane ways? When you say you can drop something, well what data is where? How do you write rules that capture this stuff? It's not that trivial. There's a lot of languages that can't deal with these data structures reasonably. There's a lot of boilerplate to bring stuff into and out of memory.

How is this different from jl2012's december proposal? petertodd has a few extra details, but it's mostly the same kind of thing. It sounds like... to my understanding of both, which is superficial, is that they are the same proposals. petertodd was talking about delaying the commitments, so it's a small detail. The general pattern is the same for both. I wanted to have an example to show how this is possible. If you can figure out how to make it faster, sure do it without the delayed commitments.

This idea is not a super one. jl2012 posted about it in December. There's been various forms talked about for a number of years. The realizations like, oh you can split and do both. And how the UTXO sets, and have things fall out of it. That wasn't in the initial ideas, but it makes it a more attractive thought. Or the notion that nodes can keep the top of the trees. I think it's a really cool idea and I hope we can have something like that in bitcoin. I'm not sure how to overcome this automatic crazy response that people have had to it, which is that, every time it's come up in a place where the members of the public, someone responds thinking this is stealing their money. Probably because there have been UTXO pruning proposals, without the proof stuff.

There's some nice sort of properties about this kind of system, like a transaction that needs proofs associated with it, is invalid without the proofs. You could say, you happen to have all the data as an archive node. I could connect to you, ask for an address, I could write a transaction that pays you and also whatever I wanted to pay, and then give it to you without the proofs. Then the archival node operator can add proofs and get it mined. So it has built-in the ability to pay to provide for proofs. There's lots of questions to answer like how do you structure wallets for this? We have analyzed the designs to know that it's straightforward to build a wallet that tracks its own proofs. Unless you had coins stored in a vault completely online and you weren't tracking proofs; so you might need an archival node to provide it for you. It would be a long time until bitcoin is scaled up to the point where people wouldn't be able to do that voluntarily as an archival node.

If the way that the data is managed in the system, it should be, if the design is right, it should be possible to make an archival node specialize in a certain subset of the data. UTXO insertion ordered index, do it by time, that's the easiest way. Then you can find archival nodes that take a certain window or something. Right now today, if you want to spend a coin buried in a safe deposit box or whatever, you need database access to the network to write the spending transaction in order to find the txids. Bitcoin nodes do not provide an address index lookup. You have to scan the blockchain which is really expensive. Or you consult a server, like an electrum server, which will tell you the input. The only extra component is needing the proofs, since we already have that electrum component.

Not sure how to deal with the public response.. if it's implemented in a test response. Ignore the naysayers, say it works. "Recoverable" is bad because it implies it could be lost. People like "archive". It's good, it's safe. And what about paying an archival node? What if it becomes too costly to pay for that proof? Well, transaction fees are going to be more than zero, geeze sorry.

Witness data, as well. But it still, yes it could be made in a way that doesn't cost any additional amount. But with efficient block propagation, you could probably do so without too much moral hazard. You don't want extra data in a transaction not costed that is free-form. But if it's highly structured proof data, then maybe it doesn't really change the "size" of the transaction. So the additional cost then might be just the rescanning cost. The scalability benefits are so obvious in this scheme. Right now, we're at a state where the UTXO size is still manageable but we're going over leveldb performance cliff and it's becoming a lot slower to do lookups. It's not just size of memory, it's also about performance of leveldb. We can maintain it. We can make the DB cache more efficient. Plenty of things to be improved. But this data structure is going to continue to grow. And it's performance-critical. It sets the bar for being able to run a full-node. When we introduced pruning, you could run it with 2 GB of space on your system. But now the system is closer to 4 GB right now. Prune 550 with all the undo data in the blocks? Yeah. The UTXO set size itself serialized... my chainstate directory size is .... I had a pruned node running before I came here but I deleted it to free up some space. Okay, right now it's about 2 GB. Maybe I was thinking 4 GB because of a db log... if you run with a huge DB cache and shutdown, it does not merge the log into the database. So that might also be why you saw a larger size. It's not an urgent issue now, but it could become urgent pretty quickly.

You create a proof of coin history by linearizing and dropping some of the data probabilistically based on coin value. It's not feasible to implement without allowing people to create coins out of thin air. On average, you are creating money out of thin air. But there's no way to get that into bitcoin. It's never going to happen.

Local UTXO set corruption

We currently have no way to detect local corruption of UTXO set. There are also other problems with leveldb corruption. We need ways to detect leveldb corruption. The CRC may be helping in a way that is invisible right now maybe. If the last log entries are unreadable, we load it in a mode where it ignores them. So it will not give an error, your last writes... it will undo your last writes. This happens transparently, or at least it should be, I don't know if we tested whether it is transparent. Old files might get corrupted, like a hard disk overrides part of your file. So it can detect file-system level corruption. The initial test, it's quite long, especially on machines with slow storage. The hardest part is reading. It's warming up your OS disk cache. One of the unfortunate things about it is that it doesn't prime the coins cache. It used to, until you optimized it away.... which was intentional because it would blow up... it limits how far back it goes, it will only go far as back, with the cache, to avoid increasing peak memory usage. It had two levels of caches. If you did that, you threw it away and kept your cache prime, yeah that would be a big improvement. At the end of the chain state things, still push it... I think the problem is that it gets update, you rewind using the.. so you cannot use key state, but if you can checkblocks for, you reconnect them, you rewind them and play them back. If you left your cache in the state it's in, that should be the state it is in. In checkblocksfor, you have rewound and then reconnected. But it could be changed to 4 as the default. It could be decreased in depth. I think C4 over 3 isn't that much, because it's just script validation and not the disk latency anymore. And script validation has become more faster. And undo is very slow in the code base and a number of other areas. I wouldn't be surprised if they were implicated by checkblocks. I tend to skip the initial check these days, this is bad right because you're bypassing the suffering of the user now. If you're disabling it, that's a sign that something is going wrong. Does anyone here have a dbcache higher than the default? Well, we should increase the default. Or make it automatic.

There's an advise flag in Linux, that you can lose the contents of the page at any time. It's hard to write a program to repair that situation. You have to copy the data every time you use it. Linux has something like that where you can allocate a page of memory like that. There could be a log entry, "this would be way faster if you are willing to use system resources". When people complain about slowness, oh just turn this knob.

We mostly shouldn't have these configuration options, because users aren't going to know how to set these values anyway. But there are corner cases where we don't know how much memory the user wants the system to use. When the user wants to change the setting, they still don't know how anyway. And miners don't want to change these values with flags, even if they want the better performance. Why didn't we make it run that way? And we wrote the software. If you are using bitcoin-qt and a UI, then there should be a drop-down sure. Whereas if you're running it on a terminal, people can probably deal with a config file. QT idea is adding profiles. Yep when bitcoin-qt boots up, tests your GPUs and then enables sha256 or something to start mining. And then it starts ordering ASICs or something.

Wallet separation and GUI issues

It feels like people are not happy with the GUI anymore because it's slow and crashy sometimes. It's slow when it's syncing up. I hear that a lot. How many people regularly build and use the wallet here? I think it would be more stable if we would have the node, and the GUI in a separate process communicating over RPC. Awesome, great. Welcome to 4 years ago, since 4 years ago. Nobody did that. There was a wallet over RPC at one point. This isn't a cute thing, right? Just having the wallet separate from the rest of the process would have important security component. GUI separate, is not the wallet separate. There can be a node GUI, then you can connect to your node server. Then we have a wallet GUI. Nobody cares about the node GUI. Statistics would matter to some people. Some did a curses monitor for bitcoind. When it builds, it works fine. But the build is often broken. One of the configs for travis is building the QT-level wallet. Nobody knows this at the moment. You lose the tabs with the transactions. First version of pruning did not support the wallet. It was a blank pane, and then a debug window. I don't want to connect to the Core code, untangle it, then go over RPC. Separating wallet from node is important, but not GUI from logic. p2p interaction is then no longer in the same place as the thing that holds your private keys. It's embarrassing that this is not process separated. Working towards wallet separation requires authentication for private RPC commands...... eh, probably not. Because without having those things, we could still have the process separated on the same host. And a different user on your same system, connects to you, that's not a problem at all. I certainly, I consider the p2p encryption stuff to be a good thing that we should do. It's not needed for separation.

There's a lot of stuff that the current wallet does correct. It makes rewriting from scratch more infeasible. More than one person's mania will carry them through. This is a standard failure mode in history of software development, where lots of companies have tried the let's rewrite. There's probably a minimal way, that deattaches it by brute force, then add a few more RPC messages to get the functionality missing back. Tried to get Armory run on top of RPC. I offered to do work to add commands. Don't write your own node backend, just use RPC. I would be happy to promote Armory as the wallet for Bitcoin Core and he didn't want to do that. He favored a design that inherently was impossible to make scalable.

What was the problem with RPC? If you want to sync your wallet, using RPC. If you're talking about detaching, detaching the wallet from the node- you don't want to do that over RPC. Relaying all transactions and blocks? You still need an RPC layer for fee estimations and things like that, which should be optional because they are impossible for SPV. And SPV doesn't work. It's not about SPV with random nodes. We should not call it SPV. We should call it "trusted nodes" or "trusted local lite" or "detached wallets".

p2p encryption

Something actually short-term doable. I recently merged the pull request. The only thing which I had changed is that ... IETF is standard than the OpenSSH standard. They have their own encryption scheme.. it's different. The difference is that they encrypt the length with a separate key, and that's pretty cool, the openssh one designed that one better. We can use code from the openssh implementation. The benefits of encryption, we have a one-time chance to overhaul the message protocol and drop the SHA256 and it will make it faster. Finally, we can drop the worrying about adding fixed line commands in message protocol. No need for 10 bytes of spaces in a tx. Wladimir pointed out that some people are adding in random info. Some nodes were leaking memory into that. There's probably a private key link. libbitcoin has had bugs like that, but they fixed this. How do you end up leaking into that? When you send a 2-byte tx command, some implementations fill in the response with random data from other locations. This might be libbitcoin because they did this in other locations. They were doing it in transaction version numbers. That's how we ended up with negative transaction version numbers in the blockchain. libbitcoin was generating transactions with uncleared memory in their transaction data. That could be fixed as well. Those little fixes would actually end up recovering a non-trivial amount of space lost to the MAC for authentication and encryption. It would recover some of the lost bandwidth.

The only downside to this encryption stuff is that it makes the peer protocol faster. But it will also make it use somewhat more bandwidth. Why faster? Because the sha256... every message in the p2p protocol gets hashed with sha256, and that's pretty slow. It's 4x faster than calling 1305 plus siphash20 (er, chacha20?)... someone on the mailing list pointed out we have a 4 byte length ... ends up in a 4 GB buffer. Because you want to do the MAC afterwards, you need the key. The MAC is of the package table. The length is first, you know the length ahead of time. You need to be able to verify the MAC. The first 3-4 bytes is to decrypt is the length. It's up to the implementation to care about the buffer. In theory there is a message with 4, well let's make it 3 bytes.

We should make an explicit effort to reach out to other implementers to review this proposal. In our past experience, they are more likely to read our code than our bips. But I think they might actually read the bip. So we should reach out to them. This is worth doing even without authentication. It increases privacy. And it makes p2p protocol faster. I would also like to have a good handle on what authentication we hope we would implement before we consider the encryption design done, .... the application is more exposed to export restrictions, unfortunately. Well what about wallet private key encryption? That's only internal to the application and the U.S. export forms only care about communication.

The concrete effect is that if I want to ship a product from the U.S. or Australia or Canada that copies U.S. law, to other places, since I'm shipping a device that contains cryptographic components, I have to fill out some forms with the U.S. government and they ask questions about what kind of cryptographic components it contains. In theory it could delay or mess up, in practice it's not a problem. So the protocol would have to be optional, and have a build time option to disable it. Then you could ship things out. Does it matter if it's NIST or...? Does it make sense to pick one either way? It's not enough of a distinction, unfortunately. They don't care if it's about authentication and authorization, which is what wallet encryption is. I had no problems filling out those forms with respect to that analysis. Yeah, it's authorization to spend from this wallet.

Wrapper around the threading primitives

This gives... we already have wrappers around those. They are currently using boost. We need a wrapper around boost. BOOST_THREAD for example. So standard threads we would be able to drop in, except they aren't interoperable. Boost thread, condition variable, this is done, there's a pull request 7993. Is there interest in moving forward with this knowing that it would cause some code churn? Or should we leave it until later? We should postpone this until 0.14. We should focus on the network stuff. We should use this when we are trying to get rid of boost, but that's not a goal for 0.13. There might be some places where boost is easier to get rid of, so once this is in, it's easier to do this sed to replace things. Except for file system, options, we can't replace boost chrono until we've merged this.

Block propagation things

Right now, we will, if a new transaction arrives, it gets immediately included in a block. It would be useful so that as we build a block, we should look at arrival time. We should assign a fee modifier to counter propagation issues. To account for propagation. This is also the rational behavior for miners. Miners would penalize a transaction they just received. If they had a somewhat lower fee, ... I went and did some propagation studies where I got some computers around the internet with some synchronized clocks. I logged the accept-to-mempool times for all the nodes. This is a binary thing. Unless, with a compact block, you were relaying.... well, what I implemented was something where there was a fee floor based on the time delay. If you've had a transaction for one second, that implies the fee per byte must be greater than x or else you would skip it in the block construction. This is justified by saying if you're going to include this transaction, then you are going to incur an additional 200 ms propagation delay per hop. Once you have done that, you're willing to include anything- not implemented this part. This could be an extra thing to do, to change state after including these things. If someone sends out a 300 BTC transaction fee, to include that, even if it causes a propagation delay.

The numbers that I got when measuring propagation were interesting. It's an S curve shape that a few get around really fast but most of the convergence happens around 2 or 3 seconds, between 2 nodes. If I have synchronized clocks on 2 nodes, and the time from the first node to see a transaction to the second node, most of them get done in 2-3 seconds right now on the network. Even if you go out to 10 or 20 seconds, it only gets you to 90% of transactions.

Due to the descendent count limit, which might be due to the package limit, if I have A&B which are descendants of some common thing, the order that I get them might determine whether they will be accepted. On the subject of propagation in general, right now in the network, orphan transaction handling is really busted. Because of trickle, it's often used, and that will go away with 0.13... in master right now, there's a thing that sorts the transactions before relaying them. This does not completely eliminate orphans because there's the scenario where one peer sends you transaction A and another peer sends transaction A and then A prime which is a descendant, you would go get A from the one peer, and before it's back to you, you might get A prime first, and then you will get that and then it's an orphan. There's a leak in the orphan pool. There's a size-limited orphan pool, transactions get ordered, and it will randomly drop transactions once it reaches a certain size. The pool is always full, even on master, even in a network where all the nodes run master. The reason for this is that if we accept a block off the network, we don't remove transactions that were included or double spent from the orphan pool. We don't do any orphan pool handling at all when we fix a block. This should be fixed and also backported to 0.12 as well. This is really adversely impacting the relay of transactions on the network.

If you have a chain of multiples, you will often always kill the thing. You never make any productive use of the orphan pool because it's dropping too often too fast. The combination of changes in master, and clearing out the orphan pool when a new block comes in, so then the orphan pool should be mostly empty when new blocks come in and the random eviction code won’t really matter anymore. When we receive a transaction that brings the parent of an orphan into view, the chunk of code to go and handle that is sort of a gnarly mess of code that would have to be duplicated in the block handling or something. I see what you're saying. Yeah. There's two things that we need to fix there. When we connect a block, we should keep a list of all the previous outputs that it spent. Then we should take that list of previous outputs and use that to filter the orphan pool. What about move the orphan pool into the mempool? The conflict removal, beyond conflict removal, what orphans have now become mempoolable? They are not conflicted, their parents are now in the block. We don't do either of these things. Sometimes, if it's deep, and the block conflicted with the parent, you have no idea. There's still going to be a leak. We should expire things that have gone into the mempool. The random eviction is moronic, less moronic I guess. We do another moronic thing... we track which peer gave us the orphan, if a peer disconnects, we throw out all the orphans from that node, it's sort of dumb. This limit has shrunk over time. There's several reasons for the limits in the orphan pool. It can be expensive to process orphans, because if an orphan spends lots of inputs, well, I can make a transaction that's not real it just looks like it, it's entire size is spending outputs that don't exist. It will make the map of prior outputs really large, it's a map so it's not slow, but it's going to take time. It should probably be limited in size not number of outputs. So it sounds like we're going to have a leaking orphan pool, and it should have a time limit. From testing code now in master for relay, at least in the lab environment, my finding is that basically that a transaction never needs to stay in the orphan pool for a couple round-trip times to its peers, so like an hour or 10 minutes. You don't want to wipe it on every block. What if it is in the middle of propagation? I did think it might be reasonable to get rid of this "kill all entries after a peer disconnects" and use a time limit. We also have a time limited mapping data structure, we can use that in relay or here and that would probably be strictly better. It's complicated here because there's indexing data structures. Removing from a map and then also remove it from the previous out index. That's easy to do out of all the possible changes. I spent some time thinking whether we could do better than random eviction. The limited map seems better than random eviction. There's some attacks there maybe? The problem is that it costs nothing to make orphan transactions, you just make up gibberish. You can drive someone's orphan pool to be empty, you can keep flooding it. You can't get banned for it, because the peer will never know that it's all jibberish. You can have an orphan map per peer so that they aren't competing with each other for space. Should we have, should the orphan pool be generalized, or should we have a separate rejects pool? There's two reasons for having a pool for having a rejects pool. We should still keep the rejection filter for the bloom filter, which is space efficient. But it's actually easier to keep around rejected transactions. The compound block relay sometimes has transactions that it could have received but we threw them out. The compact block relay could rescue from that when they show up in blocks. Miner had a different policy than you. The other reason it is good is because of child pays for parent. There's a third case, which is locktime. You might not need it for child pays for parent. You should use replace-by-fee to solve the propagation problem. The threshold for propagation is very low. As a miner or something like that, you would increase your fee income with a smaller rejects pool because of child pays for parent. It would also increase your fee income if there was a user tool for stuck transactions. Having a rejects pool and using CPFP.... it's like the least efficient way to make the transaction ... yeah just pray to the transaction reject pool fairy. Blockchain.info produces transactions where the user sets the fee too low, it will produce transactions that don't even relay, and then it will also spend them. So the user will go to the wallet, they will set it to a fee that doesn't relay, they will spend it, then they will be saying man my transaction doesn't work, then they set their fee to high, then they transact again and it spends the output from it. Then they whine that their transaction has taken 4 hours and it hasn't gone through......

Compact block relay can go take transactions from the orphan rejects pool. But there's a problem with the referencing counter there. C++11. Sad that we can't get rid of the orphans pool. The orphan pool is needed. It has an effect. It really hurts propagation to not have it. Or we have to re-architect propagation. Transaction packages? I think basically we don't need it, with the latest changes to the relay, with a small infant pool, it might just work. You have a transaction with inputs and outputs, you can add to it. You just do replacements. There you go, you have a package.

Wladimir was working on assembly optimizations for sha2. Well, there's one way to make the hashing faster than the assembly- which is, don't do it. Pieter has instrumented the hasher to print out the hashes, to see when the hash has been computed multiple times. And there's sometimes hashing that has occurred multiple times, like 20x. He did this 3 years ago and fixed a bunch of places. This is how we got CMutableTransaction vs CTransaction. Transactions that you reject, or receive multiple times, you will address those. No, this is 6 times in a row. Back-to-back. Oh, that's messed up. We need to fix this. Need to wait for a block to see this again.

Another weird behavior with transaction relay, we get an awful lot of Not Founds. What's happening in the cases that I instrumented, if you look at any node running and looking at stats per message, you have sent and received a whole bunch of Not Found messages. Sent, sure. But received? You will almost certainly sent a lot more than you have received. You will also have received them. What happens at east in the cases, yeah there's some garbage that requests weird stuff. But ignoring that, I think someone is trying to track transaction propagation to see if you have them. Bitcoin originally protected against this attack because the only way to get a transaction was if it is in the relay pool. But now it also checks the mempool. If we move the insertion into the relay pool at the point we actually relay (which it's not, it's ahead of it)... the fallback to go to the mempool... bypasses that. Now, it should be easy to make it so that you don't put it in the relay map until... because we already decided to not relay things not in the mempool. That's a one line change. When someone gets data, we still check the mempool too. So it would not improve the privacy of the system with this 1 line change. I can't get rid of that because of the mempool p2p message.... so I thought about just throwing all the mempool p2p results into the relay map, but no that's ugly and it would run you out of memory. Now you would see transactions and not even the serialized..... so the, in any case, the cause that seems to be what's occurring here is that I think that nodes are receiving transactions and for some reason they are getting jammed on receiving them. They end up pulling them from their nth retry, and the 2 minute spacing has a maximum and it's 10 minutes out, and the real relay limit maximum is 15 minutes. In the cases where it's not just junk, what's happening is that Not Found occurs for like 15 minutes in 1 second. A couple ways to improve this. The whole behavior of the 2 minute offset of fetching things, we should rework the request management of that (Wladimir has some code for this). We should make sure that thing's maximum is slightly smaller than the relay pool's maximum, which could be a max of 20 minutes or something. I'm not sure why nodes are taking 15 minutes to fetch transactions, that suggests to me they are getting DoS attacked, it's inv and not replying, it requires a lot of this in many times to get an inv response take 15 minutes, that's a long time. It's not a tremendous thundering rate, but I expected to see none, especially among my own nodes. I definitely get them between my own nodes. That was something I observed while working on other relay improvements.

http://diyhpl.us/wiki/transcripts/gmaxwell-2015-11-09-mining-and-block-size-etc/

Trying to address these privacy leaks related to the p2p network is like, there's so many of them. It's quite difficult. The mempool p2p message makes it very hard to do anything useful here. Pieter improved it slightly by putting the mempool p2p behind the same poisson time delay. It should have a longer one. That helps. My earlier attempts to improve it broke things. So that's good. Because nodes have the ability to just get data to check, they can use that for tracing. And I think some people are using that for tracking. And why should SPV nodes care about the mempool? Since when do they do that? That's pretty broken.

Mempool p2p messaging behavior in a bit of ways can be changed, but it often breaks things. You can filter it with the already-sent map. Maybe you will send 2 MB of data once, and then it's already known and won’t send it again. The only thing that broke was the block tester stuff. The block tester was using the mempool p2p message to find out things. If your mempool is big, it's a Dos vector. It's a privacy problem. And getdata has to go to the mempool not just the relay pool, which is a privacy problem. What stops us from turning it off? Most ... well maybe next release we say we're deprecating something. SPV wallets are using "inv" data ??? Send it through the bloom filter, that probably doesn't break anything, and if it breaks any of them then we should consider that acceptable at that point anyway.

Filtering and rate limiting won’t harm SPV clients. What if the SPV client gets blocked? Should it get disconnected if it breaks the rate limit? My observation is that no SPV clients are sending back-to-back mempool requests. I looked at violations or something, I used to believe that they only sent the requests at the start. But I found out that was untrue. It's any change with their key system. Some say it's only at the start and only once. I see one and then one tens minute later. If you send it within 1 minute, we just drop the connection immediately or something, that would be interesting.

Subsequent mempool requests won’t be needed in the case where you're trying to get descendants and all descendants are given in the mempool response. He's using a bloom filter, he's updating the filter and then does a mempool request. It's not an insane construction. If it's down your bip32 list, they should not have any payments to them anyway. If you had the same wallet in multiple devices at the same time, where something was in use down the bip32 list. The consequences of not doing mempool requests was not seeing zero-conf transactions anyway, but in SPV you should not be showing them.... people just think it looks cool. Seeing it immediately, people think that looks cool. You do need to know that your own coin is spent.

I have another application for something like the mempool command, but not the same thing. It comes up in compact block relay. If you are using this, then your hit rate is almost zero for the first several hours you're running because it has nothing in it. When you first start, prime your mempool with your mempool command. You want 1 MB of it or something. Mempool RPC that only gives you 2 MB of mempool. I have a patch where I added an RPC call that will send a mempool thing to the first node, while it gets results, it's relaying the other one. That's how I glue my mempools together. Patrick has a patch to do something similar to that too. If we want to have that, maybe it's a separate new p2p message. It's mempool synchronization. It's a pretty cool model. Every block, send the command again, it will always keep you a block ahead of the network. You could run a very small mempool like at 2 MB limited mempool on your node and still have a very high hit rate for the compact block stuff. You're not wasting bandwidth in doing this, there's no duplicate data between those calls, you're free-riding off their memory but not really their bandwidth. It's less bad. It's kinda cool. That could just be a new p2p message for mempool syncing. When bloom filters are disabled, we can disable mempool and turn off answering getdata queries out of anything but the relay map. We may actually have bugs in the relay map handling, that we're papering over by falling back to the mempool. We could look over this. This would be a step forward. If we get rid of the block tester stuff, then there's a bunch of other limitations that we can apply to the mempool RPC. We should still limit the filtered mempool excess to 1 per day or however we want to limit it, maybe 1 per 10 minutes, just not back to back. 1 per minute would probably be fine, too.

It should also protect the last N nodes who had relayed you a block, or a transaction that you accepted. The only reason not done yet is that the locking situation around the CNodes stuff is terrible. There's all sorts of unsafe data races already with the node stats. It's unclear what locks what, it's just terrible. For the stuff I would like to have reviewed tomorrow, I encapsulated that so you could see what locks what. The touches are only done in one place, so they can be restructured to be better. What I would do before taking the training wheels off the eviction, I would add a stat which is the last time when a block was accepted from this peer, and I would add a stat like last time a transaction was accepted from this peer and use that as eviction criteria. I think I even wrote a patch to do that. "What lock protects this?" Well... something else in that structure got updated by something a bit earlier, I assume it was protected by a lock. But no, it wasn't.

It's a pity that the bloom option was peer bloom filters instead of bloomspv. I'm just saying it looks bad because it's not the right name for the option. It will get bikeshedded. It's not used otherwise. When I went to complain about the mempool p2p message, Jeff's response was "take it out" and turn it off. The reason he wrote it was because Bitpay was monitoring transaction propagation in the network to determine double spend risks. You can hang up on someone if they immediately come back after disconnecting them from mempool p2p messages; it's really just the spy folks who are doing that. But you don't know if someone is doing "addnode" to you.

Legal agreement bit and hanging up if the other node doesn't agree

A few attorneys were-- where parties agreed not to surveil each other, a protocol handshake where they both agree to not surveil each other, that this would be enforceable. If you required, as part of the protocol, you send a message that you promise to not take the transaction data to monitor it or whatever, and as part of the handshake as part of the requirement would mean that this is enforceable, even if this is automated and there's no human participation. Has tor implemented this? Nope not yet. Many of the companies attacking the Bitcoin network with lots of monitoring, it's something that will concern them and maybe even stop them. It's pretty easy to change the code. But if you have it two ways where it's required, but I'll hang-up. I'll relay blocks, but not transactions because you didn't send the pledge. Make it something where you sign a message with your prune node key. It's probably not something we should do in the Bitcoin protocol right now. But perhaps in lightning or something. Having a separate high-latency mix network for transaction announcement, which is what people are trying to do- they are trying to figure out where it came from. It could be an additional non-technical measure. This could kinda work, if you are some random hacker you don't care, but the people like [...] etc would be gummed up. This can also apply to the "you reconnected too fast" kind of ...d if someone has addnoded you, maybe they also give a promise to not be malicious. I guess it would be "I'm not trying to monitor the network" promise. And if they didn't, you would ban them for reconnecting too quickly. Maybe a "signpdf" message for the p2p protocol, and then we can finally catch Craig Wright badsigning.

Bittorrent DHTs make it hard to poison them unless you have lots of IP addresses in different /8s, and there are companies that sell this as a service to get those IP addresses. Having centralized directories could help. ... there's things we can do to reduce abuse on the p2p network, but the spying attacks are bad.d Connection slot attacks, we have some good defenses. We could add hashcash on connect, that could be an additional boost to not getting disconnected. How about bitcoin on connect? Yeah. That's bound to happen at some point. Maybe. The criteria is... we need lightning there. The criteria of relaying and accepted... what you want to do for bitcoin on connect, is that you want to do a probabilistic payment probably, you pay bitcoin to something like a Chaum server, get a Chaum token, then go to a node and give it the Chaum token, the node goes back to the Chaum server and gets money in a payment channel. This allows you to not have a payment channel between the pair of nodes, not even a payment graph. How is that Chaum server going to comply with KYC? It's access fees. It's fixed access fees. It's de minimis, so it doesn't count.

Tor uses a centralized directory. And for onion routing with lightning, that's sounding like the one option. The way tor works is that they are monitoring the network and they know who are running those nodes. We can setup a transaction relay thing on tor, you connect to it, you give the transaction, it waits and then sends it. Your privacy at that point is no worse than tor. But that isn't solving the problem, it's hoping someone else solves the problem. One of the models thinking about, with the poisson delay stuff we're doing now, we could have lower poisson delays for tor peers, so you would tend to announce transactions first over tor by default. It fits nicely. The way that the code works, thanks for implementing it Pieter, you can have different delays for different peers. That was not a design consideration, but yeah it worked out. It doesn't right now, I wrote some code to do that a while back, shortest delay peer etc. It should prefer tor, then it should prefer longer time outbound connections. Right now it is a bias towards outbound. It relays faster towards outbound because you control those connections. This is increasing privacy at the expense of transaction propagation, it's a tradeoff.

We know when transactions are added to the mempool, we know when someone requested the data, so we can keep that data and not return new information. You can do it when they connect. This is used, the purpose is to .. to get things from before they connected. One of my patches did that, and I found it broke things. I also did a thing where I tried to do a thing where I put the mempool RPC on a 15 second time delay, it would return immediately with some quantizations so it wouldn't grow too rapidly. Pieter's delay for the mempool is better for that. If you have a simple filter like this, one that will have an aggressive response to a spammer mempool request, well you can get rid of the thing. If they keep spamming the request in order to move that timer forward. If you spam more than a few times in a row, well just drop them. When you do a mempool request, at the end of generating the request you set the time. And then you check the .... I suspect that's not going to be totally clean to implement. We know by the timestamp of when the network code received it, even if the processing is much later. In mempool lookup function, you can read the timeout right there and compare that to your mempool request found and return not found. That's good. That's cute. Yeah, I suspect people that are doing "getdata"-based transaction tracking wont realize that happened for a long time and their data will just go to crap.

We have a flag already for when a mempool request has been requested, although it's not latching. That's a good start. It's almost what we wanted. It's not protected by anything. I hope that's protected by something. There's definitely data races. The relay flag thing is probably actually unsafe. We might concurrently check the flag and also set it elsewhere at the same time. It might only be accessed from message processing, maybe some RPC accesses it?

bitcoinj has a thing where you only propagate a transaction to half your peers, then you listen on the other ones and see if it comes back. This also weakly helps with the spying thing too where you don't send it to all the peers. You don't want the originator of a transaction to behave no differently than an arbitrary node. Bitcoin Core should have a similar feature in its wallet. Everything is relayed with delays in Core now. Later on, there should be less of a difference between outbound and inbound peers. The rest of the inbounds can have 20 second delays or longer. There's no reason to have much faster than that for most of the peers, except for a few that are used much more quickly.

PGP things

pgp clearsign has this undocumented feature where if you both sign the message with the same message path, you can concatenate the two signatures and the signatures are valid. They have to sign the same message hash. You can make a single message with a bunch of signatures. If you run gpg verify on it, it checks out for the signatures. We did this for a letter to Debian to tell them to stop packaging our stuff. They wouldn't update bitcoin package, and it made it clear it didn't work on big endian hosts, because the tests were failing. And the earlier version was working because it compiled on big endian and didn't include tests back then. petertodd has done this for a multisig git commit. Fedora turns off tests when they notice them running, and they have done this for the bitcoin package.

Recent Bitcoin Core test performance

For bitcoin, the standard tests... well, something recently made them much slower. It's taking 10s of seconds for me now. "Recently" might be months. It used to be like 1 second total. This was just "make check", not RPC tests. It runs "test bitcoin". My comment was limited to that.

Perhaps there should be an option to run RPC tests from the makefile. It would be nice to parallelize this along with the others. I don't remember why it wasn't pushed.

OP_RANDOM

If you can get something like OP_RANDOM to work, then you don't need to have bitcoin mining in the first place and mining could be replaced.

"Malleability of the blockchain's entropy" https://eprint.iacr.org/2016/370.pdf

day 2 - 2016-05-21

GUI things and longpolling

There are a number of people that say "no, don't keep the GUI" but this is frustrating because there are many users that require the GUI. jonasschnelli wants to use a clone of bitcoin.git to create a separate wallet that uses RPC and the p2p protocol, while keeping most of the same build system.

Longpoll would be an efficient solution. You can go over p2p protocol, but then you have to setup handshakes. Every language has support for longpoll. You don't need zeromq or magic dependencies. I think it's more, hard to say, I would guess the guarantees are higher that they would receive your notification because the state is held in bitcoind. There's a memory limit, at the moment I guess it's a couple of megabytes, which can be configured. In zeromq, it's all behind the scenes and you don't know how to configure it. In RPC, there's sequence numbers. It can handle notify as well.

I would like in time to eliminate the notify calls. Calling into a shell is really ugly, it's insecure and super inefficient and tying up resources, potential for DoS vectors. You could provide a script that consumes zeromq of notify. It restricts how much bitcoind can be sandboxed. If you want to restrict the set of system calls is unsupported for as long as you're calling out to a shell. This is basically what all exploits do for the shell, too. The depreciation strategy could be removing it really quickly. We have to announce the deprecation in the release notes. We first have to use an alternative. Then we have to have both in the same release. With removing it will, always provide a python script that has the same output. It will be a three line script without dependencies. With zeromq you always have some annoying dependencies. It causes strange issues. All dependencies, most dependencies showed up to be kind of painful, like openssl, zeromq, boost is also kind of a showstopper on performance. It's painful for us, but also painful for people who want to use the interface.

We have a nice depends system now that builds all of our dependencies, which is less of an issue for all the people using the dependencies. People will have to learn how zeromq works, just to build... longpoll is really, it's, AJAX is essentially longpolling. Also the push notifications... it's a synchronous handler