DeathAndTaxes

Legendary



Offline



Activity: 1218

Merit: 1007





Gerald Davis







DonatorLegendaryActivity: 1218Merit: 1007Gerald Davis What the "average user" needs to know about Transaction Mutability February 11, 2014, 11:45:49 PM

Last edit: February 19, 2014, 08:40:01 PM by DeathAndTaxes #1



I am running the stock reference client (Bitcoin QT or bitcoind) and don't externally use tx ids for any backend processing. Am I stick affected?



Yes although the level this affects you may vary depending on your usage. The Bitcoin reference client as currently implemented (i.e. QT client & bitcoind) does rely on transaction ids which can produce some unexpected behavior for end users. Other major clients are likely affected to some degree you should contact the developers of those clients for more information. To provide some reassurance these issues do not result in a risk of losing funds, but they can lead to scenarios which are confusing and complicated to resolved. The long term goal should be immutable transaction ids (hashes) but a short term patch is also needed to make the behavior of the client predictable and intuitive.



The two major areas that a "normal user" will encounter unexpected client behavior relating to mutable transactions are duplicate transactions being reported & unconfirmed change output breaking subsequent transactions.



What is the problem with duplicate transactions being reported?

The first issue is that the QT client is blissfully unaware of the fact that a transaction is a duplicate (same inputs, outputs, and fee) of another transaction, so it reports both txs to the user as if they are unique transactions. This can be confusing to the user. It may appear they received two payments, or that they accidentally spent coins twice. Not only is the transaction reported twice, the duplicate(s) are included in the balance reported to the user which further reinforces the perception that the duplicates are an additional unique transaction. This also appear to affect the balances in the "accounts" system of the reference client.



How can Transaction Mutability break break transactions which use unconfirmed change outputs?

All transactions use as their inputs (money going in) the outputs of a prior transactions. The reference client as well as other major clients does not allow users to spend the output of unconfirmed transactions. This is done for obvious reasons, the incoming transaction may never be confirmed and that would prevent any subsequent transactions from confirming as well. If the clients allowed it this could be chained to involve many people. Bob sends a coin to John who uses it unconfirmed to send a coin to Sarah who uses it unconfirmed to send a coin to Bill. If Bob's tx doesn't confirm (say Bob double spent John, or the tx was spammy with no fee) then all the other tx break.



Most clients however make an exception for unconfirmed change (i.e. output from a transaction back to the same wallet). In the absence of a third party intentionally mutating transaction hashes this presents no issue as the wallet can be confident the transaction will eventually be confirmed. The active efforts of malicious third parties, to mutate transactions however makes that assumption flawed and it should be changed.



Side note on inputs & outputs

Quote If you understand how Bitcoin "really works" you can skip down to the example and conclusion. For those who are operating under an abstracted understanding of Bitcoin using "address balances" belief that Bitcoin works on "address balances" the rest won't make much sense without a little background on how Bitcoin "really works". This is a very high level explanation and a lot has been left out or abstracted away but it isn't important to the discussion of the issue. Forget about address or wallet balances. Bitcoin works on the concept of inputs and outputs. The input of every transaction is the output of some previous transaction. When a user "sends coins" to you, in that tx he is creating an new output which until spent can be used in as the input of a new tx. The output is "locked" to your corresponding private key so it can only be spent by you. When you "spend" those "coins" you are creating a new transaction which uses up that output by putting it in the input of this new transaction and in the process create one or more new outputs. Those outputs can be used in subsequent transactions by the holder of the corresponding private key. The process of creating a transaction "spends" previous unspent output which is referenced in the input side of the transaction making it spent. There is no such thing as a half spent output. All outputs are either unspent (never included as an input in another transaction) or spent (included as an input in a transaction). Since an output can't be "half spent" when the value one is attempting to transfer is less than the value of the output(s) being spent then the client will add a new output equal to the difference which goes back to the users wallet. This is commonly called "change" or a "change output". If you had a 1 BTC unspent output and wanted to spend 0.6 BTC your client would create a new transaction which includes a single input that references the 1 BTC unspent output and two outputs, one would have a value of 0.6 BTC to the recipient and the second would have a value of 0.4 BTC back to your wallet. So 1 BTC in and 1 BTC out. Prior to the transaction you had an unspent output with a value of 1 BTC. After the transaction that output would be considered spent by the network. Instead you would have a new unspent output with a value of 0.4 BTC. The receiver would also have an unspent output with a value of 0.6 BTC.



The key point to take away is you don't transfer balances. A transaction "spends" one or more specific, unique unspent outputs by referencing them in the input side of the transaction and the creates one or more new specific, unique outputs by stating the value and conditions for use in the output side of the transaction. All outputs in the bitcoin network are unique and are explicitly referenced in the transaction by their unique tx id and index.

The input portion of a transaction identifies the specific, unique output(s) being "spent" in the transaction by their unique combination of transaction id and index.. The input of a transaction doesn't reference a generic value (i.e. "spend one of my BTCs at this address") it references a specific output of a specific transaction (i.e. "spend output #0 of tx id 0e11592f65c349e44363362fe78791ccf3777da31b6bb0217422140212884356). If the transaction id of the output being spent changes (and is included in a block) after the new transaction is created then it is referencing a transaction id which while valid can never be confirmed (because only one of the transactions in a pair of duplicates can be included in a block and the "other one" already is).



At a protocol level the potential for an issue applies to any unconfirmed unspent output being spent in a new transactions. The protocol makes no distinction between "normal" and "change" outputs. Everything is an output. However all "stock" clients prevent the user from spending unconfirmed outputs received from outside the wallet so unless you are running a modified or custom client the issue practically is limited to change outputs. The reference client (and AFAIK many others) exempt "change" outputs from the requirement of being confirmed before being spent. The tx id is being relied upon by the client while it is still mutable. The tx id of the change output being spent may be modified, and the chance of that happening is now greatly increased due to ongoing malicious modification of transactions by one or more unknown third parties.



An example might help. Lets image a hypothetical user which has a single unspent output worth 1 BTC and the user spends 0.6 BTC.



A very simplified view of the tx would look something like this (non relevant portions removed)

Code: Tx Id: "A"

-------

Inputs

[0]: <prior tx hash> <prior tx index> Value: 1.0 BTC



Outputs

[0]: <PubKey Of Receiver> Value: 0.6 BTC <- Payment

[1]: <PubKey Of Change Address> Value: 0.4 BTC <- Change



Now this transaction is signed and hashed. The transaction id is the computed hash, which in this example is "A". The problem occurs if the user makes another spend before tx A is confirmed in a block. There is no method for a user to restrict the client to only use confirmed change outputs in a new transaction. So lets see what happens if the user creates a new tx and spends 0.3 BTC more.



Code: Tx Id: "B"

-------

Inputs

[0]: Tx_Id: A Tx_Index: 1 Value: 0.4 BTC <- Note the reference to "A".



Outputs

[0]: <PubKey Of Second Receiver> Value: 0.3 BTC

[1]: <PubKey Of Second Change Address> Value: 0.1 BTC

Now this transaction is signed and hashed. The hash is "B".



Contrary to popular understanding of "spending coins", the user isn't spending some generic 0.4 BTC "balance" he is spending a specific unique output (id: "A", index: 1) . If a third party mutates tx "A" so this it now has a tx id of "Z", and "Z" not "A" ends up in a block first, then transaction "B" will never confirm. It output it references (id: "A", index: 1) in the input side of the transaction can never be included in a block because "Z" already is.



Potential short term fixes:

The first issue is really just a reporting issue. The client is reporting data to the user which is confusing or inconsistent. Clients should still record duplicate transactions but they should one a single tx in a duplicate pair should be shown to the user. When the client computes the balance of the wallet, duplicates (both received and sent) should be excluded from the total. Ultimately It doesn't matter which one of the duplicates is hidden as long as only one is shown. The client will need to handle a situation where if flags and hides one transaction in a pair of duplicates and that ends up being the one which gets included in a block. For consistency the confirmed transaction should always be shown to the user and the unconfirmed one hidden. When this happens the user would see no change other than the tx id would change when the duplicate is included in a block.



The second issue can't be fixed other than by preventing the use of unconfirmed change. The client could either by default or by user set option block all unconfirmed outputs from being spent. The capability to do this already exists in clients as they already block the spending of "non-change" unconfirmed outputs. Change would be treated the same as any other output. Just 1 confirm is sufficient to ensure the tx id won't change in most cases. In some situations this may result in the user not being able to make a second transactions until the change from the first one is confirmed. Wallets can preemptively seek to reduce this scenario by monitoring the number of available outputs. When the number of outputs is low, the client could use more than one change output in the next outbound transaction. This requires no protocol changes because the protocol has no concept of "change" and already supports having more than two outputs. A client could even prompt a user to authorize (will require wallet to be unlocked) the creation of a "splitting" transaction if the number of outputs is very low and the value is very high. A wallet with a three outputs with a value of one BTC each is more flexible than a wallet with only one output with a value of three BTC so the wallet could recommend the user split the output.



While these changes wouldn't make transactions immutable they would make the behavior of clients more consistent with the expectations of users and avoid confusing situations that can arise from transactions being mutated.







On edit: Changed title and introduction to reflect recent updates.



On edit2: Yes mutability is a word. OO languages have a concept of mutable (changeable) vs immutable (unchangeable) objects. You can stop PM me that the "correct" word is malleable. They are synonyms and developers programmers working in high level OO languages are more likely to use the word mutable over malleable.

http://en.wikipedia.org/wiki/Immutable_object Early statements seemed to suggest that this issue was limited only to custom implementations and services relying on the unconfirmed transaction id as proof of payment. That normal users running standard clients were fine. This however isn't exactly the case anymore with one or more unknown third parties spamming modified versions of any transactions they receive into the network. First let me say, none of this should be considered giving MtGox a pass. Their issues go beyond just transaction mutability and the way they handled the issue was just awful however we can't pretend this isn't an issue for average users.Yes although the level this affects you may vary depending on your usage.Other major clients are likely affected to some degree you should contact the developers of those clients for more information. To provide some reassurance these issues do not result in a risk of losing funds, but they can lead to scenarios which are confusing and complicated to resolved. The long term goal should be immutable transaction ids (hashes) but a short term patch is also needed to make the behavior of the client predictable and intuitive.The two major areas that a "normal user" will encounter unexpected client behavior relating to mutable transactions are duplicate transactions being reported & unconfirmed change output breaking subsequent transactions.The first issue is that the QT client is blissfully unaware of the fact that a transaction is a duplicate (same inputs, outputs, and fee) of another transaction, so it reports both txs to the user as if they are unique transactions. This can be confusing to the user. It may appear they received two payments, or that they accidentally spent coins twice. Not only is the transaction reported twice,which further reinforces the perception that the duplicates are an additional unique transaction. This also appear to affect the balances in the "accounts" system of the reference client.All transactions use as their inputs (money going in) the outputs of a prior transactions. The reference client as well as other major clients does not allow users to spend the output of unconfirmed transactions. This is done for obvious reasons, the incoming transaction may never be confirmed and that would prevent any subsequent transactions from confirming as well. If the clients allowed it this could be chained to involve many people. Bob sends a coin to John who uses it unconfirmed to send a coin to Sarah who uses it unconfirmed to send a coin to Bill. If Bob's tx doesn't confirm (say Bob double spent John, or the tx was spammy with no fee) then all the other tx break.Most clients however make an exception for unconfirmed change (i.e. output from a transaction back to the same wallet). In the absence of a third party intentionally mutating transaction hashes this presents no issue as the wallet can be confident the transaction will eventually be confirmed. The active efforts of malicious third parties, to mutate transactions however makes that assumption flawed and it should be changed.Side note on inputs & outputs. The input of a transaction doesn't reference a generic value (i.e. "spend one of my BTCs at this address") it references a specific output of a specific transaction (i.e. "spend output #0 of tx id 0e11592f65c349e44363362fe78791ccf3777da31b6bb0217422140212884356). If the transaction id of the output being spent changes (and is included in a block) after the new transaction is created then it is referencing a transaction id which while valid can never be confirmed (because only one of the transactions in a pair of duplicates can be included in a block and the "other one" already is).At a protocol level the potential for an issue applies to any unconfirmed unspent output being spent in a new transactions. The protocol makes no distinction between "normal" and "change" outputs. Everything is an output. However all "stock" clients prevent the user from spending unconfirmed outputs received from outside the wallet so unless you are running a modified or custom client the issue practically is limited to change outputs. The reference client (and AFAIK many others) exempt "change" outputs from the requirement of being confirmed before being spent.The tx id of the change output being spent may be modified, and the chance of that happening is now greatly increased due to ongoing malicious modification of transactions by one or more unknown third parties.An example might help. Lets image a hypothetical user which has a single unspent output worth 1 BTC and the user spends 0.6 BTC.A very simplified view of the tx would look something like this (non relevant portions removed)Now this transaction is signed and hashed. The transaction id is the computed hash, which in this example is "A". The problem occurs if the user makes another spend before tx A is confirmed in a block. There is no method for a user to restrict the client to only use confirmed change outputs in a new transaction. So lets see what happens if the user creates a new tx and spends 0.3 BTC more.Now this transaction is signed and hashed. The hash is "B".. If a third party mutates tx "A" so this it now has a tx id of "Z", and "Z" not "A" ends up in a block first, then transaction "B" will never confirm. It output it references (id: "A", index: 1) in the input side of the transaction can never be included in a block because "Z" already is.The first issue is really just a reporting issue. The client is reporting data to the user which is confusing or inconsistent. Clients should still record duplicate transactions but they should one a single tx in a duplicate pair should be shown to the user. When the client computes the balance of the wallet, duplicates (both received and sent) should be excluded from the total. Ultimately It doesn't matter which one of the duplicates is hidden as long as only one is shown. The client will need to handle a situation where if flags and hides one transaction in a pair of duplicates and that ends up being the one which gets included in a block. For consistency the confirmed transaction should always be shown to the user and the unconfirmed one hidden. When this happens the user would see no change other than the tx id would change when the duplicate is included in a block.The second issue can't be fixed other than by preventing the use of unconfirmed change. The client could either by default or by user set option block all unconfirmed outputs from being spent. The capability to do this already exists in clients as they already block the spending of "non-change" unconfirmed outputs. Change would be treated the same as any other output. Just 1 confirm is sufficient to ensure the tx id won't change in most cases. In some situations this may result in the user not being able to make a second transactions until the change from the first one is confirmed. Wallets can preemptively seek to reduce this scenario by monitoring the number of available outputs. When the number of outputs is low, the client could use more than one change output in the next outbound transaction. This requires no protocol changes because the protocol has no concept of "change" and already supports having more than two outputs. A client could even prompt a user to authorize (will require wallet to be unlocked) the creation of a "splitting" transaction if the number of outputs is very low and the value is very high. A wallet with a three outputs with a value of one BTC each is more flexible than a wallet with only one output with a value of three BTC so the wallet could recommend the user split the output.While these changes wouldn't make transactions immutable they would make the behavior of clients more consistent with the expectations of users and avoid confusing situations that can arise from transactions being mutated.On edit: Changed title and introduction to reflect recent updates.On edit2: Yes mutability is a word. OO languages have a concept of mutable (changeable) vs immutable (unchangeable) objects. You can stop PM me that the "correct" word is malleable. They are synonyms and developers programmers working in high level OO languages are more likely to use the word mutable over malleable.