Stephen Gornick



Offline



Activity: 2506

Merit: 1001







LegendaryActivity: 2506Merit: 1001 Re: How to detect the change output? November 27, 2012, 08:49:21 PM #2 Quote from: Meni Rosenfeld on November 27, 2012, 06:04:55 PM Any other suggestions?



The Bitcoin-Qt client will pull a new address from the key pool for change. Thus any amount sent to an address that already has been used in a transaction is unlikely to be change. Of course, with other clients, particularly Blockchain.info, this isn't necessarily the case.



There are some clues from anecdotes as well. For over-the-counter trades, quite often upon receipt of funds the recipient will turn around and spend those funds that were received as soon as there has been a confirmation.



This alone doesn't help much but because the coins chosen by Bitcoin-Qt client will first try to use coins for a transaction where all coins selected already have at least six confirmations, then if a transaction spends a coin with fewer than six confirmations it was only done so because there were insufficient funds in the wallet of coins with six or more confirmations.



So take a transaction A with outputs X and Y.

Then another transaction, (transaction B), spends X after 1 confirmation. More often than not, it would seem, the argument could be made that looking at both transactions together, that Y was the change in transaction A. The odds are much less likely to be that X was the change but that change was then used in another transaction from the same wallet shortly thereafter. It might even be likely that in transaction B, if there is more than one output that the smaller output is the change. A person that spends funds shortly after receiving them will generally be spending all of the funds received or at least a large chunk of it.

The Bitcoin-Qt client will pull a new address from the key pool for change. Thus any amount sent to an address that already has been used in a transaction is unlikely to be change. Of course, with other clients, particularly Blockchain.info, this isn't necessarily the case.There are some clues from anecdotes as well. For over-the-counter trades, quite often upon receipt of funds the recipient will turn around and spend those funds that were received as soon as there has been a confirmation.This alone doesn't help much but because the coins chosen by Bitcoin-Qt client will first try to use coins for a transaction where all coins selected already have at least six confirmations, then if a transaction spends a coin with fewer than six confirmations it was only done so because there were insufficient funds in the wallet of coins with six or more confirmations.So take a transaction A with outputs X and Y.Then another transaction, (transaction B), spends X after 1 confirmation. More often than not, it would seem, the argument could be made that looking at both transactions together, that Y was the change in transaction A. The odds are much less likely to be that X was the change but that change was then used in another transaction from the same wallet shortly thereafter. It might even be likely that in transaction B, if there is more than one output that the smaller output is the change. A person that spends funds shortly after receiving them will generally be spending all of the funds received or at least a large chunk of it. Uni change .me



█

█

█

█

█

█

█

█

█

█

█

█

█

█

█

█



Your trusted partner in e-currency exchange

Bitcoin, Perfect Money, FasaPay, CUP, Alipay, PayPal





kjj



Offline



Activity: 1302

Merit: 1001









LegendaryActivity: 1302Merit: 1001 Re: How to detect the change output? November 27, 2012, 09:18:34 PM #4 Quote from: Meni Rosenfeld on November 27, 2012, 06:04:55 PM 3. (Assuming no anonymity-targeted coin selection): The change must be lower in value than all inputs (otherwise that input would not have been selected). This is especially useful for transactions where multiple coins are combined.



This is a decent assumption most of the time, but once every few weeks, someone pops on IRC complaining that the coin selector overselected. Most recent was, if I recall, something like 0.5 + 0.2 + 0.3 + 0.01 -> 1.0 + 0.01



On the topic of tracking in general, I would just say, "Welcome to the arms race". The raw transaction API lets crazy people like me do silly things like create all transactions with 2 to 5 outputs of nearly equal size. The next step will be a public mix-sender. This is a decent assumption most of the time, but once every few weeks, someone pops on IRC complaining that the coin selector overselected. Most recent was, if I recall, something like 0.5 + 0.2 + 0.3 + 0.01 -> 1.0 + 0.01On the topic of tracking in general, I would just say, "Welcome to the arms race". The raw transaction API lets crazy people like me do silly things like create all transactions with 2 to 5 outputs of nearly equal size. The next step will be a public mix-sender. 17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8

I routinely ignore posters with paid advertising in their sigs. You should too.

etotheipi





Offline



Activity: 1428

Merit: 1041





Core Armory Developer







LegendaryActivity: 1428Merit: 1041Core Armory Developer Re: How to detect the change output? November 27, 2012, 10:35:32 PM

Last edit: November 27, 2012, 10:52:50 PM by etotheipi #5 Quote from: Meni Rosenfeld on November 27, 2012, 06:04:55 PM Given a transaction, what methods exist to detect which of the outputs is most likely to be the change?



How do different clients choose the order of the outputs? Assuming no order information can be used, what kind of other heuristics can be used?





I spent a lot of time implementing coin selection in Armory. A few of your assumptions will be incorrect if the tx was created with Armory. Here's how Armory creates transactions:



(1) Coin-selection-evaluation method. This is a method that scores a coin-selection based on various factors: How many unique addresses are linked on the input side, the size of the transaction, the required fee, and... the output anonymity of the transaction (explained below). All factors are scored between 0 (unfavorable) and 1 (favorable).



(2) I generate a dozen deterministic coin selection methods that accommodate different, specific wallet structures. Those are added to the list. Then I generate some semi-random solutions (deterministic solutions like above, but with some perturbation).



(3) Score all the solutions in part (2) with the eval method in part (1). Default weights are hard-coded into armoryengine.py, but the user could modify them. It defines how important each factor is: most important is not using zero-conf. Next is whether the transaction can be free (not unrelated to zero-conf). Then number of input addresses linked. Then tx priority, tx size, and output anonymity. However, the user is free to make output anonymity most important (I do! but only for fun, not because I actually care ).



(4) A new address is retrieved from the wallet, a change TxOut is added to the recipient list, then the recipient list is shuffled. Transaction is signed and broadcast.



The parts you care about:



(1) If the transaction is small relative to the wallet balance, it a couple of the deterministic solutions are approx 2x the output value and 3x, so that change and recip are approximately the same or at least same order of magnitude. It may help clean up dust, it may help anonymity (but no guarantee for either). But when the tx amount is small relative to the wallet balance, you will frequently get equal or reversed size outputs, in random order. This violates your assumption that the change address is always smaller. Even if Armory didn't do it, a wallet with one very large TxOut, must have a large change output if you want to send only a small amount to someone. So even without Armory's technique, you'll see that assumption violated with any client.



(2) Recipients are always shuffled, though only with python RNG (don't need cryptographic RNG for this )



(3) The fun part of Armory's "output anonymity" scoring is that it actually takes into account the number of significant digits in both change output and recipient. Even though (recip, change) = (12.4, 11.8320385) are the same order of magnitude, it's pretty obvious which one is change which is output. As such, I actually count the number of significant digits, and give the selection a higher score if it's closer, like (12.4, 11.83). It gets an extra bonus -- a score actually above 1.0 -- if it finds a solution in which the change address has less sigfigs than the recip: (12.4, 10.0). i.e. deceptively-sized outputs is favored.



How often does (3) trigger? I don't know. I've tested the evaluation code, though, and it definitely scores those solutions higher, but I don't know how often such deceptive solutions are ever found/included. But if it ever happens, then your assumption about sig figs would be violated, as well.

I spentimplementing coin selection in Armory. A few of your assumptions will be incorrect if the tx was created with Armory. Here's how Armory creates transactions:(1) Coin-selection-evaluation method. This is a method that scores a coin-selection based on various factors: How many unique addresses are linked on the input side, the size of the transaction, the required fee, and... the output anonymity of the transaction (explained below). All factors are scored between 0 (unfavorable) and 1 (favorable).(2) I generate a dozen deterministic coin selection methods that accommodate different, specific wallet structures. Those are added to the list. Then I generate some semi-random solutions (deterministic solutions like above, but with some perturbation).(3) Score all the solutions in part (2) with the eval method in part (1). Default weights are hard-coded into armoryengine.py, but the user could modify them. It defines how important each factor is: most important is not using zero-conf. Next is whether the transaction can be free (not unrelated to zero-conf). Then number of input addresses linked. Then tx priority, tx size, and output anonymity. However, the user is free to make output anonymity most important (I do! but only for fun, not because I actually care).(4) A new address is retrieved from the wallet, a change TxOut is added to the recipient list,. Transaction is signed and broadcast.(1) If the transaction is small relative to the wallet balance, it a couple of the deterministic solutions are approx 2x the output value and 3x, so that change and recip are approximately the same or at least same order of magnitude. It may help clean up dust, it may help anonymity (but no guarantee for either). But when the tx amount is small relative to the wallet balance, you will frequently get equal or reversed size outputs, in random order. This violates your assumption that the change address is always smaller. Even if Armory didn't do it, a wallet with one very large TxOut,have a large change output if you want to send only a small amount to someone. So even without Armory's technique, you'll see that assumption violated with any client.(2) Recipients are always shuffled, though only with python RNG (don't need cryptographic RNG for this(3) The fun part of Armory's "output anonymity" scoring is that it actually takes into account the number of significant digits in both change output and recipient. Even though (recip, change) = (12.4, 11.8320385) are the same order of magnitude, it's pretty obvious which one is change which is output. As such, I actually count the number of significant digits, and give the selection a higher score if it's closer, like (12.4, 11.83). It gets an extra bonus -- a score actually above 1.0 -- if it finds a solution in which the change address has less sigfigs than the recip: (12.4, 10.0). i.e. deceptively-sized outputs is favored.How often does (3) trigger? I don't know. I've tested the evaluation code, though, and it definitely scores those solutions higher, but I don't know how often such deceptive solutions are ever found/included. But if it ever happens, then your assumption about sig figs would be violated, as well.

Armory Bitcoin Wallet: Bringing

Only use Armory software signed by the



(or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!) Founder and CEO of Armory Technologies, Inc. : Bringing cold storage to the average user!Only use Armory software signed by the Armory Offline Signing Key (0x98832223) Please donate to the Armory project by clicking here! (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)

keystroke



Offline



Activity: 897

Merit: 1007





advocate of a cryptographic attack on the globe







Hero MemberActivity: 897Merit: 1007advocate of a cryptographic attack on the globe Re: How to detect the change output? November 29, 2012, 04:51:50 AM #6 Quote from: etotheipi on November 27, 2012, 10:35:32 PM Quote from: Meni Rosenfeld on November 27, 2012, 06:04:55 PM Given a transaction, what methods exist to detect which of the outputs is most likely to be the change?



How do different clients choose the order of the outputs? Assuming no order information can be used, what kind of other heuristics can be used?





I spent a lot of time implementing coin selection in Armory. A few of your assumptions will be incorrect if the tx was created with Armory. Here's how Armory creates transactions:



(1) Coin-selection-evaluation method. This is a method that scores a coin-selection based on various factors: How many unique addresses are linked on the input side, the size of the transaction, the required fee, and... the output anonymity of the transaction (explained below). All factors are scored between 0 (unfavorable) and 1 (favorable).



(2) I generate a dozen deterministic coin selection methods that accommodate different, specific wallet structures. Those are added to the list. Then I generate some semi-random solutions (deterministic solutions like above, but with some perturbation).



(3) Score all the solutions in part (2) with the eval method in part (1). Default weights are hard-coded into armoryengine.py, but the user could modify them. It defines how important each factor is: most important is not using zero-conf. Next is whether the transaction can be free (not unrelated to zero-conf). Then number of input addresses linked. Then tx priority, tx size, and output anonymity. However, the user is free to make output anonymity most important (I do! but only for fun, not because I actually care ).



(4) A new address is retrieved from the wallet, a change TxOut is added to the recipient list, then the recipient list is shuffled. Transaction is signed and broadcast.



The parts you care about:



(1) If the transaction is small relative to the wallet balance, it a couple of the deterministic solutions are approx 2x the output value and 3x, so that change and recip are approximately the same or at least same order of magnitude. It may help clean up dust, it may help anonymity (but no guarantee for either). But when the tx amount is small relative to the wallet balance, you will frequently get equal or reversed size outputs, in random order. This violates your assumption that the change address is always smaller. Even if Armory didn't do it, a wallet with one very large TxOut, must have a large change output if you want to send only a small amount to someone. So even without Armory's technique, you'll see that assumption violated with any client.



(2) Recipients are always shuffled, though only with python RNG (don't need cryptographic RNG for this )



(3) The fun part of Armory's "output anonymity" scoring is that it actually takes into account the number of significant digits in both change output and recipient. Even though (recip, change) = (12.4, 11.8320385) are the same order of magnitude, it's pretty obvious which one is change which is output. As such, I actually count the number of significant digits, and give the selection a higher score if it's closer, like (12.4, 11.83). It gets an extra bonus -- a score actually above 1.0 -- if it finds a solution in which the change address has less sigfigs than the recip: (12.4, 10.0). i.e. deceptively-sized outputs is favored.



How often does (3) trigger? I don't know. I've tested the evaluation code, though, and it definitely scores those solutions higher, but I don't know how often such deceptive solutions are ever found/included. But if it ever happens, then your assumption about sig figs would be violated, as well.



I spentimplementing coin selection in Armory. A few of your assumptions will be incorrect if the tx was created with Armory. Here's how Armory creates transactions:(1) Coin-selection-evaluation method. This is a method that scores a coin-selection based on various factors: How many unique addresses are linked on the input side, the size of the transaction, the required fee, and... the output anonymity of the transaction (explained below). All factors are scored between 0 (unfavorable) and 1 (favorable).(2) I generate a dozen deterministic coin selection methods that accommodate different, specific wallet structures. Those are added to the list. Then I generate some semi-random solutions (deterministic solutions like above, but with some perturbation).(3) Score all the solutions in part (2) with the eval method in part (1). Default weights are hard-coded into armoryengine.py, but the user could modify them. It defines how important each factor is: most important is not using zero-conf. Next is whether the transaction can be free (not unrelated to zero-conf). Then number of input addresses linked. Then tx priority, tx size, and output anonymity. However, the user is free to make output anonymity most important (I do! but only for fun, not because I actually care).(4) A new address is retrieved from the wallet, a change TxOut is added to the recipient list,. Transaction is signed and broadcast.(1) If the transaction is small relative to the wallet balance, it a couple of the deterministic solutions are approx 2x the output value and 3x, so that change and recip are approximately the same or at least same order of magnitude. It may help clean up dust, it may help anonymity (but no guarantee for either). But when the tx amount is small relative to the wallet balance, you will frequently get equal or reversed size outputs, in random order. This violates your assumption that the change address is always smaller. Even if Armory didn't do it, a wallet with one very large TxOut,have a large change output if you want to send only a small amount to someone. So even without Armory's technique, you'll see that assumption violated with any client.(2) Recipients are always shuffled, though only with python RNG (don't need cryptographic RNG for this(3) The fun part of Armory's "output anonymity" scoring is that it actually takes into account the number of significant digits in both change output and recipient. Even though (recip, change) = (12.4, 11.8320385) are the same order of magnitude, it's pretty obvious which one is change which is output. As such, I actually count the number of significant digits, and give the selection a higher score if it's closer, like (12.4, 11.83). It gets an extra bonus -- a score actually above 1.0 -- if it finds a solution in which the change address has less sigfigs than the recip: (12.4, 10.0). i.e. deceptively-sized outputs is favored.How often does (3) trigger? I don't know. I've tested the evaluation code, though, and it definitely scores those solutions higher, but I don't know how often such deceptive solutions are ever found/included. But if it ever happens, then your assumption about sig figs would be violated, as well.

Awesome. Awesome. "The difference between a castle and a prison is only a question of who holds the keys."

casascius

VIP

Legendary



Offline



Activity: 1386

Merit: 1064





The Casascius 1oz 10BTC Silver Round (w/ Gold B)







Mike CaldwellVIPLegendaryActivity: 1386Merit: 1064The Casascius 1oz 10BTC Silver Round (w/ Gold B) Re: How to detect the change output? December 02, 2012, 05:14:33 AM #8 Quote from: etotheipi on December 02, 2012, 05:05:56 AM For instance, I just select all addresses that have balances with more than 2 decimal places, then I create a transaction sending their combined balance: the integer part goes back to me, the non-integer part gets donated to another project.



Funny I'm not the only one who does this. I normally work mostly with whole bitcoins (since my lowest physical coin is 1 BTC anyway), and then I use coin crumbles to make paper banknote wallets worth 0.1 BTC and carry them around to leave in tip jars etc.



Anyway, one useful (but less useful in a deterministic sense) heuristic when following the blockchain is to try and correlate behavior as well. If I notice consistent behavior following a chain one way, it's probably the same person. This is what led to me providing what was later determined to be a critical clue as to Goat's lost 400 BTC a while back: I notice that if I followed the chain a certain way I could see a pattern of a sender who frequently sent bitcoins in amounts like 3.995 1.995 8.995 which led me to suggest that it was a service that charged a 0.005 BTC transaction fee and took it out of the transaction amount. Funny I'm not the only one who does this. I normally work mostly with whole bitcoins (since my lowest physical coin is 1 BTC anyway), and then I use coin crumbles to make paper banknote wallets worth 0.1 BTC and carry them around to leave in tip jars etc.Anyway, one useful (but less useful in a deterministic sense) heuristic when following the blockchain is to try and correlate behavior as well. If I notice consistent behavior following a chain one way, it's probably the same person. This is what led to me providing what was later determined to be a critical clue as to Goat's lost 400 BTC a while back: I notice that if I followed the chain a certain way I could see a pattern of a sender who frequently sent bitcoins in amounts like 3.995 1.995 8.995 which led me to suggest that it was a service that charged a 0.005 BTC transaction fee and took it out of the transaction amount. Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.

Hal

Sr. Member





Offline



Activity: 314

Merit: 1124









VIPSr. MemberActivity: 314Merit: 1124 Re: How to detect the change output? December 14, 2012, 01:43:57 AM #9 bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:



// Insert change txn at random position:

vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());

wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));



The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.



I think it should be size()+1. Hal Finney

MoonShadow



Offline



Activity: 1708

Merit: 1000









LegendaryActivity: 1708Merit: 1000 Re: How to detect the change output? December 14, 2012, 02:00:06 AM #10 I wonder if having a choice in the automatic selection methods would be beneficial as well. Consider this scenario.



Say I have a full desktop client, as well as a thin hardware client (think bitcoincard) that syncs with the desktop client on a regular basis. So I want my desktop client to automaticly create simple transactions that the thin hardware client can use easily, and preferablely in combinations that don't have a change address at all. (Thin hardware clients most likely would have only one, or only a few, addresses to work with, change addresses would be trivial to identify if they go back to the sending address)



So what I would want my desktop client to do is to regularly produce transactions wherein I have two change transactions; one for the thin client, in the useful denominations, and one with the true change back to the desktop client. One time, the thin client might get a .01 transaction, another time it might get .02, and another .05; up through .1|.2|.5|1|2|5 so that it would have a selction of transactions that would permit it to combine a spending transaction to any exact amount, or almost any exact amount. A full client could also use this same technique to, occasionally, produce a changeless transaction as well. I'm not sure if this would be useful or not, but I would guess that it might complicate 'taint' tracking systems. "The powers of financial capitalism had another far-reaching aim, nothing less than to create a world system of financial control in private hands able to dominate the political system of each country and the economy of the world as a whole. This system was to be controlled in a feudalist fashion by the central banks of the world acting in concert, by secret agreements arrived at in frequent meetings and conferences. The apex of the systems was to be the Bank for International Settlements in Basel, Switzerland, a private bank owned and controlled by the world's central banks which were themselves private corporations. Each central bank...sought to dominate its government by its ability to control Treasury loans, to manipulate foreign exchanges, to influence the level of economic activity in the country, and to influence cooperative politicians by subsequent economic rewards in the business world."



- Carroll Quigley, CFR member, mentor to Bill Clinton, from 'Tragedy And Hope'

Gavin Andresen





Offline



Activity: 1652

Merit: 1066





Chief Scientist







LegendaryActivity: 1652Merit: 1066Chief Scientist Re: How to detect the change output? December 14, 2012, 02:30:16 AM #11 Quote from: Hal on December 14, 2012, 01:43:57 AM bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw...

Ay carumba, how did we not notice that for over two years?



I introduced that bug with the 'sendmany' command two years ago (commit b9d1ed85). This is why programmers should not be trusted to test their own code (I probably carefully tested to make sure the change position looked random when I send to more than one destination, and never tested the degenerate send-to-one case; sigh).



Ay carumba, how did we not notice that for over two years?I introduced that bug with the 'sendmany' command two years ago (commit b9d1ed85). This is why programmers should not be trusted to test their own code (I probably carefully tested to make sure the change position looked random when I send to more than one destination, and never tested the degenerate send-to-one case; sigh). How often do you get the chance to work on a potentially world-changing project?

keystroke



Offline



Activity: 897

Merit: 1007





advocate of a cryptographic attack on the globe







Hero MemberActivity: 897Merit: 1007advocate of a cryptographic attack on the globe Re: How to detect the change output? December 14, 2012, 07:26:04 AM #12 Quote from: Gavin Andresen on December 14, 2012, 02:30:16 AM Quote from: Hal on December 14, 2012, 01:43:57 AM bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw...

Ay carumba, how did we not notice that for over two years?



I introduced that bug with the 'sendmany' command two years ago (commit b9d1ed85). This is why programmers should not be trusted to test their own code (I probably carefully tested to make sure the change position looked random when I send to more than one destination, and never tested the degenerate send-to-one case; sigh).





Ay carumba, how did we not notice that for over two years?I introduced that bug with the 'sendmany' command two years ago (commit b9d1ed85). This is why programmers should not be trusted to test their own code (I probably carefully tested to make sure the change position looked random when I send to more than one destination, and never tested the degenerate send-to-one case; sigh).

Oh wow, so the change was always the first output if you send bitcoins to 1 address? Awesome find. But yea.. how did we miss it. Oh wow, so the change was always the first output if you send bitcoins to 1 address? Awesome find. But yea.. how did we miss it. "The difference between a castle and a prison is only a question of who holds the keys."

molecular

Legendary



Offline



Activity: 2744

Merit: 1016









DonatorLegendaryActivity: 2744Merit: 1016 Re: How to detect the change output? December 14, 2012, 11:44:52 AM #16 Quote from: Hal on December 14, 2012, 01:43:57 AM bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:



// Insert change txn at random position:

vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());

wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));



The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.



I think it should be size()+1.



Holy moly! That means we can get a much better approximation of transaction volume Holy moly! That means we can get a much better approximation of transaction volume PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0 3F39 FC49 2362 F9B7 0769