Limitations with this methodology
Clustering is limited due to its inability to account for unspent currency or currency that has been
received by a new wallet before being spent. As long as funds are not spent, no common spend
transactions can be found, and no change heuristics can be applied.
In addition, the limited number of common spend transactions, and the limited effectiveness of
change heuristics, mean that in many cases clustering fails, making following the money trail on
the blockchain a daunting task that becomes exponentially more complex as more transactions
are performed by the suspect.
There are additional, more advanced methods used for clustering blockchain data based on “change
heuristics.” Many transactions on the blockchain have two outputs: the intended destination address and
the change address. The change address is an address owned by the sender, whereby any change from the
transaction (i.e., the Unspent Transaction Output (UTXO)) is sent back to the sender’s wallet. Determining which
output of a transaction is the actual destination and which is the change address can help with clustering
by associating the input address and the change address onto the same cluster. The most notable change
heuristics are:
• Round amounts - When a transaction with two outputs has one output with a round amount and another
with a non-round amount, it can be assumed that the round amount goes to the destination address,
whereas the non-round number is the change address.
• Large/small amounts - In a transaction with two outputs, one with a very large amount and another with
a very small amount, the large output is likely being sent to the change address.
• Tagging - Using the tagging techniques presented above, it can be assumed, for example, that if the input
address is tagged as one service and one of the output addresses is tagged as another service, the output
address tagged as the same service as the input address is the change, whereas the other output address
is the destination.
Blockchain Analytics