wtogami



Offline



Activity: 263

Merit: 250









Sr. MemberActivity: 263Merit: 250 [FIXED] MacOS X LevelDB Corruption Bounty (10.00 BTC + 200.2 LTC) November 18, 2013, 03:26:40 AM

Last edit: December 29, 2013, 04:47:55 AM by wtogami #1 FIXED.



Can you fix the MacOS X Bitcoin LevelDB data corruption issue?

https://bitcointalk.org/index.php?topic=337294.msg3718821#msg3718821

TEST THESE BUILDS NOW!



Bounty Funding: 10.00 BTC + 200.2 LTC

Gavin Andresen has pledged 5 BTC. BitcoinTalk pledged 4 BTC. Public donations have contributed 1 BTC. Litecoin Dev Team pledges 200 LTC. The public is encouraged to contribute to these addresses to increase the incentive to fix this sooner.



BTC: 1FZ1mSJXj8aJqdpwUcpigLBqJLwtTu46fA

LTC: LS1Rb3bb29TA9PEVGR64bV2cLxC7RdQi8A

Conditions

The bounty may be awarded under the following conditions.



Document how anyone can consistently reproduce the data corruption.

Explain why it happens.

Write a code fix that is acceptable to the Bitcoin core developers and merged into Bitcoin.

The Bitcoin developers have ultimate deciding power of how to apportion the bounty award(s) based upon the merit of the contributions This may encourage collaboration that may lead to a fix rather than hoarding of information. Non-developers may be able to figure out #1.



These terms may be changed at any time for any or no reason.



Background

https://github.com/bitcoin/bitcoin/issues/2770

Since Bitcoin 0.8.x and the introduction of LevelDB, MacOS X users have been experiencing periodic LevelDB data corruption. For some Mac users it has never happened, while for others it happens frequently.



https://github.com/bitcoin/bitcoin/pull/2916

https://github.com/bitcoin/bitcoin/pull/3000

https://github.com/bitcoin/bitcoin/pull/2933

Bitcoin master now contains two Mac-specific fsync patches and an upgrade to LevelDB 1.13. clean shutdown and restart of the client. All corruption reports seem to be from MacOS X 10.8.x and 10.9 users. It is unclear if earlier versions of MacOS X are affected. It is unknown if particular hardware or software configurations are involved.



https://github.com/bitcoin/bitcoin/issues/2785

Corruption with the same error message apparently is capable of fixing itself. Not clear if this is true of the recent master branch.



GPG Signed message

Code: -----BEGIN PGP SIGNED MESSAGE-----

Hash: SHA256



https://bitcointalk.org/index.php?topic=337294

These addresses contain public donations to be added to the Bitcoin MacOS X corruption fix bounty.

BTC: 1FZ1mSJXj8aJqdpwUcpigLBqJLwtTu46fA

LTC: LS1Rb3bb29TA9PEVGR64bV2cLxC7RdQi8A

-----BEGIN PGP SIGNATURE-----

Version: GnuPG v1.4.15 (GNU/Linux)



iQQcBAEBCAAGBQJSiaeFAAoJELEXnrc0fcENmRsf/3c/w53R2EHX62L+QimS96Rj

J+GPSpVQQRFOFr19OM+efjC1ydoZ3N/suYI1FynQ9nX4RzmCW5ZwbxMtl6wnEw7h

oIqv+ufnD0XEpkFr+g32JdoRNN2KprrMH4Cr2oLI0w+Oqv32jLveoRIqSzIArCId

U9ZVPcvFvKa9hWJrnM9KJQW6NgsGsKW3WBk5n/Wcbp4PYUn9ZC0taRMq2NbakSwk

RaNf6yFSC1wWb2dD6eE+1UiXBCidyK0cVUMkjCRoA0eRqZqy2cJwELmOrJ1RHlgP

6K9Y6MuelTPxhXNa/NNq/sVAbhOmtAeyJ5ApuTuvjd1gpKpS14bFEHY7yFf/dv7A

t0Z43xqQ8FVJ9HnYKY0T6d5W30L31bz5EZvhTQsa+IzfrQeBXGu1ecXM6dSlkcpf

KkJQdyLZ2W72roq+RjF5eOsLmlW9+Xyk7pMSn403oMlMY5EpJByAO8znomq0XEkq

UWPqfzjF2ptXGt6JqPdXx2La3w/jd+GNpHFsA65xZlcgYls/LXyq6483jDz3qPUS

L6WZJZh5BrE4yfmIcTh8LUdiVj7fzlZs3r7CKmD8pv3mtsLpqAZGNiFdK8uMuerp

h+2rPreMxGN7AqN28xdo5WOhqCAersoJQuwz3yQcGtXqnqcVTCBUCoaDpFxExlIK

BHKuGW6awyd1akgKz46aWjlDnWuJ94ZY90tkKPXtSe2XhMZHtq5gYzxpv6qEEFo4

ikDpxyaoDMK7GOdUW0FGY9ZSELWjuPSIwjip/5KN5Z51/TaUeiOQmhxQJLIHKNY3

SMj+wNJLb+FTdlOPBEqYAu3WPPG9ye73ADudt1N36ELLqFcvjsB1RzqntpogEHXR

T+I2VOTtbMvCPqbKdy5FijOERfjRIfrfXirovboLb/iP8ouhbuH7JHcj2niFshaL

i6MBAB2eTTh9LlNx3B1w/ESQuYJlR4NsHDiGmWQGHAEHw6LaCVT7MDh2fmag+1Jx

vDF2LdcCnRCgP5mSv+ZeJv7MvpeJ84UL3SlkB6iKZyD1+EJMyTB7f7xLbyWZSp+v

To7lqJBxk1PbqcRl9rYX7jdW4b4ztsr8FNxOvw5jxcPGZ0Mc9eb9ln6Nl+hx4PBv

jg4j4emg9uAPqRZn8KgJ1OL+wYE5Lw74mu3CP63pBmRVSl894janSUhKc4Z3ToF2

9kf81jVWudmRrVzQhiYA8vlrbC1Bc3nhlrX0KlF8VdREvptfV9PMbOAZdW96u4Mt

1lbqv2ZNWqxOon7Q3HKOcOo3uNvhv0sYItXSygZx5Z/chmBBRQrrJDCdHUw+WhR8

UGNsSL+Rz2vFeAc/W6jrlw3dId/wK+H36vDW8X4bSY6rVi+HhxZNoAPihUNNFy4=

=o/b5

-----END PGP SIGNATURE-----

Gavin Andresen has pledged 5 BTC. BitcoinTalk pledged 4 BTC. Public donations have contributed 1 BTC. Litecoin Dev Team pledges 200 LTC. The public is encouraged to contribute to these addresses to increase the incentive to fix this sooner.The bounty may be awarded under the following conditions.The Bitcoin developers have ultimate deciding power of how to apportion the bounty award(s) based upon the merit of the contributions This may encourage collaboration that may lead to a fix rather than hoarding of information. Non-developers may be able to figure out #1.These terms may be changed at any time for any or no reason.Since Bitcoin 0.8.x and the introduction of LevelDB, MacOS X users have been experiencing periodic LevelDB data corruption. For some Mac users it has never happened, while for others it happens frequently.Bitcoin master now contains two Mac-specific fsync patches and an upgrade to LevelDB 1.13. Bitcoin 0.8.5 OMG3 and Litecoin 0.8.5.2-rc5 contains these same patches. It is possible that a different Mac corruption issue was solved by these earlier patches, but users of these branches have reported continued corruption. Curiously, corruption seems to happen after aand restart of the client. All corruption reports seem to be from MacOS X 10.8.x and 10.9 users. It is unclear if earlier versions of MacOS X are affected. It is unknown if particular hardware or software configurations are involved.Corruption with the same error message apparently is capable of fixing itself. Not clear if this is true of the recent master branch. If you appreciate my work please consider making a small donation.

BTC: 1LkYiL3RaouKXTUhGcE84XLece31JjnLc3 LTC: LYtrtYZsVSn5ymhPepcJMo4HnBeeXXVKW9

GPG: AEC1884398647C47413C1C3FB1179EB7347DC10D

Diapolo



Offline



Activity: 769

Merit: 500









Hero MemberActivity: 769Merit: 500 Re: MacOS X LevelDB Corruption Bounty (5 BTC + 200.2 LTC) November 18, 2013, 11:17:47 AM #2

Perhaps this can help in a way that it works a little different than current code, dunno. I also added somewhat clearer exception error messages.



https://github.com/bitcoin/bitcoin/pull/3277



It's not intended for getting merged into the master branch yet, perhaps it never will, but you can give it a try.



Dia I created a pull (not specific to this problem), which uses std::fstream instead of fopen() and such for reading/writing block/undo files.Perhaps this can help in a way that it works a little different than current code, dunno. I also added somewhat clearer exception error messages.It's not intended for getting merged into the master branch yet, perhaps it never will, but you can give it a try.Dia

1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x

bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo Liked my former work for Bitcoin Core? Drop me a donation via:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x

donal



Offline



Activity: 23

Merit: 0







NewbieActivity: 23Merit: 0 Re: MacOS X LevelDB Corruption Bounty (5 BTC + 200.2 LTC) November 18, 2013, 10:29:11 PM #6 Litecoin wallet was crashing for me, saying DB corruption, if I open terminal and enter



cd /Applications/Litecoin-Qt.app/Contents/MacOS



./Litecoin-Qt -reindex



It works..



These messages are then displayed in terminal,



2013-11-18 19:57:36.821 Litecoin-Qt[991:507] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.



2013-11-18 19:57:36.821 Litecoin-Qt[991:507] CoreText performance note: Set a breakpoint on CTFontLogSuboptimalRequest to debug.



2013-11-18 19:57:37.657 Litecoin-Qt[991:507] CoreText performance note: Client called CTFontCreateWithName() using name "Courier New" and got font with PostScript name "CourierNewPSMT". For best performance, only use PostScript names when calling this API.



Bismarck



Offline



Activity: 14

Merit: 0







NewbieActivity: 14Merit: 0 Re: MacOS X LevelDB Corruption Bounty (5 BTC + 200.2 LTC) November 19, 2013, 02:27:23 AM #8



https://forum.litecoin.net/index.php/topic,7147.msg55666.html#msg55666



I have an LTC wallet that doesn't play well with others. I have no problems being someone's guinea pig as I'd really like to get it working again on my laptop.



For the new post; I DO have TimeMachine enabled.



Just for consistency;



Here is the error that Litecoin-Qt keeps throwing;



Code: Last login: Mon Nov 18 18:27:48 on ttys000

Bismarcks-MacBook-Pro-2:~ Bismarcks$ /Applications/Litecoin-Qt.app/Contents/MacOS/Litecoin-Qt ; exit;

2013-11-18 18:32:21.744 Litecoin-Qt[12289:507] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.

2013-11-18 18:32:21.745 Litecoin-Qt[12289:507] CoreText performance note: Set a breakpoint on CTFontLogSuboptimalRequest to debug.

2013-11-18 18:32:21.748 Litecoin-Qt[12289:507] *** WARNING: Method userSpaceScaleFactor in class NSView is deprecated on 10.7 and later. It should not be used in new applications. Use convertRectToBacking: instead.

2013-11-18 18:32:27.518 Litecoin-Qt[12289:507] CoreText performance note: Client called CTFontCreateWithName() using name "Courier New" and got font with PostScript name "CourierNewPSMT". For best performance, only use PostScript names when calling this API.

Assertion failed: (pindexFirst), function GetNextWorkRequired, file ../litecoin/src/main.cpp, line 1149.

Abort trap: 6

logout



[Process completed]



I'd like to point everyone's attention to this thread on the LiteCoin forums --I have an LTC wallet that doesn't play well with others. I have no problems being someone's guinea pig as I'd really like to get it working again on my laptop.For the new post; I DO have TimeMachine enabled.Just for consistency;Here is the error that Litecoin-Qt keeps throwing;

whault



Offline



Activity: 16

Merit: 0







NewbieActivity: 16Merit: 0 Re: MacOS X LevelDB Corruption Bounty (5.51 BTC + 200.2 LTC) November 20, 2013, 12:40:38 AM #11



only the blockchain stored on the internal SSD boot disk gets corrupted, a blockchain stored on the second SATA HDD is never corrupted

corrupted corruption seems to happen most often after a system sleep (deep or not), though not always

corruption can happen during the initial sync if it is stopped and then restarted

corruption can happen with FileVault 2 turned on and off

has happened less often since updating to 10.9 only twice so far instead of every few days, though it could just be chance

That's it really. No other behaviour is specific to corruptions for me. Sometimes they happen twice in a day, sometimes not for weeks. Some observations. My setup uses two drives, one with the OS and a lower speed one for general storage. I don't use time machine like the poster above, and there's nothing else non-standard about my software.That's it really. No other behaviour is specific to corruptions for me. Sometimes they happen twice in a day, sometimes not for weeks.

moderate



Offline



Activity: 98

Merit: 10



nearly dead







MemberActivity: 98Merit: 10nearly dead Re: MacOS X LevelDB Corruption Bounty (5.51 BTC + 200.2 LTC) November 20, 2013, 02:42:39 AM #12 If anything, this should serve as a warning for picking up cool new shiny things.



I take there was some discussion about why picking LevelDB was the right choice, surely it wasn't considered only because it performs faster than BDB and is developed at Google ? After that surely there were some good testing in various systems, since this is a very new low level storage, yes ?



I'm just mocking here, obviously. Good luck finding and fixing the issues.

wtogami



Offline



Activity: 263

Merit: 250









Sr. MemberActivity: 263Merit: 250 Re: MacOS X LevelDB Corruption Bounty (5.51 BTC + 200.2 LTC) November 20, 2013, 03:15:40 AM #13 Quote from: moderate on November 20, 2013, 02:42:39 AM If anything, this should serve as a warning for picking up cool new shiny things.



I take there was some discussion about why picking LevelDB was the right choice, surely it wasn't considered only because it performs faster than BDB and is developed at Google ? After that surely there were some good testing in various systems, since this is a very new low level storage, yes ?



I'm just mocking here, obviously. Good luck finding and fixing the issues.



It's working quite well on Linux and Windows. Also the old BDB corrupted on all platforms, although less often than Mac users experience this current issue. It's working quite well on Linux and Windows. Also the old BDB corrupted on all platforms, although less often than Mac users experience this current issue. If you appreciate my work please consider making a small donation.

BTC: 1LkYiL3RaouKXTUhGcE84XLece31JjnLc3 LTC: LYtrtYZsVSn5ymhPepcJMo4HnBeeXXVKW9

GPG: AEC1884398647C47413C1C3FB1179EB7347DC10D

behindtext



Offline



Activity: 121

Merit: 101







Full MemberActivity: 121Merit: 101 Re: MacOS X LevelDB Corruption Bounty (5.51 BTC + 200.2 LTC) November 20, 2013, 11:04:03 AM #14 Quote from: moderate on November 20, 2013, 02:42:39 AM If anything, this should serve as a warning for picking up cool new shiny things.



I take there was some discussion about why picking LevelDB was the right choice, surely it wasn't considered only because it performs faster than BDB and is developed at Google ? After that surely there were some good testing in various systems, since this is a very new low level storage, yes ?



the motivation for using leveldb vs other dbs is due to the fact that with large numbers of records, e.g. over roughly 10 mln records, most "normal" dbs start to get really sluggish on inserts and selects. you can see the behavior for yourself by stuffing a ton of records in sqlite, mysql, psql, etc.



leveldb is not so much a db as a key-value store, which means that insert speed can be maintained even when there are a massive number of records, e.g. 250 mln. this is where the "level" in leveldb comes from - it load levels on inserts. the only price you pay for the load leveling is episodic compaction by leveldb. however, when doing selects/lookups on data that is already in leveldb, you must do several seeks, similar to more common databases.



the likely reason leveldb was chosen is that there aren't a ton of great choices for key-value stores. many of the key-value stores besides leveldb have only a few devs and may not be actively maintained. there are also many key-value stores that have questionable data integrity. using a dependency that goes unmaintained means having to change that dep out later, a giant PITA.



the reason the issue that is cited in this thread is so nasty is that not only does bitcoind use leveldb, it uses it in conjunction with flat file storage for the blocks. the act of storing data in flat files and referencing them in the db substantially increases the number and severity of error and failure paths in the combined structure (leveldb + flat file storage). as we can now see, hunting these bugs is very difficult.



perhaps something can be inferred from the way in which leveldb + blocks are corrupted. this would require a dev looking at the db and blocks after they have been hosed. the motivation for using leveldb vs other dbs is due to the fact that with large numbers of records, e.g. over roughly 10 mln records, most "normal" dbs start to get really sluggish on inserts and selects. you can see the behavior for yourself by stuffing a ton of records in sqlite, mysql, psql, etc.leveldb is not so much a db as a key-value store, which means that insert speed can be maintained even when there are a massive number of records, e.g. 250 mln. this is where the "level" in leveldb comes from - it load levels on inserts. the only price you pay for the load leveling is episodic compaction by leveldb. however, when doing selects/lookups on data that is already in leveldb, you must do several seeks, similar to more common databases.the likely reason leveldb was chosen is that there aren't a ton of great choices for key-value stores. many of the key-value stores besides leveldb have only a few devs and may not be actively maintained. there are also many key-value stores that have questionable data integrity. using a dependency that goes unmaintained means having to change that dep out later, a giant PITA.the reason the issue that is cited in this thread is so nasty is that not only does bitcoind use leveldb, it uses it in conjunction with flat file storage for the blocks. the act of storing data in flat files and referencing them in the db substantially increases the number and severity of error and failure paths in the combined structure (leveldb + flat file storage). as we can now see, hunting these bugs is very difficult.perhaps something can be inferred from the way in which leveldb + blocks are corrupted. this would require a dev looking at the db and blocks after they have been hosed. [ANN][DCR] Decred - Hybrid PoW/PoS | btcsuite Devs | Tons of New Features | Go

btcsuite: an alternative full-node bitcoin implementation in Go

Mike Hearn





Offline



Activity: 1526

Merit: 1008







LegendaryActivity: 1526Merit: 1008 Re: MacOS X LevelDB Corruption Bounty (5.51 BTC + 200.2 LTC) November 20, 2013, 01:15:57 PM #15



We already know Apple have made some .... questionable ... decisions in their kernel, with regard to fsync (hint: fsync doesn't). That was at least one source of corruptions, which we already fixed.



Given that rather astonishing approach to data integrity there may well be other equally questionable decisions lurking under the covers. The fact that this only happens on MacOS and not any other platform is strongly indicative that Apple have done more than one bad thing.



I am wondering if there is something going wrong with mmap.



https://code.google.com/p/leveldb/issues/detail?id=196



The behaviour of mmap seems like it can sometimes be broken by kernel developers in subtle ways, I got a bug report for the Android app a few months ago which strongly implies mmap on Motorola devices is broken in ways that can cause data corruption. I wonder if POSIX specifies its behaviour tightly enough. You can blame me for LevelDB. We switched to it because it was a large (>2x) speedup over BDB and performance is critical for Bitcoin, for obvious reasons. Also BDB sucks in lots of different ways and LevelDB is very well written.We already know Apple have made some .... questionable ... decisions in their kernel, with regard to fsync (hint: fsync doesn't). That was at least one source of corruptions, which we already fixed.Given that rather astonishing approach to data integrity there may well be other equally questionable decisions lurking under the covers. The fact that this only happens on MacOS and not any other platform is strongly indicative that Apple have done more than one bad thing.I am wondering if there is something going wrong with mmap.The behaviour of mmap seems like it can sometimes be broken by kernel developers in subtle ways, I got a bug report for the Android app a few months ago which strongly implies mmap on Motorola devices is broken in ways that can cause data corruption. I wonder if POSIX specifies its behaviour tightly enough.