Notes / short topics

BlueMatt notes 2 out of 6 of last weeks blocked and review-needed PRs got merged, which can be better. Wumpus created a high-priority for review github project page for the PRs that are mentioned to have priority in the meetings.

There have been 11 merges tagged 0.14.1 and 3 open PRs. Once those are in 0.14.1 should be good to go.

Main topics

Slow unit tests

Dealing with abortnode / ConnectTip / DisconnectTip failures

High priority review

Slow unit tests

background

Bitcoin Core provides a makefile target check that runs the project’s unit tests. The project has over time written more and more integration tests executed through the RPC interface, which are run automatically by the Travis Continuous Integration (CI) server that tests every Bitcoin Core pull request and which can manually be run by executing, qa/pull-tester/rpc-tests.py

As previously discussed in the 2017-03-16 meeting, these tests as a whole might currently take too long.

Wumpus made an overview of the slowest unit tests. Some of them have already been worked on or have PRs to make them faster.

We could also introduce an -extended mode for the unit tests, which does extra thorough tests that shouldn’t run every time. The extended mode should be part of the release process (and ran by gitian) and/or once a day on master.

Jonasschnelli has a build server with a nice web UI which does gitian builds every day at https://bitcoin.jonasschnelli.ch/.

Jnewbery notes the Travis CI service is currently failing because we’ve set it to run the extended tests once per day, so we’re flushing out all the extended tests that have always failed on Travis. Once PR #10114 and #10072 are merged these daily runs should pass Travis.

meeting conclusion

For the standard make check have a guideline of maximum ~1 second per test case, have an extended mode for unit tests with more extensive tests.

Dealing with abortnode / ConnectTip / DisconnectTip failures

background

Sdaftuar has an open PR (#9208) which improves performance after a chain reorg(anization), where a node discovers a new longest valid chain which excludes a block previously thought as being the longest valid chain (which will be orphaned). Currently we try to add transactions from each orphaned block back to the mempool, even though it’s likely many of those transactions will reoccur in the newly discovered blocks. #9208 stores those transactions in a separate “disconnect pool” for later processing.

BlueMatt brought up some edgecases in cases when ConnectTip or DisconnectTip return false where we now assert() instead of AbortNode(). Some broader discussion ensues on when to use AbortNode() and when Abort()/assert() and what’s the best way to alert the user an error has occured. AbortNode() allows us to exit with a message to inform the user, so ideally only critical errors should result in Abort()/Assert().

meeting conclusion

Rename AbortNode() to ShutdownSoon() and make sure disk corruption uses something different.

High priority review

All the PRs tagged with 0.14.1 should have priority

Sipa adds he’d like to see #9792 (FastRandomContext improvements and switch to ChaCha20) get in at some point to further remove the dependency on OpenSSL.

Gmaxwell proposes to re-open his PR #9424 which changes the logging categories to boolean flags instead of strings. This would make the use of PR’s like #10123, which allows you to exclude certain components from the debug log, easier. Cfields adds he’d like to do something simular for net messages.

Comic relief

wumpus if BlueMatt can make it work faster that's great, but don't silently kill the program on every error gmaxwell wumpus: how about every other error? 9:48 BlueMatt so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different? ... 9:53 BlueMatt <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different? ... 9:57 BlueMatt ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different? 9:58 jeremyrubin BlueMatt: maybe if you paste it again 9:58 BlueMatt jeremyrubin: ok, <BlueMatt> ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different? jtimon it seems it's time to abort the meeting wumpus #endmeeting BlueMatt wumpus: we need to change that to #abort() gmaxwell But I wanted to cleanly flush!

Participants

Disclaimer

This summary was compiled without input from any of the participants in the discussion, so any errors are the fault of the summary author and not the discussion participants.