Note I wrote this as part of a discussion recently, and I think it makes sense to share it here. This is a lot text though, feel free to skip forward.

Indeed I have a private repo, where I push and only private CI picks up. Based on Buildbot, I run many more compilations, basically around the clock on all of my computers, to find regressions from new optimization or codegen changes, and well UI changes too.

Public CI offerings like Travis are not aimed at allowing this many compilations. It will be a while before public cloud infrastructure will be donated to Nuitka, although I see it happening some time in the future. This leaves developers with the burden to run tests on their own hardware, and never enough. Casual contributors will never be able to do it themselves.

My scope is running the CPython test suites on Windows and Linux. These are the adapted 26, 27, 32, 33, 34, 35, 36, 37 suites, and also to get even more errors covered, they are ran with mismatching Python versions, so a lot of exceptions are raised. Often running the 36 tests with 37 and vice versa will extend the coverage, because of the exceptions being raise.

On Windows I compile with and without debug mode, x86 and x64, and it's kind of getting too much. For Linux I have 2 laptops in use, and an ARM CuBox bought from your donations, there it's working better, esp. due to ccache being used everywhere, although recent investigations show room for improvement there as well.

For memory usage I still compile mercurial and observe the memory it used in addition to comparing the mercurial tests to expected outputs its test suite gives. It's a sad day when Mercurial tests find changes in behavior, and luckily that has been rare. Running the Mercurial test suite gives some confidence in the thing not corrupting data it works with without knowing.

Caching the CPython outputs of tests to compare against is something I am going to make operational these days, trying to make things ever faster. There is no point to re-run tests with Python, just to get at its output, which will typically not change at all.

But for the time being, ccache.exe and clcache.exe seem to have done wonders for Windows too, but I will want to investigate some more to avoid unnecessary cache misses.

Workflow As for my workflow with Nuitka, I often tend to let some commits settle in my private repo only until they become trusted. Other times I will make bigger changes and put them out to factory immediately, because it will be hard to split up the changes later, so putting them out makes it easier. I am more conservative with factory right after telling people to try something there. But also I break it on purpose, just trying out something. I really consider it a private branch for interacting with me or public CI. I do not recommend to use it, and it's like a permanent pull request of mine that is not ever going to be finished. Then on occasions I am making a sorting of all commits on factory and split it into some things that become hotfixes, some things that become current pre-release, and other things that will remain in that proving ground. That is why I typically make hotfix and pre-release at the same times. The git flow suggests doing that and it's easy, so why not. As a bonus, develop is then practically stable at nearly all times too, with hardly any regressions. I do however normally not take things as hotfixes that are on develop already, I hate the duplication of commits. Hotfixes must be small and risk free, and easy to put out, when there is any risk, it definitely will be on develop. Nuitka stable typically covers nearly all grounds already. No panic needed to add missing stuff and break others.

Hunting bugs with bisect For me the git bisect is very important. My private commit history is basically a total mess and worthless, but on factory I am making very nice organized commits that I will frequently amend, even for the random PyLint cleanup. This allows me when e.g. one test suddenly says "segfault" on Windows to easily find the change that triggers it, look at C code difference, and spot the bug introduced, then amend the commit and be done with it. It's amazing how much time this can save. My goal is to always have a workable state which is supposed to pass all tests. Obviously I cannot prove it for every commit, but when I know it to not be the case, I tend to make rebases. At times I have been tempted and followed up on backward amending develop and even stable. I am doing that to be sure to have that bisect ability, but fortunately it's rare that kind of bug occurs, and I try not to do it.

Experimental Changes As with recent changes, I sometimes make changes with the isExperimental() marker, activating breaking changes only gradually. The C bool type code generation has been there for months in a barely useful form, until it became more polished, and always guarded with a switch, until one day for 0.6 finally I changed it, and made the necessary fixes retroactively before that switch, to make it work while that was still in factory. Then I will remove the experimental code. I feel it's very important and even ideal to be able to always compare outputs to a fully working solution. I am willing to postpone some cleanups until later date as a price, but when then something in my mind tells me again "This cannot possibly have ever worked"... a command line flag away, I have the answer to compare, plus, that includes extra changes happened in the meantime, they don't add noise to diff outputs of generated C code for example. Then looking at that diff, I can tell where the unwanted effect is, and fix all the things, and that way find bugs much faster. Even better, if I decide to make a cleanup action as part of making a change more viable to execute, then I get to execute it on stable grounds, covered by the full test suite. I can complete that cleanup, e.g. using variable identifier objects instead of mere strings was needed to make "heap generators" more workable. But I was able to put that one to active before "heap generators" was ever fully workable, and complete it, and actually reap some of its benefits already.