I've actually been thinking about this for almost a year now. Since 2013 I've had a bee in my bonnet about getting to a stack that is easy for newcomers to understand rather than for insiders to maintain. Initially I thought the answer lay in building the perfect programming language, so I built one based on Lisp. However, my language was incredibly slow, and I simultaneously started to realize that linguistic mechanisms didn't really attack the real problem.

There's a blind spot in the way we program: we diligently encode the rules we want to automate, but leave gaping holes in the reasons why we chose the rules we did. When the reason for some design decision is not recorded in code, there's literally not enough information in a codebase for a newcomer to understand it deeply without talking to old hands. Entropy becomes inevitable, as old ways become cargo-culted because people can't decide if some seemingly-vestigial feature is truly obsolete, or a subtle regression waiting to happen.

Tests help encode some of the reasons, and their use has greatly expanded (with good reason) in the past decade. But we're not yet at the point where we can blindly deploy software if all automated tests pass. I decided it's incredibly valuable to get there as quickly as possible, so that newcomers would be able to ask arbitrary what-if questions about a codebase and learn about it interactively by running it in various situations rather than just passively by trying to read its code. "Why is this line of code written like this?" Change it, run tests, see a failure, go read what it does. "Oh, that's why." Or if no tests fail, then assume with confidence that it doesn't matter, and perhaps you found an improvement.

In chasing after this goal, I ended up creating to a new language called Mu (named after the next step after lambda) which is:

a) designed to be easy to compile down to machine code. Statically typed, minimal impedance-mismatch with Assembly.

b) expressive enough to support co-evolving an OS along with the language, just like C co-evolved with Unix.

In the process I found a third benefit:

c) it's easier to teach to kids than either Lisp or C.

Lately, Mu has kinda reached a milestone where I've managed to gain confidence in several key mechanisms. The next step is to apply these lessons in a less "toy" context, something that we can use for building real stuff rather than just teaching kids. I've been at an impasse on how to do this because I get hung up on precisely the same property you mentioned: bootstrapping. I want a hierarchy of languages built using my layers, that starts from machine code and extends all the way up to a high-level language like Lisp. I want each layer to be gossamer-thin and easy to understand. I don't want the language to be "self-hosted"; building a compiler in its source language is great for geek cred but makes understanding it much harder. I want each layer to only use what has come before, allowing readers to go over layers sequentially.

Some days/weeks I think the way to achieve this goal is to start with OpenBSD and gradually reshape the C it's written in, without regard for backwards compatibility (though I'm not aiming for Mu in this thread). Other days/weeks I get obsessed with building a compiler to transform Mu into a simpler VM that can be efficiently interpreted. Mu is just so much safer than C. (Everything is bounds-checked and use-after-free errors are impossible, at a fraction of the implementation complexity of Rust but at the cost of runtime checks.)

So: I don't know if this is what you might be asking for :) But I love talking about it, so feel free to ask me more questions, let's see if we can figure that out.

To answer your precise question, here are some interesting bootstrapping projects I've found:

In general, concatenative languages like Forth seem like a very useful milestone for a bootstrapped language.