5 3

[alpine-devel] Report from Reproducible builds summit 2018

Natanael Copa Details Message ID <20181217133328.4dd1ef26@ncopa-desktop.copa.dup.pw> Sender timestamp 1545050008 DKIM signature missing Download raw message Hi, I attended the reproducible builds[1] summit in paris last week, and wanted to give a short report what I learned there and share some thoughts on reproducible builds for Alpine. I went to the summit because I think we should make it a long term goal to make Alpine reproducible built, and I wanted to learn from people with experience, what to expect and make a plan for Alpine how to get there. The summit in Paris was nicely organized with zero powerpoint presentations. Instead, we were divided in to smaller groups and had a number of group discussions and work session, where everyone was encouraged to participate. The notes from the session are here: https://pad.riseup.net/p/reproduciblebuilds4-agenda I tried to get discussions around bootstrapping rust, and how to deal with golang packaging, but people didn't seem to be too interested in that. Some take away points for Alpine: * We need a way to make older packages available, so that it is possible to rebuild the exact same install (or Docker image) later. Different distros solves this in different ways. I was told Fedora has some archive where they save all older packages. I was told Debian uses some sort of (filesystem?) snapshot archive. I have a couple of ideas how we could provide this. * in order to make Alpine reproducible built, it would be good to have 3rd party do a rebuild of all of our packages and compare with the offical packages. kpcyrd from Arch Linux worked on adding Alpine to https://tests.reproducible-builds.org and promised to follow up that. * there are various tools that can compare different binaries to figure out why and what differs. I started to work on packaging diffoscope for alpine, but bumped into various failures in the test suite. One was a bug in libmagic from file(1), and this is now fixed. There were two other failures and with some help from diffoscope developers they are also fixed now. * the work done by Suse shows that most packages will likely not need any patching. I got a number, ~500 packages of 10000 needed patching for Suse. Bernhard from Suse has also documented various common issues[2] (with a suggestion to a fix). He also has a tool[3] to monitor package versions from different distros, similar to release-monitoring.org. Alpine has been added. I think we should try focus on the v3.9 release now. Once v3.9 is out I would like to discuss how we can make alpine reproducible built. Just mentioning some points before I forget: * we may need to store the exact versions and/or hashes of the dependencies used when a package was built. I am not sure where we want store this. Maybe in the APKINDEX? * we embed the signature in the .apk, which means its not possible to re-create the exact same .apk without having access to the private key. I'm not sure how to deal with that. * I learned about this thing called IPFS[4], which may be worth have a closer look on. Now, lets get v3.9 out.... -nc [1]: https://reproducible-builds.org/events/paris2018/ [2]: https://github.com/bmwiedemann/theunreproduciblepackage [3]: https://maintainer.zq1.de/ [4]: https://ipfs.io --- Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org Help: alpine-devel+help@lists.alpinelinux.org ---

Chloe Kudryavtsev Details Message ID <1a664e98-3f41-5503-60af-98865c0b785f@toastin.space> In-Reply-To <20181217133328.4dd1ef26@ncopa-desktop.copa.dup.pw> (view parent) Sender timestamp 1545106061 DKIM signature missing Download raw message On 12/17/18 7:33 AM, Natanael Copa wrote: > * we may need to store the exact versions and/or hashes of the > dependencies used when a package was built. I am not sure where we > want store this. Maybe in the APKINDEX? I think this is a good idea. Mostly a note in regards to the next comment. > * we embed the signature in the .apk, which means its not possible to > re-create the exact same .apk without having access to the private > key. I'm not sure how to deal with that. I do not believe we need to allow for that. Since we want to store exact versions/hashes of dependencies in the .apk, I believe we can also store a hash of the resulting tree, pre-signature (meaning we sign the hash as well). This hash should be visible using apk(1), to allow people to programmatically verify that two .apks are the same internally, and guarantees the integrity of the has in mirrors. --- Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org Help: alpine-devel+help@lists.alpinelinux.org ---

Oliver Smith Details Message ID <9bc897d8-6527-fa55-0f94-89c0722c4a3f@bitmessage.ch> In-Reply-To <20181217133328.4dd1ef26@ncopa-desktop.copa.dup.pw> (view parent) Sender timestamp 1545121620 DKIM signature missing Download raw message Hello Natanel and ML, I'm glad to read about this, thank you for this writeup! I've looked into reproducible builds myself last year, even had a proof of concept with a few packages. The tooling can't be re-used, as it was based on pmbootstrap from postmarketOS, not Alpine's abuild directly. But maybe I can help with some insights or contribute otherwise. Natanael Copa: > * we may need to store the exact versions and/or hashes of the > dependencies used when a package was built. I am not sure where we > want store this. Maybe in the APKINDEX? I had created a .buildinfo.json file, where I placed all dependencies that were installed at the build time, with their versions. That file was placed next to the main apk (so no extra buildinfo file for subpackages) in the binary repository directory. Storing the hashes would be even better. I chose JSON, as it's trivial to parse that with Python, but since Alpine's build tools are lightweight and do not depend on Python, using another format probably makes moer sense. The idea for this file was based on Debian's buildinfo file, that is described here: https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles The APKINDEX is generated from the apk files, so we would need to have the information elsewhere already, right? > * we embed the signature in the .apk, which means its not possible to > re-create the exact same .apk without having access to the private > key. I'm not sure how to deal with that. My cheap workaround for that was: just make all files inside the .apk file reproducible, not the apk itself. It would be better to have the entire apk reproducible of course, but to do that, we would need to store the signature elsewhere (e.g. create a .sig file for each .apk). Having an extra signature file might also make it easier to allow multiple entities to sign an apk, e.g. after an independent rebuild. Regards, Oliver --- Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org Help: alpine-devel+help@lists.alpinelinux.org ---

Oliver Smith Details Message ID <b8bbde6c-90b0-4c96-59e5-8475ae655bb7@bitmessage.ch> In-Reply-To <20181217133328.4dd1ef26@ncopa-desktop.copa.dup.pw> (view parent) Sender timestamp 1545121620 DKIM signature missing Download raw message Hello Natanel and ML, I'm glad to read about this, thank you for this writeup! I've looked into reproducible builds myself last year, even had a proof of concept with a few packages. The tooling can't be re-used, as it was based on pmbootstrap from postmarketOS, not Alpine's abuild directly. But maybe I can help with some insights or contribute otherwise. Natanael Copa: > * we may need to store the exact versions and/or hashes of the > dependencies used when a package was built. I am not sure where we > want store this. Maybe in the APKINDEX? I had created a .buildinfo.json file, where I placed all dependencies that were installed at the build time, with their versions. That file was placed next to the main apk (so no extra buildinfo file for subpackages) in the binary repository directory. Storing the hashes would be even better. I chose JSON, as it's trivial to parse that with Python, but since Alpine's build tools are lightweight and do not depend on Python, using another format probably makes moer sense. The idea for this file was based on Debian's buildinfo file, that is described here: https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles The APKINDEX is generated from the apk files, so we would need to have the information elsewhere already, right? > * we embed the signature in the .apk, which means its not possible to > re-create the exact same .apk without having access to the private > key. I'm not sure how to deal with that. My cheap workaround for that was: just make all files inside the .apk file reproducible, not the apk itself. It would be better to have the entire apk reproducible of course, but to do that, we would need to store the signature elsewhere (e.g. create a .sig file for each .apk). Having an extra signature file might also make it easier to allow multiple entities to sign an apk, e.g. after an independent rebuild. Regards, Oliver --- Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org Help: alpine-devel+help@lists.alpinelinux.org ---

Max Rees Details Message ID <20181230225258.GB9101@sachiel> In-Reply-To <1a664e98-3f41-5503-60af-98865c0b785f@toastin.space> (view parent) Sender timestamp 1546210379 DKIM signature missing Download raw message On Dec 17 11:07 PM, Chloe Kudryavtsev wrote: > On 12/17/18 7:33 AM, Natanael Copa wrote: > > * we may need to store the exact versions and/or hashes of the > > dependencies used when a package was built. I am not sure where we > > want store this. Maybe in the APKINDEX? > > I think this is a good idea. Mostly a note in regards to the next comment. > > > * we embed the signature in the .apk, which means its not possible to > > re-create the exact same .apk without having access to the private > > key. I'm not sure how to deal with that. > > I do not believe we need to allow for that. > Since we want to store exact versions/hashes of dependencies in the .apk, I > believe we can also store a hash of the resulting tree, pre-signature > (meaning we sign the hash as well). > This hash should be visible using apk(1), to allow people to > programmatically verify that two .apks are the same internally, and > guarantees the integrity of the has in mirrors. [apologies to Chloe - I forgot to list-reply on the first draft of this message] The "datahash" field of the .PKGINFO file should be able to serve this purpose - it's the SHA256 checksum of the data.tar.gz file (i.e. the actual tree contents), and since it's located in control.tar.gz it's signed as part of the existing .apk file creation process. I agree that apk(1) or perhaps a standalone utility should make it easier to get the datahash of an .apk file. As long as data.tar.gz is created reproducibly, then the datahash should end up being the same. Max --- Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org Help: alpine-devel+help@lists.alpinelinux.org ---