Please consider including repo manifest -r output with distributed images

Added by Riku Saikkonen over 7 years ago

The Replicant build directories contain a git_versions.txt file containing git commit IDs from subprojects. (For example, .) However, there doesn't seem to be a way to use this file automatically, for example to rebuild an image using the same versions of software than in the distributed image. It appears that the build instructions in e.g. SDKBuild produce a current version of most software in the image, and not the versions that Replicant distributes. (Because the Replicant manifest default.xml does not contain exact revision information for most of the subprojects.)

However, the repo tool appears to be able to generate a manifest containing the commit IDs: repo manifest -r -o mymanifest.xml
It would be nice if this output was also included somewhere with all distributed images.

I assume that if this generated manifest was included in, e.g., the replicant/manifest repository as, say, sdk_4.0_0001.xml, then one could view the sources and/or rebuild this exact version of the SDK with: repo init -u git:// -m sdk_4.0_0001.xml
(probably also with repo sync -m sdk_4.0_0001.xml, from what I understand of "repo help sync")

There is also a "smart tag" option in repo (repo sync -t), which seems to be designed for fetching a particular version (a "known good build" is the example the help uses), but it seems to require setting up something called a "manifest server" on the network that would respond to these kinds of queries. Storing manifests containing commit IDs in files seems to be a much simpler approach.

(Why did I mention the replicant/manifest repository? I guess it could be more logical to store the generated file on the ftp site together with the images, but the repo tool seems to want to get the manifest from a git repository.)

Another approach would of course be for Replicant to distribute the source code of each image along with the binaries, but the files are rather huge. (repo init + repo sync of replicant-4.0 takes about 21 gigabytes on disk right now. Although I guess you could distribute only the then-current version and not the revision history: a .tar.gz of everything except the .repo subdirectory is "only" 2.5 gigabytes, and I assume that it (or a shallow clone, repo init --depth=1) would be enough for rebuilding.)

(Disclaimer: I did not have time to test that the repo init/sync stuff actually works, but the generated manifest looked correct.)

Possibly it would also sometimes be helpful if the generated manifest or git_versions.txt was included in a file inside the generated image: if people build their own images using current versions a lot, it could help them keep track of which versions they have (e.g., "did the image I built and installed a few months ago include this bugfix that was committed around the same time"), but those who build their own images can of course easily keep track of this by themselves as well.

Replies (3)

RE: Please consider including repo manifest -r output with distributed images - Added by Riku Saikkonen over 7 years ago

Update: I wanted to test this out using some older versions already in the repository, so I wrote a simple Python script to add git commit IDs from git_versions.txt to the distributed default.xml manifest. And I tried "repo sync" using the result...

The Python script is attached, as well as a manifest created by it from the replicant-4.0 default.xml and git_versions.txt from the 4.0 SDK (URL in the above message). The new manifest was created using:
$ python default.xml git_versions.txt >sdk_4.0_0001.xml

I then tried it out by copying the generated sdk_4.0_0001.xml to .repo/manifests/ in a Replicant checkout (.repo/ is created by repo init under the replicant-4.0/ directory when following GettingReplicantSources) - that seems to be where the repo tool stores a checkout of the manifest repository. Then:
$ ../tools/repo sync -m sdk_4.0_0001.xml
worked, at least as far as I could see from looking at "git rev-parse HEAD" in a few of the project repositories. (Add -l to make it faster: then it doesn't try to sync everything again from the network, which is a bit useless when using older revisions.)
Just "repo sync" or "repo sync -m default.xml" got me back to the original situation (current versions in the replicant-4.0 manifest).

Some further testing found a couple of glitches related to new or removed repositories: the Python script keeps all projects listed in default.xml regardless of whether they exist in git_versions.txt (so the list of projects stays constant and these glitches do not occur). I tried removing some of the repositories (that were not present at the time the SDK 4.0 images were created) from the generated manifest.xml. Then I found the following problems:

1) Starting from the "up-to-date" state (default.xml), "repo sync -m generated_old_manifest.xml" gives errors about uncommitted changes in bootable/bootloader/goldelico/gta04, because git status in that directory thinks that the x-loader subdirectory is not in the repository (it is a subproject which should also be deleted, but repo is not smart enough to delete the subproject first and the upper-level project afterwards). Simply removing the bootable/bootloader/goldelico/gta04 directory manually and re-running the repo sync command fixes the problem. (Repo sync can even recreate a project whose directory is manually removed, because its git repository is actually stored safely in .repo/projects/.) The other recently added projects (like device/samsung/p5110) do not cause any problems: they don't have other projects in subdirectories.

2) When I tried "repo manifest -r -o test.xml" after syncing using a "historical" manifest where some projects are missing, it gave errors about the missing subdirectories. (I guess it found them in .repo/projects/ and got confused, since the repo manifest command does not have a -m manifest.xml option like repo sync. I think it might work if I used repo init to switch to the required manifest instead of the temporary repo sync -m.) I don't think this matters in practice, as the only use case I can think of is trying to recreate an old manifest when you've already somehow managed to switch to its revisions and delete more recent projects.

I guess problem 1) might reappear if manifests are generated now, new projects like gta04 + gta04/x-loader are added in the future, and someone tries to repo sync -m from the future state back to the old manifest. But it's easy to fix (remove the affected subdirectory) and I guess quite unlikely to occur anyway... (Maybe repo is not designed to store a project inside another project's directory tree? Although this could well be the only place where it causes problems...)

The manifest generated by my Python script includes projects in default.xml that are not listed in git_versions.txt (but the script warns about them on standard error), which avoids the above problems, but also means that the state produced by repo sync -m includes the new projects in the directory tree. They were of course not present in the state where git_versions.txt was generated, so strictly speaking this is not exactly the same state as when git_versions.txt was generated. But hopefully the build process does not care about a few extra subdirectories being present in the directory tree...

How does all this affect my original suggestion of using repo manifest -r when releasing new Replicant images? It makes it less necessary, since git_versions.txt is enough to create an (apparently) equivalent manifest, but I still think it would be nice if you could create "proper" manifests with repo manifest -r together with new images: it is at least a cleaner and more documented way than the Python script. (And the simple script will break for instance if you ever split a <project> tag to multiple lines in default.xml, since it doesn't actually parse the XML.)

Feel free to use the Python script as you see fit, to create historical manifests for Replicant 4.0 images or whatever. If you wish, I can use the script to create the historical manifest files for you using the various git_versions.txt files in the distributed Replicant 4.0 image directories.

EDIT: Ignore the first two attachments - they are preliminary almost-correct versions that I attached accidentally. (I can't find a way to remove attachments when editing a forum post - reattaching the correct versions created the two latter files with /393/ and /399/ in the URLs.)

RE: Please consider including repo manifest -r output with distributed images - Added by Paul Kocialkowski over 7 years ago

Thanks for you research work! I think we will start distributing repo manifest -r instead of our git revisions file with new releases!

RE: Please consider including repo manifest -r output with distributed images - Added by Paul Kocialkowski over 7 years ago

That's part of the latest Replicant 4.2 release by the way :)