Bluefin - Rebase to nividia-open and when can I rebase

bluefin - Rebase to nividia-open

Can I Rebase?

The reference for understanding the rebase operation can be found at:
rpm-ostree - Rebasing a client system

The general points of consideration to rebase between Universal Blue images seem to be (see the discussion in the comments below):

Upstream OS: Fedora, CentOS - not recommended to rebase between

Desktop Enviornment: GNOME, KDE - not recommended to rebase between

Release Strategy: GTS, Stable, Stable-Daily, etc. - ok to rebase forward (GTS to Stable) but not backwards (Stable to GTS) right now because of the 41 → 42 transition in progress

It is kind of rare that you can’t go back a version. But the composefs enablement makes things messy. -comment by @inffy

See comment by @JohnAtl below for discussion about 41 → 42 considerations.

Extra Variants: no extra variant, nvidia, nvidia-open - some risk associated with inability to pick explicit versions, but should generally be OK

The reasons why some of these combinations are not recommended has to do with what happens (or does not happen) with the /etc and /var filesystems. See the article I wrote to help summarize that part of the process at Did You Know? How ostree update merges changes into etc and var. Note that rebase is just a specialized form of update.

Most of the problems encountered can be worked around, but it requires a varying amount of troubleshooting, work. -comment by @m2Giles

So it seems I am a go to rebase bluefin-dx-nvidia:stable → bluefin-dx-nvidia-open:stable.

Background

I like to stay in front of impactful changes taking place in the industry. And so, I set out to test the nvidia open source drivers for fitness of purpose.

I currently am using an Acer Nitro 5 laptop with a GeForce RTX 3050 Ti Laptop GPU. Since this is an Ampere architecture chip set, I felt mildly confident that it would be supported well.

See CUDA GPU Compute Capability that shows GeForce RTX 3050 Ti at compute capability 8.6. Which is confirmed in my CUDA tests below.

Note that I do not play games, so I do not have that set of system requirements. I only use CUDA; and leverage the GPU in apps that benefit from GPU if available.

I also am on the current (as of today) bluefin-dx-nvidia:stable image.

Current CUDA Operation

I have some custom software that I use to evaluate CUDA readiness. I built this suite a couple of years or so ago to test whether I had the drivers installed and working while on Fedora WS.

The pain I experienced while building that suite is one of the key reasons I adopted bluefin-dx-nvidia in the first place.

# pre rebase

$ pdm start

Python 3.12.10 (main, Apr  9 2025, 04:03:51) [Clang 20.1.0 ]

PyTorch Version: 2.4.1+cu121
GPU is AVAILABLE.
Using architecture: CUDA
torch.version.cuda='12.1'
_CudaDeviceProperties(name='NVIDIA GeForce RTX 3050 Ti Laptop GPU', major=8, minor=6, total_memory=3779MB, multi_processor_count=20)

Pandas 2.2.3
Scikit-Learn 1.5.2

Preparation

To prepare for the rebase I captured a snapshot of /etc into a newly created local git repo. My intention is to repeat that step post-rebase so I can analyze changes. The idea is to be prepared for a potential rollback.

cd etc-snapshot
sudo tar cz /etc | tar xzv

rm -f etc/{g,}shadow{,-}  # git add will not operate on these files

git add .
git commit -m 'before rebase'

Note this is for post-rebase analysis only. I have no intention of using these files as part of rollback procedure. But, never say never …

And, of course, I performed my normal backup regimen just in case.

How to Rebase

Although there is a helper - ujust rebase-helper I find it lacks the level of control for which I was looking.

I simply did this:

rpm-ostree rebase ostree-image-signed:docker:/ghcr.io/ublue-os/bluefin-dx-nvidia-open:stable

Look at the rpm-ostree status output to see where I got the first part of the URI.

Expand to see rpm-ostree status output before reboot...
rpm-ostree status -v
State: idle
AutomaticUpdates: stage; rpm-ostreed-automatic.timer: no runs since boot
Deployments:
  ostree-image-signed:docker://ghcr.io/ublue-os/bluefin-dx-nvidia-open:stable (index: 0)
                   Digest: sha256:e5882e1ea6ca92fb785cb26774ead7d9be0924bcc02a9ec0f5bee5b201e3340f
                  Version: 42.20250522.2 (2025-05-22T13:36:07Z)
                   Commit: e1c1ac47b729a5448b000721dfd4498faf8f8dd9db21d65335e7a8fbc5edfc0f
                   Staged: yes
                StateRoot: default
                 Upgraded: coreutils 9.6-2.fc42 -> 9.6-3.fc42
                           coreutils-common 9.6-2.fc42 -> 9.6-3.fc42
                           ibus-typing-booster 2.27.54-1.fc42 -> 2.27.56-1.fc42
                           libshaderc 2025.1-1.fc42 -> 2025.2-1.fc42
                           nspr 4.36.0-7.fc42 -> 4.36.0-8.fc42
                           nss 3.110.0-2.fc42 -> 3.111.0-2.fc42
                           nss-softokn 3.110.0-2.fc42 -> 3.111.0-2.fc42
                           nss-softokn-freebl 3.110.0-2.fc42 -> 3.111.0-2.fc42
                           nss-sysinit 3.110.0-2.fc42 -> 3.111.0-2.fc42
                           nss-util 3.110.0-2.fc42 -> 3.111.0-2.fc42
                           pcp-conf 6.3.7-2.fc42 -> 6.3.7-4.fc42
                           pcp-libs 6.3.7-2.fc42 -> 6.3.7-4.fc42
                           python3-boto3 1.38.16-1.fc42 -> 1.38.19-1.fc42
                           python3-botocore 1.38.16-1.fc42 -> 1.38.19-1.fc42
                           ublue-os-just 0.45-1.fc42 -> 0.46-1.fc42
                           yelp-libs 2:42.2-8.fc42 -> 2:42.2-9.fc42
                           yelp-xsl 42.1-6.fc42 -> 42.1-7.fc42
                           gnome-shell-extension-logo-menu 0.1.0-0.git89e0e4d.fc42 -> 0.0.0-3.gitbbbc778.fc42

â—Ź ostree-image-signed:docker://ghcr.io/ublue-os/bluefin-dx-nvidia:stable (index: 1)
                   Digest: sha256:e5a51fa20cf20ebcb970e44f724b5999b7ed34b2e969ada4117042f831ae257f
                  Version: 42.20250522 (2025-05-22T01:09:53Z)
                   Commit: 6176f17046d8c29e6465fc6184e277c4098047b95780d6f6f43b7b5dbeea9cb0
                StateRoot: default

  ostree-image-signed:docker://ghcr.io/ublue-os/bluefin-dx-nvidia:stable (index: 2)
                   Digest: sha256:55beac968d6e0bddcd46e1bf7fbbe21f4aef2e4794813e3e3b0bcabd36eec7f7
                  Version: 42.20250519.2 (2025-05-19T15:38:53Z)
                   Commit: 600c5a3483497830a7d2729d3ac13be0a76c74ca426fb79d205fec4d3abe155f
                StateRoot: default

Once the new deployment was confirmed I rebooted.

Rebase Experience

Expectations: based on the knowledge I gained from writing Did You Know? How ostree update merges changes into etc and var, I expect a new set of bootc OS layers to be put in place while maintaining my local system state (/etc, /var - especially home). Some items that have not been modified in /etc may be changed during the process. I also anticipate some problems with flatpaks because of the nvidia GL drivers (runtimes) in use.

I will confirm with a combination of my git repo and sudo ostree admin config-diff.


Upon reboot I saw these in dmesg and no new error messages.

[    4.994039] NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64  570.153.02  Release Build  (dvs-builder@U22-A23-20-3)  Tue May 13 16:34:58 UTC 2025
[    5.092092] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64  570.153.02  Release Build  (dvs-builder@U22-A23-20-3)  Tue May 13 16:23:16 UTC 2025

/etc changes analysis

After the reboot I repeated the steps to capture the /etc snapshot on top of what I had captured before.

The changes due to rebase looked reasonable.

$ git diff --minimal --name-status
M       etc/brlapi.key
M       etc/cdi/nvidia.yaml
M       etc/cups/subscriptions.conf
M       etc/cups/subscriptions.conf.O
M       etc/dnf/versionlock.toml
M       etc/nvidia/kernel.conf
M       etc/pki/ca-trust/extracted/java/cacerts
M       etc/selinux/targeted/active/commit_num

CUDA Operation

Performing the same test as above produced this output.

Python 3.12.10 (main, Apr  9 2025, 04:03:51) [Clang 20.1.0 ]

PyTorch Version: 2.4.1+cu121
GPU is AVAILABLE.
Using architecture: CUDA
torch.version.cuda='12.1'
_CudaDeviceProperties(name='NVIDIA GeForce RTX 3050 Ti Laptop GPU', major=8, minor=6, total_memory=3779MB, multi_processor_count=20)

Pandas 2.2.3
Scikit-Learn 1.5.2

Perfect!

Flatpak Concerns

While I do have concerns with the org.freedesktop.Platform.GL.nvidia-570-153-02 runtime currently installed, and flatpak update does not change that. The driver differences should not impact the flatpak runtime at all if the driver team have done their job. And I do not want to rush into anything.

I am not experiencing anything obvious and so I will take my time to dive deep into the apps’ behavior where I do use the GPU.

EDIT: after some time (maybe a couple weeks?) I did see updates to the flatpak runtimes mentioned above. And still nothing obvious different.

Summary

All in all a pretty smooth experience; with just a couple of warts to work around.

I’ll look to file an issue for the defect in the ujust file.

But not today. I have a pot of beans with smoked ham hocks that are ready to come out of the solar oven. :wink:

References

NVK is part of the mesa stack.

It uses Nouveau, not nvidia-open.

Thanks for commenting. Although, I am really hoping for some discussion around the safety boundaries of rebasing … it spooks the heck out of me.

I am not a gamer, and I haven’t written anything in GL since the 1990s and so I am not surprised that I lack familiarity with these terms.

NVK does sound suspiciously related to Nvidia and Vulkan - [just searched for it] and so it is. I thought it stood for something like NVidia Kit.

So I guess this is to help improve support for old hw ? E.g., in wine and perhaps in web browsers? Am I reading the tea leaves correctly here?

So I guess even though toggle-nvk can be used to rebase to bluefin-dx-nvidia-open, that was not your original design (I am assuming this is your work originally @m2Giles). I think I see that now. I’ll clean up the post.

I guess I should have mentioned that I admittedly approached this backwards. I knew I wanted to rebase to bluefin-dx-nvidia-open and went looking for something to help do that. Because, I wanted to make sure I was not missing some important process step.

What safety boundaries? It is just switching you to another image, just like you would install with different ISO. it just keeps you home and etc and brings them along

Well, rebasing from bluefin to aurora or bazzite, for example is not supported.

So, what I am questioning is:

What set of conditions (e.g.;, stuff in ~/.local/state or somewhere in /var) would lead to an unstable system rebasing back and forth between different images? What are the guardrails I should consider?

I know there are some. What are they?

I am sure it will become more clear to me over time. But thought it would make for a good discussion to help accelerate learning.

Switching between fedora and centos. Not recommended.

Switching between gnome and kde. Not recommended.

Basically, the general trend is that you can rebase forward (gts to stable) and you should stay on the same DE.

Most of the gotchas, can be worked around, but it requires work.

@m2Giles thanks for the input. Any feedback on the new section I added at the top of the article?

Anybody else have thoughts?

If I’m not mistaken, the limitation on rebasing GTS, stable, and latest, is that stable and latest are based on Fedora 42, which uses composefs. So going back from a 42-based release to a 41-based release is not possible without reinstalling. But, rebasing from 41-based to 42-based is possible (e.g. the upgrade to 42 rolled out to 41:stable systems recently).

E.g. GTS → stable :white_check_mark:
stable → latest :white_check_mark:
Latest → stable :white_check_mark:
Stable → GTS :no_entry:

This will change when GTS over to 42/composefs in the fall of this year.

All of this should be verified by someone official, if your post is intended to be documentation.

Also, is this already covered in the docs?

1 Like

Thanks for that clarification (I see your confirmation @inffy). That makes sense - and I lost track of that 41 → 42 boundary.

So aside from the composefs conversion, it should be generally ok to bounce back and forth between the GTS, Stable, etc. ? I haven’t had a need to do that yet. Seems like the changes should be minor for them - and mostly contained in the delivered layers (versions of binaries, some diffs in .timer units, etc.).

Can I assume that /boot diffs are taken care of during the deployment process? e.g., initramfs, etc.

These are kind of rare that you can’t go back a version. But the composefs enablement makes things messy.

Generally your modifications like kernel ARGS/initram stuff are added back after the deployment of the rebase is done.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.