I wanted to explore NAT64, primarily by not relying on any NAT44 but instead using NAT66, with a pile of reasons:
The NAT44 based implements work like thus:
Private or special addresses were not supposed to use 64:ff9b::, instead these can go in 64:ff9b:1:fffe::/96
The state tracking happens when NAT64 has to deal with addresses outside of 64:ff9b:1:fffe::
Instead, with a NAT66 based implentation, the story goes like this.
Whilst the NAT66 is itself not stateless, this is core kernel code and is visible in conntrack
It could then even be possible to migrate filtering in a host to IPv6 and deal with transit as NAT46, then NAT66 and finally NAT64.
Static NAT66 mappings can be more pleasant, like for multicast nat.
If I was going to rewrite NAT64, also gettitg this inside linux without a taint for performance, rather than outside as before.
Adding NAT64 to linux without a taint needed to use BPF, so I found out information how to do that, and unfortunatly with no complete working module, so had a go at reconstructing a module.
Have got most things working except the checksums, fortunatly the kernel can do that already although a small performance improvement could be had by putting that in the nat64. There is another NAT44 based implementation, this appears much more complete so using the kernel code for guidance.
And there is something on this topic in android
BPF is composed in nat64.c
Compile it, as it says in tc-bpf
Setup a test environment:
bpf seems to not like unshare -Umrun testing has to be done from actual root, though hopefully the steps are harmless, users that do not trust this may use a virtual machine instead.
Early in experimention I had used mirred but gather nat64 can work without that:
#tc filter add dev za matchall action mirred egress mirror dev zb action bpf obj /usr/src/bpf/nat64.o
#(re)load the BPF code, onto za, and push the supplimentary csum module to correct the various checksums.
#Linux actually does more fussy validation and may refuse to load some modules that passed "compilation" as a test, so repeatedly reloading a module after compiling it can be useful.
# I did not think this would work well with ARP involved, so initial tests rely on static mac addresses.