Mamba stacks mixer layers, which happen to be the equal of Attention levels. The core logic of mamba is held from the MambaMixer class.
This dedicate won't belong to any branch on this repository, and could belong to https://k2spiceshop.com/product/liquid-k2-on-paper-online/