You can run general purpose GPU computations using our AMD Radeon 6600XT graphics card.
You can install PyTorch for ROCm using these instructions. Run your PyTorch programs with the environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0.
Go to /opt/llama.cpp and run it with ./main -ngl 100 -m models/MODEL -p "Your prompt here". You can also run it in interactive mode (similar to ChatGPT) using the flags -i -ins --color. According to the llama.cpp docs, you can get slightly better performance by reducing the number of CPU threads used. You may be able to get 10% faster output using -t 4 instead of the default, but YMMV and you can try different values to see which one is fastest.