Feature request
Longer context up to 8k tokens, the given discussion and notebook generate promising results
Motivation
Discussion: https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/
Colab Notebook: https://colab.research.google.com/drive/1VI2nhlyKvd5cw4-zHvAIk00cAVj2lCCC#scrollTo=d2ceb547
Your contribution
As it's only 3 lines of code to change it would be pretty easy to change
I will start training an model and give an example demo