Another way to run ChatGPT-like bot thanks to an app on the Playstore and a framework that's one of the better things that happened to humanity - no seriously, everyone should know about it

October 04, 2024 #llama.cpp #A.I. #Chatbots #huggingface #termux #klog.website #Large Language Models #ollama #ChatGPT #Open Source Large Language Models #K_log Website

A video of me doing it - quickly, read the blog for more explanations!

This is running on better phone than the one in the guide and it has a different operating system and a different shell for likes, if you want to know more about that let me know!

Setup

Okay so in this post we will use another system you can setup through Nix, BUT a few thing.

First you can download this app through the Google Play Store so most people won't be intimidated about getting it.

Second Nix is great for reproducibility - not performance, so if you can compile things yourself it's better.

So this time we will do just that also the app we will utilise is the same app Nix-on-Droid was built so I will always try to introduce people to as many frameworks and systems as possible

Okay so first get Termux from the Google Play Store or you can even get it from the F-droid store if you've been here before.

This is the official guide from the repository. I will add a one extra step just to make sure we are all running the same versions and the instructions in the guide are compatible with the instructions. And a few instructions to make termux better / nicer ( at least to me ).


apt update && apt upgrade -y

apt install nala -y # nala is a wrapper for apt ( the package manager on termux, it makes it easier to track the packages and it's also nicer looking, to me at least. Completely optional but 2 things use - apt in place of nala in the guide if you don't want this and also manually and clang package so that you can compile C code and use the Makefiles in the next step )

nala install curl git cmake ccache make jq tmux fish -y # add clang if you didn't install nala 
# curl is for getting models - termux might not have built an enviroment in the same place for everyone so getting  
# git is for getting other files we need for the framework and 

# cmake and make ccache are so we can play around and automated instructions for making / compiling .c/.cpp files

# jq for interacting with the server

git clone https://github.com/ggerganov/llama.cpp ~/llama.cpp # This is the actual framework, probably the best thing that happened to hummanity. The other system we used ollama is built on top of this one.

cd ~/llama.cpp && git checkout a39ab21 ## This is just to make sure if something changes with llama.cpp in the future the instructions from the blog are valid.

make -j 8 # Guys this was so simple but trust me some very smart people made this very easy for everyone.

chsh -s $(which fish) # Optional, fish is going to give you command suggestions scrolling through history based on a few letters you do remember and similar nice things it's optional as well

exec fish # Optional

Okay we're still just doing the preparation now we only got the framework, that was the magic of nix...

Now we need to get the models. You can download them in your browser and then figure out where the location of your Downloads directory is and move it from there directly to you. Or go to the browser go to the link of the model and copy the link and use it with curl remember you can use any model on huggingface just remember to use GGUF file format. The latest models are usually released by the company in that format for older and different models I recommend checking out TheBloke.

Now that we yapped a bit


curl -L -o ~/llama.cpp/models/tinyllama.Q4_K_M.gguf https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v0.3-GGUF/resolve/main/tinyllama-1.1b-chat-v0.3.Q4_K_M.gguf 

# I used the same model we did in the nix-on-droid example just for performance comparison.

Running the models

Okay we have everything now we can run the models remember we specified where to save the models and we will be specifying to llama.cpp where to find them now. If you tried being clever and did some tinkering with the commands keep tinkering now.


# Start a new tmux session named 'llama'
tmux new-session -d -s llama

# Create a new window for the server
tmux rename-window -t llama:0 'server'  # Rename the first window
tmux send-keys -t llama:0 '~/llama.cpp/llama-server -m ~/llama.cpp/models/tinyllama.Q4_K_M.gguf ' C-m  # Start the server

# Create a new window for the chat script
tmux new-window -t llama:1 -n 'chat'  # Create a second window named 'Chat'
tmux send-keys -t llama:1 'bash ~/llama.cpp/examples/server/chat.sh' C-m  # Start the chat script

# This is just the basic general script and for the best performance you will have to check the parameters on huggingface, every model will be a bit different and then based on that edit the chat.sh file or create your own more complex system on top of it, even ollama is built on top of this( Not this specifically but llama.cpp server ). That's one of the reasons I'll rather recommend ollama instead of llama.cpp it's more user-friendly and it has it's own repository of models which makes it easier to load them. But you need nix for ollama on a Phone. It's actually the nix people that maintain / have a arm64 (smartphone architecture build) for ollama. Anyone can essentially use ollama files to build the framework with any build system  that can interpret the files. That would be like compiling llama.cpp with your own make command - veryvery hard and system specific. The makefile for llama.cpp has instructions for multiple systems even arm64 that's why we don't need much.

# Attach to the tmux session

tmux attach -t llama


Conclusion

This way is not guranteed to build and requires more tinkering in the terminal setting up models and prompts to make them the most efficient. What we did though should be sufficient for some minor chatting and testing the models. Again go to the huggingface model cards and check their specific and recommended parameters and prompt formats. I wanted to show everyone llama.cpp and in case there there were people intimidated by downloading things outside of the Google Playstore so they have something as well! Check out the other blogs! babye