LLM By Examples: Build Llama.cpp with GPU (CUDA) support
As the demand for advanced language models continues to surge, developers increasingly seek high-performance solutions to harness their capabilities. Llama.cpp stands out as a powerful framework designed for efficient execution of large language models. This article aims to provide a comprehensive guide to building Llama.cpp with GPU (CUDA) support, enabling users to maximize computational efficiency.

Building Llama.cpp with GPU (CUDA) support unlocks the potential for accelerated performance and enhanced scalability. By leveraging the parallel processing power of modern GPUs, developers can significantly reduce the time taken for model training and inference, allowing for real-time or near-real-time applications. In this guide, we will explore the prerequisites for setting up your environment, such as compatible GPU hardware and CUDA software, along with detailed steps to configure your system. We will also walk through the installation and build process, ensuring you have the tools needed to effectively deploy Llama.cpp on GPU, opening new possibilities for your projects and applications in natural language processing and beyond.
If you don’t familiar with core concepts of Llama.cpp, take a look below link first.
Prepare for installation
To demonstrate the process of build and installation, we will use Windows WSL2 environment with a Nvidia RTX2070 (8GB) GPU environment. This specification is very common and could be found at most of Game laptop or PC.
In case if you are interested in how to setup such environment, take a look below links:
Now, let’s check what we have here:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0
$ nvidia-smi
Mon Oct 21 16:19:03 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 531.18 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2070 On | 00000000:01:00.0 Off | N/A |
| N/A 48C P0 34W / N/A| 0MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+Build and Installation
To build Llama.cpp, we will need:
- cmake and support libraries
- git, we will need clone the llama.cpp git repo
Now, let’s get started.
$ git clone https://github.com/ggerganov/llama.cpp
Cloning into 'llama.cpp'...
remote: Enumerating objects: 35858, done.
remote: Counting objects: 100% (105/105), done.
remote: Compressing objects: 100% (91/91), done.
remote: Total 35858 (delta 35), reused 44 (delta 11), pack-reused 35753 (from 1)
Receiving objects: 100% (35858/35858), 59.87 MiB | 346.00 KiB/s, done.
Resolving deltas: 100% (26027/26027), done.
$ cd llama.cpp
$ sudo apt-get install libcurl4-openssl-dev
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Suggested packages:
libcurl4-doc libidn11-dev libkrb5-dev libldap2-dev librtmp-dev libssh2-1-dev libssl-dev
The following NEW packages will be installed:
libcurl4-openssl-dev
0 upgraded, 1 newly installed, 0 to remove and 27 not upgraded.
Need to get 386 kB of archives.
After this operation, 1698 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libcurl4-openssl-dev amd64 7.81.0-1ubuntu1.18 [386 kB]
Fetched 386 kB in 1s (468 kB/s)
Selecting previously unselected package libcurl4-openssl-dev:amd64.
(Reading database ... 37472 files and directories currently installed.)
Preparing to unpack .../libcurl4-openssl-dev_7.81.0-1ubuntu1.18_amd64.deb ...
Unpacking libcurl4-openssl-dev:amd64 (7.81.0-1ubuntu1.18) ...
Setting up libcurl4-openssl-dev:amd64 (7.81.0-1ubuntu1.18) ...
Processing triggers for man-db (2.10.2-1) ...
$ sudo apt install build-essential git cmake libopenblas-dev libatlas-base-dev
[sudo] password for wsluser:
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
build-essential is already the newest version (12.9ubuntu3).
build-essential set to manually installed.
git is already the newest version (1:2.34.1-1ubuntu1.11).
git set to manually installed.
The following additional packages will be installed:
cmake-data dh-elpa-helper emacsen-common libatlas3-base libjsoncpp25 libopenblas-pthread-dev libopenblas0
libopenblas0-pthread librhash0
Suggested packages:
cmake-doc ninja-build cmake-format libatlas-doc liblapack-doc
The following NEW packages will be installed:
cmake cmake-data dh-elpa-helper emacsen-common libatlas-base-dev libatlas3-base libjsoncpp25 libopenblas-dev
libopenblas-pthread-dev libopenblas0 libopenblas0-pthread librhash0
0 upgraded, 12 newly installed, 0 to remove and 27 not upgraded.
Need to get 25.5 MB of archives.
After this operation, 175 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://archive.ubuntu.com/ubuntu jammy/main amd64 libjsoncpp25 amd64 1.9.5-3 [80.0 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/main amd64 librhash0 amd64 1.4.2-1ubuntu1 [125 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy/main amd64 dh-elpa-helper all 2.0.9ubuntu1 [7610 B]
Get:4 http://archive.ubuntu.com/ubuntu jammy/main amd64 emacsen-common all 3.0.4 [14.9 kB]
Get:5 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 cmake-data all 3.22.1-1ubuntu1.22.04.2 [1913 kB]
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 cmake amd64 3.22.1-1ubuntu1.22.04.2 [5010 kB]
Get:7 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libatlas3-base amd64 3.10.3-12ubuntu1 [3340 kB]
Get:8 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libatlas-base-dev amd64 3.10.3-12ubuntu1 [3590 kB]
Get:9 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libopenblas0-pthread amd64 0.3.20+ds-1 [6803 kB]
Get:10 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libopenblas0 amd64 0.3.20+ds-1 [6098 B]
Get:11 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libopenblas-pthread-dev amd64 0.3.20+ds-1 [4634 kB]
Get:12 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libopenblas-dev amd64 0.3.20+ds-1 [18.6 kB]
Fetched 25.5 MB in 1min 1s (419 kB/s)
Selecting previously unselected package libjsoncpp25:amd64.
(Reading database ... 34093 files and directories currently installed.)
Preparing to unpack .../00-libjsoncpp25_1.9.5-3_amd64.deb ...
Unpacking libjsoncpp25:amd64 (1.9.5-3) ...
Selecting previously unselected package librhash0:amd64.
Preparing to unpack .../01-librhash0_1.4.2-1ubuntu1_amd64.deb ...
Unpacking librhash0:amd64 (1.4.2-1ubuntu1) ...
Selecting previously unselected package dh-elpa-helper.
Preparing to unpack .../02-dh-elpa-helper_2.0.9ubuntu1_all.deb ...
Unpacking dh-elpa-helper (2.0.9ubuntu1) ...
Selecting previously unselected package emacsen-common.
Preparing to unpack .../03-emacsen-common_3.0.4_all.deb ...
Unpacking emacsen-common (3.0.4) ...
Selecting previously unselected package cmake-data.
Preparing to unpack .../04-cmake-data_3.22.1-1ubuntu1.22.04.2_all.deb ...
Unpacking cmake-data (3.22.1-1ubuntu1.22.04.2) ...
Selecting previously unselected package cmake.
Preparing to unpack .../05-cmake_3.22.1-1ubuntu1.22.04.2_amd64.deb ...
Unpacking cmake (3.22.1-1ubuntu1.22.04.2) ...
Selecting previously unselected package libatlas3-base:amd64.
Preparing to unpack .../06-libatlas3-base_3.10.3-12ubuntu1_amd64.deb ...
Unpacking libatlas3-base:amd64 (3.10.3-12ubuntu1) ...
Selecting previously unselected package libatlas-base-dev:amd64.
Preparing to unpack .../07-libatlas-base-dev_3.10.3-12ubuntu1_amd64.deb ...
Unpacking libatlas-base-dev:amd64 (3.10.3-12ubuntu1) ...
Selecting previously unselected package libopenblas0-pthread:amd64.
Preparing to unpack .../08-libopenblas0-pthread_0.3.20+ds-1_amd64.deb ...
Unpacking libopenblas0-pthread:amd64 (0.3.20+ds-1) ...
Selecting previously unselected package libopenblas0:amd64.
Preparing to unpack .../09-libopenblas0_0.3.20+ds-1_amd64.deb ...
Unpacking libopenblas0:amd64 (0.3.20+ds-1) ...
Selecting previously unselected package libopenblas-pthread-dev:amd64.
Preparing to unpack .../10-libopenblas-pthread-dev_0.3.20+ds-1_amd64.deb ...
Unpacking libopenblas-pthread-dev:amd64 (0.3.20+ds-1) ...
Selecting previously unselected package libopenblas-dev:amd64.
Preparing to unpack .../11-libopenblas-dev_0.3.20+ds-1_amd64.deb ...
Unpacking libopenblas-dev:amd64 (0.3.20+ds-1) ...
Setting up libopenblas0-pthread:amd64 (0.3.20+ds-1) ...
update-alternatives: using /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 to provide /usr/lib/x86_64-linux-gnu/libblas.so.3 (libblas.so.3-x86_64-linux-gnu) in auto mode
update-alternatives: using /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3 to provide /usr/lib/x86_64-linux-gnu/liblapa
ck.so.3 (liblapack.so.3-x86_64-linux-gnu) in auto mode
update-alternatives: using /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblas.so.0 to provide /usr/lib/x86_64-linux-gnu/libopenblas.so.0 (libopenblas.so.0-x86_64-linux-gnu) in auto mode
Setting up libatlas3-base:amd64 (3.10.3-12ubuntu1) ...
Setting up libatlas-base-dev:amd64 (3.10.3-12ubuntu1) ...
update-alternatives: using /usr/lib/x86_64-linux-gnu/atlas/libblas.so to provide /usr/lib/x86_64-linux-gnu/libblas.so (libblas.so
-x86_64-linux-gnu) in auto mode
update-alternatives: using /usr/lib/x86_64-linux-gnu/atlas/liblapack.so to provide /usr/lib/x86_64-linux-gnu/liblapack.so (liblapack.so-x86_64-linux-gnu) in auto mode
Setting up emacsen-common (3.0.4) ...
Setting up dh-elpa-helper (2.0.9ubuntu1) ...
Setting up libjsoncpp25:amd64 (1.9.5-3) ...
Setting up libopenblas0:amd64 (0.3.20+ds-1) ...
Setting up librhash0:amd64 (1.4.2-1ubuntu1) ...
Setting up cmake-data (3.22.1-1ubuntu1.22.04.2) ...
Setting up libopenblas-pthread-dev:amd64 (0.3.20+ds-1) ...
update-alternatives: using /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so to provide /usr/lib/x86_64-linux-gnu/libblas.so (libblas.so-x86_64-linux-gnu) in auto mode
update-alternatives: using /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so to provide /usr/lib/x86_64-linux-gnu/liblapack
.so (liblapack.so-x86_64-linux-gnu) in auto mode
update-alternatives: using /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblas.so to provide /usr/lib/x86_64-linux-gnu/libopen
blas.so (libopenblas.so-x86_64-linux-gnu) in auto mode
Setting up libopenblas-dev:amd64 (0.3.20+ds-1) ...
Setting up cmake (3.22.1-1ubuntu1.22.04.2) ...
Processing triggers for man-db (2.10.2-1) ...
Processing triggers for libc-bin (2.35-0ubuntu3.8) ...
/sbin/ldconfig.real: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link
$ export PATH=/usr/local/cuda-12.1/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH
$ export CUDA_HOME=/usr/local/cuda-12.1/
$ export CUDA_VERSION=121
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0
$ cmake -B build -DGGML_CUDA=ON
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- OpenMP found
-- Using llamafile
-- Using AMX
-- Found CUDAToolkit: /usr/local/cuda-12.1/include (found version "12.1.66")
-- CUDA found
-- Using CUDA architectures: 52;61;70;75
-- The CUDA compiler identification is NVIDIA 12.1.66
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-12.1/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- CUDA host compiler is GNU 11.4.0
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /work/GitHubs/MEAIDev/poc-ai-tool-llama-cpp/llama.cpp/build
$ cmake --build build --config Release
[ 0%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml.c.o
[ 1%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 1%] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-backend.cpp.o
[ 2%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml-quants.c.o
[ 2%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/acc.cu.o
[ 2%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/arange.cu.o
[ 3%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/argmax.cu.o
[ 3%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/argsort.cu.o
[ 4%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/binbcast.cu.o
[ 4%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/clamp.cu.o
[ 4%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/concat.cu.o
[ 5%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/conv-transpose-1d.cu.o
[ 5%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/convert.cu.o
[ 6%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/count-equal.cu.o
[ 6%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/cpy.cu.o
[ 6%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/cross-entropy-loss.cu.o
[ 7%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/diagmask.cu.o
[ 7%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/dmmv.cu.o
[ 8%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/fattn-tile-f16.cu.o
[ 8%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/fattn-tile-f32.cu.o
[ 8%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/fattn.cu.o
[ 9%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/getrows.cu.o
[ 9%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/im2col.cu.o
[ 10%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/mmq.cu.o
[ 10%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/mmvq.cu.o
[ 10%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/norm.cu.o
[ 11%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/opt-step-adamw.cu.o
[ 11%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/out-prod.cu.o
[ 12%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/pad.cu.o
[ 12%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/pool2d.cu.o
[ 12%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/quantize.cu.o
[ 13%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/rope.cu.o
[ 13%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/rwkv-wkv.cu.o
[ 14%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/scale.cu.o
[ 14%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/softmax.cu.o
[ 14%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/sum.cu.o
[ 15%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/sumrows.cu.o
[ 15%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/tsembd.cu.o
[ 16%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/unary.cu.o
[ 16%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/upscale.cu.o
[ 16%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda.cu.o
[ 17%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqfloat-cpb16.cu.o
[ 17%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqfloat-cpb32.cu.o
[ 18%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqhalf-cpb16.cu.o
[ 18%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqhalf-cpb32.cu.o
[ 18%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-wmma-f16-instance-kqhalf-cpb8.cu.o
[ 19%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq1_s.cu.o
[ 19%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq2_s.cu.o
[ 20%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu.o
[ 20%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu.o
[ 20%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq3_s.cu.o
[ 21%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu.o
[ 21%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu.o
[ 22%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu.o
f[ 22%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q2_k.cu.o
[ 22%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q3_k.cu.o
[ 23%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q4_0.cu.o
[ 23%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q4_1.cu.o
[ 24%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q4_k.cu.o
[ 24%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q5_0.cu.o
[ 24%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q5_1.cu.o
[ 25%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q5_k.cu.o
[ 25%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q6_k.cu.o
[ 26%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/mmq-instance-q8_0.cu.o
[ 26%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
[ 26%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
[ 27%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
[ 27%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
[ 28%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[ 28%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[ 28%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
[ 29%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
[ 29%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
[ 30%] Building CUDA object ggml/src/CMakeFiles/ggml.dir/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
[ 30%] Building CXX object ggml/src/CMakeFiles/ggml.dir/llamafile/sgemm.cpp.o
[ 30%] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-amx/mmq.cpp.o
[ 31%] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-amx.cpp.o
[ 31%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml-aarch64.c.o
[ 32%] Linking CXX shared library libggml.so
[ 32%] Built target ggml
[ 32%] Building CXX object src/CMakeFiles/llama.dir/llama.cpp.o
[ 32%] Building CXX object src/CMakeFiles/llama.dir/llama-vocab.cpp.o
[ 33%] Building CXX object src/CMakeFiles/llama.dir/llama-grammar.cpp.o
[ 33%] Building CXX object src/CMakeFiles/llama.dir/llama-sampling.cpp.o
[ 34%] Building CXX object src/CMakeFiles/llama.dir/unicode.cpp.o
[ 34%] Building CXX object src/CMakeFiles/llama.dir/unicode-data.cpp.o
[ 34%] Linking CXX shared library libllama.so
[ 34%] Built target llama
[ 34%] Generating build details from Git
-- Found Git: /usr/bin/git (found version "2.34.1")
[ 34%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
[ 34%] Built target build_info
[ 35%] Building CXX object common/CMakeFiles/common.dir/arg.cpp.o
[ 35%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o
[ 36%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o
[ 36%] Building CXX object common/CMakeFiles/common.dir/json-schema-to-grammar.cpp.o
[ 36%] Building CXX object common/CMakeFiles/common.dir/log.cpp.o
[ 37%] Building CXX object common/CMakeFiles/common.dir/ngram-cache.cpp.o
[ 37%] Building CXX object common/CMakeFiles/common.dir/sampling.cpp.o
[ 38%] Building CXX object common/CMakeFiles/common.dir/train.cpp.o
[ 38%] Linking CXX static library libcommon.a
[ 38%] Built target common
[ 38%] Building CXX object tests/CMakeFiles/test-tokenizer-0.dir/test-tokenizer-0.cpp.o
[ 39%] Linking CXX executable ../bin/test-tokenizer-0
[ 39%] Built target test-tokenizer-0
[ 39%] Building CXX object tests/CMakeFiles/test-tokenizer-1-bpe.dir/test-tokenizer-1-bpe.cpp.o
[ 39%] Linking CXX executable ../bin/test-tokenizer-1-bpe
[ 39%] Built target test-tokenizer-1-bpe
[ 40%] Building CXX object tests/CMakeFiles/test-tokenizer-1-spm.dir/test-tokenizer-1-spm.cpp.o
[ 40%] Linking CXX executable ../bin/test-tokenizer-1-spm
[ 40%] Built target test-tokenizer-1-spm
[ 40%] Building CXX object tests/CMakeFiles/test-log.dir/test-log.cpp.o
[ 40%] Building CXX object tests/CMakeFiles/test-log.dir/get-model.cpp.o
[ 41%] Linking CXX executable ../bin/test-log
[ 41%] Built target test-log
[ 41%] Building CXX object tests/CMakeFiles/test-arg-parser.dir/test-arg-parser.cpp.o
[ 42%] Building CXX object tests/CMakeFiles/test-arg-parser.dir/get-model.cpp.o
[ 42%] Linking CXX executable ../bin/test-arg-parser
[ 42%] Built target test-arg-parser
[ 42%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/test-quantize-fns.cpp.o
[ 43%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/get-model.cpp.o
[ 43%] Linking CXX executable ../bin/test-quantize-fns
[ 43%] Built target test-quantize-fns
[ 44%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/test-quantize-perf.cpp.o
[ 44%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/get-model.cpp.o
[ 44%] Linking CXX executable ../bin/test-quantize-perf
[ 44%] Built target test-quantize-perf
[ 44%] Building CXX object tests/CMakeFiles/test-sampling.dir/test-sampling.cpp.o
[ 44%] Building CXX object tests/CMakeFiles/test-sampling.dir/get-model.cpp.o
[ 45%] Linking CXX executable ../bin/test-sampling
[ 45%] Built target test-sampling
[ 46%] Building CXX object tests/CMakeFiles/test-chat-template.dir/test-chat-template.cpp.o
[ 46%] Building CXX object tests/CMakeFiles/test-chat-template.dir/get-model.cpp.o
[ 47%] Linking CXX executable ../bin/test-chat-template
[ 47%] Built target test-chat-template
[ 47%] Building CXX object tests/CMakeFiles/test-grammar-parser.dir/test-grammar-parser.cpp.o
[ 48%] Building CXX object tests/CMakeFiles/test-grammar-parser.dir/get-model.cpp.o
[ 48%] Linking CXX executable ../bin/test-grammar-parser
[ 48%] Built target test-grammar-parser
[ 49%] Building CXX object tests/CMakeFiles/test-llama-grammar.dir/test-llama-grammar.cpp.o
[ 49%] Building CXX object tests/CMakeFiles/test-llama-grammar.dir/get-model.cpp.o
[ 50%] Linking CXX executable ../bin/test-llama-grammar
[ 50%] Built target test-llama-grammar
[ 50%] Building CXX object tests/CMakeFiles/test-grammar-integration.dir/test-grammar-integration.cpp.o
[ 51%] Building CXX object tests/CMakeFiles/test-grammar-integration.dir/get-model.cpp.o
[ 51%] Linking CXX executable ../bin/test-grammar-integration
[ 51%] Built target test-grammar-integration
[ 51%] Building CXX object tests/CMakeFiles/test-grad0.dir/test-grad0.cpp.o
[ 51%] Building CXX object tests/CMakeFiles/test-grad0.dir/get-model.cpp.o
[ 52%] Linking CXX executable ../bin/test-grad0
[ 52%] Built target test-grad0
[ 53%] Building CXX object tests/CMakeFiles/test-barrier.dir/test-barrier.cpp.o
[ 53%] Building CXX object tests/CMakeFiles/test-barrier.dir/get-model.cpp.o
[ 54%] Linking CXX executable ../bin/test-barrier
[ 54%] Built target test-barrier
[ 55%] Building CXX object tests/CMakeFiles/test-backend-ops.dir/test-backend-ops.cpp.o
[ 55%] Building CXX object tests/CMakeFiles/test-backend-ops.dir/get-model.cpp.o
[ 55%] Linking CXX executable ../bin/test-backend-ops
[ 55%] Built target test-backend-ops
[ 56%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 56%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 57%] Linking CXX executable ../bin/test-rope
[ 57%] Built target test-rope
[ 57%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 58%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 58%] Linking CXX executable ../bin/test-model-load-cancel
[ 58%] Built target test-model-load-cancel
[ 58%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 59%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 59%] Linking CXX executable ../bin/test-autorelease
[ 59%] Built target test-autorelease
[ 60%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/test-json-schema-to-grammar.cpp.o
[ 60%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/get-model.cpp.o
[ 60%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 60%] Built target test-json-schema-to-grammar
[ 60%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 60%] Linking C executable ../bin/test-c
[ 60%] Built target test-c
[ 61%] Building CXX object examples/cvector-generator/CMakeFiles/llama-cvector-generator.dir/cvector-generator.cpp.o
[ 61%] Linking CXX executable ../../bin/llama-cvector-generator
[ 61%] Built target llama-cvector-generator
[ 62%] Building CXX object examples/baby-llama/CMakeFiles/llama-baby-llama.dir/baby-llama.cpp.o
[ 62%] Linking CXX executable ../../bin/llama-baby-llama
[ 62%] Built target llama-baby-llama
[ 62%] Building CXX object examples/batched-bench/CMakeFiles/llama-batched-bench.dir/batched-bench.cpp.o
[ 63%] Linking CXX executable ../../bin/llama-batched-bench
[ 63%] Built target llama-batched-bench
[ 64%] Building CXX object examples/batched/CMakeFiles/llama-batched.dir/batched.cpp.o
[ 64%] Linking CXX executable ../../bin/llama-batched
[ 64%] Built target llama-batched
[ 65%] Building CXX object examples/convert-llama2c-to-ggml/CMakeFiles/llama-convert-llama2c-to-ggml.dir/convert-llama2c-to-ggml.cpp.o
[ 65%] Linking CXX executable ../../bin/llama-convert-llama2c-to-ggml
[ 65%] Built target llama-convert-llama2c-to-ggml
[ 65%] Building CXX object examples/embedding/CMakeFiles/llama-embedding.dir/embedding.cpp.o
[ 66%] Linking CXX executable ../../bin/llama-embedding
[ 66%] Built target llama-embedding
[ 66%] Building CXX object examples/eval-callback/CMakeFiles/llama-eval-callback.dir/eval-callback.cpp.o
[ 67%] Linking CXX executable ../../bin/llama-eval-callback
[ 67%] Built target llama-eval-callback
[ 67%] Building CXX object examples/export-lora/CMakeFiles/llama-export-lora.dir/export-lora.cpp.o
[ 67%] Linking CXX executable ../../bin/llama-export-lora
[ 67%] Built target llama-export-lora
[ 68%] Building CXX object examples/gbnf-validator/CMakeFiles/llama-gbnf-validator.dir/gbnf-validator.cpp.o
[ 68%] Linking CXX executable ../../bin/llama-gbnf-validator
[ 68%] Built target llama-gbnf-validator
[ 69%] Building C object examples/gguf-hash/CMakeFiles/sha256.dir/deps/sha256/sha256.c.o
[ 69%] Built target sha256
[ 70%] Building C object examples/gguf-hash/CMakeFiles/xxhash.dir/deps/xxhash/xxhash.c.o
[ 70%] Built target xxhash
[ 70%] Building C object examples/gguf-hash/CMakeFiles/sha1.dir/deps/sha1/sha1.c.o
[ 70%] Built target sha1
[ 70%] Building CXX object examples/gguf-hash/CMakeFiles/llama-gguf-hash.dir/gguf-hash.cpp.o
[ 71%] Linking CXX executable ../../bin/llama-gguf-hash
[ 71%] Built target llama-gguf-hash
[ 71%] Building CXX object examples/gguf-split/CMakeFiles/llama-gguf-split.dir/gguf-split.cpp.o
[ 72%] Linking CXX executable ../../bin/llama-gguf-split
[ 72%] Built target llama-gguf-split
[ 73%] Building CXX object examples/gguf/CMakeFiles/llama-gguf.dir/gguf.cpp.o
[ 73%] Linking CXX executable ../../bin/llama-gguf
[ 73%] Built target llama-gguf
[ 73%] Building CXX object examples/gritlm/CMakeFiles/llama-gritlm.dir/gritlm.cpp.o
[ 73%] Linking CXX executable ../../bin/llama-gritlm
[ 73%] Built target llama-gritlm
[ 74%] Building CXX object examples/imatrix/CMakeFiles/llama-imatrix.dir/imatrix.cpp.o
[ 74%] Linking CXX executable ../../bin/llama-imatrix
[ 74%] Built target llama-imatrix
[ 75%] Building CXX object examples/infill/CMakeFiles/llama-infill.dir/infill.cpp.o
[ 75%] Linking CXX executable ../../bin/llama-infill
[ 75%] Built target llama-infill
[ 75%] Building CXX object examples/llama-bench/CMakeFiles/llama-bench.dir/llama-bench.cpp.o
[ 76%] Linking CXX executable ../../bin/llama-bench
[ 76%] Built target llama-bench
[ 77%] Building CXX object examples/llava/CMakeFiles/llava.dir/llava.cpp.o
[ 77%] Building CXX object examples/llava/CMakeFiles/llava.dir/clip.cpp.o
[ 77%] Built target llava
[ 77%] Linking CXX static library libllava_static.a
[ 77%] Built target llava_static
[ 78%] Linking CXX shared library libllava_shared.so
[ 78%] Built target llava_shared
[ 78%] Building CXX object examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o
[ 79%] Linking CXX executable ../../bin/llama-llava-cli
[ 79%] Built target llama-llava-cli
[ 79%] Building CXX object examples/llava/CMakeFiles/llama-minicpmv-cli.dir/minicpmv-cli.cpp.o
[ 80%] Linking CXX executable ../../bin/llama-minicpmv-cli
[ 80%] Built target llama-minicpmv-cli
[ 80%] Building CXX object examples/lookahead/CMakeFiles/llama-lookahead.dir/lookahead.cpp.o
[ 81%] Linking CXX executable ../../bin/llama-lookahead
[ 81%] Built target llama-lookahead
[ 81%] Building CXX object examples/lookup/CMakeFiles/llama-lookup.dir/lookup.cpp.o
[ 81%] Linking CXX executable ../../bin/llama-lookup
[ 81%] Built target llama-lookup
[ 82%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-create.dir/lookup-create.cpp.o
[ 82%] Linking CXX executable ../../bin/llama-lookup-create
[ 82%] Built target llama-lookup-create
[ 83%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-merge.dir/lookup-merge.cpp.o
[ 83%] Linking CXX executable ../../bin/llama-lookup-merge
[ 83%] Built target llama-lookup-merge
[ 83%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-stats.dir/lookup-stats.cpp.o
[ 84%] Linking CXX executable ../../bin/llama-lookup-stats
[ 84%] Built target llama-lookup-stats
[ 84%] Building CXX object examples/main/CMakeFiles/llama-cli.dir/main.cpp.o
[ 84%] Linking CXX executable ../../bin/llama-cli
[ 84%] Built target llama-cli
[ 84%] Building CXX object examples/parallel/CMakeFiles/llama-parallel.dir/parallel.cpp.o
[ 84%] Linking CXX executable ../../bin/llama-parallel
[ 84%] Built target llama-parallel
[ 85%] Building CXX object examples/passkey/CMakeFiles/llama-passkey.dir/passkey.cpp.o
[ 85%] Linking CXX executable ../../bin/llama-passkey
[ 85%] Built target llama-passkey
[ 86%] Building CXX object examples/perplexity/CMakeFiles/llama-perplexity.dir/perplexity.cpp.o
[ 86%] Linking CXX executable ../../bin/llama-perplexity
[ 86%] Built target llama-perplexity
[ 86%] Building CXX object examples/quantize-stats/CMakeFiles/llama-quantize-stats.dir/quantize-stats.cpp.o
[ 86%] Linking CXX executable ../../bin/llama-quantize-stats
[ 86%] Built target llama-quantize-stats
[ 86%] Building CXX object examples/quantize/CMakeFiles/llama-quantize.dir/quantize.cpp.o
[ 87%] Linking CXX executable ../../bin/llama-quantize
[ 87%] Built target llama-quantize
[ 88%] Building CXX object examples/retrieval/CMakeFiles/llama-retrieval.dir/retrieval.cpp.o
[ 88%] Linking CXX executable ../../bin/llama-retrieval
[ 88%] Built target llama-retrieval
[ 88%] Generating theme-snowstorm.css.hpp
[ 88%] Generating colorthemes.css.hpp
[ 89%] Generating completion.js.hpp
[ 89%] Generating index-new.html.hpp
[ 90%] Generating index.html.hpp
[ 90%] Generating index.js.hpp
[ 90%] Generating json-schema-to-grammar.mjs.hpp
[ 90%] Generating loading.html.hpp
[ 91%] Generating prompt-formats.js.hpp
[ 92%] Generating style.css.hpp
[ 92%] Generating system-prompts.js.hpp
[ 92%] Generating theme-beeninorder.css.hpp
[ 93%] Generating theme-ketivah.css.hpp
[ 93%] Generating theme-mangotango.css.hpp
[ 93%] Generating theme-playground.css.hpp
[ 94%] Generating theme-polarnight.css.hpp
[ 95%] Building CXX object examples/server/CMakeFiles/llama-server.dir/server.cpp.o
[ 95%] Linking CXX executable ../../bin/llama-server
[ 95%] Built target llama-server
[ 96%] Building CXX object examples/save-load-state/CMakeFiles/llama-save-load-state.dir/save-load-state.cpp.o
[ 96%] Linking CXX executable ../../bin/llama-save-load-state
[ 96%] Built target llama-save-load-state
[ 97%] Building CXX object examples/simple/CMakeFiles/llama-simple.dir/simple.cpp.o
[ 97%] Linking CXX executable ../../bin/llama-simple
[ 97%] Built target llama-simple
[ 97%] Building CXX object examples/speculative/CMakeFiles/llama-speculative.dir/speculative.cpp.o
[ 98%] Linking CXX executable ../../bin/llama-speculative
[ 98%] Built target llama-speculative
[ 98%] Building CXX object examples/tokenize/CMakeFiles/llama-tokenize.dir/tokenize.cpp.o
[ 99%] Linking CXX executable ../../bin/llama-tokenize
[ 99%] Built target llama-tokenize
[ 99%] Building CXX object pocs/vdot/CMakeFiles/llama-vdot.dir/vdot.cpp.o
[ 99%] Linking CXX executable ../../bin/llama-vdot
[ 99%] Built target llama-vdot
[ 99%] Building CXX object pocs/vdot/CMakeFiles/llama-q8dot.dir/q8dot.cpp.o
[100%] Linking CXX executable ../../bin/llama-q8dot
[100%] Built target llama-q8dot
$After build, you could find all command line scrips at llama.cpp/build/bin directory:
$ ls -l build/bin
total 30280
-rwxr-xr-x 1 wsluser wsluser 420688 Oct 21 14:41 llama-baby-llama
-rwxr-xr-x 1 wsluser wsluser 961096 Oct 21 14:41 llama-batched
-rwxr-xr-x 1 wsluser wsluser 961144 Oct 21 14:41 llama-batched-bench
-rwxr-xr-x 1 wsluser wsluser 487264 Oct 21 14:42 llama-bench
-rwxr-xr-x 1 wsluser wsluser 998336 Oct 21 14:42 llama-cli
-rwxr-xr-x 1 wsluser wsluser 366528 Oct 21 14:41 llama-convert-llama2c-to-ggml
-rwxr-xr-x 1 wsluser wsluser 994784 Oct 21 14:41 llama-cvector-generator
-rwxr-xr-x 1 wsluser wsluser 965616 Oct 21 14:41 llama-embedding
-rwxr-xr-x 1 wsluser wsluser 961504 Oct 21 14:41 llama-eval-callback
-rwxr-xr-x 1 wsluser wsluser 999344 Oct 21 14:41 llama-export-lora
-rwxr-xr-x 1 wsluser wsluser 28344 Oct 21 14:41 llama-gbnf-validator
-rwxr-xr-x 1 wsluser wsluser 28056 Oct 21 14:41 llama-gguf
-rwxr-xr-x 1 wsluser wsluser 103448 Oct 21 14:41 llama-gguf-hash
-rwxr-xr-x 1 wsluser wsluser 48064 Oct 21 14:41 llama-gguf-split
-rwxr-xr-x 1 wsluser wsluser 961832 Oct 21 14:41 llama-gritlm
-rwxr-xr-x 1 wsluser wsluser 1004344 Oct 21 14:42 llama-imatrix
-rwxr-xr-x 1 wsluser wsluser 984752 Oct 21 14:42 llama-infill
-rwxr-xr-x 1 wsluser wsluser 1253696 Oct 21 14:42 llama-llava-cli
-rwxr-xr-x 1 wsluser wsluser 966048 Oct 21 14:42 llama-lookahead
-rwxr-xr-x 1 wsluser wsluser 995376 Oct 21 14:42 llama-lookup
-rwxr-xr-x 1 wsluser wsluser 978248 Oct 21 14:42 llama-lookup-create
-rwxr-xr-x 1 wsluser wsluser 69792 Oct 21 14:42 llama-lookup-merge
-rwxr-xr-x 1 wsluser wsluser 987240 Oct 21 14:42 llama-lookup-stats
-rwxr-xr-x 1 wsluser wsluser 1249088 Oct 21 14:42 llama-minicpmv-cli
-rwxr-xr-x 1 wsluser wsluser 970280 Oct 21 14:42 llama-parallel
-rwxr-xr-x 1 wsluser wsluser 961376 Oct 21 14:42 llama-passkey
-rwxr-xr-x 1 wsluser wsluser 1059232 Oct 21 14:42 llama-perplexity
-rwxr-xr-x 1 wsluser wsluser 21184 Oct 21 14:43 llama-q8dot
-rwxr-xr-x 1 wsluser wsluser 359752 Oct 21 14:42 llama-quantize
-rwxr-xr-x 1 wsluser wsluser 213016 Oct 21 14:42 llama-quantize-stats
-rwxr-xr-x 1 wsluser wsluser 975176 Oct 21 14:42 llama-retrieval
-rwxr-xr-x 1 wsluser wsluser 961656 Oct 21 14:43 llama-save-load-state
-rwxr-xr-x 1 wsluser wsluser 1932464 Oct 21 14:42 llama-server
-rwxr-xr-x 1 wsluser wsluser 26824 Oct 21 14:43 llama-simple
-rwxr-xr-x 1 wsluser wsluser 989720 Oct 21 14:43 llama-speculative
-rwxr-xr-x 1 wsluser wsluser 337904 Oct 21 14:43 llama-tokenize
-rwxr-xr-x 1 wsluser wsluser 21768 Oct 21 14:43 llama-vdot
-rwxr-xr-x 1 wsluser wsluser 966504 Oct 21 14:41 test-arg-parser
-rwxr-xr-x 1 wsluser wsluser 18152 Oct 21 14:41 test-autorelease
-rwxr-xr-x 1 wsluser wsluser 380080 Oct 21 14:41 test-backend-ops
-rwxr-xr-x 1 wsluser wsluser 22088 Oct 21 14:41 test-barrier
-rwxr-xr-x 1 wsluser wsluser 15776 Oct 21 14:41 test-c
-rwxr-xr-x 1 wsluser wsluser 354536 Oct 21 14:41 test-chat-template
-rwxr-xr-x 1 wsluser wsluser 61080 Oct 21 14:41 test-grad0
-rwxr-xr-x 1 wsluser wsluser 602752 Oct 21 14:41 test-grammar-integration
-rwxr-xr-x 1 wsluser wsluser 41072 Oct 21 14:41 test-grammar-parser
-rwxr-xr-x 1 wsluser wsluser 596752 Oct 21 14:41 test-json-schema-to-grammar
-rwxr-xr-x 1 wsluser wsluser 46336 Oct 21 14:41 test-llama-grammar
-rwxr-xr-x 1 wsluser wsluser 34064 Oct 21 14:41 test-log
-rwxr-xr-x 1 wsluser wsluser 16512 Oct 21 14:41 test-model-load-cancel
-rwxr-xr-x 1 wsluser wsluser 17552 Oct 21 14:41 test-quantize-fns
-rwxr-xr-x 1 wsluser wsluser 41632 Oct 21 14:41 test-quantize-perf
-rwxr-xr-x 1 wsluser wsluser 17336 Oct 21 14:41 test-rope
-rwxr-xr-x 1 wsluser wsluser 45368 Oct 21 14:41 test-sampling
-rwxr-xr-x 1 wsluser wsluser 352920 Oct 21 14:41 test-tokenizer-0
-rwxr-xr-x 1 wsluser wsluser 330064 Oct 21 14:41 test-tokenizer-1-bpe
-rwxr-xr-x 1 wsluser wsluser 329840 Oct 21 14:41 test-tokenizer-1-spm
$ If you see some errors during the compile and build process, write down the file name. Most of time, the error only impacts the file or model related, not whole installation.
What’s next?
Typically the next step is to validate the installation. Below link provides you not only the hello world use case, but most of modern common use cases.
If you are interested in building and installing Llama.cpp for different environment, check out below links:


