macosx x86_64 portable release doesn't include llamma-cpp-binaries package

### Describe the bug

The portable release for intel macs doesn't include the llama binaries package. I logged [a previous bug](https://github.com/oobabooga/llama-cpp-binaries/issues/6) that the mac intel binaries weren't even built correctly for that package, so it's maybe not surprising to find they're not being properly pulled into the build here. Seems like a chain of compounding issues obscured things and lead us to this point.

### Is there an existing issue for this?

- [x] I have searched the existing issues

### Reproduction

### Repro and Diagnosis Steps

```bash
$ cd ~/Downloads
$ wget https://github.com/oobabooga/text-generation-webui/releases/download/v3.13/textgen-portable-3.13-macos-x86_64.zip
$ unzip textgen-portable-3.13-macos-x86_64.zip
$ cd text-generation-webui-3.13/
$ ./start_macos.sh --model any-model.gguf
09:32:06-531241 INFO     Starting Text Generation Web UI
09:32:06-533982 INFO     Loading settings from "user_data/settings.yaml"
09:32:06-854936 INFO     Loading "any-model.gguf"
╭───────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────╮
│ /Users/USER/Code/text-generation-webui-3.13/server.py:312 in <module>                                                               │
│                                                                                                                                     │
│   311         # Load the model                                                                                                      │
│ ❱ 312         shared.model, shared.tokenizer = load_model(shared.model_name)                                                        │
│   313         if shared.args.lora:                                                                                                  │
│                                                                                                                                     │
│ /Users/USER/Code/text-generation-webui-3.13/modules/models.py:43 in load_model                                                      │
│                                                                                                                                     │
│    42     shared.args.loader = loader                                                                                               │
│ ❱  43     output = load_func_map[loader](model_name)                                                                                │
│    44     if type(output) is tuple:                                                                                                 │
│                                                                                                                                     │
│ /Users/USER/Code/text-generation-webui-3.13/modules/models.py:71 in llama_cpp_server_loader                                         │
│                                                                                                                                     │
│    70 def llama_cpp_server_loader(model_name):                                                                                      │
│ ❱  71     from modules.llama_cpp_server import LlamaServer                                                                          │
│    72                                                                                                                               │
│                                                                                                                                     │
│ /Users/USER/Code/text-generation-webui-3.13/modules/llama_cpp_server.py:13 in <module>                                              │
│                                                                                                                                     │
│    12                                                                                                                               │
│ ❱  13 import llama_cpp_binaries                                                                                                     │
│    14 import requests                                                                                                               │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ModuleNotFoundError: No module named 'llama_cpp_binaries'
$ find . -iname '*binaries*' # nothing gets printed here
$ portable_env/bin/pip3 install https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.46.0/llama_cpp_binaries-0.46.0-py3-none-macosx_14_0_x86_64.whl
# lots of output ending in success...
$ find . -iname '*binaries*'
./portable_env/lib/python3.11/site-packages/llama_cpp_binaries
./portable_env/lib/python3.11/site-packages/llama_cpp_binaries-0.46.0.dist-info
# Can load the model at this point but inference output is all weird so something may not be right.
```

### Screenshot

n/a

### Logs

```shell
Included above
```

### System Info

```shell
OS: Sonoma 14.5
Computer: MacBook Pro 15 inch 2018
CPU: Intel Core i7
GPUs: Intel UHD Graphics 630 / Radeon Pro Vega 20
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

macosx x86_64 portable release doesn't include llamma-cpp-binaries package #7238

Describe the bug

Is there an existing issue for this?

Reproduction

Repro and Diagnosis Steps

Screenshot

Logs

System Info

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

macosx x86_64 portable release doesn't include llamma-cpp-binaries package #7238

Description

Describe the bug

Is there an existing issue for this?

Reproduction

Repro and Diagnosis Steps

Screenshot

Logs

System Info

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions