-
Notifications
You must be signed in to change notification settings - Fork 172
Open
Description
This sample will work best on a system that has a GPU, but it can be used on a system without one if necessary. As we work on setup instructions for this app, we should include details for getting it to work on these systems with no GPU. The following are the steps I had to take so far to get it working and I'll keep adding to this until I can use all the sample use cases:
- Update the Ollama bootstrapping code so it does not use a GPU. In the AppHost project, Program.cs change this code:
var chatCompletion = builder.AddOllama("chatcompletion").WithDataVolume();
to
var chatCompletion = builder.AddOllama("chatcompletion", enableGpu: false).WithDataVolume();
- In the PythonInference project, change requirements.txt to use versions of Torch libraries without CUDA. Change it from:
--extra-index-url https://download.pytorch.org/whl/cu118
torch==2.3.1+cu118
torchaudio==2.3.1+cu118
torchvision==0.18.1+cu118
to:
torch==2.3.1
torchaudio==2.3.1
torchvision==0.18.1
- Change the PythonInference project so that it does not use CUDA for accessing models. In the routers/classifier.py file, change this line from:
classifier = pipeline('zero-shot-classification', model='cross-encoder/nli-MiniLM2-L6-H768', device='cuda')
to:
classifier = pipeline('zero-shot-classification', model='cross-encoder/nli-MiniLM2-L6-H768')
- When running without a GPU, responses from the models will be slower and default settings for timeouts are not enough. I had to update my ServiceDefaults project, Extensions.cs file to increase the StandardResilience timeouts. The following worked for me, but on some systems a different timeout may be needed. In Extensions.AddServiceDefaults, change the StandardResilienceHandler from:
http.AddStandardResilienceHandler();
to:
http.AddStandardResilienceHandler(options =>
{
options.AttemptTimeout = new HttpTimeoutStrategyOptions
{
Timeout = TimeSpan.FromMinutes(10)
};
options.TotalRequestTimeout = new HttpTimeoutStrategyOptions
{
Timeout = TimeSpan.FromMinutes(10)
};
options.CircuitBreaker.SamplingDuration = TimeSpan.FromMinutes(20);
});
kannan-cidc, zemacik, asimmon, elbruno, mianfarrukhhameed and 1 more
Metadata
Metadata
Assignees
Labels
No labels