Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ollama run codellama:34b issue #4519

Open
Iliceth opened this issue May 19, 2024 · 11 comments
Open

ollama run codellama:34b issue #4519

Iliceth opened this issue May 19, 2024 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@Iliceth
Copy link

Iliceth commented May 19, 2024

What is the issue?

Every model I tested with ollama runs fine, but when trying: ollama run codellama:34b, I get Error: llama runner process has terminated: signal: segmentation fault (core dumped)Tried the 13B-version then, works fine.

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.1.37

@Iliceth Iliceth added the bug Something isn't working label May 19, 2024
@izzy84
Copy link

izzy84 commented May 19, 2024

same here

@jmorganca
Copy link
Member

So sorry about this, looking into why this is happening.

@jonz-secops
Copy link

same

@dhiltgen dhiltgen self-assigned this May 22, 2024
@barrard
Copy link

barrard commented May 23, 2024

I'm also getting this on 7b-python and 13b-python

@dhiltgen
Copy link
Collaborator

Could you share your server log, and your VRAM size? This is a larger model, so my suspicion is we're off on our memory prediction.

@9SMTM6
Copy link

9SMTM6 commented May 23, 2024

Same here.

AMD CPU in contrast to OP.

16 GB VRAM (RTX 4070 Ti Super)

Log attached.
codellama_crashdump_shortend.log

@jonz-secops
Copy link

To add details ..

AMD 5950X on x570E
32 GB CPU RAM

6900XT
16GB VRAM

server log; doesn't appear to have the events in it anymore, but I'll try and recreate later today if I can.

@Iliceth
Copy link
Author

Iliceth commented May 23, 2024

Hereby my log.

ollama_logs.txt

I can run mixtral and dolphin-mixtral fine and those are some big ones too.

Anyhow my current system is i7-9700K + 32 GB RAM + RTX 3090 with 24 GB VRAM

@izzy84
Copy link

izzy84 commented May 23, 2024

I7 7700K with 32gb ram and 6 GPUs with in total 51gb vram.

I can run all models, even llama3:70b without problems, which is already a big model. But I get error with codellama 34b

@FukangSun
Copy link

FukangSun commented May 24, 2024

same here .I use 0.1.38 version of official docker image.
E5-2670 + 96gb ram + 1 RTX 4090 . also get below error when running codellama 34b :

ollama run codellama-34b-instruct:Q4_K_M
Error: llama runner process has terminated: signal: segmentation fault (core dumped)

Gentleman, I find another way, downgrade to 0.1.32 is okay to load codellama and large model. 0.1.33 - 0.1.39 I tested all not worked.

@AZ777xx
Copy link

AZ777xx commented May 26, 2024

same. Phind-codellama from the repository works, loading gguf doesn't though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants