Home Artists Posts Import Register

Content

Hi all, just a post on what is being worked on:

For stable diffusion:

I'm working on improving the GUI overall, add some more options and fixing bug, like multiple loras, also making Control Net work with the GUI.

Whisper:
Implementing WhisperX, that is some cases can improve the timestamp of the output. Also trying to add an option to take the audio from the computer input for more of a Real Time Experience. Perhaps in the future I can also see if this could see if this can run on a phone.

Language Model Interface, Neo/ GPT-J / llama?

Slowly working on a interface for Language Model (check attachments).
The main problem is the model. Neo / GPT-J are a little fun, but hardly useful, perhaps if the GUI can handle Fine-tune, it could become something better?

llama is obvious the better choice, but I can make the weights available, even if I make the weights as a external load, the good weights are still way to big. For now there is some options to handle the (not best) model with 16vram. Let's see if this improve in the next few days.

SD should be ready in a few more days. Whisper and the new Language Interface will take a little more time.

Files

Comments

Alien Anthony

Have you seen the alpaca "finetuned llama" model? Pretty amazing. Can't wait till people can start fine-tuning their own models that they could run at home

cool1

Is there any more info on the large language model GUI being worked on please? Is it Llama or based on it? And you mention about the good weights being too big. Is there a page where it lists the different file sizes of the different weights for the language model being worked on? The demo png attachment showed a chat window where it output one line replies. Will the UI also be able to generate large text responses and take large text inputs (eg. could you ask it to fix a function in some source code by pasting that and it giving suggestions about it and the fixed source code? or ask it to write a particular function?)?

DAINAPP

The code i'm focus is indeed llhama. For now the best bet to make llama work is the 4-bit cpp version: https://github.com/ggerganov/llama.cpp It use the LLaMA-7B and is under heavy development. Is the best bet to run on a consumer GPU for now. Text input/output are limited by the model, if I remember correctly, llama have half the size of tokens than GPT 3.5, so it can't take a lot of text, but at least is a open source option.

DAINAPP

Yeah, i'm starting to experiment with the 4-bit cpp version, to see if I can make a GUI out of it.