gradio_client Introduction - DaoDaoDaoDaoDao

With the recent popularity of HuggingGPT and AutoGPT, using LLM as a control to call various Domain Expert Models has become a common requirement. A few days ago, one of the three giants of OpenAI, GDB, gave a talk at TED, once again showing off the ChatGPT Plugin to the public... Unfortunately, this article by Quantum Bit clearly confuses AutoGPT and ChatGPT Plugin...

Using models can be considered as part of using tools, but the core feature of AutoGPT is Auto. Even Gradio introduced the gradio_client feature last week.

Let's see what it can do~!

Before discussing gradio_client in detail, let's take a moment to review the current situation of developers. With the popularity of ChatGPT and its extremely cheap pricing model, the OpenAI API has become almost a standard for all developers. Many developers even package open source LLM into a format similar to the OpenAI API. In fact, using HuggingFace Inference Endpoints, you can achieve similar functionality to the OpenAI API, and gradio_client allows a Gradio program to communicate even without the HuggingFace platform.

At first glance, it looks like some kind of reflection magic... Just construct a class and import its predict function? But how is security handled? So it should be something at the engineering level...

Example#

The Gradio client works with any hosted Gradio app, whether it be an image generator, a text summarizer, a stateful chatbot, a tax calculator, or anything else! The Gradio Client is mostly used with apps hosted on Hugging Face Spaces, but your app can be hosted anywhere, such as your own server.

ChatGPT#

First, let's use ChatGPT to familiarize ourselves with the basic Gradio program. There are so many ChatGPT instances on HuggingFace... Let's randomly look at a few examples...

loveleaves2012/ChatGPT#

https://huggingface.co/spaces/loveleaves2012/ChatGPT
This version adds a little Pre-Prompt to add functionality, but the code is also the most outrageous. The OpenAI Key is hardcoded in the source code, but HuggingFace actually has an environment variable design.

ysharma/ChatGPTwithAPI#

https://huggingface.co/spaces/ysharma/ChatGPTwithAPI
This version is more standard, allowing users to fill in their own OpenAI Key, which seems to be a standard practice for this type of application (letting users bring their own Key...).

anzorq/chatgpt-demo#

https://huggingface.co/spaces/anzorq/chatgpt-demo

I think this version is the best so far... because it also has a feature similar to stable diffusion, installing plugins from gists... We have a version of it here.

ChatGLM#

Of course, although the examples above are Gradio, they all use the OpenAI service in the end and do not demonstrate the superiority of gradio_client... So let's switch to an LLM example. The above are three examples of ChatGLM implemented using Gradio. First is the version by Hongye, which has the most comprehensive controls, but it is a completely local version and requires users to download the model locally to run it. It is not compatible with HuggingFace.

The second one is from HF's prolific stuff, the multimodalart version, which uses HF's infrastructure but requires users to spend money to host the model to run it.

def predict(input, history=None):
    if history is None:
        history = []
    response, history = model.chat(tokenizer, input, history)
    return history, history

The third version is a modification of the second version, where the core is to only modify the predict function, skip the model calling process, and directly call it from the remote.

def predict(input, history=None):
    if history is None:
        history = []

    client = Client('https://multimodalart-chatglm-6b.hf.space/')
    with open(client.predict(input, fn_index=0)) as f: 
        text = process_text(f.read())
        output = json.loads(text)[0]
        history += [output]
        return history, history

This example is currently running here: https://testing.agentswap.net/models/19/remi-test-app

Stable Diffusion 2.1#

https://huggingface.co/stabilityai/stable-diffusion-2-1

This example is more complex. First, although it is hosted on HuggingFace, it is not a traditional HuggingFace service. It is more similar to the examples in ChatGPT, a public good maintained jointly by Google and StabilityAI.

But we can still get the predict method using gradio_client, just like the previous example...

You can see the actual running result here.

Further Discussion#

gradio_client seems to meet my perfect expectations for open source software, everyone for me, and I for everyone, even saying that I can self-host locally, and then share=true, and then package it into an external service (such as a Telegram bot) using gradio_client... The imagination is huge...

But thinking about it carefully, there are also some problems here, and the biggest victim may be the child who deploys the model, spending money to do public good for everyone. And if a model becomes very popular, the scalability of the existing architecture will quickly show its shortcomings in the face of Cloud Model. Of course, because there is too little information now, I still don't understand how it is implemented, what rules the exported function meets, whether there are security risks, and so on...

Finally, there seem to be more examples here. There is even gradio_tools built on gradio_client... It's too fast...