yokon

yokon

君子终日乾乾,夕惕若厉,无咎
github

Midjourney | How to integrate into your own platform

Recently, I have been thinking about integrating Midjourney into my platform. However, since Midjourney does not have an open interface, I had to find an alternative solution. I decided to use Discord's Bot API to create drawing tasks for the Midjourney bot by sending the "/imagine" command. I also implemented real-time monitoring of messages from the Midjourney bot and sent them back to my platform. This basically meets my needs.

In this article, I will introduce the Midjourney API project that I have implemented and provide a usage tutorial.

The general process is as follows:

Mermaid Loading...

midjourney-api#

Open source repository: https://github.com/yokonsan/midjourney-api

This repository is a set of interfaces that I developed to integrate Midjourney into my platform. It has been used by me personally and meets most of my needs. Now I have open sourced it for reference. If you find it useful, you can give it a star. If you find any bugs, you can raise an issue or submit a pull request.

I won't go into the specific development details in this article, but I will explain how to use this repository.

Preparation#

Note: This repository relies on the Discord API for development. Please find your own way to access Discord. Here, it is assumed that you have already created your own server on Discord and added Midjourney to the server.

To use this repository, you need 4 parameters:

  1. User Token (required for API authentication)
  2. Discord bot Token (for real-time monitoring of Midjourney conversations)
  3. Discord self-built server ID
  4. Channel ID where the Midjourney bot is located

If you know how to obtain these parameters, you can skip to the next section.

User Token#

Log in to the Discord web client and press F12 to open the developer tools, then refresh the page. Refer to the screenshot below, click on Network, and enter /api/library in the filter field to find the corresponding request record. In the request body, find the authorization field, and its value is the User Token we need.

Note: This Token is private information and should not be directly exposed in code published on Github.

Pasted image 20230519144719

Bot Token#

Here, you need to create a Discord bot first. You can do this at https://discord.com/developers/applications

The creation process is simple and will not be described here.

Pasted image 20230519145542

Click on Reset Token and then copy the generated Token.

However, we need to add some Scopes to our bot so that it can perform tasks.

Pasted image 20230519145939

After selecting the Scopes, an OAuth2 authorization link will be generated at the bottom of the page. Copy the link and open it in a browser.

After opening the link, an OAuth2 authorization page will appear. Add the bot to your server.

Server ID and Channel ID#

This is relatively simple. First, enable developer mode:

Pasted image 20230519150515

Then, right-click on the server icon and copy the Server ID. The same goes for channels, right-click on the channel and copy the Channel ID.

Installation and Startup#

git clone
pip install -r requirements.txt

Rename .env.template to .env and fill in the parameter values:

USER_TOKEN=User token
BOT_TOKEN=Bot token
GUILD_ID=Server ID
CHANNEL_ID=Channel ID
CALLBACK_URL=Callback URL, default is http post request

Direct Startup#

# Start the bot listener
python task_bot.py
# Start the HTTP server
python server.py

Docker Startup#

# Build the image
sh build.sh
# Start the container
sh start.sh

After starting, access the swagger documentation of the API: http://127.0.0.1:8061/docs

midjourney-api provides the following interfaces:

  1. /v1/api/trigger/bot: Trigger a drawing task, already implemented
  2. /v1/api/upload: Upload an image and trigger a task, to be developed

Usage#

Currently, only the interface for triggering drawing tasks has been implemented. Let's use it as an example. This interface requires the following parameters:

class TriggerType(str, Enum):
    generate = "generate"  # Generate image based on prompt
    upscale = "upscale"  # Upscale image based on selected index
    variation = "variation"  # Generate variations based on index style
    reset = "reset"  # Redraw


class TriggerBotIn(BaseModel):
    type: TriggerType  # Trigger type
    prompt: str = ""  # Prompt
    msg_id: str = ""  # Message ID
    msg_hash: str = ""  # Message hash
    index: int = 0  # Image index, 1-4

generate#

Call the /v1/api/trigger/bot interface to generate an image with the prompt: "half fish half dragon hybrid, retro screencap --ar 2:3 --niji 5"

curl -X 'POST' \
  'http://127.0.0.1:8062/v1/api/trigger/bot' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "type": "generate",
  "prompt": "half fish half dragon hybrid, retro screencap --ar 2:3 --niji 5",
  "msg_id": "",
  "msg_hash": "",
  "index": 0
}'

Pasted image 20230521113818

You can see that the task_bot.py listener service I started has already received the message log:

Pasted image 20230521113910

upscale#

I think the second image is more in line with expectations, so let's continue by calling the interface to upscale the image and add more details. Here, we need to obtain the values of the msg_id and msg_hash fields. msg_id is the id field of CallbackData, and msg_hash is the last segment obtained by splitting the URL of Attachment without the suffix _.

class Attachment(TypedDict):
    id: int
    url: str
    proxy_url: str
    filename: str
    content_type: str
    width: int
    height: int
    size: int
    ephemeral: bool


class CallbackData(TypedDict):
    type: str
    id: int
    content: str
    attachments: List[Attachment]
curl -X 'POST' \
  'http://127.0.0.1:8062/v1/api/trigger/bot' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "type": "upscale",
  "prompt": "",
  "msg_id": "1109686524045443093",
  "msg_hash": "c937b5aa-3f58-4ae5-8dd6-932952243034",
  "index": 2
}'

Note: The index here is the index of the image, ranging from 1 to 4, not 0 to 3.

Pasted image 20230521114604

variation#

curl -X 'POST' \
  'http://127.0.0.1:8062/v1/api/trigger/bot' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "type": "variation",
  "prompt": "",
  "msg_id": "1109686524045443093",
  "msg_hash": "c937b5aa-3f58-4ae5-8dd6-932952243034",
  "index": 2
}'

Here, we generate 4 more images based on the style of the 2nd image.

Pasted image 20230521120322

reset#

Redraw based on the prompt:

curl -X 'POST' \
  'http://127.0.0.1:8062/v1/api/trigger/bot' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "type": "reset",
  "prompt": "",
  "msg_id": "1109686524045443093",
  "msg_hash": "c937b5aa-3f58-4ae5-8dd6-932952243034",
  "index": 0
}'

Pasted image 20230521120623

Conclusion#

This article mainly introduced how to integrate Midjourney and introduced my open source project midjourney-api, as well as how to use this project.

Based on this repository, you can easily integrate Midjourney into platforms such as QQ, WeChat, and DingTalk. If you are interested, I can guide you in creating a WeChat bot or integrating it into my personal official account.

Of course, there are still many areas of improvement in this repository, such as image generation. I will continue to work on these features and welcome everyone to contribute. Open source repository: https://github.com/yokonsan/midjourney-api

Compared to Stable-Diffusion, Midjourney is more beginner-friendly. Feel free to try it out.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.