【2025年最新版】オープンソースリップシンクエンジンSadTalkerをAPI化してアプリから呼ぶ【１】

カテゴリー【Python、Debian】

オープンソースリップシンクエンジンSadTalkerをAPI化してアプリから呼ぶ【１】

POSTED BY
2025-07-08

オープンソースリップシンクエンジンSadTalkerをDebianで動かす

前回Gradio/WebUIでデモするところまで出来たが、実用するにはこれをAPI化して、iOSやAndroidなどアプリから呼べるようにしたい。まずはサーバー側、API化の作業を行う。

POSTメソッドでWAVデータとJPEG/PNG画像データを受け取って、それを合成してMP4動画を返すREST API。

これには本体のCUIソースのinference.pyを改造するのが手っ取り速そうである。

https://github.com/OpenTalker/SadTalker/blob/main/inference.py

実際に変換作業を受け渡ししている部分を切り取って、sadtalker_wrapper.pyなどとしモジュール化する。以下ソース

Python

sadtalker_wrapper.py

GitHub Source

# sadtalker_wrapper.py

import os
import shutil
import torch
from time import strftime
from src.utils.preprocess import CropAndExtract
from src.test_audio2coeff import Audio2Coeff
from src.facerender.animate import AnimateFromCoeff
from src.generate_batch import get_data
from src.generate_facerender_batch import get_facerender_data
from src.utils.init_path import init_path

def generate_talking_video(source_image, driven_audio, checkpoint_dir, result_dir, device='cpu', size=256, pose_style=0):
    save_dir = os.path.join(result_dir, strftime("%Y_%m_%d_%H.%M.%S"))
    os.makedirs(save_dir, exist_ok=True)

    # モデルパス初期化
    sadtalker_paths = init_path(checkpoint_dir, os.path.join(os.path.dirname(__file__), 'src/config'), size, False, 'crop')

    preprocess_model = CropAndExtract(sadtalker_paths, device)
    audio_to_coeff = Audio2Coeff(sadtalker_paths, device)
    animate_from_coeff = AnimateFromCoeff(sadtalker_paths, device)

    # 画像→3DMM
    first_frame_dir = os.path.join(save_dir, 'first_frame')
    os.makedirs(first_frame_dir, exist_ok=True)
    first_coeff_path, crop_pic_path, crop_info = preprocess_model.generate(source_image, first_frame_dir, 'crop', source_image_flag=True, pic_size=size)
    if first_coeff_path is None:
        raise RuntimeError("Failed to extract coefficients from image.")

    # 音声→係数
    batch = get_data(first_coeff_path, driven_audio, device, None, still=False)
    coeff_path = audio_to_coeff.generate(batch, save_dir, pose_style, ref_pose_coeff_path=None)

    # アニメーション生成
    data = get_facerender_data(coeff_path, crop_pic_path, first_coeff_path, driven_audio, batch_size=2,
                               input_yaw_list=None, input_pitch_list=None, input_roll_list=None,
                               expression_scale=1.0, still_mode=False, preprocess='crop', size=size)

    result_temp_path = animate_from_coeff.generate(data, save_dir, source_image, crop_info,
                                                   enhancer=None, background_enhancer=None, preprocess='crop', img_size=size)

    # ファイル名確定 & 移動
    final_path = os.path.join(save_dir, "output.mp4")
    shutil.move(result_temp_path, final_path)

    return final_path

次に画像・音声データを外部からFastAPIでJSON形式で受け取りsadtalker_wrapperに投げて完成した動画を返すメインコードmain.pyを以下のように作成する。

Python

main.py

GitHub Source

# main.py

from fastapi import FastAPI, UploadFile, File
from fastapi.responses import FileResponse, JSONResponse
import os
from datetime import datetime
import torch

from sadtalker_wrapper import generate_talking_video

app = FastAPI(root_path="/sadtalker_api")  # サブパスで動かす

@app.post("/generate")
async def generate(source_image: UploadFile = File(...), driven_audio: UploadFile = File(...)):
    try:
        # ./working_dir/YYYYMMDD_HHMMSS に保存
        base_dir = os.path.join(os.getcwd(), "working_dir")
        os.makedirs(base_dir, exist_ok=True)
        session_dir = os.path.join(base_dir, datetime.now().strftime("%Y%m%d_%H%M%S"))
        os.makedirs(session_dir, exist_ok=True)

        image_path = os.path.join(session_dir, "input.png")
        audio_path = os.path.join(session_dir, "input.wav")

        with open(image_path, "wb") as f:
            f.write(await source_image.read())
        with open(audio_path, "wb") as f:
            f.write(await driven_audio.read())

        # SadTalker呼び出し
        result_path = generate_talking_video(
            source_image=image_path,
            driven_audio=audio_path,
            checkpoint_dir="./checkpoints",
            result_dir=session_dir,
            device="cuda" if torch.cuda.is_available() else "cpu"
        )

        return FileResponse(result_path, media_type="video/mp4", filename="talking_face.mp4")

    except Exception as e:
        return JSONResponse(status_code=500, content={"error": str(e)})

前回作成した仮想環境下で、uvicornを起動する。ローカルの5000番をListenするが、外部からは
https://test.hogeserver.jp/sadtalker_api/generate
などと呼べるようにROOT PATHを設定して起動する。

source ./venv/bin/activate

uvicorn main:app --host 127.0.0.1 --port 5000 --root-path /sadtalker_api

INFO:     Started server process [1878498]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:5000 (Press CTRL+C to quit)

このサーバーがtest.hogeserver.jpであってApache+SSL稼働中であるなら、リバースプロキシの設定をして外部との橋渡しを行う。

<location>
    Require all granted
    ProxyPass http://127.0.0.1:5000
    ProxyPassReverse http://127.0.0.1:5000
    RequestHeader set X_FORWARDED_PROTO 'https'
</location>

この設定をしたうえで外部のブラウザーから
https://test.hogeserver.jp/sadtalker_api/generate
を叩いてみて、

INFO:     xxx.yy.zz.aa:0 - "GET /sadtalker_api/generate HTTP/1.1" 405 Method Not Allowed

などとログが出力されれば、ちゃんと疎通ができており、上記改造Pythonモジュールが反応している。あとはクライアントアプリ側のデータPOST作業。それは以下次回にて。

【次の記事】オープンソースリップシンクエンジンSadTalkerをAPI化してアプリから呼ぶ【２】

【前の記事】【Xcode】iPhone is not available because it is unpairedの対処法

Android 　iPhone/iPad 　Flutter 　MacOS 　Windows 　Debian 　Ubuntu 　CentOS 　FreeBSD 　RaspberryPI 　HTML/CSS 　C/C++ 　PHP 　Java 　JavaScript 　Node.js 　Swift 　Python 　MatLab 　Amazon/AWS 　CORESERVER 　Google 　仮想通貨　 LINE 　OpenAI/ChatGPT 　IBM Watson 　Microsoft Azure 　Xcode 　VMware 　MySQL 　PostgreSQL 　Redis 　Groonga 　Git/GitHub 　Apache 　nginx 　Postfix 　SendGrid 　Hackintosh 　Hardware 　Fate/Grand Order 　ウマ娘　将棋　ドラレコ

【WEBMASTER/管理人】

自営業プログラマーです。お仕事ください！
ご連絡は以下アドレスまでお願いします★

【キーワード検索】

【最近の記事】【全部の記事】

【iOS】アプリアイコン・ロゴ画像の作成・設定方法
オープンソースリップシンクエンジンSadTalkerをAPI化してアプリから呼ぶ【２】
オープンソースリップシンクエンジンSadTalkerをAPI化してアプリから呼ぶ【１】
【Xcode】iPhone is not available because it is unpairedの対処法
【Let's Encrypt】Failed authorization procedure 503の対処法
【Debian】古いバージョンでapt updateしたら404 not foundでエラーになる場合
ファイアウォール内部のWindows11 PCにmacOS Sequoiaからリモートデスクトップする
ファイアウォール内部のNode.js+Socket.ioを外部からProxyPassを通して使う
ファイアウォール内部のGradio/WebUIを外部からProxyPassを通して使う
オープンソースリップシンクエンジンSadTalkerをDebianで動かす

【カテゴリーリンク】