语音识别接入openai的Whisper接口，手把手保姆级教程，chatgpt的接口

智增增api1年前 (2024)更新 zhizengzeng

1.1K 0 0

据说这货已经是地表最强语音识别了？？

语音识别接入openai的Whisper接口，手把手保姆级教程，chatgpt的接口
有人说“在Whisper 之前，英文语音识别方面，Google说第二，没人敢说第一——当然，我后来发现Amazon的英文语音识别也非常准，基本与Google看齐。
在中文（普通话）领域，讯飞也很能打，讯飞语音输入法，中英文夹杂、方言识别都很牛。
但Whisper 出现后——确切地说是OpenAI放出Whisper API后，一下子就把中英文语音识别的老猴王们统统打翻在地。什么英文、中文、中英文混杂、甚至包括日语（你们懂的）、土耳其语，准确率都远远高于Google和讯飞。
我认识的语言服务公司，都悄悄把原来交给讯飞与Google的钱，转交给OpenAI了。”，不知真假
直接上代码用起来：

1、首先需要拿到openai的key和url，

项目github地址：https://github.com/xing61/xiaoyi-robot

第1步：用手机号登录智增增，获取复制出key和url，地址：https://gpt.zhizengzeng.com/#/login

第2步：编写代码。注意配置的base_url是：https://flag.smarttrot.com/v1

2、开始撸python代码：（其它语言类似）

import os
import openai
import requests
import time
import json
import time

API_SECRET_KEY = "你在智增增获取的api_key";
BASE_URL = "https://flag.smarttrot.com/v1"; #智增增的base_url

# audio_transcriptions
def audio_transcriptions(file_name):
    openai.api_key = API_SECRET_KEY
    openai.api_base = BASE_URL
    audio_file = open(file_name, "rb")
    resp = openai.Audio.transcribe("whisper-1", audio_file)
    json_str = json.dumps(resp, ensure_ascii=False)
    print(json_str)

if __name__ == '__main__':
    start = time.time();
    #audio_transcriptions("1.wav");
    translation("1.wav");
    end = time.time()
    print('本次处理时间(s): ', end - start)