通过麦克风识别语音输入
Web Speech API 可以实现语音转文字的功能,目前大部分新款浏览器都可以支持。一个简单示例:
// 客户端
// 判断浏览器是否支持 Web Speech API
if ('SpeechRecognition' in window || 'webkitSpeechRecognition' in window) {
// 创建 SpeechRecognition 对象
const recognition = new (window.SpeechRecognition || window.webkitSpeechRecognition)();
recognition.lang = 'zh-CN'; // 设置语言为中文
// 监听语音识别结果
recognition.addEventListener('result', event => {
const transcript = event.results[0][0].transcript;
console.log(transcript); // 打印识别结果
});
// 开始语音识别
recognition.start();
} else {
console.log('Web Speech API is not supported');
}
当然我们为了更好地支持语音识别的效果以及兼容不同的浏览器,更推荐使用微软的 speech SDK
我们需要预先获取到token,需要在服务端获取校验信息
export const speechCheck = async (): Promise<IAzureSpeechCheckResult> => {
if (!azureSpeechKey || !azureSpeechregion) {
return {
status: false,
error: {
message: 'auth failed',
},
}
}
const params = {
method: 'POST',
headers: {
'Ocp-Apim-Subscription-Key': azureSpeechKey,
'Content-Type': 'application/x-www-form-urlencoded',
},
body: null,
}
try {
const tokenResponse: any = await fetch(
`https://${azureSpeechregion}.api.cognitive.microsoft.com/sts/v1.0/issueToken`,
params
)
const tokenResponseResult = await tokenResponse.text()
if (tokenResponseResult) {
return {
status: true,
token: tokenResponseResult,
region: azureSpeechregion,
// tokenResponse
}
}
return {
status: false,
error: {
message: 'There was an error authorizing your speech key. no tokenResponse data',
},
}
} catch (err) {
console.log(`AzureSpeechCheck`, { err })
return {
status: false,
error: {
message: 'There was an error authorizing your speech key.',
},
}
}
}
在客户端使用 speech SDK实现实时监听并且获取语音转换之后的文字
// 客户端
const sttFromMic = async (
stateRecognizer: any,
speechToken: SpeechToken,
recording: (arg: any) => void,
callback?: (arg?: any) => void
) => {
const speechConfig = speechsdk.SpeechConfig.fromAuthorizationToken(speechToken.authToken, speechToken.region)
speechConfig.speechRecognitionLanguage = 'zh-CN'
let recognizer
if (!stateRecognizer) {
const audioConfig = speechsdk.AudioConfig.fromDefaultMicrophoneInput()
recognizer = new speechsdk.SpeechRecognizer(speechConfig, audioConfig)
} else {
recognizer = stateRecognizer
recognizer.startContinuousRecognitionAsync()
callback && callback()
return
}
recognizer.recognizing = function (s: any, e: AnyObj) {
recording(e?.result)
}
recognizer.startContinuousRecognitionAsync()
if (callback) {
callback(recognizer)
}
}
最后更新于