GoogleCloudDialogflowV2beta1StreamingRecognitionResult
import type { GoogleCloudDialogflowV2beta1StreamingRecognitionResult } from "https://googleapis.deno.dev/v1/dialogflow:v3.ts";
Contains a speech recognition result corresponding to a portion of the audio
that is currently being processed or an indication that this is the end of
the single requested utterance. While end-user audio is being processed,
Dialogflow sends a series of results. Each result may contain a transcript
value. A transcript represents a portion of the utterance. While the
recognizer is processing audio, transcript values may be interim values or
finalized values. Once a transcript is finalized, the is_final
value is set
to true and processing continues for the next transcript. If
StreamingDetectIntentRequest.query_input.audio_config.single_utterance
was
true, and the recognizer has completed processing audio, the message_type
value is set to END_OF_SINGLE_UTTERANCE and the following (last) result contains the last finalized transcript. The complete end-user utterance is determined by concatenating the finalized transcript values received for the series of results. In the following example, single utterance is enabled. In the case where single utterance is not enabled, result 7 would not occur. ``` Num | transcript | message_type | is_final --- | ----------------------- | ----------------------- | -------- 1 | "tube" | TRANSCRIPT | false 2 | "to be a" | TRANSCRIPT | false 3 | "to be" | TRANSCRIPT | false 4 | "to be or not to be" | TRANSCRIPT | true 5 | "that's" | TRANSCRIPT | false 6 | "that is | TRANSCRIPT | false 7 | unset | END_OF_SINGLE_UTTERANCE | unset 8 | " that is the question" | TRANSCRIPT | true ``` Concatenating the finalized transcripts with
is_final` set to true, the complete utterance becomes "to be or not to
be that is the question".
§Properties
The Speech confidence between 0.0 and 1.0 for the current portion of
audio. A higher number indicates an estimated greater likelihood that the
recognized words are correct. The default of 0.0 is a sentinel value
indicating that confidence was not set. This field is typically only
provided if is_final
is true and you should not rely on it being accurate
or even set.
DTMF digits. Populated if and only if message_type
= DTMF_DIGITS
.
If false
, the StreamingRecognitionResult
represents an interim result
that may change. If true
, the recognizer will not return any further
hypotheses about this piece of the audio. May only be populated for
message_type
= TRANSCRIPT
.
Type of the result message.
Time offset of the end of this Speech recognition result relative to the
beginning of the audio. Only populated for message_type
= TRANSCRIPT
.
Word-specific information for the words recognized by Speech in
transcript. Populated if and only if message_type
= TRANSCRIPT
and
[InputAudioConfig.enable_word_info] is set.
An estimate of the likelihood that the speech recognizer will not change
its guess about this interim recognition result: * If the value is
unspecified or 0.0, Dialogflow didn't compute the stability. In particular,
Dialogflow will only provide stability for TRANSCRIPT
results with
is_final = false
. * Otherwise, the value is in (0.0, 1.0] where 0.0 means
completely unstable and 1.0 means completely stable.