Package 

Class Recognizer

  • All Implemented Interfaces:
    com.sun.jna.NativeMapped , java.lang.AutoCloseable

    
    public class Recognizer
    extends PointerType implements AutoCloseable
                        
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      public class Recognizer.EndpointerMode

      Endpointer delay mode

    • Constructor Summary

      Constructors 
      Constructor Description
      Recognizer(Model model, float sampleRate) Creates the recognizer object.
      Recognizer(Model model, float sampleRate, SpeakerModel spkModel) Creates the recognizer object with speaker recognition.
      Recognizer(Model model, float sampleRate, String grammar) Creates the recognizer object with the phrase list.Sometimes when you want to improve recognition accuracy and when you don't needto recognize large vocabulary you can specify a list of phrases to recognize.
    • Method Summary

      Modifier and Type Method Description
      void setMaxAlternatives(int maxAlternatives) Configures recognizer to output n-best results.
      void setWords(boolean words) Enables words with times in the output
        "result" : [{
            "conf" : 1.000000,
            "end" : 1.110000,
            "start" : 0.870000,
            "word" : "what"
          }, {
            "conf" : 1.000000,
            "end" : 1.530000,
            "start" : 1.110000,
            "word" : "zero"
          }, {
            "conf" : 1.000000,
            "end" : 1.950000,
            "start" : 1.530000,
            "word" : "zero"
          }, {
            "conf" : 1.000000,
            "end" : 2.340000,
            "start" : 1.950000,
            "word" : "zero"
          }, {
            "conf" : 1.000000,
            "end" : 2.610000,
            "start" : 2.340000,
            "word" : "one"
          }],
      
      void setPartialWords(boolean partial_words) Like above return words and confidences in partial results.
      void setSpeakerModel(SpeakerModel spkModel) Adds speaker model to already initialized recognizer.Can add speaker recognition model to already created recognizer.Helps to initialize speaker recognition for grammar-based recognizer.
      boolean acceptWaveForm(Array<byte> data, int len) Accept and process new chunk of voice data.
      boolean acceptWaveForm(Array<short> data, int len)
      boolean acceptWaveForm(Array<float> data, int len)
      String getResult() Returns speech recognition result
      String getPartialResult() Returns partial speech recognition.
      String getFinalResult() Returns speech recognition result.
      void setGrammar(String grammar) Reconfigures recognizer to use grammar.
      void reset() Resets the recognizer.Resets current results so the recognition can continue from scratch.
      void setEndpointerMode(int mode) Configures endpointer mode for recognizer
      void setEndpointerDelays(float t_start_max, float t_end, float t_max) Set endpointer delays
      void close() Releases recognizer object.Underlying model is also unreferenced and if needed, released.
      • Methods inherited from class com.sun.jna.PointerType

        equals, fromNative, getPointer, hashCode, nativeType, setPointer, toNative, toString
      • Methods inherited from class java.lang.AutoCloseable

        close
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Recognizer

        Recognizer(Model model, float sampleRate)
        Creates the recognizer object.
        Parameters:
        model - VoskModel containing static data for recognizer.
        sampleRate - The sample rate of the audio you are going to feed into the recognizer.Make sure this rate matches the audio content, it is a commonissue causing accuracy problems.
      • Recognizer

        Recognizer(Model model, float sampleRate, SpeakerModel spkModel)
        Creates the recognizer object with speaker recognition.
        Parameters:
        model - VoskModel containing static data for recognizer.
        sampleRate - The sample rate of the audio you are going to feed into the recognizer.Make sure this rate matches the audio content, it is a commonissue causing accuracy problems.
        spkModel - speaker model for speaker identification
      • Recognizer

        Recognizer(Model model, float sampleRate, String grammar)
        Creates the recognizer object with the phrase list.Sometimes when you want to improve recognition accuracy and when you don't needto recognize large vocabulary you can specify a list of phrases to recognize.
        Parameters:
        model - VoskModel containing static data for recognizer.
        sampleRate - The sample rate of the audio you are going to feed into the recognizer.Make sure this rate matches the audio content, it is a commonissue causing accuracy problems.
        grammar - The string with the list of phrases to recognize as JSON array of strings,for example "["one two three four five", "[unk]"]".
    • Method Detail

      • setMaxAlternatives

         void setMaxAlternatives(int maxAlternatives)

        Configures recognizer to output n-best results.

          {
             "alternatives": [
                 { "text": "one two three four five", "confidence": 0.97 },
                 { "text": "one two three for five", "confidence": 0.03 },
             ]
          }
        
        Parameters:
        maxAlternatives - - maximum alternatives to return from recognition results
      • setWords

         void setWords(boolean words)

        Enables words with times in the output

          "result" : [{
              "conf" : 1.000000,
              "end" : 1.110000,
              "start" : 0.870000,
              "word" : "what"
            }, {
              "conf" : 1.000000,
              "end" : 1.530000,
              "start" : 1.110000,
              "word" : "zero"
            }, {
              "conf" : 1.000000,
              "end" : 1.950000,
              "start" : 1.530000,
              "word" : "zero"
            }, {
              "conf" : 1.000000,
              "end" : 2.340000,
              "start" : 1.950000,
              "word" : "zero"
            }, {
              "conf" : 1.000000,
              "end" : 2.610000,
              "start" : 2.340000,
              "word" : "one"
            }],
        
        Parameters:
        words - - boolean value
      • setPartialWords

         void setPartialWords(boolean partial_words)

        Like above return words and confidences in partial results.

        Parameters:
        partial_words - - boolean value
      • setSpeakerModel

         void setSpeakerModel(SpeakerModel spkModel)

        Adds speaker model to already initialized recognizer.Can add speaker recognition model to already created recognizer.Helps to initialize speaker recognition for grammar-based recognizer.

        Parameters:
        spkModel - Speaker recognition model
      • acceptWaveForm

         boolean acceptWaveForm(Array<byte> data, int len)

        Accept and process new chunk of voice data.

        Parameters:
        data - - audio data in PCM 16-bit mono format
        len - - length of the audio data
      • getFinalResult

         String getFinalResult()

        Returns speech recognition result. Same as result, but doesn't wait for silence.You usually call it in the end of the stream to get final bits of audio. Itflushes the feature pipeline, so all remaining audio chunks got processed.

      • setGrammar

         void setGrammar(String grammar)

        Reconfigures recognizer to use grammar.

        Parameters:
        grammar - Set of phrases in JSON array of strings or "[]" to use default model graph.
      • reset

         void reset()

        Resets the recognizer.Resets current results so the recognition can continue from scratch.

      • setEndpointerMode

         void setEndpointerMode(int mode)

        Configures endpointer mode for recognizer

      • setEndpointerDelays

         void setEndpointerDelays(float t_start_max, float t_end, float t_max)

        Set endpointer delays

        Parameters:
        t_start_max - timeout for stopping recognition in case of initial silence (usually around 5.
        t_end - timeout for stopping recognition in milliseconds after we recognized something (usually around 0.5 - 1.
        t_max - timeout for forcing utterance end in milliseconds (usually around 20-30)
      • close

         void close()

        Releases recognizer object.Underlying model is also unreferenced and if needed, released.