Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[submodule "common"]
path = common
url = https://github.com/nvidia-riva/common.git
branch = main
url = https://gitlab-master.nvidia.com/sarane/common.git
branch = nemotron_name_change
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
[![License](https://img.shields.io/badge/license-MIT-green)](https://opensource.org/licenses/MIT)
# NVIDIA Riva Clients
# NVIDIA Nemotron Speech Clients

NVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications that are customized for your use
Nemotron Speech is a GPU-accelerated SDK for building Speech AI applications that are customized for your use
case and deliver real-time performance. This repo provides performant client example command-line clients.

## Main API

- `riva.client.ASRService` is a class for speech recognition,
- `riva.client.TTSService` is a class for speech synthesis,
- `riva.client.NLPService` is a class for natural language processing.
- `nemotronspeech.client.ASRService` is a class for speech recognition,
- `nemotronspeech.client.TTSService` is a class for speech synthesis,
- `nemotronspeech.client.NLPService` is a class for natural language processing.

## CLI interface

- **Automatic Speech Recognition (ASR)**
- `scripts/asr/riva_streaming_asr_client.py` demonstrates streaming transcription in several threads, can print time stamps.
- `scripts/asr/nemotron_streaming_asr_client.py` demonstrates streaming transcription in several threads, can print time stamps.
- `scripts/asr/transcribe_file.py` performs streaming transcription,
- `scripts/asr/transcribe_file_offline.py` performs offline transcription,
- `scripts/asr/transcribe_mic.py` performs streaming transcription of audio acquired through microphone.
Expand Down Expand Up @@ -47,12 +47,12 @@ pip install --force-reinstall dist/*.whl
```
3. `pip`:
```bash
pip install nvidia-riva-client
pip install nvidia-nemotronspeech-client
```

If you would like to use output and input audio devices
(scripts `scripts/asr/transcribe_file_rt.py`, `scripts/asr/transcribe_mic.py`, `scripts/tts/talk.py`, `scripts/asr/realtime_asr_client.py`, `scripts/tts/realtime_tts_client.py` or module
`riva.client/audio_io.py`), you will need to install `PyAudio`.
`nemotronspeech.client/audio_io.py`), you will need to install `PyAudio`.
```bash
conda install -c anaconda pyaudio
```
Expand Down Expand Up @@ -82,7 +82,7 @@ and restart.

### Server

Before running client part of Riva, please set up a server. The simplest
Before running client part of Nemotron Speech, please set up a server. The simplest
way to do this is to follow
[quick start guide](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#local-deployment-using-quick-start-scripts).

Expand Down Expand Up @@ -280,7 +280,7 @@ See tutorial notebooks in directory `tutorials`.

## Documentation

Additional documentation on the Riva Speech Skills SDK can be found [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/).
Additional documentation on the Nemotron Speech Skills SDK can be found [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/).


## License
Expand Down
2 changes: 1 addition & 1 deletion common
Submodule common updated from 71df98 to 7ad511
20 changes: 10 additions & 10 deletions riva/client/__init__.py → nemotronspeech/client/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: MIT

from riva.client.asr import (
from nemotronspeech.client.asr import (
AudioChunkFileIterator,
ASRService,
add_audio_file_specs_to_config,
Expand All @@ -14,15 +14,15 @@
add_endpoint_parameters_to_config,
add_custom_configuration_to_config,
)
from riva.client.auth import Auth
from riva.client.nlp import (
from nemotronspeech.client.auth import Auth
from nemotronspeech.client.nlp import (
NLPService,
extract_all_text_classes_and_confidences,
extract_all_token_classification_predictions,
extract_most_probable_text_class_and_confidence,
extract_most_probable_token_classification_predictions,
)
from riva.client.package_info import (
from nemotronspeech.client.package_info import (
__contact_emails__,
__contact_names__,
__description__,
Expand All @@ -35,9 +35,9 @@
__shortversion__,
__version__,
)
from riva.client.proto.riva_asr_pb2 import RecognitionConfig, StreamingRecognitionConfig, EndpointingConfig
from riva.client.proto.riva_audio_pb2 import AudioEncoding
from riva.client.proto.riva_nlp_pb2 import AnalyzeIntentOptions
from riva.client.proto.riva_nmt_pb2 import StreamingTranslateSpeechToSpeechConfig, TranslationConfig, SynthesizeSpeechConfig, StreamingTranslateSpeechToTextConfig
from riva.client.tts import SpeechSynthesisService
from riva.client.nmt import NeuralMachineTranslationClient
from nemotronspeech.client.proto.nemotron_asr_pb2 import RecognitionConfig, StreamingRecognitionConfig, EndpointingConfig
from nemotronspeech.client.proto.nemotron_audio_pb2 import AudioEncoding
from nemotronspeech.client.proto.nemotron_nlp_pb2 import AnalyzeIntentOptions
from nemotronspeech.client.proto.nemotron_nmt_pb2 import StreamingTranslateSpeechToSpeechConfig, TranslationConfig, SynthesizeSpeechConfig, StreamingTranslateSpeechToTextConfig
from nemotronspeech.client.tts import SpeechSynthesisService
from nemotronspeech.client.nmt import NeuralMachineTranslationClient
File renamed without changes.
24 changes: 12 additions & 12 deletions riva/client/asr.py → nemotronspeech/client/asr.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@
from google.protobuf.json_format import MessageToJson
from grpc._channel import _MultiThreadedRendezvous

import riva.client
import riva.client.proto.riva_asr_pb2 as rasr
import riva.client.proto.riva_asr_pb2_grpc as rasr_srv
from riva.client.auth import Auth
import nemotronspeech.client
import nemotronspeech.client.proto.nemotron_asr_pb2 as rasr
import nemotronspeech.client.proto.nemotron_asr_pb2_grpc as rasr_srv
from nemotronspeech.client.auth import Auth


def get_wav_file_parameters(input_file: Union[str, os.PathLike]) -> Dict[str, Union[int, float]]:
Expand Down Expand Up @@ -194,7 +194,7 @@ def print_streaming(
Prints streaming speech recognition results to provided files or streams.

Args:
responses (:obj:`Iterable[riva.client.proto.riva_asr_pb2.StreamingRecognizeResponse]`): responses acquired during
responses (:obj:`Iterable[nemotronspeech.client.proto.nemotron_asr_pb2.StreamingRecognizeResponse]`): responses acquired during
streaming speech recognition.
output_file (:obj:`Union[Union[os.PathLike, str, TextIO], List[Union[os.PathLike, str, TextIO]]]`, `optional`):
a path to an output file or a text stream or a list of paths/streams. If contains several elements, then
Expand Down Expand Up @@ -394,7 +394,7 @@ def __init__(self, auth: Auth) -> None:
Initializes an instance of the class.

Args:
auth (:obj:`riva.client.auth.Auth`): an instance of :class:`riva.client.auth.Auth` which is used for
auth (:obj:`nemotronspeech.client.auth.Auth`): an instance of :class:`nemotronspeech.client.auth.Auth` which is used for
authentication metadata generation.
"""
self.auth = auth
Expand All @@ -420,20 +420,20 @@ def streaming_response_generator(
with wave.open(file_name, 'rb') as wav_f:
raw_audio = wav_f.readframes(n_frames)

streaming_config (:obj:`riva.client.proto.riva_asr_pb2.StreamingRecognitionConfig`): a config for streaming.
streaming_config (:obj:`nemotronspeech.client.proto.nemotron_asr_pb2.StreamingRecognitionConfig`): a config for streaming.
You may find description of config fields in message ``StreamingRecognitionConfig`` in
`common repo
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-asr-proto>`_.
An example of creation of streaming config:

.. code-style:: python

from riva.client import RecognitionConfig, StreamingRecognitionConfig
from nemotronspeech.client import RecognitionConfig, StreamingRecognitionConfig
config = RecognitionConfig(enable_automatic_punctuation=True)
streaming_config = StreamingRecognitionConfig(config, interim_results=True)

Yields:
:obj:`riva.client.proto.riva_asr_pb2.StreamingRecognizeResponse`: responses for audio chunks in
:obj:`nemotronspeech.client.proto.nemotron_asr_pb2.StreamingRecognizeResponse`: responses for audio chunks in
:param:`audio_chunks`. You may find description of response fields in declaration of
``StreamingRecognizeResponse``
message `here
Expand All @@ -459,21 +459,21 @@ def offline_recognize(
with wave.open(file_name, 'rb') as wav_f:
raw_audio = wav_f.readframes(n_frames)

config (:obj:`riva.client.proto.riva_asr_pb2.RecognitionConfig`): a config for offline speech recognition.
config (:obj:`nemotronspeech.client.proto.nemotron_asr_pb2.RecognitionConfig`): a config for offline speech recognition.
You may find description of config fields in message ``RecognitionConfig`` in
`common repo
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-asr-proto>`_.
An example of creation of config:

.. code-style:: python

from riva.client import RecognitionConfig
from nemotronspeech.client import RecognitionConfig
config = RecognitionConfig(enable_automatic_punctuation=True)
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.

Returns:
:obj:`Union[riva.client.proto.riva_asr_pb2.RecognizeResponse, grpc._channel._MultiThreadedRendezvous]``: a
:obj:`Union[nemotronspeech.client.proto.nemotron_asr_pb2.RecognizeResponse, grpc._channel._MultiThreadedRendezvous]``: a
response with results of :param:`audio_bytes` processing. You may find description of response fields in
declaration of ``RecognizeResponse`` message `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-asr-proto>`_.
Expand Down
File renamed without changes.
4 changes: 2 additions & 2 deletions riva/client/auth.py → nemotronspeech/client/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,14 +68,14 @@ def __init__(
Initialize the Auth class for establishing secure connections with a server.

This class handles SSL/TLS configuration, authentication metadata, and gRPC channel creation
for secure communication with Riva services.
for secure communication with NemotronSpeech services.

Args:
ssl_root_cert (Optional[Union[str, os.PathLike]], optional): Path to the SSL root certificate file.
If provided and use_ssl is False, SSL will still be enabled. Defaults to None.
use_ssl (bool, optional): Whether to use SSL/TLS encryption. If True and ssl_root_cert is None,
SSL will be used with default credentials. Defaults to False.
uri (str, optional): The Riva server URI in format "host:port". Defaults to "localhost:50051".
uri (str, optional): The NemotronSpeech server URI in format "host:port". Defaults to "localhost:50051".
metadata_args (List[List[str]], optional): List of metadata key-value pairs for authentication.
Each inner list should contain exactly 2 elements: [key, value]. Defaults to None.
ssl_client_cert (Optional[Union[str, os.PathLike]], optional): Path to the SSL client certificate file.
Expand Down
File renamed without changes.
26 changes: 13 additions & 13 deletions riva/client/nlp.py → nemotronspeech/client/nlp.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
from google.protobuf.message import Message
from grpc._channel import _MultiThreadedRendezvous

import riva.client.proto.riva_nlp_pb2 as rnlp
import riva.client.proto.riva_nlp_pb2_grpc as rnlp_srv
from riva.client import Auth
import nemotronspeech.client.proto.nemotron_nlp_pb2 as rnlp
import nemotronspeech.client.proto.nemotron_nlp_pb2_grpc as rnlp_srv
from nemotronspeech.client import Auth


def extract_all_text_classes_and_confidences(
Expand Down Expand Up @@ -103,7 +103,7 @@ def __init__(self, auth: Auth) -> None:
Initializes an instance of the class.

Args:
auth (:obj:`Auth`): an instance of :class:`riva.client.auth.Auth` which is used for
auth (:obj:`Auth`): an instance of :class:`nemotronspeech.client.auth.Auth` which is used for
authentication metadata generation.
"""
self.auth = auth
Expand All @@ -125,7 +125,7 @@ def classify_text(
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.
Returns:
:obj:`Union[riva.client.proto.riva_nlp_pb2.TextClassResponse, grpc._channel._MultiThreadedRendezvous]`: a
:obj:`Union[nemotronspeech.client.proto.nemotron_nlp_pb2.TextClassResponse, grpc._channel._MultiThreadedRendezvous]`: a
response with :param:`input_strings` classification results. You may find :class:`TextClassResponse`
fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
Expand Down Expand Up @@ -158,7 +158,7 @@ def classify_tokens(
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.
Returns:
:obj:`Union[riva.client.proto.riva_nlp_pb2.TokenClassResponse, grpc._channel._MultiThreadedRendezvous]`: a
:obj:`Union[nemotronspeech.client.proto.nemotron_nlp_pb2.TokenClassResponse, grpc._channel._MultiThreadedRendezvous]`: a
response with results. You may find :class:`TokenClassResponse` fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
If :param:`future` is :obj:`True`, then a future object is returned. You may retrieve a response from a
Expand Down Expand Up @@ -189,7 +189,7 @@ def transform_text(
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.
Returns:
:obj:`Union[riva.client.proto.riva_nlp_pb2.TextTransformResponse, grpc._channel._MultiThreadedRendezvous]`: a
:obj:`Union[nemotronspeech.client.proto.nemotron_nlp_pb2.TextTransformResponse, grpc._channel._MultiThreadedRendezvous]`: a
model response. You may find :class:`TextTransformResponse` fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
If :param:`future` is :obj:`True`, then a future object is returned. You may retrieve a response from a
Expand All @@ -211,7 +211,7 @@ def analyze_entities(
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.
Returns:
:obj:`Union[riva.client.proto.riva_nlp_pb2.TokenClassResponse, grpc._channel._MultiThreadedRendezvous]`: a
:obj:`Union[nemotronspeech.client.proto.nemotron_nlp_pb2.TokenClassResponse, grpc._channel._MultiThreadedRendezvous]`: a
model response. You may find :class:`TokenClassResponse` fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
If :param:`future` is :obj:`True`, then a future object is returned. You may retrieve a response from a
Expand All @@ -233,15 +233,15 @@ def analyze_intent(

Args:
input_string (:obj:`str`): a string which will be classified.
options (:obj:`riva.client.proto.riva_nlp_pb2.AnalyzeIntentOptions`, `optional`,
defaults to :obj:`riva.client.proto.riva_nlp_pb2.AnalyzeIntentOptions()`):
options (:obj:`nemotronspeech.client.proto.nemotron_nlp_pb2.AnalyzeIntentOptions`, `optional`,
defaults to :obj:`nemotronspeech.client.proto.nemotron_nlp_pb2.AnalyzeIntentOptions()`):
an intent options. You may find fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
Defaults to an instance of :obj:`AnalyzeIntentOptions` created without parameters.
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.
Returns:
:obj:`Union[riva.client.proto.riva_nlp_pb2.AnalyzeIntentResponse, grpc._channel._MultiThreadedRendezvous]`: a
:obj:`Union[nemotronspeech.client.proto.nemotron_nlp_pb2.AnalyzeIntentResponse, grpc._channel._MultiThreadedRendezvous]`: a
response with results. You may find fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
If :param:`future` is :obj:`True`, then a future object is returned. You may retrieve a response from a
Expand Down Expand Up @@ -272,7 +272,7 @@ def punctuate_text(
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.
Returns:
:obj:`Union[riva.client.proto.riva_nlp_pb2.TextTransformResponse, grpc._channel._MultiThreadedRendezvous]`: a
:obj:`Union[nemotronspeech.client.proto.nemotron_nlp_pb2.TextTransformResponse, grpc._channel._MultiThreadedRendezvous]`: a
response with results. You may find fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
If :param:`future` is :obj:`True`, then a future object is returned. You may retrieve a response from a
Expand All @@ -296,7 +296,7 @@ def natural_query(
future (:obj:`bool`, defaults to :obj:`False`): whether to return an async result instead of usual
response. You can get a response by calling ``result()`` method of the future object.
Returns:
:obj:`Union[riva.client.proto.riva_nlp_pb2.NaturalQueryResult, grpc._channel._MultiThreadedRendezvous]`: a
:obj:`Union[nemotronspeech.client.proto.nemotron_nlp_pb2.NaturalQueryResult, grpc._channel._MultiThreadedRendezvous]`: a
response with a result. You may find fields description `here
<https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/protos.html#riva-proto-riva-nlp-proto>`_.
If :param:`future` is :obj:`True`, then a future object is returned. You may retrieve a response from a
Expand Down
Loading