Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Python SDK for Yandex Speechkit API.

License

NotificationsYou must be signed in to change notification settings

tikhonp/yandex-speechkit-lib-python

Repository files navigation

Python SDK for Yandex SpeechKit API. This SDK allows you to use the cloud API for speech recognition and synthesis from Yandex.

For more information please visitYandex Speechkit API Docs. This libsupports short and long audio recognition with speechkit

🛠 Getting Started

Assuming that you have Python andvirtualenv installed, set up your environment and install the required dependencieslike this, or you can install the library usingpip:

$ git clone https://github.com/TikhonP/yandex-speechkit-lib-python.git$cd yandex-speechkit-lib-python$ virtualenv venv...$. venv/bin/activate$ python -m pip install -r requirements.txt$ python -m pip install.
python -m pip install speechkit

📑 Speechkit documentation

Check outspeechkit docs for moreinfo.PDF docs

🔮 Using speechkit

There are support of recognizing long and short audio and synthesis. For more information please read docs below.

First you need create session for authorisation:

fromspeechkitimportSessionoauth_token=str('<oauth_token>')folder_id=str('<folder_id>')api_key=str('<api-key>')jwt_token=str('<jwt_token>')oauth_session=Session.from_yandex_passport_oauth_token(oauth_token,folder_id)api_key_session=Session.from_api_key(api_key,x_client_request_id_header=True,x_data_logging_enabled=True)# You can use `x_client_request_id_header` and `x_data_logging_enabled` params to troubleshoot yandex recognition# Use `Session.get_x_client_request_id()` method to get x_client_request_id value.jwt_session=Session.from_jwt(jwt_token)

Use created session to make other requests.

There are also functions for getting credentials (readDocumentation for more info):Speechkit.auth.generate_jwt,speechkit.auth.get_iam_token,speechkit.auth.get_api_key

For audio recognition

Short audio:

fromspeechkitimportShortAudioRecognitionrecognizeShortAudio=ShortAudioRecognition(session)withopen(str('/Users/tikhon/Desktop/out.wav'),str('rb'))asf:data=f.read()print(recognizeShortAudio.recognize(data,format='lpcm',sampleRateHertz='48000'))# Will be printed: 'text that need to be recognized'

Look at example with longaudiolong_audio_recognition.py.

Look at example with streamingaudiostreaming_recognize.py

For synthesis

fromspeechkitimportSpeechSynthesissynthesizeAudio=SpeechSynthesis(session)synthesizeAudio.synthesize(str('/Users/tikhon/Desktop/out.wav'),text='Text that will be synthesised',voice='oksana',format='lpcm',sampleRateHertz='16000')

🔗 Links

💼 License

MIT

In other words, you can use the code for private and commercial purposes with an author attribution (by including the original license file).

❤️


[8]ページ先頭

©2009-2025 Movatter.jp