Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Redshift Python Connector. It supports Python Database API Specification v2.0.

License

NotificationsYou must be signed in to change notification settings

aws/amazon-redshift-python-driver

Python VersionPyPi

redshift_connector is the Amazon Redshift connector forPython. Easy integration withpandas andnumpy, as well as support for numerous Amazon Redshift specific features help you get the most out of your data

Supported Amazon Redshift features include:

  • IAM authentication
  • Identity provider (IdP) authentication
  • Redshift specific data types

This pure Python connector implementsPython Database API Specification 2.0.

Getting Started

Install from Binary

Package ManagerDownloadsInstallation Command
PyPiPyPi Downloadspip install redshift_connector
CondaConda Downloadsconda install -c conda-forge redshift_connector

Install from Source

You may install from source by cloning this repository.

$ git clone https://github.com/aws/amazon-redshift-python-driver.git$cd redshift_connector$ pip install.

Tutorials

We are working to add more documentation and would love your feedback. Please reach out to the team byopening an issue orstarting a discussion to help us fill in the gaps in our documentation.

Integrations

redshift_connector integrates with various open source projects to provide an interface to Amazon Redshift. Pleaseopen an issue with our project to request new integrations or get support for aredshift_connector issue seen in an existing integration.

Basic Example

importredshift_connector# Connects to Redshift cluster using AWS credentialsconn=redshift_connector.connect(host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',database='dev',user='awsuser',password='my_password' )cursor:redshift_connector.Cursor=conn.cursor()cursor.execute("create Temp table book(bookname varchar,author varchar)")cursor.executemany("insert into book (bookname, author) values (%s, %s)",                    [                        ('One Hundred Years of Solitude','Gabriel García Márquez'),                        ('A Brief History of Time','Stephen Hawking')                    ]                  )cursor.execute("select * from book")result:tuple=cursor.fetchall()print(result)>> (['One Hundred Years of Solitude','Gabriel García Márquez'], ['A Brief History of Time','Stephen Hawking'])

Enabling autocommit

Following the DB-API specification, autocommit is off by default. It can be turned on by using the autocommit property of the connection.

# Make sure we're not in a transactionconn.rollback()conn.autocommit=Trueconn.run("VACUUM")conn.autocommit=False

Configuring paramstyle

Paramstyle can be set on both a module and cursor level. When paramstyle is set on a module level e.g.redshift_connector.paramstyle = 'qmark', the user specified paramstyle is used for all subsequent cursors unless set on the cursor.When paramstyle is set on the cursor e.g.`cursor.paramstyle = 'qmark' the user specified paramstyle is only used for that cursor object.

# setting paramstyle to qmark on a module levelredshift_connector.paramstyle='qmark'withredshift_connector.connect()asconn1:withconn1.cursor()ascursor1:# this cursor will use qmark paramstyle as it's been set on the module levelpasswithconn1.cursor()ascursor2:# setting paramstyle to numeric on the cursor level only this cursor will use numeric paramstylecursor.paramstyle='numeric'withconn1.cursor()ascursor3:# this cursor will use qmark paramstyle as it's been set on the module levelpasswithredshift_connector.connect()asconn2:withconn2.cursor()ascursor1:# this cursor will use qmark paramstyle as it's been set on the module levelpass

The module level default paramstyle used isformat. Valid values forparamstyle includeqmark, numeric, named, format, pyformat. The below example shows how to use various paramstyles after the paramstyle is set on the cursor.

When paramstyle is set tonamed orpyformat, parameters must be passed as a Python dictionary to theexecute() method. Other paramstyles require parameters to be passed as a Python tuple or list.

# qmarkcursor.paramstyle='qmark'sql='insert into foo(bar, jar) VALUES(?, ?)'cursor.execute(sql, (1,"hello world"))# numericcursor.paramstyle='numeric'sql='insert into foo(bar, jar) VALUES(:1, :2)'cursor.execute(sql, (1,"hello world"))# namedcursor.paramstyle='named'sql='insert into foo(bar, jar) VALUES(:p1, :p2)'cursor.execute(sql, {"p1":1,"p2":"hello world"})# formatcursor.paramstyle='format'sql='insert into foo(bar, jar) VALUES(%s, %s)'cursor.execute(sql, (1,"hello world"))# pyformatcursor.paramstyle='pyformat'sql='insert into foo(bar, jar) VALUES(%(bar)s, %(jar)s)'cursor.execute(sql, {"bar":1,"jar":"hello world"})

Exception Handling

redshift_connector uses the guideline for exception handling specified in thePython DB-API. For exception definitions, please seeredshift_connector/error.py

Example using IAM Credentials

IAM Credentials can be supplied directly toconnect(...) using an AWS profile as shown below:

importredshift_connector# Connects to Redshift cluster using IAM credentials from default profile defined in ~/.aws/credentialsconn=redshift_connector.connect(iam=True,database='dev',db_user='awsuser',password='',user='',cluster_identifier='examplecluster',profile='default' )
# ~/.aws/credentials[default]aws_access_key_id="my_aws_access_key_id"aws_secret_access_key="my_aws_secret_access_key"aws_session_token="my_aws_session_token"# ~/.aws/config[default]region=us-west-2

If a region is not provided in ~/.aws/config or you would like to override its value, region may be passed toconnect(...).

Alternatively, IAM credentials can be supplied directly toconnect(...) using AWS credentials as shown below:

importredshift_connector# Connects to Redshift cluster using IAM credentials from default profile defined in ~/.aws/credentialsconn=redshift_connector.connect(iam=True,database='dev',db_user='awsuser',password='',user='',cluster_identifier='examplecluster',access_key_id="my_aws_access_key_id",secret_access_key="my_aws_secret_access_key",session_token="my_aws_session_token",region="us-east-2" )

Integration with pandas

Retrieving query results as apandas.DataFrame

importpandascursor.execute("create Temp table book(bookname varchar,author varchar)")cursor.executemany("insert into book (bookname, author) values (%s, %s)",                   [                       ('One Hundred Years of Solitude','Gabriel García Márquez'),                       ('A Brief History of Time','Stephen Hawking')                   ])cursor.execute("select * from book")result:pandas.DataFrame=cursor.fetch_dataframe()print(result)>>booknameauthor>>0OneHundredYearsofSolitudeGabrielGarcíaMárquez>>1ABriefHistoryofTimeStephenHawking

Insert data stored in apandas.DataFrame into an Amazon Redshift table

importnumpyasnpimportpandasaspddf=pd.DataFrame(np.array(        [            ["One Hundred Years of Solitude","Gabriel García Márquez"],            ["A Brief History of Time","Stephen Hawking"],        ]    ),columns=["bookname","author‎"],)withconn.cursor()ascursor:cursor.write_dataframe(df,"book")cursor.execute("select * from book; ")result=cursor.fetchall()

Integration with numpy

importnumpycursor.execute("select * from book")result:numpy.ndarray=cursor.fetch_numpy_array()print(result)>> [['One Hundred Years of Solitude''Gabriel García Márquez']>>  ['A Brief History of Time''Stephen Hawking']]

Query using functions

cursor.execute("SELECT CURRENT_TIMESTAMP")print(cursor.fetchone())>> [datetime.datetime(2020,10,26,23,3,54,756497,tzinfo=datetime.timezone.utc)]

Connection Parameters

NameTypeDescriptionDefault ValueRequired
access_key_idstrThe access key for the IAM role or IAM user configured for IAM database authenticationNoneNo
allow_db_user_overrideboolTrue specifies the driver uses the DbUser value from the SAML assertion while False indicates the value in the DbUser connection parameter is usedFALSENo
app_namestrThe name of the IdP application used for authenticationNoneNo
auth_profilestrThe name of an Amazon Redshift Authentication profile having connection properties as JSON. See the RedshiftProperty class to learn how connection parameters should be named.NoneNo
auto_createboolIndicates whether the user should be created if they do not existFALSENo
client_idstrThe client id from Azure IdPNoneNo
client_secretstrThe client secret from Azure IdPNoneNo
cluster_identifierstrThe cluster identifier of the Amazon Redshift ClusterNoneNo
credentials_providerstrThe IdP that will be used for authenticating with Amazon Redshift.NoneNo
databasestrThe name of the database to connect toNoneNo
database_metadata_current_db_onlyboolIndicates if application supports multi-database datashare catalogs. Default value of True indicates application does not support multi-database datashare catalogs for backwards compatibilityTRUENo
db_groupslistA comma-separated list of existing database group names that the DbUser joins for the current sessionNoneNo
db_userstrThe user ID to use with Amazon RedshiftNoneNo
endpoint_urlstrThe Amazon Redshift endpoint url. This option is only used by AWS internal teams.NoneNo
group_federationboolUse the IdP Groups in the Redshift. Default value False.FalseNo
hoststrThe hostname of Amazon Redshift clusterNoneNo
iamboolIf IAM Authentication is enabledFALSENo
iam_disable_cacheboolThis option specifies whether the IAM credentials are cached. By default the IAM credentials are cached. This improves performance when requests to the API gateway are throttled.FALSENo
idc_client_display_namestrThe client display name to be used in user consent in IdC browser auth. This is an optional value. The default value is "Amazon Redshift Python connector".NoneNo
idc_regionstrThe AWS region where AWS identity center instance is located. It is required for the IdC browser auth plugin.NoneNo
identity_namespacestrThe identity namespace to be used for the IdC browser auth plugin and IdP token auth plugin. It is an optional value if there is only one IdC instance existing or if default identity namespace is set on the cluster - else it is required.NoneNo
idp_response_timeoutintThe timeout for retrieving SAML assertion from IdP120No
idp_tenantstrThe IdP tenantNoneNo
issuer_urlstrThe issuer url for the AWS IdC access portal. It is required for the IdC browser auth plugin.NoneNo
listen_portintThe listen port IdP will send the SAML assertion to7890No
login_to_rpstrOnly for AdfsCredentialsProvider. Used to specify the loginToRp when performing IdpInitiatedSignOn as apart of form based authentication.urn:amazon:webservicesNo
login_urlstrThe SSO Url for the IdPNoneNo
max_prepared_statementsintThe maximum number of prepared statements that can be open at once1000No
numeric_to_floatboolSpecifies if NUMERIC datatype values will be converted from decimal.Decimal to float. By default NUMERIC values are received as decimal.Decimal. Enabling this option is not recommended for use cases which prefer the most precision as results may be rounded. Please reference the Python docs on decimal.Decimal to see the tradeoffs between decimal.Decimal and float before enabling this option.FalseNo
partner_sp_idstrThe Partner SP Id used for authentication with PingNoneNo
passwordstrThe password to use for authenticationNoneNo
portIntThe port number of the Amazon Redshift cluster5439No
preferred_rolestrThe IAM role preferred for the current connectionNoneNo
principal_arnstrThe ARN of the IAM entity (user or role) for which you are generating a policyNoneNo
profilestrThe name of a profile in a AWS credentials file that contains AWS credentials.NoneNo
provider_namestrThe name of the Redshift Native Auth Provider.NoneNo
regionstrThe AWS region where the cluster is locatedNoneNo
role_arnstrThe Amazon Resource Name (ARN) of the role that the caller is assuming. This parameter is used by JwtCredentialsProvider. For this provider, this is a mandatory parameter.NoneNo
role_session_namestrAn identifier for the assumed role session. Typically, you pass the name or identifier that is associated with the user who is using your application. That way, the temporary security credentials that your application will use are associated with that user. This parameter is used by JwtCredentialsProvider. For this provider, this is an optional parameter.jwt_redshift_sessionNo
scopestrScope for BrowserAzureOauth2CredentialsProvider authentication.""No
secret_access_key_idstrThe secret access key for the IAM role or IAM user configured for IAM database authenticationNoneNo
serverless_acct_idstrThe account ID of the serverless. Default value NoneNoneNo
serverless_work_groupstrThe name of work group for serverless end point. Default value None.NoneNo
session_tokenstrThe access key for the IAM role or IAM user configured for IAM database authentication. Not required unless temporary AWS credentials are being used.NoneNo
sslboolIf SSL is enabledTRUENo
ssl_insecureboolSpecifies whether to disable the verification of the IdP host's server SSL certificate. ssl_insecure=True indicates that verification of the IdP host's server SSL certificate will be disabled. It is NOT recommended to disable the verification of an IdP host's server SSL certificate in a production environment.FalseNo
sslmodestrThe security of the connection to Amazon Redshift. verify-ca and verify-full are supported.verify_caNo
timeoutintThe number of seconds before the connection to the server will timeout.NoneNo
tokenstrThe access token required for the IdP token auth plugin.NoneNo
token_typestrThe token type required for the IdP token auth plugin.ACCESS_TOKENNo
userstrThe username to use for authenticationNoneNo
web_identity_tokenstrThe OAuth 2.0 access token or OpenID Connect ID token that is provided by the identity provider. Your application must get this token by authenticating the user who is using your application with a web identity provider. This parameter is used by JwtCredentialsProvider. For this provider, this is a mandatory parameter.NoneNo

Supported Datatypes

redshift_connector supports the following Amazon Redshift datatypes.redshift_connector will attempt to treat unsupported datatypes as strings.Incoming data from Amazon Redshift is treated as follows:

Amazon Redshift DatatypePython Datatype
ACLITEMstr
BOOLEANbool
INT8int
INT4int
INT2int
VARCHARstr
OIDint
REGPROCint
XIDint
FLOAT4float
FLOAT8float
TEXTstr
CHARstr
DATEdatetime.date
TIMEdatetime.time
TIMETZdatetime.time
TIMESTAMPdatetime.datetime
TIMESTAMPTZdatetime.datetime
NUMERICdecimal.Decimal
GEOMETRYstr
SUPERstr
VARBYTEbytes
GEOGRAPHYstr

Logging

redshift_connector uses logging for providing detailed error messages regarding IdP authentication. A do-nothing handler is enabled by default as to prevent logs from being output tosys.stderr.

Enable logging in your application to view logs output byredshift_connector as described inthedocumentation for Python logging module.

Client Transfer Protocol

redshift_connector requests the Amazon Redshift server use the highest transfer protocol version supported. As of v2.0.879 binary transfer protocol is requested by default. If necessary, the requested transfer protocol can be modified via theclient_protocol_version parameter ofredshift_connector.connect(...). Please see the Connection Parameters table for more details.

Getting Help

Contributing

We look forward to collaborating with you! Please read throughCONTRIBUTING before submitting any issues or pull requests.

Changelog Generation

An entry in the changelog is generated upon release usinggitchangelog. Please use the configuration file,.gitchangelog.rc when generating the changelog.

Running Tests

You can run tests by usingpytest test/unit. This will run all unit tests. Integration tests require providing credentials for an Amazon Redshift cluster as well as IdP attributes intest/config.ini.

Additional Resources

About

Redshift Python Connector. It supports Python Database API Specification v2.0.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors27


[8]ページ先頭

©2009-2025 Movatter.jp