Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Singer.io Tap for MySQL

License

NotificationsYou must be signed in to change notification settings

singer-io/tap-mysql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

607 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyPI versionCircleCI Build Status

Singer tap that extracts data from aMySQL database and produces JSON-formatted data following theSinger spec.

$ mkvirtualenv -p python3 tap-mysql$ pip install tap-mysql$ tap-mysql --config config.json --discover$ tap-mysql --config config.json --properties properties.json --state state.json

Usage

This section dives into basic usage oftap-mysql by walking through extractingdata from a table. It assumes that you can connect to and read from a MySQLdatabase.

Install

$ mkvirtualenv -p python3 tap-mysql$ pip install tap-mysql

or

$ git clone git@github.com:singer-io/tap-mysql.git$cd tap-mysql$ mkvirtualenv -p python3 tap-mysql$ python install.

Have a source database

There's some important business data siloed in this MySQL database -- we need toextract it. Here's the table we'd like to sync:

mysql> select * from example_db.animals;+----|----------|----------------------+| id | name     | likes_getting_petted |+----|----------|----------------------+|  1 | aardvark |                    0 ||  2 | bear     |                    0 ||  3 | cow      |                    1 |+----|----------|----------------------+3 rows in set (0.00 sec)

Create the configuration file

Create a config file containing the database connection credentials, e.g.:

{"host":"localhost","port":"3306","user":"root","password":"password"}

These are the same basic configuration properties used by the MySQL command-lineclient (mysql).

Discovery mode

The tap can be invoked in discovery mode to find the available tables andcolumns in the database:

$ tap-mysql --config config.json --discover

A discovered catalog is output, with a JSON-schema description of each table. Asource table directly corresponds to a Singer stream.

{"streams": [    {"tap_stream_id":"example_db-animals","table_name":"animals","schema": {"type":"object","properties": {"name": {"inclusion":"available","type": ["null","string"            ],"maxLength":255          },"id": {"inclusion":"automatic","minimum":-2147483648,"maximum":2147483647,"type": ["null","integer"            ]          },"likes_getting_petted": {"inclusion":"available","type": ["null","boolean"            ]          }        }      },"metadata": [        {"breadcrumb": [],"metadata": {"row-count":3,"table-key-properties": ["id"            ],"database-name":"example_db","selected-by-default":false,"is-view":false,          }        },        {"breadcrumb": ["properties","id"          ],"metadata": {"sql-datatype":"int(11)","selected-by-default":true          }        },        {"breadcrumb": ["properties","name"          ],"metadata": {"sql-datatype":"varchar(255)","selected-by-default":true          }        },        {"breadcrumb": ["properties","likes_getting_petted"          ],"metadata": {"sql-datatype":"tinyint(1)","selected-by-default":true          }        }      ],"stream":"animals"    }  ]}

Field selection

In sync mode,tap-mysql consumes the catalog and looks for tables and fieldshave been marked asselected in their associated metadata entries.

Redirect output from the tap's discovery mode to a file so that it can bemodified:

$ tap-mysql -c config.json --discover> properties.json

Then editproperties.json to make selections. In this example we want theanimals table. The stream's metadata entry (associated with"breadcrumb": [])gets a top-levelselected flag, as does its columns' metadata entries. Additionally,we will mark theanimals table to replicate using aFULL_TABLE strategy. For more,information, seeReplication methods and state file.

[  {"breadcrumb": [],"metadata": {"row-count":3,"table-key-properties": ["id"      ],"database-name":"example_db","selected-by-default":false,"is-view":false,"selected":true,"replication-method":"FULL_TABLE"    }  },  {"breadcrumb": ["properties","id"    ],"metadata": {"sql-datatype":"int(11)","selected-by-default":true,"selected":true    }  },  {"breadcrumb": ["properties","name"    ],"metadata": {"sql-datatype":"varchar(255)","selected-by-default":true,"selected":true    }  },  {"breadcrumb": ["properties","likes_getting_petted"    ],"metadata": {"sql-datatype":"tinyint(1)","selected-by-default":true,"selected":true    }  }]

Sync mode

With a properties catalog that describes field and table selections, the tap can be invoked in sync mode:

$ tap-mysql -c config.json --properties properties.json

Messages are written to standard output following the Singer specification. Theresultant stream of JSON data can be consumed by a Singer target.

{"value": {"currently_syncing":"example_db-animals"},"type":"STATE"}{"key_properties": ["id"],"stream":"animals","schema": {"properties": {"name": {"inclusion":"available","maxLength":255,"type": ["null","string"]},"likes_getting_petted": {"inclusion":"available","type": ["null","boolean"]},"id": {"inclusion":"automatic","minimum":-2147483648,"type": ["null","integer"],"maximum":2147483647}},"type":"object"},"type":"SCHEMA"}{"stream":"animals","version":1509133344771,"type":"ACTIVATE_VERSION"}{"record": {"name":"aardvark","likes_getting_petted":false,"id":1},"stream":"animals","version":1509133344771,"type":"RECORD"}{"record": {"name":"bear","likes_getting_petted":false,"id":2},"stream":"animals","version":1509133344771,"type":"RECORD"}{"record": {"name":"cow","likes_getting_petted":true,"id":3},"stream":"animals","version":1509133344771,"type":"RECORD"}{"stream":"animals","version":1509133344771,"type":"ACTIVATE_VERSION"}{"value": {"currently_syncing":"example_db-animals","bookmarks": {"example_db-animals": {"initial_full_table_complete":true}}},"type":"STATE"}{"value": {"currently_syncing":null,"bookmarks": {"example_db-animals": {"initial_full_table_complete":true}}},"type":"STATE"}

Replication methods and state file

In the above example, we invokedtap-mysql without providing astate fileand without specifying a replication method. The two ways to replicate a giventable areFULL_TABLE andINCREMENTAL.

Full Table

Full-table replication extracts all data from the source table each time the tapis invoked.

Incremental

Incremental replication works in conjunction with a state file to only extractnew records each time the tap is invoked. This requires a replication key to bespecified in the table's metadata as well.

Example

Let's sync theanimals table again, but this time using incrementalreplication. The replication method and replication key are set in thetable's metadata entry in properties file:

{"streams": [    {"tap_stream_id":"example_db-animals","table_name":"animals","schema": {...},"metadata": [        {"breadcrumb": [],"metadata": {"row-count":3,"table-key-properties": ["id"            ],"database-name":"example_db","selected-by-default":false,"is-view":false,"replication-method":"INCREMENTAL","replication-key":"id"          }        },...      ],"stream":"animals"    }  ]}

We have no meaningful state so far, so just invoke the tap in sync mode againwithout a state file:

$ tap-mysql -c config.json --properties properties.json

The output messages look very similar to when the table was replicated using thedefaultFULL_TABLE replication method. One important difference is that theSTATE messages now contain areplication_key_value -- a bookmark orhigh-water mark -- for data that was extracted:

{"type":"STATE","value": {"currently_syncing":"example_db-animals"}}{"stream":"animals","type":"SCHEMA","schema": {"type":"object","properties": {"id": {"type": ["null","integer"],"minimum":-2147483648,"maximum":2147483647,"inclusion":"automatic"},"name": {"type": ["null","string"],"inclusion":"available","maxLength":255},"likes_getting_petted": {"type": ["null","boolean"],"inclusion":"available"}}},"key_properties": ["id"]}{"stream":"animals","type":"ACTIVATE_VERSION","version":1509135204169}{"stream":"animals","type":"RECORD","version":1509135204169,"record": {"id":1,"name":"aardvark","likes_getting_petted":false}}{"stream":"animals","type":"RECORD","version":1509135204169,"record": {"id":2,"name":"bear","likes_getting_petted":false}}{"stream":"animals","type":"RECORD","version":1509135204169,"record": {"id":3,"name":"cow","likes_getting_petted":true}}{"type":"STATE","value": {"bookmarks": {"example_db-animals": {"version":1509135204169,"replication_key_value":3,"replication_key":"id"}},"currently_syncing":"example_db-animals"}}{"type":"STATE","value": {"bookmarks": {"example_db-animals": {"version":1509135204169,"replication_key_value":3,"replication_key":"id"}},"currently_syncing":null}}

Note that the finalSTATE message has areplication_key_value of3,reflecting that the extraction ended on a record that had anid of3.Subsequent invocations of the tap will pick up from this bookmark.

Normally, the target will echo the lastSTATE after it's finished processingdata. For this example, let's manually write astate.json file using theSTATE message:

{"bookmarks": {"example_db-animals": {"version":1509135204169,"replication_key_value":3,"replication_key":"id"    }  },"currently_syncing":null}

Let's add some more animals to our farm:

mysql> insert into animals (name, likes_getting_petted) values ('dog', true), ('elephant', true), ('frog', false);
$ tap-mysql -c config.json --properties properties.json --state state.json

This invocation extracts any data since (and including) thereplication_key_value:

{"type":"STATE","value": {"bookmarks": {"example_db-animals": {"replication_key":"id","version":1509135204169,"replication_key_value":3}},"currently_syncing":"example_db-animals"}}{"key_properties": ["id"],"schema": {"properties": {"name": {"maxLength":255,"inclusion":"available","type": ["null","string"]},"id": {"maximum":2147483647,"minimum":-2147483648,"inclusion":"automatic","type": ["null","integer"]},"likes_getting_petted": {"inclusion":"available","type": ["null","boolean"]}},"type":"object"},"type":"SCHEMA","stream":"animals"}{"type":"ACTIVATE_VERSION","version":1509135204169,"stream":"animals"}{"record": {"name":"cow","id":3,"likes_getting_petted":true},"type":"RECORD","version":1509135204169,"stream":"animals"}{"record": {"name":"dog","id":4,"likes_getting_petted":true},"type":"RECORD","version":1509135204169,"stream":"animals"}{"record": {"name":"elephant","id":5,"likes_getting_petted":true},"type":"RECORD","version":1509135204169,"stream":"animals"}{"record": {"name":"frog","id":6,"likes_getting_petted":false},"type":"RECORD","version":1509135204169,"stream":"animals"}{"type":"STATE","value": {"bookmarks": {"example_db-animals": {"replication_key":"id","version":1509135204169,"replication_key_value":6}},"currently_syncing":"example_db-animals"}}{"type":"STATE","value": {"bookmarks": {"example_db-animals": {"replication_key":"id","version":1509135204169,"replication_key_value":6}},"currently_syncing":null}}

Copyright © 2017 Stitch

About

Singer.io Tap for MySQL

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors17

Languages


[8]ページ先頭

©2009-2026 Movatter.jp