- Notifications
You must be signed in to change notification settings - Fork0
Akulbasov/PGA
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is a library for making batch request to Google Analytics Core Reporting v3 API and extracting data from Google Analytics property into Python 3 data structures.
The package uses
OAuth 2.0 (protocol) client or server access to Google Analytics API (oauth2client==3.0.0) - for connection to Google Analytics
Core Reporting v3 API Google Analytics - for extracting data
Metadata API Google Analytics - integrated dimensions or metrics reference lookup
Management API Google Analytics - to get View, Property and Account tree.
Dependency:
Pandas > 0.13.0 - for transformation data into pandas DataFrame object
Numpy > 1.0.0 - for slice numpy array chunk
google-api-python-client > 1.5.0 - self explanatory
Best practices usage:
- Interactive shellJupyterfor analyzing data
- Viapip: use the following command: # sudo pip install pga
Latest version of Pandas, Numpy and oauth2client will be automatically installed as a dependency.
First of all you will need to getgoogle client_secret json file fromGoogle API Console
You may choose the following types of Client ID :
for Service account client
for Web application
PGA.init(key_file_location=None,type_of_connection=None,facet_chunk=10,count_day_slice=1)
Constructor and set parameters for instance basic functionality.
Parameters: | key_file_location :string Set path for secret json file type_of_connection :string Available methods are Client’, ‘Server’ If use service account, then choose ‘Server’, if use web applicatio use ‘Client.’ facet_chunk : int, optional Set a number of chunk,which execute all parallels request. More detail about this technology. Important things - Google Universal Analytics make execute only 10 parallel request in one second, if you want more - contact with a Google form to increase this limit. count_day_slice : int, optional Set a number of days,which need to slice [start-date, end-date] in your request. For example: (input) {‘count_day_slice’:2, 'start_date' : '2016-12-01','end_date' : '2016-12-05'} (output) [{ 'start_date' : '2016-12-01','end_date' : '2016-12-02'}, { 'start_date' : '2016-12-03','end_date' : '2016-12-04'}, { 'start_date' : '2016-12-05','end_date' : '2016-12-05'}] |
Returns: | self : self return self with current behavior. |
After apply constructor will be create the instance, and redirect the client to a browser for authentication with Google.
Simply add request in an already instantiated object pga
Request**.add_settings_request(****settings_products)
Parameters: | **settings_products : kwargs Specify json request formats Core V3, list of query parameters -https://developers.google.com/analytics/devguides/reporting/core/v3/reference?hl=ru#q_summary |
Returns: | self : selfreturn self with current behavior. |
You can update any already used query parameters later with the following method, and make new request. ![image alt text]
Execute all settings for get DataFrame
PGA.get_dataframe(groupby=True)
Parameters: | groupby : boolean Available methods are ‘True’, ‘False’ if choose True then DataFrame groupby all date by all dimensions, dates, and start-index. Also all columns apply appropriate type based on Google Analytics MetaData API. if choose False then DataFrame doesn’t groupby data. It made for use some other library which can fast aggregate and groupby data, because in some cases data is too large and this process is very low. You may pay attention in to this project -http://dask.pydata.org/en/latest/ |
Returns: | data : pandas.DataFrame object |
All settings
Print all current settings pga:
PGA.get_all_settings()
Returns: | all settings : pandas.DataFrame object |
All products
Print all current product settings pga
PGA.get_all_products()
Returns: | all settings : pandas.DataFrame object |
ExtraAppsMetaCdm
Lookup through metadata of Google Analytics dimensions and metrics:
ExtraAppsMetaCdm.get_list_cdcm(clarify=None)
Parameters: | clarify : stringSpecifying the attribute on which the selection will be dimensions and metris |
Returns: | Table of information : pandas.DataFrame object |
ExtraAppsManagementAPI
Get the list of Google Universal Analytics (Account ID, Property id, View id) objects, you have an access to.
PGA.get_all_profile()
Returns: | Table of information with dimensions or metrics: pandas.DataFrame object |