The clouds are coming in…
Today we announce the BETA version of our first Cloud destination – BigQuery!
But what does this mean?
Well, it means from today we extract hit-level / not sampled data from your favorite FREE GA view and upload it automatically to your BigQuery project. Just like the big boys do with GA 360 but just less expensive.
It updates the data automatically behind the scenes every morning. And with NO HANDS – YAY!
Now take that fine reporting tool of yours with a BigQuery connector and put it to use. Use it to connect to your SCITYLANA transformed FREE GA hit-level / not sampled BigQuery database and do reporting on it.
Let it be Microsoft Power BI, Tableau, Google Data Studio or some other tool.
Let’s start the engine…
To get started – log-in to your www.scitylana.com account and authorize with a BigQuery account with modify permissions (Owner, Editor or Admin) in your data extraction settings. Click the Authorize BigQuery Access button:
Now enter your Google Project ID, in my case it is scitylana-1048
Save it – and off you go. If you already have extracted data on the disk, these files will immediately be uploaded to BigQuery.
Do note that the app will continue to download the files from GA on to your hard-drive. From here the SCITYLANA app will upload the files to BigQuery.
While we wait for the files to be uploaded – let’s set-up the first report.
Connect to BigQuery from Power BI
You can also build your report from scratch in the following way.
- Open PBI Desktop and click Get Data
2. Select Google BigQuery (Beta) connector
3. Click Connect
4. Expand from your project id to your tables and views.
5. Select second view from the top and click Load
6. Select DirectQuery
7. Click OK
8. Power BI now creates a model and should end up listing all the dimensions and metrics in the Fields list.
9. Right-click the table called “VIEW ya-da ya-da” and select New Measure
10. Write M_users = DISTINCTCOUNT(VIEW ya-da ya-da…[sl_userId]) where “VIEW ya-da ya-da” is the id of your own view. E.g. M_users = DISTINCTCOUNT(VIEW136604982DAYS007[sl_userId])
11. Check your new view M_users in the fields list. (Find it using the search)
12. And VIOLÁ – you get your first chart
13. Now search for the date dimension. And check it. Now we get…
14. I hope you can take the rest from here. 🙂
Nerdy details for the interested
OK, what have we done with your BigQuery account?
Well in the BigQuery Cloud console you can see we have added the following:
(The screenshot displays the output for Google Analytics View 136604982 – scitylana.com)
For each view you’ll get a date partitioned table (the one with the blue icon) named after your Google Analytics view id. The reason why we partition the data by date is because Google recommends it for getting better query performance. Currently this is the only way BigQuery can partition a table.
Here is a snippet of the schema definition.
The sl_ dimensions (e.g. sl_userId) are some of the extra stuff SCITYLANA adds to the dataset. _PARTITIONTIME is BigQuery internal column used for date partitioning. Read more about it here.
All Google Analytics metrics a re-named from ga: to M_ e.g. M_bounces is the re-named version of ga:bounces. All dimensions are re-named from ga: to E.g. ga:userType is called userType.
BigQuery Views to help
Additionally we created 7 BigQuery views. BigQuery Views are a subset of data, a stored SQL query if you like. We made these helper views for your convenience.
In the data view you get integration between the SCITYLANA hit-level table and BigQuery’s public date table called bigquery-public-data.common_us.date_greg.
This view comes in 6 variations. Latest 7 days and latest 14, 30, 90, 180, 365 days. This is to get better query performance when using them with a BigQuery DirectQuery connector in e.g. Microsoft Power BI. The views have the following format, VIEWXXXXXXXXXDAYSYYY, where XXXXXXXXX is the view id and YYY is the number of days it queries. E.g. VIEW120558169DAYS030 which is returning the last 30 days of hit-level data for GA view with id 120558169.
Another view lets you see the data partitions behind the partitioned table for your convenience. E.g. LIST120558169PARTITIONS
This view is not super important, but it’s practical when you need to get an idea of how many days have been uploaded to the partitioned table.
Please feel free to comment here or write us, firstname.lastname@example.org