The past few weeks I developed and deployed a cloud function that is supposed to get called only by authorized users/service accounts and the truth is that the documentation I found wasn’t really helpful.

from google.oauth2 import service_account 
from google.auth.transport.requests import AuthorizedSession
keypath = '.../my_sa_key.json'base_url=''creds=service_account.IDTokenCredentials.from_service_account_file( keypath, target_audience=base_url)authed_session = AuthorizedSession(creds)# make authenticated request and print the response, status_code
resp = authed_session.get(base_url)

Attempt 1

But in my case, the call to the cloud function is going to happen from within a google service (Dataflow) and I don’t have access to the service account file. So I tried to find how can I use target_audience with the default credentials. …


The working implementation is on Third Attempt


A couple of months ago when we started building our data lake, one of the requirements was to try and get real-time data in. Most of our external and internal sources supported that, I had a bit more knowledge on Salesforce so I started researching how we can stream data from Salesforce in real-time.

collection_schema =“mongo”) \ 
.option(“database”, db) \
.option(“collection”, coll) \
.option(‘sampleSize’, 50000) \
.load() \
ingest_df =“mongo”) \
.option(“database”, db) \
.option(“collection”, coll) \ .load(schema=fix_spark_schema(collection_schema))

Alex Fragotsis

Data Engineer @ League Inc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store