Alex FragotsisinInside LeagueStreaming Data to BigQuery with Dataflow and Updating the Schema in Real-TimeIn our previous story, we saw how to stream data to Big Query and also add new columns when needed. This solution though is not really…3 min read·Dec 26, 2021--3--3
Alex FragotsisAuthenticated calls to cloud functions with PythonThe past few weeks I developed and deployed a cloud function that is supposed to get called only by authorized users/service accounts and…3 min read·Jun 4, 2021--1--1
Alex FragotsisinInside LeagueLoading complex JSON files in RealTime to BigQuery from PubSub using Dataflow and updating the…In my previous post, I explained how to stream data from Salesforce to PubSub in real-time. The next logical step would be to store the…6 min read·Jan 17, 2021--4--4
Alex FragotsisinInside LeagueReal-Time Streaming Salesforce Updates to PubSub11 min read·Dec 12, 2020--2--2
Alex FragotsisinThe StartupPyspark: How to Modify a Nested Struct FieldIn our adventures trying to build a data lake, we are using dynamically generated spark cluster to ingest some data from MongoDB, our…3 min read·Aug 29, 2020--3--3
Alex FragotsisSchedule Dataflow Templates with AirflowOk, so, we’ve written our Dataflow Template with Python, now what? We want to schedule it to run daily and we’re going to use Airflow for…3 min read·Jul 4, 2020----
Alex FragotsisinAnalytics VidhyaTransform JSON to CSV from Google bucket using a Dataflow Python pipelineIn this article, we will try to transform a JSON file into a CSV file using dataflow and python4 min read·May 31, 2020--3--3