Schedule DataFlow Job with Google Cloud Scheduler

Schedule DataFlow Job with Google Cloud Scheduler

Today in this article we shall see how Schedule DataFlow Job with Google Cloud Scheduler triggers a Dataflow batch job.

We shall use the Dataflow job template which we created in our previous article.

We shall try to cover the below aspects while setting up Cloud Scheduler,

What is a Cloud scheduler?

Cloud Scheduler is a fully managed cron job service that helps you

  • Schedule all types of Jobs. Examples Streaming jobs, batch, big data jobs, cloud infrastructure operations, etc.
  • Automate the job with resiliency, let’s retry in case of failure.

Create Cloud Scheduler Job

Please create a Cloud Scheduler Job

Please visit the link: https://console.cloud.google.com/cloudscheduler

DataFlow Job with Google Cloud

Once you create a Cloud scheduler job, it lets you fill the below forms which are scheduler job properties.

Configure scheduler job

Schedule DataFlow Job with Google Cloud Scheduler

Let’s start filling the above form with the required parameters,

DataFlow Job with Google Cloud Scheduler

  • Name – Name your job

  • Frequency – Define the frequency of the run.

Sample schedule for your reference,

ScheduleScheduler pattern
Every Sunday at 23.000 23 * * 0
Every Monday at 120 12 * * 1
Every Minute* * * * *

  • URL – Specify the URL location for the job to be run using a dataflow template.

URL Pattern:

https://dataflow.googleapis.com/v1b3/projects/{project-id}/locations/{region}/templates:launch?gcsPath={Template Location}

  • Body

Below is a sample request body that can be set to call Google Dataflow,

blank

Please add any custom arguments like input file path etc. to the ‘Parameters’ section as per your requirements.

Service Account

Please specify the service account details here. This will be your Dataflow service account.

{service-account-name}@{project-id}.iam.gserviceaccount.com

Scope

The scope can be given as below,

https://www.googleapis.com/auth/cloud-platform

Let’s run the Job from the scheduler window,

DataFlow Google Cloud Scheduler

If you observe the status in Daflow Runner you shall see your job all the logs and performance metrics of the Job.

That’s All!!

References :



Please bookmark this page and share it with your friends. Please Subscribe to the blog to receive notifications on freshly published(2024) best practices and guidelines for software design and development.



Do you have any comments or ideas or any better suggestions to share?

Please sound off your comments below.

Happy Coding !!



Please bookmark this page and share it with your friends. Please Subscribe to the blog to receive notifications on freshly published(2024) best practices and guidelines for software design and development.



Leave a Reply

Your email address will not be published. Required fields are marked *