Use Google Content API for Shopping Site – 2

Not same with other data feed in Google Merchant Center: Imported google sheet, refresh from store directly, API feed doesn’t have a scheduled data refresh. In theory, API feed is similar with manually feed, but using programming. So we need to schedule our Python scripts to run in an automatically frequency. Depends on the platform the e-commerce or Python scripts built on, there are many ways we can schedule the API programming. The solution we are using here is Apache Airflow, an powerful ETL/orchestration tools to manage, monitor our Python/SQL programming, it can be also managed or upgraded to any cloud environment: AWS, GCP or Azure…

Below are the Airflow DAG as sample:

from __future__ import annotations
import sys
sys.path.append("/data/pythonprojects/google-api")


from datetime import datetime, timedelta

from airflow import DAG

# to use another Python which add_new_products.py is using in venv
from airflow.operators.python import ExternalPythonOperator

from add_new_products import get_new_products_list
from add_new_products import update_log

with DAG(
    "google_api",
    description="Call google content api",
    
    # run job 1:00am every day
    schedule_interval="0 1 * * *",
    
    start_date=datetime(2022,11,29),
    catchup=False,
    tags=["oaktree"],
) as dag:

    task1 = ExternalPythonOperator(
        task_id="update_products",
        provide_context=False,
        python= "/pythonprojects/google-api/.venv/bin",
        python_callable=get_new_products_list
    )

    task2 = ExternalPythonOperator(
        task_id="update_products",
        provide_context=False,
        python= "/pythonprojects/google-api/.venv/bin",
        python_callable=update_log
    )

    task1 >> task2
  • Use the Airflow ExtrnalPythonOperator to launch the ‘google-api’ python scripts in its own environment
  • Use another step ‘update_products’ (Python with SQL connector) to track the log in Mysql table. (Option step)
  • We can also use email alter to report issue events.

The following diagram shows the entire architecture: