使用 SQLExecuteQueryOperator 連線至 MSSQL¶
本指南旨在定義使用 SQLExecuteQueryOperator 與 MSSQL 資料庫互動的任務。
使用 SQLExecuteQueryOperator
在 MSSQL 資料庫中執行 SQL 命令。
注意
先前,MsSqlOperator
用於執行此類操作。請改用 SQLExecuteQueryOperator
。
使用 SQLExecuteQueryOperator 的常見資料庫操作¶
若要使用 SQLExecuteQueryOperator 對 MSSQL 資料庫執行 SQL 查詢,需要兩個參數:sql
和 conn_id
。這兩個參數最終會傳遞給直接與 MSSQL 資料庫互動的 MSSQL hook 物件。
建立 MSSQL 資料庫表格¶
以下程式碼片段基於 Airflow-2.2
以下是使用 SQLExecuteQueryOperator 連線至 MSSQL 的範例用法
# Example of creating a task to create a table in MsSql
create_table_mssql_task = SQLExecuteQueryOperator(
task_id="create_country_table",
conn_id="airflow_mssql",
sql=r"""
CREATE TABLE Country (
country_id INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
name TEXT,
continent TEXT
);
""",
dag=dag,
)
您也可以使用外部檔案來執行 SQL 命令。腳本資料夾必須與 DAG.py 檔案位於同一層級。這樣一來,您可以輕鬆地將 SQL 查詢與程式碼分離維護。
# Example of creating a task that calls an sql command from an external file.
create_table_mssql_from_external_file = SQLExecuteQueryOperator(
task_id="create_table_from_external_file",
conn_id="airflow_mssql",
sql="create_table.sql",
dag=dag,
)
您的 dags/create_table.sql
應該看起來像這樣
將資料插入 MSSQL 資料庫表格¶
然後,我們可以建立一個 SQLExecuteQueryOperator 任務來填充 Users
表格。
populate_user_table = SQLExecuteQueryOperator(
task_id="populate_user_table",
conn_id="airflow_mssql",
sql=r"""
INSERT INTO Users (username, description)
VALUES ( 'Danny', 'Musician');
INSERT INTO Users (username, description)
VALUES ( 'Simone', 'Chef');
INSERT INTO Users (username, description)
VALUES ( 'Lily', 'Florist');
INSERT INTO Users (username, description)
VALUES ( 'Tim', 'Pet shop owner');
""",
)
從您的 MSSQL 資料庫表格中提取記錄¶
從您的 MSSQL 資料庫表格中提取記錄可以很簡單,就像
get_all_countries = SQLExecuteQueryOperator(
task_id="get_all_countries",
conn_id="airflow_mssql",
sql=r"""SELECT * FROM Country;""",
)
將參數傳遞到 SQLExecuteQueryOperator¶
SQLExecuteQueryOperator 提供 parameters
屬性,這使得在執行期間將值動態注入到您的 SQL 請求中成為可能。
要查找亞洲大陸的國家
get_countries_from_continent = SQLExecuteQueryOperator(
task_id="get_countries_from_continent",
conn_id="airflow_mssql",
sql=r"""SELECT * FROM Country where {{ params.column }}='{{ params.value }}';""",
params={"column": "CONVERT(VARCHAR, continent)", "value": "Asia"},
)
完整的 SQLExecuteQueryOperator DAG 以連線至 MSSQL¶
當我們將所有內容放在一起時,我們的 DAG 應該看起來像這樣
import os
from datetime import datetime
import pytest
from airflow import DAG
try:
from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator
from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
except ImportError:
pytest.skip("MSSQL provider not available", allow_module_level=True)
ENV_ID = os.environ.get("SYSTEM_TESTS_ENV_ID")
DAG_ID = "example_mssql"
with DAG(
DAG_ID,
schedule="@daily",
start_date=datetime(2021, 10, 1),
tags=["example"],
catchup=False,
) as dag:
# Example of creating a task to create a table in MsSql
create_table_mssql_task = SQLExecuteQueryOperator(
task_id="create_country_table",
conn_id="airflow_mssql",
sql=r"""
CREATE TABLE Country (
country_id INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
name TEXT,
continent TEXT
);
""",
dag=dag,
)
@dag.task(task_id="insert_mssql_task")
def insert_mssql_hook():
mssql_hook = MsSqlHook(mssql_conn_id="airflow_mssql", schema="airflow")
rows = [
("India", "Asia"),
("Germany", "Europe"),
("Argentina", "South America"),
("Ghana", "Africa"),
("Japan", "Asia"),
("Namibia", "Africa"),
]
target_fields = ["name", "continent"]
mssql_hook.insert_rows(table="Country", rows=rows, target_fields=target_fields)
# Example of creating a task that calls an sql command from an external file.
create_table_mssql_from_external_file = SQLExecuteQueryOperator(
task_id="create_table_from_external_file",
conn_id="airflow_mssql",
sql="create_table.sql",
dag=dag,
)
populate_user_table = SQLExecuteQueryOperator(
task_id="populate_user_table",
conn_id="airflow_mssql",
sql=r"""
INSERT INTO Users (username, description)
VALUES ( 'Danny', 'Musician');
INSERT INTO Users (username, description)
VALUES ( 'Simone', 'Chef');
INSERT INTO Users (username, description)
VALUES ( 'Lily', 'Florist');
INSERT INTO Users (username, description)
VALUES ( 'Tim', 'Pet shop owner');
""",
)
get_all_countries = SQLExecuteQueryOperator(
task_id="get_all_countries",
conn_id="airflow_mssql",
sql=r"""SELECT * FROM Country;""",
)
get_all_description = SQLExecuteQueryOperator(
task_id="get_all_description",
conn_id="airflow_mssql",
sql=r"""SELECT description FROM Users;""",
)
get_countries_from_continent = SQLExecuteQueryOperator(
task_id="get_countries_from_continent",
conn_id="airflow_mssql",
sql=r"""SELECT * FROM Country where {{ params.column }}='{{ params.value }}';""",
params={"column": "CONVERT(VARCHAR, continent)", "value": "Asia"},
)
(
create_table_mssql_task
>> insert_mssql_hook()
>> create_table_mssql_from_external_file
>> populate_user_table
>> get_all_countries
>> get_all_description
>> get_countries_from_continent
)