airflow - Specify parallelism per task? -
i know in cfg can set parallelism, there way per task, or @ least per dag?
dag1=
task_id: 'download_sftp' parallelism: 4 #i fine downloading multiple files @ once task_id: 'process_dimensions' parallelism: 1 #i want make sure dimensions processed 1 @ time prevent conflicts 'serial' keys task_id: 'process_facts' parallelism: 4 #it fine have multiple tables processed @ once since there no conflicts
dag2 (separate file)=
task_id: 'bcp_query' parallelism: 6 #i can query separate bcp commands download data since small amounts of data
you can create task pool through web gui , limit execution parallelism specifying specific tasks use pool.
please see: https://airflow.apache.org/concepts.html#pools
Comments
Post a Comment