Does AWS Data Pipeline Support Backfill Based on Fixed-Size Sequential Blocks of Rows? -
i need copy 500m or records (each row 1kb) mysql table redshift table part of backfill process. understand, datapipline typically backfills creating tasks have occurred had pipeline been running given historical date. if understood number of backfill tasks datapipeline creates computed by
(current datetime – historical datetime)/task interval
and 1 provides sql query can collect table rows date.
however, means number of records each task transfers vary depending on how many records updated in given time interval (which varies) , alternately, if want keep number of records in each task small end many more backfill tasks like.
is there easy way to tell data pipeline assign workers fixed-size tasks (eg sequential blocks of n primary keys) can predefine number of rows in each backfill task , avoid creating large numbers of empty tasks , tasks big instance?
Comments
Post a Comment