Skip to content

Bulk Loader

The Bulk Loader transform streams data from inside Deem Integrator to a text file using LOAD DATA INFILE 'FIFO File' INTO TABLE .... into the database. This provides an efficient method for loading large volumes of data into database tables.

The transform uses a FIFO (First In, First Out) named pipe to stream data directly to the database, avoiding the need to write intermediate files to disk. This makes it one of the fastest methods for loading data into MySQL and compatible databases.

Bulk Loader

OptionDescription
Transform nameName of the transform
ConnectionSelect predefined database connection
Target schemaThe name of the Schema for the table to write data to. This is important for data sources that allow for table names with dots ’.’ in it
Target tableName of the target table
Split every … rowsTo split the data loads in chunks of data after which the data load will be
Truncate tableCreate new table before start loading data
Auto generate View on TableThe output are generated as a View. Each time the step are executed a new temporary table are created. Tables are created in a temporary database with “temp” as suffix. Tables in this temp databases have table name with suffix ‘0’ to ‘2_’. Remember to create database for temporary tables
Temp directoryTemp directory
FilenameThis field specifies the filename and location of the output text file
SeparatorThe character used to separate (delimiter) fields
EnclosureThe enclosure used for strings

With the “Get fields” button you can load the fields from the stream.

The output table are generated as a view. Each time the step are executed a new temporary table are created. Tables are created in a temporary database with “temp” as suffix. Tables in this temp databases have table name with suffix ‘0’ to ‘2_’.

Database Setup Required

Remember to create database for temporary tables. If the connection are mapped to a database with the name “dw_prd_datamarts” you need to create a database with the name “dw_prd_datamarts_temp”.

To store current temporary table number a control table with the name “bi_auto_view_control” in the datamarts database. This table have one record per tablename. Each time the loader are running the current table number are updated in the control table.

  1. Temporary Table Creation: A new temporary table is created with suffix ‘0_’, ‘1_’, or ‘2_’
  2. Data Loading: Data is loaded into the temporary table
  3. View Update: The view is updated to point to the new temporary table
  4. Control Table: The control table tracks which temporary table number is currently active

This pattern allows for zero-downtime updates where the view always points to the most recent data while old temporary tables can be cleaned up later.

  • Use for large datasets: Bulk Loader is most effective for datasets larger than 100,000 rows
  • Split large loads: Use “Split every … rows” to break very large loads into manageable chunks
  • Temp directory: Use a fast storage location (preferably SSD) for the temp directory
  • Separator and enclosure: Choose appropriate separator and enclosure characters that don’t conflict with your data

Use Bulk Loader when:

  • Loading very large datasets (> 100K rows)
  • Maximum performance is critical
  • Using MySQL or compatible database
  • You can use LOAD DATA INFILE functionality

Consider Indexed Table Output instead when:

  • You need automatic index creation
  • You need more control over the insert process
  • You’re working with smaller datasets
  • You need upsert functionality
  • File system support: Ensure your file system supports named pipes (FIFO)
  • Permissions: Ensure the process has read/write permissions for the temp directory
  • Cleanup: FIFO files are automatically cleaned up after use
  • Network storage: Avoid using network-mounted storage for temp directory if possible