Indexed Table Output
Overview
Section titled “Overview”The Indexed Table Output step allows you to load data into a database table. Table Output is equivalent to the DML operator INSERT. This step provides configuration options for target table and a lot of housekeeping and/or performance-related options such as Commit Size and Use batch update for inserts.
An index with selected Key fields will be created when the last record are finished loaded into the target table.
Configuration Options
Section titled “Configuration Options”| Option | Description |
|---|---|
| Step name | Name of the step; this name has to be unique in a single transformation |
| Connection | The database connection to which data is written |
| Target Schema | The name of the Schema for the table to write data to. This is important for data sources that allow for table names with periods in them |
| Target table | The name of the table to which data is written |
| Commit size | Use transactions to insert rows in the database table. Commit the connection every N rows if N is larger than zero (0) |
| Field that contains name of table | When the option “Is the name of the table defined in a field?” is enabled, enter the field name to use here |
| SQL | Generates the SQL to create the output table automatically |
Key Fields and Index Creation
Section titled “Key Fields and Index Creation”After all records are loaded into the target table, an index is automatically created on the selected Key fields. This is useful for:
- Improving query performance on the loaded data
- Creating indexes that match your query patterns
- Ensuring data integrity with unique indexes
Index Creation Timing
The index is created after all records are loaded, not during the load process. This ensures optimal insert performance.
Batch Updates
Section titled “Batch Updates”When enabled, batch updates can significantly improve performance by:
- Reducing the number of database round trips
- Allowing the database to optimize insert operations
- Improving overall throughput for large datasets
Best Practices
Section titled “Best Practices”Performance Optimization
Section titled “Performance Optimization”- Use batch updates: Enable batch updates for better performance
- Optimize commit size: Balance between transaction safety and performance
- Select appropriate key fields: Choose key fields that match your query patterns
- Pre-create table: For very large loads, consider pre-creating the table with appropriate structure
When to Use
Section titled “When to Use”Use Indexed Table Output when:
- You need to insert data into an existing table
- You want automatic index creation after load
- You need row-by-row insert control
- You’re working with moderate data volumes (< 10M rows)
Consider Bulk Loader instead when:
- You’re loading very large datasets (> 10M rows)
- Maximum performance is critical
- You’re using MySQL or compatible database
- You can use
LOAD DATA INFILEfunctionality
Comparison with Other Output Methods
Section titled “Comparison with Other Output Methods”| Method | Use Case | Performance | Features |
|---|---|---|---|
| Indexed Table Output | Standard inserts with index creation | Good | Automatic index creation, batch updates |
| Bulk Loader | Very large datasets | Excellent | Uses LOAD DATA INFILE, fastest method |
| Staging Upsert Output | Incremental loads with updates | Good | Upsert logic, variation number support |
Related
Section titled “Related”- Transforms Overview - Overview of all transforms
- Staging Upsert Output - Upsert to staging tables
- Bulk Loader - Bulk load to databases