21. Import data using WbImport

21.1. Importing spreadsheet files
21.2. General parameters
21.3. Parameters for the type TEXT
21.4. Text Import Examples
21.5. Parameters for the type XML
21.6. Parameters for spreadsheet import
21.7. Update mode
21.8. Native UPSERT mode
21.9. Native insertIgnore mode

The WbImport command can be used to import data from text, XML or Spreadsheet (ODS, XLS, XLSX) files into a table of the database. WbImport can read the XML files generated by the WbExport command's XML format. It can also read text files created by the WbExport command that escape non-printable characters.

The WbImport command can be used like any other SQL command (such as UPDATE or INSERT), including scripts that are run in batch mode.

During the import of text files, empty lines (i.e. lines which only contain whitespace) will be silently ignored.

WbImport recognizes certain "literals" to identify the current date or time when converting values from text files to the appropriate data type of the DBMS. Thus, input values like now, or current_timestamp for date or timestamp columns are converted correctly. For details on which "literals" are supported, please see the description about editing data.

The DataPumper can also be used to import text files into a database table, though it does not offer all of the possibilities from the WbImport command.

Archives created with the WbExport command using the -compress=true parameter can be imported using WbImport command. You simply need to specifiy the archive file created by WbExport, and WbImport will automatically detect the archive. For an example to create and import compressed exports, please refer to compressing export files

[Note]

If you use continueOnError=true and expect a substantial number of rows to fail, it is highly recommended to also use a "bad file" to log all rejected records. Otherwise the rejected records are stored in memory (until the import finishes) which may lead to an out of memory error.

21.1. Importing spreadsheet files

In order to import Microsoft Excel (XSL, XSLT) or OpenOffice Calc (ODS) files, additional libraries are needed. For Excel the same libraries are needed as for exporting those formats. For OpenOffice additional libraries are needed. All needed libraries are included in the download bundle named with-office-libs.zip If you did not download that bundle, you can download the libraries needed for OpenOffice from here: http://www.sql-workbench.net/odf-add-on3.zip.

You can tell if the needed libraries are installed if you invoke the code-completion after typing the -type= parameter. If the types XLS or ODS are presented in the drop down, the libraries installed.

The Excel import supports XLS and XLSX, it does not support the "SpreadsheetML" format.

[Important]

To import XLS or XLSX files, the entire file needs to be read into memory. When importing large files this will require a substantial amount of memory.

21.2. General parameters

WbImport supports conditional execution

The WbImport supports the following parameters

ParameterDescription
-type

Possible values: xml, text, ods, xls

Defines the type of the input file. This is only needed if the input file has a non-standard file extensions. If this parameter is not specified, the import type is derived from the input file's extension.

This parameter supports code-completion. If invoked, it will only display available import types.

-mode

Defines how the data should be sent to the database. Possible values are 'insert', 'update', 'insert,update' and 'update,insert' For details please refer to the update mode explanation.

For some DBMS, the additional modes: 'upsert' and 'insertIgnore' are supported. For details please refer to the native upsert and native insertIgnore explanation.

This parameter supports code-completion.

-file

Defines the full name of the input file. Alternatively you can also specify a directory (using -sourcedir) from which all files are imported.

-table

Defines the table into which the data should be imported

This parameter is ignored, if the files are imported using the -sourcedir parameter

This parameter supports code-completion.

-sourceDir

Defines a directory which contains import files. All files from that directory will be imported. If this switch is used with text files and no target table is specified, then it is assumed that each filename (without the extension) defines the target table. If a target table is specified using the -table parameter, then all files will be imported into the same table. The -deleteTarget will be ignored if multiple files are imported into a single table.

-extension

When using the -sourcedir switch, the extension for the files can be defined. All files ending with the supplied value will be processed. (e.g. -extension=csv). The extension given is case-sensitive (i.e. TXT is something different than txt

-ignoreOwner

If the file names imported with from the directory specified with -sourceDir contain the owner (schema) information, this owner (schema) information can be ignored using this parameter. Otherwise the files might be imported into a wrong schema, or the target tables will not be found.

-excludeFiles

Using -excludeFiles, files from the source directory (when using -sourceDir) can be excluded from the import. The value for this parameter is a comma separated list of partial names. Each file that contains at least one of the values supplied in this parameter is ignored. -excludeFiles=back,data will exclude any file that contains the value back or data in it, e.g.: backup, to_back, log_data_store etc.

-skipTargetCheck

Normally WbImport will check if the specified target table exists and will retrieve the column definition for the target table through the JDBC API. However, some JDBC drivers do not always return all table information e.g. for temporary tables. If you know that the target table exists, the parameter -skipTargetCheck=true can be used to tell WbImport to not verify the existence of the table through the JDBC API.

However, WbImport needs to know the data types of the table columns. When -skipTargetCheck is enabled, the data types are detected by running a SELECT * statement to fetch the definition of the columns. For JDBC drivers supporting meta data for prepared statements, the query is only prepared, but not executed. For drivers not supporting that, the query is executed and the result is limited to a single row.

-checkDependencies

When importing more than one file (using the -sourcedir switch), into tables with foreign key constraints, this switch can be used to import the files in the correct order (child tables first). When -checkDependencies=true is passed, SQL Workbench/J will check the foreign key dependencies for all tables. Note that this will not check dependencies in the data. This means that e.g. the data for a self-referencing table (parent/child) will not be order so that it can be imported. To import self-referencing tables, the foreign key constraint should be set to "initially deferred" in order to postpone evaluation of the constraint until commit time.

-commitEvery

If your DBMS neeeds frequent commits to improve performance and reduce locking on the import table you can control the number of rows after which a COMMIT is sent to the server.

-commitEveryis numeric value that defines the number of rows after which a COMMIT is sent to the DBMS. If this parameter is not passed (or a value of zero or lower), then the import is run as a single transaction that is committed at the end.

When using batch import and your DBMS requires frequent commits to improve import performance, the -commitBatch option should be used instead.

You can turn off the use of a commit or rollback during import completely by using the option -transactionControl=false.

Using -commitEvery means, that in case of an error the already imported rows cannot be rolled back, leaving the data in a potential invalid state.

-transactionControl

Possible values: true, false

Controls if SQL Workbench/J handles the transaction for the import, or if the import must be committed (or rolled back) manually. If -transactionControl=false is specified, SQL Workbench/J will neither send a COMMIT nor a ROLLBACK at the end. This can be used when multiple files need to be imported in a single transaction. This can be combined with the cleanup and error scripts in batch mode.

-continueOnError

Possible values: true, false

This parameter controls the behavior when errors occur during the import. The default is true, meaning that the import will continue even if an error occurs during file parsing or updating the database. Set this parameter to false if you want to stop the import as soon as an error occurs.

The default value for this parameter can be controlled in the settings file and it will be displayed if you run WbImport without any parameters.

With PostgreSQL continueOnError will only work, if the use of savepoints is enabled using -useSavepoint=true.

-emptyFile

Possible values: ignore, warning, fail

This parameter controls the behavior when an empty file (i.e. with a length of zero bytes) is used for the input file. ignore means the file is ignored, no warning will be shown or written to the log file. warning means the file is ignored, but a warning will be shown and logged. With fail an empty file will be treated as an error unless -continueOnError=true is specified.

The default value is fail

-useSavepoint

Possible values: true, false

Controls if SQL Workbench/J guards every insert or update statement with a savepoint to recover from individual error during import, when continueOnError is set to true.

Using a savepoint for each DML statement can drastically reduce the performance of the import.

-keyColumns

Defines the key columns for the target table. This parameter is only necessary if import is running in UPDATE mode.

It is assumed that the values for the key columns will never be NULL.

This parameter is ignored if files are imported using the -sourcedir parameter.

-ignoreIdentityColumns

Possible values: true, false

Controls if identity or auto-increment columns will be included in the import.

If this is used, the JDBC driver must correctly report the column to be excluded as an AUTOINCREMENT column. This can be verified in the table definition display of the DbExplorer. If the column is reported with YES for the AUTOINCREMENT property, then this column will be excluded during the import.

-schema Defines the schema into which the data should be imported. This is necessary for DBMS that support schemas, and you want to import the data into a different schema, then the current one.
-encoding

Defines the encoding of the input file (and possible CLOB files)

If code-completion is invoked for this parameter, it will show a list of encodings defined through the configuration property workbench.export.defaultencodings This is a comma-separated list that can be changed using WbSetConfig. If this property is not defined, all available encodings will be shown.

-deleteTarget

Possible values: true, false

If this parameter is set to true, data from the target table will be deleted (using DELETE FROM ...) before the import is started. This parameter will only be used if -mode=insert is specified.

This parameter is ignored for spreadsheet imports.

-truncateTable

Possible values: true, false

This is essentially the same as -deleteTarget, but will use the command TRUNCATE to delete the contents of the table. For those DBMS that support this command, deleting rows is usually faster compared to the DELETE command, but it cannot be rolled back. This parameter will only be used if -mode=insert is specified.

-batchSize

A numeric value that defines the size of the batch queue. Any value greater than 1 will enable batch mode. If the JDBC driver supports this, the INSERT (or UPDATE) performance can be increased drastically.

This parameter will be ignored if the driver does not support batch updates or if the mode is not UPDATE or INSERT (i.e. if -mode=update,insert or -mode=insert,update is used).

-commitBatch

Possible values: true, false

If using batch execution (by specifying a batch size using the -batchSize parameter) each batch will be committed when this parameter is set to true. This is slightly different to using -commitEvery with the value of the -batchSize parameter. The latter one will add a COMMIT statement to the batch queue, rather than calling the JDBC commit() method. Some drivers do not allow to add different statements in a batch queue. So, if a frequent COMMIT is needed, this parameter should be used.

When you specify -commitBatch the parameter -commitEvery will be ignored. If no batch size is given (using -batchSize, then -commitBatch will also be ignored.

-updateWhere

When using update mode an additional WHERE clause can be specified to limit the rows that are updated. The value of the -updatewhere parameter will be added to the generated UPDATE statement. If the value starts with the keyword AND or OR the value will be added without further changes, otherwise the value will be added as an AND clause enclosed in brackets. This parameter will be ignored if update mode is not active.

-startRow

A numeric value to define the first row to be imported. Any row before the specified row will be ignored. The header row is not counted to determine the row number. For a text file with a header row, the physical line 2 is row 1 (one) for this parameter.

When importing text files, empty lines in the input file are silently ignored and do not add to the count of rows for this parameter. So if your input file has two lines to be ignored, then one empty line and then another line to be ignored, startRow must be set to 4.

-endRow A numeric value to define the last row to be imported. The import will be stopped after this row has been imported. When you specify -startRow=10 and -endRow=20 11 rows will be imported (i.e. rows 10 to 20). If this is a text file import with a header row, this would correspond to the physical lines 11 to 21 in the input file as the header row is not counted.
-columnFilter

This defines a filter on column level that selects only certain rows from the input file to be sent to the database. The filter has to be defined as column1="regex",column2="regex". Only Rows matching all of the supplied regular expressions will be included by the import.

This parameter is ignored when the -sourcedir parameter is used.

-badFile

If -continueOnError=true is used, you can specify a file to which rejected rows are written. If the provided filename denotes a directory a file with the name of the import table will be created in that directory. When doing multi-table inserts you have to specify a directory name.

If a file with that name exists it will be deleted when the import for the table is started. The fill will not be created unless at least one record is rejected during the import. The file will be created with the same encoding as indicated for the input file(s).

-maxLength

With the parameter -maxLength you can truncate data for character columns (VARCHAR, CHAR) during import. This can be used to import data into columns that are not big enough (e.g. VARCHAR columns) to hold all values from the input file and to ensure the import can finish without errors.

The parameter defines the maximum length for certain columns using the following format: -maxLength='firstname=30,lastname=20' Where firstname and lastname are columns from the target table. The above example will limit the values for the column firstname to 30 characters and the values for the column lastname to 20 characters. If a non-character column is specified this is ignored. Note that you have quote the parameter's value in order to be able to use the "embedded" equals sign.

-booleanToNumber

Possible values: true, false

In case you are importing a boolean column (containing "true", "false") into a numeric column in the target DBMS, SQL Workbench/J will automatically convert the literal true to the numeric value 1 (one) and the literal false to the numeric value 0 (zero). If you do not want this automatic conversion, you have to specify -booleanToNumber=false for the import. The default values for the true/false literals can be overwritten with the -literalsFalse and -literalsTrue switches.

To store different values than 0/1 in the target column, use the parameters -numericFalse and -numericTrue

-numericFalse -numericTrue

These parameters control the conversion of boolean literals into numbers.

If these parameters are used, any text input that is identified as a "false" literal, will be stored with the number specified with -numericFalse. Any text input that is identified as "true" will be stored as the number specified with -numericFalse.

To use -1 for false and 1 for true, use the following parameters: -numericFalse='-1' -numericTrue='1'. Note that '-1' must be quoted due to the dash. If these parameters are used, -booleanToNumber will be assumed true implicitely.

These parameters can be combined with -literalsFalse and -listeralsTrue.

Please note:

  • This conversion is only applied for "text" input values. Valid numbers in the input file will not be converted to the values specified with -numericFalse or -numericTrue. This means that you cannot change a 0 (zero) in the input file into a -1 in the target column.

-literalsFalse -literalsTrue

These parameters control the conversion of boolean literals into boolean values.

These two switches define the text values that represent the (boolean) values false and true in the input file. This conversion is applied when storing the data in a column that is of type boolean in the database.

The value to these switches is a comma separated list of literals that should be treated as the specified value, e.g.: -literalsFalse='false,0' -literalsTrue='true,1' will define the most commonly used values for true/false.

Please note:

  • The definition of the literals is case sensitive!
  • You always have to specify both switches, otherwise the definition will be ignored

-dateFormat

The format for date columns.

This is only used for text imports and spreadsheet imports using -stringDates=true

-timestampFormat

The format for datetime (or timestamp) columns in the input file.

This is only used for text imports and spreadsheet imports using -stringDates=true

-locale

The locale (language) to be used for parsing date and timestamp values.

This is only used for text imports and spreadsheet imports using -stringDates=true

This parameter supports code-completion.

-illegalDateIsNull

If this is set to true, illegal dates (such as February, 31st) or malformed dates inside the input file will be treated as a null value.

This is only used for text and spreadsheet imports

-trimValues

Possible values: true, false

Controls whether leading and trailing whitespace are removed from the input values before they are stored in the database. When used in combination with -emptyStringIsNull=true this means that a column value that contains only whitespace will be stored as NULL in the database.

The default value for this parameter can be controlled in the settings file and it will be displayed if you run WbImport without any parameters.

Note that, input values for non character columns (such as numbers or date columns) are always trimmed before converting them to their target datatype.

This is only used for text and spreadsheet imports

-emptyStringIsNull

Possible values: true, false

Controls whether input values for character type columns with a length of zero are treated as NULL (value true) or as an empty string.

The default value for this parameter is true

Note that, input values for non character columns (such as numbers or date columns) that are empty or consist only of whitespace will always be treated as NULL.

This is only used for text and spreadsheet imports

-constantValues

With this parameter you can supply constant values for one or more columns that will be used when inserting new rows into the database.

The constant values will only be used when inserting rows (e.g. using -mode=insert)

The format of the values is -constantValues="column1=value1,column2=value2". The parameter can be repeated multiple times, to make quoting easier: -constantValues="column1=value1" -constantValues="column2=value2" The values will be converted by the same rules as the input values from the input file. If the value for a character column is enclosed in single quotes, these will be removed from the value before sending it to the database. To include single quotes at the start or end of the input value you need to use two single quotes, e.g.-constantValues="name=''Quoted'',title='with space'" For the field name the value 'Quoted' will be sent to the database. for the field title the value with space will be sent to the database.

To specify a function call to be executed, enclose the function call in ${...}, e.g. ${mysequence.nextval} or ${myfunc()}. The supplied function will be put into the VALUES part of the INSERT statement without further checking (after removing the ${ and } characters, of course). So make sure that the syntax is valid for your DBMS. If you do need to store a literal like ${some.value} into the database, you need to quote it: -constantValues="varname='${some.value}'".

You can also specify a SELECT statement that retrieves information from the database based on values from the input file. This is useful when the input file contains e.g. values from a lookup table (but not the primary key from the lookup table).

The syntax to specify a SELECT statement is similar to a function call: -constantValues="$@{SELECT type_id FROM type_definition WHERE type_name = $4" where $4 references the fourth column from the input file. The first column is $1 (not $0).

The parameter for the SELECT statement do not need to be quoted as internally a prepared statement is used. However the values in the input file must be convertible by the JDBC driver.

In addition to the function call or SELECT statements, WbImport provides three variables that can be used to access the name of the currently imported file. This can be used to store the source file of the data in the target table.

The following three variables are supported

  • _wb_import_file_path this contains the full path and file name of the current import file
  • _wb_import_file_name this contains only the file name (without the path)
  • _wb_import_file_dir this contains only the path of the file without the filename (and without the extension)

Please refer to the examples for more details on the usage.

-insertSQL

Define the statement to be used for inserting rows.

This can be used to use hints or customize the generated INSERT statement. The parameter may only contain the INSERT INTO part of the statement (i.e. INSERT INTO is the default if nothing is specified). This can be used to pass special hints to the database, e.g. to specify an append hint for Oracle:

[Note]
You have to quote the parameter value using single quotes, otherwise comments will be removed from the SQL statement!

-insertSQL='INSERT /*+ append */ INTO'
-adjustSequences

Possible values: true, false

For DBMS that support sequences which are associated with a column, this parameter can be used to adjust the next value for the sequence to the maximum value of the imported data. This can also be used to synchronize identity columns for DBMS that allow overriding the generated values.

Currently this is implemented for PostgreSQL, DB2 (LUW), H2 Database and HyperSQL (aka HSQLDB).

-preTableStatement -postTableStatement

This parameter defines a SQL statement that should be executed before the import process starts inserting data into the target table. The name of the current table (when e.g. importing a whole directory) can be referenced using ${table.name}.

To define a statement that should be executed after all rows have been inserted and have been committed, you can use the -postTableStatement parameter.

These parameters can e.g. be used to enable identity insert for MS SQL Server:

-preTableStatement="set identity_insert ${table.name} on"
-postTableStatement="set identity_insert ${table.name} off"

Errors resulting from executing these statements will be ignored. If you want to abort the import in that case you can specify -ignorePrePostErrors=false and -continueOnError=false.

These statements are only used if more than one table is processed.

-runTableStatementOnError

Possible values: true, false

Controls the execution of the post-table statement in case an error occurred while importing the data. By default the post-table statement is executed even if the import was not successful. If this is should not happen, use -runTableStatementOnError=false.

-ignorePrePostErrors

Possible values: true, false

Controls handling of errors for the SQL statements defined through the -preTableStatement and -postTableStatement parameters. If this is set to true (the default), errors resulting from executing the supplied statements are ignored. If set to false then error handling depends on the parameter -continueOnError.

-showProgress

Valid values: true, false, <numeric value>

Control the update frequence in the status bar (when running in GUI mode). The default is every 10th row is reported. To disable the display of the progress specify a value of 0 (zero) or the value false. true will set the progress interval to 1 (one).

21.3. Parameters for the type TEXT

ParameterDescription
-fileColumns

A comma separated list of the table columns in the import file Each column from the file should be listed with the appropriate column name from the target table. This parameter also defines the order in which those columns appear in the file. If the file does not contain a header line or the header line does not contain the names of the columns in the database (or has different names), this parameter has to be supplied. If a column from the input file has no match in the target table, then it should be specified with the name $wb_skip$. You can also specify the $wb_skip$ flag for columns which are present but that you want to exclude from the import.

This parameter is ignored when the -sourceDir parameter is used.

-importColumns

Defines the columns that should be imported. If all columns from the input file should be imported (the default), then this parameter can be ommited. If only certain columns should be imported then the list of columns can be specified here. The column names should match the names provided with the -filecolumns switch. The same result can be achieved by providing the columns that should be excluded as $wb_skip$ columns in the -filecolumns switch. Which one you choose is mainly a matter of taste. Listing all columns and excluding some using -importcolumns might be more readable because the structure of the file is still "visible" in the -filecolumns switch.

This parameter is ignored when the -sourcedir parameter is used.

-delimiter

Define the character which separates columns in one line. Records are always separated by newlines (either CR/LF or a single a LF character) unless -multiLine=true is specified

Default value: \t (a tab character)

-columnWidths

In order to import files that do not have a delimiter but have a fixed width for each column, this parameters defines the width of each column in the input file. The value for this parameter is a comma separated list, where each element defines the width in characters for each column. If this parameter is given, the -delimiter parameter is ignored. The order of the columns in the input file must still be defined using the -fileColumns parameter.

e.g.: -fileColumns=custid,actcode,regioncd,flag -columnWidths='custid=10,actcode=5,regioncd=3,flag=1'

Note that the whole list must be enclosed in quotes as the parameter value contains the equal sign.

If you want to import only certain columns you have to use -fileColumns and -importColumns to select the columns to import. You cannot use $wb_skip$ in the -fileColumns parameter with a fixed column width import.

-quoteChar

The character which was used to quote values where the delimiter is contained. This parameter has no default value. Thus if this is not specified, no quote checking will take place. If you use -multiLine=true you have to specify a quote character in order for this to work properly.

-quoteAlways

Possible values: true, false

WbImport will always handled quoted values correctly, if a quote character is defined through -quoteChar.

Using -quoteAlways=true enables the distinction between NULL values and empty strings in the import file, but only if -quoteAlways=true has also been used when running WbExport. Remember to also use -emptyStringIsNull=false, as by default empty string values are treated as NULLs

-quoteCharEscaping

Possible values: none, escape, duplicate

Defines how quote characters that appear in the actual data are stored in the input file.

You have to define a quote character in order for this option to have an effect. The character defined with the -quoteChar switch will then be imported according to the setting defined by this switch.

If escape is specified, it is expected that a quote that is part of the data is preceded with a backslash, e.g. the input value here is a \" quote character will be imported as here is a " quote character

If duplicate is specified, it is expected that the quote character is duplicated in the input data. This is similar to the handling of single quotes in SQL literals. The input value here is a "" quote character will be imported as here is a " quote character

-multiLine

Possible values: true, false

Enable support for records spanning more than one line in the input file. These records have to be quoted, otherwise they will not be recognized.

If you create your exports with the WbExport command, it is recommended to encode special characters using the -escapetext switch rather then using multi-line records.

The default value for this parameter can be controlled in the settings file and it will be displayed if you run WbImport without any parameters.

-decimalThe decimal symbol to be used for numbers. The default is a dot
-header

Possible values: true, false

If set to true, indicates that the file contains a header line with the column names for the target table. This will also ignore the data from the first line of the file. If the column names to be imported are defined using the -filecolumns or the -importcolumns switch, this parameter has to be set to true nevertheless, otherwise the first row would be treated as a regular data row.

This parameter is always set to true when the -sourcedir parameter is used.

The default value for this option can be changed in the settings file and it will be displayed if you run WbImport without any parameters. It defaults to true

-decode

Possible values: true, false

This controls the decoding of escaped characters. If the export file was e.g. written with WbExport's escaping enabled then you need to set -decode=true in order to interpret string sequences like \t, \n or escaped Unicode characters properly. This is not enabled by default because applying the necessary checks has an impact on the performance.

-lineFilter

This defines a filter on the level of the whole input row (rather than for each column individually). Only rows matching this regular expression will be included in the import.

The complete content of the row from the input file will be used to check the regular expression. When defining the expression, remember that the (column) delimiter will be part of the input string of the expression.

-nullString

Defines the string value that in the input file to denote a NULL value. The value of this is case-sensitive, so -nullString=NULL is different to -nullString=null

-blobIsFilename

Possible values: true, false

This is a deprecated parameter. Please use -blobType instead.

When exporting tables that have BLOB columns using WbExport into text files, each BLOB will be written into a separate file. The actual column data of the text file will contain the file name of the external file. When importing text files that do not reference external files into tables with BLOB columns setting this parameter to false, will send the content of the BLOB column "as is" to the DBMS. This will of course only work if the JDBC driver can handle the data that in the BLOB columns of the text file. The default for this parameter is true

This parameter is ignored, if -blobType is also specified.

-blobType

Possible values: file, ansi, base64

Specifies how BLOB data is stored in the input file. If file is specified, it is assumed that the column value contains a filename that in turn contains the real blob data. This is the default format when using WbExport.

For the other two type, WbImport assumes that the blob data is stored as encoded character data in the column.

If this parameter is specified, -blobIsFilename is ignored.

-clobIsFilename

Possible values: true, false

When exporting tables that have CLOB columns using WbExport and the parameter -clobAsFile=true the generated text file will not contain the actual CLOB contents, but the a filename indicating the file in which the CLOB content is stored. In this case -clobIsFilename=true has to be specified in order to read the CLOB contents from the external files. The CLOB files will be read using the encoding specified with the -encoding parameter.

-usePgCopy

This parameter has no value, its presence turns the feature on.

If this parameter is specified, then the input file is sent to the PostgreSQL server using PostgreSQL's JDBC support for COPY

The specified file(s) must conform to the format expected by PostgreSQL's COPY command. SQL Workbench/J creates a COPY tablename (column, ...) FROM stdin WITH (format csv, delimiter '|', header true) statement and then executes this, passing the actual file contents through the JDBC API.

As COPY does not support "merging" of data, the only allowed import mode is insert. If a different mode is specified through the -mode parameter, an error will be reported.

The options defined in the WITH (...) part are influenced by the parameters passed to WbImport. However COPY does not support all options that WbImport does. To control the format of the input file(s) only the following parameters are relevant when using -usePgCopy:

  • -header
  • -encoding
  • -delimiter

Especially the formatting options for dates/timestamps and numbers will have no effect. So the input file must be formatted properly.

All parameters controlling the target table(s), the columns, the source directory and so on still work. Including the import directly from a ZIP archive.

21.4. Text Import Examples

21.4.1. Importing date columns

WbImport -file=c:/temp/contacts.txt
         -table=person
         -filecolumns=lastname,firstname,birthday
         -dateformat="yyyy-MM-dd";

This imports a file with three columns into a table named person. The first column in the file is lastname, the second column is firstname and the third column is birthday. Values in date columns are formated as yyyy-MM-dd

[Note]

A special timestamp format millis is availalbe to identify times represented in milliseconds (since January 1, 1970, 00:00:00 GMT).

21.4.2. Excluding input columns from the import

WbImport -file=c:/temp/contacts.txt
         -table=person
         -filecolumns=lastname,firstname,$wb_skip$,birthday
         -dateformat="yyyy-MM-dd";

This will import a file with four columns. The third column in the file does not have a corresponding column in the table person so its specified as $wb_skip$ and will not be imported.

WbImport -file=c:/temp/contacts.txt
         -table=person
         -filecolumns=lastname,firstname,phone,birthday
         -importcolumns=lastname,firstname;

This will import a file with four columns where all columns exist in the target table. Only lastname and firstname will be imported. The same effect could be achieved by specifying $wb_skip$ for the last two columns and leaving out the -importcolumns switch. Using -importcolumns is a bit more readable because you can still see the structure of the input file. The version with $wb_skip$ is mandatory if the input file contains columns that do not exist in the target table.

21.4.3. Importing a file with fixed column widths

WbImport -file=cust_data.txt
         -table=customer
         -filecolumns=custnr,accountid,region_code
         -columnWidths='custnr=10,accountid=10,region_code=2';

This will import a file with three columns. The first column named custnr is taken from the characters 1-10, the second column named accountid is taken from the characters 21-30 and the third the column region_code is taken from characters 31 and 32

21.4.4. Filtering rows during import

If you want to import certain rows from the input file, you can use regular expressions:

WbImport -file=c:/temp/contacts.txt
         -table=person
         -filecolumns=lastname,firstname,birthday
         -columnfilter=lastname="^Bee.*",firstname="^Za.*"
         -dateformat="yyyy-MM-dd";

The above statement will import only rows where the column lastname contains values that start with Bee and the column firstname contains values that start with Za. So Zaphod Beeblebrox would be imported, Arthur Beeblebrox would not be imported.

If you want to learn more about regular expressions, please have a look at http://www.regular-expressions.info/

If you want to limit the rows that are updated but cannot filter them from the input file using -columnfilter or -linefilter, use the -updatewhere parameter:

WbImport -file=c:/temp/contacts.txt
         -table=person
         -filecolumns=id,lastname,firstname,birthday
         -keycolumns=id
         -mode=update
         -updatewhere="source <> 'manual'"

This will update the table PERSON. The generated UPDATE statement would normally be: UPDATE person SET lastname=?, firstname=?, birthday=? WHERE id=?. The table contains entries that are maintained manually (identified by the value 'manual' in the column source) and should not be updated by SQL Workbench/J. By specifying the -updatewhere parameter, the above UPDATE statement will be extended to WHERE id=? AND (source <> 'manual'). Thus skipping records that are flagged as manual even if they are contained in the input file.

21.4.5. Importing several files

WbImport -sourceDir=c:/data/backup
         -extension=txt
         -header=true

This will import all files with the extension txt located in the directory c:/data/backup into the database. This assumes that each filename indicates the name of the target table.

WbImport -sourceDir=c:/data/backup
         -extension=txt
         -table=person
         -header=true

This will import all files with the extension txt located in the directory c:/data/backup into the table person regardless of the name of the input file. In this mode, the parameter -deleteTarget will be ignored.

21.4.6. Storing the name of the source file

The following statement will import all .txt files from the directory /data/import and store them in the appropriate tables. Each table that is being imported has to have a column named source_file and the complete path to the import file will be stored in that column (for each imported row).

WbImport -sourceDir=/data/import
         -header=true
         -schema=staging
         -extension=txt
         -constantValues="source_file=$[_wb_import_file_path]"
         -type=text;

21.4.7. Populating columns from the database

When your input file does not contain the actual values to be stored in the target table, but e.g. lookup values, you can specify a SELECT statement to retrieve the necessary primary key of the lookup table.

Consider the following tables:

contact (contact_id, first_name, last_name, type_id)
contact_type (type_id, type_name)

The table contact_type contains: (1, 'business'), (2, 'private'), (3, 'other').

Your input file only contains contact_id, first_name, last_name, type_name. Where type_name references an entry from the contact_type table.

To import this file, the following statement can be used:

WbImport -file=contacts.txt
         -type=text
         -header=true
         -table=contact
         -importColumns=contact_id, first_name, last_name
         -constantValues="type_id=$@{SELECT type_id FROM contact_type WHERE type_name = $4}"

For every row from the input file, SQL Workbench/J will run the specified SELECT statement. The value of the first column of the first row that is returned by the SELECT, will then be used to populate the type_id column. The SELECT statement will use the value of the third column of the row that is currently being inserted as the value for the WHERE condition.

You must use the -importColumns parameter as well to make sure the type_name column is not processed! As an alternative you can also use -fileColumns=contact_id, first_name, last_name, $wb_skip$ instead of -importColumns.

[Note]

The "placeholders" with the column index must not be quoted (e.g. '$1' for a character column will not work)!

If the column contact_id should be populated by a sequence, the above statement can be extended to include a function call to retrieve the sequence value (PostgreSQL syntax:)

WbImport
  -file=contacts.txt
  -type=text
  -header=true
  -table=contact
  -importColumns=first_name, last_name
  -constantValues="id=${nextval('contact_id_seq'::regclass)}"
  -constantValues="type_id=$@{SELECT type_id FROM contact_type WHERE type_name = $4}"

As the ID column is now populated through a constant expression, it may not appear in the -importColumns list. Again you could alternatively use -fileColumns=$wb_skip$, first_name, last_name, $wb_skip$ to make sure the columns that are populated through the -constantValue parameter are not taken from the input file.

21.5. Parameters for the type XML

The XML import only works with files generated by the WbExport command.

ParameterDescription
-verboseXML

Possible values: true, false

If the XML was generated with -verboseXML=false then this needs to be specified also when importing the file. Beginning with build 78, the SQL Workbench/J writes the information about the used tags into the meta information. So it is no longer necessary to specify whether -verboseXML was true when creating the XML file.

-sourceDir

Specify a directory which contains the XML files. All files in that directory ending with ".xml" (lowercase!) will be processed. The table into which the data is imported is read from the XML file, also the columns to be imported. The parameters -keycolumns, -table and -file are ignored if this parameter is specified. If XML files are used that are generated with a version prior to build 78, then all files need to use either the long or short tag format and the -verboseXML=false parameter has to be specified if the short format was used.

When importing several files at once, the files will be imported into the tables specified in the XML files. You cannot specify a different table (apart from editing the XML file before starting the import).

-importColumns

Defines the columns that should be imported. If all columns from the input file should be imported (the default), then this parameter can be omited. When specified, the columns have to match the column names available in the XML file.

-createTargetIf this parameter is set to true the target table will be created, if it doesn't exist. Valid values are true or false.

21.6. Parameters for spreadsheet import

Both spreadsheet imports (Microsoft Excel, OpenOffice) support a subset of the parameters that are used for flat file imports.

These parameters are:

  • -header
  • -fileColumns
  • -importColumns
  • -nullString
  • -emptyStringIsNull
  • -illegalDateIsNull

The spreadsheet import does not support specifying a date or timestamp format. It is expected that those columns are formatted in such a way that they can be identified as date or timestamps.

The spreadsheet import also does not support importing BLOB files that are referenced from within the spreadsheet. If you want to import this kind of data, you need to convert the spreadsheet into a text file.

The spreadsheet import supports one additional parameter that is not available for the text imports:

ParameterDescription
-sheetNumber

Selects the spread sheet inside the file to be imported. If this is not specified the first sheet is used. The first sheet has the number 1.

All sheets can be imported with a single command when using -sheetNumber=*. In that case it is assumed that each sheet has the same name as the target table.

If all sheets are imported, the parameters -table, -fileColumns and -importColumns are ignored.

-sheetName

Defines the name of the spreedsheet inside the file to be imported. If this is not specified the first sheet is used.

-stringDates

Possible values: true, false

By default WbImport tries to read "native" date and timestamp values from an Excel Worksheet. When this parameter is set to true, the values for date and timestamp values will be retrieved as a (formatted) string value and then converted using the format specified through the -timestampFormat and -dateFormat parameters.

21.7. Update mode

The -mode parameter controls the way the data is sent to the database. The default is INSERT. SQL Workbench/J will generate an INSERT statement for each record. If the INSERT fails no further processing takes place for that record.

If -mode is set to UPDATE, SQL Workbench/J will generate an UPDATE statement for each row. In order for this to work, the table needs to have a primary key defined, and all columns of the primary key need to be present in the import file. Otherwise the generated UPDATE statement will modify rows that should not be modified. This can be used to update existing data in the database based on the data from the export file.

To either update or insert data into the table, both keywords can be specified for the -mode parameter. The order in which they appear as the parameter value, defines the order in which the respective statements are sent to the database. If the first statement fails, the second will be executed. For -mode=insert,update to work properly a primary or unique key has to be defined on the table. SQL Workbench/J will catch any exception (=error) when inserting a record, then it will try updating the record, based on the specified key columns. The -mode=update,insert works the other way. First SQL Workbench/J will try to update the record based on the primary keys. If the DBMS signals that no rows have been updated, it is assumed that the row does not exist and the record will be inserted into the table. This mode is recommended when no primary or unique key is defined on the table, and an INSERT would always succeed.

The keycolumns defined with the -keycolumns parameter don't have to match the real primary key, but they should identify one row uniquely.

You cannot use the update mode, if the tables in question only consist of key columns (or if only key columns are specified). The values from the source are used to build up the WHERE clause for the UPDATE statement.

If you specify a combined mode (e.g.: update,insert) and one of the tables involved consists only of key columns, the import will revert to insert mode. In this case database errors during an INSERT are not considered as real errors and are silently ignored.

For maximum performance, choose the update strategy that will result in a succssful first statement more often. As a rule of thumb:

  • Use -mode=insert,update, if you expect more rows to be inserted then updated.

  • Use -mode=update,insert, if you expect more rows to be updated then inserted.

To use insert/update or update/insert with PostgreSQL, make sure you have enabled savepoints for the import (which is enabled by default).

21.8. Native UPSERT mode

When using a DBMS that supports an "update or insert" functionality directly, this can be selected using -mode=upsert. In this case SQL Workbench/J will only use a single statement instead of two statements as described in the previous chapter. The advantage of using this mode over e.g. insert,update is that fewer statements are sent to the database, and that this mode supports the use of batching, which is not possible when using insert,update.

For the following database systems, native UPSERT is available:

21.9. Native insertIgnore mode

The -mode=insertIgnore will use the built in feature of the DBMS to (silently) ignore inserts that would result in a violation of a unique key constraint but not update existing rows. Using -mode=insertIgnore has the same effect as using -mode=insert -continueOnError=true but will perform better (especially when many collisions are expected) because this can be combined with batching and it does not require the use of savepoints (e.g. for Postgres)

This mode is supported for the following DBMS: