Function Name |
Text File Data Source | ||||
Category |
Text | ||||
Icon |
![]() |
||||
Description |
Reads data from a delimitated or fixed width format file (CSV etc). | ||||
Inputs |
|
||||
Outputs |
|
||||
Properties |
|
When the Text Reader component is dragged onto a transform a setup wizard is launched to guide you through the creation of the data source.
The first wizard page give you the option of basing the reader on an existing data file, or defining the data source from scratch.
If you selected a to infer the structure from an existing file then the next page allows you to define the field delimiters.
There are 2 forms of file delimited and fixed width, delimited files use a special character to separate fields, fixed width pad all the fields so they take up the same amount of space.
Some files also contain the column names on the first row.
Select the options that are applicable and press Next.
If your file was delimitated, then the next screen allows you to define the rules used to read each field.
The delimiters break the fields, it's possible to select a number of delimiters, but in practice the file normally only uses a tab or a comma.
The "Fields are enclosed in quotes" property means that if the first char within a field is a quote then the value is considered to be everything up to the next quote. This makes it possible for a field to contain the delimiter character. If the field also needs to contain a quote char then 2 quote chars in a row are interpreted as a single quote.
Data | Field 1 Value | Field 3 Value | Field 3 Value | Field 4 Value |
---|---|---|---|---|
abc,123,xyz | abc | 123 | xyz | |
abc,"123",xyz | abc | 123 | xyz | |
abc,"1,23",xyz | abc | 1,23 | xyz | |
abc,1,23,xyz | abc | 1 | 2 | xyz |
abc,"1""23",xyz | abc | 1"23 | xyz | |
abc,"1"23",xyz | ERROR | |||
abc,1"23,xyz | abc | 1"23 | xyz | |
abc,"1""2,3",xyz | abc | 1"2,3 | xyz |
If you selected "Fixed Width" on the delimiter page then the delimiter setup page is next, this allows you to graphically determine where the breaks in the fields are, just click on the display area where the delimiters should be, if you make a mistake just click it again to remove it.
Here you can see the delimiters have been added at the appropriate positions.
If your source data contained column headings then you may well be finished at this stage, however if the data contains no column information, or you don't like the names in the file then you can change them on this page.
Simply select the column to edit and set the name (if you are working on a fixed width file then you also get the option to set the column width).
Once you press Finish the new Text Data source is added to the transform and can be used as any other data source is.
The name of the component, must be unique within the transform
The encoding used to decode the data read from the file into text if the encoding is not specifically identified within the file (see BOM).
If checked then the first line in the data file is ignored as it is assumed to contain the names of the columns
The transform does not stop when an invalid row is encounters, it just ignores it and moves onto the next line.
If checked leading and trailing whitespace is removed from the field value i.e.
Col1 , Col2 , Col3
if checked the values are "Col1", "Col2", "Col3"
if unchecked the values are "Col1 ", " Col2 ", " Col3"
When checked quotes are considered to be enclosing characters, i.e.
"col 1 has a coma, would normally be seen as a separator", col 2
By enclosing column 1 in quotes the ',' is treated as data not a separator.
All the delimiters selected are treated as column separators unless they are enclosed in quotes (see Fields are enclosed in quotes).
Allows the column definitions to be manually edited.
If the reader component does not have a connector to the 'Filename' or ' Text Source' then the filename is read from this property.
If the connection point 'Filename' or ' Text Source' is connected, then this is ignored (and can be blank)
Indicates that the filename is to be resolved relative to the data mapping file (.dm). If the transform is compiled the filename will be resolved relative to the compiled exe.