Skip to Main Content
IBM Data and AI Ideas Portal for Customers


This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:


Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,


Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

IBM Employees should enter Ideas at https://ideas.ibm.com


Status Delivered
Workspace Connectivity
Created by Guest
Created on Aug 6, 2019

Greenplum connector in DataStage: CSV ipv TEXT bij external tables

When in a field (varchar/text) of the source query (postgres) a carriage return exists then we can not use the Greenplum connector of Datastage.

Needed by Date Sep 1, 2019
  • Guest
    Reply
    |
    Nov 30, 2020

    This enhancement development is now complete. It is now available as a patch (https://www.ibm.com/support/pages/apar/JR62978)

    You can open a support ticket from there https://www.ibm.com/support/home/ to get access to the patch download.

  • Guest
    Reply
    |
    Nov 18, 2020

    Dear Virginie,

    Thanks a lot for the great news ! We're looking forward to it.

    Any idea about timing ?

    Kind regards,

    Philippe

  • Guest
    Reply
    |
    Nov 18, 2020

    Dear Philippe,

    We are currently working on this enhancement request. It should thus be part of the next release of the product.

    Have a great day.

  • Guest
    Reply
    |
    Nov 17, 2020

    Hi Virginie,

    Any news about the implementation of this idea ?

    Kind regards,

    Philippe

  • Guest
    Reply
    |
    Aug 4, 2020

    Hi Virginie,

    For the Vlaamse Milieumaatschappij it remains a very important improvement of the Greenplum Connector in DataStage.

    The problem can be described as follow:

    When a newline (Linefeed of carriage return+linefeed) is present in a text/varchar column on the source query, the load procedure using the Greenplum Connector aborts.

    This is caused by the command used to create the external table: the connector uses by default the format TEXT and we cannot specify the QUOTE character. Therefore the <newline> character in the varchar column is considered as the end-of-line delimiter, and not as part of the varchar column.

    CREATE [READABLE] EXTERNAL TABLE table_name     
    ( column_name data_type [, ...] | LIKE other_table )
         LOCATION ('file://seghost[:port]/path/file' [, ...])
           | ('gpfdist://filehost[:port]/file_pattern[#transform=trans_name]'
    [, ...]
           | ('gpfdists://filehost[:port]/file_pattern[#transform=trans_name]'
               [, ...])
           | ('gphdfs://hdfs_host[:port]/path/file')
           | ('pxf://path-to-data?PROFILE[&custom-option=value[...]]'))
           | ('s3://S3_endpoint[:port]/bucket_name/[S3_prefix]
    [region=S3-region]
    [config=config_file]')
         [ON MASTER]
         FORMAT 'TEXT'
               [( [HEADER]
                  [DELIMITER [AS] 'delimiter' | 'OFF']
                  [NULL [AS] 'null string']
                  [ESCAPE [AS] 'escape' | 'OFF']
                  [NEWLINE [ AS ] 'LF' | 'CR' | 'CRLF']
                  [FILL MISSING FIELDS] )]
              | 'CSV'
               [( [HEADER]
                  [QUOTE [AS] 'quote']
                  [DELIMITER [AS] 'delimiter']
                  [NULL [AS] 'null string']
                  [FORCE NOT NULL column [, ...]]
                  [ESCAPE [AS] 'escape']
                  [NEWLINE [ AS ] 'LF' | 'CR' | 'CRLF']
                  [FILL MISSING FIELDS] )]
              | 'AVRO'
    | 'PARQUET'
              | 'CUSTOM' (Formatter=<formatter_specifications>)
        [ ENCODING 'encoding' ]
        [ [LOG ERRORS] SEGMENT REJECT LIMIT count
          [ROWS | PERCENT] ]

    Workaround:

    As a workaround we replaced the <newline> character in the source query by a string we hope will not be used elsewhere, and substitute it back to the <newline> after the load.

    But it takes a lot of time and effort to implement, and is error prone.

    Expected solution:

    The best solution would be to incorporate new parameters in the Greenplum connector to specify some parameters/options of the CREATE EXTERNAL TABLE statement or the "gpfdist" command, to be able to cover each and every case we could encounter.

    For example we could specify the FORMAT (TEXT/CSV) and with CSV be able to specify the QUOTE parameter to encapsulate the <newline>. (see the parameter in green)

    Don't hesitate to contact me if something is not clear or if you need more information.

  • Guest
    Reply
    |
    Aug 4, 2020

    We still consider this a high priority.

  • Guest
    Reply
    |
    Aug 4, 2020

    Thanks for submitting this idea. This was submitted last year, and tagged as urgent, so I'd need to know if you could finally proceed with the ODBC option mentioned as a workaround, or if you still consider this one as a high priority on your side.

    Thanks for any details you can provide.