Skip to Main Content
IBM Data and AI Ideas Portal for Customers


This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:


Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,


Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

IBM Employees should enter Ideas at https://ideas.ibm.com


Status Delivered
Workspace Knowledge Catalog
Created by Guest
Created on Dec 12, 2017

allow schema refresh for data assets in the UI

While testing ibmwdp, I ran the scenario:

1. Create a new project

2. Add a connection to the project for my cloudant database

3. Add a 'connected data" asset to my project, using my cloudant database connection.

4. Initially, the cloudant database has only one document, with the form:

{
  "_id": "fd781b5366aa03852221ceddac844d0f",
  "_rev": "7-0e94f68c016f78ae027ac41aefc50ff6",
  "field1": "abc123z",
  "field2": "def123z",
  "field3": "ghi123z",
  "field4": "jkl123z",
  "field5": "mno123z"
}

 

5. When I open the data asset to look at the preview, I see the data as I expect, split into 5 fields.

6. I then changed the document in the cloudant database to:

{
  "_id": "fd781b5366aa03852221ceddac844d0f",
  "_rev": "8-01b056aaaf579e1c8c1ae24800ad079b",
  "field1": "abc123z-1",
  "field2": "def123z-2",
  "field3": "ghi123z-3",
  "field4": "jkl123z-4",
  "field5": "mno123z-5",
  "field6": "pqr123z-6"
}

 

7. I then refresh the preview for the data asset. 

 

At this point, I would expect to see 6 fields, along with the updates to the existing fields.  I see the latter, but not the former.  Speaking with Lena Wolfe, she mentioned that this was as designed, that the schema is not refreshed for existing data assets. 

 

I find this to be counter-intuitive that the data in the existing fields would be refreshed, but not new fields/schema changes.  I understand that it may be undesirable to automatically refresh the schema, if its a large database, however, I think it would be nice to be able to explicitly choose to refresh the schema by clicking a button on the Preview panel, or as one of the actions for the data asset.  Also, it would be even better if it could somehow tell me that the schema has changed and that a schema refresh is recommended.  I know that wdp only stores meta data so that might be tricky to find a query that runs quick enough to make it feasible.  But at the very least, a way to trigger a schema refresh from the UI would be useful.

 

  • Admin
    Susanna Tai
    Reply
    |
    Dec 14, 2018

    I've created a github issue for the dev team to investigate --> https://github.ibm.com/dap/dap-planning/issues/2706

  • Guest
    Reply
    |
    Dec 14, 2018

    Continuing my last comment, see how the data catalog preview shows one document with 8 fields, and one with 7 fields, while the project preview shows both with 6 fields.  Initially, when I created both the project data asset and the data catalog data asset, both documents had 6 fields.  I then modified the documents in the cloudant database to 7 and 8 fields respectively.  The data catalog data asset preview automatically shows the changes, while the project data asset preview does not.  I have to recreate the project data asset to pickup the schema changes when working with projects, but now with data catalogs.

  • Guest
    Reply
    |
    Dec 14, 2018

    I confirmed that it is still happening after a browser refresh when using a project, and I confirmed that the flow in the data catalog definitely seems different from how it works with the project.  See the data-catalog-preview and project-preview screen captures I attached.

  • Guest
    Reply
    |
    Dec 14, 2018

    The behavior described is not how it should work and is currently not not the current design.  We are currently "not" showing a sample of the data at all. It is always a full data retrieval so the "schema" should always be fresh.  

     

    There is also no difference in how catalogs or projects are rendered where it is different. My only thinking is that that we do cache the results once a document/asset is viewed.  You just need ti simply refresh the UI to see the latest data.

     

    Can we confirm this is still happening after a browser refresh?

  • Admin
    Susanna Tai
    Reply
    |
    Dec 14, 2018

    I'm following up with Lena and Dejan to see if this is really for DSX (projects).  From the comments here, it sounds like Data Catalog actually handles schema refresh.

  • Guest
    Reply
    |
    Dec 14, 2018

    I worked with Mike offline a bit to see what he was doing differently.  If you use the data catalog tool to add a connection, then connected data, the preview seems to work better.  In my database that only has one document, if I add a new field to the document, all I need to do is refresh the preview page in the data catalog for that asset, and I see the new field. 

    My original problem still exists though, when not using the Data Catalog, and are instead simply working a Project as described in my issue description.  The fact that the preview tool works differently in the context of a project vs. the context of a data catalog is strange.

    I have not tried to create a big enough db with enough documents to test Mike's assertion that the preview is only using a subset of the docs to generate the schema, but if it is, then that is likely an issue as well.

  • Guest
    Reply
    |
    Dec 14, 2018

    I just tried to recreate this issue and it works as you would like it to.  When I create a data asset for Cloudant and view schema, then update a document with a new field, go back to Data Catalog and refresh and I see the new field.   However it seems that Data Catalog is just taking a sample of documents so if your document that you update is not part of that sample, then its schema want be shown in the catalog preview.  This idea came in under Cloudant but its the Data Catalog team who would own the experience and engineering for it, therefore I'll change ownership to Data Catalog team who can reach out to me if they have questions about how to use this in content of Cloudant API.