This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,
Post an idea
Upvote ideas that matter most to you
Get feedback from the IBM team to refine your idea
Specific links you will want to bookmark for future use
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
IBM Employees should enter Ideas at https://ideas.ibm.com
See this idea on ideas.ibm.com
While testing ibmwdp, I ran the scenario:
1. Create a new project
2. Add a connection to the project for my cloudant database
3. Add a 'connected data" asset to my project, using my cloudant database connection.
4. Initially, the cloudant database has only one document, with the form:
{
"_id": "fd781b5366aa03852221ceddac844d0f",
"_rev": "7-0e94f68c016f78ae027ac41aefc50ff6",
"field1": "abc123z",
"field2": "def123z",
"field3": "ghi123z",
"field4": "jkl123z",
"field5": "mno123z"
}
5. When I open the data asset to look at the preview, I see the data as I expect, split into 5 fields.
6. I then changed the document in the cloudant database to:
{
"_id": "fd781b5366aa03852221ceddac844d0f",
"_rev": "8-01b056aaaf579e1c8c1ae24800ad079b",
"field1": "abc123z-1",
"field2": "def123z-2",
"field3": "ghi123z-3",
"field4": "jkl123z-4",
"field5": "mno123z-5",
"field6": "pqr123z-6"
}
7. I then refresh the preview for the data asset.
At this point, I would expect to see 6 fields, along with the updates to the existing fields. I see the latter, but not the former. Speaking with Lena Wolfe, she mentioned that this was as designed, that the schema is not refreshed for existing data assets.
I find this to be counter-intuitive that the data in the existing fields would be refreshed, but not new fields/schema changes. I understand that it may be undesirable to automatically refresh the schema, if its a large database, however, I think it would be nice to be able to explicitly choose to refresh the schema by clicking a button on the Preview panel, or as one of the actions for the data asset. Also, it would be even better if it could somehow tell me that the schema has changed and that a schema refresh is recommended. I know that wdp only stores meta data so that might be tricky to find a query that runs quick enough to make it feasible. But at the very least, a way to trigger a schema refresh from the UI would be useful.
By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.
I've created a github issue for the dev team to investigate --> https://github.ibm.com/dap/dap-planning/issues/2706
Continuing my last comment, see how the data catalog preview shows one document with 8 fields, and one with 7 fields, while the project preview shows both with 6 fields. Initially, when I created both the project data asset and the data catalog data asset, both documents had 6 fields. I then modified the documents in the cloudant database to 7 and 8 fields respectively. The data catalog data asset preview automatically shows the changes, while the project data asset preview does not. I have to recreate the project data asset to pickup the schema changes when working with projects, but now with data catalogs.
I confirmed that it is still happening after a browser refresh when using a project, and I confirmed that the flow in the data catalog definitely seems different from how it works with the project. See the data-catalog-preview and project-preview screen captures I attached.
The behavior described is not how it should work and is currently not not the current design. We are currently "not" showing a sample of the data at all. It is always a full data retrieval so the "schema" should always be fresh.
There is also no difference in how catalogs or projects are rendered where it is different. My only thinking is that that we do cache the results once a document/asset is viewed. You just need ti simply refresh the UI to see the latest data.
Can we confirm this is still happening after a browser refresh?
I'm following up with Lena and Dejan to see if this is really for DSX (projects). From the comments here, it sounds like Data Catalog actually handles schema refresh.
I worked with Mike offline a bit to see what he was doing differently. If you use the data catalog tool to add a connection, then connected data, the preview seems to work better. In my database that only has one document, if I add a new field to the document, all I need to do is refresh the preview page in the data catalog for that asset, and I see the new field.
My original problem still exists though, when not using the Data Catalog, and are instead simply working a Project as described in my issue description. The fact that the preview tool works differently in the context of a project vs. the context of a data catalog is strange.
I have not tried to create a big enough db with enough documents to test Mike's assertion that the preview is only using a subset of the docs to generate the schema, but if it is, then that is likely an issue as well.
I just tried to recreate this issue and it works as you would like it to. When I create a data asset for Cloudant and view schema, then update a document with a new field, go back to Data Catalog and refresh and I see the new field. However it seems that Data Catalog is just taking a sample of documents so if your document that you update is not part of that sample, then its schema want be shown in the catalog preview. This idea came in under Cloudant but its the Data Catalog team who would own the experience and engineering for it, therefore I'll change ownership to Data Catalog team who can reach out to me if they have questions about how to use this in content of Cloudant API.