This portal is to open public enhancement requests against products and services offered by the IBM Data Platform organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,
Post an idea
Upvote ideas that matter most to you
Get feedback from the IBM team to refine your idea
Specific links you will want to bookmark for future use
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
IBM Employees should enter Ideas at https://ideas.ibm.com
See this idea on ideas.ibm.com
We need other enhancements to add partitions contingent on the row count of a table\view. When exceedingly large (i.e. millions and billions of rows), we need to look for primary and secondary keys, especially related to dates to add where conditions to restrict the data being profiled to a more manageable size (i.e. last 30 days' worth of data).
If we don't do these things, we will be getting our IDs blocked due to excessive usage of CPU, memory, and spool space.
Needed By | Week |
By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.
Per David:
The issue was on initial query within an MDI and/or MDE process it was pulling all of the rows. Request is to have the system not pull all rows during configured jobs for very large tables. Only look through a system table for generic info and when data is pulled to load and request only a subset of rows.
What VZ ended up doing is separating the connections to target same Teradata instance where one connection grabbed the very large tables, while a different one was filtered out to not see the large tables.
SQL Query Assets can be made to achieve the goal but require manual intervention to create such a partition, and still the initial scan of the whole asset might be done, is desired, and needs to not pull the entire table of rows when the number of rows is large
Use case:
If we have large tables with millions or even billions of rows (eg: tranction logs). When profiling these without any row-based filtering, the jobs fail or consume too many resources. if we can add conditions like last 30 days using data fields, it would help reduce load, avoid timeouts, and prevent our IDs from getting blocked due to high CPU and memory usage.
Hi Vamshi, This Aha item was spawned from ticket TS014403634.
The "SQL Query asset capability" introduced in IKC v4.8.x or v.5.x should address this requirement.
Reference: https://www.ibm.com/docs/en/cloud-paks/cp-data/5.1.x?topic=project-adding-dynamic-view-data-from-connection
Hi Vamshi, With SQL Query asset capability should address this use case.
A query asset is a dynamic view of a data asset that is created based on an SQL query. Such data asset can contain data from one or more tables in a single data source, or a subset of rows from a single table.
With dynamic views, you have these options:
You can create a data asset that contains a subset or a superset of columns based on column selection, an explicit set of rows based on a conditional expression, or a combination of both.
You can split a table into several smaller data assets based on the values in a selected column.
https://www.ibm.com/docs/en/cloud-paks/cp-data/5.1.x?topic=project-adding-dynamic-view-data-from-connection
Can you please verify it and confirm, if it doesn't then we need an example and a use case.
The current status is marked as Need more information, but there is no clarification on what specific details are required. Could you please let us know what particular information you are looking for so we can provide it accordingly ?