Skip to Main Content
IBM Data Platform Ideas Portal for Customers


This portal is to open public enhancement requests against products and services offered by the IBM Data Platform organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:


Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,


Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

IBM Employees should enter Ideas at https://ideas.ibm.com



Status Future consideration
Created by Guest
Created on Mar 9, 2026

Use Query Logs for Lineage and Usage Analysis

Lineage collection today depends on explicitly defined data flows and relationships between data assets. This process leaves holes in the data lineage flows. However, if IKC could start collecting, analyzing and parsing query logs during Metadata Imports, it could add additional lineage that is more dynamic and complete. It would also provide a better view of consumption by end users and external systems that query the database. This information can add missing links in the lineage graph, as well as be used for prioritizing the most frequently used data assets in search results.

Needed By Quarter
  • Admin
    Caroline Fahrenkrog
    Mar 13, 2026

    Thanks for this idea, David. Two benefits you called out are huge:
    (1) getting lineage for runtime queries when the code is dynamic (where static analysis of dynamic code doesn’t work), and
    (2) gaining insight into actual usage patterns.

    For (1), we’ve used this approach with languages like SAS, where parsing execution logs has proven effective for getting lineage for dynamic programs. We are considering incorporating our SAS log-parsing utility into the product, and that approach could potentially expand to other technologies. Even without a built-in capability, customers can extract queries from the logs and submit them to scanners as manual inputs.

    For (2), we’ve spent considerable time exploring integration possibilities with our observability team in watsonx.data integration. The difference between design lineage and runtime lineage is a bit like the difference between knowing all the roads and knowing the traffic. Having both perspectives together would be extremely powerful. As we lean further into OpenLineage, we will also have opportunities to overlay runtime lineage with design lineage and build insights on top of that, even without a formal observability integration.

    It’s also worth noting that runtime lineage (whether from logs, OpenLineage, or vendor APIs like Databricks) has limitations. It only reflects what actually ran, not all possible dependencies defined in code or pipelines. We may know what ran last week or last month, but not what could run in the future. This incompleteness problem is particularly relevant to query logs, which are often extremely large and require sampling or narrow time windows for analysis. That makes them less reliable for use cases like impact analysis and generally insufficient for regulatory reporting. Design lineage, while describing how data could flow rather than when it did, provides more complete and deterministic dependency coverage.

    We’re marking this idea as Future Consideration since it aligns with our broader direction, but we don’t have concrete roadmap items to share in the near term. Would love to discuss the topic!