Skip to Main Content
IBM Data Platform Ideas Portal for Customers


This portal is to open public enhancement requests against products and services offered by the IBM Data Platform organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:


Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,


Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

IBM Employees should enter Ideas at https://ideas.ibm.com



Status Functionality already exists
Workspace Watson Studio
Created by Guest
Created on Dec 12, 2019

Remove Livy from WS, or provide an option to run WS notebooks without Livy

Based on the feedback from our DSX users - 

We would love to stop using Livy for DSX Jupyter notebooks.

There are many disadvantages of using Livy:

  • Code execution is synchronous – can’t see results while a cell is running
  • Straight Spark connectivity can shows output while it’s running (like model build progress, Spark job progress etc) – this is not possible when using livy/sparkmagic
  • There are ton of bugs and functional deficiencies with Livy and Sparkmagic
    • actually probably a half if not more of our support cases revolved around this two components
  • We noticed it’s extremely confusing to some users that their notebooks actually have two independent python processes running
    (Jupyter notebook itself is one , and for %%spark it’s a remote spark driver). When we remove livy/sparkmagic, and run spark driver locally like we currently do in other notebook solution, it’s all very simple
  • Some Jupyter extensions don’t work with sparkmagic

 

There was one tiny benefit of using Livy as spark driver runs remotely in yarn-cluster mode,
but in case of DSX we don’t have issues with running Spark drivers locally as those are in K8S cluster,
so we’re not limited with resources. For interactive Spark applications like Jupyter, using intermediate
components like Livy/ sparkmagic doesn’t add any value, but introduces complexity and a lot of bugs.

Notice for example “sparkmagic” is still considered incubating Jupyter project as it doesn’t have a
good community/userbase. Apache Livy also lags good release cadence, doesn’t have a strong community
and is also considered “incubating” Apache project.

Based on the feedback from our users, we would love to have an option to run Jupyter Spark notebooks
without Livy/sparkmagic, where spark driver runs locally in Jupyter container. I think you were checking
if DSX 2.1 already removed Livy, when DSX has migrated over to Enterprise Gateway – please confirm.

From what we can tell, Livy/sparkmagic is a major obstacle with our success in DSX deployment
and wider adoption of WSL in our organization.

  • Guest
    Dec 19, 2019

    Thank you for confirming Snehal! 

    In case of JEG, is Spark driver runs locally in the same pod as Jupyter process, 
    or it's running in some other pod? In other words, is this yarn-client or yarn-cluster spark submit mode? A link to documentation would be awesome to have .

    Ruslan

  • Guest
    Dec 19, 2019

    Hello Ruslan, Confirmed with development team -  JEG doesn't involve Livy

  • Guest
    Dec 18, 2019

    Hello Snehal - 


    Thank you for those details. 

    Based on this, it seems the only option that will work for us is 

    WSL 2.1/CPD 2.5

    • Spark running on Hadoop via JEG

    Can you please confirm that "Hadoop via JEG" doesn't involve running Livy in any way? E.g. we want to make sure JEG doesn't run / doesn't use Livy itself.. 

    We want to completely exclude JEG from the equation. 

    Thanks!
    Ruslan

  • Guest
    Dec 18, 2019

    We will continue Livy in WS 2.1 but we have also introduced JEG.

    For spark execution, users have the following options
    WSL 1.2.3 / CPD 2.1

    • Spark running in local mode within the Jupyter pod

    • Spark running in cluster mode within the WSL/CPD cluster

    • Spark running on Hadoop via Livy

    WSL 2.1/CPD 2.5

    • Spark running in local mode within the Jupyter pod

    • Hummingbird spark

    • Spark running on Hadoop via Livy

    • Spark running on Hadoop via JEG

    Please confirm if this is helfpul for your use case.