We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Post your ideas
Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,
Post an idea
Upvote ideas that matter most to you
Get feedback from the IBM team to refine your idea
Help IBM prioritize your ideas and requests
The IBM team may need your help to refine the ideas so they may ask for more information or feedback. The product management team will then decide if they can begin working on your idea. If they can start during the next development cycle, they will put the idea on the priority list. Each team at IBM works on a different schedule, where some ideas can be implemented right away, others may be placed on a different schedule.
Receive notification on the decision
Some ideas can be implemented at IBM, while others may not fit within the development plans for the product. In either case, the team will let you know as soon as possible. In some cases, we may be able to find alternatives for ideas which cannot be implemented in a reasonable time.
Add Classifier Operation in Watson Knowledge Catalog
I have an opportunity at Honda for Knowledge Catalog in their back office operations team. This team is charged with ad-hoc reporting from a number of on premise data sources (e.g. Oracle, SQL Server, Excel Spreadsheets, Lotus Databases, etc). These reports involve the need to classify data among other basic ETL like functions. As part of the classification step, they need to look through specific columns of data (e.g. vehicle feature) for specific attributes. Unfortunately, each division and organization in Honda use slightly different terms to reflect the same thing (e.g. 4 wheel drive is referred to as 4WD, 4-wd, awd, AWD, etc). Often times they need to classify this data first before they can process it further by tools like Watson Studio and Watson Analytics. We would like a classifier ETL function added to WKC which allows the user to define a csv or xls file containing all the synonyms and their corresponding classifier. The synonyms need to allow for wildcards so a syntax like regex would be of great value. In addition, the synonyms need to support multiple languages in the example included it is English and Japanese. To improve the process there needs to be a setting for using Thesaurus.com to enrich synonyms without having to list them all. For example, if I turn on Thesaurus.com for my classifier then all the synonyms of the synonym entered in the csv file would be used in addition to the synonym specified. For example, U.S. would be automatically extending to include U.S.A., United States of America, etc. If any of the additional Synonyms in Thesaurus.com are located in the data set then they too are classified by the specified classifier.
As a nice to have it might be valuable to offer a second classifier algorithm based on Watson Natural Language Classifier where the user can import the xls/csv, but it is loaded into NLC behind the scenes and uses the NLC machine learning model for classification. However, this NLC option is secondary to the primary need of using basic text and wildcard token location outlined above.
I have demonstrated WKC to both Cisco and Fluor and they also have use cases requiring this same functionality.
Do not place IBM confidential, company confidential, or personal information into any field.