Friday, July 10, 2015

Drag-and-Drop support in Spoon

If you've ever played around with Drag-n-Drop in Spoon, you probably know that you can drag a KTR, KJB, or XML file onto the canvas, and it will open that file (if a legal PDI artifact) in Spoon for editing.  Under the hood, this is accomplished with "FileListeners", there are transformation and job versions of listeners registered automatically at Spoon startup.

However, did you know you can register your own FileListeners?  There is a pretty straightforward interface:

public interface FileListener {

  public boolean open( Node transNode, String fname, boolean importfile ) throws KettleMissingPluginsException;

  public boolean save( EngineMetaInterface meta, String fname, boolean isExport );

  public void syncMetaName( EngineMetaInterface meta, String name );

  public boolean accepts( String fileName );

  public boolean acceptsXml( String nodeName );

  public String[] getSupportedExtensions();

  public String[] getFileTypeDisplayNames( Locale locale );

  public String getRootNodeName();

You can implement this interface and it will be called at various points, including when a file is dragged onto the canvas. You can use this to add support to currently unsupported file types.  I will show how I implemented a quick CsvListener to add drag-and-drop support to PDI. With this, if you drag a CSV file onto a transformation on the canvas, a "CSV file input" step will be added to the transformation, with the filename already filled in:

I didn't bother with doing a "Get Fields" automatically because I won't know if there's a header row, etc.  Plus this is just a fun proof-of-concept, hopefully I/we will have a more robust Drag-n-Drop system in the future.

The trick is getting your FileListener registered with Spoon. There is no extension point directly for that purpose, but you can use a LifecycleListener plugin and implement the registration in your onStart() callback.

To get this going quickly, I wrote the CsvListener in Groovy, and put that in a file called onStart.groovy. I did that so I could leverage my PDI Extension Point Scripting plugin (available on the Marketplace), then drop my onStart.groovy file into plugins/pdi-script-extension-points/ and start Spoon.

The Groovy script is as follows, and is also available as a Gist:

import org.w3c.dom.* import org.pentaho.di.core.* import org.pentaho.di.core.exception.* import org.pentaho.di.core.gui.* import org.pentaho.di.core.plugins.* import org.pentaho.di.trans.step.* import org.pentaho.di.ui.spoon.* class CsvListener implements FileListener { public boolean open( Node transNode, String fname, boolean importfile ) throws KettleMissingPluginsException { def csvInputPlugin = PluginRegistry.instance.findPluginWithName(StepPluginType, 'CSV file input') def csvInputMetaClass = PluginRegistry.instance.loadClass(csvInputPlugin) csvInputMetaClass.setDefault() def pid = PluginRegistry.instance.getPluginId(csvInputPlugin.pluginType, csvInputMetaClass) def csv = new StepMeta(pid,, csvInputMetaClass) csv.stepMetaInterface.setFilename(fname) csv.setName(fname?.substring(fname?.lastIndexOf(File.separator)+1,fname?.indexOf('.')) ?: 'CSV file input') csv?.location= new Point(20,20) csv?.draw = true Spoon.instance.activeTransformation?.addStep(csv) Spoon.instance.activeTransGraph?.redraw() true } public boolean save( EngineMetaInterface meta, String fname, boolean isExport ) { false } public void syncMetaName( EngineMetaInterface meta, String name ) { } public boolean accepts( String fileName ) { def x = Arrays.asList(getSupportedExtensions()).contains(fileName?.substring(fileName?.indexOf('.')+1)) } public boolean acceptsXml( String nodeName ) { false } public String[] getSupportedExtensions() { ['csv'] as String[] } public String[] getFileTypeDisplayNames( Locale locale ) { ['CSV'] as String[] } public String getRootNodeName() { null } } Spoon.instance.addFileListener(new CsvListener())
There's lots of PDI voodoo going on in the open() method; I won't explain it all here, as I intend to write a proper plugin to do this for various file types. I just wanted to (again) show off how powerful and flexible PDI can be, and have some fun hacking the APIs :)


Tuesday, July 7, 2015

Bring Your Own Marketplace

The Pentaho Data Integration (PDI) Marketplace is a great place to share your PDI/Kettle contributions with the community at-large.  To add your plugin, you can pull down the marketplace.xml file (via our GitHub repo) and add your own entry, then submit a pull-request to have the entry added to the master Marketplace.

But did you know you could 'host' your own PDI Marketplace?  The Marketplace is designed to read in locations of marketplaces from anywhere you like, via a file at $KETTLE_HOME/.kettle/marketplaces.xml (where KETTLE_HOME can be your PDI/Kettle install directory and/or your user's home directory).  Here's an example file on Gist.

The file contains a list of marketplace entries, which are locations of various lists (aka marketplaces) of PDI plugins. The URLs provided are used to read Marketplace XML files, which contain the PDI Marketplace entries.

This is how I test incoming pull-requests for PDI Marketplace plugins. I use the marketplaces.xml file from the Gist link above, then checkout the pull-request from GitHub. Then I start PDI, go to the Marketplace, find the proposed plugin, try to install, open the dialog (if appropriate), then uninstall (NOTE: reboots are required).  Of course, support for the plugin itself is (perhaps) available via the submitter. These details are available in the PDI Marketplace UI before installation, and all licensing, usage, etc. is provided by the submitter.

The benefit of having a marketplaces.xml is that you can decide the list of PDI plugins available for download.  If your clients have a marketplaces.xml that only point at your own repositories / locations for plugins, then you can control which plugins can be downloaded by those clients.  For developers (as I show above), you can use it for testing before submitting your pull-request.  For consultants / OEMs, you can decide which plugins should show up in the list.  This mechanism is very flexible and should support most use cases.

In closing, I personally review many of the PDI Marketplace entries (aka pull-requests in GitHub), please let me know if you have any issues with announcing your plugin or otherwise contributing to our community.

- Matt