Categories

Parser-Plugin as Written Out for Feeds

05.09.2013
Author:

At times, for web developers content importing appears to be the task which is far from trivial. Writing the import assignment out "from scratch" when aimed at covering each and every instance can't be practical, thus we advice sticking to the ones already in use, say, MigrateFeeds

Let us take a close look at Feeds module. It's architecture is made up of a range of plug-ins. The mainstream ones are:
Fetcher: it acts for supllying the data for further parsing and importation;
Parser: performs data parsing and consolidates the arrays of elements for further importng;
Processor: performs data parsing and consolidates the arrays of elements for further importng.

This article is describing how a plug-in for the parser is being written out. We are going to import an XML-file which has been received due to the use of Views Data Export. Voices may be heard stating there is no point in writing a plug-in, as out-of-the-box Feeds provide for its own plug-in that does XML parsing. That is true. In fact, so as to get it operational, one has to make some major alterations to the VDE outlet. As it is, it takes far less effort to write out a parser of your own:

name = Custom Parser
description = Contains feeds plugins for XML import.
core = 7.x
version = 7.x-1.0
files[] = CustomParserXML.inc 

 hook_feeds_plugins() is being announced wherein we describe our parser plug-in:

 /**
 * Implements hook_feeds_plugins().
 */
function custom_parser_feeds_plugins() {
  return array(
    'CustomParserXML' => array(
      'name' => t('Custom XML parser'),
      'description' => t('Parses XML as we want.'),
      'handler' => array(
        'parent' => 'FeedsParser',
        'class' => 'CustomParserXML',
        'file' => 'CustomParserXML.inc',
      ),
    ),
  );
}
Array definition keys are quite self-explanatory, thus, there's no point fixing on them for long. And now the most interesting part: the parser itself. Any kind of parser keeps FeedsParser class expanding. 
 /**
 * Parses a given file as a XML file.
 */
class CustomParserXML extends FeedsParser {

  /**
   * Implements FeedsParser::parse().
   */
  public function parse(FeedsSource $source, FeedsFetcherResult $fetcher_result) {
    // Loads xml file into string.
    @ $xml = simplexml_load_string($fetcher_result->getRaw(), NULL, LIBXML_NOERROR | LIBXML_NOWARNING | LIBXML_NOCDATA);
    // Got a malformed XML.
    if ($xml === FALSE || is_null($xml)) {
      return FALSE;
    }
    $items = array();
    foreach ($xml->node as $node) {
      // Object to array conversion.
      $node = (array) $node;
      foreach ($node as $k => $v) {
        // We don't want to work with empty data.
        if (!$v) {
          unset($node[$k]);
        }
        else {
          $old_key = $k;
          // Converts all keys to lower case for consistency.
          $new_key = drupal_strtolower(str_replace('-', ' ', $old_key));
          $node[$new_key] = $v;
          unset($node[$old_key]);
        }
      }
      $items[] = $node;
    }
    return new FeedsParserResult($items, $source->feed_nid);
  }

  /**
   * Override parent::getSourceElement() to use only lower keys.
   */
  public function getSourceElement(FeedsSource $source, FeedsParserResult $result, $element_key) {
    return parent::getSourceElement($source, $result, drupal_strtolower($element_key));
  }
} 

A parser of our own is up and running. It can be selected at the importer settings page: example.com/admin/structure/feeds/[importer_name]/parser:

A deeper digging into plug-ins nature can be done here. Attached to this article are the archive with parser modules and the importer feature with a view. I appreciate your time.

5 votes, Rating: 5

Read also

1

The article describes the CMS Drupal deployment process as performed with use of Oracle DB on Debian server.

2

Experienced Drupal developers can hardly do without Drush (Drupal shell) utility as it speeds up a...

3

Any web developer may face a situation, at times, when both -...

4

There are certain instances when you are being encouraged to implement access control practices that fulfill...

5

Module rules allow us to implement the events (actions) after performing certain actions (events).It has...

Subscribe to our blog updates