Difference between revisions of "HowtoWriteAGrabber"

From XMLTV
Jump to: navigation, search
m (Fix erronous HTML)
m (More Information)
Line 149: Line 149:
 
===More Information===
 
===More Information===
  
For more information, see the documentation for the modules XMLTV::Options, Xmltv::Configure::Writer and Xmltv::Configure as well as the source for tvgrabseswedb and tvgrab_cz.
+
For more information, see the documentation for the modules XMLTV::Options, Xmltv::Configure::Writer and Xmltv::Configure as well as the source for tv_grab_se_swedb and tv_grab_cz.
  
 
==Grabbing data==
 
==Grabbing data==

Revision as of 11:32, 1 September 2010

This page describes how to write a new grabber that works in the same way as the other grabbers.

It describes the version of Xmltv version 0.5.43 and later.

The Importance of Behaving Properly

For the Xmltv-grabbers to be useful to as many people as possible, it is very important that all grabbers work in the same way. It might be tempting to take a shortcut when implementing a grabber and only implement the features that you think that you need right now, or add a special mode of operation that suits your specific grabber. However, grabbers are typically used from lots of different applications and we cannot expect all these applications to know about all different grabbers and their "quirks". Therefore, the XmltvCapabilities page describes exactly what an application can expect from a grabber and HowtoUseGrabbers describes how grabbers shall be used. It is important that your grabber behaves as described on these pages. Otherwise, there is a risk that applications will not support your particular grabber unless someone adds specific support for your grabber to the application.

This page describes how to write a grabber that adheres to all these rules. I hope that by reading this page, you will see that it is actually fairly easy to write a compliant grabber.

The XMLTV::Options module

The XMLTV::Options module is a new addition to the Xmltv framework and most older grabbers do currently not use it. It is used by tvgrabseswedb and tvgrab_cz, and my hope is that more and more grabbers will start to use it.

The XMLTV::Options module provides a single method called ?ParseOptions. It implements almost all code needed for a grabber apart from the actual "grabbing". It implements the --help, --version, --capabilities, --description options for the grabber and it handles the --configure and --configure-api options by calling callbacks supplied by the grabber. Furthermore, it handles the --output option by redirecting STDOUT to the correct file.

A skeleton for a grabber can look like this:

#!/usr/bin/perl -w

=pod

Your documentation here... Copy from tv_grab_cz for example.

=cut

use strict;
use XMLTV::Options qw/ParseOptions/;

my( $opt, $conf ) = ParseOptions( { 
     grabber_name => "tv_grab_cz",
     capabilities => [qw/baseline manualconfig apiconfig/],
     stage_sub => \&config_stage,
     listchannels_sub => \&list_channels,
     version => '$Id: HowtoWriteAGrabber.txt,v 1.9 2006/04/24 17:38:45 <span><a href="http://xmltv.org/cgi-bin/ikiwiki.cgi?page=MattiasHolmlund&from=howtowriteagrabber&do=create">?</a>MattiasHolmlund</span> Exp ed $',
     description => "The Czech Republic (www.webstep.net)",
} );

# Get the actual data and print it to stdout.

if( $is_success ) {
     exit 0;
}
else {
     exit 1;
}

sub config_stage
{
     my( $stage, $conf ) = @_;

     # Sample stage_sub that only needs a single stage.

     die "Unknown stage $stage" if $stage ne "start";

     my $result;
     my $writer = new XMLTV::Configure::Writer( OUTPUT => \$result,
                                                              encoding => 'iso-8859-1' );
     $writer->start( { grabber => 'tv_grab_se_swedb' } );
     $writer->write_string( {
          id => 'root-url', 
          title => [ [ 'Root URL for grabbing data', 'en' ] ],
          description => [ 
            [ 'The file at this URL describes which channels are available and ' .
              'where data can be found for them. ', 'en' ] ],
          default => $default_root_url,
      } );

     $writer->end( 'select-channels' );

     return $result;
}

sub list_channels
{
     my( $conf, $opt ) = @_;

     # Return a string containing an xml-document with <channel>-elements
     # for all available channels.

     return $str;
}

To turn the above code into a working grabber, you need to fill in the list channels and config stage subroutines with actual code and add code that does the actual grabbing and prints xmltv-data to stdout. Everything else is handled by ?ParseOptions.

Writing a stage_sub

The stagesub shall return an xml-string that describes what information the grabber needs from the user in order to grab data. The xml-string shall be in the format described in the ?XmltvConfigurationDtd. The configstage sub is called during the configuration process. The configuration process can ask the user questions in several steps if it is necessary to know the answer to one question before it is possible to ask the next question. This is done by dividing the configuration into stages. The first stage is always called "start" and the final stage shall always be "select-channels". The "select-channels" stage is handled by calling the listchannelssub instead of stagesub. For each stage, the user's answers on all previous stages are provided in the $conf parameter.

This stage_sub asks first asks the user for a username and a password and then downloads a list of possible regions using the username and password.

sub config_stage
{
     my( $stage, $conf ) = @_;

     my $result;

     my $writer = new XMLTV::Configure::Writer( OUTPUT => \$result,
                                                              encoding => 'iso-8859-1' );
     if( $stage eq 'start' ) {
          $writer->start( { grabber => 'tv_grab_se_swedb' } );
          $writer->write_string( {
                id => 'username', 
                title => [ [ 'Username', 'en' ] ],
                description => [ 
                 [ 'Your username at tvsite.com', 'en' ] ],
                default => '',
            } );
          $writer->write_secretstring( {
                id => 'username', 
                title => [ [ 'Password', 'en' ] ],
                description => [ 
                 [ 'Your password at tvsite.com', 'en' ] ],
                default => '',
            } );
          $writer->end( 'select-region' );
     }
     elsif( $stage eq 'select-region' ) {
          my $username = $conf->{'username'}->[0];
          my $password = $conf->{'password'}->[0];
          my @regions = fetch_regions( $username, $password );
          $writer->start_selectone( {
                id => 'region', 
                title => [ [ 'Region', 'en' ], ],
                description => [ [ 'The geographic region that you want to download data for.', 'en' ], ],
          } );

          foreach my $reg (@regions) {
                $writer->write_option( { 
                     value=>$reg,
                     text=> => [ [ $reg, 'en' ], ] 
                } );
          }

          $writer->end_selectone();
          $writer->end( 'select-region' );
     }
     else {
        die "Unknown stage $stage";
     }

     return $result;
}

Writing a listchannels_sub

The listchannels_sub shall return an xml-string listing all channels that the grabber can download data for given the current configuration. The channel listing shall be in the XMLTVFormat but without any programme-elements. The current configuration is supplied in the first parameter and all options passed on the command-line is in the second parameter.

More Information

For more information, see the documentation for the modules XMLTV::Options, Xmltv::Configure::Writer and Xmltv::Configure as well as the source for tv_grab_se_swedb and tv_grab_cz.

Grabbing data

Most grabbers get their data by parsing webpages. There are many different perl-modules that you can do this with. Have a look at the existing grabbers and pick the method that you think is best.

To do Describe some useful modules here.

Validating a grabber

See XmltvValidation.