Difference between revisions of "User:Dekarl"
From XMLTV
(→Potential Data Sources: add potential sources from http://dvblogic.com/phpBB3/viewtopic.php?p=25521) |
(→Proper TV Metadata Schema Bits and Pieces (Hi TVBrainz): Sesame Street started with shared seasons to branch off into a localized german branch) |
||
(16 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Known Issues with Character Encoding = | = Known Issues with Character Encoding = | ||
− | * | + | *Need to verify that we can dump perl strings at XMLTV::Writer and it will do the right thing with regard to escaping anything outside $encoding into XML entities. |
− | *{{grabber|se_swedb}} [http://repo.or.cz/w/nonametv.git/commitdiff/17bfeec55a6cc01adb9db4d8a78f0fb17cfde11d fix commited upstream] | + | *{{grabber|hr}} {{grabber|no_gfeed}} {{grabber|se_swedb}} [http://repo.or.cz/w/nonametv.git/commitdiff/17bfeec55a6cc01adb9db4d8a78f0fb17cfde11d fix commited upstream] |
*{{ticket|1910245}} should add a test for HTML entities in the generated XML. (hint ´ is invalid XML!) | *{{ticket|1910245}} should add a test for HTML entities in the generated XML. (hint ´ is invalid XML!) | ||
= Issues with Time Zones = | = Issues with Time Zones = | ||
− | *{{ | + | *{{grabber|dk_dr}} DST issues |
+ | *{{grabber|il}} DST issues | ||
+ | *{{grabber|it}} DST issues | ||
+ | *{{grabber|pt_meo}} DST issues | ||
+ | *{{grabber|uk_bleb}} DST issues, floating start>stop leads to wrong date calculation and time offsets | ||
= Memory Leaks?? = | = Memory Leaks?? = | ||
Line 28: | Line 32: | ||
= Potential Data Sources = | = Potential Data Sources = | ||
+ | Candidates for wrapping into [[User:Dekarl/Static_File_Grabber_Template|Static File Grabbers]] | ||
* _cz_arcao: XMLTV export from [http://xmltv.arcao.com/ arcao.com]. Provides explicit time offsets. | * _cz_arcao: XMLTV export from [http://xmltv.arcao.com/ arcao.com]. Provides explicit time offsets. | ||
* _dk_ontv: XMLTV export from [http://ontv.dk/xmltv/ ontv.dk]. | * _dk_ontv: XMLTV export from [http://ontv.dk/xmltv/ ontv.dk]. | ||
* _eu_phazer: XMLTV service from tvprofil.net aka [http://tvprofil.net/xmltv/ Phazer XMLTV Service]. Notice that they provide timestamps in their local time as floating time which is intepreted as UTC... | * _eu_phazer: XMLTV service from tvprofil.net aka [http://tvprofil.net/xmltv/ Phazer XMLTV Service]. Notice that they provide timestamps in their local time as floating time which is intepreted as UTC... | ||
+ | * <strike>_fr_kazer: XMLTV service from [http://kazer.org/ kazer.org].</strike> [http://xmltv.cvs.sourceforge.net/viewvc/xmltv/xmltv/grab/fr_kazer/ done] | ||
* _it_ambrosa: XMLTV export from [http://www.ambrosa.net/index.php/contents/XMLTV.html ambrosa.net]. Explicit about non-commercial use only. | * _it_ambrosa: XMLTV export from [http://www.ambrosa.net/index.php/contents/XMLTV.html ambrosa.net]. Explicit about non-commercial use only. | ||
* _ru_teleguide: XMLTV export from [http://www.teleguide.info/article1.html teleguide.info]. Provides explicit time offsets. | * _ru_teleguide: XMLTV export from [http://www.teleguide.info/article1.html teleguide.info]. Provides explicit time offsets. | ||
+ | |||
+ | = Configuration API = | ||
+ | * [http://sourceforge.net/mailarchive/forum.php?thread_name=49B6B9F2.80905%40holmlund.se&forum_name=xmltv-devel] possible extensions/clarifications from a consumers POV | ||
+ | * list in the supplementary files mapping DVB/ATSC id to grabber/channel. Then let --list-channels & co. enrich channel list with related ids. [http://code.mythtv.org/trac/wiki/TaskBrowserBasedSetup MythTV's Browser Based Setup] might be a consumer for this to allow automatic mapping of channels in the guide to channels on the video source. | ||
+ | |||
+ | = Data Sinks = | ||
+ | * Check [http://www.cse.unsw.edu.au/~willu/w/xmltv/grabbers/index.html] and see if all are mentioned here | ||
+ | |||
+ | = Best Practices = | ||
+ | == Consumers of XMLTV Data == | ||
+ | * be prepared that xmltv ids really might be similar to FQDN (255 characters max.) the longest I've seen in the wild is 69 characters (_es_laguiatv) | ||
+ | |||
+ | = Random Pieces of Information = | ||
+ | == You might receive the same transport stream on multiple frequencies == | ||
+ | TS 101 211 - DVB Guidelines on implementation and usage of Service Information (SI) | ||
+ | NOTE 1: The cell_id cannot be used to identify a service. The combination of service_id and original_network_id | ||
+ | remains a unique identification of a service. | ||
+ | |||
+ | It is recommended to make all receivable multiplexes with the same transport_stream_id but with different | ||
+ | cell_ids available to the user, and only when a service (not a transport stream) is available through multiple | ||
+ | multiplexes to select a preferred multiplex based on e.g. reception quality. | ||
+ | |||
+ | Any reference resolution from a transport_stream_id or a service_id (e.g. from a linkage_descriptor | ||
+ | transport_stream_id/service_id pair) to a multiplex/frequency requires consideration to handle the potential multiplicity | ||
+ | |||
+ | Note that in networks deploying the service_availability_descriptor, the unique identification of a transport stream by | ||
+ | the tuple (transport_stream_id, original_network_id), can often be sensibly replaced by identification through the triplet | ||
+ | (transport_stream_id, original_network_id, cell_id). | ||
+ | |||
+ | == Proper TV Metadata Schema Bits and Pieces (Hi TVBrainz) == | ||
+ | *Some series have multiple sets of episode titles per locale, usually its one set per broadcasting company | ||
+ | ** the Exes http://www.fernsehserien.de/the-exes/episodenguide/staffel-1/16232 | ||
+ | *Some series have a different title per season | ||
+ | **Elephant Princess http://de.wikipedia.org/wiki/Elephant_Princess | ||
+ | *Some series have alternate titles | ||
+ | **The Killing http://de.wikipedia.org/wiki/Kommissarin_Lund_%E2%80%93_Das_Verbrechen this series also uses roman or arabic numbers in addition to the title as season specific series title | ||
+ | *Usually the episode title is unique per series, but some series have multiple episodes with the same title | ||
+ | **Lindenstraße http://de.wikipedia.org/wiki/Lindenstra%C3%9Fe/Episodenliste#Mehrfache_Folgentitel | ||
+ | *some series started on radio and continued on tv | ||
+ | ** Die Hesselbachs http://de.wikipedia.org/wiki/Die_Hesselbachs | ||
+ | *some series got rebranded | ||
+ | ** Pusteblume -> Löwenzahn http://en.wikipedia.org/wiki/L%C3%B6wenzahn | ||
+ | *many series don't do seasons | ||
+ | **Eisenbahnromantik http://www.swr.de/eisenbahn-romantik/ (lots of repeats on bags of tv stations and lots of full episodes on youtube, good for testing VOD integration due to no regional restrictions http://www.youtube.com/user/Eisenbahnromantik ) | ||
+ | *some episodes belong to multiple series | ||
+ | **it is common for documentary brands to buy unrelated documentary movies and mini series that are run under a bigger brand | ||
+ | ** - example of the two episodes that belong to two series goes here - | ||
+ | *some series have a different order per country/station | ||
+ | **Wickie und die starken Männer http://forums.thetvdb.com/viewtopic.php?f=41&t=18059 | ||
+ | *some series contain the same episode multiple times / with multiple episode numbers | ||
+ | **Eisenbahnromantik http://www.swr.de/eisenbahn-romantik/ | ||
+ | *some series / collection of movies are unclear if they should be a series or a collection of movies or both | ||
+ | **Varg Veum, has a main cast and seasons, also used a translated and the original season title http://de.wikipedia.org/wiki/Der_Wolf_%28Fernsehreihe%29 | ||
+ | **Rosamunde Pilcher, shares no cast http://de.wikipedia.org/wiki/Rosamunde_Pilcher#Verfilmungen | ||
+ | **Inga Lindström, shares no cast http://de.wikipedia.org/wiki/Inga_Lindstr%C3%B6m#Inga-Lindstr.C3.B6m-Reihe | ||
+ | **Utta Danella, shares no cast http://de.wikipedia.org/wiki/Utta_Danella#Verfilmungen | ||
+ | **Harry Potter, a feature film series http://en.wikipedia.org/wiki/Harry_Potter_%28film_series%29 | ||
+ | *some series are coproduced for multiple locales with most of an episode being shared internationally, but parts being replaced locally (completely different shots/actors) | ||
+ | **Fraggle Rock, see http://muppet.wikia.com/wiki/Fraggle_Rock#Co-Productions | ||
+ | *some series started out internationally as dubs and later branched of into unrelated shows of the same brand | ||
+ | **Sesame Street, see http://muppet.wikia.com/wiki/Sesamstrasse | ||
+ | *some episodic movies are basically short movies pasted together | ||
+ | **http://en.wikipedia.org/wiki/Love_at_Twenty | ||
+ | |||
+ | The set of titles (series/season/episode) should be marked as being a true alternate or just a typo/search hint. | ||
+ | Some titles are international (used for all languages without localization) titles - showing a poster of the international is better then defaulting to a specific locale. |
Latest revision as of 06:13, 14 October 2014
Contents
Known Issues with Character Encoding
- Need to verify that we can dump perl strings at XMLTV::Writer and it will do the right thing with regard to escaping anything outside $encoding into XML entities.
- tv_grab_hr tv_grab_no_gfeed tv_grab_se_swedb fix commited upstream
- #1910245 should add a test for HTML entities in the generated XML. (hint ´ is invalid XML!)
Issues with Time Zones
- tv_grab_dk_dr DST issues
- tv_grab_il DST issues
- tv_grab_it DST issues
- tv_grab_pt_meo DST issues
- tv_grab_uk_bleb DST issues, floating start>stop leads to wrong date calculation and time offsets
Memory Leaks??
- #2612996 tv_grab_na_dtv dumps core on windows as .exe, grows quite as perl
Cleanup List
Feel free to take anything from the list
- #1880681 the bug was solved (not a bug), the suggestion for Supplementary Files is turning one can of worms into another...
- tv_grab_huro has no maintainer?
- #2748362 site changes: holes in the collected programs
- #2837668 site changes: unexpected hash references
- #2858285 close it, was "no channels found", the grabber does not fail completely anymore (see status)
- #2910015 close it, was "no programs on channel 1", test_grabbers tests with that channel succesful
Cleanup SourceForge Project
- remove group/status/category examples from the tracker (might check for other unused stuff while there)
Check for Breakage caused by LWP::Simple
- silent uncompression
- silent code page conversion
- silent proxy handling
Maybe it's best to move most uses over to our own Get_nice.
Potential Data Sources
Candidates for wrapping into Static File Grabbers
- _cz_arcao: XMLTV export from arcao.com. Provides explicit time offsets.
- _dk_ontv: XMLTV export from ontv.dk.
- _eu_phazer: XMLTV service from tvprofil.net aka Phazer XMLTV Service. Notice that they provide timestamps in their local time as floating time which is intepreted as UTC...
-
_fr_kazer: XMLTV service from kazer.org.done - _it_ambrosa: XMLTV export from ambrosa.net. Explicit about non-commercial use only.
- _ru_teleguide: XMLTV export from teleguide.info. Provides explicit time offsets.
Configuration API
- [1] possible extensions/clarifications from a consumers POV
- list in the supplementary files mapping DVB/ATSC id to grabber/channel. Then let --list-channels & co. enrich channel list with related ids. MythTV's Browser Based Setup might be a consumer for this to allow automatic mapping of channels in the guide to channels on the video source.
Data Sinks
- Check [2] and see if all are mentioned here
Best Practices
Consumers of XMLTV Data
- be prepared that xmltv ids really might be similar to FQDN (255 characters max.) the longest I've seen in the wild is 69 characters (_es_laguiatv)
Random Pieces of Information
You might receive the same transport stream on multiple frequencies
TS 101 211 - DVB Guidelines on implementation and usage of Service Information (SI)
NOTE 1: The cell_id cannot be used to identify a service. The combination of service_id and original_network_id remains a unique identification of a service.
It is recommended to make all receivable multiplexes with the same transport_stream_id but with different cell_ids available to the user, and only when a service (not a transport stream) is available through multiple multiplexes to select a preferred multiplex based on e.g. reception quality.
Any reference resolution from a transport_stream_id or a service_id (e.g. from a linkage_descriptor transport_stream_id/service_id pair) to a multiplex/frequency requires consideration to handle the potential multiplicity
Note that in networks deploying the service_availability_descriptor, the unique identification of a transport stream by the tuple (transport_stream_id, original_network_id), can often be sensibly replaced by identification through the triplet (transport_stream_id, original_network_id, cell_id).
Proper TV Metadata Schema Bits and Pieces (Hi TVBrainz)
- Some series have multiple sets of episode titles per locale, usually its one set per broadcasting company
- Some series have a different title per season
- Elephant Princess http://de.wikipedia.org/wiki/Elephant_Princess
- Some series have alternate titles
- The Killing http://de.wikipedia.org/wiki/Kommissarin_Lund_%E2%80%93_Das_Verbrechen this series also uses roman or arabic numbers in addition to the title as season specific series title
- Usually the episode title is unique per series, but some series have multiple episodes with the same title
- some series started on radio and continued on tv
- Die Hesselbachs http://de.wikipedia.org/wiki/Die_Hesselbachs
- some series got rebranded
- Pusteblume -> Löwenzahn http://en.wikipedia.org/wiki/L%C3%B6wenzahn
- many series don't do seasons
- Eisenbahnromantik http://www.swr.de/eisenbahn-romantik/ (lots of repeats on bags of tv stations and lots of full episodes on youtube, good for testing VOD integration due to no regional restrictions http://www.youtube.com/user/Eisenbahnromantik )
- some episodes belong to multiple series
- it is common for documentary brands to buy unrelated documentary movies and mini series that are run under a bigger brand
- - example of the two episodes that belong to two series goes here -
- some series have a different order per country/station
- Wickie und die starken Männer http://forums.thetvdb.com/viewtopic.php?f=41&t=18059
- some series contain the same episode multiple times / with multiple episode numbers
- Eisenbahnromantik http://www.swr.de/eisenbahn-romantik/
- some series / collection of movies are unclear if they should be a series or a collection of movies or both
- Varg Veum, has a main cast and seasons, also used a translated and the original season title http://de.wikipedia.org/wiki/Der_Wolf_%28Fernsehreihe%29
- Rosamunde Pilcher, shares no cast http://de.wikipedia.org/wiki/Rosamunde_Pilcher#Verfilmungen
- Inga Lindström, shares no cast http://de.wikipedia.org/wiki/Inga_Lindstr%C3%B6m#Inga-Lindstr.C3.B6m-Reihe
- Utta Danella, shares no cast http://de.wikipedia.org/wiki/Utta_Danella#Verfilmungen
- Harry Potter, a feature film series http://en.wikipedia.org/wiki/Harry_Potter_%28film_series%29
- some series are coproduced for multiple locales with most of an episode being shared internationally, but parts being replaced locally (completely different shots/actors)
- Fraggle Rock, see http://muppet.wikia.com/wiki/Fraggle_Rock#Co-Productions
- some series started out internationally as dubs and later branched of into unrelated shows of the same brand
- Sesame Street, see http://muppet.wikia.com/wiki/Sesamstrasse
- some episodic movies are basically short movies pasted together
The set of titles (series/season/episode) should be marked as being a true alternate or just a typo/search hint. Some titles are international (used for all languages without localization) titles - showing a poster of the international is better then defaulting to a specific locale.