Difference between revisions of "User:Dekarl"

From XMLTV
Jump to: navigation, search
m (update to reflect closed tickets)
(Known Issues with Character Encoding: update with encoding fixes)
Line 2: Line 2:
 
Which grabbers status will turn red if we add character encoding checks?
 
Which grabbers status will turn red if we add character encoding checks?
 
*{{grabber|ch_search}} data source sends windows-1252 as iso-8859-1 (e.g. Euro Symbol) Newer HTML readers are supposed to handle this correctly. Need to verify that we can dump perl strings at XMLTV::Writer and it will do the right thing with regard to escaping anything outside $encoding into XML entities.
 
*{{grabber|ch_search}} data source sends windows-1252 as iso-8859-1 (e.g. Euro Symbol) Newer HTML readers are supposed to handle this correctly. Need to verify that we can dump perl strings at XMLTV::Writer and it will do the right thing with regard to escaping anything outside $encoding into XML entities.
*{{grabber|se_swedb}} data source sends windows-1252 as iso-8859-1 (e.g. single right quotation mark in actor name from data source Viasat)
+
*{{grabber|it}} stores category in utf-8
*{{grabber|re}} writes iso-8859-1 header and programs but adds utf-8 encoded categories.
+
*{{grabber|se_swedb}} data source sends windows-1252 as iso-8859-1 (data source Viasat)
 
*{{ticket|1910245}} should add a test for HTML entities in the generated XML. (hint ´ is invalid XML!)
 
*{{ticket|1910245}} should add a test for HTML entities in the generated XML. (hint ´ is invalid XML!)
  

Revision as of 07:21, 14 October 2010

Known Issues with Character Encoding

Which grabbers status will turn red if we add character encoding checks?

  • tv_grab_ch_search data source sends windows-1252 as iso-8859-1 (e.g. Euro Symbol) Newer HTML readers are supposed to handle this correctly. Need to verify that we can dump perl strings at XMLTV::Writer and it will do the right thing with regard to escaping anything outside $encoding into XML entities.
  • tv_grab_it stores category in utf-8
  • tv_grab_se_swedb data source sends windows-1252 as iso-8859-1 (data source Viasat)
  • #1910245 should add a test for HTML entities in the generated XML. (hint ´ is invalid XML!)

Issues with Time Zones

Memory Leaks??

Cleanup List

Feel free to take anything from the list

  • #1880681 the bug was solved (not a bug), the suggestion for Supplementary Files is turning one can of worms into another...
  • tv_grab_huro has no maintainer?
    • #2748362 site changes: holes in the collected programs
    • #2837668 site changes: unexpected hash references
    • #2858285 close it, was "no channels found", the grabber does not fail completely anymore (see status)
    • #2910015 close it, was "no programs on channel 1", test_grabbers tests with that channel succesful

Cleanup SourceForge Project

  • remove group/status/category examples from the tracker (might check for other unused stuff while there)

Check for Breakage caused by LWP::Simple

  • silent uncompression
  • silent code page conversion
  • silent proxy handling

Maybe it's best to move most uses over to our own Get_nice.

Potential Data Sources