NextPVR Forums

Full Version: Invalid character in EPG data/Unicode error
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I'm having an issue where recordings of a programme fail, and can't be deleted, because of an invalid character in the programme description.

The relevant log lines are variations on the following:
Code:
[ERROR][1]    Unexpected error parsing EPGEvent xml: System.Xml.XmlException: '', hexadecimal value 0x19, is an invalid character. Line 5, position 161.
   at System.Xml.XmlTextReaderImpl.Throw(Exception e)
   at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args)
   at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars)
   at System.Xml.XmlTextReaderImpl.ParseText()
   at System.Xml.XmlTextReaderImpl.ParseElementContent()
   at System.Xml.XmlTextReaderImpl.Read()
   at System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace)
   at System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc)
   at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
   at System.Xml.XmlDocument.Load(XmlReader reader)
   at System.Xml.XmlDocument.LoadXml(String xml)
   at NUtility.EPGEvent.Parse(String xml)

In each case it's a character 0x19 in place of what should evidently be U+2019 right single quotation mark. The description is stored (with the invalid character) in the database, but the UI shows the description blank in the pending recordings list. They fail with Object reference not set to an instance of an object, and generate an unhandled exception for the same if trying to cancel them.

Looking at the database there are examples of the 0x19 getting into the description strings of epg events across many channels.

I first encountered the problem a few days ago running the previous version and have since reinstalled with 4.1.1.180410
Can you supply your npvr.db3, so I can check a couple of things?
sub Wrote:Can you supply your npvr.db3, so I can check a couple of things?

ok
Thanks - what show was it?
Randall and Hopkirk (Deceased), but I've just tried scheduling a one off recording of another show containing that 0x19 character in the epg_event description field and it generated just the same error in the log.
Sorry to be a pain, but looks like the DVB EPG parser isn't handling this data correctly and I'd like to fix this the right way....

Is there any chance you could download/install the eval version of TSCapture from http://tscapture.com, and capture a 1 minute full transport stream of that channel, and put the file somewhere like dropbox or onedrive for me to download? That way I can put it through the signal generator here, and watch what's happening when it pulls in these listings.
Hi sub I got a database from a user and after a bit of research I figured out how to search hex with sqlite using the LIKE '%'||cast(X'19' as text)||'%' command

What I see is the text actually have several hex characters and they were the low bytes of 0x2013 0x2018 0x2019 and 0x201c and 0x201d which seemed to match this table https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html I am not sure if you parse the table as double byte but that might help if you never get the raw stream.

There were a few instances of 0x0a and 0x0d in there too. Not sure if you encode them for sending in xml.

Martin