NextPVR Forums
  • ______
  • Home
  • New Posts
  • Wiki
  • Members
  • Help
  • Search
  • Register
  • Login
  • Home
  • Wiki
  • Members
  • Help
  • Search
NextPVR Forums General General Discussion v
« Previous 1 … 36 37 38 39 40 … 159 Next »
Trying to suppress Chinese Characters in TVxB

 
  • 0 Vote(s) - 0 Average
Trying to suppress Chinese Characters in TVxB
mvallevand
Online

Posting Freak

Ontario Canada
Posts: 52,941
Threads: 956
Joined: May 2006
#11
2010-09-04, 04:34 AM
markbb1 Wrote:in hex (I think). It might be that when the sed script takes off the first instance on the line, the remaining characters then appear to our text renderers as extended ascii instead of Chinese. Removing all instances might be the magic required.

sed works character by character, the chinese utf-8 3 byte codes are all 0x80 or greater which is why this in theory will work. If the file is somehow converted to utf-16 or unicode all bets are off

Martin
jksmurf
Offline

Posting Freak

HK (DMBTH)
Posts: 3,590
Threads: 410
Joined: Jul 2005
#12
2010-09-04, 09:58 AM
You guys are both AWESOME, thanks so much for the help. I've been struggling with this for a year!

markbb1 Wrote:1. Get SED for Windows working. Make sure the above sed command syntax does what is desired.
Try this command
Code:
sed -e "s/[\x00-\x1F\x7F-\xFF]//g" tbv1.html > tvb.html
Done (on the XML, I agree much easier on the xml for my limited capabilities!). And it works using the above from the later post! Strips the Chinese Characters right out. Truly GREAT!
markbb1 Wrote:2. Configure TVxb to produce an xmltv file named "temp.xmltv".
Done.
markbb1 Wrote:3. Put a file named "prepost" in the \TVxb\bin directory because, per section 2.3.4 of the TVxb manual, "For security reasons a flag file called prepost (with no extension) must be created in the \TVxb\bin folder before the pre or post commands will work. (The file can be empty.) If this file does not exist, then the pre- and post-commands are deactivated."
Done.
markbb1 Wrote:4. Put a postcommand in the TVxb ini file like this
postcommand="sed -e s/[\x00-\x1F\x7F-\xFF]/ / temp.xmltv > mylistings.xmltv"
(I am not sure of how to correctly put the quotes in that line, or if/where they are necessary.)
Ok, Done.
markbb1 Wrote:5. Let GBPVR (or NPVR) process the "mylistings.xmltv" file as it would normally.
OK, Ran it. Need to play around with a batch file, rather than a command line... I think?

Code:
WARNING: Post-command batch file "xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml" can only run from the C:\Utils\TVXB\TVXB04Test\bin\folder.
.......: Edit the TVxb.ini file and correct the postcommand= item
WARNING: Post-command batch file ""C:\Utils\TVXB\TVXB04Test\bin\xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml"" was not found.
.......: Post-command did not run.

This what I used:

Code:
postcommand="C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml"

markbb1 Wrote:Put a line in your epg update batch file to delete the temp.xmltv file if you want to save some hard drive space.
Not too concerned about this it recreates new one every time anwyay.

Thanks guys, 99% there!

k.
ASUS STRIX X470-F AMD 2700x 4GHz | Win10Prox64 | 32GB | NVIDIA GEforce GT1030 Fanless | WinTV DMB-TH | WinTV HVR-1280 | Hauppauge Colossus | AC86U/AC68U | USB-UIRT | RPi4 Libreelec | Sony Bravia LCD X9000F Android TV |
mvallevand
Online

Posting Freak

Ontario Canada
Posts: 52,941
Threads: 956
Joined: May 2006
#13
2010-09-04, 11:54 AM
jksmurf Wrote:OK, Ran it. Need to play around with a batch file, rather than a command line... I think?

Code:
WARNING: Post-command batch file "xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml" can only run from the C:\Utils\TVXB\TVXB04Test\bin\folder.
.......: Edit the TVxb.ini file and correct the postcommand= item
WARNING: Post-command batch file ""C:\Utils\TVXB\TVXB04Test\bin\xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml"" was not found.
.......: Post-command did not run.

I think you are correct that it want a batch file so in C:\Utils\TVXB\TVXB04Test\bin create something like striphigh.bat and create it with just the working command line, update the ini file to use postcommand=striphigh.bat and finally keep your fingers crossed.

Martin
markbb1
Offline

Member

Posts: 155
Threads: 7
Joined: Jul 2006
#14
2010-09-04, 02:36 PM
jksmurf Wrote:Need to play around with a batch file, rather than a command line... I think?

This what I used:

Code:
postcommand="C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml"

k.
I think you could remove the quotes around "s/[\x00-\x1F\x7F-\xFF]//g" and eliminate the ambiguity that is causing the errors. Obviously the postcommand is being interpreted as two or three independent strings:

"C:\Progra~1\GnuWin32\bin\sed -e "
s/[\x00-\x1F\x7F-\xFF]//g
" xmltvHKBTVTemp.xml > xmltvHKBTV.xml"

Try it with only the outermost quotes.
markbb1
Offline

Member

Posts: 155
Threads: 7
Joined: Jul 2006
#15
2010-09-04, 02:46 PM
mvallevand Wrote:sed works character by character, the chinese utf-8 3 byte codes are all 0x80 or greater which is why this in theory will work. If the file is somehow converted to utf-16 or unicode all bets are off

Martin
It can work character by character, but I read in the documentation somewhere that the search and replace functions, by default, only find/replace the first instance in the line. There are ways to make it only do the first, or only the last, or any of the ones in between, or any combination. The "g" is the simpest way to get all of them in the file.

It appeared from the changes caused by your original syntax that it was only removing the first byte from the line, or possibly the first byte from each 3-byte string. Anyway, I am glad the "g" took care of it.
jksmurf
Offline

Posting Freak

HK (DMBTH)
Posts: 3,590
Threads: 410
Joined: Jul 2005
#16
2010-09-04, 03:35 PM
Success!!

First off, unfortunately none of the combinations I tried with or without the quotes worked (but thanks for the suggestion!)

Code:
postcommand="C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml"

Code:
postcommand="C:\Progra~1\GnuWin32\bin\sed -e s/[\x00-\x1F\x7F-\xFF]//g xmltvHKBTVTemp.xml > xmltvHKBTV.xml"

Code:
postcommand=C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" xmltvHKBTVTemp.xml > xmltvHKBTV.xml

Code:
postcommand=C:\Progra~1\GnuWin32\bin\sed -e s/[\x00-\x1F\x7F-\xFF]//g xmltvHKBTVTemp.xml > xmltvHKBTV.xml

BUT, the batch file in the bin dir DID work.

striphigh.bat contents (for other future users)
Code:
C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTVTemp.xml > C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTV.xml

Code:
Executing: Post-command
Postcommand: C:\Utils\TVXB\TVXB04Test\bin\striphigh.bat
Status: Finished!

Yippee!

As an aside, I am now ironically getting a few oddities like

"()" where there used to be (太子珠寶鐘錶特約Wink but I'll work it out.
I tried TVxB's left clip on "()" and Substitution="()", but of course the () don't exist until AFTER the SED command and lclip on just ")" takes too much of the title.

Thanks again fellers,

k.
ASUS STRIX X470-F AMD 2700x 4GHz | Win10Prox64 | 32GB | NVIDIA GEforce GT1030 Fanless | WinTV DMB-TH | WinTV HVR-1280 | Hauppauge Colossus | AC86U/AC68U | USB-UIRT | RPi4 Libreelec | Sony Bravia LCD X9000F Android TV |
mvallevand
Online

Posting Freak

Ontario Canada
Posts: 52,941
Threads: 956
Joined: May 2006
#17
2010-09-04, 03:54 PM
You can stack up your sed replacements, try adding -e s/()//g; after the first one.

Martin
jksmurf
Offline

Posting Freak

HK (DMBTH)
Posts: 3,590
Threads: 410
Joined: Jul 2005
#18
2010-09-04, 04:04 PM
So striphigh.bat would contain?

Code:
C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTVTemp.xml > C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTV[color=red]Temp[/color].xml

C:\Progra~1\GnuWin32\bin\sed -e "[color=red]s/()//g[/color]" C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTVTemp.xml > C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTV.xml

k.
ASUS STRIX X470-F AMD 2700x 4GHz | Win10Prox64 | 32GB | NVIDIA GEforce GT1030 Fanless | WinTV DMB-TH | WinTV HVR-1280 | Hauppauge Colossus | AC86U/AC68U | USB-UIRT | RPi4 Libreelec | Sony Bravia LCD X9000F Android TV |
mvallevand
Online

Posting Freak

Ontario Canada
Posts: 52,941
Threads: 956
Joined: May 2006
#19
2010-09-04, 04:07 PM
One line should do it.
Code:
C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" -e "s/()//g" C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTVTemp.xml > C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTVTemp.xml

Martin
jksmurf
Offline

Posting Freak

HK (DMBTH)
Posts: 3,590
Threads: 410
Joined: Jul 2005
#20
2010-09-04, 04:16 PM
Yep that works

Code:
C:\Progra~1\GnuWin32\bin\sed -e "s/[\x00-\x1F\x7F-\xFF]//g" -e "s/()//g" C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTVTemp.xml > C:\Utils\TVXB\XML\TVXB04XML\xmltvHKBTV.xml

(just xmltvHKBTVTemp.xml > xmltvHKBTV.xml though)
ASUS STRIX X470-F AMD 2700x 4GHz | Win10Prox64 | 32GB | NVIDIA GEforce GT1030 Fanless | WinTV DMB-TH | WinTV HVR-1280 | Hauppauge Colossus | AC86U/AC68U | USB-UIRT | RPi4 Libreelec | Sony Bravia LCD X9000F Android TV |
« Next Oldest | Next Newest »

Users browsing this thread: 1 Guest(s)

Pages (2): « Previous 1 2


  • View a Printable Version
  • Subscribe to this thread
Forum Jump:

© Designed by D&D, modified by NextPVR - Powered by MyBB

Linear Mode
Threaded Mode