Announcement

Collapse
No announcement yet.

UTF-8 Encoding for Text Files

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    no bug UTF-8 Encoding for Text Files

    When opening text files, such as a list of file names for use in a slideshow, if the text file uses UTF-8 characters but is not saved as UTF-8 BOM (base only media), then IrfanView will incorrectly parse the file. Thus any file with unicode characters will be inaccessible.

    It is possible to go into like, Notepad++, and then change the encoding and save the file (from UTF to UTF BOM).

    It is a cryptic error that will cause people to report unicode bugs, unopenable files, etc..

    You can avoid this by detecting the encoding of the text file based on the text contents. This is a routine thing done by lots of programs so there should be something you can just copy and paste from github for your next update
    Last edited by Bhikkhu Pesala; 05.07.2018, 06:57 PM. Reason: Fixed thread title

    #2
    If you save a slideshow text file from IrfanView that includes Unicode characters the file will look like this in Notepad or Notepad2:

    ; UNICODE FILE - edit with care ;-)


    C:\TEMP\IrfanView 4.51 32-bit\UTF-8 Filen?me.jpg

    If you edit text files in other apps, it is up to you to take care that it is compatible with the encoding used by IrfanView.

    I won't forward this report, as I am sure that Irfan Skiljan will just say that it is not a bug in IrfanView. If you disagree, read the sticky thread and submit your own bug report, then report back to tell us what he said.
    Before you post ... Edit your profile • IrfanView 4.62 • Windows 10 Home 19045.2486

    Irfan PaintIrfan View HelpIrfanPaint HelpRiot.dllMore SkinsFastStone CaptureUploads

    Comment


      #3
      A slideshow list saved from IrfanView is a 16-bit Unicode file with the Byte Order Mark (base only media?). The first two bytes are FF FE. They must be preserved while editing, or IrfanView will not load the list back in. It loads a list as ANSI by default, or loads nothing if the list contains null bytes as with a normal Unicode file.

      Could Irfan reliably tell the encoding between ansi-win-cp and utf-8 if just one or two special characters are encountered in the list? Filenames are usually kept ansi-safe for compatibility with old programs, but sometimes a special character slips in.

      Comment


        #4
        See also Handling Unicode Text Files from Irfanview v4.50 using a Visual Basic script
        Last edited by Bhikkhu Pesala; 07.07.2018, 06:21 AM.
        Before you post ... Edit your profile • IrfanView 4.62 • Windows 10 Home 19045.2486

        Irfan PaintIrfan View HelpIrfanPaint HelpRiot.dllMore SkinsFastStone CaptureUploads

        Comment

        Working...
        X