'BLOB is not a valid UTF-8 string' error?

  • Im reading in a document that the user uploads from the visualFlow:

    <apex:inputFile value="{!contentFile}" filename="{!nameFile}" id="file"/>
    

    And its accessed in apex in this manner:

    String nameFile = contentFile.toString();
    

    And it works like a charm. I am able to parse through the document and extract all the information needed, but only for English users. But for Spanish users that's not the case.

    Those files have some special characters, and cause a BLOB is not a valid UTF-8 string error.

    I've tried to base64Encode the file contents, but the results come out illegible.

    String nameFile= EncodingUtil.base64Encode(contentFile);
    

    Can you try ensuring the file is saved in UTF-8 format? For example in Windows Notepad ensure Save As dialog shows Unicode UTF-8.

    Yes this works, great. But I use windows command line (cmd) to generate this file on the user side. and I cant seem to find a way to save in this format using cmd. But I guess my question now is far from apex and java. But you just got me one step towards a solution.

    Great, I've added an answer for future reference and also found an answer regarding command line that might be of some use.

  • While it does not appear to be clearly documented, having scanned a number of forum posts about this. The Blob type only supports UTF-8 encoded strings, you must ensure that the file you're uploading complies with this encoding, otherwise you will get this error in cases where you have special characters.

    To ensure your file is UTF-8, for example in Windows Notepad make sure that you select to save as Unicode UTF-8 in the Save As dialog, other applications will likely have a similar option (for command line this answer might be of use).

    Yes, but the problem is how to automate such a process.

    I face a similar problem. Users use a downloadable CSV file as a template, key in details and upload from a VF page. Dataloader doesn't work for us, because its not really for end users. But even if the CSV template is saved as UTF-8, when users edit it in Excel and save it, it becomes ANSI again. So yes, the problem is to automate storing CSV in UTF-8

License under CC-BY-SA with attribution


Content dated before 7/24/2021 11:53 AM

Tags used