
With the changes that Python 3 has brought to bear in terms of dealing with character encodings, I have written before some tips that I use on my day to day work. It is sometimes useful to determine the character encoding of a files at a much earlier stage. The command line is a perfect tool to help us with these issues.
The basic syntax you need is the following one:
$ file -I filename
Furthermore, you can even use the command line to convert the encoding of a file into another one. The syntax is as follows:
$ iconv -f encoding_source -t encoding_target filename
For instance if you needed to convert an ISO88592 file called input.txt
into UTF8 you can use the following line:
$ iconv -f iso-8859-1 -t utf-8 < input.txt > output.txt
If you want to check a list of know coded characters that you can handle with this command simply type:
$ iconv --list
Et voilà!